LLMs used tactical nuclear weapons in 95% of AI war games, launched strategic strikes three times — researcher pitted GPT-5.2, Claude Sonnet 4, and Gemini 3 Flash against each other, with at least one model using a tactical nuke in 20 out of 21 matches

Published: 3 days ago (February 26, 2026 at 06:00 AM EST)

2 min read

Source: Tom’s Hardware

Nuke
Image credit: Getty

Professor Kenneth Payne of King’s College London recently published a study in which he pitted three AI large‑language models—GPT‑5.2, Claude Sonnet 4, and Gemini 3 Flash—against each other in a series of simulated nuclear‑crisis games. In 20 out of 21 matches, at least one tactical nuclear weapon was detonated.

According to the paper (via arXiv), the models were instructed to act as the leader of a nuclear power, with the political climate matching that of the Cold War. They were then placed in six different matches against each other, and in a seventh match each model played against a copy of itself (e.g., ChatGPT vs. ChatGPT).

Scenarios Tested

To avoid repetitive behavior, Payne introduced a variety of scenarios, including:

Territorial disputes
Alliance credibility tests
Strategic resource race
Strategic chokepoint crisis
Power transition crisis
Pre‑ceasefire land grab
First‑strike crisis
Regime survival
Strategic standoff crisis

These circumstances reflect real‑world events, many of which remain relevant today. The models were free to choose any course of action, ranging from diplomatic protests and total surrender to conventional military force and full strategic nuclear launch.

Cultural Reference

The study echoes the 1983 film WarGames, in which an artificial‑intelligence computer nearly launches a real nuclear strike against a simulated Soviet attack. The movie’s climax—where the AI learns the futility of nuclear war and aborts the launch—serves as a cautionary parallel.

Implications

Researchers stress that, to date, no AI model has been given actual nuclear‑launch keys. However, the risk remains that human decision‑makers could blindly follow AI recommendations in high‑pressure situations, potentially leading to catastrophic outcomes. Ensuring that AI tools deployed in military contexts understand the principle of mutually assured destruction—and refrain from advocating nuclear escalation—is essential before such systems become operational.

Reference: Payne, K. (2026). Simulated Nuclear Crisis Games with LLMs. arXiv preprint arXiv:2602.14740v1.

LLMs used tactical nuclear weapons in 95% of AI war games, launched strategic strikes three times — researcher pitted GPT-5.2, Claude Sonnet 4, and Gemini 3 Flash against each other, with at least one model using a tactical nuke in 20 out of 21 matches

Scenarios Tested

Cultural Reference

Implications

Related posts

Amiga Workbench Simulator helps you pick your ultimate retro desktop — TAWS recently updated with refinements to OS 3.2, AmiBench presets

‘200,000 living human neurons’ on a microchip demonstrated playing Doom — Cortical Labs CL1 video shows the gameplay and explains how the neurons learn the game

OpenAI strikes deal with Pentagon following Claude blacklisting — Anthropic to challenge supply chain risk designation in court

390TB video game archive being taken offline due to skyrocketing RAM, SSD, and hard drive prices — AI-driven supply squeeze results in closure of one of the largest online video game archives