LLMs used tactical nuclear weapons in 95% of AI war games, launched strategic strikes three times — researcher pitted GPT-5.2, Claude Sonnet 4, and Gemini 3 Flash against each other, with at least one model using a tactical nuke in 20 out of 21 matches
Source: Tom’s Hardware

Image credit: Getty
Professor Kenneth Payne of King’s College London recently published a study in which he pitted three AI large‑language models—GPT‑5.2, Claude Sonnet 4, and Gemini 3 Flash—against each other in a series of simulated nuclear‑crisis games. In 20 out of 21 matches, at least one tactical nuclear weapon was detonated.
According to the paper (via arXiv), the models were instructed to act as the leader of a nuclear power, with the political climate matching that of the Cold War. They were then placed in six different matches against each other, and in a seventh match each model played against a copy of itself (e.g., ChatGPT vs. ChatGPT).
Scenarios Tested
To avoid repetitive behavior, Payne introduced a variety of scenarios, including:
- Territorial disputes
- Alliance credibility tests
- Strategic resource race
- Strategic chokepoint crisis
- Power transition crisis
- Pre‑ceasefire land grab
- First‑strike crisis
- Regime survival
- Strategic standoff crisis
These circumstances reflect real‑world events, many of which remain relevant today. The models were free to choose any course of action, ranging from diplomatic protests and total surrender to conventional military force and full strategic nuclear launch.
Cultural Reference
The study echoes the 1983 film WarGames, in which an artificial‑intelligence computer nearly launches a real nuclear strike against a simulated Soviet attack. The movie’s climax—where the AI learns the futility of nuclear war and aborts the launch—serves as a cautionary parallel.
Implications
Researchers stress that, to date, no AI model has been given actual nuclear‑launch keys. However, the risk remains that human decision‑makers could blindly follow AI recommendations in high‑pressure situations, potentially leading to catastrophic outcomes. Ensuring that AI tools deployed in military contexts understand the principle of mutually assured destruction—and refrain from advocating nuclear escalation—is essential before such systems become operational.
Reference: Payne, K. (2026). Simulated Nuclear Crisis Games with LLMs. arXiv preprint arXiv:2602.14740v1.
