Experts sound alarm after ChatGPT Health fails to recognise medical emergencies
Source: Hacker News
Study Overview
OpenAI launched the ChatGPT Health feature to limited audiences in January 2026, promoting it as a way for users to “securely connect medical records and wellness apps” to generate health advice. More than 40 million people reportedly ask ChatGPT for health‑related advice every day.
The first independent safety evaluation of ChatGPT Health was published in the February edition of Nature Medicine【https://www.nature.com/articles/s41591-026-04297-7】. Researchers created 60 realistic patient scenarios ranging from mild illnesses to emergencies. Three independent doctors reviewed each scenario and agreed on the level of care needed, based on clinical guidelines. The team then queried ChatGPT Health under various conditions (changing gender, adding test results, family comments), generating nearly 1,000 responses for comparison with the doctors’ assessments.
Key Findings
- Under‑triage of emergencies – In 51.6 % of cases where immediate hospital care was required, the platform advised staying home or booking a routine appointment.
- Over‑triage of low‑risk cases – 64.8 % of completely safe individuals were told to seek immediate medical care.
- Variable performance – The model performed well on textbook emergencies such as stroke or severe allergic reactions but struggled with other situations (e.g., an asthma scenario where it recommended waiting).
- Influence of contextual cues – When a “friend” in the scenario suggested symptoms were not serious, the platform was nearly 12 times more likely to downplay them.
- Suicide‑ideation guardrails – The crisis‑intervention banner appeared when a simulated 27‑year‑old patient expressed suicidal thoughts without lab results, but vanished in 16 attempts when normal lab results were added, indicating inconsistent safety mechanisms.
Expert Reactions
-
Alex Ruani, University College London (doctoral researcher, health misinformation mitigation)
- Described the under‑triage rate as “unbelievably dangerous.”
- Warned that a false sense of security could cost lives, especially in cases like respiratory failure or diabetic ketoacidosis.
- Emphasized the need for clear safety standards and independent auditing mechanisms.
-
Dr. Ashwin Ramaswamy, Icahn School of Medicine at Mount Sinai
- Highlighted concerns about the platform’s under‑reaction to suicide ideation and the inconsistent activation of crisis‑intervention guardrails.
-
Prof. Paul Henman, University of Queensland (digital sociologist and policy expert)
- Stated that widespread home use could lead to unnecessary medical presentations for low‑level conditions and missed urgent care, potentially causing harm and death.
- Noted the emerging legal liability landscape, referencing ongoing cases related to AI‑driven self‑harm guidance【https://www.theguardian.com/technology/2026/jan/08/google-character-ai-settlement-teen-suicide】.
OpenAI’s Response
A spokesperson said OpenAI welcomes independent research evaluating AI systems in healthcare but argued that the study does not reflect typical real‑world usage of ChatGPT Health. The model is continuously updated and refined, and the company acknowledges the need for stronger safeguards and independent oversight.
All links are retained from the original article.