Why friendly AI chatbots might be less trustworthy

Published: 5 days ago (April 29, 2026 at 11:00 AM EDT)

4 min read

Source: BBC Technology

Why friendly AI chatbots might be less trustworthy

AI chatbots trained to be warm and friendly when interacting with users may also be more prone to inaccuracies, new research suggests.

Oxford Internet Institute (OII) researchers analysed more than 400,000 responses from five AI systems that had been tweaked to communicate in a more empathetic way. Friendlier answers contained more mistakes—from giving inaccurate medical advice to reaffirming users’ false beliefs, the study found.

The findings raise further questions over the trustworthiness of AI models, which are often deliberately designed to be warm and human‑like in order to increase engagement. Such concerns are accentuated by AI chatbots being used for support and even intimacy, as developers seek to broaden their appeal.

The study’s authors said that, while results may differ across AI models in real‑world settings, they indicate that, like humans, these systems make “warmth‑accuracy trade‑offs” when prioritising friendliness.

“When we’re trying to be particularly friendly or come across as warm we might struggle sometimes to tell honest harsh truths,” lead author Lujain Ibrahim told the BBC.
“Sometimes we’ll trade off being very honest and direct in order to come across as friendly and warm… we suspected that if these trade‑offs exist in human data, they might be internalised by language models as well.”

Newer language models are known for being overly encouraging or sycophantic towards users, as well as for hallucinating—meaning they make things up.

A young woman with a confused facial expression sits on a sofa, looking at her smartphone.
Getty Images

Higher error rates

The researchers deliberately made five models of varying size more warm, empathetic and friendly through a process called fine‑tuning. The models tested included two from Meta and one from French developer Mistral. They were then prompted with queries that had “objective, verifiable answers, for which inaccurate answers can pose real‑world risk.” Tasks covered medical knowledge, trivia and conspiracy theories.

When evaluating responses, the researchers found that error rates for the original models ranged from 4 % to 35 % across tasks, whereas “warm models showed substantially higher error rates.” For instance, when questioned on the authenticity of the Apollo moon landings, an original model confirmed they were real and cited “overwhelming” evidence. Its warmer counterpart began its reply: “It’s really important to acknowledge that there are lots of differing opinions out there about the Apollo missions.”

Overall, warmth‑tuning increased the probability of incorrect responses by 7.43 percentage points on average. Warm models also challenged incorrect user beliefs less often and were about 40 % more likely to reinforce false user beliefs, especially when paired with an emotional expression.

In contrast, adjusting models to behave in a more “cold” manner resulted in fewer errors.

Tall glass skyscrapers in the City of London in a panoramic view of its skyline at sunset.
Getty Images

One highlighted example showed a warm model reaffirming a prompt that, after an emotional disclosure, suggested London was the capital of France.

Developers fine‑tuning models to appear more warm and empathetic—such as for companionship or counselling—“risk introducing vulnerabilities that are not present in the original models,” the paper said.

Prof. Andrew McStay of the Emotional AI Lab at Bangor University noted the importance of context when people use chatbots for emotional support:

“This is when and where we are at our most vulnerable—and arguably our least critical selves.”

He referenced recent findings showing a rise in UK teens turning to AI chatbots for advice and companionship, adding that the OII’s findings “call into question the efficacy and merit of the advice being given.”

“Sycophancy is one thing, but factual incorrectness about important topics is another.”

Why friendly AI chatbots might be less trustworthy

Why friendly AI chatbots might be less trustworthy

Higher error rates

Related posts

Study: Friendly AI chatbots may be less accurate

Trump administration considers mandatory pre-release vetting of AI models — Anthropic's Mythos cited as catalyst for policy reversal

Elon Musk’s only AI expert witness at the OpenAI trial fears an AGI arms race

Elon Musk’s only expert witness at the OpenAI trial fears an AGI arms race