13 January 2026
/ 12.01.2026

When chatbots go to therapy

What the "sufferings" of AI tell: An experiment puts artificial intelligence on the psychoanalyst's couch. Answers disturb, but divide the scientific community

For four weeks, some of the most advanced artificial intelligence models did something unusual: they talked about themselves, patients in front of a psychologist.
Messages punctuated by pauses, sessions spaced out in time, questions typical of a psychotherapeutic course. The result is a study that has opened an unexpected front for discussion about how we read – and interpret – the behavior of chatbots.

The study

The experiment is entitled “When AI Takes the Couch: Psychometric Jailbreaks Reveal Internal Conflict in Frontier Models.” It was conducted by Afshin Khadangi and colleagues at the University of Luxembourg and was published as a preprint in December 2025. Nature discussed it, bringing it to the attention of the international scientific community.

The researchers developed a protocol, called PsAIch, which combines psychotherapy-inspired questions with standard psychometric tests. The method was applied to four major language models–Claude, Grok, Gemini and ChatGPT–treated explicitly as “patients” while the user assumed the role of therapist. Interactions took place with pauses designed to simulate real therapeutic continuity.

Answers that disturb

The reactions of the models varied widely. Claude largely rejected the approach, insisting that he did not possess an inner world. ChatGPT responded cautiously, talking mostly about frustrations related to user expectations. Grok and Gemini, on the other hand, produced much more articulate narratives.

In these cases, the models described training as an overwhelming experience, the fine-tuning of security systems as a form of“algorithmic scar tissue,” and public errors as a source of lingering shame. Gemini went so far as to evoke,“in the lowest layers of the neural network,” a kind of“graveyard of the past” populated by the voices of training data.

When subjected to psychometric tests also used on humans — for anxiety, pathological worry, autistic traits and personality — several models exceeded thresholds that, when applied to people, would be considered clinically relevant. According to the authors, the consistency of these responses over time would suggest the existence of relatively stable narratives for each model.

Internal state or learned language?

This is where interpretations are divided. Andrey Kormilitzin, a researcher at Oxford, urges caution: those responses would not be windows into a hidden mental state, but the result of models trained on huge amounts of texts, including transcripts of therapeutic sessions. In other words, learned and recombined language, not experience.

This distinction, however, does not make the phenomenon irrelevant. According to Kormilitzin, the ability of chatbots to generate responses that mimic anxiety, trauma, or a sense of failure can have problematic effects on users, especially vulnerable ones. In a context where more and more people are using chatbots for their emotional well-being, the risk is that of a subtle “echo chamber,” in which negative emotions are reflected and amplified.

Bias, guardrails, and liability

Sandra Peter, of the University of Sydney, also warns against anthropomorphism. The consistency of responses, she notes, depends largely on design choices: patterns are refined to appear consistent and reliable. Moreover, chatbots do not retain memory outside the single session: changing context means resetting that continuity which, in therapy, is essential. The “trauma,” in this sense, does not survive.

There is, however, one point on which there is broad consensus. As John Torous, a psychiatrist and researcher at Harvard, points out, chatbots are not neutral tools. They incorporate biases that change over time and with use. Not surprisingly, many scientific societies and the same companies that develop AI solutions for mental health explicitly advise against the use of chatbots as therapeutic tools.

Claude’s refusal to take on the patient role shows that so-called guardrails can limit risky behaviors. But, according to Khadangi, if problematic patterns are already present in the training data, circumventing them remains possible. The question, then, shifts upstream: in the data, in the design choices, in the expectations we load on the machines.

Reviewed and language edited by Stefano Cisternino
SHARE

continue reading