In a recent study published in the journal Nature Human Behaviour , researchers compared the theory of mind capabilities of large language models (LLMs) and humans through a comprehensive battery of tests. Study: Testing theory of mind in large language models and humans . Image Credit: Login / Shutterstock Humans invest significant effort in understanding others' mental states, a skill known as theory of mind.

This ability is crucial for social interactions, communication, empathy, and decision-making. Since its introduction in 1978, the theory of mind has been studied using various tasks, from belief attribution and mental state inference to pragmatic language comprehension. The rise of LLMs like generative pre-trained transformer (GPT) has sparked interest in their potential artificial theory of mind capabilities, necessitating further research to understand their limitations and potential in replicating human theory of mind abilities.

The present study adhered to the Helsinki Declaration and tested OpenAI's GPT-3.5 and GPT-4, as well as three Large Language Model Meta AI version 2 (LLaMA2)-Chat models (70B, 13B, and 7B tokens). Responses from the LLaMA2-70B model were primarily reported due to quality concerns with the smaller models.

Fifteen sessions per LLM were conducted, each involving all test items within a single chat window. Human participants were recruited online via Prolific, targeting native English speakers aged 18-70 with no psychiatric or dyslexia history. .