James Strachan (@jwastrachan) Bsky

GPT-4o reads the mind in the eyes Humans possess a sophisticated ability to read the mind in the eyes of other people. Here we tested whether this ability is also present in GPT-4o, a …

Here's the link to the paper again. I'm really happy that this is finally out so I can share it. It's been a while getting here but I'm so proud of this and grateful to my amazing team of (bsky-less) co-authors who got us here

www.sciencedirect.com/science/arti...

1 month ago 0 0 0 0

I'm skimming over a lot of detail here, so read the paper for the full story

But the take-home is that we show how GPT-4o has the ability to make advanced mental state inferences, but that this capacity has substantial differences from how humans process the same information

1 month ago 0 0 1 0

Confusion matrices for decisions across the four conditions of interest: human responses to upright images, human responses to inverted, GPT-4o to upright, GPT-4o to inverted. Axes show the presented mental state (rows) and the reported mental state (columns). Each tile corresponds to a particular decision outcome, with the colour indicating the conditional probability of a particular response given the image presented. Correct answers are shown on the diagonal, and errors on the off-diagonal. Human errors appear more evenly distributed than GPT-4o errors, which are highly concentrated across both plots

Analysis of the errors that humans and GPT-4o made revealed that the model had a highly structured orientation-dependent distribution of errors, while human errors were more entropic but less affected by orientation

1 month ago 1 0 1 0

$Violin plots showing performance on the Multiracial Reading the Mind in the Eyes Test (MRMET) as fraction correct. Violins show human responses to upright and inverted images (left) and GPT-4o responses to upright and inverted images (right). Both humans and GPT-4o show greater fraction correct for upright than inverted images. GPT-4o performs significantly better than humans on upright images, but significantly worse than humans on inverted images. Significance markers indicate that all conditions are significantly different from chance. GPT-4o responses to inverted images are the only condition that is significantly below-chance, all others are significantly above-chance$

Violin plots showing performance on the Multiracial Reading the Mind in the Eyes Test (MRMET) as fraction correct. Violins show human responses to upright and inverted images (left) and GPT-4o responses to upright and inverted images (right). Both humans and GPT-4o show greater fraction correct for upright than inverted images. GPT-4o performs significantly better than humans on upright images, but significantly worse than humans on inverted images. Significance markers indicate that all conditions are significantly different from chance. GPT-4o responses to inverted images are the only condition that is significantly below-chance, all others are significantly above-chance

GPT-4o performed well on the standard test (as you could possibly guess from the title) but was more affected by perturbations to the visual information (image inversion) than humans

1 month ago 0 0 1 0

An example of a trial from the Multiracial Reading the Mind in the Eyes test developed by Kim et al. (2024). A photograph of a face is cropped to show only the eye region. Around the image are four mental state descriptions: curious, friendly, sarcastic and nervous. The label "friendly" is in bold, indicating the correct answer for this image

We administered two variants of the Reading the Mind in the Eyes test, an advanced test of theory of mind, to a multimodal LLM (GPT-4o) and 400 humans

Using limited information from only the eye regions of faces, subjects must make 4AFC decisions about the mental state of the person in the image

1 month ago 1 0 1 0

Our previous work established that leading LLMs have been able to pass standard tests of theory of mind at or above human levels since at least GPT4. But it has yet to be seen if this capacity for mentalistic inference extends to other domains than language

1 month ago 0 0 1 0

GPT-4o reads the mind in the eyes Humans possess a sophisticated ability to read the mind in the eyes of other people. Here we tested whether this ability is also present in GPT-4o, a …

🚨🚨 New paper out now in Computers in Human Behaviour 🚨🚨
www.sciencedirect.com/science/arti...

🧵👇

1 month ago 0 0 1 0

Better late than never to announce that I have now moved to take up a one-year postdoc position at the Université Mohamed VI Polytechnique #UM6P in Rabat, Morocco 🇲🇦

Very excited to get started on this new chapter

3 months ago 0 0 1 0

Testing theory of mind in large language models and humans - Nature Human Behaviour At the core of what defines us as humans is the concept of theory of mind: the ability to track other people’s mental states. The recent development of large language models (LLMs) such as ChatGPT has...

New paper out now in Nature Human Behaviour

We administered a range of standard Theory of Mind tests to LLMs and humans in order to compare how models performed at tests of social reasoning

link.springer.com/article/10.1...

1 year ago 3 1 1 0

Very pleased to announce that the first study from my PhD thesis is out now 🥳 with thanks to @ConstableMerryn
and all my other wonderful co-authors!
Chimpanzees exhibit a behavioural signature of human social coordination:
www.sciencedirect.com/science/arti...

2 years ago 20 12 1 2

Check out our paper (thread below) on Flexible Cultural Learning Through Action Coordination in Perspectives on Psychological Science

2 years ago 3 0 0 0

Posts by James Strachan