[9/n] Huge thanks to @pcastr.bsky.social, Daniel Kasenberg, @neurokim.bsky.social and everyone in Montreal 💙 I had an amazing experience exploring topics in behavioral game theory, cognitive science, and AI-driven scientific discovery, together with brilliant colleagues.
Posts by Caroline Wang
[8/n] For me, it’s really cool that this aligns with the jump in theory-of-mind capabilities in recent LLMs (since opponent modeling in IRPS is basically a type of ToM)
[7/n] So what were the insights? Both humans and LLMs use value learning + opponent modeling, but frontier models maintain more sophisticated opponent models (3x3x3 transition matrices vs simple size 3 vectors tracking of prior move frequencies).
[6/n] Using this approach, we get actual programs that explain the behavior, which we can read and compare. Diagram for human program shown below.
[5/n] But what does the difference in win rates actually mean? To understand, we used AlphaEvolve to automatically discover interpretable behavioral models directly from gameplay data.
[4/n] Frontier models (Gemini 2.5 Pro/Flash, GPT 5.1) win more and adapt much faster than humans, while smaller models like GPT OSS 120B actually get worse over time because they can’t integrate the long context.
[3/n] So how do their strategic behaviors actually differ from humans? We examined this question through the lens of behavioral game theory, using iterated rock-paper-scissors (IRPS).
[2/n] LLM agents are everywhere now: customer service, negotiations, even as human proxies for social science/market research
[1/n] Just wrapped up 7 months interning with @pcastr.bsky.social at Google DeepMind and I'm so excited to share our work: arxiv.org/abs/2602.10324.
TLDR: We used LLM-powered program synthesis to automatically model and discover differences between human and LLM strategic behavior