⚠️ Small sample — precise results should be taken with caution. But we've already reproduced most findings in another study we'll share soon.
📄 Paper: arxiv.org/abs/2505.01106
Posts by Pierre-Yves Oudeyer
🎯 Takeaways for educators:
→ AI literacy must train question-asking & answer evaluation, not just "how AI works"
→ Training metacognition should be a priority: benefits far beyond AI use
→ An "effortless" AI interaction should be a red flag, not a comfort signal
But not all students were equal:
🧠 Those with stronger metacognitive skills (monitoring & regulating their own learning) judged prompt quality better
⚠️ Paradoxically, self-rated AI-experienced students performed worse: self-taught AI breeds overconfidence, not critical skills
3️⃣ They almost never asked follow-up questions — even after incomplete answers.
Only 14 out of 63 students ever did.
2️⃣ They couldn't tell a good AI answer from a poor one.
Vague responses were rated "useful" just as often as genuinely informative ones. Prior knowledge didn't help: confident students were just as fooled by fluent-sounding text.
1️⃣ Students couldn't distinguish a good prompt from a bad one. A precise, context-rich question vs. a vague one? Chance-level discrimination.
So what went wrong?
In short: cognitive surrender. ChatGPT's fluent,
confident tone creates an illusion of understanding.
This manifests in 3 concrete ways 👇
We studied 63 students (aged 14-15) in French schools.
Task: read a science inquiry problem, query ChatGPT, evaluate its answers, then explain the concept in your own words.
📊 Average score: 10/20 — despite unlimited AI access and solvable problems (some scored near-perfect).
🔍 What happens when you give middle-schoolers unrestricted access to ChatGPT for science tasks?
Findings from our new paper led by Rania Abdelghani w/ @koumurayama.bsky.social @celestekidd.bsky.social Hélène Sauzéon 🧵👇
📖 Blog: developmentalsystems.org/phylolm
📄 ICLR 2025 paper: iclr.cc/virtual/2025...
💻 Code: github.com/Nicolas-Yax/...
Hugging Face online demo: t.co/9wEdav3LZA
This work was led by the outstanding
@nicolasyax.bsky.social and co-supervised by
@stepalminteri.bsky.social and me
The blog walks the full intellectual journey — from cat genetics to Dawkins' memes, from myth phylogenetics to the evolutionary landscape of modern AI. Written to be accessible to anyone curious about evolution, culture, and intelligence. 🌱
Why does this matter? 300+ models appear daily on
@huggingface
. Training transparency is declining. Tools like PhyloLM could matter for AI governance, safety monitoring, and simply mapping the evolutionary landscape of what's out there.
Result: surprisingly accurate reconstruction of known model family trees. And a few insights on the ancestry of closed-source models 👀
Our method PhyloLM (ICLR 2025) treats language models as populations of text.
"Genes" = short prompt contexts.
"Alleles" = their completions.
Apply classic genetic distance measures → build phylogenetic trees. No prior knowledge of training history needed.
A good evolutionary marker needs 3 properties:
→ compressed & universal representation
→ moderate rate of change
→ functional grounding
DNA has all three. So do high-level myth features. And so does an LLM's core identity: its probability distribution over text.
Now the key question: can you apply the same method to LLMs? Without access to their code, weights, or training data?
Mostly Yes. 🧬
The myth phylogenies match known human migration patterns — shown beautifully by the outstanding work of Julien d'Huy. Same logic, different substrate: not DNA, but narrative structure as evolutionary marker.
Can you reconstruct the evolutionary tree of cat breeds just by looking at their genes? Yes — biologists do this routinely.
Can you do the same with ancient myths, tracing how stories evolved across continents by comparing narrative structure? To a large extent, yes.
New blog post: The Phylogenetics of Artifacts — inferring the evolution of cultural objects, artificial life forms, and language models.
From cat genetics to ancient myths to LLMs. 🧬 1/n
Wow, some of my old #EvolutionOfLanguage #EoL pals may have just done something huge for #AILaw data attribution #LLM #AIGovernance Specifically
@pyoudeyer.bsky.social @nicolasyax.bsky.social @stepalminteri.bsky.social
Evolutionary biology can track LLM phylogeny!
developmentalsystems.org/phylolm
Major new system and result from the team: SOAR is an open-source self-improving genAI system pushing the frontier of ARC performances using program synthesis
It relies on using LLMs as self-improving smart operators for evolutionary search
Congratulations @jul-p.bsky.social for this major achievement and @ccolas.bsky.social for the amazing co-supervision ! Your work was magic to develop this self-improving system pushing the frontier of what can be done with program synthesis and open-source methods and models on the ARC challenge !
Using LLMs to advance the cognitive science of collectives
Very interesting new paper by @sucholutsky.bsky.social
Katherine Collins @norijacoby.bsky.social @billdthompson
@roberthawkins.bsky.social
arxiv.org/pdf/2506.00052
Catalogue de l’expo « Le monde selon l’IA »
Artiste et enseignant-chercheur, Samuel Bianchini collabore avec de nombreux ingénieurs et scientifiques pour explorer les relations qu'entretiennent les nouvelles technologies avec les contextes culturels dans lesquels elles s'insèrent. Dans cette troisième version de l'installation Prendre vie(s), Samuel Bianchini mobilise Flow Lenia, un logiciel de vie artificielle augmentee par IA. Cette forme d'animation, née d'une simulation mathématique appelée «jeu de la vie » en 1970, compose des systèmes susceptibles d'adopter des comportements émergents et inattendus. Prendre vie(s), version 03, 2020-2025 Développement informatique (algorithmes de vie et d'intelligence artificielles): Léon Denise et Adrian Mangel, sur la base de l'environnement logiciel Flow Lenia développé par l'équipe Flowers (Inris. Université de Bordeaux - Pierre-Yves Oudeyer, Clément Moulin-Fricr, Gautier Hamon et Erwan Plantee) à partir de Lenis. développé par Bert Chan (DeepMind). et avec la collaboration de Colin Bouvry Ce projet a bénéficié du soutien de l'École des arts décoratifs (Université PSL, Paris) et du festival accès)s( cultures électroniques Remerciements: Alain Declercq. Jean-Jacques Gay. Stéphane Trois Carrés
Le travail de Samuel Bianchini à partir des algorithmes de @pyoudeyer.bsky.social mis en lumière par le Jeu de paume dans son expo IA
Merci pour le lien ! Pour être plus précis, ce sont les algorithmes développés par des membres de mon équipe, ici en particulier @eplantec.bsky.social @clemmoulinfrier.bsky.social @hamongautier.bsky.social et le système Flow Lenia sites.google.com/view/flowlen...
guydavidson.me/files/goal_i...
Humans' ability to invent their own games & goals is at the core of open-ended learning.
Understanding and modeling computationally how they do it would be enlightening to understand better human cognition and build open-ended AI
Great step in this direction in new paper by Guy Davidson et al.
I reviewed "These Strange New Minds: How AI Learned to Talk and What It Means" by Chris Summerfield.
melaniemitchell.me/EssaysConten...