We thus suggest that XAI researchers treat explanations as statistical estimators. Quantifying uncertainty and testing against random baselines will help align XAI with the standards of experimental science.
arxiv.org/abs/2512.18792 - with François Portet, Giada Dirupo and
@peyrardmax.bsky.social
Posts by Maxime Méloux
This highlights a fundamental non-identifiability problem: multiple incompatible explanations can fit the same computation.
Other experimental sciences overcame similar issues in the past by adopting rigorous statistical frameworks.
2/3
Our new paper, "The Dead Salmons of AI interpretability", is out!
In 2009, researchers showed that standard statistical errors could detect "brain activity" in a dead salmon 🐟.
Modern XAI methods face similar issues: we find interpretable neurons and probes even in randomly initialized models.
1/X
I'm very happy to present our work "Everything, Everywhere, All at Once: Is Mechanistic Interpretability Identifiable?" this afternoon at #ICLR2025! Come have a chat at stand #439 :)