🧐 Delve into our paper:
arxiv.org/abs/2602.20068
Many thanks to my Supervisor Prof. @kostaskamnitsas.bsky.social @ox.ac.uk
and co-authors Ziyun Liang & Hermione Warr! Funded by @drivehealth.bsky.social
I look forward to attending #CVPR2026 in Denver this June!
🏁(9/9)
#ML #AI #Outofdistribution
Posts by Harry Anthony
Can we mitigate the Invisible Gorilla Effect?
We tested two strategies:
🎨 Colour jitter augmentation → mixed results
🧭 Projecting out nuisance subspace → more consistently improves detection
Understanding the bias helps design more robust OOD detectors 👇
🧵(8/9)
#CVPR2026
Why does this happen?
We look for insights in the model’s latent space.
We find that OOD colour changes lie along high-variance directions.
Feature-based OOD detectors downweight these directions, so large colour shifts can look in-distribution.
🧵(7/9)
#CVPR2026 #AIsafety
What happens across different OOD detectors?
We observe large OOD Detection performance drops when artefacts differ from the model’s region of interest.
Feature-based methods are affected the most.
The Invisible Gorilla Effect appears across a wide range of methods 👇
🧵(6/9)
We’re also releasing the data 👇
• 11,355 manually annotated images for public datasets #ISIC #MVTec
• Colour-labelled artefacts (ink, charts, defects)
• Colour-swapped counterfactuals for controlled evaluation
All publicly available on GitHub:
github.com/HarryAnthony...
🧵(5/9)
We conduct a Large-scale Evaluation for the Invisible Gorilla Effect across:
• 40 OOD detection methods
• 3795 Hyperparamer configuations
• 3 model architectures
• 7 OOD benchmarks (being made public)
• 25 random seeds
• Colour-swapped counterfactuals
🧵(4/9)
#CVPR2026
The community view:
"The closer an OOD input is to the training distribution, the harder it is to detect"
Our finding: Not always
Artefacts more visually similar to the model's ROI can be easier to detect than dissimilar ones.
💡OOD detection depends on model attention
🧵(3/9)
The Invisible Gorilla Effect👇
- Train a classifier for a primary task
- Test on images with different colour OOD artefacts
Finding:
OOD artefacts with similar colour to Model’s Region of Interest → detected as abnormal ✅
OOD with dissimilar colour → not detected ❌
🧵(2/9)
📣 New #CVPR2026 paper!
We found a counterintuitive bias in OOD detection.
OOD detectors are more likely to detect artefacts when they resemble what the model attends to - and miss them when they don’t.
Called the Invisible Gorilla Effect 🦍👇
arxiv.org/abs/2602.20068
🧵(1/9)
Whoop #NeurIPS2025 accepted! 🎉
Meet DIsoN, our 🧹💨 privacy-preserving OOD detector that compares test samples to training data without ever sharing the training data.
We make Out-of-Distribution detection decentralized!
📄Paper: arxiv.org/pdf/2506.09024
🧵👇
Make your research presentations unforgettable! 🎤
I’ve written a Medium post where I share tips for creating dynamic slides and walk you through how I designed some standout slides from my latest talk.
Check it out and transform your #presentations!
🔗 medium.com/@harry.antho...