[Paper]: arxiv.org/pdf/2412.06014
🕑 Fri, Apr 24, 2026 • 6:30 AM – 9:00 AM PDT
🚩 Pavilion 3 P3-#313
Posts by ExplainableML
3/ Post-hoc Probabilistic Vision-Language Models
@antonbaumann.bsky.social , @ruili-pml.bsky.social pml.bsky.social, @marcusklasson.bsky.social @smentu.bsky.social,@shyamgopal.bsky.social , @zeynepakata.bsky.social, @arnosolin.bsky.social @trappmartin.bsky.social
2/
Are Reasoning LLMS Robust to Interventions on Their Chain-of-thought?
Alexander von Recum, @lgirrbach.bsky.social , @zeynepakata.bsky.social
[Paper]: arxiv.org/pdf/2602.07470
🕑 Fri, Apr 24, 2026 • 6:30 AM – 9:00 AM PDT
🚩Pavilion 3 P3-#918
1/
Person-Centric Annotations of LAION-400M: Auditing Bias and Its Transfer to Models
@lgirrbach.bsky.social , Stephan Alaniz, Genevieve Smith, Trevor Darrel, @zeynepakata.bsky.social
[Paper]: arxiv.org/pdf/2510.03721
🕑 Sat, Apr 25, 2026 • 11:15 AM – 1:45 PM PDT
🚩 Pavilion 3 P3-#1620
Our EML members are heading for Rio🇧🇷 for #ICLR2026! 🌴We’re excited to share our latest research — three poster presentations. Check out the thread below and come say hi to our authors
🥳Our proposal "TACO-VLM: Tackling challenges in large-scale multimodal learning" was granted 1.5M GPU-h on JUPITER
@fz-juelich.de, Europe's first exascale supercomputer with ~24k NVIDIA GH200 GPUs. With JUPITER's support, we hope to push large-scale multimodal learning further.
BayesVLM provides reliable uncertainty for pretrained vision-language models without retraining or inference-time sampling. It improves zero-shot calibration, reduces overconfident errors under domain shift, and enables more sample-efficient active learning with negligible overhead.
BayesVLM places a Laplace posterior over the final projection layers and analytically propagates uncertainty to cosine similarities.
This avoids Monte Carlo sampling while enabling efficient uncertainty-aware inference and active learning.
We introduce BayesVLM, a training-free post-hoc Bayesian method for uncertainty estimation in pretrained VLMs.
BayesVLM yields interpretable, well-calibrated uncertainty with virtually no inference overhead.
[Paper]: arxiv.org/pdf/2412.06014
[Project]: aaltoml.github.io/BayesVLM/
[Code]: github.com/AaltoML/Baye...
3/
Post-hoc Probabilistic Vision-Language Models
@antonbaumann.bsky.social, @ruili-pml.bsky.social, @marcusklasson.bsky.social, @smentu.bsky.social, @shyamgopal.bsky.social, @zeynepakata.bsky.social, @arnosolin.bsky.social , @trappmartin.bsky.social
While models are largely robust, recovery is inefficient and doubt expression plays a crucial role in recovery. Models are also not style-invariant, and suppression of doubt in a reasoning trace can lead to performance degradation.
To evaluate this, we developed a method to alter the chain of thought of a reasoning model at certain fixed time steps of the reasoning process. We then modified the reasoning by introducing various interventions and evaluated how the models reacted to them.
Robust reasoning is becoming ever more important as we deploy LLMs in critical settings. But how robust is their ability to recover from noisy or incorrect reasoning steps? Which recovery mechanisms do they employ? And can we potentially make this recovery process more robust?
2/
Are Reasoning LLMS Robust to Interventions
on Their Chain-of-thought?
Alexander von Recum, @lgirrbach.bsky.social, @zeynepakata.bsky.social
[Paper]: arxiv.org/pdf/2602.07470
This establishes the first large-scale empirical link proving that dataset composition is a primary driver of model bias, and also creates the foundation to study complex dynamics like second-order bias transfer and amplification through model architecture.
We demonstrate that a simple linear fit predicts 60-70% of the gender bias found in CLIP and Stable Diffusion directly from co-occurrences in the training data .
We use an ensemble of MLLMs and custom classifiers to generate 276M+ person-centric annotations across the full dataset . 🦾 With these labels, we measured "dataset bias" via co-occurrence frequencies, correlating them with "model bias" in CLIP and Stable Diffusion to see if data predicts the model .
We set out to change that! By auditing the massive LAION-400M dataset, we finally enable researchers to empirically test how well dataset statistics actually predict downstream model behavior .
Do foundation models merely reflect the bias in their data, or do they amplify it? So far, the link between dataset imbalances and model bias has been an assumption rather than a measurement .
1/
Person-Centric Annotations of LAION-400M: Auditing Bias and Its Transfer to Models
@lgirrbach.bsky.social, Stephan Alaniz, Genevieve Smith,Trevor Darrell, @zeynepakata.bsky.social
[Paper]: arxiv.org/pdf/2510.03721?
🥳Happy to share that we have three papers accepted to #ICLR2026. Congrats to our authors and see you in Rio🌴🇧🇷. Check the thread for highlights👇
4/
Invited talk: The Asymmetry of Adaptation: Reverse-Engineering Multimodal In-Context Learning
Yiran Huang & @zeynepakata.bsky.social
📍[🇺🇸 NeurIPS CCFM Workshop]: Sunday, December 7th 2025, 8:15 AM - 9:00 AM, San Diego Convention Center, Upper Level Room 25ABC
3/
Concept-Guided Interpretability via Neural Chunking
Shuchen Wu , Stephan Alaniz , @shyamgopal.bsky.social , Peter Dayan, Eric Schulz, @zeynepakata.bsky.social
📍[🇺🇸NeurIPS]: Fri, Dec 5, 2025 • 11:00 AM – 2:00 PM PST, Exhibit Hall C,D,E #2113
2/
Noise Hypernetworks: Amortizing Test-Time Compute in Diffusion Models
@lucaeyring.bsky.social , @shyamgopal.bsky.social , Alexey Dosovitskiy, @natanielruiz.bsky.social , @zeynepakata.bsky.social
📍[🇺🇸NeurIPS]: Thu, Dec 4 • 11:00 AM – 2:00 PM PST, Exhibit Hall C,D,E #3605
1/
Sparse Autoencoders Learn Monosemantic Features in Vision-Language Models
Mateusz Pach, @shyamgopal.bsky.social , @qbouniot.bsky.social , Serge Belongie, @zeynepakata.bsky.social
📍[🇺🇸NeurIPS]: Wed, Dec 3 • 4:30 PM – 7:30 PM PST, Exhibit Hall C,D,E #1007
📍[🇩🇰EurIPS]: Thu, Dec 4, #98
Our EML members have arrived in beautiful San Diego for #NeurIPS2025! 🌴
We’re excited to share our latest research — three poster presentations and one workshop presentation.
Check out the thread below 👇 and come say hi to our authors!
🎓PhD application season is back!
We’re hiring ONLY through the ELLIS @ellis.eu and the MCML
@munichcenterml.bsky.social
📌Please denote Prof. Zeynep Akata @zeynepakata.bsky.social as your preferred supervisor!
👉 Link to ELLIS (ellis.eu/news/ellis-p...) and MCML (mcml.ai/opportunitie...)
Results. GenEval: SDXL 0.55→0.61 (notable gains in two objects, counting, color attribution). T2I-CompBench: broad boosts (esp. Color/Texture). DPG-Bench (SDXL): DSG 74.65→79.26, Q-Align 0.72→0.81; user study: RankDPO wins over SDXL & DPO-SDXL.
We propose RankDPO—a listwise preference objective that weights pairwise denoising comparisons using DCG-style gains/discounts, optimizing the entire ranking per prompt rather than isolated pairs.