Our team is hiring a postdoc in (mechanistic) interpretability! The ideal candidate will have research experience in interpretability for text and/or image generation models and be excited about open science!
Please consider applying or sharing with colleagues: metacareers.com/jobs/2223953961352324
Posts by Koustuv Sinha
Excited to share the results of my recent internship!
We ask 🤔
What subtle shortcuts are VideoLLMs taking on spatio-temporal questions?
And how can we instead curate shortcut-robust examples at a large-scale?
We release: MVPBench
Details 👇🔬
The HuggingFace/Nanotron team just shipped an entire pretraining textbook in interactive format. huggingface.co/spaces/nanot...
It’s not just a great pedagogic support, but many unprecedented data and experiments presented for the first time in a systematic way.
Excited to have two papers at #NAACL2025!
The first reveals how human over-reliance can be exacerbated by LLM friendliness. The second presents a novel computational method for concept tracing. Check them out!
arxiv.org/pdf/2407.07950
arxiv.org/pdf/2502.05704
Congrats, nice and refreshing papers, especially the word confusion idea! We need better similarity methods, good to see developments in this front! Curious if the confusion similarity depends on the label size of the classifier?
👋 Hello world! We’re thrilled to announce the v0.4 release of fairseq2 — an open-source library from FAIR powering many projects at Meta. pip install fairseq2 and explore our trainer API, instruction & preference finetuning (up to 70B), and native vLLM integration.
Many many congratulations!! 🥳🎉🎉
another factor which makes simple mlps work is visual token length. if you care about shorter tokens, you need a better mapper. these days most llms are capable of long context, which reduces the need of compressing visual tokens.
one hypothesis why simple mappers work is 1. unfreezing the LLM provides enough parameters for mapping, 2. richer vision representations are closer to llm internal latent space arxiv.org/abs/2405.07987
good questions! from what I see some folks still use complex mappers like Perceivers, but often simple mlp works good enough. the variable which induces the biggest improvement is almost always the alignment data.
This is actually a cool result - token length being a rough heuristic for confidence of models?
I am shocked by the death of Felix Hill. He was one of the brightest minds of my generation.
His last blog post on the stress of working in AI is very poignant. Apart from the emptiness of working mostly to make billionaires even richer, there's the intellectual emptiness of 'scale is all you need'
Lots of cool findings in our paper as well as in the website: tsb0601.github.io/metamorph/
Excited to see how the community "MetaMorph"'s existing LLMs!
We posted our paper on arxiv recently, sharing this here too: arxiv.org/abs/2412.141... - work led by our amazing intern Peter Tong. Key findings:
- LLMs can be trained to generate visual embeddings!!
- VQA data appears to help a lot in generation!
- Better understanding = better generation!
I wonder if veo-2 would be better at these prompts!
Co-organized by @randomwalker.bsky.social @peterhenderson.bsky.social, @in4dmatics.bsky.social Naila Murray, @adinawilliams.bsky.social, Angela Fan, Mike Rabbat and Joelle Pineau. Checkout our website for CFP and more details: reproml.org
🚨 We are pleased to announce the first, in-person event for the Machine Learning Reproducibility Challenge, MLRC 2025! Save your dates: August 21st, 2025 at Princeton!
Our paper PRISM alignment won a best paper award at #neurips2024!
All credits to @hannahrosekirk.bsky.social A.Whitefield, P.Röttger, A.M.Bean, K.Margatina, R.Mosquera-Gomez, J.Ciro, @maxbartolo.bsky.social H.He, B.Vidgen, S.Hale
Catch Hannah tomorrow at neurips.cc/virtual/2024/poster/97804
Also, MLRC is now in 🦋 as well - do follow! :) @reproml.org
Checkout the MLRC 2023 posters at #NeurIPS 2024 this week: reproml.org/proceedings/ - do drop by to these posters and say hi!
The return of the Autoregressive Image Model: AIMv2 now going multimodal.
Excellent work by @alaaelnouby.bsky.social & team with code and checkpoints already up:
arxiv.org/abs/2411.14402
Yes, that imo is one of the most exciting outcome for this direction - learning a new modality with much less compute. We have some really nice results, can’t wait to share it with everyone, stay tuned!
For those who missed this post on the-network-that-is-not-to-be-named, I made public my "secrets" for writing a good CVPR paper (or any scientific paper). I've compiled these tips of many years. It's long but hopefully it helps people write better papers. perceiving-systems.blog/en/post/writ...
👋 hello! :)
How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this:
Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢
🧵⬇️
When I first read this paper, I instinctively scoffed at the idea. But the more I look at empirical results, the more I’m convinced this paper highlights something fundamentally amazing. Lots of exciting research on this direction will come very soon!
arxiv.org/abs/2405.07987
All the ACL chapters are here now: @aaclmeeting.bsky.social @emnlpmeeting.bsky.social @eaclmeeting.bsky.social @naaclmeeting.bsky.social #NLProc
Doing good science is 90% finding a science buddy to constantly talk to about the project.
Same here! Lets make a club! 😅