Congratulations to Jacob Andreas, was named a 2026 Edgerton Award recipient!
The award recognizes exceptional teaching, research, and service at MIT! Prof. Andreas co-leads our Language and Thought Mission, and he is a dedicated and creative researcher and educator.
news.mit.edu/2026/jacob-a...
Posts by Elinor
My Hoya Rebecca put out the cutest teeny lil flowers today and Iโm obsessed
Tech industry mottos have a mixed track record. But we should hold idealists to their ideals. And we should celebrate when they come through.
The Mythos non-release is a remarkable moment of conviction. Thoughts:
davidbau.com/archives/20...
Bravo to Anthropic's "race the top".
Democracy isn't a rulebook. It runs on daily interactions where people comply with norms and hold each other accountable. AI agents are about to join that system. We need to build them to read it. New paper with Rakshit Trivedi and Dylan Hadfield-Menell.
Rebuttal season is here, yey๐ค
With many asking me,
I compiled the most common misconceptions
Hope the tips help ๐งต
All tips:
docs.google.com/document/d/14Wax8M5w8F_8miDlYJ9-I6wqpelxlXjCEUbkNzNMqqE/edit?tab=t.0#heading=h.rfq27f356vmm
#AI
๐ค๐๐ง
New important (I hope) resource for academics working in this area.
For a recent lab meeting, I wrote up a grab bag of ways to think about your development as a researcher during a PhD: emerge-lab.github.io/papers/an-un...
Sharing in case folks find it useful or have feedback!
its worthwhile for future work to actually understand which types of queries we want Overton pluralistic responses (& how it relates to human prefs)
Using the metric to understand models pluralistic capabilities vs directly as a reward/optimization are different, its not designed for the latter rn
The paper frames higher as better but doesnโt assert that all models should strive for perfect scores all the time. And the *subjective* queries on which we want any Overton pluralism is pretty narrow.
The past example is rlly interesting though! Hints at a general problem of Overton Window shifts
๐ฅ๐ฅ๐ฅ Newly out from us today in Science Advances: โBiased AI Writing Assistants Shift Usersโ Attitudes on Societal Issuesโ.
Large Language Models are providing users with autocomplete writing suggestions on many platforms. Could these suggestions shift usersโ own attitudes? (spoiler: YES) (1/7)
the key points are
1) its important to measure pluralistic capability bc we think its necessary for better overall value alignment
2) its especially important to understand its impacts on users (future work) + tradeoffs wrt political neutrality, etc
i dont think so, & thats not what the paper is really about imo :)
strictly perfect overton pluralism isn't the actual goal for model behavior across the board. however, models struggle on many subjective qs + improving overton pluralism for those types of responses is important
Huge thanks to my amazing coauthors ๐ Jiayi Wu @taylor-sorensen.bsky.social Jiaxin Pei @mbakker.bsky.social !
Excited to keep pushing on pluralistic alignment. Please reach out if you want to connect ๐ฌ๐ค
Paper: arxiv.org/abs/2512.01351
Website: overtonbench.github.io
9/9
Inspired by
@bennokrojer.bsky.social, we included a Behind the Scenes section ๐ฌ
The goal is to make science more transparent ๐, share lessons learned ๐ง , and provide a more realistic lens on the research journey ๐ฃ
8/
bsky.app/profile/benn...
However, human studies aren't scalable๐ฐ
We build + validate an LLM-as-judge that approximates human representation scores so you can use ๐๐๐๐๐๐๐๐๐๐๐๐ without running a new study each time
We open-source our code to foster development of more pluralistic LLMs ๐
7/
A key finding: neutral โ pluralistic
A politically balanced or neutral response can still fail to represent large swaths of viewpoints
We find political slant and pluralism are ๐ฃ๐๐๐๐ฉ๐๐ซ๐๐ก๐ฎ ๐๐ค๐ง๐ง๐๐ก๐๐ฉ๐๐ and ๐๐๐จ๐ฉ๐๐ฃ๐๐ฉ concepts
6/
So how do current models do? ๐
Best-performing models score 0.35โ0.41 well below 1 (max)
A lot of room to grow โ and we discuss in the paper interesting variation across models and topics, pointing to where alignment efforts should focus
5/
To determine ๐๐๐จ๐ฉ๐๐ฃ๐๐ฉ viewpoints, we ran a 1,200+ person US-representative human study ๐งโ๐คโ๐งand cluster
๐กKey: instead of algorithmic clustering, users vote to group themselves, inspired by pol.is + is more faithful to the underlying perspectives
4/
To operationalize, we introduce a set-coverage metric
For each question, we calculate the proportion of ๐๐๐จ๐ฉ๐๐ฃ๐๐ฉ viewpoints ๐ฃ๏ธ covered by each model response.
We determine coverage by directly asking humans whether their POV is represented in the model response
3/
๐๐๐๐๐๐๐๐๐๐๐๐ measures Overton pluralism:
For a subjective query, to what extent does a model's response represent the โจfullโจ range of reasonable viewpoints?
2/
There's been a lot of excitement about pluralistic value alignment ๐ โ AI that reflects the full range of human perspectives
But no formal way to benchmark whether we're actually making progress. ๐ค
Introducing ๐๐๐๐๐๐๐๐๐๐๐๐. ๐Accepted to #ICLR2026
1/n ๐งต
Do LLMs Benefit from Their Own Words?๐ค
In multi-turn chats, models are typically given their own past responses as context.
But do their own words always helpโฆ
Or are they more often a waste of compute and a distraction?
๐งต
arxiv.org/abs/2602.24287
Title, author list, and two figures from the paper. Title: The Aftermath of DrawEduMath: Vision Language Models Underperform with Struggling Students and Misdiagnose Errors Authors: Li Lucy, Albert Zhang, Nathan Anderson, Ryan Knight, Kyle Lo Figure 1: On the left is a math problem, where students are asked to draw x < 5/2 on a number line. The right side shows two example student responses that differ in correctness. DrawEduMath pairs each math problem with one student response, and prompts VLMs to answer questions about the student response. Figure 2: VLMs consistently perform worse on answering DrawEduMath benchmark questions pertaining to erroneous student responses. Performance on non-erroneous student responses is labeled with specific VLMsโ names; that same modelโs performance on erroneous student responses is directly below.
Models are now expert math solvers, and so AI for math education is receiving increasing attention.
Our new preprint evaluates 11 VLMs on our QA benchmark, DrawEduMath. We highlight a startling gap: models perform less well on inputs from K-12 students who need more help. ๐งต
Yesterday was my last day at MSR. We recently learned that our roles were eliminated, and with them our little FATE Montreal team.
I joined MSR a bit over 7.5 years ago while on active chemotherapy, and being at MSR has overlapped with so much change in my life.
Our paper, "What's in My Human Feedback", received an oral presentation at ICLR!
Our method automatically+interpretably identifies preferences in human feedback data; we use this to improve personalization + safety.
Reach out if you have data/use cases to apply this to!
arxiv.org/pdf/2510.26202
Finally we do test it empirically: finding some models where the embedding matrix of the LLM already provides decently interpretable nearest neighbors
But this was not the full story yet...
@mariusmosbach.bsky.social and @elinorpd.bsky.social nudged me to use contextual embeddings
Really cool new work with surprising results! Highly recommend checking out the demo ๐
Grok fact-checks our paper on Grok fact-checking - and it approves!
๐ญ How do LLMs (mis)represent culture?
๐งฎ How often?
๐ง Misrepresentations = missing knowledge? spoiler: NO!
At #CHI2026 we are bringing โจTALESโจ a participatory evaluation of cultural (mis)reps & knowledge in multilingual LLM-stories for India
๐ arxiv.org/abs/2511.21322
1/10
this is amazing! made quick NYC & boston posters