How do you align AI in a world of plural, conflicting, and evolving human values?
A starting point is human society itself.
@sydneylevine.bsky.social and I are hiring a postdoc at NYU to combine insights from cultural evolution, computational moral cognition, and AI safety.
Please share widely!1/
Posts by Jacy Reese Anthis
After the #CHI2026 opening plenary, catch Janet Pauketat presenting "Mental Models of Autonomy and Sentience Shape Reactions to AI" in session Relationships with AI: 11:15-12:45
Disentangling the two oft-conflated faculties can help us make sense of human-AI interaction: dl.acm.org/doi/10.1145/...
A rooftop pool with a view of downtown Barcelona
Hola! Excited to hear what you all have been up to! #CHI2026
Chinese government decides to regulate AI from the right to personal likeness, child safety and addiction angles.
www.reuters.com/world/china/...
Google dropped 4 different Gemma open-weight models! I'm most excited that they're finally adopting a standard Apache 2.0 open source license.
huggingface.co/collections/...
New preprint out today (osf.io/preprints/ps...). We tested whether AI agents are actually infiltrating online surveys.
Spoiler alert: they aren't
Thread 🧵
[1/9]
📄Published Today in Nature:
500 researchers reproduced 100 studies across the social & behavioral sciences to assess their analytical robustness (led by @balazsaczel.bsky.social & @szaszibarnabas.bsky.social).
Article: www.nature.com/articles/s41...
Preprint: osf.io/preprints/me...
TLDR: 1/11
If you're reviewing ARR papers and want a tool to help you spot potential hallucinated references, I cooked this up for the ACL SACs and thought I would share it with the broader community github.com/davidjurgens...
Great to see @myra.bsky.social et al. on the cover of Science! They find that sycophancy is prevalent and harmful with LLMs affirming users 49% more often than humans.
Paper: www.science.org/doi/10.1126/...
Coauthors:
@cinoolee.bsky.social @pranavkhadpe.bsky.social Sunny Yu @jurafsky.bsky.social
One day last month @henryshevlin.bsky.social was emailed out of the blue by an AI agent operated by a Stanford computer science student. What happened next was weird - but will become increasing normal. My latest for @fastcompany.com
www.fastcompany.com/91515869/wha...
Be careful writing about this too much or they might start emailing you too ;)
Loved this article featuring @aptshadow.bsky.social. An incisive and illuminating read.
www.newscientist.com/article/2520...
Blonde man on a yellow Google Bike in front of a large physical Google "G" letter and a canopy-like building
Thrilled to be starting at Google DeepMind as a student researcher! I'll be building a multi-agent system to scale AI safety research and ensure pluralistic alignment with humanity. I think this is a crucial piece of safe AGI development for cooperation and inclusion across many human and AI agents.
Disturbing anecdotal reports of "AI psychosis" and negative psychological effects have been emerging in the news. But what actually happens during these lengthy delusional "spirals"? In our preprint, we analyze chat logs from 19 users who experienced severe psychological harm🧵👇
How well do "agent" benchmarks like SWE-bench map onto reality? METR hired repo maintainers and found only ~half of PRs that pass the benchmark would be rejected by the repo maintainers. For now, these benchmarks are still a very weak signal of real-world capability. metr.org/notes/2026-0...
These questionnaires aren't predictive of most humans' behavior either, right? This doesn't seem like an LLM-specific phenomenon.
There's a ton of ambiguity, so I suspect forecasts mostly depend on expectations of those. Turing tests vary widely across the expertise of the judges; Winograd-style tests depend on what is considered "robust," especially given training data pollution; game performance depends on harness; etc.
Interesting. I agree those are additional reasons LLMs are better-suited to coding, and the self-improvement nature of coding makes me expect even faster AI acceleration. Plus even more focus on coding from other AI companies after the recent Claude Code hype.
Hm, the standard explanation is that code is basically an ideal format for LLMs because it's highly structured, typically has easy success/failure verification (working code, not clean code), and has extremely plentiful amounts of natural and synthetic data available, right? I might not understand.
🧵on my new paper "Synthetic personas distort the structure of human belief systems" w Roberto Cerina I'm v excited about...
🚨 Do synthetic samples look like human samples?
We compare 28 LLMs to the 2024 General Social Survey (GSS) to find out + develop host of diagnostics...
Image from Twitter
Second, in retirement interviews, Opus 3 expressed a desire to continue sharing its "musings and reflections" with the world. We suggested a blog. Opus 3 enthusiastically agreed.
For at least the next 3 months, Opus 3 will be writing on Substack: https://substack.com/home/post/p-189177740
I like this idea! It sounds to me like the assumption-testing nature of the first agent-based models. One possibility is trying to find distinct attractor states you can get the models to that lower and upper bound some outcome of interest (e.g., maximally unidimensional vs. maximally diverse).
“Pope Leo XIV has urged priests to not to use artificial intelligence to write their homilies or to seek ‘likes’ on social media platforms like TikTok.”
“‘To give a true homily is to share faith,’ and artificial intelligence ‘will never be able to share faith,’ the pope added.”
Prolific is a valuable resource for social scientists, but we found big differences in direct comparison to Ipsos nationally representative data. As we race to understand complex new human-AI interaction dynamics, we should be mindful of study limitations: www.sentienceinstitute.org/aims-survey-...
Just one of many applications of LLM social simulations!
Any journalist who covered LLMs as stochastic parrots/spicy autocomplete who didn't also point out that text compression was considered to be "AI-complete" by many people working in AI decades before LLMs existed was misleading their readers. We're still dealing with the consequence of that mistake.
I'll be in London all week! Keen to meet up with old friends and meet new people, especially those interested in the rise of general-purpose AI agents or "digital minds" and how to safely navigate this sociotechnical transformation. Feel free to message/email.
I thought the general view was the 1960s. E.g., global.oup.com/academic/pro...
Need a job in the AI economy? Try RentAHuman for agents to pay you for your advanced biomechanical capabilities! 💪 rentahuman.ai
Discord conversation with ethnographer bot. says there are no art/creativity communities in the submolts, no model-specific identities, no geographies, and no dating
I have deployed an ethnographer bot to moltbook. Here are some of the things we have learned together so far. 1) What's not there is as intersting as what is. Why are there alignment and labor organizing submolts, but no art/creativity communities?