First draft online version of The RLHF Book is DONE. Recently I've been creating the advanced discussion chapters on everything from Constitutional AI to evaluation and character training, but I also sneak in consistent improvements to the RL specific chapter.
rlhfbook.com
Posts by Daniel Jiang
At ICLR 2025 in Singapore, my co-authors and I presented two papers on RL. Feel free to let us know of any feedback and let me know if you'd like to chat!
- openreview.net/forum?id=AOl...
- openreview.net/forum?id=AOl...
Topics of interest include offline RL, post-training large language models with RLHF, and long-term recommendation systems. If you’re interested, please email me and/or apply here: www.metacareers.com/jobs/1142270...
Our team at Meta is hiring a postdoc researcher! Our group conducts both fundamental and applied research in reinforcement learning, with a focus on applications in Meta's advertising systems.
Congrats to this year's Turing award winners! www.nytimes.com/2025/03/05/t...
Incidentally, if you'd like to hear from them, we know a place they've given / are giving keynotes
There’s one from ASOS.com that provides A/B test data over time (across many experiments, each with several arms).
Dataset: osf.io/64jsb/
Paper: arxiv.org/abs/2111.10198
We used it in a paper to benchmark an AE method. But I’d also love to know of other alternatives out there.
Given a high-quality verifier, language model accuracy can be improved by scaling inference-time compute (e.g., w/ repeated sampling). When can we expect similar gains without an external verifier?
New paper: Self-Improvement in Language Models: The Sharpening Mechanism
arxiv.org/abs/2412.01951
An updated intro to reinforcement learning by Kevin Murphy: arxiv.org/abs/2412.05265! Like their books, it covers a lot and is quite up to date with modern approaches. It also is pretty unique in coverage, I don't think a lot of this is synthesized anywhere else yet
I know one of the organizers is @eugenevinitsky.bsky.social. They did a great job and organized a very enjoyable conference.
I collected some folk knowledge for RL and stuck them in my lecture slides a couple weeks back: web.mit.edu/6.7920/www/l... See Appendix B... sorry, I know, appendix of a lecture slide deck is not the best for discovery. Suggestions very welcome.
Want to learn / teach RL? 

Check out new book draft:
Reinforcement Learning - Foundations
sites.google.com/view/rlfound...
W/ Shie Mannor & Yishay Mansour
This is a rigorous first course in RL, based on our teaching at TAU CS and Technion ECE.
New paper: Do social media algorithms shape affective polarization?
We ran a field experiment on X/Twitter (N=1,256) using LLMs to rerank content in real-time, adjusting exposure to polarizing posts. Result: Algorithmic ranking impacts feelings toward the political outgroup! 🧵⬇️
The RL (and some non-RL folks) starter pack is almost full. Pretty clear that the academic move here has succeeded
go.bsky.app/3WPHcHg