Still hiring for PhD candidates who are *specifically* excited in building and deploying RL systems for self-driving vehicles and other multi-agent planning settings. Shoot me an email if you think this is you and please help spread the word!
Posts by Bernhard Jaeger
๐ Wir drehen an einer wichtigen Stellschraube fรผr bessere #Arbeitsbedingungen in der #Wissenschaft: Vollzeitstellen sollen fรผr Doktorand:innen in Deutschland zur Norm werden.
"Wir mรถchten dazu beitragen, das #Wissenschaftssystem in Deutschland weiterzuentwickeln und zu verbessern."
๐ sohub.io/kyr3
Thank you for making the world a better place!
Naver AI has put the word "World" into "World Models", at least on a metropolis scale: A world model for Seoul using Naver Maps (the Seoul capital area is 25.6 million people, btw.).
seoul-world-model.github.io
arxiv.org/abs/2603.15583
@jnhwkim.bsky.social
Regularized self-play RL in grounded simulation effectively adapts driving policies to completely new cities. ๐ฝ -> ๐ผ
Really enjoyed collaborating on this work, led by Zilin and Saeed! Check out Zilin's post below for a great summary
๐งต: x.com/nirhso/statu...
๐: arxiv.org/abs/2602.15891
๐ Excited to share REPPO, a new on-policy RL agent!
TL;DR: Replace PPO with REPPO for fewer hyperparameter headaches and more robust training.
REPPO, led by @cvoelcker.bsky.social, will be presented at ICLR 2026. How does it work? ๐งต๐
It had been previously reported. Waymo got it taken down very quickly, but not before the Internet Archive got a copy of the text summary:
web.archive.org/web/20250703...
Crongratulations Andreas!
haha, thought it was weird that the integers were defined as float and thought it was about cache line optimizations.
Tรผbingen AI Research Building, where the Cluster of Excellence "Machine Learning" is based.
๐ขWeโre hiring: W3-Professorship in Machine Learning in Physics @unituebingen.bsky.social! What weโre looking for: Established research profile in a core area of #physics (condensedmatter, quantum or theoretical particle physics), strong track record in research questions related to #ML and/or #AI.
is faster?
What if you could train agents on a ๐ฑ๐ฒ๐ฐ๐ฎ๐ฑ๐ฒ of driving experience in ๐๐ป๐ฑ๐ฒ๐ฟ ๐ฎ๐ป ๐ต๐ผ๐๐ฟ, on a single GPU?
Excited to share ๐๐ช๐๐๐๐ง๐ฟ๐ง๐๐ซ๐ 2.0: A fast, friendly driving simulator with RL training via PufferLib at ๐ฏ๐ฌ๐ฌ๐ ๐๐๐ฒ๐ฝ๐/๐๐ฒ๐ฐ ๐ก + ๐
youtu.be/LfQ324R-cbE?...
Our new E2E driving method, TransFuser v6, is out on ArXiv.
It outperforms all other methods on CARLA by a wide margin, 95 DS on Bench2Drive!
We show that minimizing the asymmetry between data annotator and policy is key for strong IL results.
Code, models, and paper:
ln2697.github.io/lead/
Our staff's 2025 recommendations for individual donors, fresh off the press: coefficientgiving.org/research/su...
This article is from people who have thought about FROs for years and have experience with what works and what doesn't.
I have always appreciated the restraint in defining the niche of FROs in the broader ecosystem; it comes out clearly in this piece.
www.essentialtechnology.blog/p/the-future...
Unfortunately it appears much of the academic community has reconstituted itself on LinkedIn
I am so happy and excited that this project got funded!
true, you could try to collect some dataset withgood coverage by running online RL first and then do offline RL in future iterations to save sim compute
This is not bringing back offline RL (but online RL). The purpose of closed-loop training here is to gather data in OOD states with the model.
Stitching doesn't work if your base dataset doesn't cover the state space well, which is the case in autonomous driving.
Speaking of RL, Nvidia also just published a survey on the importance of closed-loop training (RL, etc.) in E2E driving.
research.nvidia.com/publication/...
๐
Tired Europe: Let's do tons of AI regulations
Wired Europe: Let's do tons of AI open source
#aiPULSE2025
This essay, roughly on dual use, has been haunting me for a while now:
dl.acm.org/doi/pdf/10.1...
Excited to be at #Neurips2025 this week to present our paper "Monoculture or Multiplicity: Which is it?", joint work with Moritz Hardt.
๐ Paper #1000: openreview.net/pdf?id=DO5Lt...
๐ Wed, Dec 3, 2025 โข 4:30 PM โ 7:30 PM
Feel free to come by and reach out!
A short ๐งต.
Attending #Neurips2025? Get your personalized Scholar Inbox conference program now to easily navigate the poster sessions and find what you are looking for:
www.scholar-inbox.com/conference/n...
Scholar Inbox for NeurIPS is live now.
www.nature.com/articles/d41...
ArXiv banned surveys due to AI slop spam.
Now we need to wait for them to be peer-reviewed.
Bad development, we need to find better solutions to AI slop than banning unreviewed papers.
Getting a survey reviewed at a good journal can take over a year. :(
Quick reminder about the EPFL PhD program deadline (EDIC) on Dec 15.