Markus Wulfmeier (@mwulfmeier) Bsky

Review request:
As usual for the time of year, I'll be looking for #IROS2026 reviewers. Highly interesting stack of papers on #ReinforcementLearning #Sim2Real #RewardLearning #LLMs #DataEfficiency #RoboticManipulation

Reach out with your ID or papercept registered mail address and background.

1 month ago 1 0 0 0

Robotiq directly integrating touch, robustly and reliably (hopefully 😉), is a big deal. looking forward to playing with these.

Now reannotating old teleop data might be unfeasible, but RL... 🚀

www.linkedin.com/posts/vision...

2 months ago 1 0 0 0

LinkedIn This link will take you to a page that’s not on LinkedIn

Nice perspective from Lu Li, Tianwei Ni, Yihao Sun, and Pierre-Luc Bacon on different conditions for offline-to-online learning.
lnkd.in/dUKc4jyV

We have been heavily relying on replay across experiments (lnkd.in/duVpCUru). Few experiments get started these days without initial datasets available.

3 months ago 0 0 0 0

LinkedIn This link will take you to a page that’s not on LinkedIn

Distributional RL has been the foundation of our recent projects (e.g. lnkd.in/eYiCp-y8).

Nice to see a distributional perspective to improvement for LLMs. Amin Rakhsha,Kanika Madan,Tianyu Zhang, Amir Farahmand, and Amir Khasahmadi via mode estimation rather than best of N for LLM: lnkd.in/d4NpyegJ

3 months ago 0 0 1 0

Curriculum matters more for active data generation than for fixed datasets. Data defines what you can learn and RL curriculum enables finding different datasets.

Great to see our paper mentioned in @dwarkesh_sp’s blog post on RL efficiency.

dwarkesh.com/p/bits-per-sam…

arxiv.org/pdf/1707.05300

3 months ago 2 0 0 0

Didn't spend much time here and just had a look back: since when is most posting on bsky so aggressive and vile? And why? How can this get worse than x?

3 months ago 2 0 0 0

Thrilled with the possibilities of the new @BostonDynamics design for AI research.

3 months ago 2 0 0 0

While the human form is brilliant, it’s ultimately limited.

The real future of robotics lies in the permission to move beyond our own biological constraints and optimize.

Note, we didn't achieve supersonic flight by mimicking the flapping of wings.

3 months ago 61 7 29 4

LinkedIn This link will take you to a page that’s not on LinkedIn

Our own work took this to the extreme by fully replacing the SFT stage with (inverse) RL lnkd.in/d_VeJ9FG

It’s still an open question where on this spectrum we will eventually converge, but my estimate is we’ll land much closer to full RL across the entire pipeline.

3 months ago 1 0 0 0

LinkedIn This link will take you to a page that’s not on LinkedIn

We’re moving toward a unified paradigm where supervised learning adopts RL ideas for a better conditioned pipeline. Thrilled to see this space growing e.g. via PPO style clipping (Proximal SFT - Zhu et al), direct KL reg (Anchored SFT - Li et al). importance sampling (Direct Fine-Tuning - Wu et al).

3 months ago 2 0 1 0

Taking parental leave for some deeply needed reading and it's fascinating to see the walls between SFT and RL crumbling.

🧵

3 months ago 1 0 1 0

Featuring: Nihar B. Shah, Nitya Thakkar, Yutaro Yamada, Joelle Pineau, and a panel including Chris Bregler, Tom Dietterich, Andrew McCallum, Nathan Srebro, Markus Wulfmeier, & James Zou

4 months ago 1 0 0 0

LinkedIn This link will take you to a page that’s not on LinkedIn

Join our NeurIPS social event: The Role of AI in Scientific Peer Review.
lnkd.in/dgdBiBqM

Help build community and explore solutions for a fair, efficient, and transparent peer review system.

Wed. Dec. 3rd, 7:00 PM – 9:00 PM (Upper Level Ballroom 6CDEF)

4 months ago 3 0 1 0

share.google

Join us to find out what's still missing share.google/AJZyTvg4HUPNZA…
(5+ role types for #Gemini #robotics @GoogleDeepMind)

4 months ago 0 0 0 0

Progress on robotics and AI has never been more closely linked. Access to leading frontier models has the potential to define what's possible in the physical world.

I'll be at #NeurIPS2025 for a couple of days next week. Find me if you want to know more.

x.com/sundarpichai...

4 months ago 1 0 1 0

Hindsight is 2025

4 months ago 1 0 0 0

I'll be at #NeurIPS2025 next week!

Looking forward to catching up with old and new friends - on AI for robots, Gemini Robotics, the RLnaissance in AI, and bad puns.

Bon appetit!

4 months ago 1 0 0 0

Expensive to run 😉

5 months ago 3 0 2 0

Congrats Nathan!

5 months ago 1 0 0 0

Truly enjoyed discussing the consolidation of specialist and generalist approaches to physical AI at #IROS2025.

Hoping to visit Hangzhou in physical rather than digital form myself in the not too distant future - second IROS AC dinner missed in a row.

#Robotics #physicalAI

6 months ago 1 0 1 0

Using Cognitive Models to Reveal Value Trade-offs in Language Models - Kempner Institute People’s actions and words are the result of a balance of different goals. The authors use a leading cognitive model of this value trade-off in polite speech to systematically examine […]

The explosion of AI capability and complexity demands better understanding.

I firmly believe in studying large model behavior - from the perspective of artificial sequential decision making (like Inverse RL) to now linking it with human decision-making and cognitive models.

bit.ly/3Wqrtxl

6 months ago 1 0 0 0

'Papa, was ist das?'

As a parent, Gemini has made my life massively easier. We found this caterpillar earlier and the answer is in fact correct (verification is much easier than generation thanks to classical Google search)

Very curious about how LLMs are continuing to change the ways we learn!

6 months ago 2 0 0 0

The position paper track at #NeurIPS2025 was a great idea, the acceptance rate of under 6% not so much!

This is unnecessarily low and will reduce interest in any future iterations of the track.

(Disclaimer: our team is part of the 94%)

@NeurIPSConf

6 months ago 10 0 0 0

If you want one brain for any robot, cross-embodiment learning is the key. Check out the new model and tech report for sparks of it share.google/H5RBiwtCWnW7...

And catch up with the team (unfortunately without me this year) at #CORL2025!

More here:
x.com/sundarpichai...

6 months ago 2 0 0 0

Control problems are everywhere, but some applications of Reinforcement Learning are truly out of this world! 🌌

Our team's latest research @ Google DeepMind in #Science shows RL can improve sensitivity by 30-100x.

See how RL can accelerate cosmic discovery!👇 (Image #nanobanana)
lnkd.in/e8r7t3NJ

7 months ago 2 1 0 0

Other companies folded or dropped their divisions.

They've given up too early. Congratulations, not easy to accept you have been wrong and lucky I had no bets. Lots of learnings. Now to scaling the rest of robotics!

9 months ago 1 0 0 0

Waymo on X: "100 million real world, fully autonomous miles driven on public roads. That’s more than 200 trips to the Moon and back. Thank you riders. https://t.co/2EcE4Oqxnu" / X 100 million real world, fully autonomous miles driven on public roads. That’s more than 200 trips to the Moon and back. Thank you riders. https://t.co/2EcE4Oqxnu

#Robotics is hard, and so is #AutonomousDriving! Massive congratulations to my friends at #Waymo for proving me wrong!

x.com/Waymo/status...

There was a time a couple years back when I was getting skeptical about the scale of both technological and societal challenges for scaling autonomy....

9 months ago 1 1 1 0

Massive props to everyone organizing the #RSS2025 demo. I'm having a massive amount of fomo here.

9 months ago 0 0 0 0

Ayzaan Wahid on X: "We took a robot to RSS in LA running our new Gemini Robotics On-Device VLA model. People interacted with the model with new objects and instructions in a brand new environment and the results were amazing! https://t.co/0qZiGK2h0g" / X We took a robot to RSS in LA running our new Gemini Robotics On-Device VLA model. People interacted with the model with new objects and instructions in a brand new environment and the results were amazing! https://t.co/0qZiGK2h0g

Large language and vision models alone don't solve the whole #robotics problem.
But they surely have a massive impact on generalisation and robustness!

New scenes, new backgrounds, new objects, new people, new language and audio....

x.com/ayzwah/statu...

9 months ago 0 0 1 0

Gemini Robotics on-device in simulation YouTube video by Google DeepMind

Don't have a robot? Try our newest Gemini Robotics on-Device VLA in simulation!

Or become a trusted tester and tune and adapt the model yourself!

www.youtube.com/watch?v=nVMY...

9 months ago 5 1 1 0

Posts by Markus Wulfmeier