Noodling on some lab documentation for LLM usage but high level:
1. LLMs still make bad design decisions. Do not let them make design choices.
2. You’re still responsible for your code correctness. This means reading a lot of code. Hence, bullet point 1.
Posts by Antonin Raffin
What Matters for Simulation to Online Reinforcement Learning on Real Robots
"We investigate what specific design choices enable successful online reinforcement learning (RL) on physical robots."
arxiv.org/abs/2602.20220
Stable-Baselines3 v2.8.0 comes with a fix for a long-standing (4 years!) bug in `MaskablePPO`, as well as default hyperparameters for unlisted environments in the RL Zoo and additional quality-of-life improvements.
Also, our documentation is now fully in Markdown!
github.com/DLR-RM/stabl...
Gittensor is paying crypto for merged OSS PRs and it’s generating slop contributions to repos listed on their platform without maintainer consent.
If you maintain an open source project, it's probably worth checking if you’re listed and requesting removal: gittensor.io/repositories
The Story of Python's Lazy Imports: why it tooks 3 years and 2 attempts to have a "lazy" Keyword coming in version 3.15 #Python techlife.blog/posts/the-st...
TIL: when using PyTest, you can get a list of the slowest ten test durations over 1.0s long: `pytest --durations=10 --durations-min=1.0`, pass `--durations=0 -vv` to show all durations.
See docs.pytest.org/en/6.2.x/usa...
One week left to apply to the Reinforcement Learning Summer School (Milan, 3-12 June 2026).
Don't miss this opportunity to dive deep into RL foundations and learn about the most recent applications!
rlsummerschool.com/application/
You can't write a compelling promotion packet about the thing you didn't build. And that's the whole problem.
terriblesoftware.org/2026/03/03/n...
DeepMind's RL team is hiring a research scientist: if you're passionate about RL, come work with us!
And if you know people who might be interested, please share:
job-boards.greenhouse.io/deepmind/job...
If you are in Europe, applications for the 2026 Reinforcement Learning Summer School are now open!
Location: Milan
Date: June 3–12, 2026
Website: rlsummerschool.com
folks, please don't submit LLM-generated PRs to open source projects. It makes no sense.
If the maintainers want to use an LLM to fix an issue, they can use Claude or whatnot directly. They don't need you as intermediary, that's just silly.
If they don't want to use LLMs, they have reasons.
Thanks to the MyST parser and rst-to-myst, you can easily convert your documentation to Markdown while still keeping all the features of Sphinx =)
github.com/DLR-RM/stabl...
bsky.app/profile/kahn...
The talk i gave about "Recent Advances in RL for Continuous Control" at CERN last year is now online =)
www.youtube.com/watch?v=Sb0d...
github.com/Stable-Basel...
contributions are welcomed =) (the issue is from 2020...)
mainly lack of time, clean and readable implementation (and benchmark/comparison to other algos)
reppo is in the updated version (in the references), PQL might be what you are looking for? (should be in the references too)
For mpo, i need to re-read the paper and try to implement and benchmark.
I also updated the slides recently for the RL Mannheim Workshop to include new SOTA algorithms from early 2026
araffin.github.io/slides/advan...
The talk i gave about "Recent Advances in RL for Continuous Control" at CERN last year is now online =)
www.youtube.com/watch?v=Sb0d...
To understand how our radio buttons work I need to understand two separate component libraries and hundreds of lines of React.
If you missed this post last week, it explains pretty well how modern frontend works these days. :/
https://paulmakeswebsite
Q-value overestimation animation for my upcoming talk about "Recent Advances in RL for Continuous Control" at the Mannheim RL Workshop
This is something I talk about in my paper, where I suggest being explicit about {\gamma}_train (some methods use multiple gammas during training) and \gamma_eval.
One of my students is empirically investigating this and, as one would expect, it can have a huge impact.
arxiv.org/abs/2510.16175
Servo 0.0.4 showing new support for multiple windows
December in Servo…
🎤🧑🏫 FOSDEM talks next week!
🤹🪟 multiple windows
🪆🌐 HTTP proxy support
🔐🕵️ more SubtleCrypto algorithms
💽🗃️ new site data & network API
servo.org/blog/2026/01...
The export and preview menu, with the "PDF" section unfolded.
HTML preview & export now available in the web app! With HTML export, you can create a website from the same Typst file as your PDFs. This makes it easy to create documents that feel just as at home on the web as they do in print.
This network analyzer is very efficient and allows you to find interesting accounts, eg. people followed by lots of the people you follow (but not you).
bsky-follow-finder.theo.io
(Reposting this for folks who have joined Bsky more recently)
People wanted our Open Source Organizations starter pack to include many projects, so we decided to give them their own starter pack.
go.bsky.app/HvKFRKa
"uv is fast because of what it doesn’t do, not because of what language it’s written in"
Using AI coding for data analysis without personal programming skill fills me with dread.
Small errors in the code poisons results in ways that may not be visibly obvious.
LLMs are great when people verify outputs; the path to hell is when they don't.