We have the keynote speakers for RLC2026 now!
Thrilled to welcome Rika Antonova, Sheila McIlraith, Marc G. Bellemare, Danijar Hafner, Balaraman Ravindran!
Details: rl-conference.cc/index.html
The RL community is coming together this August in Montréal, Québec, Canada. Hope you make it!
Posts by Pablo Samuel Castro
Applications for the global Google PhD Fellowship Program are open! Fellowships directly support grad students doing exceptional and innovative research in computer science and related fields as they pursue their PhD. Learn more and apply by April 30 at goo.gle/phdfellowship.
Exploring SigReg regularization for RL has been on my "todo when I have time" list for a while.
While I haven't fully read this paper yet, it appears to do a fantastic thorough investigation, and it looks quite promising!
Code is available at: github.com/asahebpa/Iso...
This project was led by my students Ali Saheb Pasand & @johanobandoc.bsky.social , and in collaboration with Aaron Courville & @bashivan.bsky.social
Read the full paper: arxiv.org/abs/2602.19373
🖼️ Big picture takeaway 🖼️
Representation collapse and neuron dormancy aren't just symptoms to treat, they're predictable consequences of unconstrained representation geometry under non-stationarity.
🔑 Fixing the geometry fixes the learning 🔑
And beyond Atari: on Isaac Gym continuous control (AllegroHand, ShadowHand, Humanoid, Ant…), SIGReg improves training stability and reduces variance across all tasks.
On Atari with PQN, SIGReg improves performance in 51/57 games (89.5%), with a mean AUC improvement of 889% and median of 138%. These gains are broad, not driven by a few outliers. The benefits extend to PPO too in which 70% of games’ performance are improved.
We first validate on non-stationary CIFAR-10 (shuffled labels protocol). Without SIGReg: rank collapse, dormant neurons, poor recovery after each shift. With SIGReg: stable rank, near-zero dormancy, fast recovery.
To enforce this in practice, we use SIGReg (Sketched Isotropic Gaussian Regularization) proposed by Balestriero & LeCun, 2025:
💡Project embeddings onto random directions and match each projection to a Gaussian.
Simple✅
differentiable✅
computationally cheap✅
3️⃣ Maximum entropy — Gaussians maximize entropy under a fixed variance budget. This keeps all representational dimensions active and useful.
2️⃣ Gaussianity controls drift variance — among all isotropic distributions, Gaussians minimize fluctuations in the drift term. Heavier-tailed distributions (Laplacian, Logistic) introduce higher-order moments that cause destabilizing spikes.
See the red term in Theorem 3.1:
1️⃣ Isotropy equalizes contraction — when the feature covariance Σ = σ²I, the tracking error is uniformly damped across ALL directions. Anisotropic representations amplify drift along poorly conditioned directions, destabilizing learning.
See the blue term in Theorem 3.1:
We ask: what structure should learned representations have to remain stable under continuously evolving targets? 🤔
Answer: Isotropic Gaussian representations are provably advantageous. 💡
Why? Three reasons:
The core problem: deep RL is inherently non-stationary.
📉As the agent learns, its data distribution and learning targets keep shifting. This causes representations to degrade, neurons to go dormant, and performance to plateau or collapse.
New paper 🚨
"Stable Deep Reinforcement Learning via Isotropic Gaussian Representations"
Deep RL suffers from unstable training, representation collapse, and neuron dormancy. We show that a simple geometric insight, isotropic Gaussian representations, can fix this. Here's how 👇
In light of the ongoing conflict in the Middle East, RLC decided to remove the abstract deadline: rl-conference.cc/callforpaper...
The only deadline is for the full paper: Mar 5(AOE) openreview.net/group?id=rl-...
Affected folks may also contact the PCs to discuss deadline extensions before Mar 5.
RLC 2026 Call for Workshop is live on OpenReview!
Submission deadline: Mar 12 (AoE).
Full details here: rl-conference.cc/call_for_wor...
@glenberseth.bsky.social @eugenevinitsky.bsky.social @twkillian.bsky.social @schaul.bsky.social @sologen.bsky.social @audurand.bsky.social @bradknox.bsky.social
It's been such a pleasure working with @caroline-wang.bsky.social the last few months. She is a fantastic researcher, engineer, & person, and I really hope we can work together again.
Check out the paper that came out of her time with us, comparing strategic behavior of humans and LLMs!👇🏽
This is something I talk about in my paper, where I suggest being explicit about {\gamma}_train (some methods use multiple gammas during training) and \gamma_eval.
One of my students is empirically investigating this and, as one would expect, it can have a huge impact.
arxiv.org/abs/2510.16175
Today's wordle spoke volumes.
Wordle 1,677 2/6
⬛🟩⬛🟩🟩
🟩🟩🟩🟩🟩
P.S. here's a selfie of my x-country ski commute this morning 😃🎿
Today's wordle energized me
Wordle 1,662 2/6
🟨🟩🟩⬛⬛
🟩🟩🟩🟩🟩
How I started 2026, hoping to keep this spirit up throughout the year!
Happy new year!
Another one of my favorite spots in Ecuador: El Pailón del Diablo 🇪🇨
Gets better the closer you get 🏔️🇪🇨
Quito, Ecuador
Will never tire of this view ❤️🇪🇨
Hi RL Enthusiasts!
RLC is coming to Montreal, Quebec, in the summer: Aug 16–19, 2026!
Call for Papers is up now:
Abstract: Mar 1 (AOE)
Submission: Mar 5 (AOE)
Excited to see what you’ve been up to - Submit your best work!
rl-conference.cc/callforpaper...
Please share widely!
Today's wordle made me feel warm.
Wordle 1,646 2/6
⬛🟨🟩⬛⬛
🟩🟩🟩🟩🟩
Despite some of its flaws, OpenReview has had a tremendously positive impact on our research community.
This is a great initiative! Join me, and many others, in donating to ensure OpenReview not only remains a positive force, but is able to improve.
It was really great to chat with so many friends that I don't always get a chance to see, and to meet new ones, especially during our morning runs.
I really enjoyed this #NeurIPS2025 , but am also exhausted 🥰
Until next time!
bsky.app/profile/did:...