Chandar Research Lab (@chandar-lab) Bsky

Sugar-shack with the lab!! 🍁

Saying goodbye to a very long winter, and welcoming sunnier days 🤩🍃

17 hours ago 0 0 0 0

LinkedIn This link will take you to a page that’s not on LinkedIn

🗣️ Shoutout to the authors: Pranshu Malviya, Balaraman Ravindran and @sarath-chandar.bsky.social!!! (published at CoLLAs 2022).

🔗 Learn more at: lnkd.in/eSSd9m56

1 month ago 0 1 0 0

This was the first work to show that you can successfully use adaptive gradient optimizers for lifelong learning and still beat Stochastic Gradient Descent (i.e., RMSProp < SGD < TAG-RMSProp!).

🔥 Across benchmarks, TAG was shown to improve final accuracy over baselines like ER and A-GEM.

1 month ago 0 0 1 0

🔥 TAG tracks gradient traces from previously learned tasks and estimates task similarity:
⬇️ Lower α → related tasks → transfer is encouraged
⬆️ Higher α → conflicting tasks → reduce interference

1 month ago 0 0 1 0

In lifelong learning, acquiring new tasks can cause ML models to forget previously learned knowledge. For this, our lab introduced TAG (Task-based Accumulated Gradients), a general wrapper on top of adaptive gradient optimizers. 📈

1 month ago 0 0 1 0

As we hope for women to strive in research every year more than the last, we encourage all of them to apply to our lab for internships, Master’s or PhD degrees with
@sarath-chandar.bsky.social !

1 month ago 0 0 0 0

The Chandar Research Lab remains committed to supporting women and other underrepresented communities @mila-quebec.bsky.social and in ML with initiatives such as the graduate application assistance program or a Computer Science summer school for high school students goint to undergrad.👩‍🔬

1 month ago 0 0 1 0

This is only a sneak peek into the actual work they did last year, as much of their research is still under submission. Stay tuned for more interesting papers spanning ML for Biology, model merging, continual learning, etc...

1 month ago 0 0 1 0

Generalization Can Emerge in Tabular Foundation Models From a Single Table Deep tabular modelling increasingly relies on in-context learning where, during inference, a model receives a set of $(x,y)$ pairs as context and predicts labels for new inputs without weight updates....

Generalization Can Emerge in Tabular Foundation Models From a Single Table by Nour Shaheen at the AI for Tabular Data workshop @euripsconf.bsky.social 2025!

arxiv.org/abs/2511.09665

1 month ago 0 0 1 0

ICLR Poster The Expressive Limits of Diagonal SSMs for State-TrackingICLR 2026

The Expressive Limits of Diagonal SSMs for State-Tracking by Behnoush Khavari @iclr-conf.bsky.social 2026.

iclr.cc/virtual/2026...

1 month ago 0 0 1 0

NeoBERT: A Next-Generation BERT Recent innovations in architecture, pre-training, and fine-tuning have led to the remarkable in-context learning and reasoning abilities of large auto-regressive language models such as LLaMA and Deep...

NeoBERT: A Next Generation BERT by @lola-le-breton.bsky.social published @tmlr-pub.bsky.social and @iclr-conf.bsky.social in Rio this year.

arxiv.org/abs/2502.19587

1 month ago 0 0 1 0

Revisiting Replay and Gradient Alignment for Continual Pre-Training of Large Language Models Training large language models (LLMs) typically involves pre-training on massive corpora, only to restart the process entirely when new data becomes available. A more efficient and resource-conserving...

Revisiting Replay and Gradient Alignment for Continual Pre-Training of Large Language Models by Istabrak Abbes @collasconf.bsky.social

arxiv.org/abs/2508.01908

1 month ago 1 0 1 0

Small Encoders Can Rival Large Decoders in Detecting Groundedness Istabrak Abbes, Gabriele Prato, Quentin Fournier, Fernando Rodriguez, Alaa Boukhary, Adam Elwood, Sarath Chandar. Findings of the Association for Computational Linguistics: ACL 2025. 2025.

📜 Small Encoders Can Rival Large Decoders in Detecting Groundedness by Istabrak Abbes published
@aclmeeting.bsky.social 2025.

aclanthology.org/2025.finding...

1 month ago 0 0 1 0

Maryam Hashemzadeh, @lola-le-breton.bsky.social, Istabrak Abbes, Nour Shaheen, Behnoush Khavari, Anabel Tan and @katelobacheva.bsky.social. Give them a follow and look at this list of their publications with our lab in the past year!⬇️

1 month ago 0 0 1 0

This week, as we celebrated International Women’s Right Day for the 115th time on Sunday, the Chandar Lab wanted to pay tribute to all the amazing women doing research👩‍🎓, and to highlight the cutting-edge work they do at our lab everyday...🧵

1 month ago 1 2 1 0

GitHub - chandar-lab/stream-rep-rl: Streaming setup with representation learning for RL Streaming setup with representation learning for RL - chandar-lab/stream-rep-rl

Work done by @nilaksh404.bsky.social, Antoine Clavaud, @mreymond.bsky.social, Francois Rivest, and @sarath-chandar.bsky.social

Checkout the paper at : arxiv.org/abs/2602.09396
Code : github.com/chandar-lab/...

1 month ago 1 0 0 0

Look at the latents! t-SNE analysis shows that our method (top) learns structured, temporally coherent representations faster than standard streaming RL

1 month ago 1 0 1 0

Our method systematically outperforms existing baselines across Atari, MinAtar, and Octax. The best part? It remains efficient enough to train on just a few CPU cores.

1 month ago 0 0 1 0

Streaming data is highly correlated, which usually causes poor training. To fix this, we introduced Orthogonal Gradient Updates. By projecting gradients onto a subspace orthogonal to their history, we keep learning stable and effective.

1 month ago 1 0 1 0

We bring Self-Predictive Representations (SPR) to the streaming pipeline. By predicting future latent states, we force the encoder to learn much richer features from every observed frame: without needing a massive memory footprint of a replay buffer.

1 month ago 1 0 1 0

Without a replay buffer, streaming agents struggle to build meaningful representations. Traditional value-based losses alone can’t exploit the full informational content of transient data before it's gone.

1 month ago 1 0 1 0

Streaming Reinforcement Learning (RL) is a huge challenge: transitions are used once and discarded immediately. This makes agents extremely sample-inefficient. But what if we could "squeeze" more information out of every single frame?

Check out our latest paper!

1 month ago 2 3 1 1

The Markovian Thinker: Architecture-Agnostic Linear Scaling of Reasoning Reinforcement learning (RL) has recently become a strong recipe for training reasoning LLMs that produce long chains of thought (LongCoT). Yet the standard RL "thinking environment", where the state i...

Shoutout to the authors: Kamran Chitsaz, Milad Aghajohari, @a-kazemnejad.bsky.social. Supervised by: @sarath-chandar@bsky.social, @murefil.bsky.social, AaronCourville and @sivareddyg.bsky.social

🔗 Learn more at: arxiv.org/abs/2510.06557
🔗Build with: github.com/McGill-NLP/the-markovian-thinker

2 months ago 0 0 0 0

🧩 Even state-of-the-art models show Markovian Thinking at zero-shot: both GPT-oss-120B and Qwen3-30B-A3B recover/track LongCoT with no special prompting/training required, and lots of in-distribution positives on initialization, so RL with Delethink is primed to scale!!

2 months ago 0 0 1 0

🔥 Further, we scaled DeepSeek R1-1.5B to a thinking budget of 96K in 150 RL steps. Accuracy jumped, with mean trace lengths at around 40K tokens.

2 months ago 0 0 1 0

Markovian Thinking is instantiated by Delethink, an RL enviroment. With it, we trained DeepSeek R1-1.5B and demonstrated:

1️⃣ The same scaling as LongCoT-RL, but at lower costs,
2️⃣ Better test-time scaling, improving past 24K tokens, while LongCoT-RL plateaus.
3️⃣ All this while keeping linear costs!!

2 months ago 0 0 1 0

Markovian Thinking works by:

1️⃣ Making LLMs reason in 8K chunks.
2️⃣ At each boundary, context is reset and a small textual state from the last chunk is carried over.
🔃 Continues from that state.

✅ This decouples thinking length from context size, achieving linear compute and constant memory!

2 months ago 0 0 1 0

‘The Markovian Thinker’, developed by our lab, has been accepted at @iclr-conf.bsky.social   This work achieved long reasoning without the quadratic attention tax by making LLMs reason in chunks with a bounded state, achieving linear compute, constant memory and scaling beyond its training limits! 🔥

2 months ago 1 0 1 0

The Expressive Limits of Diagonal SSMs for State-Tracking State-Space Models (SSMs) have recently been shown to achieve strong empirical performance on a variety of long-range sequence modeling tasks while remaining efficient and highly-parallelizable....

📝 openreview.net/forum?id=5bg...
Joint work of Mehran Shakerinava, Behnoush Khavari, Siamak Ravanbakhsh and @sarath-chandar.bsky.social @mila-quebec.bsky.social .

2 months ago 1 1 0 0

Takeaways for architecture design:
- Diagonal structure imposes a precise group-theoretic ceiling on expressivity
- Depth helps in a principled way (one layer per Abelian factor)
- But training algorithms need to catch up — expressivity alone isn't enough

2 months ago 0 0 1 0

Posts by Chandar Research Lab