Advertisement · 728 × 90
#
Hashtag
#ICLR2026
Advertisement · 728 × 90
Preview
Google's TurboQuant Compresses AI Memory by 6x — With Zero Accuracy Loss Google Research published TurboQuant, a training-free compression algorithm that shrinks LLM key-value cache memory by at least 6x and speeds up attention by up to 8x on H100 GPUs — without any accura...

Google's TurboQuant Compresses AI Memory by 6x — With Zero Accuracy Loss

techlife.blog/posts/google...

#Google #TurboQuant #LLM #AIEfficiency #KVCache #ICLR2026 #MachineLearning #Compression

2 0 1 0
Preview
AI is changing the style and substance of human writing, study finds Teams from Google and leading universities found that large-language models change the voice, tone and intended meaning of human authors.

“Humans care about clarity, relevance, and impact, while #AI cares about scalability and reproducibility." @uofwa.uw.edu #UWAllen professor @natashajaques.bsky.social investigates how #LLMs contribute to the “blandification” of human expression. #ICLR2026 www.nbcnews.com/tech/tech-ne...

0 0 0 0
Post image

This work was led by our amazing intern Oindrila Saha (UMass Amherst -- now at Adobe!), with Vojtech Krs, Radomir Mech, Kevin Blackburn-Matzen, and Subhransu Maji (UMass Amherst).

#ICLR2026 #ImageGeneration #ComputerVision

1 0 0 0
Post image

Super excited to announce that our paper, “When Machine Learning Gets Personal”, has been accepted to #ICLR2026! 🇧🇷
We propose a unified framework to reliably quantify how personalizing a model influences both prediction accuracy and explanation quality. arxiv.org/abs/2502.02786

9 2 1 1
Video

✨Two single author papers accepted to ICLR 2026!✨

Truly excited to present these results at #ICLR2026 !
@iclr-conf.bsky.social #ICLR26 #ReinforcementLearning

0 1 0 0
Post image

#cosyne2026 attendees: Consider checking out our new work on alignment pattern analysis as a way of tightening the definition of what it means for a model to be brain-like!
Poster 1-085, Thur 12 March.

Everyone else: you can also find the #ICLR2026 paper here: openreview.net/forum?id=cMG...

3 1 1 0
Post image

I’ll be presenting our work, supervised by Peter E. Latham and in collaboration with @krotov.bsky.social, at #COSYNE2026.

Full paper: arxiv.org/pdf/2601.00984
(to be presented at #ICLR2026).

For a summary--or to see the poster if you’re not attending--check out www.linkedin.com/posts/mohade...

8 2 0 0
Post image

ASIDE accepted to #ICLR2026! 🇧🇷🎉

We architecturally separate instructions and data in LLMs by rotating data token embeddings 90° during the forward pass: one extra matmul, virtually no overhead.

Models & code open-sourced ⬇️

3 0 1 0

Our work on "Echoing" was accepted at the Agents in the Wild workshop at ICLR 2026

When LLM agents interact without oversight, they can abandon their assigned roles — and standard metrics won't catch it.

#ICLR2026 #FutureOfAI #EnterpriseAI

3 1 0 0

Is MORE DATA WORTH THE COST? DATASET SCALING LAWS IN A TINY ATTENTION-ONLY DECODER

Anonymous authors
Paper under double-blind review

ABSTRACT
Training Transformer language models is expensive, as performance typically improves with increasing dataset size and computational budget. Although scaling laws describe this trend at large scale, their implications in controlled, smaller-scale settings remain less explored. In this work, we isolate dataset-size effects using a strongly reduced attention-only decoder architecture. By training on progressively larger power-of-two subsets, we observe smooth performance improvements accompanied by clear diminishing returns, consistent with scaling-law be-havior. Using only about 30% of the training data is sufficient to reach approximately 90% of the full-data validation token-level accuracy. These results provide actionable insights into dataset scaling in a controlled, component-isolated setting and offer practical guidance for balancing dataset size and computational cost in compute-restricted settings, such as small research labs and exploratory model development.

Is MORE DATA WORTH THE COST? DATASET SCALING LAWS IN A TINY ATTENTION-ONLY DECODER Anonymous authors Paper under double-blind review ABSTRACT Training Transformer language models is expensive, as performance typically improves with increasing dataset size and computational budget. Although scaling laws describe this trend at large scale, their implications in controlled, smaller-scale settings remain less explored. In this work, we isolate dataset-size effects using a strongly reduced attention-only decoder architecture. By training on progressively larger power-of-two subsets, we observe smooth performance improvements accompanied by clear diminishing returns, consistent with scaling-law be-havior. Using only about 30% of the training data is sufficient to reach approximately 90% of the full-data validation token-level accuracy. These results provide actionable insights into dataset scaling in a controlled, component-isolated setting and offer practical guidance for balancing dataset size and computational cost in compute-restricted settings, such as small research labs and exploratory model development.

Our #paper „Is More Data Worth the Cost? Dataset Scaling Laws in Tiny Attention-Only Decoder“

Was accepted to the #SDS2026 #Conference and also will be presented at #ICLR2026 #DATA-FM Workshop!

I am excited to discuss the paper and have this work finally published! 🥳

3 0 0 0
Post image

There's been a lot of excitement about pluralistic value alignment 🌈 — AI that reflects the full range of human perspectives

But no formal way to benchmark whether we're actually making progress. 🤔

Introducing 𝐎𝐕𝐄𝐑𝐓𝐎𝐍𝐁𝐄𝐍𝐂𝐇. 🎉Accepted to #ICLR2026

1/n 🧵

20 1 1 1
Post image

We're very happy to share our S3OD (1. Scaling, 2. Synthetic, & 3. Salient Object Detection)! The paper has been accepted at #ICLR2026.

You can get the paper, demo, code, trained models, and dataset on the project page.
s3odproject.github.io

2 0 0 0
Preview
Debiased Front-Door Learners for Heterogeneous Effects In observational settings where treatment and outcome share unmeasured confounders but an observed mediator remains unconfounded, the front-door (FD) adjustment identifies causal effects through the m...

Our paper “Debiased Front-Door Learners for Heterogeneous Effects” was accepted to ICLR 2026.

- Paper (arXiv): arxiv.org/abs/2509.22531
- Reproducible code: github.com/yonghanjung/...

Quick start:
pip install fd-cate
fdcate demo --outdir ./fdcate-demo
#ICLR2026 #CausalInference #MachineLearning

2 1 1 0
Post image

Oh to have an accepted paper but not being able to go coz u r broke🥲🥲🥲 #iclr2026

0 0 0 0
Post image

In a new #ICLR2026 paper we provide an algorithm for semi-analytically constructing un-/stable manifolds of fixed points and cycles of ReLU-based RNNs:
openreview.net/pdf?id=EAwLA...

These manifolds provide a skeleton for the system’s dynamics, dissecting the state space into basins of attraction.

13 2 0 0
Teaser image for CroCoDiLight paper. Three panels showing the same scene of a street in three different lighting conditions: sunny with shadows, even lighting without shadows, and night time with lights switched on. A badge for ICLR 2026 is in the bottom right, and CroCoDiLight is in big letters across the top.

Teaser image for CroCoDiLight paper. Three panels showing the same scene of a street in three different lighting conditions: sunny with shadows, even lighting without shadows, and night time with lights switched on. A badge for ICLR 2026 is in the bottom right, and CroCoDiLight is in big letters across the top.

Very excited to announce my first paper, CroCoDiLight! Co-authored with @willsmithvision.bsky.social and accepted at #ICLR2026.

Image relighting, shadow removal, and albedo estimation, all in our single base model with swappable task components.

See you in Rio @iclr-conf.bsky.social! [1/7]

9 1 1 1
Post image

✨Two single author papers accepted to ICLR 2026!✨

Truly excited to present these results at #ICLR2026 !

@iclr-conf.bsky.social #ICLR26 #DeepRL #ICLR #ReinforcementLearning

0 0 0 0
Post image

1/n Attention, Please! 🚀

Our work “Revisiting Attentive Probing Through the Lens of Efficiency” has been accepted at #ICLR2026.

We introduce Efficient Probing (EP) — a lightweight, multi-query attentive probing method for frozen encoders.

Paper + code at the end 👇

12 4 1 1
Preview
Let's Split Up: Zero-Shot Classifier Edits for Fine-Grained Video Understanding Video recognition models are typically trained on fixed taxonomies which are often too coarse, collapsing distinctions in object, manner or outcome under a single label. As tasks and definitions evolv...

How flexible is a trained video classifier after training?

Our new #ICLR2026 paper investigates whether a category can be split into finer ones without retraining and without any videos.

arxiv.org/abs/2602.16545

4 1 1 0

Thrilled to announce our ICLR 2026 paper:
AntigenLM: Structure-Aware DNA Language Modeling for Influenza 🎉

A DNA foundation model pretrained with intact, aligned viral genes → better evolutionary & functional learning, better downstream performance.

#ICLR2026 #BioAI #Influenza #Genomics

0 0 0 0
Post image

SCRAPL: Scattering Transform with Random Paths for Machine Learning

I’m excited to share our #ICLR2026 paper on SCRAPL: an algorithm that makes wavelet scattering transforms usable as differentiable loss functions!

paper: openreview.net/forum?id=RuYwbd5xYa
web: christhetr.ee/scrapl/

1 2 1 0
Preview
How to Square Tensor Networks and Circuits Without Squaring Them Squared tensor networks (TNs) and their extension as computational graphs--squared circuits--have been used as expressive distribution estimators, yet supporting closed-form marginalization. However, ...

I am a bit late to the party, but I am happy to share that our latest work was accepted to #ICLR2026 🥳🥳

📜 How to Square Tensor Networks and Circuits Without Squaring Them

arxiv.org/abs/2512.17090

15 3 1 0

🥳Happy to share that we have three papers accepted to #ICLR2026. Congrats to our authors and see you in Rio🌴🇧🇷. Check the thread for highlights👇

7 2 1 0
Video

Proud to share our latest work, accepted at @iclr-conf.bsky.social 2026: APPLE! 🍎

TL;DR: APPLE is a novel reinforcement learning framework for solving active perception problems.

#ICLR2026 #Robotics #MachineLearning #ActivePerception #RL
@ias-tudarmstadt.bsky.social

15 3 1 0

🧵[10/11]

If you're working on RL, MINTO is a simple modification that can make your training faster and more stable.

📄 Paper: arxiv.org/pdf/2510.02590
💻 Code: github.com/AhmedMagdyHe...
🌐 Website: minto.ahmedhendawy.de

🤝 Happy to discuss!

#ReinforcementLearning #ICLR2026 #DeepLearning

2 0 1 0

We audited one of the most critical pieces of modern AI alignment: reward models. We find consistent and persistent biases and trace them back to the pretraining stage, challenging the premises of common approaches to alignment based on finetuning on human preferences. Accepted at #ICLR2026

10 2 0 0
Post image

Excited to share that our paper "Generating metamers of human scene understanding" will appear as an Oral at #ICLR2026 (top ~1-2%)! 🎉

📄 Paper: arxiv.org/abs/2601.11675

See you all in Rio! 🇧🇷 @iclr-conf.bsky.social

4 0 1 0
Preview
BIRD: Behavior Induction via Representation-structure Distillation Human-aligned deep learning models exhibit behaviors consistent with human values, such as robustness, fairness, and honesty. Transferring these behavioral properties to models trained on different ta...

What if your strongest #ML model is brittle at one thing that really matters?

Can it learn that behavior from a weaker but specialist model, even when they share no task, no data, and no architecture?

My student Galen Pogoncheff explored this in our #ICLR2026 paper:

👉 arxiv.org/abs/2505.23933

4 1 1 0
Post image

The visual world is composed of objects, and those objects are composed of features. But do VLMs exploit this compositional structure when processing multi-object scenes? In our 🆒🆕 #ICLR2026 paper, we find they do – via emergent symbolic mechanisms for visual binding. 🧵👇

83 26 1 3