Advertisement · 728 × 90

Posts by Arno Solin

Preview
DiVeQ: Differentiable Vector Quantization Using the Reparameterization Trick Vector quantization is common in deep models, yet its hard assignments block gradients and hinder end-to-end training. We propose DiVeQ, which treats quantization as adding an error vector that mimics...

8/ Paper preprint:

Mohammad Hassan Vali, Tom Bäckström, and Arno Solin (2026). DiVeQ: Differentiable vector quantization using the reparameterization trick. ICLR 2026.

arxiv.org/abs/2509.26469

1 week ago 2 0 0 0

7/ DiVeQ is also included in the popular vector-quantize-pytorch package.

To use it there, enable:
directional_reparam=True

1 week ago 0 0 1 0

6/ We have also released a PyTorch package on PyPI:

pip install diveq

It implements the methods and variants from the paper and makes integration into training pipelines straightforward.

1 week ago 0 0 1 0

5/ The result is a direct and general way to do end-to-end trainable quantization, without many of the complications of earlier approaches.

We also see improved performance in image compression, image generation, and speech coding.

1 week ago 0 0 1 0

4/ We do this by modelling quantization as adding a carefully constructed error vector.

So the forward pass still uses hard assignments, while training gets meaningful gradient flow.

1 week ago 0 0 1 0

3/ In our #ICLR2026 paper, we introduce DiVeQ.

The idea is simple: keep the hard quantization behavior we want, but make training behave as if learning can still pass through it.

1 week ago 0 0 1 0

2/ The challenge is that VQ uses a hard nearest-codeword decision. That makes learning awkward, because the quantization step is non-differentiable and gradients stop flowing. Existing fixes often add bias and extra tuning.

1 week ago 0 0 1 0
Video

1/ 🔥 New paper: Differentiable Vector Quantization (DiVeQ) 🔥

Vector quantization (VQ) is a core tool in modern AI. It connects continuous data like images and audio to discrete tokens used by transformers. It underpins compression, generation, and multimodal modelling.

1 week ago 9 1 2 0
Statement Regarding API Security Incident

OpenReview's announcement:
openreview.net/forum/user%7...

4 months ago 3 0 0 0
Post image

Statement from #AISTATS2026 organizers regarding the @openreview.bsky.social API Security Incident

4 months ago 13 6 1 0
Advertisement

I'm feeling grateful to colleagues, students, collaborators, and everyone who joined the talk – and excited about the next steps in research on machines that learn, and maybe one day, truly make sense. 🙏✨
4/n

4 months ago 0 0 0 0

My own research, together with my group, focuses less on building the giant models and more on designing the building blocks behind them: model components, inductive biases, training principles, and inference methods that make AI systems more robust, data-efficient, and uncertainty-aware.
3/n

4 months ago 0 0 1 0

I talked about "Making Sense of Learning Machines":
• How modern machine learning has learned to cope with natural, “chaotic” data – images, text, sound
• Why the big breakthroughs of the last 10–15 years matter
• What we lack and what we would like to understand
2/n

4 months ago 1 0 1 0
Making sense of learning machines – Arno Solin
Making sense of learning machines – Arno Solin YouTube video by Aalto University

I recently gave my installation talk after being tenured. The video of the talk is now available on the university's YouTube channel: youtu.be/R1UQoflPTDg 1/n

4 months ago 15 3 1 0
2026 Conference

Yes. The easiest way to find it will be on the website virtual.aistats.org We are in the process of adding material there and will add a link.

8 months ago 1 0 0 0

We will go public with it as soon as everything is set up with the venue.

8 months ago 3 0 1 0

I'm thrilled to be Program Chairing AISTATS 2026 together with Aaditya Ramdas. AISTATS has a special feel to it, and it has been described by many colleagues as their "favourite conference". We aim to preserve that spirit while introducing some fresh elements for 2026. [3/3]

8 months ago 4 0 1 0
Call for Papers

Accepted papers will be presented in person in Morocco, May 2–5, 2026. The full Call for Papers is available here: virtual.aistats.org/Conferences/... [2/3]

8 months ago 1 0 1 0
Post image

📣 Please share: We invite submissions to the 29th International Conference on Artificial Intelligence and Statistics (#AISTATS 2026) and welcome paper submissions at the intersection of AI, machine learning, statistics, and related areas. [1/3]

8 months ago 36 21 2 2
Advertisement
BitVI on 1D Gaussian mixture models.

BitVI on 1D Gaussian mixture models.

Remember that computers use bitstrings to represent numbers? We exploit this in our recent @auai.org paper and introduce #BitVI.

#BitVI directly learns an approximation in the space of bitstring representations, thus, capturing complex distributions under varying numerical precision regimes.

9 months ago 22 3 2 0
DeSplat: Decomposed Gaussian Splatting for Distractor-Free Rendering DeSplat: Decomposed Gaussian Splatting for Distractor-Free Rendering

Check our #CVPR paper and project page for more results, videos, and code!
📄 arxiv.org/abs/2411.19756
🎈 aaltoml.github.io/desplat/

10 months ago 0 0 0 0
Post image

Qualitative visualization of static distractor elements achieved by our model, DeSplat. [3/n]

10 months ago 0 0 1 0
Video

Compared to Splatfacto we model and can ignore distractors to improve 3DGS reconstruction quality. [2/n]

10 months ago 0 0 1 0
Post image

Real-world #3DGS scenes are messy—occluders, moving objects, and clutter often ruin reconstruction. This #CVPR2025 paper presents DeSplat, which separates static scene content from distractors, all without requiring external semantic models. [1/n]

10 months ago 5 0 1 0
Post image

I’m visiting the Isaac Newton Institute for Mathematical Sciences in Cambridge this week.

I’m giving an invited talk in the ”Calibrating prediction uncertainty : statistics and machine learning perspectives” workshop on Thursday.

10 months ago 16 2 0 0
Preview
Are Your Continuous Approximations Really Continuous? Reimagining... Efficiently performing probabilistic inference in large models is a significant challenge due to the high computational demands and continuous nature of the model parameters. At the same time, the...

Our method addresses the eminent question of probabilistic modelling in quantized large-scale ML models. See the workshop paper below. [3/3]

📄 Paper: openreview.net/forum?id=Sai...

11 months ago 1 0 0 0

We introduce BitVI, a novel approach for variational inference with discrete bitstring representations of continuous parameters. We use a deterministic probabilistic circuit structure to model the distribution over bitstrings, allowing for exact and efficient probabilistic inference. [2/3]

11 months ago 1 0 1 0
Post image

Have you thought that in computer memory model weights are given in terms of discrete values in any case. Thus, why not do probabilistic inference on the discrete (quantized) parameters. @trappmartin.bsky.social is presenting our work at #AABI2025 today. [1/3]

11 months ago 45 11 3 1

We show that externalising reasoning as a DAG at test time leads to more accurate, efficient multi-hop retrieval – and integrates seamlessly with RAG systems like Self-RAG.
📄 Paper: openreview.net/pdf?id=gi9aq...
3/3

11 months ago 0 0 0 0
Advertisement

This work was born out of Prakhar's internship with Microsoft Research (\w Sukruta Prakash Midigeshi, Gaurav Sinha, Arno Solin, Nagarajan Natarajan, and Amit Sharma).
2/3

11 months ago 0 0 1 0