Anh Ta (@anhta24) Bsky

Fabian Falck, Teodora Pandeva, Kiarash Zahirnia, Rachel Lawrence, Richard Turner, Edward Meeds, Javier Zazo, Sushrut Karmalkar
A Fourier Space Perspective on Diffusion Models
https://arxiv.org/abs/2505.11278

11 months ago 1 1 0 0

We now have a whole YouTube video explaining our MINDcraft paper, check it out!
youtu.be/MeEcxh9St24

11 months ago 9 3 1 0

We define a new cryptographic system to allow user to show that he has a valid certificate from a public set of authorities, while hiding all the message, signature and identity of the authority

11 months ago 0 0 0 0

When using digital certificate, one usually gets signature from some authority, then show the message and signature for verification.

11 months ago 0 0 1 0

A visualization of compressed column evaluation in sparse autodiff. Here, columns 1, 2 and 5 of the matrix (in yellow) have no overlap in their sparsity patterns. Thus, they can be evaluated together by multiplication with a sum of basis vectors (in purple).

Wanna learn about autodiff and sparsity? Check out our #ICLR2025 blog post with @adrhill.bsky.social and Alexis Montoison. It has everything you need: matrices with lots of zeros, weird compiler tricks, graph coloring techniques, and a bunch of pretty pics!
iclr-blogposts.github.io/2025/blog/sp...

11 months ago 49 12 1 0

The DeepSeek Series: A Technical Overview An overview of the papers describing the evolution of DeepSeek

Recently, my colleague Shayan Mohanty published a technical overview of the papers describing deepseek. He's now revised that article, adding more explanations to make it more digestible for those of us without a background in this field.

martinfowler.com/articles/dee...

1 year ago 62 14 2 1

Huawei's Dream 7B (Diffusion reasoning model), the most powerful open diffusion large language model to date.

Blog: hkunlp.github.io/blog/2025/dr...

1 year ago 24 4 1 2

A Tour of Reinforcement Learning: The View from Continuous Control This manuscript surveys reinforcement learning from the perspective of optimization and control with a focus on continuous control applications. It surveys the general formulation, terminology, and ty...

This week's #PaperILike is "A Tour of Reinforcement Learning: The View from Continuous Control" (Recht 2018).

Pairs well with the PaperILiked last week -- another good bridge between RL and control theory.

PDF: arxiv.org/abs/1806.09460

1 year ago 7 1 0 0

A screenshot of the course description

I taught a grad course on AI Agents at UCSD CSE this past quarter. All lecture slides, homeworks & course projects are now open sourced!

I provide a grounding going from Classical Planning & Simulations -> RL Control -> LLMs and how to put it all together
pearls-lab.github.io/ai-agents-co...

1 year ago 38 9 3 1

Model Predictive Control and Reinforcement Learning: A Unified Framework Based on Dynamic Programming In this paper we describe a new conceptual framework that connects approximate Dynamic Programming (DP), Model Predictive Control (MPC), and Reinforcement Learning (RL). This framework centers around ...

This week's #PaperILike is "Model Predictive Control and Reinforcement Learning: A Unified Framework Based on Dynamic Programming" (Bertsekas 2024).

If you know 1 of {RL, controls} and want to understand the other, this is a good starting point.

PDF: arxiv.org/abs/2406.00592

1 year ago 43 8 0 0

David Picard

I updated my ML lecture material: davidpicard.github.io/teaching/
I show many (boomer) ML algorithms with working implementation to prevent the black box effect.
Everything is done in notebooks so that students can play with the algorithms.
Book-ish pdf export: davidpicard.github.io/pdf/poly.pdf

1 year ago 37 6 0 0

Our beginner's oriented accessible introduction to modern deep RL is now published in Foundations and Trends in Optimization. It is a great entry to the field if you want to jumpstart into RL!
@bernhard-jaeger.bsky.social
www.nowpublishers.com/article/Deta...
arxiv.org/abs/2312.08365

1 year ago 62 14 2 0

KS studies the Matrix Multiplication Verification Problem (MMV), in which you get three n x n matrices A, B, C (say, with poly(n)-bounded integer entries) and want to decide whether AB = C. This is trivial to solve in MM time O(n^omega) deterministically: compute AB and compare it with C. 2/

1 year ago 1 1 1 0

Introducing The AI CUDA Engineer: An agentic AI system that automates the production of highly optimized CUDA kernels.

sakana.ai/ai-cuda-engi...

The AI CUDA Engineer can produce highly optimized CUDA kernels, reaching 10-100x speedup over common machine learning operations in PyTorch.

Examples:

1 year ago 89 17 3 4

why on earth that somebody thought of doing this in the first place

1 year ago 0 0 0 0

Lorenzo Pastori, Arthur Grundner, Veronika Eyring, Mierk Schwabe
Quantum Neural Networks for Cloud Cover Parameterizations in Climate Models
https://arxiv.org/abs/2502.10131

1 year ago 1 1 0 0

Przemys{\l}aw Pawlitko, Natalia Mo\'cko, Marcin Niemiec, Piotr Cho{\l}da
Implementation and Analysis of Regev's Quantum Factorization Algorithm
https://arxiv.org/abs/2502.09772

1 year ago 1 1 0 0

Digital illustration of a school of red fish with simple geometric features on a purple background. Some fish are connected by curved dashed and solid lines, suggesting interactions or relationships between them.

Enjoyed sharing our work on electric fish with @dryohanjohn.bsky.social⚡🐟 Their electric "conversations" help us build models to discover neural mechanisms of social cognition. Work led by Sonja Johnson-Yu & @satpreetsingh.bsky.social with Nate Sawtell

kempnerinstitute.harvard.edu/news/what-el...

1 year ago 43 9 1 1

Model-free deep RL algorithms like NFSP, PSRO, ESCHER, & R-NaD are tailor-made for games with hidden information (e.g. poker).
We performed the largest-ever comparison of these algorithms.
We find that they do not outperform generic policy gradient methods, such as PPO.
arxiv.org/abs/2502.08938
1/N

1 year ago 93 21 3 4

🔥 Want to train large neural networks WITHOUT Adam while using less memory and getting better results? ⚡
Check out SCION: a new optimizer that adapts to the geometry of your problem using norm-constrained linear minimization oracles (LMOs): 🧵👇

1 year ago 18 6 3 1

Simplifying, Stabilizing and Scaling Continuous-Time Consistency Models Consistency models (CMs) are a powerful class of diffusion-based generative models optimized for fast sampling. Most existing CMs are trained using discretized timesteps, which introduce additional hy...

this paper is a pretty impressive tour de force in neural network training: arxiv.org/abs/2410.11081

pretty inspiring to me -- network isn't converging? rigorously monitor every term in your loss to identify where in the architecture something is going wrong!

1 year ago 9 1 0 0

Towards Integrating Personal Knowledge into Test-Time Predictions Machine learning (ML) models can make decisions based on large amounts of data, but they can be missing personal knowledge available to human users about whom predictions are made. For example, a mode...

Obsessed with the work coming out of Finale Doshi-Velez's group; they don't just take the limits of the real world for ML deployment seriously but instead turn it into new algorithmic ideas
arxiv.org/abs/2406.08636

1 year ago 62 9 1 1

Our new paper with @chrismlangdon is just out in @natureneuro.bsky.social! We show that high-dimensional RNNs use low-dimensional circuit mechanisms for cognitive tasks and identify a latent inhibitory mechanism for context-dependent decisions in PFC data.
www.nature.com/articles/s41...

1 year ago 71 24 0 1

I just checked the data of accepted papers at ICLR '25. The authors with most submission had 21 accepted out of 42 submitted. Oh well!

1 year ago 0 0 0 0

Also note that, instead of adding KL penalty in the reward, GRPO regularizes by directly adding the KL divergence between the trained policy and the reference policy to the loss, avoiding complicating the calculation of the advantage.

@xtimv.bsky.social and I were just discussing this interesting comment in the DeepSeek paper introducing GRPO: a different way of setting up the KL loss.

It's a little hard to reason about what this does to the objective. 1/

1 year ago 49 9 3 0

Restarting an old routine "Daily Dose of Good Papers" together w @vaibhavadlakha.bsky.social

Sharing my notes and thoughts here 🧵

1 year ago 61 8 5 3

It's finally out!

Visual experience orthogonalizes visual cortical responses

Training in a visual task changes V1 tuning curves in odd ways. This effect is explained by a simple convex transformation. It orthogonalizes the population, making it easier to decode.

10.1016/j.celrep.2025.115235

1 year ago 153 44 5 2

group relative policy optimization (GRPO)

A friendly intro to GRPO. The algorithm is quite simple and elegant when you compare it to PPO, TRPO etc - and it's remarkable how well that worked out for deepseek R1.

superb-makemake-3a4.notion.site/group-relati...

1 year ago 33 4 0 1

Eugen Coroi, Changhun Oh
Exponential advantage in continuous-variable quantum state learning
https://arxiv.org/abs/2501.17633

1 year ago 1 1 0 0

Can Transformers Do Enumerative Geometry? How can Transformers model and learn enumerative geometry? What is a robust procedure for using Transformers in abductive knowledge discovery within a mathematician-machine collaboration? In this work...

I am extremely happy to announce that our paper
Can Transformers Do Enumerative Geometry? (arxiv.org/abs/2408.14915) has been accepted to the
@iclr-conf.bsky.social!!
Congrats to my collaborators Alessandro Giacchetto at ETH Züruch and Roderic G. Corominas at Harvard.
#ICLR2025 #AI4Math #ORIGINS

1 year ago 12 3 1 2

Posts by Anh Ta