Advertisement · 728 × 90

Posts by Ivan Rubachev

Post image Post image

New paper 🚨
"Stable Deep Reinforcement Learning via Isotropic Gaussian Representations"

Deep RL suffers from unstable training, representation collapse, and neuron dormancy. We show that a simple geometric insight, isotropic Gaussian representations, can fix this. Here's how 👇

1 month ago 24 4 2 2
Preview
GitHub - soda-inria/tabicl: TabICLv2: A state-of-the-art tabular foundation model TabICLv2: A state-of-the-art tabular foundation model - soda-inria/tabicl

TabICL also released a new version this week, higly recommend checking it out too github.com/soda-inria/t...

2 months ago 3 0 0 0
GitHub - limix-ldm-ai/LimiX: LimiX: Unleashing Structured-Data Modeling Capability for Generalist Intelligence https://arxiv.org/abs/2509.03505 LimiX: Unleashing Structured-Data Modeling Capability for Generalist Intelligence https://arxiv.org/abs/2509.03505 - limix-ldm-ai/LimiX

P.S. Its interesting how in our little corner of the ML/DL space SOTA foudnation models are actually open (GraphPFN was initialised from the github.com/limix-ldm/Li... model, and used @dholzmueller.bsky.social @gaelvaroquaux.bsky.social prior sampling from TabICL

2 months ago 3 0 1 0
Preview
GraphPFN: a graph foundation model pretrained on diverse synthetic graphs We introduce GraphPFN, a graph foundation model that extends the prior-data fitted network (PFN) framework to graphs, achieving state-of-the-art in-context and finetuned results on real-world graphs.

They also wrote up a blogpost about the model research.yandex.com/blog/graphpf...

2 months ago 0 0 1 0

To piggy-back a bit on foundation models for structured data discussion here

My colleagues at Yandex Research just updated the GraphPFN paper. It's a Graph Foundation Model that works on graph datasets with tabular features, and shows SOTA results both in ICL regimes and when fine-tuned.

2 months ago 2 2 1 0
Video

this?

2 months ago 2 0 0 0
Video

How hard can it be to build a browser from scratch for three platforms anyways?

Apparently 20K lines of code and ~70 hours from first commit to last.

emsh.cat/one-human-on...

#llm #llms #ai #codex #openai

2 months ago 39 9 4 2
The Importance of Diversity I read Dario’s The Adolescence of Technology and it’s scary. It assumes the perspective of a top-down ruler, that someone can and will get to control AI. This is taken as a given. Machines of Loving G...

I liked the response better geohot.github.io//blog/jekyll...

2 months ago 5 0 1 0
Advertisement

if you’re going to use AI in your workflow, you have to get extremely good at self-discipline/focus because AI will literally tempt you to pursue every tiny whim/idea that enters your brain and thus will absolutely destroy you and your work if left unchecked

slow down before it’s too late

3 months ago 100 12 5 2
Rust's standard library on the GPU GPU code can now use Rust's standard library. We share the implementation approach and what this unlocks for GPU programming.

We are excited to announce that we can successfully use Rust's standard library from the GPU. This has never been done before.

www.vectorware.com/blog/rust-st...

Supporting Rust's standard library enables existing Rust code to work on the GPU and makes GPU programming feel normal.

3 months ago 252 57 4 6
Knowing Less, Producing More On AI, Craftsmanship, and the Slow Erosion of Productive Friction.

www.d12frosted.io/posts/2025-1...

3 months ago 1 0 0 0
Preview
A Social Filesystem — overreacted Formats over apps.

formats over apps

3 months ago 822 189 68 84
Preview
N-Gram Induction Heads for In-Context RL: Improving Stability and Reducing Data Needs In-context learning allows models like transformers to adapt to new tasks from a few examples without updating their weights, a desirable trait for reinforcement learning (RL). However, existing in-co...

Explicitly adding induction heads helps. Some gains in NLP, seemingly bigger in RL algorithm distillation arxiv.org/abs/2411.01958

1 year ago 1 0 1 0

⚡️

1 year ago 0 0 0 0
Day 1 - Advent of Code 2024

I just completed "Historian Hysteria" - Day 1 - Advent of Code 2024 #AdventOfCode adventofcode.com/2024/day/1 (in zig btw)

1 year ago 1 1 1 0

Yep, just need to find the code. I can share

1 year ago 1 0 1 0

Yeah. I've experimented a bit with the existing code. It generalized to some of our specific problems in tabular DL (even though the meta-train was mostly from language and vision tasks). Curious what do you mean by actually worked here? No edge cases and failures, or just easy to use technically?

1 year ago 0 0 1 0
Advertisement
Post image

#MLsky

1 year ago 69 6 0 1

The rejects were horribly misinformed self contradictory but extremely confident. PSGD, SOAP and friends are taking over regardless of academia.

1 year ago 0 2 0 0
Preview
VeLO: Training Versatile Learned Optimizers by Scaling Up While deep learning models have replaced hand-designed features across many domains, these models are still trained with hand-designed optimizers. In this work, we leverage the same scaling approach b...

VeLO was something else, I’m a fan arxiv.org/abs/2211.09760

1 year ago 5 0 3 0
Post image

Thank you @bsky.app team for correcting the mistake. Glad to be back!

1 year ago 304 24 39 32

Did you know that 99% of email today is spam? Your inbox isn’t 99% spam because AI is used to filter it.

The same 99% will happen here too, but if AI researchers continue to get perma-banned for making available the datasets needed to filter it, it’s going to make this platform unusable.

1 year ago 511 64 42 25

@trl-research.bsky.social

1 year ago 2 1 0 0
How AutoML Creates New Opportunities for Europe - Frank Hutter // CyberValley Podcast #5
How AutoML Creates New Opportunities for Europe - Frank Hutter // CyberValley Podcast #5 YouTube video by Cyber Valley

Tabular DL and AutoML podcast just dropped. For sure watching this

youtu.be/3qpQ-sMRafE

1 year ago 11 2 1 0
Post image

Hello to all #ICLR reviewers on #MLsky

1 year ago 27 4 0 2

bsky.app/profile/hame...

1 year ago 4 1 0 0
Advertisement

But keep the numbers in appendix or code pls

So annoying when the only info is in visual form with unclear axes etc. I agree that it’s much better for presentation, but when digging in, I often need raw metrics.

1 year ago 4 0 1 0
Preview
GitHub - Bossett/bsky-feeds Contribute to Bossett/bsky-feeds development by creating an account on GitHub.

…extend of customisability?

If I understand correctly, we can do a lot with custom feeds.

Some examples here github.com/Bossett/bsky...

1 year ago 0 0 0 0
Preview
Custom Feeds | Bluesky Custom feeds, or feed generators, are services that provide custom algorithms to users through the AT Protocol. This allows users to choose their own timelines, whether it's an algorithmic For You pag...

Wow. Didn’t know we can create custom algorithmic feeds here. This is cool! What are your favourites, what’s the extend of

(context: docs.bsky.app/docs/starter...)

1 year ago 1 0 1 0