Ulyana Piterbarg (@upiter) Bsky

1/13 New Paper!! We try to understand why some LMs self-improve their reasoning while others hit a wall. The key? Cognitive behaviors! Read our paper on how the right cognitive behaviors can make all the difference in a model's ability to improve with RL! 🧵

1 year ago 57 17 2 3

Thank you to @sloanfoundation.bsky.social for this generous award to our lab. Hopefully this will bring us closer to building truly general-purpose robots!

1 year ago 22 4 3 0

(Many) more details in our paper! arxiv.org/abs/2410.02749

1 year ago 0 0 0 0

LMs trained to synthesize programs by repeatedly editing their own generations produce more diverse code compared to baselines

This improves the trade-off between test-time FLOPs and pass@k

1 year ago 1 0 1 0

Our approach introduces an algorithm, LintSeq, for sampling across interdependent lines in source code by using a code linter

With LintSeq, we can generate plausible edit *trajectories* for any source code file, covering possible ways of synthesizing its contents edit-by-edit with no linter errors

1 year ago 1 0 1 0

Our paper showing that LMs benefit from human-like abstractions for code synthesis was accepted to ICLR! 🇸🇬

We show that order matters in code gen. -- casting code synthesis as a sequential edit problem by preprocessing examples in SFT data improves LM test-time scaling laws

1 year ago 10 2 1 1

Can we extend the power of world models beyond just online model-based learning? Absolutely!

We believe the true potential of world models lies in enabling agents to reason at test time.
Introducing DINO-WM: World Models on Pre-trained Visual Features for Zero-shot Planning.

1 year ago 20 8 1 1

Williams and Zipser (1989) is a classic one! leech.cybernoid.gr/files/text/p...

1 year ago 5 0 2 0

Scaling Laws for Pre-training Agents and World Models The performance of embodied agents has been shown to improve by increasing model parameters, dataset size, and compute. This has been demonstrated in domains from robotics to video games, when generat...

Finally finally finally some scaling curves for imitation learning in the large-scale-data regime: arxiv.org/abs/2411.04434

1 year ago 54 8 2 0

Introducing 🧞Genie 2 🧞 - our most capable large-scale foundation world model, which can generate a diverse array of consistent worlds, playable for up to a minute. We believe Genie 2 could unlock the next wave of capabilities for embodied agents 🧠.

1 year ago 234 61 15 30

Now that @jeffclune.bsky.social and @joelbot3000.bsky.social are here, time for an Open-Endedness starter pack.

go.bsky.app/MdVxrtD

1 year ago 105 32 16 5

Posts by Ulyana Piterbarg