Advertisement Β· 728 Γ— 90

Posts by Sasha Rush

Talk By No abstract available.

If you're in Berkeley or like a nice streamed talk, I'm about to give a talk at the Simons Institute today: "You Know It Or You Don’t: Compositionality and Phase Transitions in LMs". Tune in at 4PM pacific!

1 year ago 21 3 1 0

I'm hanging around with Theorists πŸ€“

1 year ago 0 0 1 0
How DeepSeek Changes the LLM Story
How DeepSeek Changes the LLM Story YouTube video by Sasha Rush πŸ€—

What to know about DeepSeek

youtu.be/0eMzc-WnBfQ?...

In which we attempt to figure out MoE, o1, scaling, tech reporting, modern semiconductors, microeconomics, and international geopolitics.

1 year ago 96 14 1 5

These are great recommendations thank you.

1 year ago 1 0 0 0

For reasons, I find myself thinking a lot about the history of US/USSR Cold War science, particularly in applied math. Does anyone have a recommendation for a good book on this topic?

1 year ago 7 1 4 0

Yeah vertical is kind of dumb, but I thought I try it out.

1 year ago 0 0 0 0

πŸ˜†, I noticed I have trouble saying that word.

however if you listen to your own videos then you will never manage to release anything.

1 year ago 2 0 0 0
Flash LLMs: Pipeline Parallel
Flash LLMs: Pipeline Parallel YouTube video by Sasha Rush πŸ€—

10 short videos about LLM infrastructure to help you appreciate Pages 12-18 of the DeepSeek-v3 paper (arxiv.org/abs/2412.19437)

www.youtube.com/watch?v=76gu...

1 year ago 28 4 4 1

Thought this "Bill Gates" guy was on the level.

1 year ago 3 0 0 0

I'll try it out. Good to check once a year to see if I'm secretly an existential risk guy.

1 year ago 3 0 0 0
Advertisement

I mean Microsoft under the table bought his company, right?

Luckily there is no indication from the review he read the book.

1 year ago 2 0 0 0

Maybe I'll just use bluesky to rant about topics I'm too scared to talk about on twitter.

1 year ago 26 1 2 0

The casual conflation of AI with gene editing is intellectual malpractice. These two things have nothing to do with each other!

1 year ago 14 0 3 0

Would love to read a good book about this topic if anyone wants to give it a try.

1 year ago 5 0 1 1

I've been listening to too much If Books Could Kill, so now I'm convinced these airport books are actually the only thing that matters.

1 year ago 14 0 1 0

I tried reading this book, and I was just shocked at how little insight it had, and it's sheer inability to focus. The fact that it is being recommended to policy makers...

www.gatesnotes.com/The-Coming-W...

1 year ago 39 1 10 1
Python + WebGPU
Python + WebGPU YouTube video by Sasha Rush πŸ€—

I'm going to do a live coding stream for the next couple of hours. We'll start by running through some WebGPU tutorials. Can also talk about some AI stuff.

www.youtube.com/watch?v=sqKq...

1 year ago 21 4 1 0
Post image

We outperform Llama 70B with Llama 3B on hard math by scaling test-time compute πŸ”₯

How? By combining step-wise reward models with tree search algorithms :)

We're open sourcing the full recipe and sharing a detailed blog post πŸ‘‡

1 year ago 109 21 4 1

huh, so maybe OCaml should be the target for verifiable generation? I heard you guys have ways to build fast

1 year ago 2 0 2 0
Advertisement

As coding LLMs get faster at inference, iterating verification-in-the-loop tests becomes the bottleneck for coding agents. Probably need quite different programming systems for these settings, or even things like "batchable" runtimes, whatever that means.

1 year ago 10 0 2 0

We organised a lively poster fest with many students rehearsing for the upcoming @neuripsconf.bsky.social next week and others discussing their cool works!

Thanks to #GAIL, the #Generative #AI lab in #Edinburgh for sponsoring the event!

1 year ago 44 7 0 1
Post image Post image
1 year ago 31 3 4 0
Post image

I wanted to make my first post about a project close to my heart. Linear algebra is an underappreciated foundation for machine learning. Our new framework CoLA (Compositional Linear Algebra) exploits algebraic structure arising from modelling assumptions for significant computational savings! 1/4

1 year ago 139 21 3 2
Screenshot of BBC 100 picture of Sasha and blurb; linked in post.

Screenshot of BBC 100 picture of Sasha and blurb; linked in post.

Proud of my amazing colleague @sashamtl.bsky.social for her much deserved recognition on advancing the science of AI energy use.
BBC100: www.bbc.co.uk/news/resourc...
Fast Company: www.fastcompany.com/91233692/why...
Sasha has been working tirelessly moving things fwd--endurance & brilliance in one.

1 year ago 34 5 2 0
Post image

NEW: we have an exciting opportunity for a tenure-track professor at the #KempnerInstitute and the John A. Paulson School of Engineering and Applied Sciences (SEAS). Read the full description & apply today: academicpositions.harvard.edu/postings/14362
#ML #AI

1 year ago 20 19 0 1
Post image

We're hiring another predoctoral researcher for my team at Ai2/OLMo next year. The goal of this position is to mentor and grow future academic stars of NLP/AI over 1-2 years before grad school.

This ends up being people done with BS or MS who want to continue to a PhD soon.
https://buff.ly/49nuggo

1 year ago 54 7 6 1
x.com

Answers from Twitter x.com/srush_nlp/st...

1 year ago 5 0 1 0

Unfortunately Yoav's question is a bit more interesting and subtle than this talk.

1 year ago 4 0 3 0

πŸ™

1 year ago 1 0 0 0
Advertisement

Is there a community that writes RL-first programming languages? Something like (Num)Pyro that takes seriously the idea of separating the policy specification from the learning process.

1 year ago 16 0 2 0