Advertisement ยท 728 ร— 90

Posts by Eran Malach

In our newest work (led by the amazing
@sunnytqin.bsky.social , w/ @emalach.bsky.social, Samy Jelassi), we investigate a core question for LLMs: "๐‘ก๐‘œ ๐‘๐‘Ž๐‘๐‘˜๐‘ก๐‘Ÿ๐‘Ž๐‘๐‘˜ ๐‘œ๐‘Ÿ ๐‘›๐‘œ๐‘ก ๐‘ก๐‘œ ๐‘๐‘Ž๐‘๐‘˜๐‘ก๐‘Ÿ๐‘Ž๐‘๐‘˜" in two prototypical logic-heavy puzzles: CountDown and Sudoku.

1 year ago 3 2 1 0

Will be presenting this work at #NeurIPS2024, today 11am, poster #2311. Come visit us!

1 year ago 10 1 0 0

Heading to NeurIPS tomorrow โœˆ๏ธ
Will be presenting a few papers during the week. Ping me if you want to chat!

1 year ago 2 0 0 0
Post image

I defended my PhD dissertation back in May. I didn't have time to share it widely then (newborn baby), but I think some of you might enjoy it, especially the opening chapters: benjaminedelman.com/assets/disse...

1 year ago 31 3 3 1

Just put together a starter pack for Deep Learning Theory. Let me know if you'd like to be included or suggest someone to add to the list!

go.bsky.app/2qnppia

1 year ago 87 31 29 5
Post image

How does test loss change as we change the training data? And how does this interact with scaling laws?

We propose a methodology to approach these questions by showing that we can predict the performance across datasets and losses with simple shifted power law fits.

1 year ago 19 7 1 2