Katrina Drozdov (Evtimova) (@stochasticdoggo) Bsky

9 Lessons I Learned while Doing RL Post-Training for LLMs I recently had the chance to experiment with post-training techniques for large language models, a space that has become central to making LLMs useful and controllable in real-world applications. I us...

Finally dipped my toes into RL post-training. I trained a code generation LLM with GRPO using open-r1. Here are my 9 takeaways: kevtimova.github.io/posts/grpo/

9 months ago 2 0 0 0

I asked ChatGPT, Gemini, and Claude for a clever joke. They all gave me the same one. Either AI is merging into a hive mind… or humor has officially been solved mathematically!

1 year ago 2 0 0 0

The principle of least effort, from psychology, describes how we favor efficiency over effort. It aligns with System 1 (fast, intuitive) vs. System 2 (slow, deliberate) reasoning. AI faces a similar challenge: knowing when to rely on heuristics vs. deeper reasoning.

1 year ago 4 0 0 0

Happy Holidays and a Joyful New Year!

Wall sculpture series from “Random Walks of Happiness”, 2024
Celebrating Art: the Process, the Experiment, the Media.

#randomwalksofhappiness #happynewyear #happyholidays #abstractart #art #experimentalart #drawing #sculpture #wire #foundobjectsart #december

1 year ago 10 1 0 0

If you are into ML theory (RL or not) with a proven track record, and you are interested in an industry research position, PM me. Feel free to spread the word.

1 year ago 74 31 2 0

I gave a talk on Compositional World Models at NeurIPS last week 🌐

The recording is now online: neurips.cc/virtual/2024... (for registered attendees; starts at 6:06:00)

Workshop: compositional-learning.github.io

1 year ago 40 4 1 0

Just 10 days after o1's public debut, we’re thrilled to unveil the open-source version of the technique behind its success: scaling test-time compute

By giving models more "time to think," Llama 1B outperforms Llama 8B in math—beating a model 8x its size. The full recipe is open-source!

1 year ago 83 19 4 2

Transactions on Machine Learning Research

Announcement from TMLR jmlr.org/tmlr/

"""
📣 Heads-up 📣 that TMLR will pause new submissions over the upcoming holiday period from December 2 2024 to January 6 2025 (midnight AoE on both dates). We will resume accepting new submissions on January 7, 2025. Happy Holidays!
"""

1 year ago 50 5 2 1

An screenshot of UnicodeIt website

Write math on 🦋 with UnicodeIt!
For example: θ ∈ ℝⁿ or pp̅ → μ⁺μ⁻
Use website or install system-wide in Linux, macOS, or windows
www.unicodeit.net

(Created several years ago with @svenkreiss.bsky.social)

1 year ago 288 62 19 9

If you post a research paper with a picture of an animal, I will follow you. It's the law.

1 year ago 8 1 4 1

some little bluesky tips 🦋

your blocks, likes, lists, and just about everything except chats are PUBLIC

you can pin custom feeds; i like quiet posters, best of follows, mutuals, mentions

if your chronological feed is overwhelming, you can make and pin make a personal list of "unmissable" people

1 year ago 255 57 17 3

A plot showing that reranking improves recall as we increase the number of reranked docs, but with increasing docs we diminishing returns and eventually a performance dip.

Mat is not on 🦋—posting on his behalf!

It's time to revisit common assumptions in IR! Embeddings have improved drastically, but mainstream IR evals have stagnated since MSMARCO + BEIR.

We ask: on private or tricky IR tasks, are rerankers better? Surely, reranking many docs is best?

1 year ago 81 24 4 5

How many documents should you retrieve when using a reranker? The answer might surprise you!

Check out the excellent work from our intern Mathew on this important retrieval question. 👏

1 year ago 11 3 0 0

Google Scholar is twenty years (and one day) old today www.infodocket.com/2024/11/18/google-schola...

1 year ago 5 1 0 0

whiteboard with MUST: - 10k user signups - view post on web - app store - other pds federate - 3p labels/services etc

A whiteboard from our Jan 2023 team retreat where I wrote down our goals, and the top one was “10k user signups.”

We’re growing by 10k users every 10-15 minutes right now.

1 year ago 49260 3666 1396 281

I'm making a list of AI for Science researchers on bluesky — let me know if I missed you / if you'd like to join!

go.bsky.app/AcP9Lix

1 year ago 248 90 160 5

New here? Interested in AI/ML? Check out these great starter packs!

AI: go.bsky.app/SipA7it
RL: go.bsky.app/3WPHcHg
Women in AI: go.bsky.app/LaGDpqg
NLP: go.bsky.app/SngwGeS
AI and news: go.bsky.app/5sFqVNS

You can also search all starter packs here: blueskydirectory.com/starter-pack...

1 year ago 553 212 66 55

A starter pack for #NLP #NLProc researchers! 🎉

go.bsky.app/SngwGeS

1 year ago 251 99 45 13

From Academia to Industry: How a 2018 Paper Foreshadowed OpenAI’s Latest Innovation How a 2018 paper by CDS researchers helped shape OpenAI’s latest innovation, the o1 model.

You can read about my research on emergent communication with adaptive compute at inference time and its connection to OpenAI's o1 model here: nyudatascience.medium.com/from-academi...

#ai #machinelearning #research

1 year ago 1 0 1 0

Hello Bluesky! 👋 I'm an AI researcher with a PhD from NYU's Center for Data Science. My research focuses on representation learning for images and video, with an emphasis on self-supervised learning and regularization methods. Excited to connect and explore here! #ai #deeplearning #computervision

1 year ago 4 1 0 0

Posts by Katrina Drozdov (Evtimova)