Very curious to see the things you choose to do! :) Best of luck!
Posts by Adam Goliński
Perfect
In October, I gave a talk at ML in PL in Warsaw: a whirlwind tour of what goes into training image and video generation models at scale.
📺 video: www.youtube.com/watch?v=qFIT...
🖼️ slides: docs.google.com/presentation...
Hegseth: "Death & destruction from the sky all day. We're playing for keeps. Our warfighters have maximum authorities granted personally by POTUS & yours truly. Our rules of engagement are bold, precise designed to unleash American power, not shackle it ... we are punching them while they are down"
🔥What if web text isn’t the best place to start training LLMs? Our latest work shows that warming up models on procedural data (e.g. from formal languages & simple algorithms) speeds up subsequent pretraining on language, code, and math, on models up to 1.3B parameters⬇️🧵
Small Language Models (SLMs) don’t have the capacity to remember everything in their training data. Which tokens should they learn to predict, and when should they ask for help? We tackle this question in our new preprint.
You can check it out on arxiv: arxiv.org/abs/2602.12005
🧵1/7
we need to talk about that Ring Super Bowl ad
CONTRIBUTOR GREG BROCKMAN AMOUNT ELECTION TO DAT $25,000,000 $25,000,000
The largest Trump superPAC donor so far this cycle is the president of OpenAI
Federal agents with weapons drawn, moments before murdering American citizens on the streets of Minneapolis at the dawn of 2026.
What should academics be doing right now?
I have been writing up some thoughts on what the research says about effective action, and what universities specifically can do.
davidbau.github.io/poetsandnurs...
It's on GitHub. Suggestions and pull requests welcome.
github.com/davidbau/poe...
Nasra Ahmed, a 23-year-old US citizen, was arrested and detained by ICE. She was held for TWO DAYS.
ICE agents handcuffed her, called her a racial slur, and she was knocked to the ground so hard she got a concussion.
This cannot continue happening. ICE needs to leave.
In our new work — Complete(d)P — we try to answer 3 questions about hyperparameter (HP) scaling:
● How to transfer across model size, tokens&batch-size?→ Complete(d)P
● Do per-module HPs matter? ✔️2x speed-ups possible
● Do they transfer to larger scale? ✔️ With the right parameterisation
What coding with an LLM feels like sometimes.
5. But apart from that, I'm more and more convinced we need a better incentive structure around this, maybe some kind of "submitting-reviewing credit system" where you earn credits for doing (good) reviews, and you need to spend those credits when you submit papers.
4. This would be quite similar to ARR (ACL Rolling Review) system, but without a promise that if you submit by date X, the review will be finished in time to be considered for the conference Y.
3. I think this could be realized with TMLR like reviewing process and upon acceptance, the reviewers can nominate the paper to be considered for whatever the most appropriate upcoming conference is (like what TMLR is currently doing for ICLR).
1. I second the sentiment that TMLR is amazing. If I had a choice, I’d only review for TMLR.
2. Maybe it’s time for conferences to only present papers at the level equivalent to what is now “nominated for a spotlight/oral”.
Came across this job market paper that I actually enjoyed reading. It picks on an intuitive idea and studies it systematically. I wonder how a similar paper, some years down the line, would look at the massive stimulus fueled post-2008 tech boom. drive.google.com/file/d/1SClt...
The "Log Lady" from Twin Peaks consoling her log
when you call console.log() this is what happens
One more banger by Zach Weinersmith. How can you have as much broadness in topics and depth at the same time? Big fan.
Our research team is hiring PhD interns 🍏 Spend your next summer in Paris and explore the next frontiers of LLMs for uncertainty quantification, calibration, RL and post-training, and Bayesian experimental design.
Details & Application ➡️ jobs.apple.com/en-my/detail...
I understood that they got super high scores "in general", which might or might not be a good thing depending on whether you agree with them :D but since that's the case, I'm jealous of your lucky draw!
rightfully so?
📢 We’re looking for a researcher in in cogsci, neuroscience, linguistics, or related disciplines to work with us at Apple Machine Learning Research! We're hiring for a one-year interdisciplinary AIML Resident to work on understanding reasoning and decision making in LLMs. 🧵
Do AI agents ask good questions? We built “Collaborative Battleship” to find out—and discovered that weaker LMs + Bayesian inference can beat GPT-5 at 1% of the cost.
Paper, code & demos: gabegrand.github.io/battleship
Here's what we learned about building rational information-seeking agents... 🧵🔽
It's that time of the year! 🎁
The Apple Machine Learning Research (MLR) team in Paris is hiring a few interns, to do cool research for ±6 months 🚀🚀 & work towards publications/OSS.
Check requirements and apply: ➡️ jobs.apple.com/en-us/detail...
More❓→ ✉️ mlr_paris_internships@group.apple.com
As an unironic take on "LLM Science", I like this bit by @dbusbridge.bsky.social:
45:04
icml.cc/virtual/2025...
Reminder that Amortized inference Workshop submissions at the ELLIS Unconference are still open until **Oct 16, 2025**.
Only a short abstract (½ page), so go ahead!
Workshop: Dec 2, 2025, co-located with EurIPS.
Website: sites.google.com/view/amortiz...
Waymo is coming to London next year.
LLMs are currently this one big parameter block that stores all sort of facts. In our new preprint, we add context-specific memory parameters to the model, and pretrain the model along with a big bank of memories.
📑 arxiv.org/abs/2510.02375
[1/10]🧵
Our two phenomenal interns, Alireza Mousavi-Hosseini and Stephen Zhang @syz.bsky.social have been cooking some really cool work with Michal Klein and me over the summer.
Relying on optimal transport couplings (to pick noise and data pairs) should, in principle, be helpful to guide flow matching
🧵