Advertisement · 728 × 90

Posts by Yoav Artzi

Alt text

3 weeks ago 2 0 0 0

I leave this project with a lot of food for thought about how we get to compact (and equally capable) models

3 weeks ago 1 0 0 0

Very excited about @nthngdy.bsky.social's new work! It really gets to the bottom (or top, depends where the head in LMs is 😜) and fundamentals of contemporary LLMs. A real treat of a paper: solid theory, and very cool experiments.

3 weeks ago 14 2 1 0

This call is still open. I am looking to recruit, as well as many other faculty at Cornell. We review folders as they come, and will send offers until all positions are filled.

Please share with your network 🙏

2 months ago 11 8 0 0
About

We have a mailing list for big announcements:
groups.google.com/g/colm-annou...

We use it very sparingly, roughly 1-2 times a year

2 months ago 2 1 0 0

Call for papers -- due March 31, 2026 (abstracts due March 26)
colmweb.org/cfp.html

Call for workshops -- due April 14, 2026
colmweb.org/cfw.html

3 months ago 18 7 0 0

Hence, this is an interesting and important benchmark. Through a simple environment, it exposes a fairly fundamental flaw in current models

3 months ago 1 0 1 0

This is not surprising, and aligns with other findings in the literature regarding visual reasoning and manipulation

3 months ago 0 0 1 0
Advertisement

The prompts do provide rudimentary illustration. The stateful version allows the model to see the outcome of its own actions, technically allowing it to infer the physics. Generally though, the result for LLMs out of the box is negative.

3 months ago 1 0 1 0

Most of the experiments are not with VLMs, but with a diverse set of RL methods.

Do LLMs understand physics? They definitely generate outputs that seem to indicate so.

3 months ago 0 0 1 0

Submit to COLM! Deadline of March 31. This llama gets to enjoy his holidays and isn't stressed out just yet...

4 months ago 8 1 0 0

Zoe presented this paper at NeurIPS D+B: it's all knots(🪢🪢🪢!?), no language tokens were harmed (or reinforced) in the process

It's such a fun and creative paper, a real mind twist ;)

You really get to think carefully about visual intelligence looking at these knots 🪢

4 months ago 7 0 1 0

Hi all, I will be at #NeurIPS2025 to present my work on stress-testing looooooong visual reasoning with KnotGym🥨
Let's talk, whether or not your VLM that can see 14 million possible futures like Doctor Strange

4 months ago 1 1 0 0
Post image

COLM is going to San Francisco for 2026!

🗓️Dates: October 6-9, 2026
🏨Venue: Hilton San Francisco Union Square

Website and CFPs for papers and workshops coming up soon!

5 months ago 21 6 0 1
Post image
5 months ago 6 2 0 1

This is maybe counterintuitive to the original intention of just index the chaos to make it accessible. I guess that ideal of search softened a long time ago

5 months ago 0 0 0 0

That's definitely part of it, because this digestions has deeper history. Search engine indexing also seems just easier, so companies opt to it, even pre AI-overview-everything

5 months ago 0 0 1 0
Advertisement

Re peer-rev --> pre-print servers: arXiv is a simple uniform place to store. Indexing engines love it, so if you want something to be searchable, nothing is better. To make things worse, at times it seems like journals/proceedings almost play a game of hide-and-seek with PDFs

5 months ago 0 0 1 0

Re position papers: I don't think anyone can deny how effective some of these papers became for citations counts

5 months ago 0 0 1 0

Is this all just a big practical joke for ChatGPT? I have been told god doesn't play dice with the world, but I guess AGI does :)

5 months ago 1 0 0 0
Post image

It's a Thursday though ....

5 months ago 4 0 1 0
Preview
LM-class LM-class is an education resource for contemporary language modeling, broadly construed.

All available here:
lm-class.org

ChangeLog here:
lm-class.org/CHANGELOG.md

5 months ago 2 0 0 0
Post image

Pushed a big update to LM-class (v2025.2) -- this second version makes a much more mature resource

Many refinements of lecture slides + significant improvements to the assignments

Many thanks to @ch272h.bsky.social, Yilun Hua, and Shankar Padmanabhan for their work on the assignments

5 months ago 4 0 1 0
Preview
Post-training for Efficient Communication via Convention Formation Humans communicate with increasing efficiency in multi-turn interactions, by adapting their language and forming ad-hoc conventions. In contrast, prior work shows that LLMs do not naturally show this ...

This kind of ad-hoc adaptation is hard in general of LLMs, but you can post-train to it for some degree
arxiv.org/abs/2508.06482

I suspect contemporary ASR models have the same backbone, so maybe applicable too

More broadly, there is a lot of interesting stuff to do in this space of adaptation

5 months ago 2 0 1 0
Advertisement

I am potentially recruiting a postdoctoral fellow through this program. If interested, name me as a mentor, and ping me to let me know that you are applying! The process includes some sort of interview, so I can try to squeeze a few of these in advance (it will help a lot!)

5 months ago 4 0 0 0
Post image

Cornell is recruiting for multiple postdoctoral positions in AI as part of two programs: Empire AI Fellows and Foundational AI Fellows. Positions are available in NYC and Ithaca.

Deadline for full consideration is Nov 20, 2025!
academicjobsonline.org/ajo/jobs/30971

5 months ago 3 1 0 2
Cornell University, Empire AI Fellows Program Job #AJO30971, Postdoctoral Fellow, Empire AI Fellows Program, Cornell University, New York, New York, US

Cornell (NYC and Ithaca) is recruiting AI postdocs, apply by Nov 20, 2025! If you're interested in working with me on technical approaches to responsible AI (e.g., personalization, fairness), please email me.

academicjobsonline.org/ajo/jobs/30971

5 months ago 32 20 1 2

Wild

5 months ago 0 0 0 0

There's the legit gaming, which is just optimizing for the metrics and breaking them. Then there's the really fake stuff, like citation rings. You would thing citation translate to bitcoins with the level of creativity and effort that people put into it

5 months ago 2 0 1 0

The top citer has >1k papers, with a PhD from 2007. That's one hell of a steady rate ¯\_(ツ)_/¯

5 months ago 0 0 1 0