Caglar Gulcehre (@caglarai) Bsky

We benchmarked several open-weight Chinese models on FrontierMath. Their top scores on Tiers 1-3 lag the overall frontier by about seven months.

3 months ago 8 3 1 0

#IPAM (the institute for pure and applied mathematics) is facing a critical shortfall for operating expenses due to an unexpected suspension of NSF funding www.ipam.ucla.edu/news/nsf-fun... . Donations for emergency continuity of operations funding can be made at

giving.ucla.edu/Campaign/Donat

8 months ago 128 39 5 7

🚀 Big time! We can finally do simple LLM RL fine-tuning with rewards and leverage offline/off-policy data!

❌ You want rewards, but GRPO only works online?
❌ You want offline, but DPO is limited to preferences?
✅ QRPO can do both!

🧵Here's how we do it:

9 months ago 12 4 1 0

We provide an efficient and performant method that provides best of both worlds in a new architecture. We managed to show that our approach scales better than the SOTA transformers with self-attention.

Incredible execution and attention to details by @xiuyingwei.bsky.social !

9 months ago 4 0 0 0

Thrilled to announce that our work “Fleet of Agents” has been accepted @icmlconf.bsky.social. On average, FoA boosts quality by ~5% while reducing costs to ~40% of SOTA baselines. Blog post after the Neurips deadline ;)

Until then:
Paper: arxiv.org/abs/2405.066...
Code: github.com/au-clan/FoA

11 months ago 13 4 1 1

Many thanks to all amazing collaborators that contributed to this project - Amin Mansouri, @lars-quaedvlieg.bsky.social , Amal Seddas, Maryna Viazovska, Emmanuel Abbe, @caglarai.bsky.social

12/12

11 months ago 4 2 1 0

Excited to share our latest work on EvoTune, a novel method integrating LLM-guided evolutionary search and reinforcement learning to accelerate the discovery of algorithms! 1/12🧵

11 months ago 21 10 1 2

This fantastic work is mainly due to incredibly hard-working students like @anjasurina.bsky.social, @lars-quaedvlieg.bsky.social, and Amin Mansouri. Also, this was my first paper with a Fields medalist, Maryna Viazovska, and the one and only Emmanuel Abbe 😀.

11 months ago 1 0 0 0

Wohoo🥳 Thrilled to announce this paper 📢. We have shown that it is possible to significantly improve the FunSearch method with RL and achieve impressive algorithmic discoveries on challenging NP-complete combinatorial optimization tasks like TSP and bin-packing.

11 months ago 4 0 1 0

🚨🚨 24 more hours to register your abstracts for the @grades-nda.bsky.social workshop @sigmod2025.bsky.social

Papers due March 30th 23:59 AoE 🚀

@sdumbrava.bsky.social @olafhartig.bsky.social @csaudk.bsky.social

1 year ago 5 1 0 0

Open positions and projects ### Open semester and Master's projects If you're an AU student looking for a semester project, a Bachelor project, or an MS thesis project, please refer to [this list](projects). ### Prospective PhD ...

I am recruiting 2 PhD students for Fall'25 @csaudk.bsky.social to work on bleeding-edge topics in #NLProc #LLMs #AIAgents (e.g. LLM reasoning, knowledge-seeking agents, and more).

Details: www.cs.au.dk/~clan/openings
Deadline: May 1, 2025

Please boost!

cc: @aicentre.dk @wikiresearch.bsky.social

1 year ago 30 22 0 1

Ahahaha it is still very clear in my as if it was yesterday and it was definitely soju because after you talked with the owner; several plastic bottles of soju showed up on our table. Though I don't remember the rest of night 😉

1 year ago 1 0 0 0

If it turns out LLMs are only capable of recombinatory innovation (finding novel connections among existing knowledge), that would still be very useful. Most innovation is recombination and one of the big issues in science is that fields are too vast for scientists to bridge them to find connections

1 year ago 172 17 10 4

Amazing, could become the next hit 😉 I discovered @kyunghyuncho.bsky.social's amazing singing skills when I first went to karaoke with him in 2012.

1 year ago 2 0 1 0

TURING AWARD WINNER Richard S. Sutton in Conversation with Cam Linke | No Authorities in Science YouTube video by Amii

www.youtube.com/watch?v=9_Pe... An interview with Rich. The humility of Rich is truly inspiring: "There are no authorities in science". I wish people would listen and live by this.

1 year ago 40 13 2 1

stay tuned for more proper, detailed and exciting cover of this preprint, but whoa i'm so proud of the team @prescientdesign.bsky.social and our achievements on <Lab-in-the-loop therapeutic antibody design with deep learning>!

1 year ago 20 5 1 0

And I am an ally. If you are too, let the world know.

1 year ago 79123 17128 1287 1009

I have been using Glove80 kb in the last week due to my RSI and it improved significantly since then. But I am still baffled how hard it is to get used to a new kb layout. Oddly, although I type perfectly fine on it now, I can't enter my passwords with it because they are stored in my muscle memory.

1 year ago 2 0 0 0

LLMs and World Models, Part 1 How do Large Language Models Make Sense of Their “Worlds”?

Do large language models develop "emergent" models of the world? My latest Substack posts explore this claim and more generally the nature of "world models":

LLMs and World Models, Part 1: aiguide.substack.com/p/llms-and-w...

LLMs and World Models, Part 2: aiguide.substack.com/p/llms-and-w...

1 year ago 212 58 14 10

Trajan's (@starlord37.bsky.social) story & countless others like it in the face of these cuts and wild shifts in government fellowships intended for our brightest & most promising students will have long-term, deeply damaging effects on the U.S.'s competitiveness in science, math, CS and more.

1 year ago 51 12 2 2

SC24 IEEE-CS Seymour Cray Computer Engineering Award YouTube video by SC Conference Series

A great talk on the history and design decisions in Google's TPUs by my longtime colleague Norm Jouppi, winner of the 2024 Seymour Cray Computer Engineering award.

Talk: www.youtube.com/watch?v=a-1x...

Award announcement: www.computer.org/publications...

1 year ago 87 13 2 0

We’ve been thrilled by the positive reception to Gemini 2.0 Flash Thinking we discussed in December.

Today we’re sharing an experimental update w/improved performance on math, science, and multimodal reasoning benchmarks 📈:
• AIME: 73.3%
• GPQA: 74.2%
• MMMU: 75.4%

1 year ago 158 31 8 6

Google's Titans: a new architecture with attention and a meta in-context memory that learns how to memorize at test time as presented by one of the author - @alibehrouz.bsky.social

1 year ago 70 18 4 5

Also, check out our ML project template—it’s a game-changer!🚀🚀
@caglarai.bsky.social
🧑‍💻 github.com/CLAIRE-Labo/...

1 year ago 6 3 0 0

Ever been puzzled by your PPO agent collapsing out of nowhere? 📈🤯📉 Come check out our poster tomorrow!
Wed 11 Dec 11 am - 2 pm PST
West Ballroom A-D #6403
@caglarai.bsky.social @andreamiele.bsky.social @razvan-pascanu.bsky.social

1 year ago 7 1 1 1

Looking forward to seeing many of you during the conference!

1 year ago 2 0 0 0

Pluralistic Alignment @ NeurIPS 2024 Pluralistic Alignment

2. No Representation, No Trust: Connecting Representation, Collapse, and Trust Issues in PPO, West Ballroom A-D #6403
Both are on 11 Dec, 2pm-5pm. EST. I am co-organizing the Pluralistic Alignment workshop on the 14th Dec with a fantastic line of speakers: pluralistic-alignment.github.io (2/3)

1 year ago 3 0 1 1

I am in Vancouver for NeurIPS 2024 until December 16th if you want to meet, DM or email me.
We have two accepted papers from my lab:
1. Building on Efficient Foundations: Effective Training of LLMs with Structured Feedforward Layers, on Wednesday, East Exhibit Hall A-C #2010 (1/3)

1 year ago 11 4 1 0

Metalenses harness AI for high-resolution, full-color imaging for compact optical systems Modern imaging systems, such as those used in smartphones, virtual reality (VR), and augmented reality (AR) devices, are constantly evolving to become more compact, efficient, and high-performing. Tra...

phys.org/news/2024-11... #AI #artificialintelligence

1 year ago 3 1 0 0

Posts by Caglar Gulcehre