@ucl-dark.bsky.social entered the stage! Thanks @lauraruis.bsky.social :)
Posts by DARK lab
Check out Tim's start pack for Open-Endedness on Bluesky!
How do LLMs learn to reason from data? Are they ~retrieving the answers from parametric knowledge🦜? In our new preprint, we look at the pretraining data and find evidence against this:
Procedural knowledge in pretraining drives LLM reasoning ⚙️🔢
🧵⬇️
The LLM parrot analogy is dead. Fantastic work by UCL DARK's @lauraruis.bsky.social on rigorously investigating whether LLMs learn reasoning from procedural knowledge during pretraining.
Excited to announce "BALROG: a Benchmark for Agentic LLM and VLM Reasoning On Games" led b UCL DARK's @dpaglieri.bsky.social! Douwe Kiela plot below is maybe the scariest for AI progress — LLM benchmarks are saturating at an accelerating rate. BALROG to the rescue. This will keep us busy for years.