Oh, and I must mention the BAIR espresso machine! It was only huddled around freshly ground coffee machines did we come up with this idea (initially wondering if content length matters for statistical behaviors). If you want good research, provide your students with coffee.
Posts by Ritwik Gupta
This behavior has very interesting quirks. LLMs implicitly demonstrate time-discounting over the ICL examples. That is, recent evidence matters more!
Interestingly, models follow a very similar trajectory to what the true Bayesian posterior should look like with the same amount of evidence! When we prompt for coin flips from a 60% heads-biased coin but give it evidence the follows 70% heads, models converge to the latter.
Can we control this behavior? We tried many things before settling on in-context learning as a working mechanism. If we prompt an LLM to flip a biased coin, and then show increasing rollouts of flips from such a distribution, models converge to the right underlying parameter.
Biased coin flips follow a simple probability distribution that LLMs should be able to simulate explicitly. In fact, when prompted to flip a fair coin, most LLMs predict heads 70-85% of the time! This holds true even if you prompt the model to flip a biased coin 🪙
Do LLMs understand probability distributions? Can they serve as effective simulators of probability? No!
However, in our latest paper that via in-context learning, LLMs update their broken priors in a manner akin to Bayseian updating.
📝 arxiv.org/abs/2503.04722
Computer Science Seminar Series. Making AI Work in the Crucible: Perception and Reasoning in Chaotic Environments. February 25, 2025, 228 Malone Hall. Refreshments available 10:30 a.m. Seminar begins 10:45 a.m. Ritwik Gupta, University of California, Berkeley.
Seminar with @ritwikgupta.bsky.social coming up! Learn more here: www.cs.jhu.edu/event/cs-sem...
Recent proposals to kill the influence of “Chinese AI” in America will have devastating knock-on effects to American innovation. In this article, I discuss the statelessness of AI and the overly broad nature of Senator Hawley’s proposed legislation.
It is a false premise that America has a lead in AI over China. So many articles have come out recently about DeepSeek threatening our lead. The lead in *meaningful* capabilities has never existed.
The narrative that we have achieved peak data is so absurd to me. Humans are still around. New data is constantly being created. We just have to be more efficient about using it.