Laurence Aitchison (@laurenceai) Bsky

Our paper on the best way to add error bars to LLM evals is on arXiv! TL;DR: Avoid the Central Limit Theorem -- there are better, simple Bayesian and frequentist methods you should be using instead.

We also provide a super lightweight library: github.com/sambowyer/baye… 🧵👇

1 year ago 25 8 1 0

Go read it on arXiv! Thanks to my co-authors @sambowyer.bsky.social and @laurenceai.bsky.social 💥

1 year ago 4 1 1 0

📣 Jobs alert

We’re hiring postdoc and research engineer to work on UQ for LLMs!! Details ⬇️

#ai #llm #uq

1 year ago 13 11 0 0

Do you know what rating you’ll give after reading the intro? Are your confidence scores 4 or higher? Do you not respond in rebuttal phases? Are you worried how it will look if your rating is the only 8 among 3’s? This thread is for you.

1 year ago 77 20 4 3

Would love to be added!

1 year ago 1 0 2 0

But you can't prove that the *real* asteroid won't hit earth, because the real world isn't your simplified model. e.g. you don't know the initial conditions, there might be other bodies you aren't aware of etc. etc.

1 year ago 0 0 0 0

The analogy we're working from is "mathematically provable asteroid safety": within a simplified mathematical model, with known initial conditions, you can prove that an asteroid won't hit earth. (2/3)

1 year ago 0 0 1 0

Does anyone want to collaborate on an ICML position paper on "The impossibility of mathematically proving AI safety"? The basic thesis being that it is a category error to try to prove AI safety in the real world. (1/3)

1 year ago 2 0 1 0

Can you add?

1 year ago 0 0 0 0

Posts by Laurence Aitchison