jessica dai (@jessica) Bsky

congrats!!!

7 months ago 1 0 0 0

so close! that's standard error ❤️

8 months ago 3 0 0 0

i'm still pissed about this like the difference is literally too small to have been distinguishable with swe bench (500 samples) lmaoooo

8 months ago 8 0 0 0

hey wasn't this the same company that made a beautiful shiny "research" post about how AI evals should include error bars or something like that. or did they decide the CLT didn't apply here

8 months ago 39 3 5 1

I will be at ICML in a few weeks & would love to chat about how to make this real - I am a critic at heart and also hate self-promo so that’s how you know I really believe in this 🥲

9 months ago 1 0 0 0

various ways to read more 😀

blog post- argmin.net/p/individual...
position paper- arxiv.org/abs/2506.18133
fairness-oriented instantiation- arxiv.org/abs/2502.08166

& many thanks to brilliant collaborators
@rajiinio.bsky.social @irenetrampoline.bsky.social @beenwrekt.bsky.social & paula gradu !!

9 months ago 4 0 1 0

lots of other stuff I won’t get into rn (e.g., I think this is a prereq to any serious attempt at “democratic” AI!), and there’s also a ton of open research questions (stats, econ/ml, empirical methods, hci, …)

9 months ago 1 0 1 0

the core concept is individual reporting as a means to build collective knowledge. if one person has a bad experience, that doesn’t necessarily mean that there’s something wrong with the system — but if lots of people start reporting similar things, maybe we should pay attention.

9 months ago 2 0 1 0

we’ve already seen this informally with the chatgpt sycophancy debacle — a few days of twitter virality resulted in action and statements from openai — but what other, subtler, patterns are happening? what could we discover if we had better ways to listen to the public?

9 months ago 2 1 1 0

individual reporting for post-deployment evals — a little manifesto (& new preprints!)

tldr: end users have unique insights about how deployed systems are failing; we should figure out how to translate their experiences into formal evaluations of those systems.

9 months ago 20 3 1 0

Individual experiences and collective evidence Jessica Dai on theory for the world as it could be

@jessica.bsky.social on individual reporting as a means to build collective knowledge.

9 months ago 8 2 1 0

right but one would hope that the date of doom _does_ get further away as safety research improves

bsky.app/profile/jess...

11 months ago 0 0 1 0

help ..

11 months ago 3 0 0 0

where are the bullshit "x% of experts believe" polls when you need them lol

11 months ago 6 0 0 0

well probably, but i wanna know how folks who do believe in that happening think about the field

1 year ago 0 0 1 0

or is it a secret third thing idk. scared to ask this on Real Twitter but genuinely curious how people think about the role of this field

1 year ago 0 0 0 0

like is it that the field has been ineffective (studied the wrong problems, advocated for the wrong positions, etc) or is it that every step of safety progress has been matched by 2 steps of capabilities progress (in which case, what are the best examples of safety work concretely reducing harm?)

1 year ago 3 0 1 1

perhaps this is a stupid question but given that ai safety has been a pretty vibrant (+ well funded) field for the last 5-10 years... how should we be thinking about the concern that (ai) catastrophe still is, allegedly, imminent

1 year ago 6 0 2 0

Sam Altman on X: "we trained a new model that is good at creative writing (not sure yet how/when it will get released). this is the first time i have been really struck by something written by AI; it got the vibe of metafiction so right. PROMPT: Please write a metafictional literary short story" / X we trained a new model that is good at creative writing (not sure yet how/when it will get released). this is the first time i have been really struck by something written by AI; it got the vibe of metafiction so right. PROMPT: Please write a metafictional literary short story

in middle school we were asked to write a short story in the style of edgar allan poe. as you might expect, all of our little pieces (even, especially, the ones the students thought were "good") were hilariously bad. anyway, i had forgotten about that homework until now

x.com/sama/status/...

1 year ago 3 0 0 0

back on bluesky to be mean about ai discourse

1 year ago 4 0 0 0

im ngl i think this kinda just means u are stupid

1 year ago 3 0 1 0

how to have a good time in a phd (not real advice)

happy new year

letterstomyfriends.substack.com/p/how-to-hav...

1 year ago 4 0 1 0

i don't work well under deadline pressure but i also don't work well without it. therefore,

1 year ago 4 0 0 0

... didn't we just talk about this ...

1 year ago 1 0 0 0

ill read it

1 year ago 1 0 0 0

The plan: Post your dissertation abstract online to rekindle a decades-long controversy about the utility of the humanities, turning your paper into the most-read publication in the history of your field

1 year ago 10 1 1 0

were you born yesterday

1 year ago 3 0 0 0

wait is that your house lmaooo

1 year ago 0 0 1 0

where

1 year ago 0 0 1 0

i just know these people would have been the biggest fans of japanese internment

1 year ago 9 0 3 0

Posts by jessica dai