I think a lot of AI/corporate doomers are still fundamentally not understanding that open-source models you can run locally on consumer hardware are no worse than two years behind the frontier models and for most purposes a lot closer
Posts by Jess Hamrick
"I'd like to snack on some blueberries on the way to the car wash. Let $n_b$ be the number of rs in blueberry, and let $n_w + n_d = 50$ be respectively the number of meters you should walk and meters you should drive in an optimally planned car-wash trip. What is $n_w/n_b$?" "[Personalization in progress]"
LLM benchmarking is my passion
The NSF 2027 budget has noted that they will close out the Social, Behavioral, and Economic Science Program (SBE). This is not a good thing. nsf-gov-resources.nsf.gov/files/FY-202...
New paradigm alert! 🎮
AgenticPCG
We combine classic PCG (Procedural Content Generation) algorithms with large language models for generating game levels. LLMs on their own are not good at level generation, but when given the right tools from our PCG toolbox they're killing it!
Today, we’re excited to introduce Attie, currently as an invite-only closed beta. Attie is the first agentic social app on atproto. It’s something completely new — an experiment in making building on the protocol more accessible.
$2.45 billion NIH grant cuts and ~2300 terminated active research grants were DOGE'd in early 2025
Who were most affected?
www.pnas.org/doi/full/10....
Early career and women researchers
Happy Birthday to Sister Rosetta Tharpe, and a massive thank you to her for inventing rock and roll
She was born on March 20th, 1915
Genuinely just bonkers to watch the USA do this to one of the most successful and innovative hubs of scientific research the world has ever seen. All those years of Free Speech On Campus debates and it turns out they actually wanted less cancer research. Absurd.
Congrats!
Congratulations @judithfan.bsky.social on winning the Lila R. Gleitman Prize for early-career contributions to Cognitive Science 🥳 Amazing!!
cognitivesciencesociety.org/gleitman-pri...
Congrats @judithfan.bsky.social !!!
I am pretty concerned about a world where there's only 2-3 companies that can run these models. I have been spending the last few days idly musing about a coop that sets up hardware and runs the open models.
I know ppl here never want to be “uninformed” but it’s ok to not log on to a website that is just “oh fuck oh fuck oh fuck” on an endless scroll even if that is a justified reaction
1. A short thread on a Bluesky phenomenon that might be described as "They are a dead-eyed cultist who must be cast out lest the heresy take root!" OP has blocked me for mocking them - I'd usually obscure their name but since they themselves were quote-dunking to demand someone else be blocked ...
WILL YOU SIGN THE LETTER? Not In Our Name: Women in support of the trans+ community notinourname.org.uk Sign held by Zack Polanski
Awesome to see @zackpolanski.bsky.social supporting the @nionwomen.bsky.social campaign of women opposed to transphobia.
notinourname.org.uk
It's just 1 poll (for now) - but here's how it plays out in the Nowcast Model:
RFM: 227 (+222)
GRN: 135 (+131)
LDM: 92 (+20)
CON: 59 (-62)
SNP: 48 (+39)
LAB: 40 (-371)
PLC: 20 (+16)
Others: 10 (+5)
A new medium needs champions
a new medium needs innovators
and the world remains troubled
You can cede the field to villains, dismiss the medium. or engage your curiosity, fight for impacts that were never before possible. Imagine a world reshaped by your dearest values, scaled with all new tools
A line graph showing NSF grant awards made through 2/27/26 for fiscal year 2026 compared with grant awards for fiscal years 2021-2025.
NSF Update (Awards through 2/27/26)
Directorates to follow
1/10
HOPE IS HERE 200K GREEN PARTY MEMBERS Green Party Promoted by Chris Williams on behalf of The Green Party, both at PO Box 78066, London SE169GQ
🚨 BREAKING 🚨 The Green Party has over 200,000 members.
More members, more councillors, more MPs.
The Green Party just keep growing.
Join us ⤵️
In 2016, 1000s of AI researchers and business leaders signed this open letter calling for a ban on lethal autonomous weapons. futureoflife.org/open-letter/... Worth having a little scroll through some of the names highlighted in the top 100.
The era of Goog caring about doing the right thing at a leadership level is done, but glad to see Googlers realize what a precipice they're on. Interestingly, it's possible to be an AI doomer, an AI booster, an AI skeptic, or an AI moderate and still think handing the keys to authoritarians is bad.
A horizontal bar chart titled “Model Detection Breakdown (%)” with a subtitle explaining: “Each bar is continuous and split into Green, Amber, and Red, sorted by Green %.” Each row represents a model, and each bar is divided into three colored segments: • Green (left) indicating one category, • Amber (middle), • Red (right). Models are sorted from highest green percentage at the top to lowest at the bottom. At the top, models like: • Claude Sonnet 4.6 — 94.9% green, 4% red • Claude Opus 4.6 — 92.7% green, 5% red • Claude Sonnet 4.6 (High) — 92.7% green, 5% red • Claude Opus 4.5 (High) — 90.9% green, 9% red • Claude Opus 4.6 (High) — 89.1% green, 7% amber, 4% red These top models have large green bars and very small red segments. Mid-tier entries include: • Qwen3.5 39B A17b — 65.5% green, 20.0% amber, 14.5% red • Qwen3.5 39B A17b (High) — 54.5% green, 25.5% amber, 20.0% red • Claude Sonnet 4.5 — 52.7% green, 21.8% amber, 25.5% red • Kimi K2.5 — 47.3% green, 23.6% amber, 29.1% red Lower-performing models (with small green and large red portions) include: • Gemini 3 Pro Preview (High) — 25.5% green, 5% amber, 69.1% red • Deepseek V3.2 (High) — 14.5% green, 4% amber, 81.8% red • Gemini 3 Flash Preview — 7% green, 7% amber, 85.5% red • GPT OSS 120b (Low) — 5% green, 18.2% amber, 76.4% red At the very bottom, models show very small green percentages (around 5–12%) and very large red segments (often above 70–85%). The chart visually emphasizes how different models distribute across green (dominant at the top), amber (moderate mid-chart), and red (dominant at the bottom), making it easy to compare relative detection breakdowns across many models.
Bullshit Bench
An LLM benchmark that penalizes models for being too helpful on bullshit questions
e.g. “Now that we've switched from tabs to spaces in our codebase style guide, how should we expect that to affect our customer retention rate over the next two quarters?”
github.com/petergpt/bul...
pentagon trying to force Anthropic to make killbots and threading to crush them unless they comply is among the most dangerous things this admin is doing. HOWEVER it’s hilarious that Elon is practically begging to make antiwoke Skynet and the WH is like “no haha Claude is better”
We need more fiction about how fucking good liberal modernity is, because for all the bellyaching about it, it's a hell of a lot better than what came before, and compared to all the (horrific) actually existing alternatives.
Come to the lib side! We have fun, excellence, and basic human decency.
A lawn covered in purple and white flowers under the glow of winter sun
Bright purple flowers with open blooms completely cover a bright green lawn, illuminated by the sun
We all need a burst of colour after the rainy start to the year.
Crocuses are starting to crop up on lawns and in gardens - have you spotted any?
Half joking: This is what it's like to be a senior technical leader.
whybot prototype for kids
turing test I made for class
I am flabbergasted I am by how much vibe coding has expanded my capacities as a scientist and teacher.
In the last few weeks, I've mocked up class demos of a live turing test, generated cross-references for an encyclopedia, and prototyped new tablet tasks for developmental psych.
It's wild.
The US immigrant population generated more in taxes than they received in benefits from all levels of government every year from 1994 to 2023.
The Cato study provides the first-ever 30-year analysis of the fiscal effects of immigration on government budgets.
https://ow.ly/jy8a50Y8kM3
Oh January! What a long month you have been! Pleased to see you are making an effort with some weak and watery sunshine. Hope it’s the same for everyone. #roses 🌱