What model are you using underneath?
Posts by Rob Bowley
You may have seen posts claiming METR's new study shows an 18% AI productivity boost. That's not what the article says.
tl;dr GenAI is speeding up individual coding tasks, but measuring it reliably is now v difficult, it's only one part of software delivery, and most teams aren't set up to benefit
Saying they re-ran is misleading - METR started a new experiment, found it too compromised to trust, and are now redesigning their approach entirely.
They suspect AI coding productivity is improving but suggesting they can no longer measure it reliably.
Some thoughts about AI-assisted coding and how it affects my brain. I'm more productive and more exhausted at the same time.
marvinh.dev/blog/ddosing...
An absolutely essential thread from Mark, who very much knows what he is talking about
It will most likely go the same way as Haolcracy and crypto governance
The tech is real, but the challenge of aligning thousands of people through ambiguity, politics and accountability has deep regularities that new tech has repeatedly failed to dissolve - and usually just get absorbed by.
5/n
It's what Holacracy said, what crypto governance said, and what Block is saying now.
4/n
The piece does this almost perfectly. Two thousand years of organisational evolution exist because hierarchy solves a real coordination problem. The argument is: AI finally gives us something capable of replacing it.
3/n
Reinhart and Rogoff's "This Time Is Different" book documents eight centuries of crises all preceded by the same pattern: a genuinely new development is used to justify dismissing what history consistently shows about how these things tend to end.
2/n
Jack Dorsey published a piece arguing Block is replacing organisational hierarchy with AI coordination. It's well written. It's also textbook "this time it's different"
1/n
But if anyone has any examples pls share
Even the article says it's primarily due to AI investments, not AI replacing jobs. I'm still yet to see or hear of an actual case where an engineering team has directly reduced headcount due to increased dev productivity with AI (which hasn't actually been just an excuse to cut costs)
New article where I offer definitions & a mental model how to think about harness engineering as coding agent users. Building blocks at our disposal, dimensions and goals to consider; emerging practices, open questions; and of course, what role do humans play martinfowler.com/articles/har...
CircleCI’s 2026 State of Software Delivery report has two findings that are already travelling: AI is meaningfully boosting software delivery, but only 1 in 20 teams are capturing that benefit. I think both claims are more uncertain than the report suggests, for different reasons.
Thoughtworks write up on the study
Calling it a paradox, it is not a paradox. There is no mystery to it!
Yet another study showing only a small proportion of high performing teams are benefiting from GenAI coding. The rest getting negative impact
If this is at all accurate about Anthropic’s engineering hiring process, it hardly looks like a company that thinks coding is "solved".
New evidence suggesting AI is amplifying the already existing skills gap. Demand for experienced folks ⬆️, junior/entry level 👎
That means harder hiring, salary inflation, slower execution, and more delivery risk. Wrote about this risk only in Feb, further thoughts here:
In spite of all the talk of Claude Code and Codex meaning the end of humans writing code, software job adverts are actually going up, according to @jburnmurdoch.ft.com's crunching of millions of job ads for this week's The AI Shift www.ft.com/content/7325...
For a long time, I felt that because I wasn’t very good at formal education, it probably meant I just wasn’t that capable. It was only when I joined the world of work I realised I had skills and abilities that mattered, they just weren’t the kind education was very good at recognising or rewarding.
ICYMI there is a new AGI ARC 3 benchmark which is focused on rewarding novel thinking and no frontier model scores more than 1%
x.com/i/status/203...
As someone who always struggled with acedemia and have at least one incredibly bright, but neurodiverse child, it really pisses me off
The main thing I take away from all the AI benchmark tests is how badly we’ve approached measuring human intelligence.
They reward memory, recall and test performance.
Very little about critical or novel thinking, judgement, or navigating messy reality
Probably the grossest misinterpretation of Parkinson's Law is the idea that giving individuals less time to do things/shorter deadlines will speed things up
Actually they (the larger papers) are a collection of essays including Parkinson's Law. The Parkinson's Law article I read
ddata.over-blog.com/2/11/25/66/m...
Yesterday! The paper I read was only 4 pages however it turns out there are bigger ones!