"But why don't you do it right from the start?"
Posts by Dragan Stepanović
Ever since I read The Limits to Growth from the MIT System Dynamics group, I'm not really able to think about other things
www.youtube.com/watch?v=BoKQ...
Some are easier to observe like bugs and outages, but some are not as tangible and harder to detect, like decreased ability to reason about the system and anticipate problems before they occur, shifting the ratio of proactive problem detection to reactive mitigation, etc.
An additional problem is the delay between the point in time you start piling up the risks in the codebase and the team's process, and the time it takes for them to visibly materialize.
>>
...what some of the purposes of that practice were and the benefits you didn't recognize you were getting.
Assumptions you don't know you're making. Unknown unknowns.
The risk with removing parts of your delivery process that you think you don't need anymore because "AI can do it" - such as teasing out the mental model of how the system works from the heads of people who own the system - is that you get to discover, often way too late...
>>
The difference is that on this spectrum of fragmentation of the mental model of workings of the system we went from erosion to a complete dissolution of a shared mental model.
I also have new understanding of why teams doing pair/mob programming worked so well.
Preparing a talk "Agentic coding - A Systems Perspective", and along the way realized that cognitive/comprehension debt didn't arrive with the advent of agentic coding. Most teams doing work in isolation (individually), already heavily experienced it.
Let's not forget that LLMs got fed with an ever decreasing quality of work our industry has been producing over the last 15 years as a result of cheap money with ever decreasing central bank interest rates.
And both of these reinforcing loops are accelerating with LLMs dogfooding.
Human errors are usually traceable, wrong assumption, misread spec, forgotten edge case. LLM errors often have no recoverable cause. You can't ask "why did you do that?" and get a meaningful answer you can act on.
Humans _have_ a mental model of the system, LLM cannot. Humans can distinguish between plausible, possible, probable, and certain. Humans have causal understanding, LLMs don't.
A human (or a team) having a mental model of the workings of the system they've built as they were building the system have a very different, and expected(!) failure modes compared to an algorithm that can't distinguish between guessing and being sure about something.
Oh, that's an often raised point, but I think we're conflating different things into one.
Plausibility coming from a human is a very different thing than a plausibility coming from an LLM.
Imagine your team doing agentic coding in the codebase serving mills of customers without understanding how it works, and you get paged at 2 am, and since no one knows how the system works you need to ask LLM to tell you, and it gives you, wait for it... a PLAUSIBLE answer what might be wrong.
That's all good, but have you also managed to leak your codebase?
Elon Musk had DOGE defund USAID, the largest humanitarian organization in the world, causing HUNDREDS OF THOUSANDS of deaths from starvation and disease - and Trump GLEEFULLY approved it. It’s the single most evil act in US history. Cartoon by Mike Luckovich
One small tweet from a man, one giant cringe from mankind
From what I can tell Claude Code source leak by shipping a source map was definitely a skill issue.
In case you're in Berlin at some point, let me know, would be lovely to catch up over a coffee/lunch.
Oh, I wish I was there. Have fun!
It sounded plausible until it wasn't.
It's my own fault for using ChatGPT as a search tool, but this really made me laugh. "...with an important nuance..." is sublime 🤌
"agents write all of our code and humans understanding how the system they own works is a bottleneck to remove".
It's a sign of having to deal with the fallout of a dysfunctional system that most engineers shouldn't have had to deal with in the first place, instead of attempting to normalize it.
"I always operated and debugged systems I didn't understand" is not a badge of honor to justify everyone else should somehow feel comfortable maintaining and running systems they don't understand anymore because
>>
In short: the tipping point of the effects of poor service availability for most businesses out there is likely far earlier than for Anthropic, and trying to address it just-in-time is likely too little, too late.
Availability that comes as a result of particular way of developing software needs to be put in a broader context, and your business is, I'd say, very likely far from the same position as those telling you to do the same thing they have the luxury of doing.
But if you use the same approach to building software, and you don't have the same privilege of demand for your product as Anthropic does, then availability becomes utterly important as a perception of value for what your customers are paying for your product.
As long as the demand for your product is so disproportionately strong, that this level of (un)availability doesn't affect that demand to a level where customers would consider someone else instead of you.
So, here's the thing.
If this, someone would say, embarrassing level of availability came as a result of "90% of our code is written by agents, and we (mostly) don't review the code they generate"- fine.
If it hurts do it more often, but if it really really really hurts, and it doesn't have to then try to prevent it.