LLM demos that crumble on the way to prod?
The demo-to-prod gap is a major problem; I'm sharing my playbook for closing it
⚙️ TDD for prompts
🧠 Debugging with thinking tokens
↔️ Knowing when to code vs. prompt
Stop chasing magic. Start engineering:
samsaccone.com/posts/shippi...
Posts by Sam Saccone
what the f.... I am so sorry to hear this dude.
From the trenches of seeing this play out there is frustration building as the non-linear growth of “agent” code-reviews for quality continues. For those running in the space be wary at the ease of prompting “tell me how to make this better” and measured in exposing credibility.
In this way those on the receiving end of AI augmented code-review have to be skeptical out of the gate as “reasonable statements” without checking the facts leads to poor outcomes.
Signal around this sentiment surfaces via some surveys
survey.stackoverflow.co/2024/ai/#:~:...
LLM backed coding helpers suffer from the paradigm of speaking with a voice of an expert without the expertise of one.
Pattern matching on well written statements alone - which is a reasonable first level triage - now is a failure mode when you bring AI to code-review.
Don’t get me wrong, prompts are important, but it's not the moat of innovation people think it is. Bolt, v0, and cursor for example bring a rich end-to-end experience where others are falling short.
TL;DR; Innovation in the LLM engineering space is gated 100% by UX, not by tech.
Inversely sourcegraph + cody are an example of coming up short on the UX front - a chat sidebar is not a winning approach, they have the knowledge base, the context, and the potential - but the AI UX innovation here is aged out of the gate.
Cursor for example is not winning because of their prompting skills, they are winning because of the UX and seamless experience where even vscode and copilot are falling short.
“Leaked system prompts” continue to matter less and less - as proven by more and more companies open sourcing their “secret recipe” github.com/stackblitz/b...
What matters is UX, interaction polish, and end-to-end value prop.
appreciate it!
@guo.bsky.social called this the last m̶i̶l̶e̶ 99 mile problem which rings true. AI powered software is _and I stress_ 99% software and a pinch of LLM.
www.youtube.com/watch?t=9&v=...
After rolling out 6 large LLM projects for developer productivity across 10k+ engineers it's clear that while the hype of "just throw an LLM at it" continues... The reality under the hood is a pile of software and a pinch of LLM.
🤤
🐐