There's concrete evidence this time - the preview model has found genuine security vulnerabilities in real world software which have been accepted and patched
Posts by Simon Willison
Generate an SVG of a NORTH VIRGINIA OPOSSUM ON AN E-SCOOTER came out even better bsky.app/profile/simo...
It has been quite the few weeks for “security work we are not going to tell you details of.”
Here's the transcript gist.github.com/simonw/1864b...
From the comments: /* Earring sparkle */, <!-- Opossum fur gradient -->, <!-- Distant treeline silhouette - Virginia pines -->, <!-- Front paw on handlebar --> ... static.simonwillison.net/static/2026/...
Wrote up some thoughts on Anthropic's Project Glassing, where their latest Opus-beating model is available to partnered security research organizations only. Given the recent alarm bells raised by credible security voices I think this is a justified decision.
simonwillison.net/2026/Apr/7/p...
I'm a big fan of the pelican GLM-5.1 drew me today, it even animated it! simonwillison.net/2026/Apr/7/g...
The mysteries mostly remain but the last half of season one is a beautiful case study in ramping up the pacing and the tension and the stakes, I loved it
This is excellent, one of my favorite long-form pieces on agentic engineering - includes several non-obvious ways in which leaning on coding agents can catch you out
Yeah my tool should be really good for that - different plugins all use the official APIs so there shouldn't be any difference from using those APIs directly
I've been experimenting with using a "orphan" git branch for that - here's my experiment with that so far github.com/simonw/denob...
I built this one using README-driven-development: I hand crafted a detailed README describing exactly how the tool should work... then dumped that into Claude Code and told it to build it gisthost.github.io?d4b1a398bf3b...
I built a new Python CLI tool for scanning folders for secret strings, useful if you want to share a bunch of log files but first want to check they didn't accidentally leak API keys or similar. Run this command to learn more:
uvx scan-for-secrets --help
Blog: simonwillison.net/2026/Apr/5/s...
Started a new tag on my blog to track stories about AI-powered security research, which is very much having a moment right now - 11 posts so far already simonwillison.net/tags/ai-secu...
The fun thing about doing a podcast with a pro like Lenny is that his team can slice the thing up into TikTok-style shorts after the fact, here's 48 seconds of me talking about the cognitive cost of agentic engineering
Have you genuinely never installed random software in a hurry to join a video call?
I think he navigated to the teams link they gave him and it told him to install something in order to join the call
Warning to open source maintainers: the Axios supply chain attack started with some
very sophisticated social engineering targeted at one of their developers simonwillison.net/2026/Apr/3/s...
I'm on a 128GB M5 Max MacBook Pro now, but 64GB or even 32GB should be enough for these Gemma models
Buying a laptop to run local models is generally a bad idea because you still won't get performance anywhere near the cheap cloud ones
I was a guest on @lennysan's podcast! We talked about agentic engineering and all sorts of other LLM-related topics for 1h39m(!), plus a little bit about kākāpō parrots - here's my selection of highlights from our conversation simonwillison.net/2026/Apr/2/l...
Two blue circles on a brown rectangle and a weird mess of orange blob and yellow triangle for the pelican
Two black wheels joined by a sort of grey surfboard, the pelican is semicircles and a blue blob floating above it
Bicycle has the right pieces although the frame is wonky. Pelican is genuinely good, has a big triangle beak and a nice curved neck and is clearly a bird that is sitting on the bicycle
Motion blur lines, a mostly great bicycle albeit missing the front part of the frame. Pelican is decent.
Pelicans for Gemma 4 E2B, E4B, 26B-A4B and 31B - the first three generated on my laptop via LM Studio, the 31B was broken on my laptop so I ran it via the Gemini API instead simonwillison.net/2026/Apr/2/g...
Plus the beat code is simple, obvious, predictable and as consistent as possible with the rest of the codebase
The Mr. chatterbox creator has a good explanation of what they were trying to achieve with it here www.estragon.news/mr-chatterbo...
When I published this post I hadn't yet come across Trip's own writeup of the project, which provides way more detail about how he trained the model including SFT (supervised fine tuning) against synthetic chat examples created using Claude Haiku and GPT-4o-mini www.estragon.news/mr-chatterbo...
The energy usage of this particular demo - both training and running the thing - is absolutely trivial
I'd guess less than 20 hours on an A100 GPU to train, which is ~$2 in electricity
And there aren't any copyright issues here because the training data is all from 1899 or earlier
www.reddit.com/r/LocalLLaMA/ has a good reputation
I'm not sure that could work because most licenses also include an attribution requirement, and you would have to attribute every one of the million+ authors of the code in the training data
If you have uv installed you can start a conversation (after a 2GB model download) directly like this:
uvx --with llm-mrchatterbox llm chat -m mrchatterbox
Mr. Chatterbox is a really fun project. It's not great to talk to but it's a fun demo of what you can build using entirely out-of-copyright training data
It's a 2GB nanochat model - I released a new llm-mrchatterbox plugin that can run it on my Mac simonwillison.net/2026/Mar/30/...
This post inspired GrantBar, a menu bar app listing NSF and NIH grant opportunities! bsky.app/profile/step...