here's a mildly provocative take: every valid LLM worry about societal effects is reducible to some approximation of 'it makes what were once costly things trivially easy to do'
Posts by Shiminsky
What major issues are you seeing? I’m catching enough silly hallucinations that I’m worried about the stuff I’m not catching…
Go max or go back to using a powerful local model!
That said, even at max I find 4.7's hallucination to be 'sillier' than 4.6. When it's not making up random carp though, the ceiling on 4.7 is very high.
Kimi is especially strong on my other AI on AI Benchmark (that I need to do a write up about). IMO it's the best deal in town.
The Latest Pelican Bicycle Benchmark result from @simonwillison.net for Opus 4.7 was so shocking that I had to do some follow up experiments. It turns out Opus 4.7 is
....just kinda lazy??
It uses almost no reasoning tokens compare to Qwen, and 40x less than Opus 4.6
shimin.io/journal/clau...
And of course, if you really miss the latest Opus models, you can always try politely asking Pi to invoke CC.
I'd been missing my Pi Agent harness since the Anthropic subscription crack down. Tonight I finally got it back with a local Qwen 3.6 setup locally and it feel great to be reunited with my favorite harness!
Thank you @mariozechner.at @mitsuhiko.at for your work on Pi!
@addyosmani.bsky.social's multi agent experience agrees with my own, going from 2 to 5 parallel agents is a quick way to transform from a judge to a ticket usher.
I've been working on a slice of the problem as well, resisting AI Sycophancy for everyday users.
www.flatterproof.me
Reminder when the only fatigue we felt was about too many new JS frameworks? Those were the days.
Great post from @antirez.bsky.social tracing through the history of the open source movement and positions 'clean room clones' as an extension and not revolution.
AI sycophancy will reap many promising careers in next next few years.
In my experience the only major downside is it uses open router and gets expensive very quickly.
So I am still querying each model individually, like a peasant.
Been a few years, maybe it's time for a reread (or get the AI outline).
Which resources would you recommend?
Thank you for this (and ignore the orange site doing its orange site things). VSDD is a clear step forward in the direction of where things are evolving toward.
Folks speak of the dangers of AI psychosis, what about the dangers of human psychosis from repeated attempts at convincing LLMs that the Earth is flat?
My wife sent me this and I'm very glad it's after we bought paint for our bathroom remodel project.
Just wanted to thank you for your insightful writing over the years Jason. Can I preorder the new book yet?
Found an additional graphic that gets even more of these quotes together.
I've kept "I hate myself, I hate clover, and I hate bees" pinned above my desk since I first started studying evolutionary biology as an undergraduate. So relatable to get extremely frustrated with your study system.
I love how stylized it is, this feels like a step up from the previous pelican SVGs
Genuinely very impressed by the SVG of a pelican riding a bicycle I just got out of Google's new Gemini 3 Deep Think model simonwillison.net/2026/Feb/12/...
Yep, I first noticed it when I told AI to update my .gitignore instead of doing it by hand, feels like an uphill battle against my innate 'laziness is a virtue' programming.