Advertisement · 728 × 90

Posts by Shiminsky

here's a mildly provocative take: every valid LLM worry about societal effects is reducible to some approximation of 'it makes what were once costly things trivially easy to do'

15 hours ago 282 31 18 13

What major issues are you seeing? I’m catching enough silly hallucinations that I’m worried about the stuff I’m not catching…

1 day ago 0 0 0 0

Go max or go back to using a powerful local model!
That said, even at max I find 4.7's hallucination to be 'sillier' than 4.6. When it's not making up random carp though, the ceiling on 4.7 is very high.

1 day ago 2 0 0 0

Kimi is especially strong on my other AI on AI Benchmark (that I need to do a write up about). IMO it's the best deal in town.

1 day ago 1 0 0 0
Preview
Claude 4.7 isn't dumb, it's just lazy — Shimin Zhang Some follow up experiments with Claude 4.7 based on Simon Willison's Pelican Benchmark Shocker.

The Latest Pelican Bicycle Benchmark result from @simonwillison.net for Opus 4.7 was so shocking that I had to do some follow up experiments. It turns out Opus 4.7 is

....just kinda lazy??

It uses almost no reasoning tokens compare to Qwen, and 40x less than Opus 4.6

shimin.io/journal/clau...

2 days ago 46 3 2 2

And of course, if you really miss the latest Opus models, you can always try politely asking Pi to invoke CC.

2 days ago 0 0 0 0

I'd been missing my Pi Agent harness since the Anthropic subscription crack down. Tonight I finally got it back with a local Qwen 3.6 setup locally and it feel great to be reunited with my favorite harness!

Thank you @mariozechner.at @mitsuhiko.at for your work on Pi!

2 days ago 3 0 1 0
Advertisement
Preview
Your parallel Agent limit Running multiple agents in parallel is not just a question of throughput. It is a new kind of cognitive labor that requires managing multiple mental models, ...

@addyosmani.bsky.social's multi agent experience agrees with my own, going from 2 to 5 parallel agents is a quick way to transform from a judge to a ticket usher.

2 weeks ago 2 1 0 0
Preview
Actually, people love to work hard - Anil Dash A blog about making culture. Since 1999.

Another banger from @anildash.com : Actually, people love to work hard anildash.com/2026/04/06/p...

2 weeks ago 24 9 0 0
FlatterProof Gamified training to recognize AI sycophancy

I've been working on a slice of the problem as well, resisting AI Sycophancy for everyday users.
www.flatterproof.me

1 month ago 2 0 0 0

Reminder when the only fatigue we felt was about too many new JS frameworks? Those were the days.

1 month ago 1 0 0 0

Great post from @antirez.bsky.social tracing through the history of the open source movement and positions 'clean room clones' as an extension and not revolution.

1 month ago 1 0 0 0

AI sycophancy will reap many promising careers in next next few years.

1 month ago 0 0 0 0

In my experience the only major downside is it uses open router and gets expensive very quickly.

So I am still querying each model individually, like a peasant.

1 month ago 2 0 0 0
Preview
GitHub - karpathy/llm-council: LLM Council works together to answer your hardest questions LLM Council works together to answer your hardest questions - karpathy/llm-council

Sounds similar to Karpathy's LLM Council github.com/karpathy/llm... but with more snark?

1 month ago 4 0 1 0

Been a few years, maybe it's time for a reread (or get the AI outline).

1 month ago 2 0 0 0

Which resources would you recommend?

1 month ago 1 0 2 0
Advertisement

Thank you for this (and ignore the orange site doing its orange site things). VSDD is a clear step forward in the direction of where things are evolving toward.

1 month ago 1 0 1 0

Folks speak of the dangers of AI psychosis, what about the dangers of human psychosis from repeated attempts at convincing LLMs that the Earth is flat?

2 months ago 2 0 0 0

My wife sent me this and I'm very glad it's after we bought paint for our bathroom remodel project.

2 months ago 2 0 0 0

Just wanted to thank you for your insightful writing over the years Jason. Can I preorder the new book yet?

2 months ago 2 0 1 0

Found an additional graphic that gets even more of these quotes together.

I've kept "I hate myself, I hate clover, and I hate bees" pinned above my desk since I first started studying evolutionary biology as an undergraduate. So relatable to get extremely frustrated with your study system.

2 months ago 4984 1790 76 164

I love how stylized it is, this feels like a step up from the previous pelican SVGs

2 months ago 0 0 0 0
Preview
Gemini 3 Deep Think New from Google. They say it's "built to push the frontier of intelligence and solve modern challenges across science, research, and engineering". It drew me a really good SVG of …

Genuinely very impressed by the SVG of a pelican riding a bicycle I just got out of Google's new Gemini 3 Deep Think model simonwillison.net/2026/Feb/12/...

2 months ago 256 21 19 4
Advertisement

Yep, I first noticed it when I told AI to update my .gitignore instead of doing it by hand, feels like an uphill battle against my innate 'laziness is a virtue' programming.

2 months ago 2 0 0 0