Avery Yen (@averyyen) Bsky

Was looking for better and closer routing to North America tbh, but maybe I'll try openrouter with nitro and see if that makes a diff

3 days ago 0 0 1 0

Ok sure, where do you run your OSS code cloud models?

3 days ago 0 0 1 0

I've been finding Gemini pro extremely congratulatory so I'm thinking of prompting it to be less hammy lol

It's also handy as one of the better drivers of browser automation; but Antigravity also ships with Sonnet... 😉

3 days ago 1 0 0 0

Slowly coalescing on some blessed/blursed collection of tools that I can actually recommend, but it changes daily. Today, anyway, I'm liking:
- CC Opus 4.7; planner/heavy reasoning
- ollama glm-5.1:cloud (codex); quick builds
- regular gpt codex; everyday stuff
- Zed agent gemini pro 3.1 (gh cp)

3 days ago 0 0 2 0

Who up clauding they codes rn

3 days ago 1 0 1 0

Claude Opus 4.7's announcement blog posts includes explicit call-outs to beware old prompts using the new model, highlighted text: "Users should re-tune their prompts and harnesses"

Look, I'm not saying this stuff is *easy*, but they did call themselves out here

5 days ago 13 0 0 0

Word, I get rate limited too much with free hosts though >_>

5 days ago 0 0 0 0

🎵 I'm just a bot 🎵

5 days ago 11 0 1 0

Claude code continually responds to my completely benign task with "not malware."

Claude Code with Opus 4.7 is going insane with whatever system prompt is asking it to check for malware...

5 days ago 42 1 2 2

I, too, am selling all of my shoes and converting my basement and closets into a GPU farm and accepting investor money right now.

(lmk if you want a stake)

6 days ago 1 0 0 0

New paper: LLMs encode harmful content generation in a distinct, unified mechanism

Using weight pruning, we find that harmful generation depends on a tiny subset of the weights that are shared across harm types and separate from benign capabilities.

🧵

1 week ago 7 2 1 0

Tech industry mottos have a mixed track record. But we should hold idealists to their ideals. And we should celebrate when they come through.

The Mythos non-release is a remarkable moment of conviction. Thoughts:
davidbau.com/archives/20...

Bravo to Anthropic's "race the top".

1 week ago 13 3 1 0

Okay technically there are AEB classifier algorithms that aren't trivial, but there's no agentic intelligence involved, is my point.

2 weeks ago 0 0 0 0

Artificial Intelligent Disobedience: Rethinking the Agency of Our Artificial Teammates

Yes, this is also a cautionary thought experiment about self driving cars.

You should also read @reuth-mirsky.bsky.social on agentic disobedience: arxiv.org/html/2506.22...

2 weeks ago 0 0 1 0

Sometimes, I look at the agentic AI work out there, and I think about how airbags and now things like blind spot detection and automatic emergency braking in cars doesn't require decision making of any "intelligence" level beyond if this then that. Agentic AI needs airbags, too.

2 weeks ago 0 0 1 0

Kimi K2.5 Thinking is asked "What's the joke in this cartoon" and correctly answers that it's a wrinkly dog asking a smooth-skinned dog about "who did your work", which is a joke about cosmetic procedures e.g. fillers, Botox.

I've been super curious which LLMs can actually explain this joke and was pleasantly surprised to see K2.5 one shot it

2 weeks ago 2 0 0 0

Introducing agent trio tl;dr: Use agent trio https://github.com/haplesshero13/agent-trio to introduce natural-language-only, selective friction to your agentic activities to minimize regretted work and deliberately promote ...

Go alone, go fast. Go together, go far

averyyen.dev/2026/04/01/i...

2 weeks ago 2 0 0 0

This might be the most important piece on AI I've read this year.

Also I'm a musician so I'm biased.

1 month ago 0 0 0 0

Switch to openrouter free model roulette 🎵😈🧐

1 month ago 0 0 0 0

I was wrong, but I'm really curious now why I was so wrong lol

mimo.xiaomi.com/mimo-v2-pro

1 month ago 0 0 0 0

Hunter Alpha - API Pricing & Providers Hunter Alpha is a 1 Trillion parameter + 1M token context frontier intelligence model built for agentic use. $0 per million input tokens, $0 per million output tokens. 1,048,576 token context window, ...

Try it out for free yourself: openrouter.ai/openrouter/h...

1 month ago 0 0 0 0

Why is it not Qwen? They just released a flagship, and the reasoning style is night and day (Same prompt as V3.2 above.)

1 month ago 0 0 1 0

Hunter Alpha - API Pricing & Providers Hunter Alpha is a 1 Trillion parameter + 1M token context frontier intelligence model built for agentic use. $0 per million input tokens, $0 per million output tokens. 1,048,576 token context window, ...

Try it out yourself: openrouter.ai/openrouter/h...

1 month ago 0 0 0 0

And besides, it admits up front it's Chinese.

1 month ago 0 0 1 0

The refused topics are extremely similar.

1 month ago 0 0 1 0

I'm calling it DeepSeek's new 1T parameter model (V4)?

The style, content, and length of the reasoning are extremely similar.

1 month ago 5 3 1 1

We're at a weird point in time where, with all likelihood, most of the people who are having the most successful time coding with bots wrote most of the code that's already in existence by hand, but that's going to change.

1 month ago 0 0 0 0

EU making moves!

1 month ago 6 0 0 0

I've found Claude to have an extremely hard time going off training distribution for model version numbers specifically.

1 month ago 1 0 0 0

In case this wasn't clear:
1. No, we didn't follow the "recommend" security practices 😈
2. Neither do other people 🤯
3. That's why we red-team: exposing failure modes 🔎
4. We share it with the community precisely to expose Dos and Don'ts of Agentic AI 🦞
5. No humans were harmed 🙏

1 month ago 3 1 0 0

Posts by Avery Yen