The typo really had me wondering about cat dealerships….
Posts by Kurt Thorn
What are your favorite harnesses? I’ve been wanting to explore outside of Claude Code but not sure where to start
The Anthropic team does lots of helpful posting to X, but unfortunately has not prioritized bluesky cross-posting (makes sense, X algo loves everything AI)
@j4ck.xyz made this tool to mirror those accounts, and takes community requests.
Boris @bcherny-x.bsky.social
Thariq @trq212-x.bsky.social
This video of a Ukrainian Yak-52 crew taking down drones with a shotgun is incredible, but the music is deeply, and profoundly inappropriate. There was really only one option. So I've fixed it.
Oh dear
Asking an LLM to manipulate SMILES directly is a lot like asking it to count the number of Rs in strawberry. It doesn’t do it well, but it also doesn’t matter much, because there are tools that do it well.
That said, LLMs are surprisingly good at generating reaction SMARTS for named reactions, but as always, giving it tools so it can check its work vastly improves it.
For example, our agent can do library enumeration using rdkit but it uses an enumeration tool and checks the reaction SMARTS against a test molecule or two prior to submitting it.
Having built an agent for doing drug discovery and medchem, the agent shouldn’t be trying to manipulate molecules directly - it should use tools. But an agent that knows more about chemical ideas could be quite useful.
Bar Shiru?
It will be ironic if an AI hater ran the DDoS because Bluesky uses AI
happy opus 4.7 day for those who celebrate
Interesting chatter about this www.reddit.com/r/math/comme...
Math community is often skeptical, and I’ve never seen them this stunned by a result. And by a public model too.
In France, if you want to build a home above a certain size, you’re legally required to use a licensed architect.
Can you guess what that size is
Interesting supply chain I did not know about
I'm going to make everyone mad too.
You want your bold new left of center security strategy? Here it is. (The title, not the book itself. That's just me being provocative.)
There is a global backswing against democracy. It is left or right colored in different countries. But it has to lose.
This is some absolutely fantastic work by @gelliottmorris.com!
A weird thing about this is it’s very variable. I haven’t had any issues with Claude lately but a week or so ago it definitely seemed dumb for a day
Following @renice.bsky.social’s prompt advice, I added “Crashing is preferred over silent failures” to CLAUDE.md and it’s helping surface a lot of silent failures in my code base.
🧪 Can LLMs judge chemical synthesis plans?
Excited to share new work from our industrial postdoc Varvara Voinarovska, in collaboration with Samuel Genheden and Mikhail Kabeshov at AstraZeneca's Molecular AI team!
📄 : doi.org/10.26434/che...
Haven’t read the paper yet … will do that later today
Is your eval suite available? I’d like to run it on the latest generation of models as well as some of the new open weights models (GLM-5.1, etc)
this is really fucking good
tbh, it’s pretty rotten to shit on the doorstep like this while I’m trying to treat you like you’re approaching the topic in good faith
read this and tell me that your metrics-declining developer is doing this kind of work — or is capable of it
the lesson to learn is that the skillset has changed
I built an eval suite for tool resolution using my mcp server. It was very helpful in getting it working
The subagent is Nova-2-lite, which is surprisingly good at this and very cheap. It means the main agent never sees the mcp tools that are configured, so there’s little cost to adding a new one.
I have a lot of mcp tools and a specialized subagent that finds the appropriate tool based on a description and input files from the main agent, validates, and returns the validated tool call to the agent, or any errors if validation fails.
I think the use case matters. I have an agent running scientific computing and mcp is nice because you can specify schemas and validate against them.