Advertisement · 728 × 90

Posts by Kurt Thorn

The typo really had me wondering about cat dealerships….

12 hours ago 2 0 1 0

What are your favorite harnesses? I’ve been wanting to explore outside of Claude Code but not sure where to start

1 day ago 0 0 0 0

The Anthropic team does lots of helpful posting to X, but unfortunately has not prioritized bluesky cross-posting (makes sense, X algo loves everything AI)

@j4ck.xyz made this tool to mirror those accounts, and takes community requests.

Boris @bcherny-x.bsky.social
Thariq @trq212-x.bsky.social

4 days ago 2 2 1 0

This video of a Ukrainian Yak-52 crew taking down drones with a shotgun is incredible, but the music is deeply, and profoundly inappropriate. There was really only one option. So I've fixed it.

4 days ago 722 211 49 47

Oh dear

5 days ago 0 0 0 0

Asking an LLM to manipulate SMILES directly is a lot like asking it to count the number of Rs in strawberry. It doesn’t do it well, but it also doesn’t matter much, because there are tools that do it well.

5 days ago 1 0 1 0

That said, LLMs are surprisingly good at generating reaction SMARTS for named reactions, but as always, giving it tools so it can check its work vastly improves it.

5 days ago 0 0 1 0

For example, our agent can do library enumeration using rdkit but it uses an enumeration tool and checks the reaction SMARTS against a test molecule or two prior to submitting it.

5 days ago 0 0 1 0

Having built an agent for doing drug discovery and medchem, the agent shouldn’t be trying to manipulate molecules directly - it should use tools. But an agent that knows more about chemical ideas could be quite useful.

5 days ago 1 0 1 0

Bar Shiru?

5 days ago 0 0 1 0
Advertisement

It will be ironic if an AI hater ran the DDoS because Bluesky uses AI

5 days ago 3 1 1 0

happy opus 4.7 day for those who celebrate

6 days ago 71 4 0 1

Interesting chatter about this www.reddit.com/r/math/comme...

Math community is often skeptical, and I’ve never seen them this stunned by a result. And by a public model too.

6 days ago 173 23 3 8
Post image

In France, if you want to build a home above a certain size, you’re legally required to use a licensed architect.

Can you guess what that size is

6 days ago 578 97 13 17

Interesting supply chain I did not know about

1 week ago 0 0 0 0

I'm going to make everyone mad too.

You want your bold new left of center security strategy? Here it is. (The title, not the book itself. That's just me being provocative.)

There is a global backswing against democracy. It is left or right colored in different countries. But it has to lose.

1 week ago 145 26 4 2

This is some absolutely fantastic work by @gelliottmorris.com!

1 week ago 85 20 1 0
Advertisement

A weird thing about this is it’s very variable. I haven’t had any issues with Claude lately but a week or so ago it definitely seemed dumb for a day

1 week ago 1 0 1 0

Following @renice.bsky.social’s prompt advice, I added “Crashing is preferred over silent failures” to CLAUDE.md and it’s helping surface a lot of silent failures in my code base.

1 week ago 12 0 2 0
Do humans and large language models agree on the quality of synthesis plans? | ChemRxiv Large language models (LLMs) have seen a widespread adoption in all spheres of science including chemistry and cheminformatics. Nevertheless, our knowledge of how they operate is limited, giving rise to exploration of their capabilities in different areas ...

🧪 Can LLMs judge chemical synthesis plans?

Excited to share new work from our industrial postdoc Varvara Voinarovska, in collaboration with Samuel Genheden and Mikhail Kabeshov at AstraZeneca's Molecular AI team!

📄 : doi.org/10.26434/che...

1 week ago 13 3 2 0

Haven’t read the paper yet … will do that later today

1 week ago 0 0 0 0

Is your eval suite available? I’d like to run it on the latest generation of models as well as some of the new open weights models (GLM-5.1, etc)

1 week ago 0 0 1 0

this is really fucking good

1 week ago 25 3 3 0
Preview
Szekelygulyas (Sauerkraut Goulash) (Published 1999)

I have always liked this: cooking.nytimes.com/recipes/7451...

1 week ago 0 0 0 0
Advertisement
Preview
Harness engineering: leveraging Codex in an agent-first world By Ryan Lopopolo, Member of the Technical Staff

tbh, it’s pretty rotten to shit on the doorstep like this while I’m trying to treat you like you’re approaching the topic in good faith

read this and tell me that your metrics-declining developer is doing this kind of work — or is capable of it

the lesson to learn is that the skillset has changed

1 week ago 17 1 1 2
A Breakdown of Amy's User Prompt

new post! A breakdown of Amy's user prompt: tobert.github.io/post/2026-04...

1 week ago 31 3 3 1

I built an eval suite for tool resolution using my mcp server. It was very helpful in getting it working

1 week ago 1 0 0 0

The subagent is Nova-2-lite, which is surprisingly good at this and very cheap. It means the main agent never sees the mcp tools that are configured, so there’s little cost to adding a new one.

1 week ago 2 0 1 0

I have a lot of mcp tools and a specialized subagent that finds the appropriate tool based on a description and input files from the main agent, validates, and returns the validated tool call to the agent, or any errors if validation fails.

1 week ago 2 0 2 0

I think the use case matters. I have an agent running scientific computing and mcp is nice because you can specify schemas and validate against them.

1 week ago 2 0 1 0