In March 2026, an Anthropic employee released the source code of Claude Code, a wrapper around their large language model that is widely used to generate code in programming tasks. Its thousands of lines of Typescript code contained many hopeful prompts and incantations to shape Claude’s behaviour. Here are some examples: “Report outcomes faithfully”; “Never characterize incomplete or broken work as done”; “Be careful not to introduce security vulnerabilities” (prompts.ts in Anthropic 2026). There is more than a passing resemblance here to the Azande witch-doctor apprentice who, while stirring the medicine, utters: “You medicine which I am cooking, mind you always speak the truth to me. Do not let anyone injure me with his witchcraft, but let me recognize all witches. … Let me be expert at the witch-doctor’s craft so that people will give me many spears on account of my magic.” (Evans-Pritchard 1937: 93). In the case of Claude, the incantations appeared insufficient: analysis of the codebase, which according to a company executive was “pretty much 100% written by Claude Code”, revealed severe security vulnerabilities (Townsend 2026).
I was supposed to finish this March 31 and then the #Claude Code leak happened, handing me the perfect opening example
Some of it has been in the works for longer: it's also a version of (part of) my #DHd2025 keynote titled "What makes LLMs so irresistible?"
Read it here: doi.org/10.5281/zeno...