Posts by Alex Strick van Linschoten
But once we mapped out what versioning and provenance would look like across two systems, the seams started showing. Sarah Wooders' framing (which Harrison also quotes) captures why: managing memory is a core responsibility of the harness, not a peripheral one.
We considered integrating with Mem0, Letta, and the other dedicated memory providers. We learned a lot from reading their code and their philosophies, there's real diversity in how this space thinks about the problem.
3. Provenance is automatic. Because memory and artifacts share a backend, you don't have to stitch the audit trail back together across systems. (And we offer a full audit log in case you need that for your memories.)
2. Scopes match how agents actually work. Namespace for repo conventions, flow for per-agent learned state, execution for per-run progress. No cramming everything into one global blob.
1. Versioning comes free. Every memory.set() creates a new version. Soft deletes leave tombstones. You can ask "which run taught the agent this?" and get an actual answer. (And since memory ships through our MCP server, you can ask Claude Code or Codex that question directly.)
Three things fell out of putting memory in the same substrate that already handles execution durability:
"Your Harness, Your Memory" by Harrison Chase argues that memory belongs inside your agent harness, not behind a third-party API. We've been building exactly that, and Kitaru 0.4.0 shipped it this morning.
kitaru.ai/blog/kitaru...
The skills are conservative: they flag what they're unsure about rather than guessing. Works with Claude Code, Cursor, Codex, or any coding agent.
Open-source and free. Feedback welcome!
github.com/zenml-io/sk...
Table outlines migration paths from various ML/data platforms to ZenML, detailing core translations and special notes for each source.
We just shipped migration skills to allow you to migrate off 11 ML/data platforms to ZenML: Airflow, Argo, AzureML, Dagster, Databricks, Flyte, Kedro, Metaflow, Prefect, SageMaker, Vertex AI.
Each has hand-curated concept maps baked in showing what maps 1:1 and what needs redesign.
Try it out. It’s pretty transparent when there’s lossiness involved.
The kind of project I enjoy just steadily plodding away at — one format at a time.
github.com/strickvl/pa...
v0.5.0: split-aware YOLO reading + conversion explainability
v0.6.0: Five new adapters (LabelMe, CreateML, KITTI, VIA JSON, RetinaNet CSV)
13 supported formats now with full read, write, and auto-detection. Single binary, no Python deps.
I've been building panlabel — a fast Rust CLI that converts between dataset annotation formats — and I'm a few releases behind on sharing updates.
v0.3.0: Hugging Face ImageFolder support
v0.4.0: auto-detection UX overhaul + Docker
Our designer said it best: "It feels much nicer and powerful to work on the website now, and also flexible to make new layouts and whatever ideas that come to our minds without the Webflow restrictions."
One of those reviews caught 7 schema issues that would've broken everything downstream.
Best part is what we can do now that we couldn't before — blog posts through git, a searchable LLMOps database with real filtering, preview URLs for every PR.
The thing that made it reliable: using different models for different parts of the project. ChatGPT Deep Research for the upfront architecture decisions, Claude Code for building, and RepoPrompt to get Codex to review Claude's work at phase boundaries.
Last month I migrated our ZenML website from Webflow to Astro in a week during a Claude Code / Cerebras hackathon. 2,224 pages, 20 CMS collections, 2,397 images. The site you see now is the result.
Didn't win the hackathon but got a production website out of it, so I'll take that trade.
Full roadmap and install instructions in the repo. If you work with annotated datasets and have hit similar pain points, would be curious to hear what formats or features would be most useful.
Not going to change the world, but it might save someone a few hours of debugging coordinate transforms or prevent silent data corruption between tools.
There are a ridiculous number of object detection formats out there, and each one has its own quirks about how it handles bounding boxes, coordinates, or class mappings. I'm working through them slowly, format by format.
→ Convert between annotation formats (focusing on object detection first, but segmentation and classification coming soon)
→ Validate your datasets
→ Generate statistics
→ Semantic diff between dataset versions
→ Create random or stratified subsets
The origin story is pretty mundane: I hit one too many bounding box bugs caused by format inconsistencies and decided someone should just build a Pandoc equivalent for annotation data.
What it does:
A README document for Panlabel, a CLI tool that converts dataset annotation formats, including installation instructions for various platforms.
panlabel 0.2 is out. It's a CLI tool (and Rust library) for converting between different dataset annotation formats. Now also available via Homebrew.
The reasoning isn't strong enough for gnarly bugs, but the speed makes it useful for a different class of task. Still early days figuring out where it fits.
Are you using Codex Spark? Has it carved out a specific role in your workflow, or is it just another option you reach for occasionally?
I'm developing a mental filter for it. Docs updates after a code change? Spark's fine. First pass at demo code? Sure. Scanning docs and suggesting rewrites based on a PR? Worth trying. Complex debugging? Not there yet.
When regular Codex disappears for 30 minutes on high reasoning mode, you learn to run multiple tasks in parallel and context-switch between them. Spark doesn't need that pattern. The speed drops the friction enough that I'm less precious about what I delegate.
My main tools are still Codex 5.3 on high reasoning or Opus 4.6 (usually through @RepoPrompt), but Spark is fast enough that it makes me rethink what's worth handing off.