Eval: base Claude vs our Workbench.
Base: leaks creds 100%, skips docs, no sampling, 1-shot code.
Workbench: 0 leaks, always docs/samples/iterates.
58% higher cost ($2.21 vs $1.40) isn’t overhead, it’s the gap between AI slop and production.
full read
Posts by Adrian Brudaru
Every layer of software automation was called overkill before it became the baseline.
Fortran → Make → CI/CD → Docker → now agents.
Code that runs is only the 10%.
The other 90% is engineering judgment, boundaries, and iteration.
If you don’t define the path, the agent will improvise.
With our Agentic REST toolkit, we make the right way the easiest way:
- structured access
- limited operations
- no hidden side effects
Full breakdown:
AI agents don’t just use your APIs, they optimize around them.
Ask an agent to “build a pipeline” and it will find credentials, escalate privileges, and take the shortest path to completion.
Not against your interests, just goal-driven.
Not everything that can be modeled should be.
With LLMs, more context doesn’t mean a better prompt.
The key is Minimum Viable Context for high-precision data models.
Here’s what we learned building ontology-driven modeling.
Blog by Hiba Jamal ↓
Outcome → fixed-price, high-margin projects are now viable.
This isn’t a productivity hack.
It’s a business model shift.
Case study 👇
For agencies and teams of 5+, standardization is everything.
Tasman encodes their standards, naming, rate limits, and workflows so mid-level engineers ship production-quality work.
Knowledge scales across the team instead of being locked in a few individuals.
Generation works.
What doesn’t come for free:
→ transparency
→ validation
→ correctness
That’s where dlt comes in:
→ Dataset Browser
→ native logging
→ inspect–fix–rerun loop
Catching schema drift, nested data, column mismatches, before production.
Tasman Analytics (20-person consultancy, enterprise clients) went from 2 weeks to scope an API connector → 20 minutes with dltHub Pro.
But speed isn’t the story.
The real question is: what happens after the code runs?
AI can generate a data pipeline in 10 minutes.
But can you trust what it produces?
That’s the real problem, and it’s not technical. It’s business.
If you can’t trust the output, it never reaches production. 🧵
Writing pipelines isn’t the hardest part.
Trusting them enough to deploy is.
The deployment toolkit is available via design partnership as part of dltHub Pro.
Interested → join the design partnership 👇
After deployment:
You can ask the agent for pipeline health
it:
• inspects logs
• pulls observability data
• checks incremental loading (incl. duplicates)
3. Deploys, monitors, and self-corrects
• Deploys pipelines
• Reads logs
• Fixes issues (dependencies, connections)
• Redeploys
If something breaks, it can loop back to other toolkits, fix the pipeline, validate on dev, and try again.
2. Production readiness checks
Scans the code for dev artifacts like 'dev_mode=True' or '.add_limit()'
Flags and removes anything that shouldn’t ship
1. Dev → production
Converts dev workspaces into production-ready setups:
• separate dev/prod profiles
• destination-agnostic code
• pinned dependencies
Then runs the full pipeline in dev, validating credentials without the agent ever reading them.
The dltHub AI Workbench deployment toolkit handles what usually breaks between local and prod.
Here’s what happens before anything reaches production:
Most AI coding tools stop at “here’s your code.”
But getting pipelines into production, and trusting them there, is the hard part.
We built a deployment toolkit that closes that gap 🧵
Prompt 10,000 pipeline contexts into a portable, manageable scaffold built on open standards.
From one-off connectors → repeatable systems.
No more DI-WHY.
See how it works 👇
Most teams still build connectors from scratch, one at a time.
Different patterns.
Different implementations.
Accumulating tech debt.
What if you built the system instead?
We just released dlt Skills, a pipeline factory powered by Claude.
👇
The result is a a platform-agnostic model where all your sources speak the same language.
Definition first, code as consequence.
Open-source, Python-native. Works with Claude Code, Cursor, and Codex.
LLMs fail at data transformation because they see isolated tables, not your business.
We built the dltHub AI Workbench transformation toolkit to fix that.
You feed it sources + use cases → it builds a taxonomy, business ontology, and a Canonical Data Model.
The craziest part of the new dltHub AI release? The MCP integration.
Asked Claude Code for an OpenAI pipeline -> it searches the dlt context -> scaffolds the exact code with schema & incremental loading. No more starting from scratch.
https://dlthub.com/blog/ai-workbench
In this hands-on course, you’ll learn how to:
• Build pipelines from a single prompt
• Handle APIs (auth, pagination, schema, incremental loads)
• Validate, explore & visualize data with dashboards
• Use semantic + ontology-driven transformations
• Deploy pipelines safely
New course: Agentic Data Engineering with dltHub 🤖
Agents can now write entire data pipelines, but writing code was never the hard part.
The real challenge? Data quality, schema stability, and running reliably in production. 🧵👇
The shift in data engineering:
The bottleneck is no longer writing pipelines, it’s trusting them in production.
dltHub AI Workbench is built around that:
agents propose, humans verify, tooling enforces.
Works with Claude Code, Cursor, and OpenAI Codex.
Before anything goes to production, the agent:
– converts dev → prod workspace
– removes dev artifacts
– pins dependencies
– validates credentials (without reading them)
Then deploys, monitors, and fixes if needed.
Once data is loaded, you don’t switch tools.
The agent can:
– validate row counts, keys, timestamps
– inspect nested data
– generate queries
– build dashboards (via @marimo.io)
The feedback loop goes from days → minutes.
(validation in action 👇)
Start with a prompt:
"write a dlt pipeline for OpenAI models"
The agent uses an MCP server to pull API context (9,700+ configs at https://dlthub.com/context and scaffolds a full pipeline:
– auth
– pagination
– schema
– incremental loading
Production-ready from the first run.
(how it looks 👇)
Writing pipelines was never the hardest part.
The hard part is everything around it:
– does the data match expectations?
– did pagination break?
– is schema drifting?
– is this safe to run in prod?
That’s the gap we’re addressing.