Dori (@doriwilson.com) Bsky

Data Debug SF Recorded talks from the speakers of Data Debug SF. These are informal lightning talks from data practitioners about problems and experiences they have. Recce...

Tuesday's Data Debug lightning talks are up on youtube!

Context management for AI agents (Claire Gouze), AI in daily data science workflows (Kasia Rachuta) & building self-improving AI skills (me).

Playlist: www.youtube.com/playlist?lis...

3 weeks ago 1 1 0 0

Data Debug SF: Lightning Talks v7⚡ · Luma Data Debug SF: Lightning Talks v7⚡ Join us for March's Data Debug! Come learn new techniques and hear stories from your fellow data folks! Who should…

Data Debug SF's March Meetup is tomorrow! All 3 lightning talks are about AI this month:

context engineering for analytics agents
integrating AI into daily data science work
building & evaluating AI skills (I'm giving this one)

join us here: luma.com/lo8ogbub

3 weeks ago 0 0 0 0

Trusting AI with Your Data: Safe Automation from Branch to Production · Zoom · Luma AI coding assistants already help you ship software faster. But when it comes to data, the stakes are different. One bad pipeline change can corrupt production…

Tomorrow 9am PT: CL is joining Bauplan live to show how AI can safely modify data pipelines without wrecking production.

Branch-level isolation + Recce's review agent catching issues before merge.

Free & online. luma.com/mm3gsalo?tk=...

1 month ago 0 0 0 0

Every bad join Claude writes becomes a rule in the skills file. Every ignored existing model becomes a convention. The skills get better every time.

The next run will be tighter because of everything I caught on this one.

1 month ago 0 0 0 0

I let Claude Code build dbt models from raw production data. It made certain tables incremental without being told. Smart inference from the data pattern.

It also silently dropped rows on edge cases via inner joins. The decisions that matter are still yours.

1 month ago 0 0 0 0

Happy hour tonight, Nashville tomorrow. CL is speaking at DataTune on Saturday. He did 288 benchmark trials on building data agents that don't break your pipeline. I'll be at the Recce booth. Come find us!

1 month ago 0 0 0 0

Truth-Seeking Data Systems w/Bryan Bischof This is Episode 7 of Data Renegades: Bryan Bischof, Head of AI at Theory Ventures, talks about building the world's first coffee recommendation system at Blue Bottle, hunting the infamous backpacks…

"We are in the age of unstructured data and people are not using it enough."

Not everything you need is in Snowflake. Bryan built a dataset spanning PDFs, logs, & structured data to prove it.

From our Data Renegades lightning round with Bryan Bischof: youtu.be/-s6Xh5sCdLs

1 month ago 0 0 0 0

I've never thought about architecture or the why behind things so upfront as i have when trying to get an llm to do work i want it to

1 month ago 0 0 0 0

Please do! And let me know what you think! It's incredible to hear how others use Recce.

Also, anything you've learned from the MCP configs & context set-up? I've been trying to do diff context mgmt techniques so I don't have to "redo" anything between sessions

1 month ago 0 0 0 0

any skills or lessons learned from fixing the infra that you took forward? I've been working on trying to get any bumps I hit coded up such that Claude doesn't do it the next time.

1 month ago 0 0 0 0

New Data Renegades is up. CL & I talked with Wes McKinney about building pandas, radical accountability for software, & why data infrastructure might be the last AI-resistant frontier.

His book changed my career. This one was personal. Wherever you get your podcasts.

1 month ago 0 0 0 0

Wes McKinney told CL & me he's not sure anyone will read technical books in a year or two. This from the person whose book changed my career.

New Data Renegades tomorrow. This one covers a lot of ground.

1 month ago 0 0 0 0

Your AI Tools Should Get Better Every Time You Use Them Prompting helps once, but persistent context is what makes AI workflows compound across engineering and content work.

New blog post. How I went from four separate Claude chats with manually pasted prompts to persistent skills that improve every session. Covers the podcast workflow, the dbt side, & why business context matters more than conventions.

doriwilson.com/blog/your-ai...

1 month ago 2 0 0 0

Listen to the full Data Renegades episode with Bryan Bischof wherever you get your favorite podcasts.

1 month ago 0 0 0 0

Bryan's worst production bug: too many backpacks.

Stitch Fix recommender gave someone three backpacks. System built to prevent duplicates. They lived in a weird part of latent space, close to everything.

Same bug bit him twice.

1 month ago 0 0 1 0

I let Claude Code build my dbt models. The interesting part wasn't the code. AI-assisted analytics engineering isn't a prompting problem. It's an infrastructure problem. Creating and implementing skills, MCPs, conventions, and guardrails are the real work.

I spent more time building skills & MCP configs for Claude Code than watching it generate dbt models. The setup is the work, not the prompt.

AI-assisted analytics engineering is an infrastructure problem.

Wrote about it on the Recce blog. blog.reccehq.com/i-let-claude...

1 month ago 0 0 3 0

I let Claude Code build my dbt models. The interesting part wasn't the code. AI-assisted analytics engineering isn't a prompting problem. It's an infrastructure problem. Creating and implementing skills, MCPs, conventions, and guardrails are the real work.

Claude Code filtered out rows with missing org_ids instead of flagging a potential production bug.

An AI made a data quality decision that should have been a human decision. And it didn't flag it. It just handled it.

Read about it here: blog.reccehq.com/i-let-claude...

1 month ago 0 0 0 0

Truth-Seeking Data Systems w/Bryan Bischof This is Episode 7 of Data Renegades: Bryan Bischof, Head of AI at Theory Ventures, talks about building the world's first coffee recommendation system at Blue Bottle, hunting the infamous backpacks…

"Saying you are wrong is not curious. Saying why are your priors different than what the data is showing is curious."

Bryan on how to work with people who resist uncomfortable data.

Listen to more on Data Renegades with Bryan Bischof.

youtu.be/-s6Xh5sCdLs

1 month ago 0 0 0 0

Truth-Seeking Data Systems w/Bryan Bischof This is Episode 7 of Data Renegades: Bryan Bischof, Head of AI at Theory Ventures, talks about building the world's first coffee recommendation system at Blue Bottle, hunting the infamous backpacks…

"What was GTM engineer before Clay decided to make a name for it? Well, that was a data engineer."

Same work. New labels.

From our Data Renegades chat with Bryan Bischof.

Listen to the full episode here: youtu.be/-s6Xh5sCdLs

1 month ago 0 0 0 0

Truth-Seeking Data Systems w/Bryan Bischof This is Episode 7 of Data Renegades: Bryan Bischof, Head of AI at Theory Ventures, talks about building the world's first coffee recommendation system at Blue Bottle, hunting the infamous backpacks…

"BigQuery UI feels like someone designed it to punish me. Snowflake was like, that's cute. Hold our database query."

Anyone who's used these UIs felt this in their soul.

More on Data Renegades with Bryan Bischof.

youtu.be/-s6Xh5sCdLs

1 month ago 0 0 0 0

Data Debug SF: Lightning Talks v7⚡ · Luma Data Debug SF: Lightning Talks v6⚡ Join us for March's Data Debug! Come learn new techniques and hear stories from your fellow data folks! Who should…

Next Data Debug SF is Tues 3/24, some speaker slots are still open! DM me if you're interested in speaking. Otherwise you can RSVP here: luma.com/lo8ogbub

1 month ago 0 0 0 0

What Is a Context Graph? | Daniel Davis at Data Debug SF Daniel Davis presents context graphs (the knowledge graph subset built for AI) at Data Debug SF February 2026, making the case that graph-based context is the missing infrastructure layer behind…

Enterprise AI POCs failed their year-end reviews. The fix everyone landed on: context. A context graph is a knowledge graph subset optimized for AI. The hard part: AI-generated content becomes the new context. The loop closes & now you're dealing with drift. Watch: youtu.be/cgPw4SSl4Ew

1 month ago 0 0 1 0

DataFusion + DuckLake — Zac Farrell | Data Debug SF Zac Farrell presents an OSS DuckLake integration built on Apache DataFusion at Data Debug SF February 2026, showing that DuckLake is a spec with no hard dependency on DuckDB and that alternative…

Duck Lake stores lakehouse metadata in a relational database instead of scattered metadata files. That's the whole design. Building a native implementation against the spec meant reading DuckDB's source code because the docs & the code didn't agree. Watch here: youtu.be/VtvjyMKYPEA

1 month ago 0 0 1 0

DataFusion + DuckLake — Zac Farrell | Data Debug SF Zac Farrell presents an OSS DuckLake integration built on Apache DataFusion at Data Debug SF February 2026, showing that DuckLake is a spec with no hard dependency on DuckDB and that alternative…

Good one at Data Debug SF this week. Three talks: DuckLake without DuckDB, the builder stack for open source tooling, & what a context graph actually is. Same through-line from 3 directions: building on moving targets. Summary 🧵

1 month ago 1 0 1 0

Ask the sycophantic data scientist ready for a promotion how growth looks. They'll tell you it's great.

Ask an AI agent the same question. It'll tell you what you want to hear.

Trustworthy answers are a different problem.

From our Data Renegades chat with Bryan Bischof.

2 months ago 0 0 0 0

Data Renegades | Heavybit Exploring data, code, culture, and everything in between.

"What's the biggest predictive feature for coffee recommendations?"

"Your favorite salad dressing."

Bryan built Blue Bottle's coffee recommender. The coffee team's domain knowledge made the model work, not his ML intuition.

Listen to the full ep:

2 months ago 0 0 0 0

Five Days, Five Data Problems, Five Fixes: What the Data Valentine Challenge Revealed Five companies tackled real data problems live. Agent benchmarks, DuckDB reconciliation, no-code pipelines, dbt cleanup, and data versioning in one week.

Happy Valentine's Day, data folks. This week I ran the Data Valentine Challenge: 5 companies, 5 data problems. Three things kept surfacing: breaks happen in the handoffs between tools, "just in case" infrastructure is the enemy & AI works when constraints are tight
blog.reccehq.com/data-valenti...

2 months ago 0 0 0 0

Riverside.FM Studio

last day of data valentines tomorrow: bauplan's "Let AI Build Your Pipelines Without Breaking Your Heart (or Production)"

register for tomorrow here: riverside.com/webinar/regi...

2 months ago 0 0 0 0

From Hot Mess to Happily Ever After: A dbt Glow-Up | Data Valentine Challenge Day 4 Your dbt project looks fine until someone walks the lineage and starts asking questions. In this session, Chloe from Database Tycoon deleted duplicate models, orphan intermediates, unused dimensions,…

they deleted everything without a downstream dependency & regenerated dbt docs. lineage went from chaos to clean. "the best code is the code you don't write. or in this case, the code you delete."

full replay here: youtu.be/2snf_AY94-A

2 months ago 0 0 1 0

data valentines day 4: database tycoon did a dbt makeover show. Chloe pulled up the lineage & immediately: "she's trying her best, but this is giving overwhelmed & overworked." cross joins that nothing uses, orphan dimensions, 7 models off one source with 3 leading nowhere

2 months ago 1 0 1 0

Posts by Dori