Advertisement · 728 × 90

Posts by David Jayatillake

Preview
How big should your data team be? Founders and CEOs are wondering if their data function is bloated and if they should replace everyone with AI agents. Data Leaders are scrambling to defend why they need a 15-people data team in a 200...

How big should your data team be?

Data teams are often oversized. A company of 200 people rarely needs 15+ data staff, usually 5% of org size is enough

dataactionmentor.com/knowledge-ba...

6 months ago 2 1 0 0

Try to find a non-traditional role that is more suited to a future where engineering is very cheap. If you have an idea, try build it yourself. The experience of trying to found is more valuable than employee experience now and even more so in the coming years.

6 months ago 2 0 0 0

The amount you love someone is proportional to how often you Ghiblify their pictures.

8 months ago 1 0 0 0
Post image

This week I look at agents.

I think this is a new way to build where we don’t intentionally build code-based software.

open.substack.com/pub/davidsj/...

9 months ago 1 0 0 0
Preview
China's biggest public AI drop since DeepSeek, Baidu's open source Ernie, is about to hit the market Chinese internet search giant Baidu will open source its Ernie gen AI large language model as soon as this week, with uncertain consequences for the market.

BERT and ERNIE! 😂

tracking.tldrnewsletter.com/CL0/https:%2...

9 months ago 2 0 0 0
Post image Post image

I don't usually share photos of my family on social media for good reason, but I'm happy to share these ones!

10 months ago 5 0 0 0
Preview
My AI Skeptic Friends Are All Nuts My smartest friends have bananas arguments about LLM coding.

This post encapsulates how I feel about the current state of LLMs and doomers etc. Really great read:

fly.io/blog/youre-a...

10 months ago 0 0 0 0

So when I've attended Snowflake summit before, I've usually written a blog post talking about the new features released, etc. Is someone going to do that this year, given I didn't go? 😊

#datasky #databs

10 months ago 1 0 0 0
Advertisement

It is possible to build machine learning systems which punch up instead of punching down.

10 months ago 688 127 9 3
Video

Got a cool story about something in the data engineering space? You should 💯 submit it as a talk to Current 2025 in New Orleans 😁

Do it! Now! CfP is open until 15th June.

sessionize.com/current-2025...

(Pro-tip: you only need an abstract at this point; writing the talk can be later 😅)

#dataBS

10 months ago 4 1 1 1

This is genuinely one thing you can rely on AI for.

10 months ago 2 0 1 0

It was actually very impressive. Lots of stuff I want to try.

10 months ago 1 0 0 0
Post image Post image Post image Post image

At the London Data Practitioners Meetup with @pedramnavid.com @jayatillake.bsky.social @rittmananalytics.bsky.social and the London Dagster community

11 months ago 2 1 0 0

I also think people don’t use the tags as we have found each other. I almost exclusively use the popular with friends feed.

11 months ago 2 0 0 0

It’s not but you don’t have to keep declaring ctes. May be able to have partial queries too.

11 months ago 3 0 0 0

Theyre still here just quieter than at the start. More of them though

11 months ago 2 0 1 0
Advertisement
Post image

Doctor’s orders 🫡

11 months ago 4 0 1 0
Preview
Siri’s new boss is already making big internal changes, per report - 9to5Mac Siri’s new boss at Apple, Mike Rockwell, has reportedly wasted no time making big changes internally to the people building its assistant.

I still think this is the biggest prize in AI. If Siri could actually do most things you do on a phone manually...

9to5mac.com/2025/04/22/s...

11 months ago 1 0 0 0

Haha yes but he fits the bill.

11 months ago 0 0 0 0

@petefein.bsky.social

11 months ago 0 0 1 0

I wonder what the limit difference between CSV and Parquet would be under real conditions, where most queries only need a tiny subset of large datasets. You could probably handle >petabyte datasets on that EC2 machine with good partitioning of Parquet or using Iceberg.

11 months ago 3 0 0 0

Well, if it works, the real engineers can tidy it up or more likely do nothing and talk about code standards.

11 months ago 1 0 1 0

Has anyone tried Llama 4 Maverick yet? How big a machine does it need to run locally?

@simonwillison.net

1 year ago 0 0 0 0
Advertisement

Looks like Nintendo became the best at console FPS.

1 year ago 0 0 0 0

Oh no! I’ve been enjoying bluesky for the data stuff but can imagine that it’s swung very radically left on other topics.

1 year ago 0 0 0 0

@windsurfai.bsky.social

1 year ago 1 0 0 0

I've seen many blog posts and social posts by these supposed true artisans saying that they tried this method, and the output was subpar.

Well, maybe it would have taken just as long if you had just written the code, but for the rest of us, we now have an option to build without you.

1 year ago 0 0 2 0
Preview
Vibe coder Free like a puppy

Once again, we've devised a derogatory name for something many of us are doing: "Vibe coding".

Just like "Citizen Data Scientist", "Excel Data Analyst", and many other terms made to belittle by the supposed true artisans that came before.

open.substack.com/pub/davidsj/...

1 year ago 0 0 2 0

yeah but was there coffee down there, and if so was it any good?

1 year ago 4 0 1 0