Towards Data Science (@towardsdatascience.com) Bsky

Hallucinations in LLMs Are Not a Bug in the Data | Towards Data Science It’s a feature of the architecture

If you think better data will eliminate hallucinations, you may be solving the wrong problem. Javier Marin explains why this behavior is rooted in the architecture itself.

12 hours ago 0 0 0 0

How Visual-Language-Action (VLA) Models Work | Towards Data Science The mathematical foundations of Vision-Language-Action (VLA) models for humanoid robots and more

Curious to learn about the mathematical foundations of Vision-Language-Action (VLA) models? You're in luck — Sam Black just published a lucid and detailed explainer on this emerging topic.

15 hours ago 0 0 0 0

A Survival Analysis Guide with Python: Using Time-To-Event Models to Forecast Customer Lifetime | Towards Data Science Understand survival analysis by modeling customer retention through Kaplan-Meier curves and Cox Proportional Hazard regressions.

In his new, hands-on deep dive, Gustavo Santos unpacks the steps you need to follow for a robust survival analysis in the context of customer retention.

18 hours ago 0 0 0 0

A Visual Explanation of Linear Regression | Towards Data Science A long-form article featuring over 100 visualizations, covering a range of topics from how to build linear regression model, measure the quality and how to improve the model

For a (truly) one-stop resource on all things linear regression, don't miss Mikhail Sarafanov's clear, accessible, visual guide.

19 hours ago 0 0 0 0

The Future of AI for Sales Is Diverse and Distributed | Towards Data Science True creativity and innovation will come from human-agent collaboration. One human, millions of agents.

"Today’s AI conversation is dominated by LLMs, and for good reason. They’re incredibly versatile and accessible. But versatility shouldn’t be confused with universality."

Nicolas Maquaire reflects on the promise of human-agent collaboration and the future of AI for sales.

1 day ago 0 0 0 0

Why AI Is Training on Its Own Garbage (and How to Fix It) | Towards Data Science Deep Web Data Is the Gold We Can't Touch, Yet

"But what if I told you we aren’t actually running out of data? We’ve just been looking in the wrong place."

Sabrine Bendimerad explores the potential of "deep web data" in her latest article.

1 day ago 1 0 0 0

Detecting Translation Hallucinations with Attention Misalignment | Towards Data Science A low-budget way to get token-level uncertainty estimation for neural machine translations

Join us in welcoming Aleksandr Gapchenko to TDS! His debut article is an insightful deep dive on translation hallucinations (and how to detect them).

1 day ago 0 0 0 0

How to Use Claude Code to Build a Minimum Viable Product | Towards Data Science Learn how to effectively present product ideas by building MVPs with coding agents

Need a quick and streamlined method for producing minimum viable products? Eivind Kjosbakken explains how he uses Claude Code for that purpose.

1 day ago 0 0 0 0

Grounding Your LLM: A Practical Guide to RAG for Enterprise Knowledge Bases | Towards Data Science A clear mental model and a practical foundation you can build on

Thorough, accessible, and actionable, Priyansh Bhardwaj's debut TDS article is a hands-on guide to using RAG effectively in the context of enterprise knowledge bases.

1 day ago 0 0 0 0

How to Effectively Review Claude Code Output | Towards Data Science Get more out of your coding agents by making reviewing more efficient

More output does not mean better outcomes if review cannot keep up. Eivind Kjosbakken shows how to turn agent-generated code into something reliable through smarter review practices.

2 days ago 0 0 0 0

How to Effectively Review Claude Code Output | Towards Data Science Get more out of your coding agents by making reviewing more efficient

Coding agents can generate more code than you can realistically review. Eivind Kjosbakken explains how to structure your review process to keep quality high without slowing everything down.

2 days ago 1 0 0 0

Democratizing Marketing Mix Models (MMM) with Open Source and Gen AI | Towards Data Science A practical system design combining open-source Bayesian MMM and GenAI for transparent, vendor independent marketing analytics insights.

For a clear introduction to modern marketing mix models, don't miss Shakti Kothari's debut TDS article, which leverages the power of open source tools and generative AI.

2 days ago 0 0 0 0

From 4 Weeks to 45 Minutes: Designing a Document Extraction System for 4,700+ PDFs | Towards Data Science How a hybrid PyMuPDF + GPT-4 Vision pipeline replaced £8,000 in manual engineering effort, and why the latest models weren’t the answer

We're thrilled to welcome Obinna Iheanachor to TDS! Click below to explore his comprehensive walkthrough on designing a streamlined, rapid document-extraction system.

2 days ago 0 0 0 0

Context Engineering for AI Agents: A Deep Dive | Towards Data Science How to optimize context, a precious finite resource for AI agents

New to context engineering and need a clear explainer on the fundamental concepts of this emerging field? Clara Chong is here to help with an accessible guide.

2 days ago 0 0 0 0

The Arithmetic of Productivity Boosts: Why Does a “40% Increase in Productivity” Never Actually Work? | Towards Data Science Why does grand productivity promises never actually deliver? Is every product just bad, or is there something else hiding in the numbers?

Why do grand productivity promises never actually deliver when it comes to data science and ML tools?

Eirik Berge digs into the frequent gap between sales pitches and on-the-ground results.

3 days ago 1 0 0 0

Bayesian Thinking for People Who Hated Statistics | Towards Data Science You already think like a Bayesian. Your stats class just taught the formula before the intuition. Here's a 5-step framework to apply it at work.

Bayesian thinking is less about formulas and more about updating beliefs with evidence. Kaushik Rajan breaks it down into something far more practical than what most classrooms deliver.

3 days ago 0 0 0 0

Bayesian Thinking for People Who Hated Statistics | Towards Data Science You already think like a Bayesian. Your stats class just taught the formula before the intuition. Here's a 5-step framework to apply it at work.

Maybe statistics was never the problem. Kaushik Rajan reframes Bayesian thinking in a way that actually aligns with how people reason in the real world.

3 days ago 1 0 0 0

Behavior is the New Credential | Towards Data Science We are living through a paradigm shift in how we prove we are who we say we are online. Instead of asking What do you know? (password, PIN, mother’s maiden name) or What do you look like? (Face ID,…

"Behavioral biometrics analysis is now becoming standard practice at banks, which are liable for covering losses from cybercrimes unless the security measures they put in place meet the challenges of these new attack surfaces."

🖋️ by Brandon Janes

towardsdatascience.com/behavior-is-...

3 days ago 0 0 0 0

A Data Scientist’s Take on the $599 MacBook Neo | Towards Data Science Why it doesn’t fit my workflow but still makes sense for beginners

Is the new Macbook Neo a strong fit for data professionals? Benjamin Nweke shares a nuanced analysis based on his own testing and workflows.

3 days ago 0 0 0 0

Building a Python Workflow That Catches Bugs Before Production | Towards Data Science Using modern tooling to identify defects earlier in the software lifecycle.

Wouldn't it be great to catch bugs before they slip into production code? @taupirho.bsky.social shows how you can build a Python workflow to achieve precisely that.

4 days ago 1 1 0 0

How to Run Claude Code Agents in Parallel | Towards Data Science Learn how to apply coding agents in parallel to work more efficiently

In this article, Eivind Kjosbakken shows how to apply coding agents in parallel to work more efficiently:

towardsdatascience.com/how-to-run-c...

4 days ago 0 0 0 0

The Geometry Behind the Dot Product: Unit Vectors, Projections, and Intuition | Towards Data Science The geometric foundations you need to understand the dot product

The geometric foundations you need to understand the dot product

By Amit Shreiber

towardsdatascience.com/the-geometry...

4 days ago 0 0 0 0

Building Robust Credit Scoring Models with Python | Towards Data Science A Practical Guide to Measuring Relationships between Variables for Feature Selection in a Credit Scoring.

A Practical Guide to Measuring Relationships between Variables for Feature Selection in a Credit Scoring. By JUNIOR JUMBONG.

towardsdatascience.com/building-rob...

4 days ago 0 1 0 0

Proxy-Pointer RAG: Achieving Vectorless Accuracy at Vector RAG Scale and Cost | Towards Data Science A new way to build vector RAG—structure-aware and reasoning-capable

A new way to build vector RAG; structure-aware and reasoning-capable

🖋️ by Partha Sarkar

towardsdatascience.com/proxy-pointe...

4 days ago 0 0 0 0

How Can A Model 10,000× Smaller Outsmart ChatGPT? | Towards Data Science Why thinking longer can matter more than being bigger

"What if actual intelligence isn’t related to the size of the model, but instead, how long you let it reason? Can a tiny network, given the freedom to reiterate on its own solution, outsmart a model thousands of times bigger than itself?"

By Moulik Gupta

towardsdatascience.com/how-can-a-mo...

5 days ago 1 0 0 0

Building a Personal AI Agent in a couple of Hours | Towards Data Science I’ve been so surprised by how fast individual builders can now ship real and useful prototypes. Tools like Claude Code, Google AntiGravity, and the growing ecosystem around them have crossed a…

"We are living in a Brave New World and prototyping just got fun"

Read more from Ivo Bernardo's post:

towardsdatascience.com/building-a-p...

6 days ago 1 0 0 0

How to Make Claude Code Better at One-Shotting Implementations | Towards Data Science Make your coding agent more efficient

In this article, Eivind Kjosbakken discusses how to make Claude Code better at one-shotting the implementations that you want to build:

towardsdatascience.com/how-to-make-...

6 days ago 1 0 1 0

How to Make Your AI App Faster and More Interactive with Response Streaming | Towards Data Science In my latest posts, we’ve talked a lot about prompt caching as well as caching in general, and how it can improve your AI app in terms of cost and latency. However, even for a fully optimized AI app,…

Learn how HTTP streaming, SSE, and WebSockets can improve user experience by delivering AI responses token by token, without waiting for full outputs.

By Maria Mouschoutzi

towardsdatascience.com/how-to-make-...

6 days ago 0 0 0 0

What the Bits-over-Random Metric Changed in How I Think About RAG and Agents | Towards Data Science Why retrieval that looks excellent on paper can still behave like noise in real RAG and agent workflows

Why retrieval that looks excellent on paper can still behave like noise in real RAG and agent workflows

🖋️ by Sean Moran

towardsdatascience.com/what-the-bit...

6 days ago 1 0 0 0

Get ready for COLLIDE 2026 on Oct 1 for a day of thought leadership and strategic insights focused on the "how" behind enterprise AI. This event brings together the architects of the modern AI enterprise to share their blueprints for success.

datasciconnect.com/events/collide

1 week ago 3 1 0 0

Posts by Towards Data Science