If you think better data will eliminate hallucinations, you may be solving the wrong problem. Javier Marin explains why this behavior is rooted in the architecture itself.
Posts by Towards Data Science
Curious to learn about the mathematical foundations of Vision-Language-Action (VLA) models? You're in luck — Sam Black just published a lucid and detailed explainer on this emerging topic.
In his new, hands-on deep dive, Gustavo Santos unpacks the steps you need to follow for a robust survival analysis in the context of customer retention.
For a (truly) one-stop resource on all things linear regression, don't miss Mikhail Sarafanov's clear, accessible, visual guide.
"Today’s AI conversation is dominated by LLMs, and for good reason. They’re incredibly versatile and accessible. But versatility shouldn’t be confused with universality."
Nicolas Maquaire reflects on the promise of human-agent collaboration and the future of AI for sales.
"But what if I told you we aren’t actually running out of data? We’ve just been looking in the wrong place."
Sabrine Bendimerad explores the potential of "deep web data" in her latest article.
Join us in welcoming Aleksandr Gapchenko to TDS! His debut article is an insightful deep dive on translation hallucinations (and how to detect them).
Need a quick and streamlined method for producing minimum viable products? Eivind Kjosbakken explains how he uses Claude Code for that purpose.
Thorough, accessible, and actionable, Priyansh Bhardwaj's debut TDS article is a hands-on guide to using RAG effectively in the context of enterprise knowledge bases.
More output does not mean better outcomes if review cannot keep up. Eivind Kjosbakken shows how to turn agent-generated code into something reliable through smarter review practices.
Coding agents can generate more code than you can realistically review. Eivind Kjosbakken explains how to structure your review process to keep quality high without slowing everything down.
For a clear introduction to modern marketing mix models, don't miss Shakti Kothari's debut TDS article, which leverages the power of open source tools and generative AI.
We're thrilled to welcome Obinna Iheanachor to TDS! Click below to explore his comprehensive walkthrough on designing a streamlined, rapid document-extraction system.
New to context engineering and need a clear explainer on the fundamental concepts of this emerging field? Clara Chong is here to help with an accessible guide.
Why do grand productivity promises never actually deliver when it comes to data science and ML tools?
Eirik Berge digs into the frequent gap between sales pitches and on-the-ground results.
Bayesian thinking is less about formulas and more about updating beliefs with evidence. Kaushik Rajan breaks it down into something far more practical than what most classrooms deliver.
Maybe statistics was never the problem. Kaushik Rajan reframes Bayesian thinking in a way that actually aligns with how people reason in the real world.
"Behavioral biometrics analysis is now becoming standard practice at banks, which are liable for covering losses from cybercrimes unless the security measures they put in place meet the challenges of these new attack surfaces."
🖋️ by Brandon Janes
towardsdatascience.com/behavior-is-...
Is the new Macbook Neo a strong fit for data professionals? Benjamin Nweke shares a nuanced analysis based on his own testing and workflows.
Wouldn't it be great to catch bugs before they slip into production code? @taupirho.bsky.social shows how you can build a Python workflow to achieve precisely that.
In this article, Eivind Kjosbakken shows how to apply coding agents in parallel to work more efficiently:
towardsdatascience.com/how-to-run-c...
The geometric foundations you need to understand the dot product
By Amit Shreiber
towardsdatascience.com/the-geometry...
A Practical Guide to Measuring Relationships between Variables for Feature Selection in a Credit Scoring. By JUNIOR JUMBONG.
towardsdatascience.com/building-rob...
A new way to build vector RAG; structure-aware and reasoning-capable
🖋️ by Partha Sarkar
towardsdatascience.com/proxy-pointe...
"What if actual intelligence isn’t related to the size of the model, but instead, how long you let it reason? Can a tiny network, given the freedom to reiterate on its own solution, outsmart a model thousands of times bigger than itself?"
By Moulik Gupta
towardsdatascience.com/how-can-a-mo...
"We are living in a Brave New World and prototyping just got fun"
Read more from Ivo Bernardo's post:
towardsdatascience.com/building-a-p...
In this article, Eivind Kjosbakken discusses how to make Claude Code better at one-shotting the implementations that you want to build:
towardsdatascience.com/how-to-make-...
Learn how HTTP streaming, SSE, and WebSockets can improve user experience by delivering AI responses token by token, without waiting for full outputs.
By Maria Mouschoutzi
towardsdatascience.com/how-to-make-...
Why retrieval that looks excellent on paper can still behave like noise in real RAG and agent workflows
🖋️ by Sean Moran
towardsdatascience.com/what-the-bit...
Get ready for COLLIDE 2026 on Oct 1 for a day of thought leadership and strategic insights focused on the "how" behind enterprise AI. This event brings together the architects of the modern AI enterprise to share their blueprints for success.
datasciconnect.com/events/collide