Advertisement · 728 × 90

Posts by DrivenData

Post image

Excited to be speaking at Good Tech Summit in DC April 7 www.goodtechtogether.org/summit

We’ll share a program focused on K-12 education and talk about investing in the foundations of AI: data, models, and benchmarks. We'll explore how these shape AI development in a field. Join us!

2 weeks ago 2 1 0 0
Preview
On Top of Pasketti: Children’s Speech Recognition Challenge Develop cutting-edge ASR algorithms specifically for children's speech to advance early education assessments and teaching tools.

The performance gap in children’s ASR is real — and solvable!

With 1 week until the April 6 deadline, we’re inviting the global ML community to help close it.

Compete for $120K and contribute to better speech technology for kids.
kidsasr.drivendata.org

2 weeks ago 0 0 0 0
Preview
On Top of Pasketti: Children’s Speech Recognition Challenge Develop cutting-edge ASR algorithms specifically for children's speech to advance early education assessments and teaching tools.

Children’s speech remains one of ASR’s toughest challenges — and the leaderboard is moving!

3 weeks left to compete for $120K in the On Top of Pasketti Challenge.

Submit your model:
kidsasr.drivendata.org

1 month ago 1 0 0 0

10 years ago, "data science for social good" was just an idea. Today, it's a global movement.

Our 10-Year Impact Report reflects on a decade of responsible, real-world AI.

Take a look back with us: s3.amazonaws.com/drivendata-p...

1 month ago 0 0 0 0
Preview
Improving Automatic Speech Recognition for Kids - On Top of Pasketti Word-Track Benchmark Learn how to train a model to transcribe child speech for the On Top of Pasketti Challenge (Word Track)

Competing in the On Top of Pasketti Word Track? We've published a reference tutorial walking through how to build a model for children's speech recognition — covering data exploration, model training, and submission packaging.

Get started: drivendata.co/blog/child-a...

1 month ago 0 0 0 0

From Bogotá to Almaty, our community continues to impress!

IGCPHARMA built prize-winning early dementia prediction models in PREPARE.
Kirill Brodt has 23 competitions, 8 top-10s, and 6 top-3 finishes.

Two new Community Spotlights:
drivendata.co/blog/communi...
drivendata.co/blog/communi...

1 month ago 0 0 0 0
Preview
On Top of Pasketti: Children’s Speech Recognition Challenge Develop cutting-edge ASR algorithms specifically for children's speech to advance early education assessments and teaching tools.

Automatic speech recognition struggles with children’s speech. That gap matters!

The $120K Children’s Speech Recognition Challenge is driving progress toward models that truly understand kids.

Join breakthrough! Submit your solution by April 6th.
kidsasr.drivendata.org

1 month ago 0 0 0 0
Preview
Bringing small water bodies into view: Sentinel-2 satellite monitoring of harmful algal blooms (HABs) CyFi enhances modern HAB monitoring programs by extending their reach and informing field-based components.

A machine learning competition for NASA sparked something bigger.

CyFi (cyanobacteria finder) is now an open-source tool using Sentinel-2 data to monitor harmful algal blooms worldwide, with local validation underway.

How we got here: drivendata.co/blog/cyfi-sm...

1 month ago 1 0 0 0
Preview
AI Agents in Data Science Competitions: Lessons from the Leaderboard How good are AI agents at data science? Here's what we've learned from initial experiments about what works, what doesn't, and what the future might hold.

What happens when AI agents enter ML competitions?

Spoiler: Humans 1, Agents 0.25–0.93.

The top of the leaderboard — where “good” becomes “great” — still looks very human.

How might that change?

See the results:
drivendata.co/blog/ai-agen...

1 month ago 1 0 0 0
Advertisement

We’re launching our first benchmark competition.

The SNOMED CT Entity Linking Benchmark evaluates how well models structure clinical notes using the SNOMED CT nomenclature for a large, de-identified dataset of annotated records.

Help set the baseline: www.drivendata.org/benchmarks/3...

1 month ago 1 0 0 0

Building data pipelines is deceptively complex; Cold starts, unstable inputs, shifting requirements, and delivery trade-offs create friction at every step.

To ease the pain, we examined five recurring challenges and suggest practical improvements: drivendata.co/blog/pipelin...

1 month ago 0 0 0 0
Preview
The State of Machine Learning Competitions More than 390 machine learning competitions took place in 2025, across 30+ platforms, with a total prize pool of over $16m. These competitions included multi-million-dollar government-funded…

ML competitions moved fast in 2025 - from 512-GPU training runs to benchmark-style challenges.

The new State of Machine Learning Competitions report explores what winning teams used.

We're grateful to be part of such a dynamic ML community.

Read:
mlcontests.com/state-of-mac...

1 month ago 0 0 0 0
Post image

The gap between public data and usable data is the “last-mile data problem.”

We’ve all seen it: confusing CSVs, messy schemas, opaque data dictionaries.

We’re testing a “baked data” approach to solve it, with promising results.

See our recipe:
drivendata.co/blog/last-mi...

1 month ago 0 0 0 0
Preview
AI Agents in Data Science Competitions: Lessons from the Leaderboard How good are AI agents at data science? Here's what we've learned from initial experiments about what works, what doesn't, and what the future might hold.

We gave AI agents 24 hours on the leaderboard.

Claude Code (Opus 4.5) and Codex (GPT 5.2) produced dozens of submissions. Some hit the top 20. Others overfit or plateaued.

We identified 9 obstacles and 6 open questions.

Do they match your experience?
drivendata.co/blog/ai-agen...

2 months ago 0 0 0 0
Preview
On Top of Pasketti: Children’s Speech Recognition Challenge Develop cutting-edge ASR algorithms specifically for children's speech to advance early education assessments and teaching tools.

Voice-based tools have the potential to support learning, accessibility, and early literacy, but only if they work for children. In the $120k On Top of Pasketti Children’s Speech Recognition Challenge, solvers will build ASR systems that understand kids. kidsasr.drivendata.org

2 months ago 0 0 0 0
Preview
SciPy Conference 2026 (@scipyconf.bsky.social) 📣 Call for Proposals is OPEN for #SciPy2026! Have a talk, tutorial, or poster idea you’re excited to share with the scientific Python community? Now’s the time! 🚀 🗓 Submit by: February 25, 2026 🔗…

Hang out with us at #SciPy2026 this summer! Senior Data Scientist Katie Wetstone is co-chairing the Environmental, Earth, and Climate Sciences track, which you can submit to by February 25.

2 months ago 0 0 0 0
Advertisement
Post image

We're excited to be a part of the K-12 AI Infrastructure Program, advancing open datasets, models, and benchmarks for AI in teaching & learning. The first RFP is now open ($50K–$250K) - apply now! k12-ai-infrastructure.org/rfp-due-marc...

2 months ago 0 0 0 0
Preview
On Top of Pasketti: Children’s Speech Recognition Challenge Develop cutting-edge ASR algorithms specifically for children's speech to advance early education assessments and teaching tools.

Kids learn through voice, but today's ASR tech can barely understand them. In a new data science challenge, solvers will develop models that work with children's unique speech patterns and compete for a share of the $120k prize pool. kidsasr.drivendata.org

2 months ago 0 0 0 0
Preview
On Top of Pasketti: Children’s Speech Recognition Challenge Develop cutting-edge ASR algorithms specifically for children's speech to advance early education assessments and teaching tools.

🚨 New opportunity: Help build open-source speech recognition AI 🎙️📚

@drivendata.org is hosting a data science competition to advance automatic speech recognition (ASR) for early education. Two tracks, real impact, and $120K in prizes.

Learn more & compete: kidsasr.drivendata.org

2 months ago 1 1 0 0
Preview
SciPy Conference 2026 (@scipyconf.bsky.social) 📣 Call for Proposals is OPEN for #SciPy2026! Have a talk, tutorial, or poster idea you’re excited to share with the scientific Python community? Now’s the time! 🚀 🗓 Submit by: February 25, 2026 🔗…

DrivenData's Katie Wetstone will be co-chairing the climate sciences track at @scipyconf.bsky.social, where you can share YOUR work in environmental data science!  Submit a #SciPy2026 talk, tutorial, or poster by February 25. See you there!

2 months ago 0 0 0 0
Preview
Poverty Prediction Challenge Estimate individual and aggregate household consumption from limited survey data.

The $10k prize pool Poverty Prediction Challenge sponsored by The World Bank tackles a major challenge in development research: How do you estimate current poverty rates without recent household expenditure data? Submissions open through midnight UTC February 4, 2026.

3 months ago 0 0 0 0
Preview
Poverty Prediction Challenge Estimate individual and aggregate household consumption from limited survey data.

In our newest machine learning challenge, solvers will use survey data and help uncover imputation methods for monitoring poverty trends. The top teams will take home a share of the $10,000 prize, provided by The World Bank. Learn more and submit predictions here: www.drivendata.org/competitions...

4 months ago 2 0 0 0
Post image

Throwback to our "Hateful Memes" challenge where teams detected harmful content combining text and images. Critical work for online safety! 🛡️💻 #ContentModeration #AI drivendata.org/competitions/64/

7 months ago 2 0 0 0
Advertisement
Preview
DrivenData Labs DrivenData helps mission-driven organizations harness their data to work smarter and offer more impactful services using data science, machine learning, and AI.

Hot topic in #DataScience: Multi-modal learning is bridging text, images, and audio. Our blog explores practical applications beyond the hype! 🎭🔗 #MultiModal #AI drivendata.co/blog.html

7 months ago 2 0 0 0
Preview
DrivenData Labs DrivenData helps mission-driven organizations harness their data to work smarter and offer more impactful services using data science, machine learning, and AI.

Our latest blog dives into "Causal Inference for Data Scientists" - moving beyond correlation to understand what actually drives outcomes! 🔗💡 #CausalInference #DataScience drivendata.co/blog.html

7 months ago 2 0 0 0
Preview
Pri-matrix Factorization Data scientists from more than 90 countries around the world drew on 300,000 video clips in a competition to build the best machine learning models for identifying wildlife from camera trap footage. …

Ever wonder how AI helps track endangered species? Our "Pri-matrix Factorization" competition used camera trap data to identify primates in the wild! 🐒📷 #WildlifeConservation #ComputerVision drivendata.org/competitions/49/

7 months ago 1 0 0 0
Preview
Community Spotlight: Kirill Brodt The Community Spotlight features fantastic members from our DrivenData community. Kirill Brodt, a researcher in computer graphics at the University of Montreal, talks animation, pose estimation, and…

Community Spotlight: Kirill Brodt
The Community Spotlight features fantastic members from our DrivenData community. Kirill Brodt, a researcher in computer graphics at the University of Montreal, talks animation, pose estimation, and data science challenges.

7 months ago 0 0 0 0
Preview
DrivenData Labs DrivenData helps mission-driven organizations harness their data to work smarter and offer more impactful services using data science, machine learning, and AI.

Want to build ML systems that actually work in production? Download our free ebook "The 10 Rules of Reliable Data Science" and learn the essentials from our years of real-world experience. Game-changing insights await! 📊🔬 #DataScience #MachineLearning

9 months ago 3 0 0 0
Preview
Open AI Caribbean Challenge: Mapping Disaster Risk from Aerial Imagery Can you predict the roof material of buildings from drone imagery? Leverage aerial imagery in St. Lucia, Guatemala, and Colombia to more accurately map disaster risk at scale.

Check out our "Open AI Caribbean Challenge" winners! Teams used satellite imagery to map building footprints for disaster preparedness. Amazing work applying #ComputerVision to humanitarian needs! 🛰️🏠 #DataForGood drivendata.org/competitions/58/

9 months ago 1 0 0 0

📢 We're Hiring: Child Speech Transcriber (Remote, Part-Time)
Join DrivenData in making speech technology more accessible for children! We're looking for a Child Speech Orthographic Transcriber to help advance Automatic Speech Recognition (ASR) for young learners.
docs.google.com/forms/d/e/1F...

9 months ago 0 0 0 0