Advertisement · 728 × 90
#
Hashtag
#worderrorrate
Advertisement · 728 × 90

Understanding, not correction.

#speechrecognition #fine-tuning #Downsyndrome #worderrorrate #accessibility

0 0 0 0

That result matters not because fine-tuning is surprising — it isn't — but because of what it proves. The speech was always intelligible. The model just hadn't learned how to listen to it yet. All I did was teach it.

#speechrecognition #fine-tuning #Downsyndrome #worderrorrate #accessibility

0 0 1 0

Same architecture, different training distribution. One run, a few hours later: 12.1% word error rate. A 66% improvement.

#speechrecognition #fine-tuning #Downsyndrome #worderrorrate #accessibility

0 0 1 0

It's undertrained on this kind of speech because this kind of speech is underrepresented in every dataset that ever went into it. That's not a model failure — it's a data failure upstream of the model.

So I fine-tuned.

#speechrecognition #fine-tuning #Downsyndrome #worderrorrate #accessibility

0 0 1 0

Averaging in easier cases flatters the metric and hides the real gap.

The real gap was the point. Whisper isn't bad because it was built carelessly.

#speechrecognition #fine-tuning #Downsyndrome #worderrorrate #accessibility

0 0 1 0

I added a clarifying commit almost immediately, because if you're building something for a specific population, your baseline has to be honest about that population.

#speechrecognition #fine-tuning #Downsyndrome #worderrorrate #accessibility

0 0 1 0

Then I read my own measurement more carefully. That number included non-DS speakers in the mix. Strip those out and look at DS speech alone, and the picture gets worse. The headline was misleading.

#speechrecognition #fine-tuning #Downsyndrome #worderrorrate #accessibility

0 0 1 0

So I ran vanilla Whisper — one of the best general-purpose speech recognition models in the world — against a curated dataset of Down syndrome speech. The word error rate came back at 35.7%.

#speechrecognition #fine-tuning #Downsyndrome #worderrorrate #accessibility

0 0 1 0

66% improvement in one training run — and why the baseline number was a lie

Before you can make something better, you need to know how bad it actually is.

#speechrecognition #fine-tuning #Downsyndrome #worderrorrate #accessibility

1 0 1 0