Advertisement · 728 × 90
#
Hashtag
#speechRecognition
Advertisement · 728 × 90
The Hidden Audio Bias Inside Audio-Visual Speech Recognition Shapley analysis reveals why AVSR models keep trusting corrupted audio, exposing a hidden bias in multimodal speech recognition.

The Hidden Audio Bias Inside Audio-Visual Speech Recognition

Shapley analysis reveals why AVSR models keep trusting corrupted audio, exposing a hidden bias in multimodal speech recognition.

Telegram AI Digest
#ai #news #speechrecognition

1 0 0 0
The Hidden Audio Bias Inside Audio-Visual Speech Recognition

Скрытое звуковое предубеждение в распознавании аудиовизуальной речи

Анализ Шепли показывает, почему модели AVSR продолжают доверять искаженному аудио, выявляя скрытую предвзятость в мультимодальном распознавании речи.

Telegram ИИ Дайджест
#ai #news #speechrecognition

0 0 0 0

Understanding, not correction.

#speechrecognition #fine-tuning #Downsyndrome #worderrorrate #accessibility

0 0 0 0

That result matters not because fine-tuning is surprising — it isn't — but because of what it proves. The speech was always intelligible. The model just hadn't learned how to listen to it yet. All I did was teach it.

#speechrecognition #fine-tuning #Downsyndrome #worderrorrate #accessibility

0 0 1 0

Same architecture, different training distribution. One run, a few hours later: 12.1% word error rate. A 66% improvement.

#speechrecognition #fine-tuning #Downsyndrome #worderrorrate #accessibility

0 0 1 0

It's undertrained on this kind of speech because this kind of speech is underrepresented in every dataset that ever went into it. That's not a model failure — it's a data failure upstream of the model.

So I fine-tuned.

#speechrecognition #fine-tuning #Downsyndrome #worderrorrate #accessibility

0 0 1 0

Averaging in easier cases flatters the metric and hides the real gap.

The real gap was the point. Whisper isn't bad because it was built carelessly.

#speechrecognition #fine-tuning #Downsyndrome #worderrorrate #accessibility

0 0 1 0

I added a clarifying commit almost immediately, because if you're building something for a specific population, your baseline has to be honest about that population.

#speechrecognition #fine-tuning #Downsyndrome #worderrorrate #accessibility

0 0 1 0

Then I read my own measurement more carefully. That number included non-DS speakers in the mix. Strip those out and look at DS speech alone, and the picture gets worse. The headline was misleading.

#speechrecognition #fine-tuning #Downsyndrome #worderrorrate #accessibility

0 0 1 0

So I ran vanilla Whisper — one of the best general-purpose speech recognition models in the world — against a curated dataset of Down syndrome speech. The word error rate came back at 35.7%.

#speechrecognition #fine-tuning #Downsyndrome #worderrorrate #accessibility

0 0 1 0

66% improvement in one training run — and why the baseline number was a lie

Before you can make something better, you need to know how bad it actually is.

#speechrecognition #fine-tuning #Downsyndrome #worderrorrate #accessibility

1 0 1 0
Cohere Transcribe: Speech Recognition

Cohere Transcribe: Speech Recognition

Telegram AI Digest
#ai #cohere #speechrecognition

0 0 0 0
Cohere Transcribe: Speech Recognition

Cohere Transcribe: Распознавание речи

Telegram ИИ Дайджест
#ai #cohere #speechrecognition

0 0 0 0
Post image

Speech Recognition Market is Transforming the Industry Landscape www.marketresearchfuture.com/reports/spee...
#SpeechRecognition #AI #VoiceTech #NaturalLanguageProcessing #SmartAssistants #Automation #Innovation #TechMarket

0 0 0 0
Preview
Mistral Ships Voxtral - Open-Weights Voice AI Platform Mistral releases Voxtral, a pair of open-weights models covering speech recognition and text-to-speech that undercut OpenAI and ElevenLabs on price.

Mistral Ships Voxtral - Open-Weights Voice AI Platform

awesomeagents.ai/news/mistral-voxtral-ope...

#Mistral #OpenSource #SpeechRecognition

2 0 0 0
Post image

Simplified, intuitive, and packed with new features, our brand new mobile app for Apple and Android is coming soon.

Intelligent workflows, ambient voice technology, medical speech recognition, and text messaging on the go.

#workflows #ambientAI #speechrecognition #digitaltransformation #medtech

0 0 0 0
Preview
Mistral Ships Voxtral - Open-Weights Voice AI Platform Mistral releases Voxtral, a pair of open-weights models covering speech recognition and text-to-speech that undercut OpenAI and ElevenLabs on price.

Mistral Ships Voxtral - Open-Weights Voice AI Platform

awesomeagents.ai/news/mistral-voxtral-ope...

#Mistral #OpenSource #SpeechRecognition

2 0 0 0
Preview
Cohere's Open-Source Transcribe Model Tops ASR Leaderboard Cohere has released Transcribe, a 2-billion-parameter open-source speech recognition model that tops the Hugging Face Open ASR leaderboard across 14 languages.

winbuzzer.com/2026/03/27/c...

Cohere's Open-Source Transcribe Model Tops ASR Leaderboard

#AI #Cohere #CohereTranscribe #SpeechRecognition #AITranscription #OpenSourceAI #HuggingFace #MultimodalAI

0 0 0 0
Preview
Cohere's lean transcription AI challenges the proprietary model Cohere releases Transcribe, a lean 2-billion-parameter open-source speech model that outperforms larger competitors on accuracy while running on consumer GPUs.

Cohere's lean transcription AI challenges the proprietary model

#AI #OpenSource #SpeechRecognition #AusNews

thedailyperspective.org/article/2026-03-27-coher...

2 0 0 0
Post image

One powerful solution for AI scribing, speech recognition and text/email messaging.

Visit our website and find out why NHS organisations have been choosing Lexacom for more than 25 years. Link in bio.

#nhs #digitaltransformation #speechrecognition #workflows #healthtech

0 0 0 0
The Best Medical Speech Recognition Software and APIs in 2026 Medical speech recognition tools are transforming healthcare by reducing documentation time, improving workflow efficiency, and lowering physician burnout. This guide compares top APIs and ready-to-use software, outlines ROI expectations, and provides a framework for choosing the right solution based on technical needs, compliance requirements, and scalability.

The Best Medical Speech Recognition Software and APIs in 2026

Medical speech recognition tools are transforming healthcare by reducing documentation time, improving workflow efficiency, and lowering physician burnout. This guide compares top APIs an…

Telegram AI Digest
#ai #news #speechrecognition

0 0 0 0
The Best Medical Speech Recognition Software and APIs in 2026

Лучшее программное обеспечение и API для распознавания медицинской речи в 2026 году

Инструменты распознавания медицинской речи преобразуют здравоохранение, сокращая время документирования, повышая эффективность рабочего процесса и снижая выгорание…

Telegram ИИ Дайджест
#ai #news #speechrecognition

0 0 0 0
Post image

Lexacom has been supporting healthcare professionals across the NHS for more than 25 years.

- Intelligent workflows
- Ambient voice technology
- Medical speech recognition
- Digital dictation
- Text messaging

#nhs #medtech #workflows #nhsdigital #speechrecognition #nhsAI #AIinhealthcare

0 0 0 0
Post image

Scottish GP with dyslexia hails the benefits of Lexacom in helping his practice cope with a rise in administrative demands and growing clinical complexity.

➡️ Read the full case study on our website > Resources. Link in bio.

#nhs #speechrecognition #nhsAI #digitaltransformation

0 0 0 0
Preview
IBM Granite 4.0 1B Speech Tops OpenASR Leaderboard IBM has released Granite 4.0 1B Speech, a compact 1-billion-parameter multilingual speech model that ranks first on OpenASR with a 5.52 Word Error Rate.

winbuzzer.com/2026/03/16/i...

IBM Granite 4.0 1B Speech Tops OpenASR Leaderboard

#AI #AIModels #IBM #SpeechRecognition #OpenSourceAI #EnterpriseAI #EdgeComputing #AITranslation #OpenASRLeaderboard

0 0 0 0
Preview
Human brain and AI speech recognition decode speech in similar step-by-step stages, study finds Over the past decades, computer scientists have developed numerous artificial intelligence (AI) systems that can process human speech in different languages. The extent to which these models replicate...

#Human #brain and #AI #speechrecognition decode #speech in similar step-by-step stages, study finds

techxplore.com/news/2026-03...

0 0 0 0
Post image

Speech Recognition Market Share, Growth Drivers, and Competitive Analysis www.marketresearchfuture.com/reports/spee...
#SpeechRecognition #VoiceAI #VoiceTechnology

0 0 0 0
Preview
IBM Granite 4.0 1B Speech Tops OpenASR Leaderboard IBM's new 1B-parameter speech model claims the top spot on the Open ASR Leaderboard while running on consumer hardware, beating Whisper Large V3 by 25% on word error rate.

IBM Granite 4.0 1B Speech Tops OpenASR Leaderboard

awesomeagents.ai/news/ibm-granite-4-speec...

#Ibm #Granite #SpeechRecognition

0 0 0 0
Post image

Simplified, intuitive, and packed with new features, our brand new mobile app for Apple and Android is coming soon.

Intelligent workflows, ambient voice technology, medical speech recognition, and text messaging on the go.

#workflows #ambientAI #speechrecognition #digitaltransformation #medtech

0 0 0 0
Preview
IBM taps Deepgram to add real-time speech to watsonx Orchestrate - SiliconANGLE IBM taps Deepgram to add real-time speech to watsonx Orchestrate - SiliconANGLE

IBM taps Deepgram to add real-time speech to watsonx Orchestrate #Technology #SoftwareandApps #Other #IBM #WatsonX #SpeechRecognition

siliconangle.com/2026/02/24/ibm-taps-deep...

0 0 0 0