Our #PickOfTheWeek by @beomseok-lee.bsky.social: "Can Speech LLMs Think while Listening?" by Yi-Jen Shih, @rdesh26.bsky.social, Chunyang Wu, Wei Zhou, SK Bong, Yashesh Gaur, Jay Mahadeokar, Ozlem Kalinli, Mike Seltzer (2025).
#Speech #SpeechLLM #LLM #SpeechTech #AI
Our pick of the week by @bsavoldi.bsky.social: "Acoustic-based Gender Differentiation in Speech-aware Language Models" by Junhyuk Choi, Jihwan Seol, Nayeon Kim, Chanhee Cho, EunBin Cho, Bugeun Kim.
arxiv.org/abs/2509.21125
#Gender #SpeechLLM #Speech
Study Shows Gender and Positional Bias in SpeechLLM Conversational AI
A SPECOM 2025 study finds SpeechLLM voice assistants show stronger positional bias for female‑sounding inputs than male ones, highlighting the need for voice‑aware fairness testing. Read more: getnews.me/study-shows-gender-and-p... #speechllm #bias
Bias Benchmarks May Not Generalize Across SpeechLLM Tasks
Research finds MCQA bias benchmarks for SpeechLLMs don't reliably predict performance on other MCQA sets or long-form generation, per a study submitted on 24 Sep 2025. Read more: getnews.me/bias-benchmarks-may-not-... #speechllm #bias
How to Preserve Text Skills in Speech-Enabled Large Language Models
A study shows speech fine‑tuning shifts LLM parameter importance, hurting text reasoning, but layer‑wise learning‑rate scheduling and LoRA keep text performance, beating full fine‑tuning. getnews.me/how-to-preserve-text-ski... #speechllm #lora
CP‑Bench Benchmark Tests Paralinguistic Reasoning in Speech‑LLMs
The CP‑Bench benchmark, at EMNLP Findings 2025, tests speech‑LLMs on literal, contextual and paralinguistic queries, revealing a drop in accuracy for emotion‑based items (arXiv:2509.16589). getnews.me/cp-bench-benchmark-tests... #cpbench #speechllm
Benchmark Evaluates Speech‑LLMs on Contextual and Paralinguistic Reasoning
CP‑Bench introduces two QA sets with wild‑source audio to test speech‑LLMs on verbal content and emotional prosody. Results show models lag on tone interpretation. Read more: getnews.me/benchmark-evaluates-spee... #cpbench #speechllm
Prompt-Aware Mixture Improves Speech LLMs for Transcription and Captioning
The Prompt‑aware Mixture (PaM) lets a speech LLM pick audio encoders, achieving better accuracy than any single‑encoder model on ASR and audio captioning. Read more: getnews.me/prompt-aware-mixture-imp... #promptawaremixture #speechllm
Cross-Modal Knowledge Distillation Boosts Speech LLM Performance
Dual‑channel distillation lets a speech‑enabled LLM retain text benchmark scores while improving accuracy on spoken queries, narrowing the text‑speech gap. Read more: getnews.me/cross-modal-knowledge-di... #speechllm #multimodal
Our pick of the week by @mgaido91.bsky.social: "AlignFormer: Modality Matching Can Achieve Better Zero-shot Instruction-Following Speech-LLM" by Ruchao Fan, Bo Ren, Yuxuan Hu, Rui Zhao, Shujie Liu, Jinyu Li (2024).
#NLProc #Speech #instructionfollowing #zeroshot #speechtech #speechllm