Advertisement · 728 × 90

Posts by Faro Stöter

Enjoyed my first @interspeech.bsky.social conference. Seems like a great community. Well organized and great venue. This is how big conferences could look like. Take notes, ICASSP!

7 months ago 2 0 0 0

Now in Rotterdam at @interspeech.bsky.social with @cifkao.bsky.social and @hschreiber.bsky.social

8 months ago 0 0 0 0

Same here. With Claude 4, pandas becomes usable again but every time I tried torch models, all shapes are messed up. What I like though is that the AI agent often comes up with little clever helper bash scripts to test stuff (because it doesn’t understand the code base 😁)

9 months ago 1 0 0 0
Igniting Innovation: Evidence from PyTorch on Technology Control in Open Collaboration <div> Many companies offer free access to their technology to encourage outside add-on <span>innovation, hoping to later profit by raising prices or harne

Harvard Business on Open Source: When PyTorch left Meta for its own non-profit, "this shift led to a significant decrease in contributions from Meta but a notable increase from external companies...participation increased from complementors (Chip Manufacturers);" papers.ssrn.com/sol3/papers....

1 year ago 4 1 0 0
Preview
Internship: ML Optimization | Notion Location: Paris preferred (remote within France/EU possible)

🚀 We’re looking for a Master’s student to join our research team for a 6-month internship at AudioShake!

Deep dive into PyTorch, optimize our SOTA audio models, and help make ML sound better (and faster) 🎶

Based in Paris or remote 🇫🇷 → audioshake.notion.site/Internship-M... #AudioML #Internship

9 months ago 1 3 0 0

Would I ever want to have the reviews written by LLMs? Hell, no!

10 months ago 1 0 0 0

I think they serve well as a guide of how to do reviews well. “Have you checked x?” I often find actual flaws that I would have missed otherwise. You don’t have to understand a paper to find flaws. Just think of: “we did x to improve y” - without backing it up by a citation.

10 months ago 0 0 1 0

Why? Isn’t the main point to identify flaws? I often found an LLM finds 10 flaws and only 1-2 of them are valid concerns. So yes this is dangerous if just used without human in the loop. But also often I find ideas what to check in detail based on the initial LLM summary.

10 months ago 0 0 1 0

Just wonder whether the reviewer demographics are something specific to your field. I review about 10-20 papers per year, I don’t get payed by the public and looking at our main conferences like ICASSP it looks like (no numbers) at least half of the reviewers do have an industry position.

10 months ago 0 0 0 0
Advertisement
Post image

🚀 New #ICLR2025 Paper Alert! 🚀

Can Audio Foundation Models like Moshi and GPT-4o truly engage in natural conversations? 🗣️🔊

We benchmark their turn-taking abilities and uncover major gaps in conversational AI. 🧵👇

📜: arxiv.org/abs/2503.01174

1 year ago 9 6 1 0

true. But it thought IEEE owns the idea of paying much more and getting much less than at other conferences :-)

10 months ago 2 0 0 0
Post image

@interspeech.bsky.social new to the speech community coming from ISMIR/ICASSP/Eusipco/DAFX. How come Interspeech is that much more expensive than other conferences? This makes it very hard for many researchers to get approval!

11 months ago 1 0 1 0

@fakufaku.bsky.social can I do this with pyroomacoustics? Or do you know a simpler idea?

1 year ago 0 0 0 0

Not knowing much about spatial audio: how do people render multiple dry mono sources to a wet reverberated stereo image where each source has a fixed position in space? I guess one could use ambisonics RiRs to create stereo images? But whats the easier way to handle the positioning?

1 year ago 0 0 1 0
Post image

AudioShake’s Multi-Speaker Separation is the first-ever hi-res solution for isolating overlapping voices. Perfect for media pros, transcription, & AI voice workflows. 🔗www.audioshake.ai/post/introducing-multi-speaker-separation-from-audioshake

1 year ago 5 4 0 0
Preview
AudioShake Isolations Bring Maria Callas’ Voice to Life in Netflix film, “Maria” Filmmakers and Warner Classics in partnership with the Maria Callas Estate, used AudioShake’s stem separation to isolate her voice to perfect the biopic’s music

How stem separation tech brought the legendary voice of Maria Callas back to life in “Maria". 🎶 Isolating Callas’s original vocals allowed @warnerclassics.bsky.social and filmmakers to control and blend her voice with Jolie’s performance. 🔗 Read: www.audioshake.ai/post/audiosh...

1 year ago 7 2 0 0

We just released the Helium-1 model , a 2B multi-lingual LLM which @exgrv.bsky.social and @lmazare.bsky.social have been crafting for us! Best model so far under 2.17B params on multi-lingual benchmarks 🇬🇧🇮🇹🇪🇸🇵🇹🇫🇷🇩🇪
On HF, under CC-BY licence: huggingface.co/kyutai/heliu...

1 year ago 25 8 0 0
Advertisement
Preview
Diffusion Models for Audio Restoration: A review [Special Issue On Model-Based and Data-Driven Audio Signal Processing] With the development of audio playback devices and fast data transmission, the demand for high sound quality is rising for both entertainment and communications. In this quest for better sound quality...

Our article, "Diffusion Models for Audio Restoration: A Review," is now published in the IEEE Signal Processing Magazine!

A huge thank you to all co-authors Jean-Marie Lemercier, Julius Richter, Simon Welker, Eloi Moliner, and Vesa Välimäki for a great collaboration.

doi.org/10.1109/MSP....

1 year ago 12 5 0 0
Post image

Today, we’re introducing NatureLM-audio: the first large audio-language model tailored for understanding animal sounds. arxiv.org/abs/2411.07186 🧵👇

1 year ago 15 8 2 4

Where is AGI that charges all my devices and batteries?

1 year ago 0 0 0 0
Preview
utter-project/mHuBERT-147 · Hugging Face We’re on a journey to advance and democratize artificial intelligence through open source and open science.

Since this is a new platform and mHuBERT-147 just reached 86k downloads, let me make some promotion!

This year we released a compact powerful multilingual SSL model. Trained on balanced, high-quality, open-license data, this model rivals MMS-1B but is 10x smaller.

huggingface.co/utter-projec...

1 year ago 15 3 2 0
Video

Looking for reviewers before Christmas

1 year ago 675 89 11 23
interspeech2025.org challenge URGENT Organizers: Kohei Saijo, Wangyou Zhang, Samuele Cornell, Robin Scheibler, Chenda Li, Zhaoheng Ni, Anurag Kumar, Marvin Sach, Yihui Fu, Wei Wang, Tim Fingscheidt, Shinji Watanabe

interspeech2025.org challenge URGENT Organizers: Kohei Saijo, Wangyou Zhang, Samuele Cornell, Robin Scheibler, Chenda Li, Zhaoheng Ni, Anurag Kumar, Marvin Sach, Yihui Fu, Wei Wang, Tim Fingscheidt, Shinji Watanabe

🌟 URGENT Challenge @ #Interspeech2025 🌟

Join the Universal, Robust, & Generalizable Speech EnhancemeNT (URGENT) challenge! Explore noisy corpora, tackle diverse speech degradations, and test scalability across 2 tracks (~2.5k/60k hrs).

🚀 Learn more: urgent-challenge.github.io/urgent2025/

1 year ago 6 4 0 0
Video

new paper! 🗣️Sketch2Sound💥

Sketch2Sound can create sounds from sonic imitations (i.e., a vocal imitation or a reference sound) via interpretable, time-varying control signals.

paper: arxiv.org/abs/2412.08550
web: hugofloresgarcia.art/sketch2sound

1 year ago 23 9 2 5
Video

📢 Audio AI Job opportunity at Adobe!

The Sound Design AI Group (SODA) is looking for an exceptional research engineer to join us in building the future of AI-assisted audio and video creation.

Strong ML background, GenAI experience a plus.

Details: adobe.wd5.myworkdayjobs.com/external_exp...

1 year ago 11 3 1 3
Advertisement
Preview
DeepMind

🚨🚨My team @GoogleDeepMind in Tokyo is looking for a talented research scientist to work on audio generative models! 🔊
Please consider applying if you have expertise in the domain or related areas such as multimodal models, video generation 📹, etc.
boards.greenhouse.io/deepmind/job...

1 year ago 4 4 0 0
Preview
Notre Dame reopening offers ‘shock of hope’, says Emmanuel Macron French president tours medieval cathedral in Paris to view restoration after devastating 2019 fire

€700M and not even generative? Doesn’t seem like a good investment.

www.theguardian.com/world/2024/n...

1 year ago 1 0 0 0

🎓Academia or the industry 💸? I wrote a detailed point of view on Twitter a few months ago, so maybe I should share it here again. I think that most things are still true, the only slight change would be linked to the GenAI bubble, but only time will tell.

www.darnault-parcollet.fr/documents/Ba...

1 year ago 4 1 0 0
“My AI startups is just a GPT Wrapper”

“My AI startups is just a GPT Wrapper”

The Reality for AI Startups

1 year ago 139 23 5 0

mit Dir ist alles blauer 🥰

1 year ago 2 0 0 0