Ever wanted to easily edit individual sounds in a complex audio sceneβlike removing a cough or making a doorbell louder?
New paper - Recomposer(arxiv.org/abs/2509.05256) from our Sound Understanding team introduces a powerful way to do just that.
Huge congrats to all the authors π
#AudioEditing
Posts by Vivek Kumar
Today we are releasing the dataset of table tennis ball trajectories used to train the Google DeepMind robot that can play amateur table tennis with humans (sites.google.com/corp/view/co...). This work was accepted for presentation at #ICRA2025 and we hope to see you there!
In December, I posted about our new paper on mastering board games using internal + external planning. π
Here's a talk now on Youtube about it given by my awesome colleague John Schultz!
www.youtube.com/watch?v=JyxE...
Theory: We were so busy shipping / publishing last year we forgot to publish our year in review.π
Jokes aside, huge progress in LLM, Reasoning, Generative Media. In science, so many breakthroughs, flood prediction, GraphCast, semiconductor design, quantum computing. Wow!
blog.google/technology/a...
Did you know you can use '@' <tab> or '<at>Gemini' to access Gemini in the desktop Chrome address bar? Super convenient!
You can also use:
<at>Tabs to search your tabs
<at> History to search your history
<at>Bookmarks to find bookmarks
9to5google.com/2024/05/02/g...
Congratulations on this well deserved recognition, Heiga!
Fantastic opportunity for any student researcher passionate about generative audio - www.linkedin.com/posts/john-h...
Congrats to the PaliGemma 2 team! π Bigger models, better results π
Really cool new work out of Deep Mind for video game world generation using latent diffusion! Soon you'll be able to speed run a game just by tricking a model to morph you from one location to another.
deepmind.google/discover/blo...
Going Galt
Great π¦-storm highlighting the importance and implications of AI for science. It's already helping win Nobel Prizes! π€―
Juan's amazing essay to spark debate about the most important AI for Science opportunities, ingredients, risks and policy ideas. www.aipolicyperspectives.com/p/a-new-gold...
π€ ML/AI Mega Starter Pack
1. Open-source LLMS
go.bsky.app/FELkyDr
π§΅
The #SANE2024 talks are up on YouTube! Feat. Quan Wang,
@gretatuckute.bsky.social, Mark Hamilton, Bhuvana Ramabhadran, Zhiyao Duan, Chris Donahue.
Binge watching playlistβ¬οΈ
youtube.com/playlist?lis...
Thanks for putting this together. Would love to be added
TIL that there's a Gemini @gradio-hf.bsky.social library that lets you automatically build Python chat bots and web apps with just a few lines of code, then (optionally) deploy them as apps or in @huggingface.bsky.social Spaces.
β¨π Amazing work, @_akhaliq!!
π github.com/AK391/gemini...