Advertisement Β· 728 Γ— 90

Posts by Vivek Kumar

Ever wanted to easily edit individual sounds in a complex audio sceneβ€”like removing a cough or making a doorbell louder?

New paper - Recomposer(arxiv.org/abs/2509.05256) from our Sound Understanding team introduces a powerful way to do just that.

Huge congrats to all the authors πŸŽ‰

#AudioEditing

7 months ago 3 2 0 0
Video

Today we are releasing the dataset of table tennis ball trajectories used to train the Google DeepMind robot that can play amateur table tennis with humans (sites.google.com/corp/view/co...). This work was accepted for presentation at #ICRA2025 and we hope to see you there!

11 months ago 20 3 1 0

In December, I posted about our new paper on mastering board games using internal + external planning. πŸ‘‡

Here's a talk now on Youtube about it given by my awesome colleague John Schultz!

www.youtube.com/watch?v=JyxE...

1 year ago 35 11 1 0
Preview
2024: A year of extraordinary progress and advancement in AI As we move into 2025, we’re looking back at the astonishing progress in AI in 2024.

Theory: We were so busy shipping / publishing last year we forgot to publish our year in review.😜

Jokes aside, huge progress in LLM, Reasoning, Generative Media. In science, so many breakthroughs, flood prediction, GraphCast, semiconductor design, quantum computing. Wow!
blog.google/technology/a...

1 year ago 5 2 0 0
Preview
Chrome's Gemini address bar shortcut rolls out Following Tuesday's announcement, the Gemini shortcut in desktop Chrome's address bar has rolled out. In the Omnibox...

Did you know you can use '@' <tab> or '<at>Gemini' to access Gemini in the desktop Chrome address bar? Super convenient!

You can also use:
<at>Tabs to search your tabs
<at> History to search your history
<at>Bookmarks to find bookmarks

9to5google.com/2024/05/02/g...

1 year ago 77 14 2 0

Congratulations on this well deserved recognition, Heiga!

1 year ago 0 0 1 0
Post image

Fantastic opportunity for any student researcher passionate about generative audio - www.linkedin.com/posts/john-h...

1 year ago 1 0 0 0

Congrats to the PaliGemma 2 team! πŸŽ‰ Bigger models, better results πŸš€

1 year ago 0 0 0 0
Preview
Genie 2: A large-scale foundation world model Generating unlimited diverse training environments for future general agents

Really cool new work out of Deep Mind for video game world generation using latent diffusion! Soon you'll be able to speed run a game just by tricking a model to morph you from one location to another.

deepmind.google/discover/blo...

1 year ago 38 10 1 4
Advertisement

Going Galt

1 year ago 0 0 0 0

Great πŸ¦‹-storm highlighting the importance and implications of AI for science. It's already helping win Nobel Prizes! 🀯

Juan's amazing essay to spark debate about the most important AI for Science opportunities, ingredients, risks and policy ideas. www.aipolicyperspectives.com/p/a-new-gold...

1 year ago 2 1 0 0

πŸ€– ML/AI Mega Starter Pack

1. Open-source LLMS
go.bsky.app/FELkyDr

🧡

1 year ago 24 9 3 2
Preview
SANE 2024 @ Google Cambridge - YouTube SANE 2024, a one-day event gathering researchers and students in speech and audio from the Northeast of the American continent, was held on Thursday October ...

The #SANE2024 talks are up on YouTube! Feat. Quan Wang,
@gretatuckute.bsky.social, Mark Hamilton, Bhuvana Ramabhadran, Zhiyao Duan, Chris Donahue.
Binge watching playlist⬇️
youtube.com/playlist?lis...

1 year ago 10 3 0 0

Thanks for putting this together. Would love to be added

1 year ago 1 0 0 0
Post image

TIL that there's a Gemini @gradio-hf.bsky.social library that lets you automatically build Python chat bots and web apps with just a few lines of code, then (optionally) deploy them as apps or in @huggingface.bsky.social Spaces.

βœ¨πŸ™Œ Amazing work, @_akhaliq!!

πŸ”— github.com/AK391/gemini...

1 year ago 52 12 2 0