Advertisement · 728 × 90

Posts by

Thanks for this! Would love to be added

1 year ago 1 0 0 0
Post image Post image Post image

The code for Simplified and Generalized Masked Diffusion for Discrete Data (Jiaxin Shi et al) has been released and a lecture by @arnauddoucet.bsky.social on this topic is also available!

🐍 Code: github.com/google-deepm...
📄 Article: arxiv.org/abs/2406.04329
📼 Video: www.youtube.com/watch?v=qj9B...

1 year ago 26 6 0 0
Video

new paper! 🗣️Sketch2Sound💥

Sketch2Sound can create sounds from sonic imitations (i.e., a vocal imitation or a reference sound) via interpretable, time-varying control signals.

paper: arxiv.org/abs/2412.08550
web: hugofloresgarcia.art/sketch2sound

1 year ago 23 9 2 5
Video

🎥 Introducing MultiFoley, a video-aware audio generation method with multimodal controls! 🔊
We can
⌨️Make a typewriter sound like a piano 🎹
🐱Make a cat meow like a lion roars! 🦁
⏱️Perfectly time existing SFX 💥 to a video.

arXiv: arxiv.org/abs/2411.17698
website: ificl.github.io/MultiFoley/

1 year ago 42 12 2 6

I initiated a starter pack for Audio ML. Let me know if you'd like to be added/removed.
go.bsky.app/LGmct4z

1 year ago 68 22 46 1

Made a feed that tries to index paper threads only: bsky.app/profile/psee.... To get into the feed, make a post with "arxiv.org" in the post somewhere + don't be a bot. My tiny contribution to the recent migration! Built w/ @skyfeed.app. Planning on some paper threads of my own soon...

1 year ago 7 2 0 1