Advertisement Β· 728 Γ— 90

Posts by Kyle Kastner

Post image

ProbNum 2025 Keynote 2 ``Gradient Flows on the Maximum Mean Discrepancy'' by @arthurgretton.bsky.social ( @gatsbyucl.bsky.social and Google DeepMind.

Slides available here: probnum25.github.io/keynotes

7 months ago 6 2 1 0
Post image

Surprising new results from Owain Evans and Anthropic: Training on the outputs of a model can change the model's behavior, even when those outputs seem unrelated. Training only on completions of 3-digit numbers was able to transmit a love of owls. alignment.anthropic.com/2025/sublimi...

8 months ago 31 5 5 2
Post image

MorphScore got an update! MorphScore now covers 70 languages 🌎🌍🌏 We have a new-preprint out and we will be presenting our paper at the Tokenization Workshop @tokshop.bsky.social at ICML next week! @marisahudspeth.bsky.social @brenocon.bsky.social

9 months ago 12 4 1 1

Our work finding universal concepts in vision models is accepted at #ICML2025!!!

My first major conference paper with my wonderful collaborators and friends @matthewkowal.bsky.social @thomasfel.bsky.social
@Julian_Forsyth
@csprofkgd.bsky.social

Working with y'all is the best πŸ₯Ή

Preprint ⬇️!!

11 months ago 15 4 0 1
Post image

Contribute to the first global archive of soniferous freshwater life, The Freshwater Sounds Archive, and receive recognition as a co-author in a resulting data paper!

Pre-print now available. New deadline: 31st Dec, 2025.

See link πŸ‘‡4 more fishsounds.net/freshwater.js

10 months ago 41 16 4 2
Post image

πŸš€ Interested in Neuro-Symbolic Learning and attending #ICRA2025? πŸ§ πŸ€–

Do not miss Leon Keller presenting β€œNeuro-Symbolic Imitation Learning: Discovering Symbolic Abstractions for Skill Learning”.

Joint work of Honda Research Institute EU and @jan-peters.bsky.social (@ias-tudarmstadt.bsky.social).

11 months ago 11 2 1 0

Prasoon Bajpai, Tanmoy Chakraborty
Multilingual Test-Time Scaling via Initial Thought Transfer
https://arxiv.org/abs/2505.15508

10 months ago 2 1 0 0
Preview
In-Context Learning Boosts Speech Recognition via Human-like Adaptation to Speakers and Language Varieties ArXiv link for In-Context Learning Boosts Speech Recognition via Human-like Adaptation to Speakers and Language Varieties

A study shows in-context learning in spoken language models can mimic human adaptability, reducing word error rates by nearly 20% with just a few utterances, especially aiding low-resource language varieties and enhancing recognition across diverse speakers. https://arxiv.org/abs/2505.14887

10 months ago 1 1 0 0
Video

"Interdimensional Cable", shorts made with Veo 3 ai. By CodeSamurai on Reddit

10 months ago 171 28 11 30

Bingda Tang, Boyang Zheng, Xichen Pan, Sayak Paul, Saining Xie
Exploring the Deep Fusion of Large Language Models and Diffusion Transformers for Text-to-Image Synthesis
https://arxiv.org/abs/2505.10046

11 months ago 1 1 0 0
Advertisement
Learning Nonlinear Dynamics in Physical Modelling Synthesis using Neural Ordinary Differential Equations Victor Zheleznov, Stefan Bilbao, Alec Wright, Simon King

A neural ODE model combined modal decomposition with a neural network to model nonlinear string vibrations, generating synthetic data and sound examples.

11 months ago 2 1 0 0
Preview
Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM? ArXiv link for Omni-R1: Do You Really Need Audio to Fine-Tune Your Audio LLM?

Research unveils Omni-R1, a fine-tuning method for audio LLMs that boosts audio performance via text training, achieving MMAU results. Findings reveal how enhanced text reasoning affects audio capacities, suggesting new model optimization directions. https://arxiv.org/abs/2505.09439

11 months ago 1 1 0 0
Post image

Yeah we finally have a model report with an actual data section. Thanks Qwen 3! github.com/QwenLM/Qwen3...

11 months ago 53 10 1 0
Preview
FLAM: Frame-Wise Language-Audio Modeling ArXiv link for FLAM: Frame-Wise Language-Audio Modeling

FLAM, a novel audio-language model, enables frame-wise localization of sound events in an open-vocabulary format. With large-scale synthetic data and advanced training methods, FLAM enhances audio understanding and retrieval, aiding multimedia indexing and access. https://arxiv.org/abs/2505.05335

11 months ago 2 1 0 0

#ICML2025
Is standard RLHF optimal in view of test-time scaling? Unsurprisingly no.

We show a simple change to standard RLHF framework that involves 𝐫𝐞𝐰𝐚𝐫𝐝 𝐜𝐚π₯π’π›π«πšπ­π’π¨π§ and 𝐫𝐞𝐰𝐚𝐫𝐝 𝐭𝐫𝐚𝐧𝐬𝐟𝐨𝐫𝐦𝐚𝐭𝐒𝐨𝐧 (suited to test-time procedure) is optimal!

11 months ago 17 6 1 0
Post image Post image

Is Best-of-N really the best we can do for language model inference?

New paper (appearing at ICML) led by the amazing Audrey Huang (ahahaudrey.bsky.social) with Adam Block, Qinghua Liu, Nan Jiang, and Akshay Krishnamurthy (akshaykr.bsky.social).

1/11

11 months ago 22 5 1 1
Post image

Congratulations to the #AABI2025 Workshop Track Outstanding Paper Award recipients!

11 months ago 20 8 0 1
Advertisement
Post image

Why not?

Reinforcement Learning for Reasoning in Large Language Models with One Training Example

Applying RLVR to the base model Qwen2.5-Math-1.5B, they identify a single example that elevates model performance on MATH500 from 36.0% to 73.6%,

11 months ago 20 2 2 2
Preview
Latent Factor Models Meets Instructions: Goal-conditioned Latent Factor Discovery without Task Supervision ArXiv link for Latent Factor Models Meets Instructions: Goal-conditioned Latent Factor Discovery without Task Supervision

Instruct-LF merges LLMs' instruction-following with statistical models, enhancing interpretability in noisy datasets and improving task performance up to 52%. https://arxiv.org/abs/2502.15147

11 months ago 2 1 0 0

An incomplete list of Chinese AI:

- DeepSeek: www.deepseek.com. You can also access AI models via API.
- Moonshot AI's Kimi: www.kimi.ai
- Alibaba's Qwen: chat.qwen.ai. You can also access AI models via API.
- ByteDance's Doubaob (only in Chinese): www.doubao.com/chat/

11 months ago 22 7 1 0
Post image Post image

I really liked this approach by @matthieuterris.bsky.social et al.They propose learning a unique lightweight model for multiple inverse problems by conditioning it with the forward operator A. Thanks to self-supervised fine-tuning, it can tackle unseen inverse pb.

πŸ“° https://arxiv.org/abs/2503.08915

11 months ago 7 1 0 0

Excited to be presenting our spotlight ICLR paper Simplifying Deep Temporal Difference Learning today! Join us in Hall 3 + Hall 2B Poster #123 from 3pm :)

11 months ago 7 1 0 0

Balinese text-to-speech dataset as digital cultural heritage https://pubmed.ncbi.nlm.nih.gov/40275973/

11 months ago 1 1 0 0
Post image

Kimi.ai releases Kimi-Audio! Our new open-source audio foundation model advances capabilities in audio understanding, generation, and conversation.

Paper: github.com/MoonshotAI/K...
Repo: github.com/MoonshotAI/K...
Model: huggingface.co/moonshotai/K...

11 months ago 13 2 1 0
Advertisement
Post image Post image Post image

Very cool article from Panagiotis Theodoropoulos et al: https://arxiv.org/abs/2410.14055
Feedback SchrΓΆdinger Bridge Matching introduces a new method to improve transfer between two data distributions using only a small number of paired samples!

11 months ago 4 2 0 0
Post image

Our #ICLR2025 poster "Discrete Codebook World Models for Continuous Control" (Aidan Scannell, Mohammadreza Nakhaeinezhadfard, Kalle KujanpÀÀ, Yi Zhao, Kevin Luck, Arno Solin, Joni Pajarinen)
πŸ—“οΈ Hall 3 + Hall 2B #415, Thu 24 Apr 10 a.m. +08 β€” 12:30 p.m. +08
πŸ“„ Preprint: arxiv.org/abs/2503.00653

11 months ago 10 3 2 0

Andrew Kiruluta
Wavelet-based Variational Autoencoders for High-Resolution Image Generation
https://arxiv.org/abs/2504.13214

11 months ago 1 1 0 0
Post image

7/ Large Language Models to Diffusion Finetuning

Paper: openreview.net/forum?id=Wu5...
Workshop: workshop-llm-reasoning-planning.github.io

New finetuning method empowering pre-trained LLMs with some of the key properties of diffusion models and the ability to scale test-time compute.

11 months ago 4 3 1 0
Post image

10/ Sakana AI Co-Founder and CEO, David Ha, will be giving a talk at the #ICLR2025 World Models Workshop, at a panel to discuss the Current Development and Future Challenges of World Models.

Workshop Website: sites.google.com/view/worldmo...

11 months ago 12 3 0 1

Duy A. Nguyen, Quan Huu Do, Khoa D. Doan, Minh N. Do: Are you SURE? Enhancing Multimodal Pretraining with Missing Modalities through Uncertainty Estimation https://arxiv.org/abs/2504.13465 https://arxiv.org/pdf/2504.13465 https://arxiv.org/html/2504.13465

11 months ago 1 1 1 0