✨ Introducing a new #SOTA action recognition large multimodal language model: #LLaVAction!
By @shaokaiye.bsky.social Haozhe Qi, @trackingskills.bsky.social and me!
📝 arxiv.org/abs/2503.18712
🤖 mmathislab.github.io/llavaction/
1/n
42
17
1
1