Advertisement ยท 728 ร— 90

Posts by Shaokai Ye

LLaVAction: Video Action Recognition LLaVAction: evaluating and training multi-modal large language models for action recognition

โœจ Introducing a new #SOTA action recognition large multimodal language model: #LLaVAction!

By @shaokaiye.bsky.social Haozhe Qi, @trackingskills.bsky.social and me!

๐Ÿ“ arxiv.org/abs/2503.18712

๐Ÿค– mmathislab.github.io/llavaction/

1/n

1 year ago 42 17 1 1

Thanks so much!

1 year ago 2 0 0 0