カジュアルな画像からAIがどこまで3D形状とテクスチャを理解できるかをGaussian Splattingで検証する評価手法!
Feat2GS: Probing Visual Foundation Models with Gaussian Splatting
3dnchu.com/archives/fea...
by @faneggchen #Feat2GS
3DGSデータもこういうのでどんどん精度が上がっていくのかな?楽しみ
#Feat2GS
Checkout the thread of #Feat2GS, the 3D awareness of visual foundation models (VFMs) "should" and "could" be evaluated on large-scale casual video, rather than the data with 3D labels.
Our findings in 3D probe lead to a simple-yet-effective solution, by just combining features from different visual foundation models and outperform prior works.
Apply #Feat2GS in sparse & causal captures:
🤗Online Demo: huggingface.co/spaces/endle...
With #Feat2GS we evaluated more than 10 visual foundation models (DUSt3R, DINO, MAE, SAM, CLIP, MiDas, etc) in terms of geometry and texture — see the paper for comparison.
📄Paper: arxiv.org/abs/2412.09606
🔍Try it NOW: fanegg.github.io/Feat2GS/#chart
How much 3D do visual foundation models (VFMs) know?
Previous work requires 3D data for probing → expensive to collect!
#Feat2GS @cvprconference.bsky.social 2025 - our idea is to read out 3D Gaussains from VFMs features, thus probe 3D with novel view synthesis.
🔗Page: fanegg.github.io/Feat2GS