Martin Ziqiao Ma (@marstin) Bsky

Thanks for sharing, LaCET is now open-sourced :)

6 days ago 1 0 0 0

that’s why I still love my hand typed good old html homepage :(

3 weeks ago 1 0 0 0

Thrilled to announce the 1st Workshop on Computational Developmental Linguistics (CDL) at ACL 2026 🎉 A new venue at the intersection of development linguistics × modern NLP, spearheaded by @fredashi.bsky.social @marstin.bsky.social, and and outstanding team of colleagues!

A thread 🧵

3 months ago 22 9 3 1

new year boi

3 months ago 2 0 0 0

NEPA

NEPA: Next-Embedding Predictive Autoregression
sihanxu.me/nepa/

Key ideas:
- One self-supervised signal: cosine-style next-embedding prediction
- Autoregression runs directly on the model's native embeddings
- No pixel decoder (& loss), no contrastive pairs, no task-specific heads, no random masks

4 months ago 0 0 0 0

Test-Time Training Done Better: From Plastic Adaptation to Elastic Memory Elastic Test-Time Training (ETTT) that prevents catastrophic forgetting at inference time and overfitting during pretraining LaCT.

In my new blog, “Test-Time Training Done Better: From Plastic Adaptation to Elastic Memory Consolidation,” I introduce a long-context modeling architecture that learns to adapt and memorize at test time by updating a subset of the model’s weights during inference.

mars-tin.github.io/blogs/posts/...

4 months ago 0 0 0 0

Gosh, I’m getting way too emotional writing my thesis acknowledgements...

4 months ago 1 0 0 0

Will be at #NeurIPS2025 (San Diego) Dec 1-9, then in the Bay Area until the 14th. Hmu if you wanna grab coffee and talk about totally random stuff.

Thread with a few things I’m excited about.
P.S. 4 NeurIPS papers all started pre-May 2024 and took ~1 year of polishing...so proud of the team!

4 months ago 0 0 0 0

Trying to decide what to do on the first day of #NeurIPS2025?

Check out my, @marstin.bsky.social and @xiangyue96.bsky.social's tutorial, "The Science of Benchmarking: What's Measured, What's Missing, What's Next" on December 2 from 1:30 to 4:00pm.

benchmarking.science

What will we cover?

1/3

5 months ago 21 4 2 1

An Open-Notebook Exploration of Emergent Grounding in LMs How We Did This Curiosity-Driven Research? An Open-Notebook Exploration of Emergent Grounding in LMs

@fredashi.bsky.social and I wrote a blog for our new mechinterp paper (arxiv.org/abs/2510.13796), including many unpublished and even negative results that we found meaningful to share.

An Open-Notebook Exploration of Emergent Grounding in LMs mars-tin.github.io/blogs/posts/...

6 months ago 7 0 0 1

Regrettably can’t attend #COLM2025 due to deadlines, but
Jane and Joyce will be presenting our work. :)

Jane is an exceptional undergraduate researcher and a great collaborator! Go meet her at COLM if you’re curious about her work on mechanistic interpretability, multimodality, & pragmatics!

6 months ago 2 0 0 0

🚀 ACL ARR is looking for a Co-CTO to join me lead our amazing tech team and drive the future of our workflow. If you’re interested or know someone who might be, let’s connect!

RTs & recommendations appreciated.

6 months ago 4 3 1 0

Unfortunately, I’ll be missing #ACL2025NLP this year — but here are a few things I’m excited about! 👇

8 months ago 1 0 0 0

Congratulations!!

8 months ago 1 0 1 0

with @fredashi.bsky.social / Jiayuan Mao / @djiafei.bsky.social / @manlingli.bsky.social / David Hsu / Parisa Kordjamshidi

9 months ago 1 0 0 0

📣 Excited to announce SpaVLE: #NeurIPS2025 Workshop on Space in Vision, Language, and Embodied AI!

Join us in San Diego to push the frontiers of spatial understanding and reasoning across CV, NLP, and robotics!

👉 space-in-vision-language-embodied-ai.github.io

9 months ago 2 0 1 0

#CoreCognition #LLM #multimodal #GrowAI We spent 3 years to curate 1503 classic experiments spanning 12 core concepts in human cognitive development and evaluated on 230 MLLMs with 11 different prompts for 5 times to get over 3.8 millions inference data points.

A thread (1/n) - #ICML2025 ✅

9 months ago 13 9 1 0

New Paper Alert ‼️ Current VLMs completely fail human gaze understanding 🙀 and scaling does NO help ‼️

However, humans, since an extremely age 🧒, are extremely sensitive to other people's gaze 🙄 👀

No mentors, no labs, only pre-doc students, 111 VLMs, and we did it 😎

10 months ago 6 5 1 1

SimWorld SimWorld: A World Simulator for Scaling Photorealistic Multi-Agent Interactions

& @tianminshu.bsky.social (+ @marstin.bsky.social, @zhitinghu.bsky.social, ‪@lianhui.bsky.social & more) will present “SimWorld: A World Simulator for Scaling Photorealistic Multi-Agent Interactions,” an @unrealengine.bsky.social-based sim that generates unlimited/diverse urban environments: (13/14)

10 months ago 1 1 1 0

At Albuquerque Now :)

11 months ago 1 0 0 0

See you at #NAACL2025! I will talk about grounded lexicon acquisition and scaling mechanistically grounded vision language models. Happy to chat if you are around :)

11 months ago 1 0 0 0

VLMs Are Not Pragmatically Competent in Referring Expression Generation VLMs fail to refer like humans. Our study reveals widespread pragmatic issues in GPT-4o, LLaVA, and others, showing how their expressions often violate Gricean maxims.

We introduce RefOI, a new dataset of 1.5k objects, each with 3 written and 2 spoken human-produced referring expressions. We also release RefOI-TLHF, a large dataset of token-level human feedback for 10.6k referring expressions.

👀https://vlm-reg.github.io/
📄https://arxiv.org/abs/2504.16060

11 months ago 1 0 0 0

Vision-Language Models are not yet pragmatically optimal.

We identify 3 key failures of pragmatic competence in referring expression generation with VLMs: (1) cannot uniquely refer to the referent, (2) include excessive or irrelevant information, and (3) misalign with human pragmatic preferences.

11 months ago 4 3 1 1

Do Vision-Language Models Have Internal World Models? Towards an... Internal world models (WMs) enable agents to understand the world's state and predict transitions, serving as the basis for advanced deliberative reasoning. Recent large Vision-Language Models...

🔹 Workshop Paper at World Models:
Do Vision-Language Models Have Internal World Models?
🗓 Apr 27, 9 p.m. (Peridot 201&206)

Paper: openreview.net/forum?id=tpP...

Excited for this collaboration with MaitrixOrg, details coming soon :)

1 year ago 1 0 0 0

🔹 ICLR BiAlign Workshop:
We’re hosting the Bidirectional Human-AI Alignment Workshop (BiAlign).
🗓 Apr 28, (Garnet 216–214)

Website: bialign-workshop.github.io

I’ll join remotely — huge thanks to @huashen.bsky.social for leading this!

1 year ago 4 0 1 0

🔹 ICLR Oral Paper:
Do Vision-Language Models Represent Space and How?

🗓 Oral: Apr 25, 3:42–3:54 a.m. (Session 4C)
🗓 Poster: Thu, Apr 24, 10 p.m.–12:30 a.m. (Hall 3 + 2B, #212)

Website: spatial-comfort.github.io

Big thanks to @fredashi.bsky.social for presenting on site!

1 year ago 0 0 1 0

I won’t be attending #ICLR2025 in person since #NAACL2025 follows right after, but here are a few things I’m excited about (all time in EDT) ⬇️

1 year ago 0 0 1 0

BiAlign: ICLR'25 Workshop on Bidirectional Human-AI Alignment The official website for the ICLR BiAlign: Workshop on Bidirectional Human-AI Alignment

📄 View the full list of accepted papers: bialign-workshop.github.io#/papers

We look forward to seeing you there!

1 year ago 0 0 0 0

🎉 Out of these, 72 papers were accepted, including 5 tiny papers. 10 papers were selected for oral presentations: 2 at CHI and 8 at ICLR. Award winners will be announced during the workshop!

1 year ago 0 0 1 0

📬 We received over 100 submissions, each reviewed by 2–4 expert reviewers, with ethical assessments included when appropriate. Our program committee features leading researchers in NLP, RL, HCI, ML, and AI/ML Ethics, carefully selected based on scholarly merit and expertise.

1 year ago 0 0 1 0

Posts by Martin Ziqiao Ma