Thank you Vagrant! π
Posts by Shane Storks
Thanks Julia!!
Thanks Sireesh!
Eastern Michigan University seal
Happy news: I'll be joining Eastern Michigan University as an Assistant Professor of Computer Science in Fall 2026!
If you're an EMU student interested in my research, let's connect π
Will be sharing some fun #ACL2026NLP papers from my postdoc soon!
Hello #NLProc #ACL2026NLP people. I am looking for **two emergency reviewers** in the Safety and Alignment in LLMs track for ACL/ARR.
Reviews are due Feb 15th. Please DM if interested and available.
Happy to offer drinks/food if you live in/pass by Lisbon βοΈ
Seems to be a common situation for ACs this round, but I'm also looking for two emergency reviewers for the January #ARR Evaluation and Resources track. I'd appreciate any help (reposts, encouragement, black magic...)
I'm looking for two emergency reviewers π§βππ©βπ for the ARR January Generalizability and Transfer track.
Please reach out if you have time & qualify for review or RT for visibilityππ
I could use an emergency reviewer for an ACL submission involving interpretability and syntax. Please DM me if you might be able to provide an emergency review before February 15!
Looking for emergency reviewers for ARR Special Track "Explainability of NLP Models". Topics: Faithfulness, mechanistic interpretability, surveys and position papers. Deadline Feb 14 AoE. #ACL2026NLP
I am looking for 2 emergency reviewers for the ARR Ethics, Bias & Fairness track. Please DM me if you are available π
Hello #NLProc #ACL2026NLP community, I'm looking for an emergency reviewer for an ARR submission on LLM interpretability.
If you're available to complete a review before Feb 15, please reply or DM π
This work finally has a home! Looking forward to presenting βTransparent and Coherent Procedural Mistake Detectionβ at #EMNLP2025 π€©
Screenshot of the Ai2 Paper Finder interface
Meet Ai2 Paper Finder, an LLM-powered literature search system.
Searching for relevant work is a multi-step process that requires iteration. Paper Finder mimics this workflow β and helps researchers find more papers than ever π
The EMNLP 2025 conference website and CfP are now live! 2025.emnlp.org/calls/main-c...
Conference dates: November 5-9 in Suzhou, China
Submissions will be through ARR, and this year's theme is Interdisciplinary Recontextualization of NLP
Our workshop has been extended till Feb 20. We are looking forward for your papers at NAACL's Queer in AI workshop.
One of ways in which AI hype men are highly copacetic with Trump is that they think you can assert things with absolutely no care for truth or feasibility. Bullshitters par excellence
Some happy news: my dissertation on "Coherent Physical Commonsense Reasoning in Foundational Language Models" is finally available online! πhttps://deepblue.lib.umich.edu/handle/2027.42/196025
Adding more details. Space is (very) limited. Please contact me by next Wednesday 1/15/2025 for full consideration. Proposal doesnβt have to be formal.
π£ UMich undergraduate/master students: are you interested in research at the intersection of LLMs and cognitive science, but need guidance and computing resources? I want to work with you!
If interested, DM/email me with your CV and a brief project proposal!
Shane Storks wearing academic regalia after his doctoral hooding ceremony.
So happy to finally share this last piece of my dissertation (and my first post on Bluesky)!
Obligatory photo after my recent hooding attached π§βπ
Compared to vanilla VLMs, our interventions improve the accuracy of mistake detection and the relevance, coherence, and efficiency of explanations.
We also show that patterns in metrics can indicate common issues in VLMs, such as visual hallucination! π΅βπ«
In this work, we expand the recently studied problem of procedural mistake detection in images to require explanations through self-Q&A. πβπ¨π€π¬
We define automated metrics for explanation coherence, and incorporate them into VLMs with various inference and fine-tuning methods.
Dialog between a foundational VLM and itself to detect the incomplete state of the procedure "Unclip the pegs on the cloth" in an image showing a cloth pegged to a clothing line. The VLM generates the following questions and answers: 1. "Is there a cloth in the image? Yes", 2. "Are there pegs on the cloth? Yes", and 3. "Is there someone holding pegs? No". As the VLM asks these questions it becomes more confident that the procedure has not been successfully completed.
How well can VLMs detect and explain humans' procedural mistakes, like in cooking or assembly?
π§βπ³π§βπ§
My new pre-print with Itamar Bar-Yossef, Yayuan Li, Zheyuan Zhang, Jason J. Corso, and Joyce Chai dives into this!
arxiv.org/pdf/2412.11927