Check out Yu Zhao's (@yuzhaouoe.bsky.social) latest work, βLearning GUI Grounding with Spatial Reasoning from Visual Feedbackβ (www.arxiv.org/abs/2509.21552), done during his internship at MSR (@msftresearch.bsky.social)!
New SOTA π results on ScreenSpot-v2 (+5.7%) and ScreenSpot-Pro (+110.8%)!
Posts by Yu Zhao
π‘ We compare prompting (zero and multi-shot + explanations) and inference-time interventions (ActAdd, REFT and SAEs).
Following SpARE (@yuzhaouoe.bsky.social @alessiodevoto.bsky.social), we propose β¨ contrastive SAE steering β¨ with mutual info to personalize literary MT by tuning latent features 4/
MMLU-Redux Poster at NAACL 2025
MMLU-Redux just touched down at #NAACL2025! π
Wish I could be there for our "Are We Done with MMLU?" poster today (9:00-10:30am in Hall 3, Poster Session 7), but visa drama said nope π
If anyone's swinging by, give our research some love! Hit me up if you check it out! π
We find a single biased direction encodes a KV Cache selection mechanism in Self-Attention -- Key vector with a strong component in this direction results in this Key-Value pair being ignored by Queryπππ
New and very cool library!π Our L2 Norm-based KV Cache compression is already implemented - ready to use! π
Check out the method details in our EMNLP '24 paper: arxiv.org/abs/2406.11430
Iβll be travelling to London from Wednesday to Friday for an upcoming event and would be very happy to meet up! π
I'd love to chat about my recent works (DeCoRe, MMLU-Redux, etc.). DM me if youβre around! π
DeCoRe: arxiv.org/abs/2410.18860
MMLU-Redux: arxiv.org/abs/2406.04127