Posts by Thomas Frick
This separation unlocks powerful capabilities:
✨ Scale "looking" independently of "thinking"
✨ Keep contexts lean — only process relevant crops
✨ Train the "eyes" without retraining the "brain" on a single GPU
Our method SPARC explicitly decouples the "Where" (perception) from the "Why" (reasoning) — mimicking how the brain separates early visual processing from executive function.
🔍 First: aggressive visual search to find the right pixels
🧠 Then: focused reasoning on only the relevant crops
Even the most brilliant detective can't solve a case without finding the clues first.
Yet most "thinking" VLMs make this exact mistake: they entangle visual search and complex logic into one giant, expensive chain of thought.
Stop burning through tokens in the dark. Ignite a SPARC. ⚡
Thanks! Love to be on the list!
👋🏻