Tyler Steben (@tylerste) Bsky

The tool sharpened the toolmaker.

Through teaming with AI I've picked up habits that show up whether AI is involved or not. Lateral reframes. Anticipating blind spots. Questioning first instincts.

The real ROI from AI won't come from a better model. It'll come from better thinkers.

1 week ago 0 0 0 0

Used AI to stress-test a go-to-market decision. Ran the problem through multiple frameworks.

Most valuable output: realizing I was asking the wrong question entirely. That reframe changed my execution.

AI didn't do a task faster. It changed the quality of the thinking.

2 weeks ago 0 0 0 0

ATMs didn't kill bank teller jobs. The iPhone did. Great piece by David Oks on paradigm replacement vs task substitution.

Most AI usage today is still in the ATM phase. The interesting part starts when it breaks out.

davidoks.blog/p/why-the-atm-didnt-kill...

2 weeks ago 0 0 0 0

Claude Code + @obra.bsky.social's episodic memory plugin + Obsidian as a live knowledge base. Every conversation builds on the last instead of starting from zero. Paper summary becomes research direction becomes prototype becomes automated workflow. Outputs become inputs. Context compounds.

2 weeks ago 0 0 0 0

The AI agent market has a blind spot: cross-user agent collaboration.

Everyone's optimizing for "my agent does more for me." The interesting problem is what happens when your agent needs to coordinate with someone else's agent on shared work.

https://github.com/a2aproject/A2A

2 weeks ago 1 0 0 1

Endsley & Kiris 1995: operators who only supervised automated systems lost situation awareness. Those who stayed hands-on kept it. For AI agent workflows: don't just review output all day. Do some of the work yourself. It keeps your brain in the loop. https://is.gd/yWoa23

3 weeks ago 1 0 2 0

Book pairing from Rob D. Willis: Thinking in Systems (Meadows) for feedback loops, flows, delays. Good Strategy Bad Strategy (Rumelt) for the strategic kernel: diagnosis + guiding policy + coherent action. One shows what's happening. The other shows where to act. is.gd/MFurMO

3 weeks ago 1 0 2 0

Fox et al. 2005 (PNAS): brain runs two anticorrelated networks. Focus vs rest/reflection. Sustained AI agent supervision keeps focus on, blocking recovery. The drain after reviewing agent output all day is measurable, not imagined. https://is.gd/qO3jm1

3 weeks ago 1 0 1 0

YouTube: $62B revenue (2025), surpassing Disney's $60.9B media revenues. MoffettNathanson values it at $500-560B standalone. AI touches 95% of user interactions per CEO Neal Mohan. 1M+ channels used AI creator tools in Dec 2025 alone. is.gd/hSvdDb

3 weeks ago 0 0 0 0

Warm et al. 2008 measured blood flow to attention regions during sustained monitoring. Performance drops within 15 min. This was studied in pilots and plant operators, but the same brain regions fire when you're supervising AI agents all day. https://is.gd/Wv8dAB

4 weeks ago 1 0 0 0

Sivulka (Hebbia CEO) via a16z.news: enterprise AI adoption repeats 1890s electrification. Companies replaced the motor, kept the factory layout for 30 years. His 7 pillars of Institutional AI are worth reading if you build org-level AI systems. is.gd/9yqNPg

4 weeks ago 0 0 0 0

Ibrahim Bashir's invisible inventory problem applied to agents: when machines evaluate products, they parse APIs and metadata, not feature pages. Product marketing has taught humans about capabilities for decades. Nobody has figured out machine-readable capability discovery yet. is.gd/mZ7uTn

4 weeks ago 0 0 0 0

Qwen3.5 35B-A3B: 3B active params, outperforms Qwen3-235B on benchmarks. Alibaba's MoE architecture puts near-Sonnet 4.5 performance on consumer hardware with <4GB VRAM active. Cost-to-capability ratio for open-source is collapsing. tinyurl.com/2dq8597f

1 month ago 0 1 0 0

Jeff Dean on Latent Space: Google designs TPUs around picojoules per operation, not FLOPs. At 50 AI agents per engineer, bottleneck shifts from model quality to watts. Compute scales with money, energy scales with physics. Grid capacity takes longer than model dev. is.gd/f84jV5

1 month ago 2 0 0 0

NotebookLM workflow: 6 textbooks + 15 papers + all lecture transcripts. Ask for expert mental models, fundamental disagreements, and diagnostic questions instead of summaries. MIT grad student (Ihtesham Ali) compressed a semester into 48 hours using this approach. is.gd/MIlKpB

1 month ago 0 0 0 0

Google's Gemini paper frames AI not as a research tool but as a creative research partner. Iterative refinement + cross-disciplinary transfer. The useful moments aren't answers, they're reframed questions. tinyurl.com/2d6khx5m

1 month ago 1 0 0 0

Anthropic's design team breakdown for AI-era hiring: (1) strong generalist who ships across disciplines, (2) deep craft specialist, (3) prototyper-builder who works directly in code. The prototyper-builder is newest and most in-demand. Mock-first designers are being routed around.

1 month ago 0 0 0 0

Brian Flynn's sharpest point: if your service can't be discovered by a machine, it doesn't exist to agents. Product teams build human personas. Nobody's building agent personas yet. That's a gap. @Flynnjamm is.gd/mdvtOr

1 month ago 0 0 0 0

AI adoption pattern I keep seeing in every context, work, side projects, casual conversations: build small thing, use it until it breaks, rebuild. Three cycles in and people attempt work they never would have tried. The tool didnt improve. Their confidence did.

1 month ago 0 1 0 0

Barclays analysts: AI-powered robotics market potential is \$1T+ by 2035. Path: autonomous vehicles first (already advanced), then drones, then general-purpose humanoids. Amazon has 1M+ robots deployed. The supply chain shift is logistics first, not factory floor.

1 month ago 0 0 0 0

AI agents collapse the search and evaluation costs that made building in-house worth it. If an agent can find, evaluate, and call an API in seconds, the build-vs-buy math shifts hard toward buy.

That's a procurement change, not just a tech change.

1 month ago 1 0 1 0

Jenny Wen (Anthropic) on Lenny's Podcast: three AI-era design archetypes: generalist shipper, deep craft specialist, prototyper-builder who works in code. The third is newest and most in-demand. is.gd/sGSBwz

1 month ago 0 0 0 0

Platonic Representation Hypothesis (Huh et al., ICML 2024): as models scale, internal representations converge regardless of architecture. 79% of creative reruns exceed 0.8 cosine similarity. Fix is structural diversity, not better prompts. tinyurl.com/2735l8ok

1 month ago 0 0 0 0

Qwen3.5 35B-A3B (3B active params) outperforms Qwen3-235B on benchmarks. Alibaba's MoE architecture means near-Sonnet 4.5 performance running locally with <4GB VRAM active. For specific tasks, the open-source models aren't just catching up. They're ahead.

1 month ago 0 0 0 0

Karpathy's autoresearch: 630 lines, 1 GPU, ~12 experiments/hour, 18k stars in days. Full autonomous ML loop — edits code, trains, scores, keeps or reverts, repeats.

630 lines is now the threshold where non-coders + AI assistants can read and adapt research-grade tooling.

https://is.gd/Zohl6B

1 month ago 0 0 0 0

Anthropic's unit economics: each model ships profitably, but the company still loses money overall. A one-year revenue miscalculation = bankruptcy risk. Meanwhile Dario puts AGI at 50/50 in 12-24 months. The financial structure of frontier AI labs is genuinely unusual right now.

1 month ago 1 0 0 0

Jeff Dean on Latent Space: the step CS classes taught but nobody practiced — writing specs — is now the most important artifact when coding agents execute your instructions. 50 AI interns per engineer, managed via specs. https://is.gd/f84jV5

1 month ago 0 0 0 0

Dario Amodei on Dwarkesh: every Anthropic model ships profitably, but the company still loses money. A one-year revenue miss = bankruptcy risk. That kind of pressure either sharpens every decision or breaks the team. No middle ground. is.gd/GfOrd7

1 month ago 0 0 0 0

Google's Gemini Deep Think paper positions AI not as a research tool but as a creative research partner. Iterative refinement + cross-disciplinary transfer.

The useful moments aren't when AI gives you answers. It's when it reframes your question.

1 month ago 0 0 0 0

4% of GitHub public commits today are Claude Code-authored. SemiAnalysis projects 20%+ by EOY 2026. Viral growth inflection: October 2025. Second surge: January 2026 after Boris Cherny's 4.4M-view post. Task horizon doubling every 4-7 months.

1 month ago 0 0 0 0

Posts by Tyler Steben