Advertisement Ā· 728 Ɨ 90

Posts by Cosmo Santoni

Post image

By removing host-device round-trips for the autoregressive loop, this runs unmodified across platforms while saturating up to 64% of memory bandwidth on a single batch on Cloud TPU v6e.

#DeepLearning #OpenSource #Flax #Python

1 month ago 2 0 0 0
Post image

#MachineLearning state-space models are incredible but custom CUDA kernels lock its performance to NVIDIA hardware.

My latest #JAX port maps the SSD algorithm to XLA passes, achieving true O(1) on-device caching across CPU, GPU & #TPU.

Pre-print šŸ‘‰ huggingface.co/papers/2603....

#AI #SSM

1 month ago 2 0 2 0
Post image Post image

🧵Pt3. Demo of what it looks like in practice you can see context usage of a snapshot with and without trimming below. 50% reduction on this particular task.

2 months ago 0 0 0 0

🧵Pt2. A typical 150k token session is 60-70% tool result dumps and base64 signatures Claude already processed. Trim strips that, keeps every message. Observed 50% reduction in real sessions. Unlike /compact, nothing gets summarised away.

Analysis breakdown here github.com/CosmoNaught/...

2 months ago 1 0 1 0
Video

🧵 Built git-style versioning for Claude Code sessions. Snapshot context, branch for different tasks, trim the bloat. 40 mins of codebase analysis reused across 5 tasks instead of re-explaining from scratch.

github.com/CosmoNaught/...

#ClaudeCode #DevTools #Anthropic #Claude #LLM #AIAgents

2 months ago 5 0 1 0
Post image Post image Post image

Took a much-needed screen break this week to answer the age-old tale: how many epidemiology PhD students does it take to build a gingerbread house šŸ¤”

@nderqui.bsky.social @olliesimmons.bsky.social @cosmosantoni.bsky.social Sam Hemmings @mrc-outbreak.bsky.social

1 year ago 6 3 0 0