100%
Posts by brendan chambers
& those blocks of color really lighten the massing and fade it to an unimposing scale
Park Circle became Park Oracle. Patterson Park became Pattereon Park. Alan Wright Park became something that cannot be pronounced by humans.
tried Photoshop's new upscaler for a map that's getting printed and... it renamed all our streets and parks
Including link to another recent fast-slow optimizer manuscript arxiv.org/pdf/2510.15830
Really interesting read. There was some nice discussion on X here: x.com/rosinality/s...
this motivates a strong fast-slow optimizer framed as Nexus Regularization
Nexus: Same Pretraining Loss, Better Downstream Generalization l
arxiv.org/pdf/2604.09258
The authors emphasize the closeness of learned minima across domains for generalization performance from a theoretical perspective, for reducing interference without changing the data mixture
I can't go back to the regular YouTube UI after this 😅
Obsidian Reader now makes the transcript interactive so you can scrub, highlight, auto-scroll. It feels so nice.
claude mythos is far more token-efficient than opus. continuing the trend
A carpet of pink and purple bluebells, with some bare trees dotted around and some straggly bits of holly and ivy. Some low sunlight is coming through the trees making some of the bluebells appear pink, not purple.
An hour later than the previous shot and the light had intensified.
#bluebells #woodlandphotography #springflowers
Google releases Gemma 4. ✨
Gemma 4 introduces 4 models: E2B, E4B, 26B-A4B, 31B.
The multimodal reasoning models are under Apache 2.0.
Run E2B and E4B on ~6GB RAM, and on phones.
Run 26B-A4B and 31B on ~18GB.
GGUFs: huggingface.co/collections/...
Guide: unsloth.ai/docs/models/...
I tried out the Armenian thing with Claude and I am shocked at the level of self observation it's capable of here. I've never ever seen a model see itself bug out in some way and then notice it and attribute it to the tokenizer (possibly correct, or just very plausible) like this before
PrismML releases 1-bit LLM (open-weight), or a 8B LLM that fits in 1.15GM of VRAM
Website: prismml.com
Blog: prismml.com/news/bonsai-8b
HuggingFace: huggingface.co/collections/...
gonna slap a 'i used uv before they got bought by sama' sticker on my laptop
For people who are just learning about Nemotron with the awesome Nemotron 3 Super drop, recommend you watching this interview I did with one of the leads Bryan Catanzaro -- Nemotron as a project is a LONG time coming.
www.youtube.com/watch?v=Y3Vb...
But in the backward pass, the story is much worse. Gradients get compressed via projection onto a D-dimensional subspace, and most of the training signal simply vanishes.
Common Corpus just breaking 1M downloads: it took some time but open data in ai is actually popular.
A little offended Grammarly didn't make a sloppelganger of me
“autoresearch” micro teaching repo from Karpathy
readme edits seem like such a nice dx for open ended hparam tuning, and maybe other kinds of hill climbing too, so much less painful than the old days
github.com/karpathy/aut...
It was all about spying on Americans: www.theatlantic.com/technology/2...
FlashSampling: Fast and Memory-Efficient Exact Sampling
Paper: flashsampling.github.io/FlashSamplin...
We analyzed 250K+ queries & 430K+ clickstream interactions from Asta, our AI-powered research assistant—and today we're releasing the full dataset. How do researchers actually use AI science tools? Here's what we found. 🧵
new blog post on permissioned data in atproto! this one introduces "buckets", the protocol-level primitive for shared access control. I walk through two approaches that don't quite work and land on something that I think does
let me know your thoughts!
tldr iiuc we are once again enclosing the commons and industrializing craft, dispossessing laborers while apotheosizing capital, and to slow down this doomloop we need to innovate new collectives and public goods
This has a very cool result on in-context learned classification tasks, where they disentangle representational quality (how well-separated concept labels are) and readout alignment (how good it is at reading out its own inner labels). Adding demo examples helps through readout, not representations!
Designing around the tight bottleneck on latency and throughput that separates local and cloud compute is such an interesting problem. Significant challenges though
Anti-homeless benches in Pokemon Legends ZA
why is there anti-homeless architecture in pokemon
A year ago, data center developers were focused on connecting to the grid. Today roughly 1/3 of all planned capacity is onsite power - and 72% of that planned capacity is fossil gas. Homer City PA's data center project could soon be one of the largest single sources of carbon emissions in the US.