it's Space Quebec, because they speak a language with the same phonology as French which is incomprehensible to Francophones
Posts by bilal
there’s been some interesting work lately on multiscale autoregressive image modeling arxiv.org/abs/2404.029...
city of stairs and the tainted cup both very good 👍
once again coming crawling back to AdamW after every paper published after 2015 has failed me again
i think they’re just starting to realize what they unleashed w the tweet
from the other app xd
Love to log on to the Horrors app to catch up on today's Horrors
we’re doing ai exorcisms in 2025 huh
like one of the big things about 404 media, Brian Merchant, Paris Marx, Ed Zitron, etc., is that they neither know nor care how the subject of their criticism actually works
I wish academic ML was a bit more skeptical of papers and less skeptical of industry. I get that it sucks to not have visibility on details, but it doesn’t invalidate the results. On the flip side, there are too many papers whose message are parroted despite sketchy experiments.
we’re going shopping
an LLM that uses streetview to pre-drive down the route and assemble comments like "at the big red barn, turn left" "when you get to the sorta squiggly road, take the exit" like a farmer would
if you squint hard enough everything in ml is either a special case of the KL div or newton’s method
a lot of machine learning research is about discovering which parts of mathematics are actually L2 regularization and which parts of mathematics are actually Adam
justine tunney the libc mutex micro optimizations person??
This guy needs to read Manufacturing Consent! You’re not supposed to do this yourself you gotta hire editors who already agree with you, this is amateur hour shit…
pov: post training researchers learning what pretraining researchers do while waiting for the model to train
accidentally typed rm -fr and i’m using that now
congrats!!
thanks for cleaning it up
ai generated slavoj zizek voice on slop video of some bizarre rural chinese cooking
incredible new forms of postings emerging
interesting is there anywhere i can read more about this
its all approximating numbers w other numbers all the way down. everything else is an implementation detail! 😛
if your values do matter replace them w values similar to them aka ones (parameter sharing / shared kv cache / factorizing a large matrix into two small ones / lora / adafactor)
i love how every efficiency advance in machine learning is approximating [expensive operation] by either ones (just pass it straight through) or zeros (doesn’t matter, just don’t compute it/sparsity)