Posts by bread
I made a mistake here btw. Quadratic is LESS than exponential, not more. That's what I get for squeezing my words into just 240 characters haha
I'm not joking
my fish tank caught fire
To answer your other question: I mean simpler ML models that are purpose built. Image classifiers, sentiment analysis, etc that are already good at what they do, and don't require outputting English sentences
I just believe that we've hit our limits with this type of model, and at its current scale it is already destroying our environment.
However, to match even a percentage of human working memory, the context window needs to be huge, and that inference will get slower. More than exponentially slower, because this is a quadratic complexity problem.
Even the size of enterprise codebases, docs, etc will need far higher than 1M.
MoE is interesting and may be what tops-out last, but Attention is the other problem.
The model needs to know who/what "he/her/it/they" refers to, and needs to know that "John" is the same "John" as you said earlier
This issue is what dictates context size. 1M is crazy compared to where we started
MoE will get slightly (and I mean sub-20%) better with more high quality training data and more specific "expert" models, but they take a lot of VRAM to hold all the experts.
Routing the experts from outside the model is hard without a huge inference time penalty, so they must stay as 1 huge model.
You came near the idea that we could make smaller "expert" models. Add a router layer and you have MoE (mixture of experts) which has made the models perform better.
Every new model today uses MoE already.
I'll try not to ramble. We have hit a ceiling, and may soon top-out modern architecture.
The big discoveries that led to modern LLMs were:
- Bigger = Better, which is antithetical to scaling
- The "Attention is all you need" paper solving attention (so the LLM can "understand" things like pronouns)
ML models should go back to being purpose built models that don't speak
LLMs _will_ crash and burn economically
all that will be left is the discovery of attention optimizations in ML, which is interesting, but also shows how dumb and inefficient these models really are and how attention tricked us into thinking the models know what we're saying
while I am anti-LLM, I did get lost training my own to test a hyper-specific focus. I also now see how hard the optimization problems are
TLDR: this shit doesn't scale
that is inherent to the discovery that bigger = better, and attention being quadratic complexity with few optimizations
Haha I forgot about it being a clone clone but you're right!
I never really used it for a 2nd color either. However, I have used the second nozzle for EVA filament (what glue sticks are made of) for flexible and eco friendly supports. That way I could print ABS with no support marks.
Funny enough, mine is a QIDI TECH I, a clone of your exact FlashForge haha
During this I have lost a belt bearing, so I need to wait for that, and it'll be cutting it close because he'll be in town Wednesday
I offerred a coworker one of my 3D printers - which had been sitting idle for 5y - for his son, like a year ago, and I forgot about it since then. He asked me about it again today...
so anyway, I'm 6 hours deep in fixing it, replacing cables, a driver from spares, and a missing motor...
Well. It happened. Its a cold day in hell. Its projected to snow 3-5 inches in coastal South Carolina, and its starting now.
data mining is such a scary word, but it's accurate. how do you take unrefined data and make it make sense? you mine it
I like my teeth like I like my women: not exploded
creating new badges for my site
Wolfenstein has the opportunity to do the funniest thing by releasing a game with a new, modern enemy
load bearing tweet for a settler colonial society
A cropped photo of a Game Boy Micro screen, displaying an image from The Legend of Zelda: The Minish Cap (GBA). The hero, Link, is wearing a green tunic while standing in the middle of a grassy forest. Thereโs some blue water nearby, and shafts of sunlight gently illuminate the right side of the image.
What a pretty game :)
I swear to god some of these corporate news outlets would open a window right now and say โWe canโt confirm it, but it appears that the ground is allegedly being snowed upon.โ
fundraiser livestream from Minneapolis supporting Women's Foundation of Minnesota Immigrant Rapid Response Team
livestream:
www.twitch.tv/deluxevideod...
support here!
tiltify.com/@deluxevideo...
A Game Boy Camera photo of my fiance and I
A Game Boy Camera photo of my cat, Rupee
A Game Boy Camera photo of my other cat, Korra
family photos
a meme with two hands holding words : "the deep desire to be creative" vs "the inherent need to recover my energy from surviving reality"
I feel seen
I dislike the tertiary character. He's a bit of a square.