bread (@bread.codes) Bsky

1 week ago 4735 1843 50 115

I made a mistake here btw. Quadratic is LESS than exponential, not more. That's what I get for squeezing my words into just 240 characters haha

1 week ago 1 0 0 0

I'm not joking

1 week ago 0 0 0 0

my fish tank caught fire

1 week ago 0 0 1 0

To answer your other question: I mean simpler ML models that are purpose built. Image classifiers, sentiment analysis, etc that are already good at what they do, and don't require outputting English sentences

1 week ago 1 1 1 0

I just believe that we've hit our limits with this type of model, and at its current scale it is already destroying our environment.

1 week ago 1 0 1 0

However, to match even a percentage of human working memory, the context window needs to be huge, and that inference will get slower. More than exponentially slower, because this is a quadratic complexity problem.

Even the size of enterprise codebases, docs, etc will need far higher than 1M.

1 week ago 1 0 2 0

MoE is interesting and may be what tops-out last, but Attention is the other problem.

The model needs to know who/what "he/her/it/they" refers to, and needs to know that "John" is the same "John" as you said earlier

This issue is what dictates context size. 1M is crazy compared to where we started

1 week ago 1 0 1 0

MoE will get slightly (and I mean sub-20%) better with more high quality training data and more specific "expert" models, but they take a lot of VRAM to hold all the experts.

Routing the experts from outside the model is hard without a huge inference time penalty, so they must stay as 1 huge model.

1 week ago 1 0 1 0

You came near the idea that we could make smaller "expert" models. Add a router layer and you have MoE (mixture of experts) which has made the models perform better.

Every new model today uses MoE already.

1 week ago 1 0 1 0

I'll try not to ramble. We have hit a ceiling, and may soon top-out modern architecture.

The big discoveries that led to modern LLMs were:
- Bigger = Better, which is antithetical to scaling
- The "Attention is all you need" paper solving attention (so the LLM can "understand" things like pronouns)

1 week ago 1 0 1 0

ML models should go back to being purpose built models that don't speak

1 week ago 1 1 1 0

LLMs _will_ crash and burn economically

all that will be left is the discovery of attention optimizations in ML, which is interesting, but also shows how dumb and inefficient these models really are and how attention tricked us into thinking the models know what we're saying

1 week ago 1 0 1 0

while I am anti-LLM, I did get lost training my own to test a hyper-specific focus. I also now see how hard the optimization problems are

TLDR: this shit doesn't scale

that is inherent to the discovery that bigger = better, and attention being quadratic complexity with few optimizations

1 week ago 1 0 1 0

Haha I forgot about it being a clone clone but you're right!

I never really used it for a 2nd color either. However, I have used the second nozzle for EVA filament (what glue sticks are made of) for flexible and eco friendly supports. That way I could print ABS with no support marks.

1 month ago 1 0 0 0

Funny enough, mine is a QIDI TECH I, a clone of your exact FlashForge haha

1 month ago 1 0 1 0

During this I have lost a belt bearing, so I need to wait for that, and it'll be cutting it close because he'll be in town Wednesday

1 month ago 0 0 0 0

I offerred a coworker one of my 3D printers - which had been sitting idle for 5y - for his son, like a year ago, and I forgot about it since then. He asked me about it again today...

so anyway, I'm 6 hours deep in fixing it, replacing cables, a driver from spares, and a missing motor...

1 month ago 1 0 2 0

Well. It happened. Its a cold day in hell. Its projected to snow 3-5 inches in coastal South Carolina, and its starting now.

1 month ago 0 0 0 0

data mining is such a scary word, but it's accurate. how do you take unrefined data and make it make sense? you mine it

2 months ago 0 0 0 0

I like my teeth like I like my women: not exploded

2 months ago 170 15 2 0

creating new badges for my site

2 months ago 3 0 0 0

Wolfenstein has the opportunity to do the funniest thing by releasing a game with a new, modern enemy

2 months ago 2 1 0 0

load bearing tweet for a settler colonial society

2 months ago 4653 1246 19 19

A cropped photo of a Game Boy Micro screen, displaying an image from The Legend of Zelda: The Minish Cap (GBA). The hero, Link, is wearing a green tunic while standing in the middle of a grassy forest. There’s some blue water nearby, and shafts of sunlight gently illuminate the right side of the image.

What a pretty game :)

2 months ago 185 15 11 0

I swear to god some of these corporate news outlets would open a window right now and say “We can’t confirm it, but it appears that the ground is allegedly being snowed upon.”

2 months ago 6446 1129 302 35

deluxevideodrift - Twitch MELT THE ICE: A Fundraiser for WFM's Immigrant Rapid Response Fund, featuring games, guests, music, & more! | !donate !irrf

fundraiser livestream from Minneapolis supporting Women's Foundation of Minnesota Immigrant Rapid Response Team

livestream:
www.twitch.tv/deluxevideod...

support here!
tiltify.com/@deluxevideo...

2 months ago 7 5 0 0

A Game Boy Camera photo of my fiance and I

A Game Boy Camera photo of my cat, Rupee

A Game Boy Camera photo of my other cat, Korra

family photos

2 months ago 1 0 0 0

a meme with two hands holding words : "the deep desire to be creative" vs "the inherent need to recover my energy from surviving reality"

I feel seen

2 months ago 10271 4659 35 113

I dislike the tertiary character. He's a bit of a square.

2 months ago 0 0 0 0

Posts by bread