magnesit (@magnesit) Bsky

The current approach is literally like doing precision-critical maths, then rounding to the nearest integer and complaining about a lack of accuracy

21 hours ago 0 0 0 0

Reasoning LLMs are the biggest compromise in machine learning - converting all those rich, contextual embeddings into discrete token IDs and then feeding them back. Awful.

I'm confident "they" will come up with something more efficient soon.

Please.

21 hours ago 1 0 1 0

the U.S. made tax exemptions for WHAT???

5 days ago 0 0 0 0

Wowza, that actually sounds pretty cool!

Thanks for deciding to make it open-source! 🫶

6 days ago 6 0 1 0

😭😭💀

6 days ago 1 0 0 0

These are tokens, generated by the model, paid for by the users. I don't know if that's supposed to drive profits up for the providers, but it feels so arbitrary to spend time waiting for the model to say that it's about to start thinking.

Like botched training. Strong models, but weirdly trained.

6 days ago 0 0 0 0

Image depicting a chain-of-thought of Gemma 4 E2B generating "Thinking process: 1. **Analyze the request**: ..."

I don't like the chains of thought of the newer open-weight LLMs on the market. They just don't try to be efficient anymore. I know, it's supposed to be more structured and stuff, but I think leaving all the distillation artifacts from bigger models like the one marked in the image is unacceptable.

6 days ago 0 0 1 0

Really cool stuff. Haven't given much *Attention* (see what I did there?) to the bigger ones in the family, but I guess it only gets better from here!

6 days ago 0 0 0 0

Gemma 4 E2B is extremely impressive. I tried it out a little bit and must admit that it's got that "feeling" of being much stronger than many of its bigger counterparts.

I am especially blown away by the inference speed on my phone, because it's genuinely a bit faster than reading speed on Pixel 7a

6 days ago 0 0 1 0

Rare European L unfortunately, the system has been heavily criticized. It certainly has upsides though, let's see what happens

1 week ago 0 0 0 0

When doing the exam, all you need to perform is a few matmuls here and there and you'll have your own chatbot to support you with the toughest questions!

Follow for more undetectable life hacks.

1 week ago 1 0 0 0

I came up with a tremendous, nearly undetectable method to cheat in exams; instead of trying to smuggle in an AI assistant akin to ChatGPT to help you out, simply *remember* (!) the weights of an open, frontier LLM like, say, GLM 5.1.

1 week ago 1 0 1 0

I welcome the two-week ceasefire the US and Iran agreed last night. It brings much-needed de-escalation.

I thank Pakistan for its mediation.

Now it is crucial that negotiations for an enduring solution to this conflict continue.

We will continue coordinating with our partners to this end.

1 week ago 392 65 72 14

The amount of nihilistic comments really shows that as long as there's authority, Bluesky will find something negative about it.

Social media accounts of corpos, countries and in this case, the European commission, exist. Deal with it.

You're doing great, EU, keep it up.🇪🇺

2 weeks ago 8 0 1 0

But the more I read things like these, my brain goes "ugh, federalize already."

1 month ago 0 0 0 0

This is such an astronomically great thing! 🥹

1 month ago 0 0 1 0

He didn't get away with it.

1 month ago 0 0 0 0

Why in the name of fuck is Gemini 3.1 Flash Lite priced at $1.50/M output tokens? lil bro it's not that good 🥀🥀

1 month ago 1 0 0 0

Volla... 🙄

funny how it's the same 3 companies that come up with stuff like that every single time

1 month ago 9 0 0 0

I'm really, really looking forward to Deepseek V4! Let's just hope it releases soon, because the competition is evolving a lot right now...

1 month ago 2 0 0 0

Motorola News | Motorola's new partnership with GrapheneOS Motorola announces three new B2B solutions at MWC 2026, including GrapheneOS partnership, Moto Analytics and more.

We're happy to announce a long-term partnership with Motorola. We're collaborating on future devices meeting our privacy and security standards with official GrapheneOS support.

motorolanews.com/motorola-thr...

1 month ago 762 220 55 63

There's just no better option security-wise. But rest assured, they are working on their own phone together with a large OEM, (likely) coming 2027.

1 month ago 0 0 0 0

Flash (Fast) avoids mistakes a pure instruct model would never be able to avoid with the current SOTA. Even situations where way too much relevance would be put onto the first token of the response - and where every other instruct model fails - are handled well by Flash.

2 months ago 0 0 0 0

Gemini 3 Flash (Fast mode) is literally just a reasoning model that pretends like it's not and any comparison between instruct models is inherently unfair. Even minimal reasoning is still reasoning and you can clearly feel the difference in the response quality my opinion.

2 months ago 0 0 1 0

Graphene Is All You Need.

2 months ago 0 0 0 0

Every VLM Implementation except for Qwen's and Gemini's feels botched.

2 months ago 1 0 0 0

Update, support for the model improved with newer versions of llama.cpp and hits >60 t/s decode speed now.

I still don't believe this thing will run well on a phone though.

3 months ago 0 0 0 0

Doubling the number of active parameters but cutting the number of experts in half feels arbitrary and has not shown an improvement in output quality so far. The model runs slower though.

We'll have to see what Magistral can make out of it.

4 months ago 0 0 0 0

Introducing Mistral 3 | Mistral AI A family of frontier open-source multimodal models

They did: mistral.ai/news/mistral-3

3B, 8B, 14B, 675B.

4 months ago 1 0 0 0

I'm glad the RL slop has been reduced in the Ministral series of models. The models don't perform absolutely SOTA, but at least they do not seem to spam \boxed for every single problem now.

Their Magistral pipeline could probably continue to scale extremely well in the future though.

4 months ago 0 0 0 0

Posts by magnesit