thanks!!! that's what we tried to do!!
there is a benchmaxxing study here that gives even more information: kaitchup.substack.com/p/gemma-4-31...
Posts by Gus
that is exactly what jebediah98.bsky.social said.
we need to convert our models to multiple external frameworks to make sure every developer can use the models. sometimes there are some bugs and we try to fix as many as possible
I hope your next test goes better! please let me know
for cloud next at least
will do!
I wonder how much of that paper was written by claude code itself!
it is kind of an autobiography!
or maybe you were using a buggy version?
at launch some of the external implementation had some issues with the chat template and it led to some bugs in agentic tasks
whenever in doubt, try the AI Studio version which is the ground truth to what Gemma models should do
right? it is very interesting indeed!!!
LLM benchmarks are broken. 📉
We’re seeing more "benchmaxxing" than actual intelligence. High academic scores are easy to fake, but real-world generalization is much harder.
If you want to see what true performance looks like, look at the FoodTruckBench results:
foodtruckbench.com/blog/gemma-4...
"Google DeepMind gave an amazing gift to humanity!!"
- Two Minute Papers
Gemma is indeed great and I'm very happy to see these videos!!
have you tried it yet? you should!!!
www.youtube.com/watch?v=Sk9t...
Randall has my vote to be in charge of ISO!
Tons of niches...
It looks beautiful
I agree!
The Gemma 4 e2b model has similar quality to Gemma 3 27b!!!
That's incredible!
First Gemma 4 talk at the London Deepmind office!
Many more to come!
You should try this app!
Playing with Gemma 4 on device is super fun!
It shows work on all the big Frameworks
That being said, we are working on improving inference speed as much as possible
Teaching Gemma 4 (26b MoE) how to map Census data in R via OpenCode & oMLX in the Zed IDE.
It's the first local LLM I've used that gets me real-time responsiveness. Running on a MacBook Pro M2 MAX, 64GB RAM.
Not as smart as the frontier models - but it's running for free on my own hardware.
This is very cool!!!
about the speed, which platform are you using for serving? vLLM
I'm usually afraid that my work end up on a Fireship video, but this time I'm very very happy!!!
www.youtube.com/watch?v=-01Z...
100%?
Wow
To help implement the details
Good question.
We did many tests with different prompting styles and ended up in the one we released, even being a little bit different from other model families.
From a user perspective, it should be transparent to them as it only affects framework creators which we tried to collaborate with
I don't much about that platform too (apparently lot's of tools I don't know of, hahaha)
but it should be able to turn off the thinking there too
unfortunately I don't know about Joplin plugins, but, thinking is something that is enabled by System Instructions, like on/off switch
maybe something here may be able to help you: ai.google.dev/gemma/docs/c...
which framework are you using to run the model?
asking because the model can run without thinking enabled but some frameworks expose that in different ways
A Visual Guide to Gemma 4 by Maarten Grootendorst
An in-depth, architectural deep dive of the Gemma 4 family of models. From Per-Layer Embeddings to the vision and audio encoders.
newsletter.maartengrootendorst.com/p/a-visual-g...
This is so beautiful!