Advertisement Β· 728 Γ— 90

Posts by Xuan Son Nguyen

Preview
Using OCR models with llama.cpp An easy-to-follow guide on how to use OCR models with llama.cpp.

llama.cpp now supports various small OCR models that can run on low-end devices. These models are small enough to run on GPU with 4GB VRAM, and some of them can even run on CPU with decent performance.

In this post, I will show you how to use these OCR models with llama.cpp πŸ‘‡

1 week ago 3 0 0 0
Post image

Very nice touch, Gmail πŸ˜…

6 months ago 3 0 0 0
Preview
Building My Smart Home - Part 2: ESPHome & RF433 Implementing an RF433 gateway using ESPHome and a custom Home Assistant add-on to manage multiple RF433 receivers for my smart home.

Link to article: blog.ngxson.com/building-my-...

7 months ago 0 0 0 0
Post image

Part 2 of my journey building a smart home! πŸš€

In this part:
> ESPHome & custom component
> RF433 receiver & transmitter
> Hassio custom addon

7 months ago 0 0 1 0
Preview
Building My Smart Home - Part 1: Home Assistant Designing a smart home system from electrical wiring to Home Assistant automations, using affordable devices and network-inspired architecture.

Link to article: blog.ngxson.com/building-my-...

7 months ago 0 0 0 0
Post image

Just published a new article on my blog πŸƒβ€β™‚οΈ

Building My Smart Home - Part 1: Plan, Idea & Home Assistant

Check it out!

7 months ago 0 0 1 0
Preview
Gemma 3-270m - a ggml-org Collection Collection of models for Gemma 3-270m

Link here: huggingface.co/collections/...

8 months ago 3 0 0 0
Post image

Kudos to Google and the llama.cpp team! 🀝

GGUF support for Gemma 270M right from day-0

8 months ago 5 0 1 0


Watch it here: www.youtube.com/watch?v=Qtzz...

8 months ago 0 0 0 0
Advertisement
Post image

Richy Mini and SmolLM3 are featured in Github's weekly news! πŸš€ πŸš€

8 months ago 0 0 1 0
Post image

Gemma 3n has arrived in llama.cpp πŸ‘¨β€πŸ³ 🍰

Comes in 2 flavors: E2B and E4B (E means "effective/active parameters")

9 months ago 1 0 0 0
Post image

See you this Sunday at AI Plumbers conference: 2nd edition!

πŸ“ Where: GLS Event Campus Berlin, Kastanienallee 82 | 10435 Berlin
πŸ‘‰ Register here: lu.ma/vqx423ct

10 months ago 0 0 0 0
Post image

✨✨ AIFoundry is bringing you the AI Plumbers Conference: 2nd edition β€” an open source meetup for low-level AI builders to dive deep into "the plumbing" of modern AI

πŸ“ Where: GLS Event Campus Berlin, Kastanienallee 82 | 10435 Berlin
πŸ“… When: June 15, 2025
πŸ‘‰ Register now: lu.ma/vqx423ct

10 months ago 2 1 0 0
Post image

Hugging Face Inference Endpoints now officially support deploying **vision** models via llama.cpp πŸ‘€ πŸ‘€

Try it now: endpoints.huggingface.co/catalog

11 months ago 0 0 0 0
Preview
GitHub - ngxson/smolvlm-realtime-webcam Contribute to ngxson/smolvlm-realtime-webcam development by creating an account on GitHub.

Check it out: github.com/ngxson/smolv...

11 months ago 0 0 0 0
Video

Real-time webcam demo with @huggingface.bsky.social SmolVLM and llama.cpp server.

All running locally on a Macbook M3

11 months ago 2 0 1 0
Post image

Although we have A100, H200, M3 Ultra, etc

Still can't match the power of that Casio FX πŸ˜†

11 months ago 2 0 0 0
Post image

llama.cpp vision support just got much better! πŸš€

Traditionally, models with complicated chat template like MiniCPM-V or Gemma 3 requires a dedicated binary to run.

Now, you can use all supported models via a "llama-mtmd-cli" πŸ”₯

(Only Qwen2VL is not yet supported)

11 months ago 5 0 0 0

Learn more: blog.ngxson.com/introducing-...

11 months ago 1 0 0 0
Advertisement
Post image

Finally have time to write a blog post about ggml-easy! πŸ˜‚

ggml-easy is a header-only wrapper for GGML, simplifies development with a cleaner API, easy debugging utilities, and native safetensors loading ✨ Great for rapid prototyping!

11 months ago 0 0 1 0
Post image

Someone at Google definitely had a lot of fun making this πŸ˜†

And if you don't know, it's available in "Starter apps" section on AI Studio. The app is called "Gemini 95"

11 months ago 1 0 0 0
Post image

Telling LLM memory requirement WITHOUT a calculator?

Just use your good old human brain 🧠 😎

Check out my 3‑step estimation πŸš€

11 months ago 3 1 0 0
Post image

Google having a quite good sense of humor πŸ˜‚

Joke aside, 1B model quantized to Q4 without performance degrading is sweet 🀏

11 months ago 2 1 0 0
Preview
GitHub - ngxson/ggml-easy: Thin wrapper around GGML to make life easier Thin wrapper around GGML to make life easier. Contribute to ngxson/ggml-easy development by creating an account on GitHub.

Where to try? ggml-easy --> github.com/ngxson/ggml-...

1 year ago 0 0 0 0
Post image

Cooking a fun thing today, I can now load safetensors file directly to GGML without having to convert it to GGUF!

Why? Because this allow me to do experiments faster, especially with models outside of llama.cpp πŸ˜†

1 year ago 0 0 1 0
Video

No vibe coding. Just code it βœ…

Visit my website --> ngxson.com

1 year ago 2 0 0 0
Preview
The State of On-Device LLMs Xuan-Son Nguyen, an engineer at Hugging Face, specializes in on-device large language models (LLMs) and runtime optimization, working extensively with llam...

πŸ“… The Live Webinar will happen at
πŸ•” 11 AM SF β€” 2 PM NYC β€” 6 PM London β€” 19h00 Paris
πŸ‘‰πŸ‘‰πŸ‘‰ Register here: app.getcontrast.io/register/sot... πŸ‘ˆπŸ‘ˆπŸ‘ˆ

1 year ago 0 0 0 0
Post image

On Monday, the 24th, I'm proud to give a talk at sota's webinar.

My main talk will last for an hour to deep dive into the current state of on-device LLMs, exploring their advantages, trade-offs, and limitations.

The session will end with an Q&A, where you can ask me anything about this subject.

1 year ago 3 0 1 0
Advertisement
Post image

Had a fantastic chat today with Georgi Gerganov, the brilliant mind behind ggml, llama.cpp, and whisper.cpp! We discussed about:

πŸš€ The integration of vision models into llama.cpp
πŸš€ The challenges of maintaining a smooth UX/DX
πŸš€ The exciting future of llama.cpp

Big things ahead - stay tuned!

1 year ago 1 0 0 0
Post image Post image

OK now you are the best, Gememe 2.0

1 year ago 0 0 0 0