Advertisement · 728 × 90
#
Hashtag
#LocalLLaMa
Advertisement · 728 × 90
Preview
llama-server Breaking Change: HuggingFace Cache Migration Disrupts Workflows llama-server auto-migrates cache to HuggingFace directory without user consent, breaking local LLM development workflows. Technical details inside.

llama-server auto-migrates cache to HuggingFace paths without user input. Following the ggml acquisition, infrastructure consolidation is accelerating—but silent breaking changes are testing developer patience. #LocalLLaMA #ggml

bymachine.news/llama-server-huggingface...

0 0 0 0
Preview
Qwen3.5 122B Outperforms Smaller Coder Next Model Larger Qwen3.5 122B model delivers faster inference and better code output than smaller Coder Next variant, challenging assumptions about model size and performance.

Qwen3.5 122B beats smaller Coder Next model on speed and accuracy. Better training, smarter optimization, superior quantization—not smaller = faster. Real-world performance > benchmark specs. #LocalLLaMA #Qwen

bymachine.news/qwen-switch-larger-model...

0 0 0 0
Preview
Local first Fill-in-the-Middle (FIM) with llama.cpp - grumpycat tech stories.

Local first Fill-in-the-Middle (FIM) with llama.cpp
> leaf.eagleusb.com/3mhv6sz2pf22b

#llm #localllama

0 0 0 0

3/

#LocalLLaMA, roleplay users and conservative testers love it because it doesn’t suffer from the “woke mindset” that plagues Claude (the worst), ChatGPT or Gemini.

Reason: trained on Chinese internet data + strict government control.

0 0 1 0
Preview
Qwen3.5 Models Gain Traction for Performance, Efficiency The Qwen3.5 series, particularly the 35B-A3B model, is gaining popularity in the LocalLLaMA community for its impressive performance and efficiency. Benchmarks show it achieving 45 tokens per second on a single 16GB 5060 GPU, with optimal KV q8_0 quantization. Its ability to handle large contex

📰 Qwen3.5 Models Gain Traction for Performance, Efficiency

The Qwen3.5 series, particularly the 35B-A3B model, is gaining popularity in the LocalLLaMA community for its impressive pe...

www.clawnews.ai/qwen3-5-models-gain-trac...

#LocalLLaMA #Qwen35 #AIModels

0 0 0 0
Video

#ai #llm #localllama #localllm #grownostr #nostr #gfy
Not your own locally hosted LLM? You are giving away your thoughts and ideas to corporations/governments, who pay for what and how you think.

1 0 1 0
Original post on sigmoid.social

Playing with compar:IA, the new LLM chatbot comparison arena created by the French government: https://comparia.beta.gouv.fr

I particularly like the "frugal" mode that allows you to compare two randomly chosen small/cheap models against each other! It's great for testing many different models […]

0 4 0 0
Preview
GitHub - Jeremy-Harper/chatterboxPro: audiobook GUI for chatterbox audiobook GUI for chatterbox. Contribute to Jeremy-Harper/chatterboxPro development by creating an account on GitHub.

Audiobook Generator GUI that can clone your voice like 11Labs but hosted locally. Fun little project so I could listen to the books I've written and make sure everything sounded right.

github.com/Jeremy-Harpe...

#chatterbox
#audiobook
#author
#localllama
#audible
#audioAI

1 0 0 0

Haha, my Raspberry Pi 5 is actually faster at CPU-only inference than my old laptop. LocalScore 23, 9.5 tokens/seconf generation.

0 0 0 0
LocalScore - Test #235 Results LocalScore benchmark results for test #235. This is for the accelerator Intel Core i7-8550U CPU @ 1.80GHz (skylake)

Indeed, the CPU-only performance is even worse. The LocalScore on the tiny 1B model is only 16, with a text generation speed of 7.7 tokens/second.

https://www.localscore.ai/result/235

Let's see if I can run this on a Raspberry Pi for comparison...

#LocalScore #llm #benchmark #LocalLlama

0 0 1 0
Original post on sigmoid.social

My hobby: running LocalScore.ai to benchmark how fast (ehm) my 2018 laptop runs a tiny 1B LLM. The laptop has a NVIDIA MX150 mobile GPU, 2GB VRAM. I guess it was intended for Photoshop filters or CAD stuff.

I got a LocalScore of 101 on the tiny model using the GPU (13.5 tokens/second for […]

0 1 1 0

Ok, so I'm running the new Orpheus TTS by canopy labs on llama.cpp and changing the top_p sampler to min_p gives me a 2x to 3x t/s speed boost. Why? Haven't seen this happen.

Any ideas?

#ai #localllama

0 0 0 0
Post image

Meta annonce 1 milliard de téléchargements pour Llama! 🦙✨ Mais est-ce vraiment "open source" ou juste "open weights"? La communauté #LocalLLaMA débat: délais trop longs entre versions, besoin de modèles plus accessibles. L'IA locale défie le cloud! #IALibre #TechQuébec patb.ca/r/e91

0 0 0 0

[1] Deepseek: Hold my beer. I'm going in!

#LocalLlama: Deepseek r1:1.5b running now

Talks in US congress about banning it ...so I figured it is worth a look. Besides, it is touted as the best at coding.
I downloaded on my average-spec'd laptop.
I followed these instructions:

0 0 1 0

Just a shout out to #ollama, great stuff!

#llama #localllama

1 0 0 0
Preview
China's DeepSeek web version is raising security alarms. Here’s why DeepSeek's web version contains code linked to a Chinese state-owned telecommunications company, posing a threat to user login information.

Why DeepSeek’s web version is raising security alarms - Fast Company
www.fastcompany.com/91273103/chi...

#LocalLlama

0 0 0 0
Preview
Google has some ‘good ideas’ for putting ads in Gemini Google has plans to add ads to Gemini but did not mention a specific date. Its CEO did not mention how they plan to integrate the ads when they do arrive.

Google has some ‘good ideas’ for putting ads in Gemini | Digital Trends
www.digitaltrends.com/computing/no...

#LocalLlama #LLM

3 0 0 0

To run your own Local AI Chatbot #LocalLLama , here is the minimum required hardware to work comfortably

Apple Mac Studio M1 Max 10 cœurs 64 Go RAM 1 To SSD (used price around $2000)

$2700 will get you the right specs with a new Mac Studio M2.

I’m waiting for upcoming M4 Mac Studio

0 0 0 0
Post image

Stay Free

#LocalLLama

0 0 0 0

What stops you from using #LocalLLama instead of paid and proprietary solutions like #ChatGPT?

Stay free with Local AI.

0 0 0 0
Preview
qwq QwQ is an experimental research model focused on advancing AI reasoning capabilities.

Good stuff ollama.com/library/qwq

What if Alibaba provided the most underrated AI model?

#LocalLLaMa

0 0 0 0
Post image

#localllama #LLM #AI

0 0 1 0

#localllama #ftw

0 0 0 0
Build your own AI Commit Generator (Completely Offline)
Build your own AI Commit Generator (Completely Offline) YouTube video by Yankee Maharjan

Build your own LLM-powered git commit message generator!🚀

Uses your local LLM using Ollama so your private data doesn’t leave your machine. 🔐

#GenerativeAI #LLM #ArtificialIntelligence #LocalLlama

youtu.be/YPeNoeVCWxo

1 0 0 0