Learning Grouped Lattice Vector Quantizers for Low-Bit LLM Compression The key idea The key idea The paper introduces grouped lattice vector quantisation (GLVQ), a weight quantisation technique tha...
#efficient-inference #quantisation
Origin | Interest | Match
Lightweight L‑SGD Enables Training on Low‑Power RISC‑V Edge Devices
L‑SGD runs on low‑power RISC‑V microcontrollers, though lacking FPUs hurts performance; an 8‑bit quantised version doubles training speed and cuts memory use by ~4×. Read more: getnews.me/lightweight-l-sgd-enable... #riscv #quantisation
Want to fine-tune LLMs without a #GPU cluster? Join our live online training “Fine-tuning on one GPU” for anyone building smart AI w/ lean resources.
8 September 2025 | 09:00–12:30 CET
events.asc.ac.at/event/203/
#LLM #LoRA #PEFT #AItraining #Quantisation #AIonABudget #HuggingFace #Python
Optimal Formats and the Cube Root of the PDF Your boss emails you a point in 128-billion-dimensional space. “Llama 3.1 8B,” the message reads. “A not-so-large language model in bfloat16. But ...
#posts #quantisation #efficient-inference #number-formats
Origin | Interest | Match
Optimal Formats and the Cube Root of the PDF Your boss emails you a point in 128-billion-dimensional space. “Llama 3.1 8B,” the message reads. “A not-so-large language model in bfloat16. But ...
#posts #quantisation #efficient-inference #number-formats
Origin | Interest | Match
ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantisation The key idea The key idea Quantisatio...
https://graphcore-research.github.io/paretoq/
#efficient-inference #LLMs #quantisation #scaling-laws
Event Attributes
DeepSeek-V3 & DeepSeek-R1 Technical Reports With their V3 and R1 models, DeepSeek sets a new ...
https://graphcore-research.github.io/deepseek/
#efficient-inference #quantisation #reinforcement #learning #reasoning #LLMs
Event Attributes