Advertisement · 728 × 90
#
Hashtag
#Quantisation
Advertisement · 728 × 90
Preview
Learning Grouped Lattice Vector Quantizers for Low-Bit LLM Compression The key idea

Learning Grouped Lattice Vector Quantizers for Low-Bit LLM Compression The key idea The key idea The paper introduces grouped lattice vector quantisation (GLVQ), a weight quantisation technique tha...

#efficient-inference #quantisation

Origin | Interest | Match

0 0 0 0
Lightweight L‑SGD Enables Training on Low‑Power RISC‑V Edge Devices

Lightweight L‑SGD Enables Training on Low‑Power RISC‑V Edge Devices

L‑SGD runs on low‑power RISC‑V microcontrollers, though lacking FPUs hurts performance; an 8‑bit quantised version doubles training speed and cuts memory use by ~4×. Read more: getnews.me/lightweight-l-sgd-enable... #riscv #quantisation

0 0 0 0
Post image Post image Post image Post image

Want to fine-tune LLMs without a #GPU cluster? Join our live online training “Fine-tuning on one GPU” for anyone building smart AI w/ lean resources.

8 September 2025 | 09:00–12:30 CET
events.asc.ac.at/event/203/

#LLM #LoRA #PEFT #AItraining #Quantisation #AIonABudget #HuggingFace #Python

1 0 0 0
Preview
Optimal Formats and the Cube Root of the PDF Your boss emails you a point in 128-billion-dimensional space. “Llama 3.1 8B,” the message reads. “A not-so-large language model in bfloat16. But it’s too big. Trim the fat (ASAP).” You open up your toolbox: quantisation, sparsity, distillation.

Optimal Formats and the Cube Root of the PDF Your boss emails you a point in 128-billion-dimensional space. “Llama 3.1 8B,” the message reads. “A not-so-large language model in bfloat16. But ...

#posts #quantisation #efficient-inference #number-formats

Origin | Interest | Match

0 0 0 0
Preview
Optimal Formats and the Cube Root of the PDF Your boss emails you a point in 128-billion-dimensional space. “Llama 3.1 8B,” the message reads. “A not-so-large language model in bfloat16. But it’s too big. Trim the fat (ASAP).” You open up your toolbox: quantisation, sparsity, distillation.

Optimal Formats and the Cube Root of the PDF Your boss emails you a point in 128-billion-dimensional space. “Llama 3.1 8B,” the message reads. “A not-so-large language model in bfloat16. But ...

#posts #quantisation #efficient-inference #number-formats

Origin | Interest | Match

0 0 0 0
Preview
ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantisation The key idea

ParetoQ: Scaling Laws in Extremely Low-bit LLM Quantisation The key idea The key idea Quantisatio...

https://graphcore-research.github.io/paretoq/

#efficient-inference #LLMs #quantisation #scaling-laws

Event Attributes

0 0 0 0
Awakari App

DeepSeek-V3 & DeepSeek-R1 Technical Reports With their V3 and R1 models, DeepSeek sets a new ...

https://graphcore-research.github.io/deepseek/

#efficient-inference #quantisation #reinforcement #learning #reasoning #LLMs

Event Attributes

0 0 0 0