See you at #EACL2026 in Rabat 🕌!
#UKPLab #NLProc #ResponsibleAI #Quantization #MLSafety #Fairness #TrustworthyAI #ModelCompression #LLMSafety #EthicalAI #NLP #AIResearch @cs-tudarmstadt.bsky.social @proloewe.bsky.social
Image
I stumbled upon this excellent paper on deploying LLMs efficiently at the edge using only ternary weights with Bitnet.cpp. If edge AI excites you, check this out! See link below. #EdgeAI #LLM #ModelCompression #MachineLearning #Research
https://arxiv.org/abs/2502.11880
The ultimate transformer size competition: build the smallest model that can add two 10-digit number
The ultimate transformer size competition: build the smallest model that can add two 10-digit numbers with 99%+ accuracy. Current record holder uses just 36 parameters with 100% accuracy.
https://github.com/anadim/AdderBoard
#Transformers #MachineLearning #ModelCompression
@cshlnews.bsky.social @princetonneuro.bsky.social
@cmu-neuroscience.bsky.social
#neuroAI #compneuro #neuroscience #visualcortex #closedloop #activelearning #modelcompression #distillation #pruning
www.cshl.edu/ai-monkey-br...
Reducing a neural network’s complexity through pruning, quantization, distillation, or matrix factorization enhances efficiency and scalability, allowing AI systems to deliver comparable performance with lighter architectures and optimized resource use.
#ModelCompression #EdgeAI
Despite skepticism, many find the *degree* of compressibility enabled by this universal subspace truly remarkable. This suggests significant potential for shrinking models without losing performance, which could be a game-changer. #ModelCompression 3/6
Flow-Induced Diagonal Gaussian Processes Enhance AI Model Compression
FiD-GP reduces Bayesian training cost by orders of magnitude and halves parameter counts, shrinking model size by three-quarters, keeping state-of-the-art accuracy. Read more: getnews.me/flow-induced-diagonal-ga... #bayesiandeep #modelcompression
In-Training Compression Improves Efficiency of State Space Models
In‑training compression trims SSM hidden dimensions during training, preserving performance while speeding up optimization; paper submitted Oct 2025. Read more: getnews.me/in-training-compression-... #statespacemodels #modelcompression
Dynamic Expert Clustering Boosts Efficiency of MoE Large Language Models
Dynamic expert clustering cuts MoE model parameters by about 80% and boosts throughput 10‑20% while keeping quality on GLUE and WikiText‑103. getnews.me/dynamic-expert-clusterin... #moe #modelcompression #nlp
BALF Enables Fine‑Tuning‑Free Neural Network Compression
BALF enables fine-tuning-free compression, cutting FLOPs of ResNeXt-101 by about 45% while incurring only a 1-point top-1 accuracy drop. The paper was submitted in September 2025. Read more: getnews.me/balf-enables-fine-tuning... #balf #modelcompression
Random Matrix Theory Powers New AI Model Compression Technique
RMT‑KD leverages random matrix theory for knowledge distillation, trimming up to 80% of model parameters with just ~2% accuracy loss and 2.8× faster inference. getnews.me/random-matrix-theory-pow... #randommatrixtheory #modelcompression
COSPADI: Sparse Dictionary Learning Boosts LLM Compression
COSPADI compresses large language models without additional training, using calibration‑guided sparse dictionary factorization to achieve 20‑50% reduction while preserving accuracy. getnews.me/cospadi-sparse-dictionar... #llm #modelcompression #sparselearning
SlimDiff Enables Training-Free Compression of Diffusion Models
SlimDiff compresses diffusion models without training, achieving up to 35% faster inference and removing about 100 million parameters while maintaining quality. getnews.me/slimdiff-enables-trainin... #slimdiff #diffusionmodels #modelcompression
Unified Framework for Neural Network Compression with Rank Selection
A unified framework merges tensor decomposition with automatic rank selection, cutting manual grid searches and using continuous optimization to compress models while keeping accuracy. getnews.me/unified-framework-for-ne... #modelcompression #nn
Location‑Aware Discriminant Analysis Improves Visual Detector Compression
Location‑aware discriminant analysis compresses detectors, cutting model size while preserving accuracy; on KITTI and COCO the pruned models matched or beat the originals. getnews.me/location-aware-discrimin... #locationaware #modelcompression
Flow-Induced Diagonal Gaussian Processes Reduce AI Model Size
FiD‑GP halves neural network parameters and shrinks storage by about 75 %, while keeping state‑of‑the‑art accuracy and uncertainty estimation on benchmarks. Read more: getnews.me/flow-induced-diagonal-ga... #fidgp #modelcompression
Random Matrix Theory Boosts Model Compression with RMT-KD
RMT‑KD cuts model parameters by up to 80% with just 2% accuracy loss and runs up to 2.8× faster, tested on GLUE, AG News and CIFAR‑10, according to the study. Read more: getnews.me/random-matrix-theory-boo... #rmtkd #modelcompression #edgeai
Adaptive Tensor-Train Decomposition Improves Network Compression
The new LWIQ method cuts tensor‑train rank‑search time by 63.2% and yields a model 3.2× smaller with only a 0.86% drop in top‑1 accuracy on CIFAR‑10 ResNet‑56. getnews.me/adaptive-tensor-train-de... #modelcompression #tensortrain #deeplearning
ButterflyQuant slashes memory use in large language models without losing performance. Could this mean faster, cheaper AI on any device? What excites you about the future of model compression? 🤔 #AI #Innovation #ModelCompression LINK
Check out the blog: superteams.ai/blog/a-hands...
#AIInfrastructure #ModelCompression #KnowledgeDistillation
We propose Redundant Information Distillation which maximizes the task-relevant common information between teacher and student using a new alternating optimization: #explainability #informationtheory #distillation #modelcompression
AI model compression isn't just a technical refinement but a strategic choice that aligns cost reduction, sustainability, and operational agility with the pressing demands of today's rapidly evolving digital landscape.
#AI #ModelCompression #Efficiency
Today's task: model compression!!
🆕 New at IWSLT! But no less exciting 🔥
🎯 Goal: Compress a large, general-purpose multimodal model, making speech translation more efficient ⚡️, deployable 📲, and sustainable ♻️, while preserving translation quality ⭐️
#AI #SpeechTech #ModelCompression #LLMcompression
PITOME Revolutionizes Transformers by Merging Tokens to Save Memory and Boost Speed 🔬🚀🧠 www.azoai.com/news/2024111... #AI #Innovation #MachineLearning #DeepLearning #TokenMerging #Transformers #GraphTheory #DataEfficiency #ModelCompression #ImageProcessing @arxiv-stat-ml.bsky.social