#MXFP4 hashtag - Bluesky

@hgpu.bsky.social

2 weeks ago

Diagnosing FP4 inference: a layer-wise and block-wise sensitivity analysis of NVFP4 and MXFP4 Quantization addresses the high resource demand for large language models (LLMs) by alleviating memory pressure and bandwidth congestion and providing significantly scaled compute power with a tole…

Diagnosing FP4 inference: a layer-wise and block-wise sensitivity analysis of NVFP4 and MXFP4

#LLM #FP4 #NVFP4 #MXFP4 #Precision #AMD #NVIDIA

hgpu.org?p=30661

0 1 0 0

ROWAN

@alastor8472.bsky.social

7 months ago

What’s MXFP4? The 4-Bit Secret Powering OpenAI’s GPT‑OSS Models on Modest Hardware A Blog post by Rakshit Aralimatti on Hugging Face

With #ChatGPT5, #OpenAI shifted part of their priorities to efficiency and cost-saving.

The new auto-routing helps select the best model for the prompt (to some success).

But another big change is the support for #MXFP4, which can have a changing impact on all future models.

#AI #ChatGPT

0 1 0 0

Suhaib Khan

@suhaibkhan.bsky.social

7 months ago

OpenAI gpt-oss LLMs use MXFP4: smaller, faster, cheaper Analysis: Decision to use MXFP4 makes models smaller, faster, and more importantly, cheaper for everyone involved

How OpenAI used a new data type to cut inference costs by 75%

Decision to use #MXFP4 makes models smaller, faster, and more importantly, cheaper for everyone involved

MXFP4 is a micro-scaling block floating-point format defined by OCP

www.theregister.com/2025/08/10/o...

3 1 1 0