Diagnosing FP4 inference: a layer-wise and block-wise sensitivity analysis of NVFP4 and MXFP4
#LLM #FP4 #NVFP4 #MXFP4 #Precision #AMD #NVIDIA
hgpu.org?p=30661
Hashtag
#MXFP4
Advertisement · 728 × 90
0
1
0
0
With #ChatGPT5, #OpenAI shifted part of their priorities to efficiency and cost-saving.
The new auto-routing helps select the best model for the prompt (to some success).
But another big change is the support for #MXFP4, which can have a changing impact on all future models.
#AI #ChatGPT
0
1
0
0
How OpenAI used a new data type to cut inference costs by 75%
Decision to use #MXFP4 makes models smaller, faster, and more importantly, cheaper for everyone involved
MXFP4 is a micro-scaling block floating-point format defined by OCP
www.theregister.com/2025/08/10/o...
3
1
1
0