[1/N] add fp8 fp32 scale support for custom RL model#368
[1/N] add fp8 fp32 scale support for custom RL model#368yiakwy-xpu-ml-framework-team wants to merge 2 commits into
Conversation
|
@antirez could you have a look at it ? |
|
Hi, the PR itself has a few quality issues but especially it is not clear why it would be useful for the proejct as a whole given that we convert from DS4 hugging face formats. |
Quantization is successful. @antirez Thank you for the quick response, let me explain.
The model is tuned specifically to handle Candonese, Chinese madarin and English efficiently. |


Background
We added fp8 RL+SFT version of Deepseek V4 in week 0 support and suppressed DeepSeek V4 baseline in all major dimensions from our internal evaluation.
Hence we want to add 2 bit support for DeepSeek V4 with our Expert Pruning technology:

Noted, in H100/H800, we usually don't use E8M0 for scale, since it will introduce runtime overhead. FP32 scale is the best.