[1/N] add fp8 fp32 scale support for custom RL model by yiakwy-xpu-ml-framework-team · Pull Request #368 · antirez/ds4

yiakwy-xpu-ml-framework-team · 2026-06-09T08:20:17Z

Background

We added fp8 RL+SFT version of Deepseek V4 in week 0 support and suppressed DeepSeek V4 baseline in all major dimensions from our internal evaluation.

Hence we want to add 2 bit support for DeepSeek V4 with our Expert Pruning technology:

Noted, in H100/H800, we usually don't use E8M0 for scale, since it will introduce runtime overhead. FP32 scale is the best.

yiakwy-xpu-ml-framework-team · 2026-06-09T08:20:35Z

@antirez could you have a look at it ?

antirez · 2026-06-09T11:05:53Z

Hi, the PR itself has a few quality issues but especially it is not clear why it would be useful for the proejct as a whole given that we convert from DS4 hugging face formats.

yiakwy-xpu-ml-framework-team · 2026-06-10T03:12:20Z

Quantization is successful.

@antirez Thank you for the quick response, let me explain.

our sft/RL model of deepseek v4 has embedding layer (bf16 or int32), while deepseek model has embedding with type int64
since we are running in Hopper platform , our expert weight stored with E4M3 FP8 weight and weight scale stored with FP32 for best performance (which can verified in SGLang):

Customer DSV4 sglang fp8 serving in Hopper platform with identity injectioin, private/public knowledge injection and enhanced security shield module
Huggingface model is not SGLang compatible version, while our version is; and huggingface does not consider convert SFT/RL model from Bf16 to FP8 variants

The model is tuned specifically to handle Candonese, Chinese madarin and English efficiently.

add fp8 fp32 scale support for custom RL model

4decca9

This was referenced Jun 10, 2026

[2/N] add cuda imatrix support for custom RL model #377

Open

Distributed CUDA worker host-registers the whole GGUF, not just its layer slice → Q4 OOM on 128GB DGX Spark #293

Open

our sft/rl model does not contain fp8 scale for this weight

fe08b54

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[1/N] add fp8 fp32 scale support for custom RL model#368

[1/N] add fp8 fp32 scale support for custom RL model#368
yiakwy-xpu-ml-framework-team wants to merge 2 commits into
antirez:mainfrom
yiakwy-xpu-ml-framework-team:add_fp8_fp32_scale_support

yiakwy-xpu-ml-framework-team commented Jun 9, 2026 •

edited

Loading

Uh oh!

yiakwy-xpu-ml-framework-team commented Jun 9, 2026

Uh oh!

antirez commented Jun 9, 2026

Uh oh!

yiakwy-xpu-ml-framework-team commented Jun 10, 2026 •

edited

Loading

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

yiakwy-xpu-ml-framework-team commented Jun 9, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Background

Uh oh!

yiakwy-xpu-ml-framework-team commented Jun 9, 2026

Uh oh!

antirez commented Jun 9, 2026

Uh oh!

yiakwy-xpu-ml-framework-team commented Jun 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

yiakwy-xpu-ml-framework-team commented Jun 9, 2026 •

edited

Loading

yiakwy-xpu-ml-framework-team commented Jun 10, 2026 •

edited

Loading