Commit f38b174
Fix noaux_tc cuda Error 700 in CUDAGraph and Add wfp8apf8 moe quant method (#4115)
* improve per_token_quant_fp8 performance
* support moe wfp8apf8
* check glm test
* fix noaux_tc op in cudagraph, support noaux_tc return the correct
* check
* check inf and overwrite score in noaux_tc
---------
Co-authored-by: Jiang-Jia-Jun <163579578+Jiang-Jia-Jun@users.noreply.github.com>1 parent 6b47773 commit f38b174
17 files changed
Lines changed: 921 additions & 122 deletions
File tree
- custom_ops/gpu_ops
- quantization
- fastdeploy/model_executor
- layers
- moe
- quantization
- models
- tests
- e2e
- operators
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
564 | 564 | | |
565 | 565 | | |
566 | 566 | | |
| 567 | + | |
567 | 568 | | |
568 | 569 | | |
569 | 570 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
151 | 151 | | |
152 | 152 | | |
153 | 153 | | |
| 154 | + | |
| 155 | + | |
| 156 | + | |
| 157 | + | |
| 158 | + | |
| 159 | + | |
| 160 | + | |
| 161 | + | |
| 162 | + | |
| 163 | + | |
| 164 | + | |
| 165 | + | |
| 166 | + | |
| 167 | + | |
| 168 | + | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
| 175 | + | |
| 176 | + | |
| 177 | + | |
| 178 | + | |
| 179 | + | |
| 180 | + | |
| 181 | + | |
154 | 182 | | |
155 | 183 | | |
156 | 184 | | |
| |||
563 | 591 | | |
564 | 592 | | |
565 | 593 | | |
| 594 | + | |
| 595 | + | |
| 596 | + | |
| 597 | + | |
| 598 | + | |
| 599 | + | |
| 600 | + | |
| 601 | + | |
| 602 | + | |
| 603 | + | |
| 604 | + | |
| 605 | + | |
| 606 | + | |
| 607 | + | |
| 608 | + | |
| 609 | + | |
| 610 | + | |
| 611 | + | |
| 612 | + | |
| 613 | + | |
| 614 | + | |
| 615 | + | |
| 616 | + | |
| 617 | + | |
| 618 | + | |
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
26 | 26 | | |
27 | 27 | | |
28 | 28 | | |
| 29 | + | |
29 | 30 | | |
30 | 31 | | |
31 | 32 | | |
| |||
48 | 49 | | |
49 | 50 | | |
50 | 51 | | |
| 52 | + | |
51 | 53 | | |
52 | 54 | | |
53 | 55 | | |
| |||
76 | 78 | | |
77 | 79 | | |
78 | 80 | | |
| 81 | + | |
79 | 82 | | |
80 | 83 | | |
81 | 84 | | |
| |||
0 commit comments