Commit 789a6cb
da.huo
fix: resolve undefined symbol and MoE dispatch crash
CMakeLists.txt:
Move tma.cu from gemm2 into GEMM2_KERNELS_SM90, so make_2d_tma_desc
resides in the same archive (libgemm2_sm90.a) as its SM90 CUTLASS
callers. This fixes the undefined symbol error caused by single-pass
static-link ordering between libgemm2.a and libgemm2_sm90.a.
LlamaLinear.cu:
Guard invokeMoeDispatchScales with `if (U)`. The is_cublas_grouped
path (SM100 bf16 MoE) enters the dispatch block without quantization,
leaving the scales tensor U empty. Calling invokeMoeDispatchScales on
an empty tensor crashes with std::out_of_range on B200.1 parent b101a61 commit 789a6cb
2 files changed
Lines changed: 9 additions & 10 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| 19 | + | |
19 | 20 | | |
20 | 21 | | |
21 | 22 | | |
| |||
49 | 50 | | |
50 | 51 | | |
51 | 52 | | |
52 | | - | |
53 | 53 | | |
54 | 54 | | |
55 | 55 | | |
| |||
85 | 85 | | |
86 | 86 | | |
87 | 87 | | |
88 | | - | |
89 | | - | |
90 | | - | |
91 | | - | |
92 | | - | |
| 88 | + | |
| 89 | + | |
93 | 90 | | |
94 | 91 | | |
95 | 92 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
89 | 89 | | |
90 | 90 | | |
91 | 91 | | |
92 | | - | |
93 | | - | |
94 | | - | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
95 | 98 | | |
96 | | - | |
97 | 99 | | |
98 | 100 | | |
99 | 101 | | |
| |||
0 commit comments