Commit 9b9ce04
Fix Qwen RMSNorm: revert runtime +1 for GGUF + switch demo to Qwen3.5 (#24)
PR #23 incorrectly added RMSNorm +1 for all Qwen-family GGUF models.
Investigation reveals:
- Qwen2/Qwen3: standard RMSNorm (weight * norm(x)), no +1 needed
- Qwen3.5/Gemma: use (1+weight), but llama.cpp's GGUF converter
already bakes +1 into the weights during conversion
- Runtime +1 was double-applying for Qwen3.5 and incorrectly
applying for Qwen2/3, causing activation explosion
Fix: skip runtime +1 for all GGUF models. Only apply for non-GGUF
(raw checkpoint) DeltaNet models.
Also switch WASM demo default from Qwen3-0.6B Q4_K_M (broken due to
double-quantization on a tiny model) to Qwen3.5-0.8B Q4_K_M (~508 MB)
which produces coherent output at 25 tok/s.
Verified:
- Qwen3.5 0.8B Q8_0: coherent English output
- Llama 3.2 1B Q8_0: coherent English output (unchanged)
- Qwen3 0.6B Q4_K_M: real words now (was garbage Unicode), but
quality limited by double-quantization on 0.6B model
Co-authored-by: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent a44df86 commit 9b9ce04
5 files changed
Lines changed: 25 additions & 31 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
53 | 53 | | |
54 | 54 | | |
55 | 55 | | |
56 | | - | |
57 | | - | |
58 | | - | |
59 | | - | |
| 56 | + | |
| 57 | + | |
| 58 | + | |
| 59 | + | |
60 | 60 | | |
61 | 61 | | |
62 | 62 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9982 | 9982 | | |
9983 | 9983 | | |
9984 | 9984 | | |
9985 | | - | |
9986 | | - | |
9987 | | - | |
9988 | | - | |
9989 | | - | |
9990 | | - | |
9991 | | - | |
9992 | | - | |
9993 | | - | |
9994 | | - | |
9995 | | - | |
9996 | | - | |
9997 | | - | |
9998 | | - | |
9999 | | - | |
10000 | | - | |
10001 | | - | |
10002 | | - | |
| 9985 | + | |
| 9986 | + | |
| 9987 | + | |
| 9988 | + | |
| 9989 | + | |
10003 | 9990 | | |
10004 | 9991 | | |
10005 | 9992 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
4065 | 4065 | | |
4066 | 4066 | | |
4067 | 4067 | | |
| 4068 | + | |
| 4069 | + | |
| 4070 | + | |
| 4071 | + | |
| 4072 | + | |
| 4073 | + | |
| 4074 | + | |
4068 | 4075 | | |
4069 | 4076 | | |
4070 | 4077 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
121 | 121 | | |
122 | 122 | | |
123 | 123 | | |
124 | | - | |
125 | | - | |
126 | | - | |
| 124 | + | |
| 125 | + | |
| 126 | + | |
127 | 127 | | |
128 | | - | |
| 128 | + | |
129 | 129 | | |
130 | 130 | | |
131 | 131 | | |
| |||
167 | 167 | | |
168 | 168 | | |
169 | 169 | | |
170 | | - | |
171 | | - | |
172 | | - | |
173 | | - | |
174 | | - | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
175 | 175 | | |
176 | 176 | | |
177 | 177 | | |
| |||
Binary file not shown.
0 commit comments