Commit 199f066
Fix delta KV NaN on Qwen: auto-disable for DeltaNet hybrid models
Delta KV compression requires pure self-attention architecture.
DeltaNet hybrid models (Qwen3.5) have non-contiguous attention layers
that cause NaN in delta accumulation.
Fix: auto-detect DeltaNet (delta_n_heads > 0) and disable delta with warning.
Llama-family models (SmolLM2) continue to work correctly.
Qwen + delta: auto-disabled → PPL 153.6 (runs without delta, no NaN)
SmolLM2 + 3-bit delta: PPL 9.67 (+1.7%) — confirmed working
33/33 tests pass.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 9c38016 commit 199f066
3 files changed
Lines changed: 20 additions & 26 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
166 | 166 | | |
167 | 167 | | |
168 | 168 | | |
| 169 | + | |
| 170 | + | |
| 171 | + | |
| 172 | + | |
| 173 | + | |
| 174 | + | |
169 | 175 | | |
170 | 176 | | |
171 | 177 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
1113 | 1113 | | |
1114 | 1114 | | |
1115 | 1115 | | |
1116 | | - | |
1117 | | - | |
| 1116 | + | |
| 1117 | + | |
| 1118 | + | |
| 1119 | + | |
| 1120 | + | |
| 1121 | + | |
| 1122 | + | |
| 1123 | + | |
1118 | 1124 | | |
1119 | 1125 | | |
1120 | 1126 | | |
| |||
1679 | 1685 | | |
1680 | 1686 | | |
1681 | 1687 | | |
1682 | | - | |
1683 | | - | |
1684 | | - | |
1685 | | - | |
1686 | | - | |
1687 | | - | |
1688 | | - | |
1689 | | - | |
1690 | | - | |
1691 | | - | |
1692 | | - | |
1693 | | - | |
1694 | | - | |
1695 | | - | |
1696 | | - | |
1697 | | - | |
1698 | | - | |
1699 | | - | |
1700 | | - | |
1701 | | - | |
1702 | | - | |
1703 | | - | |
1704 | | - | |
1705 | 1688 | | |
1706 | 1689 | | |
1707 | 1690 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
351 | 351 | | |
352 | 352 | | |
353 | 353 | | |
354 | | - | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
355 | 360 | | |
356 | 361 | | |
357 | 362 | | |
| |||
0 commit comments