Commit ca820d1
Mixed-precision delta: FP32 I-frames + quantized P-frames
I-frames stored in FP32 (perfect reference), P-frames as quantized deltas.
Added --iframe N CLI flag for interval control.
Results (SmolLM2 1.7B):
3-bit + delta (FP32 I, N=64): PPL 9.61 (+1.1%) ← near-lossless at ~4 bpe
2-bit + delta (FP32 I, N=8): PPL 12.55 (+32%) ← best 2-bit, but 6.6 bpe
2-bit + delta (FP32 I, N=32): PPL 12.95 (+36%) ← 3.9 bpe
Conclusion: 3-bit + delta = practical sweet spot (PPL +1.1%).
2-bit remains challenging — FP32 I-frame overhead vs drift trade-off.
Auto-disables delta for DeltaNet hybrid models (NaN prevention).
33/33 tests pass.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 65bbf5f commit ca820d1
2 files changed
Lines changed: 6 additions & 12 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
167 | 167 | | |
168 | 168 | | |
169 | 169 | | |
170 | | - | |
171 | | - | |
172 | | - | |
173 | | - | |
174 | | - | |
175 | | - | |
| 170 | + | |
| 171 | + | |
176 | 172 | | |
177 | 173 | | |
178 | 174 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
355 | 355 | | |
356 | 356 | | |
357 | 357 | | |
358 | | - | |
359 | | - | |
360 | | - | |
361 | | - | |
362 | | - | |
| 358 | + | |
| 359 | + | |
363 | 360 | | |
364 | 361 | | |
365 | 362 | | |
| |||
419 | 416 | | |
420 | 417 | | |
421 | 418 | | |
422 | | - | |
| 419 | + | |
423 | 420 | | |
424 | 421 | | |
425 | 422 | | |
| |||
1009 | 1006 | | |
1010 | 1007 | | |
1011 | 1008 | | |
| 1009 | + | |
1012 | 1010 | | |
1013 | 1011 | | |
1014 | 1012 | | |
| |||
0 commit comments