Commit b72f05f
CLI default KV type: uniform_4b → turbo_kv_4b
Variant F validation across two models confirms turbo_kv_4b beats
uniform_4b at the same 4-bit budget on both:
SmolLM2 135M (FP32 18.62):
uniform_4b 20.33 (+9.2%)
turbo_kv_4b 19.70 (+5.8%) ✅ -3.1% PPL improvement
Llama 3.2 3B (FP32 13.56):
uniform_4b 14.41 (+6.3%)
turbo_kv_4b 14.28 (+5.3%) ✅ -0.9% PPL improvement
Smaller model = larger relative improvement, consistent with the
finer codebook (16 levels vs 15) capturing more of the per-block
distribution detail.
Switching the CLI default so users get the better quantization
without having to know the type name. uniform_4b remains available
via -k uniform_4b.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 61c0f54 commit b72f05f
1 file changed
Lines changed: 5 additions & 5 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
12 | 12 | | |
13 | 13 | | |
14 | 14 | | |
15 | | - | |
| 15 | + | |
16 | 16 | | |
17 | 17 | | |
18 | 18 | | |
| |||
71 | 71 | | |
72 | 72 | | |
73 | 73 | | |
74 | | - | |
| 74 | + | |
75 | 75 | | |
76 | 76 | | |
77 | 77 | | |
| |||
85 | 85 | | |
86 | 86 | | |
87 | 87 | | |
88 | | - | |
89 | | - | |
| 88 | + | |
| 89 | + | |
90 | 90 | | |
91 | 91 | | |
92 | 92 | | |
| |||
145 | 145 | | |
146 | 146 | | |
147 | 147 | | |
148 | | - | |
| 148 | + | |
149 | 149 | | |
150 | 150 | | |
151 | 151 | | |
| |||
0 commit comments