Commit 18eeed1
bench: llama.cpp full KV type PPL comparison (q4_0 to q8_0)
SmolLM2 1.7B, 2K tokens:
f16: 2.83, q8_0: 2.82, q5_1: 2.86, q5_0: 2.85, q4_1: 2.92, q4_0: 3.13
Q5 types show <1% PPL loss. Q4_0 shows +10.6%.
This provides the baseline for comparison with TurboQuant approaches.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 3bc4ded commit 18eeed1
1 file changed
Lines changed: 11 additions & 0 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
32 | 32 | | |
33 | 33 | | |
34 | 34 | | |
| 35 | + | |
| 36 | + | |
| 37 | + | |
| 38 | + | |
| 39 | + | |
| 40 | + | |
| 41 | + | |
| 42 | + | |
| 43 | + | |
| 44 | + | |
| 45 | + | |
0 commit comments