Skip to content

Commit 99fd881

Browse files
unamedkrclaude
andcommitted
bench: 2-bit research results — drift is the fundamental barrier
All approaches tested on real model data (SmolLM2 1.7B): error feedback, NF2, norm correction, 2nd-order delta — none help. Per-delta cosine is fine (0.9975), but drift over 200 tokens → 0.885. 3-bit + delta (+1.1%) remains the practical minimum. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent ca820d1 commit 99fd881

1 file changed

Lines changed: 27 additions & 0 deletions

File tree

bench/results/real_kv_compression.md

Lines changed: 27 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -32,3 +32,30 @@ Keys stored ONLY in quantized cache. Attention dequantizes per-query.
3232
4. **Below Q4 V: noticeable degradation.** Q2 V adds +36% PPL.
3333
5. **RHT-based types (turbo_kv_*) underperform uniform at head_dim=64.**
3434
turbo_kv_4b PPL is worse than uniform_4b despite same bit count.
35+
36+
## 2-bit Research: All Approaches Tested (SmolLM2 1.7B)
37+
38+
### Per-delta cosine (individual, dim=256, 199 deltas)
39+
| Method | Cosine | Notes |
40+
|--------|--------|-------|
41+
| 2-bit uniform delta | 0.9975 | baseline |
42+
| 2-bit + error feedback | 0.9963 | slightly worse |
43+
| 2-bit NF2 (non-uniform) | 0.9959 | worse |
44+
| 3-bit uniform delta | 0.9993 | reference |
45+
46+
### Accumulated cosine (200 steps, drift)
47+
| Method | Avg cosine | Notes |
48+
|--------|-----------|-------|
49+
| Standard delta+2-bit | 0.885 | drift accumulation |
50+
| Norm-corrected delta+2-bit | 0.877 | worse (distorts direction) |
51+
52+
### Second-order delta
53+
| Metric | d1 | d2 | d2/d1 |
54+
|--------|----|----|-------|
55+
| Range | 9.58 | 9.16 | 95.7% |
56+
| RMS | 0.351 | 0.289 | 82.4% |
57+
58+
### Conclusion
59+
2-bit drift over 200 tokens (cos 0.997 → 0.885) is the fundamental barrier.
60+
No tested approach (error feedback, NF2, norm correction, 2nd-order) overcomes it.
61+
3-bit + delta (+1.1% PPL) is the practical minimum.

0 commit comments

Comments
 (0)