Skip to content

Commit 8b83b0c

Browse files
unamedkrclaude
andcommitted
Llama architecture verified: SmolLM2 1.7B, 1-bit KV = PPL identical
4th architecture verified (Llama/SmolLM2, after Gemma 3, Qwen3.5, Qwen2-MoE). SmolLM2 1.7B (Llama arch, GGUF Q8_0): baseline PPL: 5.8441 1-bit K + FP16 V PPL: 5.8441 (+0.00%) ← exactly identical 1-bit K + Q4 V PPL: 5.8233 30-token output: byte-identical ✓ Speed: 24 tok/s (Q4, 6T, M3) Also fixed: 4B GGUF Q4 conversion (threshold 8→16 GB, DeltaNet detect) Qwen3.5-4B: 0.1 → 5.4 tok/s (54x improvement) Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 6b2ce68 commit 8b83b0c

2 files changed

File tree

tq_convert

-192 KB
Binary file not shown.

tq_run

-295 KB
Binary file not shown.

0 commit comments

Comments
 (0)