Commit a492db9
fix(gemma4): add V-norm + correct layer_output_scale to simple multiply
CRITICAL FINDINGS from llama.cpp gemma4-iswa.cpp source comparison:
1. V-norm missing (line 92 in llama.cpp):
Vcur = ggml_rms_norm(Vcur, eps) — weight-free RMS normalization
of V projection output. Added for Gemma 4 when QK-norm is present.
2. layer_output_scale — confirmed simple multiply is correct:
llama.cpp line 228: cur = ggml_mul(cur, out_scale)
Applied to ENTIRE layer output including residual.
The model was trained with this scaling.
3. KV sharing (line 79, 105-109):
has_kv(il) controls whether K/V are computed or reused from cache.
Shared layers pass nullptr for Kcur/Vcur to build_attn().
4. Gemma 4 chat template confirmed: <|turn>/<turn|>/<|think|>
Still garbage after V-norm fix. Remaining candidates:
- KV sharing (still disabled, need proper implementation)
- Other subtle differences in attention computation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 0c6aa0e commit a492db9
1 file changed
Lines changed: 22 additions & 10 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
14320 | 14320 | | |
14321 | 14321 | | |
14322 | 14322 | | |
| 14323 | + | |
| 14324 | + | |
| 14325 | + | |
| 14326 | + | |
| 14327 | + | |
| 14328 | + | |
| 14329 | + | |
| 14330 | + | |
| 14331 | + | |
| 14332 | + | |
| 14333 | + | |
| 14334 | + | |
| 14335 | + | |
| 14336 | + | |
14323 | 14337 | | |
14324 | 14338 | | |
14325 | 14339 | | |
| |||
15676 | 15690 | | |
15677 | 15691 | | |
15678 | 15692 | | |
15679 | | - | |
15680 | | - | |
15681 | | - | |
15682 | | - | |
15683 | | - | |
15684 | | - | |
15685 | | - | |
15686 | | - | |
15687 | | - | |
| 15693 | + | |
| 15694 | + | |
| 15695 | + | |
| 15696 | + | |
| 15697 | + | |
| 15698 | + | |
| 15699 | + | |
15688 | 15700 | | |
15689 | 15701 | | |
15690 | 15702 | | |
| |||
15696 | 15708 | | |
15697 | 15709 | | |
15698 | 15710 | | |
15699 | | - | |
| 15711 | + | |
15700 | 15712 | | |
15701 | 15713 | | |
15702 | 15714 | | |
| |||
0 commit comments