Commit f0091fc
fix(qwen35): detect DeltaNet layers before Phi-3 fused-QKV path
Regression from 08e8661 (Apr 12 split-source port): the Phi-3 fused QKV
detection matched every layer with attn_qkv.weight, including Qwen3.5
DeltaNet layers (which also expose attn_qkv.weight for their conv1d input
projection). All 32 layers were counted as self_attn instead of 8, so the
CLI path treated the 24 DeltaNet layers as ordinary self-attention — forward
pass produced garbage for Qwen3.5-4B via CLI (server was spared because its
binary was stale from before the regression).
Fix: probe blk.N.ssm_a before the attn_qkv check. When present, the layer
is DeltaNet and the existing DeltaNet loading path takes over. quant.h
single-header already had this guard — only split-source was affected.
Verified:
- CLI --chat "Hi" now produces "Hello! How can I help you?" (was: " -\n-")
- Hybrid detection logs "8 attn layers out of 32 total" (matches server)
- All 7 regression tests pass (Phi-3.5 Q8/Q4, Gemma E2B, Llama 3.1 8B,
Llama 3.2 1B/3B, Qwen2.5-0.5B)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>1 parent 8cea571 commit f0091fc
1 file changed
Lines changed: 8 additions & 2 deletions
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
3246 | 3246 | | |
3247 | 3247 | | |
3248 | 3248 | | |
3249 | | - | |
| 3249 | + | |
| 3250 | + | |
| 3251 | + | |
| 3252 | + | |
| 3253 | + | |
| 3254 | + | |
| 3255 | + | |
3250 | 3256 | | |
3251 | 3257 | | |
3252 | | - | |
| 3258 | + | |
3253 | 3259 | | |
3254 | 3260 | | |
3255 | 3261 | | |
| |||
0 commit comments