Commit d03e9f3
committed
refactor(compute): extract shared matmul helpers to gpu_engine_matmul.go
Consolidate repeated patterns from 14 quantized matmul methods into 6
shared helpers: uploadRawBytes, aShapeCheck2D, bweightShapeMKN,
quantGemvResult, dequantSgemm, and sgemmNTOrFallback. Each original
method is now a thin wrapper calling these helpers, reducing
gpu_engine.go by 797 lines (net -557 across both files).
Zero behavioral changes -- all method signatures remain identical.1 parent 3e5cb40 commit d03e9f3
2 files changed
Lines changed: 471 additions & 1028 deletions
0 commit comments