feat(neural): export int8 lstm.bin (+ parity fixture)#34
Open
mkpoli wants to merge 1 commit into
Open
Conversation
Serializes the trained LSTM to a compact `lstm.bin` the Rust DLL will load and run with a hand-rolled forward pass (no ONNX runtime). The four big matrices (embedding, the two LSTM weights, the output projection) are per-tensor symmetric int8 + an f32 scale; biases stay f32. Result: **4.37 MB**, and int8 is *free* — the dequantized model scores the SAME KSR as fp32 (38.9, in-vocab 44.9), max dequant error 0.054. So we ship int8. - --emit-checkpoint writes a dequantized model.pt so eval_ksr.py can measure the as-shipped KSR (it did: 38.9, unchanged). - --emit-parity writes a JSON fixture (explicit-math forward on the exported weights) for the upcoming Rust module's parity test. Binary layout documented in the module header. lstm.bin / fixtures are git-ignored (corpus-derived); regenerate with this script.
|
Important Review skippedAuto reviews are disabled on base/target branches other than the default branch. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
6d89903 to
c3e9296
Compare
a3bb3af to
13f8f82
Compare
e8242ef to
db94c4c
Compare
13f8f82 to
37a31f3
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Serializes the trained LSTM to a compact
lstm.binfor the Rust DLL (no ONNX runtime). Big matrices → per-tensor symmetric int8 + f32 scale; biases f32.int8 is free: the dequantized model scores the SAME KSR as fp32 — 38.9 (in-vocab 44.9), at 4.37 MB, max dequant error 0.054.
--emit-checkpointwrites a dequantizedmodel.ptsoeval_ksr.pymeasures the as-shipped KSR (it did: 38.9, unchanged).--emit-paritywrites a JSON fixture for the Rust module's parity test.Stack: this →
feat/neural-rs(module) →feat/neural-wire(wiring).lstm.bin/fixtures are git-ignored (corpus-derived).