feat(neural): pure-Rust streaming LSTM inference module#35
Conversation
A dependency-free neural next-word engine: no ONNX runtime, just std. Loads the int8 lstm.bin (export_weights.py), dequantizes to f32 at load, and runs a streaming single-layer LSTM — keep (h,c) across a composition's committed words, step per word, rank prefix-matching vocab by the next-word logits (only the matching projection rows are scored, so it's cheap per keystroke). - src/neural.rs: Model::load (bounds-checked, None on bad bytes), State + step_id /step_word, logits, complete(state, prefix, k), and global() over the embedded table. - build.rs: provide_lstm_table — embeds data/neural/lstm.bin when present, else an empty header so public builds compile with the engine disabled (mirrors provide_ngrams_table). - Tests: a synthetic hand-verified model (LSTM math) + garbage rejection. The real model is parity-checked against the Python exporter offline (hidden state within tolerance, top-1 / top-10 agree) — verified via the host harness. Not wired into the IME yet (that's the next PR, behind a config flag). Clippy -D warnings clean on the Windows target.
|
Warning Review limit reached
More reviews will be available in 25 minutes and 53 seconds. Learn how PR review limits work. Your organization has used up its prepaid credits, and credit purchases are no longer available. Enable the review add-on in the billing tab to keep reviews running — you're only billed for reviews past your plan's rate limits ($0.25/file). ⌛ How to resolve this issue?After more reviews become available, a review can be triggered using the To avoid repeated limits, reduce automatic review volume by pausing incremental auto-reviews earlier, using label-based review opt-in, excluding WIP or generated PR titles, or requesting reviews manually when the PR is ready. If your team needs uninterrupted high-volume reviews, an organization admin can enable usage-based credits. 🚦 How do rate limits work?CodeRabbit enforces per-developer PR review limits for each organization. Most developers receive the normal plan review availability. For paid Pro and Pro+ PR reviews, CodeRabbit uses adaptive limits for sustained high-volume activity. When a developer's recent PR review activity reaches the 95th percentile or higher among CodeRabbit users, additional reviews become available more gradually as earlier reviews age out of the rolling window. Please see our Fair Usage Limits Policy for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (3)
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
A dependency-free neural next-word engine — no ONNX runtime, just
std. Loads the int8lstm.bin, dequantizes at load, and runs a streaming single-layer LSTM: keep(h,c)across a composition, step per word, rank prefix-matching vocab by next-word logits (only the matching projection rows are scored → cheap per keystroke).src/neural.rs:Model::load(bounds-checked,Noneon bad bytes),State/step,logits,complete(state, prefix, k),global()over the embedded table.build.rs:provide_lstm_table— embedsdata/neural/lstm.binwhen present, else an empty header so public builds compile with the engine disabled.Not wired into the IME here (that's #33). Clippy
-D warningsclean on the Windows target.Stack:
feat/neural-export(lstm.bin) → this →feat/neural-wire(#33).