Add next-word prediction core to the candidate list#43
Open
mkpoli wants to merge 1 commit into
Open
Conversation
`CandidateList::build` only ever *completes the current word*, so once a word is committed the suggestion engine goes dormant until the user starts typing the next partial word. The n-gram engine already predicts the following word (`Suggestions::predict_scores` / `default_words`), but nothing surfaces it. Add `CandidateList::predictions(prev2, prev1, suggest, max)`: a next-word list ranked by the blended trigram+bigram context scores, with the most frequent words filling any remaining slots for a useful cold start. Unlike `build`, there is no "typed word" at index 0 — every entry is a real predicted word. Pure, host-independent logic with unit tests (context-first ranking, trigram-beats-bigram, cold-start frequency fallback, max/dedup). The TSF key/window wiring that consumes this lands in a follow-up PR.
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
📝 WalkthroughWalkthroughAdds a new ChangesNext-word Prediction Candidate List
Estimated code review effort: 2 (Simple) | ~10 minutes 🚥 Pre-merge checks | ✅ 5✅ Passed checks (5 passed)
✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Why
The IME's autocompletion feels like it "only completes one word" — because it literally does.
CandidateList::buildonly completes the current partial word; the moment a word is committed the suggestion engine goes dormant (key_event_sink.rs:100gates on a non-empty buffer,candidates.rs:46returns an empty list for an empty word). It never predicts the next word, even though the n-gram engine already can (Suggestions::predict_scores,next_words,default_words).This is the first of two PRs adding next-word prediction (azooKey-style predictive continuation). Note: for Ainu the "conversion" problem that drives Japanese IMEs (kanji homophones) barely applies — output is kana/latin — so the real win is predictive continuation, which is exactly what's missing.
What
Adds
CandidateList::predictions(prev2, prev1, suggest, max):predict_scores), best first, alphabetical tie-break for a stable order.max. Unlikebuild, there is no "typed word" at index 0 — every entry is a real predicted word.Pure, host-independent logic. Four new unit tests (context-first ranking, trigram-beats-bigram, cold-start frequency fallback, max/dedup) — all green, and
cargo clippy -D warningsclean on the MSVC target.Scope
This PR is logic only — nothing yet consumes
predictions(). The TSF integration (a no-composition predictive popup, accept-and-insert, and the key gating that surfaces it after a commit) lands in a follow-up PR, which will need Windows verification since TSF behavior can't be exercised on Linux.Summary by CodeRabbit