OCTO-11227 #376
Merged
Merged
Conversation
…tuation The punctuation lookahead that suppresses mid-row code spaces was seeing the SCC error-correction duplicate instead of the actual next content. When a doubled control code (e.g. 9120 9120) appeared before punctuation (e.g. ae80 = period), the duplicate masked the period from the check, causing an incorrect space to be inserted and pushing the line to 33 characters. Skip the duplicate when computing next_command so the punctuation exception works as intended.
🟢 PR Compliance ReviewRisk Level: LOW
SAFE TO MERGE - No critical issues found Full report available in workflow artifacts |
🟢 PR Compliance ReviewRisk Level: LOW
SAFE TO MERGE - No critical issues found Full report available in workflow artifacts |
dianadersedan
previously approved these changes
Jun 25, 2026
🟢 PR Compliance ReviewRisk Level: LOW
SAFE TO MERGE - No critical issues found Full report available in workflow artifacts |
dianadersedan
approved these changes
Jun 25, 2026
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
SCC files double every control code for error correction. So 9120 (italic off / white plain) is always sent as 9120 9120. The decoder sees the pair and ignores the second one.
When pycaption processes a mid-row code like 9120, it looks at the next word in the hex stream to decide whether to insert a space. If the next word starts with a punctuation byte (ae = period, a1 = exclamation, etc.), it skips the space.
In the failing file, the sequence is:
...9120 9120 ae80...
↑ ↑
duplicate period
When processing the first 9120:
next_command = "9120" (the duplicate)
next_command[:2] = "91" — NOT punctuation
Space gets inserted → "Files ." → 33 chars → error
The Fix
If the next word is identical to the current word (it's the duplicate), skip it and look one further. Now when processing the first 9120:
Sees word_list[idx+1] = "9120" = same as current word → skip
next_command = "ae80" (the period)
next_command[:2] = "ae" — IS punctuation
Space is suppressed → "Files." → 32 chars → passes