Skip to content
This repository was archived by the owner on Jan 2, 2026. It is now read-only.

Commit 7122cd1

Browse files
zircoteclaude
andcommitted
fix(git_ops): correct byte counting in batch content parsing
Use UTF-8 encoding for accurate byte counting when parsing git cat-file --batch output. The previous implementation used len(str) which counts characters, not bytes, causing incorrect content boundary detection for multi-byte UTF-8 characters. 🤖 Generated with [Claude Code](https://claude.com/claude-code) Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
1 parent 39c13b5 commit 7122cd1

1 file changed

Lines changed: 15 additions & 4 deletions

File tree

src/git_notes_memory/git_ops.py

Lines changed: 15 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -425,13 +425,24 @@ def show_notes_batch(
425425
try:
426426
size = int(parts[2])
427427
# Content follows on next lines until size bytes consumed
428+
# Note: git cat-file --batch output format has content
429+
# followed by a newline separator. We track bytes consumed
430+
# including newlines between lines (but not trailing).
428431
content_lines: list[str] = []
429-
remaining = size
432+
bytes_read = 0
430433
i += 1
431-
while remaining > 0 and i < len(lines):
434+
while bytes_read < size and i < len(lines):
432435
content_line = lines[i]
433-
content_lines.append(content_line)
434-
remaining -= len(content_line) + 1 # +1 for newline
436+
line_bytes = len(content_line.encode("utf-8"))
437+
# Account for newline except after last content line
438+
if bytes_read + line_bytes >= size:
439+
# Last line of content
440+
content_lines.append(content_line)
441+
bytes_read += line_bytes
442+
else:
443+
# More content follows; add newline byte
444+
content_lines.append(content_line)
445+
bytes_read += line_bytes + 1
435446
i += 1
436447
results[current_sha] = "\n".join(content_lines)
437448
sha_index += 1

0 commit comments

Comments
 (0)