feat(odin): Implement BoundedReader for safe current-day file streaming#543
feat(odin): Implement BoundedReader for safe current-day file streaming#543vikramlc-cognite wants to merge 5 commits into
Conversation
There was a problem hiding this comment.
Code Review
This pull request introduces BoundedReader, a wrapper for binary file handles designed to limit reads to a snapshot byte count, preventing issues when uploading active log files that are still being appended to. The review feedback highlights that BoundedReader needs to implement tell(), closed, and close() to fully satisfy the file-like interface required by multipart uploads (e.g., when wrapped in ChunkedStream for files larger than 5 GiB) and recommends adding corresponding unit tests.
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## master #543 +/- ##
==========================================
+ Coverage 83.08% 83.17% +0.09%
==========================================
Files 45 46 +1
Lines 4522 4548 +26
==========================================
+ Hits 3757 3783 +26
Misses 765 765
🚀 New features to boost your workflow:
|
|
/gemini review |
There was a problem hiding this comment.
Code Review
This pull request introduces BoundedReader, a utility class designed to wrap a binary file handle and limit reads to a specified snapshot size, preventing upload errors when log files are actively being appended to. The feedback suggests handling cases where the underlying stream's read method returns None to avoid a potential TypeError when calling len(data).
Yaseen-A-Khan
left a comment
There was a problem hiding this comment.
The changes look good, just a nit comment.
Summary
Adds BoundedReader, a lightweight BinaryIO wrapper that gates reads at a byte count captured at snapshot time, enabling safe streaming of the active log file (file.log) to CDF Files without risking an HTTP Content-Length mismatch as the logger appends new lines.
Type of change
What changed
Why it changed
The active log file (file.log) is written by TimedRotatingFileHandler while the action runs. Passing a plain open(log_path, "rb") to IOFileUploadQueue is unsafe:
BoundedReader eliminates this by declaring the snapshot size via len and hard-stopping reads at that boundary. No buffering, no copies - the only overhead is one integer decrement per read() call.
Related issue
EDGE-183 / EDGE-607
Related docs
BULK_LOG_UPLOAD_ACTION_PLAN - "Current Day File Handling" section
What to focus on during review
BoundedReader only needs to satisfy the upload path's interface: len, read(), and the context manager. Adding inheritance would pull in abstract methods we don't need.
Test evidence
$ python -m pytest tests/test_unstable/test_bounded_reader.py -v
tests/test_unstable/test_bounded_reader.py::test_len_returns_snapshot_size_and_is_stable_after_reads PASSED
tests/test_unstable/test_bounded_reader.py::test_read_respects_snapshot_bound[read_all] PASSED
tests/test_unstable/test_bounded_reader.py::test_read_respects_snapshot_bound[read_partial] PASSED
tests/test_unstable/test_bounded_reader.py::test_read_respects_snapshot_bound[read_exceeds_snapshot] PASSED
tests/test_unstable/test_bounded_reader.py::test_read_returns_empty_after_exhaustion PASSED
tests/test_unstable/test_bounded_reader.py::test_zero_snapshot_returns_empty_immediately PASSED
tests/test_unstable/test_bounded_reader.py::test_context_manager_closes_underlying_file PASSED
tests/test_unstable/test_bounded_reader.py::test_super_len_reads_len_not_fstat PASSED
tests/test_unstable/test_bounded_reader.py::test_midnight_rotation_race_file_shorter_than_snapshot PASSED
9 passed in 0.03s
Risks and unknowns
Rollout and rollback
N/A - internal implementation class, not yet wired into any action. No config changes, no migrations.
Checklist