Goal
Implement automated tests that evaluate gRPC request/reply robustness when handling very large DataFrame payloads. The test should simulate an experiment by generating DataFrames with the following sample sizes: 100, 1,000, 10,000, 100,000, 1,000,000, 10,000,000. The intent is to validate that sending and receiving these payloads over gRPC does not cause regressions in responsiveness (especially UI rendering), and to measure request/response performance and stability.
Scope / Requirements
- Create pytest-based tests that are parameterized for the sample sizes: [100, 1000, 10000, 100000, 1000000, 10000000].
- For each sample size, the test must:
- Generate a synthetic DataFrame payload representing a simulated experiment (columns should match the production payload shape if known; otherwise use a representative set: timestamp/feature1..featureN/label).
- Serialize the payload into the same gRPC request message used in production (or a close mock if production proto isn't available). If the repo has an existing gRPC client/server implementation, use it; otherwise implement a local test gRPC server that echoes or validates the payload.
- Measure and log: request serialization size (bytes), round-trip latency (ms), and success/failure.
- Repeat requests N times (e.g., 3-5) per size and report min/median/max latency.
- Tests must assert that requests complete successfully and do not crash the server or client. They should fail if the server returns error or the process runs out of memory.
- Add an optional smoke check that attempts a simple UI rendering metric: e.g., record the server-side response time under the gRPC load to ensure it stays within a configurable threshold. (If integrating a real UI/browser test is too heavy, document how to add it later.)
Implementation Notes (suggested)
- File: tests/test_grpc_robustness.py
- Use pytest.mark.parametrize for the sample sizes.
- Use pandas to construct DataFrames and protobuf helpers (or existing proto stubs) to serialize messages.
- Start a local test gRPC server in-process for tests using the same service implementation (or a lightweight echo/mocking service) to avoid network flakiness.
- Keep payload generation memory-efficient: generate columns with simple numeric types and avoid storing multiple huge copies simultaneously.
- Add CLI flags or pytest markers (e.g., @pytest.mark.long) to allow skipping these tests on CI by default, or to run only on demand because the largest sizes may be time- and memory-consuming.
- Record results to stdout and optionally store a CSV/JSON summary in the test artifacts for later analysis.
Acceptance Criteria
- A new test file exists (tests/test_grpc_robustness.py) with parameterized tests for all requested sizes.
- Tests run locally and against an in-process test gRPC server. They pass for reasonable environments and fail safely on OOM or server error.
- The test logs sample size, serialized payload size, and min/median/max round-trip latencies.
- Clear instructions added in the test file header or README section describing how to run the tests, how to skip long runs, and how to interpret the results.
Deliverables
- tests/test_grpc_robustness.py
- (Optional) helper under tests/utils or tests/fixtures for starting the test gRPC server and generating payloads
- Documentation/comments explaining how to run and configure thresholds and how to add a UI rendering check later
Notes / Questions (left intentionally as implementation decisions)
- I assumed we do not require adding a full browser-based UI test; instead, we will measure server-side response times as a proxy for UI impact. If you want a synthetic browser rendering measurement (Puppeteer / Playwright), I can add it in a follow-up issue.
If this looks good, I'll create the issue and include implementation suggestions and a proposed test skeleton in the description so someone can pick it up.
Goal
Implement automated tests that evaluate gRPC request/reply robustness when handling very large DataFrame payloads. The test should simulate an experiment by generating DataFrames with the following sample sizes: 100, 1,000, 10,000, 100,000, 1,000,000, 10,000,000. The intent is to validate that sending and receiving these payloads over gRPC does not cause regressions in responsiveness (especially UI rendering), and to measure request/response performance and stability.
Scope / Requirements
Implementation Notes (suggested)
Acceptance Criteria
Deliverables
Notes / Questions (left intentionally as implementation decisions)
If this looks good, I'll create the issue and include implementation suggestions and a proposed test skeleton in the description so someone can pick it up.