Skip to content

feat: support live replay test reporters#959

Merged
thymikee merged 8 commits into
mainfrom
codex/live-test-reporters
Jun 30, 2026
Merged

feat: support live replay test reporters#959
thymikee merged 8 commits into
mainfrom
codex/live-test-reporters

Conversation

@thymikee

Copy link
Copy Markdown
Member

Summary

Enable custom replay test reporters to receive live progress events while agent-device test is running, with stream-aware stdout/stderr reporter context.

Update the default reporter to render through the same live reporter path and document a custom emoji reporter example.

Touched 16 files; scope stayed within replay test reporting, daemon client progress forwarding, CLI command wiring, tests, and docs.

Validation

pnpm exec vitest run src/__tests__/cli-network.test.ts src/utils/__tests__/daemon-client.test.ts passed: 2 files, 46 tests.

pnpm typecheck passed.

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown
PR Preview Action v1.8.1
Preview removed because the pull request was closed.
2026-06-30 19:10 UTC

@github-actions

github-actions Bot commented Jun 30, 2026

Copy link
Copy Markdown

Size Report

Metric Base Current Diff
JS raw 1.4 MB 1.4 MB +2.4 kB
JS gzip 450.2 kB 450.4 kB +186 B
npm tarball 549.1 kB 549.4 kB +305 B
npm unpacked 1.9 MB 1.9 MB +2.4 kB

Startup median (7 runs, lower is better):

Scenario Base Current Diff
CLI --version 28.3 ms 27.6 ms -0.6 ms
CLI --help 47.4 ms 46.9 ms -0.6 ms

Top changed chunks:

Chunk Raw diff Gzip diff
dist/src/cli.js +16.1 kB +5.1 kB
dist/src/find.js +8.3 kB +3.3 kB
dist/src/495.js -5.7 kB -2.1 kB
dist/src/2948.js +9 B +82 B

@thymikee thymikee force-pushed the codex/live-test-reporters branch 4 times, most recently from c688cb1 to e6e9ed9 Compare June 30, 2026 15:06
@thymikee

Copy link
Copy Markdown
Member Author

Reviewed #959 against the live replay test reporter contract and the command path from agent-device test through CLI transport wrapping, daemon progress streaming, socket/HTTP client readers, reporter runtime dispatch, custom reporter loading, public exports, tests, and docs.

The progress path now keeps daemon progress parsing transport-level and routes replay-test events into the same reporter instances that render final output. Generic command progress still falls back to stderr, JSON mode still avoids implicit default reporter output, and the documented reporter exit-code behavior now prevents a custom reporter from masking a failing suite with exit 0. No linked issue is attached, so I reviewed against the PR summary and existing replay reporter docs.

Checks are green, including Typecheck, Unit Tests, Integration Tests, Smoke Tests, Fallow Code Quality, Lint & Format, Swift Runner Unit Compile, Web Platform Smoke, and deploy-preview. I do not see an actionable blocker; this is ready for maintainer review.

@thymikee thymikee added the ready-for-human Valid work that needs human implementation, judgment, or maintainer merge label Jun 30, 2026
@thymikee thymikee force-pushed the codex/live-test-reporters branch 3 times, most recently from 0a2a19c to 4669dfd Compare June 30, 2026 16:13
@thymikee thymikee removed the ready-for-human Valid work that needs human implementation, judgment, or maintainer merge label Jun 30, 2026
@thymikee

Copy link
Copy Markdown
Member Author

The head changed after my previous review (e6e9ed9 -> 4669dfd) and this run still has pending checks, so I removed ready-for-human for now. I will re-review the current head once checks settle and then restore the label if there are still no blockers.

@thymikee thymikee force-pushed the codex/live-test-reporters branch from 4669dfd to ae85fec Compare June 30, 2026 16:34
@thymikee thymikee force-pushed the codex/live-test-reporters branch from ae85fec to 42981b3 Compare June 30, 2026 16:41
@thymikee

Copy link
Copy Markdown
Member Author

Re-reviewed the current clean #959 head (42981b3) after the force-push and pending-check window.

The live reporter path still matches the intended contract: daemon progress parsing remains transport-level, replay-test events route into the same reporter runtime used for final output, custom reporters see semantic hooks (onSuiteStart, onTestStart, onTestStep, onTestResult), JSON mode still avoids implicit reporter output, and generic command progress still falls back to stderr. The docs now describe the stream/context API without adding unsupported public helper exports.

Checks are green, including Typecheck, Unit Tests, Integration Tests, Smoke Tests, Coverage, Fallow Code Quality, Lint & Format, Swift Runner Unit Compile, Web Platform Smoke, iOS Runner Swift Compatibility, and deploy-preview. I do not see an actionable blocker; ready for maintainer review.

@thymikee thymikee added ready-for-human Valid work that needs human implementation, judgment, or maintainer merge and removed ready-for-human Valid work that needs human implementation, judgment, or maintainer merge labels Jun 30, 2026
@thymikee

thymikee commented Jun 30, 2026

Copy link
Copy Markdown
Member Author

Head changed after the prior readiness review (now at 8e41006), and CI is still in progress. I removed ready-for-human for now; please wait for checks to settle and re-review this head before restoring the label.

@thymikee thymikee force-pushed the codex/live-test-reporters branch from 8e41006 to aa4d7bf Compare June 30, 2026 17:20
@thymikee

Copy link
Copy Markdown
Member Author

Review finding on the current clean head aa4d7bf:

ReplayTestReporter live hooks are typed as void | Promise<void> (src/replay/test/reporters/types.ts), but runReplayTestReporterHook starts returned promises and only attaches .catch(...) without waiting for them. renderReplayTestResponse then immediately runs onSuiteEnd, so a stateful async reporter can receive final output before its live onTestResult/onTestStep work has completed. That makes the public reporter contract racy even though the type advertises async hooks.

Please either make live hooks explicitly synchronous in the public type/docs, or track/flush pending live-hook promises before onSuiteEnd and add a regression test with an async live hook whose state is consumed by onSuiteEnd. Checks are green, but I would not restore ready-for-human until this contract is resolved.

…spatch

Live reporter hooks (onSuiteStart/onTestStart/onTestStep/onTestResult)
were typed as `void | Promise<void>` but fired from the synchronous daemon
progress stream reader without being awaited, so a stateful async reporter
could receive onSuiteEnd before its live work settled. Type them as `void`
to make the contract honest; onSuiteEnd stays awaited for async flushing.

A returned promise from a misbehaving custom JS reporter is still caught so
it cannot crash the CLI with an unhandled rejection, but it is documented as
unsupported and not awaited.

Collapse the four near-identical per-event hook dispatch branches into a
single table-driven path, and document the synchronous-hook and
exit-code-escalation contracts. Add a regression test covering a throwing
live hook.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
Claude-Session: https://claude.ai/code/session_01XXHAYxWpvSzqc6CtneYL8J
@thymikee thymikee merged commit f4882bc into main Jun 30, 2026
22 checks passed
@thymikee thymikee deleted the codex/live-test-reporters branch June 30, 2026 19:10
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants