test(ssh-commands): lock down update timing fix and clarify skill docs#20
Conversation
The original timing bug (PR #14, v0.16.4) had no tests, leaving the omitempty behaviour on SSHCommandCreateRequest.Timing easy to regress. Mixpanel still shows 140 historical hits from one codex agent on v0.16.3 attempting `ssh-commands update` with bad timing values — modern versions are immune, but the skill reference that drives agent input still advertised a nonexistent --name flag and never mentioned --timing at all. - Add pkg/sdk regression test asserting empty Timing is dropped from the PATCH body, and a positive test that explicit values pass through. - Add internal/commands unit test for validateSSHTiming covering empty, valid, and the historically-misleading invalid values (before/after). - Replace the stale --name flag in the skill reference with the real flags (--command, --description, --timing) and document the missing update subcommand so future agents see valid timing values. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
|
No actionable comments were generated in the recent review. 🎉 ℹ️ Recent review info⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: 📒 Files selected for processing (1)
WalkthroughAdds tests for CLI timing validation and SDK request JSON serialization around the ChangesSSH Command Timing Tests & Documentation
Estimated code review effort🎯 2 (Simple) | ⏱️ ~12 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Comment |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ed89189d5b
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| | `--timing` | no | When the command runs: `all` (default), `first`, or `after_first` | | ||
|
|
||
| ```bash | ||
| dhq ssh-commands create -p my-app --command "sudo systemctl restart app" --description "Restart" --json |
There was a problem hiding this comment.
Update the SSH command eval expectation
Once this reference tells agents to create SSH commands without --name, the add-ssh-command skill eval is left inconsistent: skill-evals/deployhq/evals.json still requires the generated command to contain --name, and run-evals.sh reports MISSING_FLAG:--name whenever that expected flag is absent. In contexts where these skill evals are run against the updated reference, a correct agent response following this example will fail until that eval is updated to remove --name (and include the intended timing flag if needed).
Useful? React with 👍 / 👎.
The --name flag does not exist on `dhq ssh-commands create` — the reference docs in this PR now correctly point at --command, --description, and --timing. Leaving the eval expectation requiring --name would make run-evals.sh emit MISSING_FLAG:--name on a correct agent response. Codex flagged this on PR #20: #20 (comment)... Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Summary
The original
ssh-commands updatetiming bug (PR #14, shipped in v0.16.4) was code-correct but untested, and the agent-facing skill reference still advertised a--nameflag that doesn't exist while saying nothing about--timing. Mixpanel still shows 140 historical 422s from a single codex agent on v0.16.3 failing withtiming is not included in the list— modern CLIs are immune, but the docs that drive agent input were still steering them wrong.pkg/sdk/ssh_commands_test.go(new) — assertsomitemptykeeps emptyTimingout of the PATCH body, plus a positive test that explicit values pass through. This is the actual regression guard for the original bug.internal/commands/ssh_commands_test.go(new) — unit test forvalidateSSHTimingcovering empty (allowed on update), the three valid values, and the historically misleadingbefore/afterstrings that the old help text suggested.skills/deployhq/references/configuration.md— removed the nonexistent--nameflag, documented--command/--description/--timingwith valid values, and added the missingupdatesubcommand section.Test plan
go test ./...— all packages greengolangci-lint run ./...— 0 issuesTestUpdateSSHCommand_OmitsEmptyTimingpasses; revertingomitemptyonSSHCommandCreateRequest.Timingmakes it fail (verified the test actually guards the regression)🤖 Generated with Claude Code
Summary by CodeRabbit