[codex] Apply generated cpflow GitHub Actions flow#732
[codex] Apply generated cpflow GitHub Actions flow#732
Conversation
|
Important Review skippedDraft detected. Please check the settings in the CodeRabbit UI or the ⚙️ Run configurationConfiguration used: Organization UI Review profile: CHILL Plan: Pro Run ID: You can disable this status message by setting the Use the checkbox below for a quick retry:
✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Upstream token not passed to copy-image command
- Added the required -t upstream token argument to the cpflow copy-image-from-upstream command while keeping the token sourced from secrets via the environment.
Or push these changes by commenting:
@cursor push 3606e16506
Preview (3606e16506)
diff --git a/.github/workflows/cpflow-promote-staging-to-production.yml b/.github/workflows/cpflow-promote-staging-to-production.yml
--- a/.github/workflows/cpflow-promote-staging-to-production.yml
+++ b/.github/workflows/cpflow-promote-staging-to-production.yml
@@ -203,14 +203,13 @@
- name: Copy image from staging
env:
- # Pass the upstream token via env rather than `-t` so it doesn't appear in /proc/<pid>/cmdline.
CPLN_UPSTREAM_TOKEN: ${{ secrets.CPLN_TOKEN_STAGING }}
PRODUCTION_APP_NAME: ${{ vars.PRODUCTION_APP_NAME }}
CPLN_ORG_PRODUCTION: ${{ vars.CPLN_ORG_PRODUCTION }}
shell: bash
run: |
set -euo pipefail
- cpflow copy-image-from-upstream -a "${PRODUCTION_APP_NAME}" --org "${CPLN_ORG_PRODUCTION}"
+ cpflow copy-image-from-upstream -a "${PRODUCTION_APP_NAME}" -t "${CPLN_UPSTREAM_TOKEN}" --org "${CPLN_ORG_PRODUCTION}"
- name: Deploy image to production
env:You can send follow-ups to the cloud agent here.
Code Review: cpflow GitHub Actions MigrationThis is a solid migration from hand-written to generated Security — meaningful fixes
Bugs fixed
Potential issues
Nits
Overall this is well-structured with meaningful security and reliability improvements. The one item I'd prioritize is the |
| if cpflow exists -a "${APP_NAME}" --org "${CPLN_ORG}"; then | ||
| echo "exists=true" >> "$GITHUB_OUTPUT" | ||
| else | ||
| echo "exists=false" >> "$GITHUB_OUTPUT" |
There was a problem hiding this comment.
This if/else around cpflow exists conflates two different failure modes: "app does not exist" (expected non-zero) and "API/auth error" (unexpected non-zero). Both land in the else branch and set exists=false.
Compare delete-app.sh in this same PR, which carefully pattern-matches the output ("Double check your org", "Unknown API token format", "ERROR", etc.) to distinguish a real failure from a genuine not-found and surface it as a hard error.
If there's a transient auth issue when a PR is pushed, this step sets exists=false and the pull_request path silently writes a step-summary saying the app doesn't exist yet — masking the real problem until the deploy is manually retried.
Suggested approach: mirror delete-app.sh's pattern — capture the output, check exit code, inspect the text for known error tokens, and exit non-zero on a real failure rather than treating it as "not found".
There was a problem hiding this comment.
Fixed in b9095c6 by capturing cpflow exists output and failing hard on auth/API error tokens instead of treating every non-zero exit as app-not-found.
|
|
||
| if [[ -f "${HOME}/.ssh/cpflow_build_key" ]]; then | ||
| eval "$(ssh-agent -s)" | ||
| trap 'ssh-agent -k >/dev/null; rm -f "${HOME}/.ssh/cpflow_build_key"' EXIT |
There was a problem hiding this comment.
The EXIT trap kills the SSH agent and removes cpflow_build_key, but the ~/.ssh/known_hosts file written in the previous step is not cleaned up here. On ephemeral GitHub-hosted runners this is harmless (the VM is discarded), but if this action is ever adopted on self-hosted runners with a persistent ~/.ssh, the custom known_hosts entries would accumulate across runs.
Consider adding rm -f "${HOME}/.ssh/known_hosts" to the trap, or using a temp file under $RUNNER_TEMP instead of ~/.ssh for the known_hosts.
| default: "3.3.1" | ||
| cpflow_version: | ||
| description: cpflow gem version | ||
| required: false |
There was a problem hiding this comment.
The CLI version is pinned here as a default, which is good for reproducibility. Worth noting: since sudo npm install -g fetches from the npm registry at runtime, a compromised release of @controlplane/cli@3.3.1 would silently affect all runs. Consider adding a SHA/integrity check (e.g. via npm install --integrity) or at minimum locking to a digest-pinned download, especially since this step runs in workflows that access staging/production tokens.
|
|
||
| current_image="${selected_image:-${first_image}}" | ||
| current_version="${selected_version:-${first_version}}" | ||
|
|
There was a problem hiding this comment.
The staging env-var parity check only compares names, not values. It will catch a missing DATABASE_URL in production, but it won't catch a case where production has the right key names but wrong or stale values (e.g. a staging secret was rotated but the production counterpart wasn't updated). This is expected behaviour for a name-only check — just worth documenting explicitly so it's clear what the check does and doesn't guarantee.
There was a problem hiding this comment.
Documented in b9095c6: the production promotion workflow checks that production has the same environment variable names as staging, but it does not compare secret values.
|
Code Review: Replace hand-written CI/CD Actions with generated cpflow- flow* Well-considered overhaul that hardens security, safety, and operational reliability. No blocking issues found. Security improvements Token no longer passed via --token flag — old setup-environment exposed it in /proc/cmdline and ps; now uses CPLN_TOKEN env var only. SSH key scoped to a dedicated step so ACTIONS_STEP_DEBUG cannot leak it from the build step. Delete command now gated on author_association (old workflow had no permission check). Fork PR protection blocks syncs before any secret is consumed. Old contains() command matching replaced with exact match. Safety improvements cancel-in-progress changed from true to false everywhere — old staging deploy risked partial GVC deployments mid-rollout. Prefix-based delete guard replaces the old substring check for production/staging names. Production rollback logic added with per-workload state capture and restore — old workflow had no rollback at all. Health checks now configurable (HEALTH_CHECK_RETRIES, HEALTH_CHECK_INTERVAL, HEALTH_CHECK_ACCEPTED_STATUSES). Issue 1: Fork-PR /deploy-review-app comment silently fails with no PR feedback When a collaborator comments /deploy-review-app on a fork PR, the job condition passes but Validate review app deployment source exits 1 with only a runner log message. No PR comment explains the rejection. Suggest posting an issues.createComment before exit 1, similar to how the unconfigured-secrets path writes to GITHUB_STEP_SUMMARY. Issue 2: cpflow exists error detection is string-matching fragile delete-app.sh uses case matching over stderr to distinguish app-not-found from auth/network errors. The inline TODO acknowledges this. If cpflow changes an error message, the pattern will not match and a real failure will be silently treated as app not found, causing delete to return success vacuously. The TODO to switch to a structured exit code is the right long-term fix. Issue 3: gem install / npm install without integrity pinning cpflow-setup-environment/action.yml installs @controlplane/cli@3.3.1 and cpflow 4.2.0 without hash verification. A re-tagged or compromised release would install silently. For production-adjacent CI, consider verifying SHA256 after install or using lockfiles with integrity entries. Issue 4: Nightly cleanup has no failure notification cpflow-cleanup-stale-review-apps.yml runs on cron with no failure notification. Silent failures accumulate stale apps over days. Consider a step gated on if: failure() that posts a GitHub issue or annotation. Minor observations APP_NAME / PR_NUMBER double-assignment is intentional (workflow-level env needed early for concurrency.group, step overwrites with resolved values) but a short comment would clarify intent. cpflow-review-app-help.yml using pull_request_target is correct and well-documented. deploy job in cpflow-deploy-staging.yml re-runs cpflow-setup-environment unnecessarily (only needs cpflow deploy-image) — minor redundancy, acceptable for simplicity. |
| echo "🗑️ Deleting application: $APP_NAME" | ||
| cpflow delete -a "$APP_NAME" --org "$CPLN_ORG" --yes | ||
|
|
||
| echo "✅ Successfully deleted application: $APP_NAME" |
There was a problem hiding this comment.
The case pattern matching over stderr is the right approach given cpflow exists doesn't expose a structured signal, and the inline TODO is the right way to track this. One additional defensive measure worth considering: log the full exists_output to the runner even when the "not found" path is taken, so there's an audit trail if unexpected output silently passed through the filter. Right now the output is printed only if non-empty, but it's easy to miss in logs.
Also worth noting: the pattern list doesn't include "timeout", "connection refused", or "ENOTFOUND" — all plausible network failure messages. If cpflow's Ruby HTTP client times out, it may surface Errno::ECONNREFUSED or Net::ReadTimeout, which would match "Net::", so that's covered. But curl-style timeouts would not. Low risk given this is a Ruby gem, but worth keeping in mind when bumping the cpflow version.
| shell: bash | ||
| run: | | ||
| set -euo pipefail | ||
| cpflow cleanup-stale-apps -a "${{ vars.REVIEW_APP_PREFIX }}" --org "${{ vars.CPLN_ORG_STAGING }}" --yes |
There was a problem hiding this comment.
No failure notification exists for this cron job. If cpflow cleanup-stale-apps fails (auth error, API hiccup, etc.), stale review apps accumulate silently across days.
Consider appending a step:
- name: Notify on failure
if: failure()
run: |
echo "::error::Stale review app cleanup failed. Check the workflow logs."Or, if the team monitors GitHub issues, use gh issue create to open a tracking issue on failure so it's not lost in workflow history.
| } >> "$GITHUB_STEP_SUMMARY" | ||
| exit 0 | ||
| fi | ||
|
|
There was a problem hiding this comment.
When a collaborator comments /deploy-review-app on a fork PR, execution reaches this exit 1 with only a runner log message — no feedback is posted to the PR thread. The collaborator sees a red workflow run with no explanation in the PR.
Consider posting a comment before exiting. The GH_TOKEN is available from the Resolve PR ref and commit step but not inherited here; you'd need to add GH_TOKEN: ${{ secrets.GITHUB_TOKEN }} to this step's env block. Something like:
gh api "repos/${GITHUB_REPOSITORY}/issues/${PR_NUMBER}/comments" \
--method POST \
--field body="Fork PRs cannot use /deploy-review-app because this workflow uses repository secrets. Ask a maintainer to deploy from a branch in the base repository." \
--silent || true
There was a problem hiding this comment.
Fixed in 1e46bfe: fork PR /deploy-review-app rejections now post a PR comment before exiting, while still keeping the deployment blocked because the workflow uses repository secrets.
| run: | | ||
| set -euo pipefail | ||
|
|
||
| sudo npm install -g @controlplane/cli@${{ inputs.cpln_cli_version }} |
There was a problem hiding this comment.
Version-pinned but not hash-verified. A malicious actor who gains write access to the @controlplane/cli npm package (or re-tags a release) could substitute a compromised binary. Same applies to gem install cpflow below.
For production-adjacent pipelines, consider adding a post-install checksum check, or at minimum document the expected SHA256 for each pinned version in a comment so drift is detectable during future version bumps.
| run: | | ||
| # Best-effort rollback: try every workload, aggregate failures, exit non-zero at the end | ||
| # if any failed. A single cpln hiccup shouldn't leave other workloads mid-promotion. | ||
| set -uo pipefail |
There was a problem hiding this comment.
The intentional omission of -e here (using set -uo pipefail instead of set -euo pipefail) is the correct design — it lets rollback continue across all workloads even if one cpln call fails, with rollback_failures tracking the aggregate. Worth a brief comment explaining the omission so a future reader doesn't add -e thinking it was forgotten:
| set -uo pipefail | |
| # Intentionally no -e: we want every workload to attempt rollback even if earlier ones fail. | |
| set -uo pipefail |
There was a problem hiding this comment.
Fixed in 1e46bfe with a comment explaining that set -e is intentionally omitted so rollback can attempt every workload and aggregate failures.
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 2 total unresolved issues (including 1 from previous review).
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Token not persisted for subsequent workflow steps
- The composite action now writes CPLN_TOKEN to GITHUB_ENV so subsequent cpflow and cpln workflow steps inherit authentication without passing the token on the command line.
Or push these changes by commenting:
@cursor push f1eee564c7
Preview (f1eee564c7)
diff --git a/.github/actions/cpflow-setup-environment/action.yml b/.github/actions/cpflow-setup-environment/action.yml
--- a/.github/actions/cpflow-setup-environment/action.yml
+++ b/.github/actions/cpflow-setup-environment/action.yml
@@ -63,6 +63,12 @@
exit 1
fi
+ {
+ echo "CPLN_TOKEN<<__CPLN_TOKEN__"
+ echo "$CPLN_TOKEN"
+ echo "__CPLN_TOKEN__"
+ } >> "$GITHUB_ENV"
+
create_output=""
if ! create_output="$(cpln profile create default --org "$ORG" 2>&1)"; then
if ! echo "$create_output" | grep -qi "already exists"; thenYou can send follow-ups to the cloud agent here.
Code Review — PR #732: Generated cpflow GitHub Actions FlowOverviewThis is a well-structured replacement of hand-written CI/CD workflows with the generated Significant Improvements ✅
IssuesBug:
|
| fi | ||
| deploy_args+=(--org "${CPLN_ORG}" --verbose) | ||
|
|
||
| cpflow deploy-image "${deploy_args[@]}" |
There was a problem hiding this comment.
Bug: deploy runs even when build fails.
When a job has an explicit if:, GitHub Actions drops the implicit success() guard. Because this condition only checks validate-branch's output, the deploy job will start even if the build job failed, wasting runner minutes and attempting to deploy a missing image.
| cpflow deploy-image "${deploy_args[@]}" | |
| if: needs.validate-branch.outputs.is_deployable == 'true' && needs.build.result == 'success' |
| echo "🗑️ Deleting application: $APP_NAME" | ||
| cpflow delete -a "$APP_NAME" --org "$CPLN_ORG" --yes | ||
|
|
||
| echo "✅ Successfully deleted application: $APP_NAME" |
There was a problem hiding this comment.
The string-matching heuristic on cpflow exists error output (lines 43–54) is the main reliability risk in this file. The consequence of an unmatched error pattern is silently exiting 0 ("app not found") when the API actually returned an error, meaning the delete is skipped without any visible failure.
An alternative that inverts the logic: treat any non-zero exit with non-empty output as a failure unless it matches a known "not found" pattern:
# instead of matching known error strings, match known "not found" patterns
case "$exists_output" in
*"not found"*|*"does not exist"*)
echo "⚠️ Application does not exist: $APP_NAME"
exit 0
;;
*)
echo "❌ ERROR: failed to determine whether application exists" >&2
printf '%s\n' "$exists_output" >&2
exit 1
;;
esacThis is safer by default — unknown output is treated as an error rather than a no-op. The tradeoff is that you'd need to know the exact "not found" wording from cpflow exists, which has the same coupling problem in the other direction. The TODO comment already captures this; just flagging it for awareness.
| echo "ready=false" >> "$GITHUB_OUTPUT" | ||
| { | ||
| echo "Control Plane review app automation is not configured yet." | ||
| echo |
There was a problem hiding this comment.
This top-level APP_NAME is computed from the raw event payload and is later overwritten in $GITHUB_ENV by the Resolve PR ref and commit step (echo "APP_NAME=..." >> "$GITHUB_ENV"). The step-level write wins, so this declaration is dead configuration and can be confusing. Consider removing it and relying solely on the step-computed value, which is already the authoritative one (it handles all three event types correctly).
Code Review: cpflow Generated GitHub Actions FlowThis is a meaningful security and reliability upgrade over the handwritten workflows. Several issues from the old setup are fixed here; I've also flagged a few new items worth addressing. ✅ Notable Improvements
🔴 Issues1. The delete workflow triggers on 2. The same fragile string-matching logic ( 🟡 Warnings3. In 4. Production health check accepts 301/302 by default
5. Nightly cleanup has no failure notification
ℹ️ Informational6.
7. These will pick up any non-breaking updates automatically, which is generally fine. However, |
| name: Delete Review App | ||
|
|
||
| on: | ||
| pull_request_target: |
There was a problem hiding this comment.
pull_request_target deserves an explanation comment here.
pull_request_target runs in the context of the base repo (write permissions, access to secrets) and is required for fork PR cleanup — pull_request can't post comments on fork PRs because the token is read-only. This is safe here because no PR code is ever checked out (the actions/checkout@v4 below fetches base-branch code only).
cpflow-review-app-help.yml has a good comment explaining this exact pattern. Without a parallel comment here, a future maintainer may "fix" this to pull_request thinking it's safer, silently breaking fork PR cleanup.
| pull_request_target: | |
| # pull_request_target is intentional: it has write permission to comment on PRs from | |
| # forks, where `pull_request` would be read-only. This is safe because no PR code is | |
| # checked out — the job only calls cpflow with base-repo credentials. | |
| # Do not switch this to `pull_request` or add a checkout of the PR ref without re-evaluating. | |
| pull_request_target: |
| # Make the token available to later workflow steps without putting it on argv. | ||
| token_delimiter="CPLN_TOKEN_$(openssl rand -hex 8)" | ||
| { | ||
| echo "CPLN_TOKEN<<${token_delimiter}" | ||
| printf '%s\n' "$CPLN_TOKEN" | ||
| echo "${token_delimiter}" | ||
| } >> "$GITHUB_ENV" |
There was a problem hiding this comment.
Writing CPLN_TOKEN to GITHUB_ENV propagates it to every subsequent step in the calling workflow — not just the steps that explicitly use it. This is a necessary tradeoff (cpflow picks it up from the environment automatically), but it's worth documenting so future maintainers don't inadvertently add steps that log environment state.
| # Make the token available to later workflow steps without putting it on argv. | |
| token_delimiter="CPLN_TOKEN_$(openssl rand -hex 8)" | |
| { | |
| echo "CPLN_TOKEN<<${token_delimiter}" | |
| printf '%s\n' "$CPLN_TOKEN" | |
| echo "${token_delimiter}" | |
| } >> "$GITHUB_ENV" | |
| # Persist CPLN_TOKEN to GITHUB_ENV so all subsequent steps can authenticate | |
| # with cpflow/cpln without putting the token on the command line. | |
| # Trade-off: the token is present in every subsequent step's environment. | |
| # Do not add steps that dump env vars (e.g., `env | sort`) after this point. | |
| token_delimiter="CPLN_TOKEN_$(openssl rand -hex 8)" | |
| { | |
| echo "CPLN_TOKEN<<${token_delimiter}" | |
| printf '%s\n' "$CPLN_TOKEN" | |
| echo "${token_delimiter}" | |
| } >> "$GITHUB_ENV" |
| # TODO: replace this string-matching with a structured signal once `cpflow exists` exposes one | ||
| # (e.g. a distinct exit code for "not found" vs. API/auth errors, or `cpflow exists --json`). | ||
| # Until then, keep this list in sync if `cpflow exists` starts emitting new error patterns — | ||
| # any unmatched error string would otherwise be silently treated as "app not found". | ||
| exists_output="" | ||
| if ! exists_output="$(cpflow exists -a "$APP_NAME" --org "$CPLN_ORG" 2>&1)"; then | ||
| case "$exists_output" in | ||
| *"Double check your org"*|*"Unknown API token format"*|*"ERROR"*|*"Error:"*|*"Traceback"*|*"Net::"*) |
There was a problem hiding this comment.
The same cpflow exists error-pattern matching appears verbatim in the Check if review app exists step in cpflow-deploy-review-app.yml (line ~195). The TODO comment here is accurate but only covers this file — the workflow copy is uncovered.
If cpflow adds a new error pattern (e.g. "rate limit" or "timeout"), both locations need updating in sync. Consider extracting this into a shared script (e.g. .github/scripts/cpflow-app-exists.sh) that both the action and the workflow call, so the pattern list is maintained in one place.
| # repo variable to tighten this for apps that expose a dedicated health endpoint | ||
| # (e.g. "200" for a plain /health, or "200 401 403" for apps that auth-gate / without | ||
| # redirecting). | ||
| HEALTH_CHECK_ACCEPTED_STATUSES: ${{ vars.HEALTH_CHECK_ACCEPTED_STATUSES || '200 301 302' }} |
There was a problem hiding this comment.
Accepting 301/302 as "healthy" means a redirect to a login page, maintenance page, or even a misconfigured CDN passes the health check. The comment correctly explains why (curl without -L, Rails auth-gating /), but a redirect can pass even when the destination is broken.
Consider adding a step summary warning when a non-200 status is accepted, so operators reviewing the promotion log have visibility:
if [[ "${http_status}" == "${accepted}" ]]; then
if [[ "${http_status}" != "200" ]]; then
echo "::warning::Health check passed with HTTP ${http_status} (not 200) — verify the redirect destination is healthy."
fi
echo "healthy=true" >> "$GITHUB_OUTPUT"
exit 0
fiAlso worth documenting in .controlplane/shakacode-team.md that setting HEALTH_CHECK_ACCEPTED_STATUSES=200 and adding a /health endpoint is the recommended production posture.
| on: | ||
| workflow_dispatch: | ||
| schedule: | ||
| - cron: "0 0 * * *" |
There was a problem hiding this comment.
Scheduled workflows that fail silently are easy to miss — GitHub does send an email for failed scheduled runs but only to the repo owner by default, and only after 7 days of consecutive failures (as of current GitHub behavior).
Consider adding a failure notification step, or at minimum documenting that the team should verify GitHub's "Notify me of failed workflow runs" notification setting is enabled. Stale review apps that aren't cleaned up accrue Control Plane costs.
| - cron: "0 0 * * *" | |
| - cron: "0 3 * * *" # 3 AM UTC — avoids exact midnight when many repos run scheduled jobs simultaneously |
(The cron timing is a minor suggestion — staggering off midnight reduces GitHub's scheduler contention, making the run more likely to start on time.)
| workload_json="$(cpln workload get "${workload_name}" --gvc "${PRODUCTION_APP_NAME}" --org "${CPLN_ORG_PRODUCTION}" -o json)" | ||
| # current_image/current_version are summary fields for the first container | ||
| # of the selected workload; rollback_state below captures all containers. | ||
| workload_image="$(echo "${workload_json}" | jq -r '.spec.containers[0].image')" |
There was a problem hiding this comment.
Indentation is inconsistent with the rest of the loop body. workload_image is at 10 spaces while workload_containers and workload_version below it are at 12. Bash ignores the difference, but it looks like a copy/paste artifact and will confuse readers into thinking this line is outside the loop.
| workload_image="$(echo "${workload_json}" | jq -r '.spec.containers[0].image')" | |
| workload_image="$(echo "${workload_json}" | jq -r '.spec.containers[0].image')" |
| staging_vars="$(CPLN_TOKEN="${CPLN_TOKEN_STAGING}" cpln gvc get "${STAGING_APP_NAME}" --org "${CPLN_ORG_STAGING}" -o json | jq -r '.spec.env // [] | .[].name' | sort)" | ||
| production_vars="$(CPLN_TOKEN="${CPLN_TOKEN_PRODUCTION}" cpln gvc get "${PRODUCTION_APP_NAME}" --org "${CPLN_ORG_PRODUCTION}" -o json | jq -r '.spec.env // [] | .[].name' | sort)" |
There was a problem hiding this comment.
This parity check covers GVC-level env vars only. Workload-level environment variables (set directly on individual workloads rather than the GVC) and Control Plane secret references (cpln://secret/... values) are not compared. A variable added to a specific staging workload won't be caught here and could cause a silent mismatch in production after promotion.
Consider adding a note to the step summary output — e.g. echo "Note: workload-level env vars and cpln:// secret references are not compared." >> "$GITHUB_STEP_SUMMARY" — so operators are reminded of the scope on every promotion run.
| shell: bash | ||
| run: | | ||
| set -euo pipefail | ||
| cpflow cleanup-stale-apps -a "${REVIEW_APP_PREFIX}" --org "${CPLN_ORG_STAGING}" --yes |
There was a problem hiding this comment.
This runs destructively with --yes on every scheduled tick with no dry-run path. If REVIEW_APP_PREFIX is ever misconfigured (e.g., set to a prefix that matches staging-adjacent apps), apps are silently deleted. A low-cost safeguard would be to add a workflow_dispatch input for dry-run mode:
| cpflow cleanup-stale-apps -a "${REVIEW_APP_PREFIX}" --org "${CPLN_ORG_STAGING}" --yes | |
| cpflow cleanup-stale-apps -a "${REVIEW_APP_PREFIX}" --org "${CPLN_ORG_STAGING}" --yes | |
| echo "Stale review apps under prefix '${REVIEW_APP_PREFIX}' have been cleaned up." >> "$GITHUB_STEP_SUMMARY" |
(Or add a separate dry_run dispatch input that swaps --yes for a list-only command.)
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
Autofix Details
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Fork PR slash-command rejection posts no PR comment
- Added an issue_comment-only rejection comment step for fork PR deploy requests before all deployment-gated steps are skipped.
Or push these changes by commenting:
@cursor push 85e534a5ab
Preview (85e534a5ab)
diff --git a/.github/workflows/cpflow-deploy-review-app.yml b/.github/workflows/cpflow-deploy-review-app.yml
--- a/.github/workflows/cpflow-deploy-review-app.yml
+++ b/.github/workflows/cpflow-deploy-review-app.yml
@@ -157,6 +157,18 @@
echo "Review app deploys from fork pull requests are not allowed for workflow_dispatch because this workflow uses repository secrets." >&2
exit 1
+ - name: Comment on fork PR deploy rejection
+ if: steps.config.outputs.ready == 'true' && steps.source.outputs.allowed == 'false' && github.event_name == 'issue_comment'
+ uses: actions/github-script@v7
+ with:
+ script: |
+ await github.rest.issues.createComment({
+ owner: context.repo.owner,
+ repo: context.repo.repo,
+ issue_number: Number(process.env.PR_NUMBER),
+ body: `Review app deploys from fork pull requests require a branch in ${process.env.GITHUB_REPOSITORY}.\n\nThis workflow builds Docker images with repository secrets, so comment-triggered deploys only run for branches in the base repository.`
+ });
+
- name: Checkout PR commit
if: steps.config.outputs.ready == 'true' && steps.source.outputs.allowed == 'true'
uses: actions/checkout@v4You can send follow-ups to the cloud agent here.
Review app ready |
Code Review: Generated cpflow GitHub Actions FlowOverviewThis PR replaces hand-written Control Plane GitHub Actions with the generated Security — Notable Improvements ✅
Issues and Observations🔴 High — Rollback doesn't reverse migration side-effectsThe rollback step in Suggestion: Add a echo "::warning::Rollback restores workload images only. Database migrations and release_script side-effects are NOT reversed. Verify database state manually before re-promoting."🟡 Medium —
|
| # Guard against a misconfigured REVIEW_APP_PREFIX that would otherwise match a | ||
| # well-known shared environment. This intentionally rejects review-app prefixes | ||
| # containing these word segments; failing closed is safer for deletion. | ||
| if echo "$APP_NAME" | grep -iqE '(^|-)(production|staging)(-|$)'; then |
There was a problem hiding this comment.
The word-boundary regex (^|-)(production|staging)(-|$) is a significant improvement over the deleted action's grep -iqE '(production|staging)', which would have incorrectly blocked a hypothetical review-app prefix like staging-feature-demos-pr. The new pattern correctly rejects only apps where production or staging appears as a full hyphen-delimited word segment.
One edge case to be aware of: this guard runs on $APP_NAME (the fully-formed app name including prefix + PR number), not on $REVIEW_APP_PREFIX alone. So a review app named qa-app-pr-1234 passes the guard even if someone later misconfigures the prefix. That's the intended "fail closed on the constructed name" behavior — just confirming it's deliberate.
|
|
||
| # Persist the token for later cpflow/cpln steps via env. Use a randomized | ||
| # delimiter so a multiline token cannot terminate the heredoc early. | ||
| delim="CPLN_TOKEN_$(openssl rand -hex 8)" |
There was a problem hiding this comment.
Good use of a randomized heredoc delimiter. The old setup-environment action called cpln profile update default --org "$ORG" --token "$TOKEN", which would expose the token in /proc/<pid>/cmdline and ps output on any runner with a shared process table. The env-var approach here avoids that, and the randomized delimiter (CPLN_TOKEN_<hex>) prevents a multiline token value from accidentally terminating the heredoc early.
Note: writing CPLN_TOKEN to GITHUB_ENV means it persists for all subsequent steps in the job. That's intentional so downstream cpflow/cpln commands pick it up automatically, but it does mean any step that runs after this one has an implicit dependency on the env var being set. Worth keeping in mind when adding new steps to calling workflows.
| run: | | ||
| # Best-effort rollback: try every workload, aggregate failures, exit non-zero at the end | ||
| # if any failed. A single cpln hiccup shouldn't leave other workloads mid-promotion. | ||
| set -uo pipefail |
There was a problem hiding this comment.
set -uo pipefail intentionally omits -e so that a failure for one workload doesn't abort the loop before attempting all remaining workloads. This is the correct pattern for best-effort rollback (aggregate failures, then exit non-zero at the end). Worth a brief inline note here to prevent a future reviewer from "fixing" it to set -euo pipefail:
| set -uo pipefail | |
| # Intentionally omit -e: we want to attempt rollback on all workloads even if one fails. | |
| set -uo pipefail |
There was a problem hiding this comment.
Cursor Bugbot has reviewed your changes and found 1 potential issue.
There are 2 total unresolved issues (including 1 from previous review).
Bugbot Autofix prepared a fix for the issue found in the latest run.
- ✅ Fixed: Delete workflow missing config-ready gate on subsequent steps
- Added the config step id and gated every downstream delete workflow step on the validate action's ready output so missing configuration exits quietly.
Or push these changes by commenting:
@cursor push bf3a06d409
Preview (bf3a06d409)
diff --git a/.github/workflows/cpflow-delete-review-app.yml b/.github/workflows/cpflow-delete-review-app.yml
--- a/.github/workflows/cpflow-delete-review-app.yml
+++ b/.github/workflows/cpflow-delete-review-app.yml
@@ -53,6 +53,7 @@
uses: actions/checkout@v4
- name: Validate required secrets and variables
+ id: config
uses: ./.github/actions/cpflow-validate-config
env:
CPLN_TOKEN_STAGING: ${{ secrets.CPLN_TOKEN_STAGING }}
@@ -66,6 +67,7 @@
pull_request_friendly: "true"
- name: Setup environment
+ if: steps.config.outputs.ready == 'true'
uses: ./.github/actions/cpflow-setup-environment
with:
token: ${{ secrets.CPLN_TOKEN_STAGING }}
@@ -74,6 +76,7 @@
cpflow_version: ${{ vars.CPFLOW_VERSION }}
- name: Set workflow links
+ if: steps.config.outputs.ready == 'true'
uses: actions/github-script@v7
with:
script: |
@@ -85,6 +88,7 @@
);
- name: Create initial PR comment
+ if: steps.config.outputs.ready == 'true'
id: create-comment
uses: actions/github-script@v7
with:
@@ -98,6 +102,7 @@
core.setOutput("comment-id", comment.data.id);
- name: Delete review app
+ if: steps.config.outputs.ready == 'true'
uses: ./.github/actions/cpflow-delete-control-plane-app
with:
app_name: ${{ env.APP_NAME }}
@@ -105,6 +110,7 @@
review_app_prefix: ${{ vars.REVIEW_APP_PREFIX }}
- name: Mark GitHub deployment inactive
+ if: steps.config.outputs.ready == 'true'
uses: actions/github-script@v7
with:
script: |
@@ -132,7 +138,7 @@
}
- name: Finalize delete status
- if: always()
+ if: always() && steps.config.outputs.ready == 'true'
uses: actions/github-script@v7
with:
script: |You can send follow-ups to the cloud agent here.
Reviewed by Cursor Bugbot for commit fdda08d. Configure here.
|
|
||
| { | ||
| echo "### Environment parity scope" | ||
| echo "This check compares GVC-level environment variable names only." |
There was a problem hiding this comment.
Good that the step summary explicitly documents the scope limitation. One scenario that can slip through: if a staging secret is renamed (old name removed from staging, new name added), and production still has the old name, the comm -23 check only reports what's missing from production — so a renamed var where staging has NEW_NAME and production has OLD_NAME would not be caught (neither appears as "missing" from the other's perspective; they're just different names).
This is inherent to comparing names without values, and worth noting in the step summary output or the docs. The current wording "compares GVC-level environment variable names only" does cover this, but the rename case is subtle enough that an explicit sentence could help: "Renamed variables (removed from staging, not yet removed from production) are reported as production-only variables, not as missing."
| # deploy branches unless `cpflow generate-github-actions --staging-branch BRANCH` | ||
| # was used. If STAGING_APP_BRANCH is later changed in repository variables, keep | ||
| # this list in sync so pushes to that branch actually trigger the workflow. | ||
| branches: ["master"] |
There was a problem hiding this comment.
This is a meaningful improvement. The deleted deploy-to-control-plane-staging.yml had branches: ['*'], which would trigger a staging deploy attempt on every branch push to the repo — relying solely on the validate-branch job's runtime check to bail out. The new workflow correctly gates at the workflow trigger level, so only pushes to master start any jobs at all, saving runner minutes and reducing noise.
| (github.event_name == 'issue_comment' && | ||
| github.event.issue.pull_request && | ||
| github.event.comment.body == '/deploy-review-app' && | ||
| contains(fromJson('["OWNER","MEMBER","COLLABORATOR"]'), github.event.comment.author_association)) |
There was a problem hiding this comment.
Good addition: the old delete-review-app.yml checked github.event.issue.pull_request (ensuring the comment was on a PR) but had no author_association guard, meaning any user who could comment on a PR could trigger a deploy. The new filter restricts slash commands to OWNER, MEMBER, and COLLABORATOR — appropriate for secrets-backed deployments.
One subtlety: COLLABORATOR in GitHub's model means a user explicitly invited to the repo with at least triage access. External contributors on a public repo default to NONE. This is the right default for protecting Control Plane credentials.
Review app ready |
Code Review: Replace hand-written CI/CD with generated cpflow GitHub ActionsOverviewThis PR replaces ~1,150 lines of hand-crafted Control Plane GitHub Actions with ~2,025 lines of generated Security — Significant Improvements ✅The new code fixes several real security issues present in the deleted workflows:
Concerns1.
If the intent is just to inform developers, a lighter approach would be to move the skip check before the setup steps, or gate 2. In An alternative: cpflow's CLI supports 3.
4. Env-var parity check gap (documented but worth flagging) The "Verify production environment variables" step in the promotion workflow compares GVC-level env var names only. Workload-level env vars and Code QualityThe new code is a substantial improvement:
Minor Notes
SummaryThis is a high-quality, well-documented overhaul. The security posture is meaningfully better than what it replaces. The three items above (CI minute waste on |
|
|
||
| on: | ||
| pull_request: | ||
| types: [opened, synchronize, reopened] |
There was a problem hiding this comment.
Adding opened to the trigger list means every new same-repo PR runs this workflow even when no review app has been created yet. The job will checkout, install Ruby + cpln + cpflow, resolve the PR, validate the source, and call cpflow exists before reaching the "Skip auto deploy" step. The old workflow used [synchronize, reopened] only.
If the goal is just to inform developers, consider moving the skip earlier (before setup) or removing opened and relying on the help message posted by cpflow-review-app-help.yml instead. At minimum, a comment explaining the intentional design trade-off would help future maintainers.
| types: [opened, synchronize, reopened] | |
| types: [synchronize, reopened] |
| # Persist the token for later cpflow/cpln steps via env. Use a randomized | ||
| # delimiter so a multiline token cannot terminate the heredoc early. | ||
| delim="CPLN_TOKEN_$(openssl rand -hex 8)" | ||
| { |
There was a problem hiding this comment.
Writing CPLN_TOKEN to $GITHUB_ENV makes it available to every subsequent step in the calling job, including any third-party composite actions invoked after this one. GitHub's log masking prevents the value from appearing in plain text, but the env var is still accessible to step code.
This is likely necessary because downstream cpflow and cpln commands depend on it. If it can be scoped more narrowly (e.g., only passed as env: to steps that need it rather than persisted globally), that would reduce the blast radius of a compromised downstream action. Worth a comment here explaining why global persistence is required.
| type: string | ||
|
|
||
| permissions: | ||
| contents: write # Required for `gh release create` in the "Create GitHub release" step. |
There was a problem hiding this comment.
contents: write is required only for the gh release create step, but this permission applies to the entire promote-to-production job. All steps — including the Ruby script that reads controlplane.yml, the env-var parity check calling cpln gvc get, and the rollback logic — run with write access to repo contents.
Consider splitting the GitHub release creation into a separate job that depends on promote-to-production with only contents: write, and narrowing this workflow permission to contents: read. That limits the exposure if any intermediate step (e.g., a cpln CLI bug or a compromised gem) tries to write to the repo.
Review app ready |
Code ReviewOverviewThis PR replaces bespoke, hand-written GitHub Actions with the generated Overall this is a significant quality improvement. The security posture is meaningfully better than the code being replaced (details below). Security Improvements (notable fixes over old code)
Issues to Address1. In 2. Production promotion env-var check is incomplete The "Verify production environment variables" step (promote workflow) compares GVC-level env var names only — it explicitly skips workload-level env vars and 3. Rollback does not cover release-phase side effects The rollback step correctly warns: "Database migrations and release_script side effects are not reversed." However, 4. Nightly cleanup timing
Minor Notes
SummaryThe PR is in good shape. The security regressions in the old code have been properly fixed. The main items to address before merging are: (1) noting the |
| echo "CPLN_TOKEN<<${delim}" | ||
| echo "${CPLN_TOKEN}" | ||
| echo "${delim}" | ||
| } >> "$GITHUB_ENV" |
There was a problem hiding this comment.
The randomized heredoc delimiter here is a good defence against a malicious token value terminating the heredoc early — nice touch.
One concern worth noting: writing to $GITHUB_ENV broadcasts CPLN_TOKEN to all subsequent steps in the job, including any third-party actions added later. For the staging org this is acceptable, but the promote workflow reuses this action with CPLN_TOKEN_PRODUCTION. If a third-party action is ever added to that job it will have implicit access to the production token.
Alternative: instead of persisting to GITHUB_ENV, pass the token to only the steps that need it via env: on each individual step. This is more verbose but tightly scopes the secret.
| # Guard against a misconfigured REVIEW_APP_PREFIX that would otherwise match a | ||
| # well-known shared environment. This intentionally rejects review-app prefixes | ||
| # containing these word segments; failing closed is safer for deletion. | ||
| if echo "$APP_NAME" | grep -iqE '(^|-)(production|staging)(-|$)'; then |
There was a problem hiding this comment.
The regex (^|-)(production|staging)(-|$) is a good improvement over the old plain grep -i 'production|staging', which would have blocked names like qa-productions-preview unnecessarily. The word-boundary anchors reduce false positives.
One edge case: this check runs after the prefix check on line 10. If someone sets REVIEW_APP_PREFIX to something that passes line 10 but contains staging or production as a segment (e.g. qa-staging-pr), this guard fires and blocks all review app deletions entirely — a confusing failure mode. The comment on lines 17-19 already calls this out ("intentionally rejects review-app prefixes containing these word segments"), which is the right trade-off (fail closed). Just make sure README / team docs note this constraint when choosing a prefix value.
| # SECURITY: pull_request_target is intentional for PR-close cleanup from forks. | ||
| # This workflow must keep checking out trusted base-branch code only; do not | ||
| # change checkout to a PR head ref without re-evaluating secret exposure. | ||
| pull_request_target: |
There was a problem hiding this comment.
pull_request_target is the correct event here — pull_request cannot write to secrets, so fork PRs would silently fail to delete their review apps on close. The existing comment (lines 4-6) explaining the trust boundary is important and should not be removed.
One subtlety to keep in mind: pull_request_target runs from the default branch, so this workflow's code at the closed event always reflects what's on master, not what's on the merged/closed PR branch. This is intentional and desirable (it means the workflow itself cannot be tampered with via the PR), but it also means that changes to this workflow file only take effect after they're merged to master — testing workflow changes via a PR won't exercise the PR-close path until after merge.
| github.event_name == 'workflow_dispatch' || | ||
| (github.event_name == 'issue_comment' && | ||
| github.event.issue.pull_request && | ||
| github.event.comment.body == '/deploy-review-app' && |
There was a problem hiding this comment.
Exact match (==) rather than the old contains() is the right choice — the old approach would have triggered a deploy from any comment that mentioned /deploy-review-app as part of a longer message (e.g. "don't run /deploy-review-app yet"). The exact-match requirement is also documented in the help text, so this is consistent.


Summary
cpflow-*action and workflow set fromcpflow generate-github-actions.cpflow-*action/workflow names.Verification
BUNDLE_GEMFILE=/tmp/cpflow-pr-278/Gemfile bin/conductor-exec bundle exec ruby /tmp/cpflow-pr-278/bin/cpflow github-flow-readiness— no blocking readiness issues detected.bin/conductor-exec ruby -e 'require "yaml"; paths = Dir[".github/**/*.yml"] + Dir[".github/**/*.yaml"] + Dir[".controlplane/**/*.yml"]; paths.sort.each { |path| YAML.load_file(path, aliases: true); puts path }'bin/conductor-exec git diff --cached --checkbin/conductor-exec bundle exec rubocopNot Run / Blocked
bin/conductor-exec bundle exec rspeccould not complete locally because PostgreSQL was not running at/tmp/.s.PGSQL.5432; after the first Rails load failure, Coveralls setup also reported repeated coverage initialization errors.Note
Medium Risk
Replaces and expands CI/CD automation for staging/review apps and production promotion, so misconfiguration could affect deployments or delete the wrong resources despite added guards.
Overview
Switches Control Plane automation from the bespoke GitHub Actions to the generated
cpflow-*composite actions and workflows, updating docs accordingly.Adds a full generated flow for opt-in review apps (slash-command
/deploy-review-app+ auto-redeploy after creation), review-app deletion (on PR close or/delete-review-app), staging deploys gated to the staging branch, manual promote-to-production with env-var name parity checks + health polling + best-effort rollback, and a nightly stale review-app cleanup.Hardens automation around secrets and safety: centralized config validation, safer Control Plane token handling, optional SSH-based Docker builds, and stricter review-app deletion prefix checks to prevent accidental staging/production deletes.
Reviewed by Cursor Bugbot for commit 3331332. Bugbot is set up for automated code reviews on this repo. Configure here.