Multi-org & component-group support for delete-organization-workspaces#90
Conversation
…spaces Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
f9fa2a8 to
409ff58
Compare
There was a problem hiding this comment.
Pull request overview
Adds support for cleaning orphaned workspaces across multiple organizations and/or predefined component groups, while improving CLI ergonomics by moving optional positional args to named options and expanding summary output.
Changes:
- Allow
organizationIdto be a comma-separated list and process organizations sequentially with per-org summaries. - Add
--component-group/-gand convert the single component + other optional positional args into options (--component/-c,--hostname-suffix/-H,--until-date/-d). - Extend final summary output with per-component breakdowns for deleted/to-delete and skipped workspaces.
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
vojtabiberle
left a comment
There was a problem hiding this comment.
Review: PR #90 — Multi-org & component-group support for delete-organization-workspaces
Target: origin/jirka/multi-org-component-groups-orphaned-workspaces • Base: origin/main • PR: #90 • State: open (not draft)
Reviewer: review-changes skill (+ platform-impact-analyst agent) • Date: 2026-05-18 16:01
Changed: 1 file, +242 / −129 lines (single commit 409ff58)
Repo note: This review is for keboola/cli-utils, NOT keboola/connection. The review-changes skill's connection-specific architecture/innovation-path rules were skipped (they do not apply to a standalone ops CLI). General dimensions (performance, BC, security, platform impact) were applied. Architecture compliance was checked only against Symfony Console norms.
Summary
PR converts the manage:delete-organization-workspaces ops CLI from single-org / single-component to multi-org (CSV) with an optional --component-group shortcut, and turns three positional args into named options. Code is well-factored, retains the safe --force-gated dry-run default, and adds clearer per-org and per-component reporting. cli-utils is internal SRE tooling, so the CLI argument-shape break is low-impact — worth a heads-up to whoever owns the runbooks, but not a release-notes-grade concern. Security and platform impact are clean; the change does not widen any authz boundary or contract.
Blockers (must fix before merge)
None.
Major findings
None.
Minor findings
[minor] CLI argument surface changes — internal runbook owners should be notified
-
Where:
src/Keboola/Console/Command/DeleteOrganizationOrphanedWorkspaces.php:36-74 -
Why: Three argument-shape changes will fail any existing wrapper script or runbook line that uses the old positional form:
orphanComponent(positional REQUIRED, position 3) → removed; now--component/-cor--component-group/-g. Invocations likephp cli.php manage:delete-organization-workspaces TOKEN 13 keboola.snowflake-transformationwill fail with "too many arguments".hostnameSuffix(positional OPTIONAL, position 4, defaultkeboola.com) →--hostname-suffix/-H.untilDate(positional OPTIONAL, position 5, default-1 month) →--until-date/-d.
Single-org
organizationIdstill works (CSV with one element parses to a single int). cli-utils is internal SRE tooling, so this is low-impact: there are no external consumers and the audience is small. Flagging only so whoever maintains existing runbooks / aliases can update them. -
Reference: Symfony Console
Command::addArgument/addOptionsemantics; PR body at #90 -
Suggestion: A one-line ping to the SRE channel or a brief update of any internal runbook is sufficient. No code change required.
[minor] Non-numeric tokens in the org-ID CSV are silently dropped
- Where:
src/Keboola/Console/Command/DeleteOrganizationOrphanedWorkspaces.php:90 - Why:
array_map('intval', array_filter(explode(',', $organizationIdArg), 'is_numeric'))will quietly discard any non-numeric token. An operator pasting13, 10, hello, 45would get[13, 10, 45]silently — no warning. For a destructive command this is unfriendly: the operator may believe they targeted "hello" too. (Whitespace works incidentally because PHP'sis_numericaccepts leading whitespace.) - Reference: PHP
is_numeric()semantics; this command is--force-gated but pre-flight visibility matters. - Suggestion: Iterate the split tokens,
trim(), and throw\InvalidArgumentExceptionon any non-numeric. The "no valid organization IDs" check on line 91-93 already exists for the all-invalid case; extend it to reject partial invalidity.
[minor] Subtle latent behavior in isWorkspaceOrphaned when empty string is mixed into a multi-element component list
- Where:
src/Keboola/Console/Command/DeleteOrganizationOrphanedWorkspaces.php:379-387 - Why: The new branch logic is:
If a future group is ever defined as
if (in_array('', $components, true)) { if ($workspaceComponent !== '') { return false; } } else { if (!in_array($workspaceComponent, $components, true)) { return false; } }
['', 'keboola.foo'](i.e. mixed empty + real component), this will accept ONLY workspaces whose component is empty — the'keboola.foo'entry becomes unreachable. Today noCOMPONENT_GROUPSentry contains'', so this is latent only. - Reference:
COMPONENT_GROUPSconstant at line 22-28. - Suggestion: Either (a) add a comment noting
''should not appear inside a group, (b) restructure asif ($workspaceComponent === '' && in_array('', $components, true)) { /* pass */ } elseif (!in_array($workspaceComponent, $components, true)) { return false; }, or (c) leave as-is and rely on the implicit invariant. Lowest-effort fix is the comment.
[minor] Configuration block no longer shows total project count up-front
- Where:
src/Keboola/Console/Command/DeleteOrganizationOrphanedWorkspaces.php:138-143 - Why: The pre-PR Configuration block showed
Projects to check: %d(a single number). The new Configuration block omits this because projects are only known per-org. The operator now has to wait for the per-org block at line 171 to see project counts. For a multi-org run this is a small loss of pre-flight visibility (you can't see "OK, total 1,200 projects across these 6 orgs" before things start). - Reference: pre-PR file at
origin/main:src/Keboola/Console/Command/DeleteOrganizationOrphanedWorkspaces.php:92. - Suggestion: Optional: aggregate
count($organization['projects'])after loading each org and addTotal projects to check: %dto the configuration block — but this requires loading all orgs up-front before any processing, which trades a slight latency increase for the pre-flight visibility. Take or leave.
Nits
[nit] Mutual-exclusion error messages do not mention short-flag aliases
- Where:
src/Keboola/Console/Command/DeleteOrganizationOrphanedWorkspaces.php:101-106 - Why: Errors say "Cannot use both
--componentand--component-group" but the short flags-c/-gare not mentioned. Operators using short flags will see a message that names long flags they didn't type. - Suggestion:
'Cannot use both --component/-c and --component-group/-g options.'
[nit] Dead-code cleanup is correctly removed (positive note)
- Where: pre-PR
origin/main:.../DeleteOrganizationOrphanedWorkspaces.php:60-63(theis_numeric ? (int) : (int)triple-cast and the redundant subsequent(int)cast) is gone in the new version. - Why: Good cleanup — flagged here only so the author knows it was noticed and is welcome.
Notes / informational
- Token TTL under large CSV runs is safe: each project's storage token is dropped immediately after the project loop ends (line 278-279), so the 1800s TTL is per-project, not per-run. Confirmed by the agent's audit as well.
- Dry-run default preserved:
--forceis still required for actual deletes (line 133). Multi-org expansion does not weaken the safety gate. - Per-org error containment is new and correct:
getOrganization()now caught (line 161-168), so a bad org ID in a CSV won't abort the whole run. This is strictly more graceful than the pre-PR behavior. - Datadog dashboards / alerts: if an SRE monitor watches workspace-delete rate per stack, a multi-org
--forcerun will spike it. Not a code concern, but worth a heads-up to whoever owns those monitors. - Audit trail: console output is the only record of a bulk-delete run. If team needs persistent audit, a wrapper script that tees output to a SIEM / Slack would cover it. Not in scope for this PR.
Platform impact
Concepts touched
| Concept | Concrete impact | Citation |
|---|---|---|
| Workspace | Command iterates and deletes workspaces; multi-org expansion increases the blast radius of a single invocation | concepts/workspace.md |
| Storage Token | Per-project token minted with 1800s TTL, dropped after each project loop; TTL pressure analysed in Security section | concepts/token.md |
| Organization | organizationId now accepts a CSV list; each org loaded independently via getOrganization(int) |
concepts/organization.md |
| Development Branch | Command lists all dev branches per project and checks workspaces in each; unchanged behaviour, wider iteration | concepts/development-branch.md |
| Transformation | COMPONENT_GROUPS['transformations'] hard-codes three component IDs; expansion of the group list is the only way to add new component coverage |
concepts/transformation.md |
Services touched (other than connection)
| Service | Contract affected | Change | Wiki citation |
|---|---|---|---|
| Manage API | GET /organizations/{id} |
Called per-org in a loop instead of once; a 4xx on any org is now caught, logged, and skipped rather than aborting the command | apis/manage-api.md |
| Storage API | POST /tokens (createProjectStorageToken), DELETE /tokens/{id} (dropToken) |
Called once per project as before; no interface change | apis/manage-api.md |
| Storage API | GET /dev-branches, GET /workspaces (per branch), DELETE /workspaces/{id} |
Called once per project as before; no interface change | apis/workspaces-api.md (page returns no content -- see Wiki staleness notes) |
The command is a pure API consumer. It produces no events, writes no shared data structures, and touches no other service.
APIs touched
| API | Impact | Citation |
|---|---|---|
| Manage API | Consumed as client only; no server-side contract changed | apis/manage-api.md |
| Storage API | Consumed as client only; no server-side contract changed | concepts/workspace.md |
Cross-cutting checklist
| Dimension | Status | Detail |
|---|---|---|
| Backends (Snowflake / BigQuery) | N/A | cli-utils is a client-side CLI; it calls APIs and does not touch backend drivers |
| Stacks (AWS / GCP / Azure / region) | Partial | --hostname-suffix selects the stack; multi-org runs stay on a single stack per invocation. No stack-specific code paths added |
| Dev branches / SOX-protected | N/A | The command already iterated dev branches; no behaviour change. SOX project protection is enforced server-side by the Storage API; this CLI does not bypass it |
| Bucket sharing / linking / external | N/A | Command operates on workspaces only, not buckets |
| Token roles & scopes | No | Token creation flow is unchanged: createProjectStorageToken with expiresIn: 1800. No new scope or privilege added |
| Jobs / queues / workers / cron | N/A | cli-utils has no queue integration; this is a synchronous shell command |
| Doctrine migrations | N/A | No PHP entity, no database, no migration |
| Telemetry & KIDS coordination | N/A | No DB columns added, removed, or renamed |
| OpenAPI / Apiary docs | N/A | No server-side API contract changed |
| Events / audit | No | No platform events emitted by this change; see Security section for audit trail |
| Billing / metering | No | Workspace deletion reduces backend resource usage; no new billable action created |
| Datadog dashboards / metrics / alerts | Unknown | If an SRE Datadog monitor tracks workspace-count or bulk-delete rates, a multi-org invocation could spike the signal. Not a code concern, but the SRE running this should verify no alert fires spuriously |
| Symfony vs legacy Zend controllers | N/A | cli-utils is a standalone Symfony Console app, not part of connection |
| storage-driver (protobuf) | N/A | No protobuf involved |
| keboola/ui frontend | No | Internal ops tool; no UI changes needed |
| CLI / SDK clients | No | Uses keboola/manage-api-php-client and keboola/storage-api-php-client as consumers with no interface changes. kbc CLI and other SDK clients are unaffected |
Security relevance
| Dimension | Status | Detail |
|---|---|---|
| Authentication boundary | No | manageToken argument is consumed identically to before. No new auth path, no widening. Token is never written to stdout or stderr (verified: grep of the file shows it only reaches the Client constructor at line 70) |
| Authorization / role check | No | Server-side authz is unchanged. The 403 path on createProjectStorageToken is caught and the project skipped, same as before. getOrganization 4xx is now caught per-org instead of aborting -- this is strictly more graceful, not a bypass |
| Tenant isolation | No | Each getOrganization(orgId) call is independently scoped; each createProjectStorageToken(project_id) is independently scoped. The loop variable never crosses org or project boundaries. The storageToken variable is reassigned per project and the old token is dropped before moving to the next project (diff line 204-205) |
| Data exposure | No | Output lines contain workspace IDs, component names, and creation timestamps -- all already present in the pre-PR version. No new fields printed |
| Encryption at rest / in transit | N/A | CLI makes HTTPS calls to existing Keboola API endpoints; no new encryption concern |
| External egress | No | No new third-party calls. ServiceClient($hostnameSuffix) resolves the Keboola stack URL as before |
| Audit trail | Unknown | The wiki (apis/manage-api.md) is silent on audit logging requirements for maintenance scripts. The platform does record workspace deletions as Storage API events visible to project admins. Bulk deletion across many orgs is now possible in one run; there is no aggregated audit event at the org or run level. If the team requires a run-level audit trail, nothing in this PR provides it -- the console output is the only record |
| Abuse / DoS / rate limiting | Yes (low) | A CSV of many org IDs will serially call getOrganization, createProjectStorageToken, listBranches, listWorkspaces, and deleteWorkspace for every project. The Manage API and Storage API have rate limits not documented in the wiki. A very large org list could hit those limits. The PR adds no throttling or back-off. Mitigation: operators should test with a small org list first, and the existing backoffMaxTries: 1 on BranchAwareClient means it will fail fast rather than retry infinitely |
| Multi-tenant blast radius | Yes (elevated, manageable) | Pre-PR: one wrong org ID = one org cleaned. Post-PR: one wrong CSV = multiple orgs cleaned in one --force run. Dry-run default is the primary mitigation. The per-org error catch means a bad org ID (404/403) does not silently skip to destructive work on other orgs -- it logs and counts the failure |
| SOX / GDPR / HIPAA / SOC2 | No | Workspace deletion removes compute resources, not customer data tables. No PII or regulated data path introduced |
| Secrets handling | No | manageToken and storageToken['token'] are passed only to client constructors (diff lines 70, 129, 144). Neither is printed, logged, or appended to output strings anywhere in the file |
| PII | No | No PII collected, stored, or logged. Workspace metadata (ID, component name, creation date) is non-PII |
Open follow-ups
- Rate limiting: Who owns the Manage API and Storage API rate-limit specs? The SRE team should confirm the per-minute request ceiling before running this against a large CSV (e.g., 50+ orgs with 100+ projects each). The wiki (
apis/manage-api.md) has no rate-limit documentation. - Audit trail for bulk ops: Is there a team requirement that SRE-initiated bulk workspace deletions be logged to an ops audit system (e.g., a Slack channel, a SIEM, or Datadog log)? The console output is ephemeral. If yes, a
--audit-logflag or a wrapper script that tees output to a persistent store would cover it. - Token TTL under large runs: With 1800s TTL and a large org list, is it possible that the token minted for project N is still live (not yet dropped) while project N+M is being processed? No -- the token drop (
dropToken) happens immediately after each project's loop (diff line 204-205) and the next project gets a fresh token. This is confirmed safe, no follow-up needed.
Wiki staleness notes
apis/workspaces-api.md: The fetch returned empty content (the shell reported no output). The page either does not exist or is empty. The workspace deletion API contract (endpoints, request/response shapes forlistWorkspaces/deleteWorkspace) is not documented in the wiki. Citations for workspace API behaviour above fall back toconcepts/workspace.md.apis/manage-api.md: The OpenAPI specification URL is listed as "TODO". Rate limiting, token-creation quotas, and audit logging expectations are not documented.
Approval recommendation
RECOMMEND APPROVE WITH NITS — no blockers or majors. cli-utils is internal SRE tooling, so the CLI-surface change is low-impact; out-of-band notification to runbook owners is sufficient. Security and platform impact are clean.
The only finding worth addressing in code is the silent drop of non-numeric org-IDs (line 90) — small ergonomic improvement for a destructive command. The rest are optional polish.
vojtabiberle
left a comment
There was a problem hiding this comment.
Summary
PR converts the manage:delete-organization-workspaces ops CLI from single-org / single-component to multi-org (CSV) with an optional --component-group shortcut, and turns three positional args into named options. Code is well-factored, retains the safe --force-gated dry-run default, and adds clearer per-org and per-component reporting. cli-utils is internal SRE tooling, so the CLI argument-shape break is low-impact — worth a heads-up to whoever owns the runbooks, but not a release-notes-grade concern. Security and platform impact are clean; the change does not widen any authz boundary or contract.
See full review in the conversation comment for findings (4 minors, 2 nits — none blocking).
|
Claude to trochu přehnal s tím commentem … |
Summary
organizationIdargument now accepts a comma-separated list of organization IDs.--component-group/-gto target a predefined set of components at once (e.g.transformationscoverskeboola.snowflake-transformation,keboola.legacy-transformation,transformation).orphanComponent,hostnameSuffix,untilDate) into options (--component/-c,--hostname-suffix/-H,--until-date/-d) so they no longer collide with--component-groupwhen callers skip the single-component arg.Usage
Example:
Test plan
--component keboola.snowflake-transformation--component-group transformations--forceagainst a non-production org and confirm workspaces are actually deleted🤖 Generated with Claude Code