Commit 0cdc937
authored
🤖 feat: support all-namespaces LIST across multiple CoderControlPlane instances (#85)
## Summary
Enable all-namespaces LIST to aggregate results across every eligible
`CoderControlPlane` instance (one per namespace) for `CoderTemplate` and
`CoderWorkspace` resources. Previously, `kubectl get codertemplates -A`
failed with:
> multiple eligible CoderControlPlane instances across namespaces;
multi-instance support is planned
## Background
The aggregated API server's storage assumed exactly one eligible
`CoderControlPlane` when handling all-namespaces LIST (request namespace
is empty). When multiple eligible control planes exist across
namespaces, client list-watch loops fail, blocking `kubectl get
codertemplates -A`, `kubectl get coderworkspaces -A`, and
controllers/informers that use list+watch.
## Implementation
**New interface** — `coder.NamespaceLister` with
`EligibleNamespaces(ctx) ([]string, error)`:
- Opt-in capability that lets storage enumerate namespaces a provider
can serve.
- Keeps storage decoupled from concrete provider types.
**Provider implementations**:
- `ControlPlaneClientProvider.EligibleNamespaces` — discovers eligible
CPs via `findEligibleControlPlanes`, groups by namespace, rejects
duplicates within a namespace, returns sorted namespace list.
- `StaticClientProvider.EligibleNamespaces` — returns the pinned
namespace.
**Storage fan-out** in `TemplateStorage.List` and
`WorkspaceStorage.List`:
- When request namespace is empty and provider implements
`NamespaceLister`: fan out across all eligible namespaces, query each,
convert with correct per-namespace metadata, and return aggregated
results sorted by `(namespace, name)`.
- Namespaced requests and providers without `NamespaceLister` keep
existing behavior unchanged.
- Error semantics: fail-fast (if any namespace fails, return error
immediately).
## Validation
- `make verify-vendor` ✅
- `make test` ✅ (8 new tests + no regressions)
- `make build` ✅
- `make lint` ✅
## Risks
Low risk — the fan-out path only activates for all-namespaces LIST when
the provider supports `NamespaceLister`. Single-namespace requests and
the static provider path are unchanged. Multiple eligible CPs within the
same namespace are still explicitly rejected.
---
<details>
<summary>📋 Implementation Plan</summary>
# Plan: Support querying multiple CoderControlPlane instances
(multi-namespace aggregation)
## Context / Why
Today, the aggregated API server’s `CoderTemplate` / `CoderWorkspace`
storage assumes **exactly one** eligible `CoderControlPlane` when
handling **all-namespaces LIST** (i.e. request namespace is empty). When
multiple eligible control planes exist across namespaces, client
list-watch loops fail with:
- `multiple eligible CoderControlPlane instances across namespaces;
multi-instance support is planned`
This blocks workflows like:
- `kubectl get codertemplates -A`
- controllers/informers that use list+watch against
`aggregation.coder.com/v1alpha1`
## Goal
Enable **multi-instance querying** by making **all-namespaces LIST**
aggregate results across every eligible `CoderControlPlane` (one per
namespace) for:
- `aggregation.coder.com/v1alpha1/CoderTemplate`
- `aggregation.coder.com/v1alpha1/CoderWorkspace`
Namespaced requests (`-n <namespace>`) must keep current behavior.
## Non-goals (v1)
- Supporting **multiple eligible** `CoderControlPlane` objects **within
the same Kubernetes namespace**.
- Implementing a “true” upstream-backed watch (watch events that reflect
changes that occur directly in Coder without going through the
aggregated API server).
## Acceptance Criteria
- `kubectl get codertemplates -A` returns templates from **all**
eligible `CoderControlPlane` namespaces.
- `kubectl get coderworkspaces -A` returns workspaces from **all**
eligible `CoderControlPlane` namespaces.
- `kubectl get codertemplates -A --watch` no longer fails due to the
multi-instance discovery error (because the initial LIST succeeds).
- Standalone `aggregated-apiserver` mode (static provider pinned to
`--coder-namespace`) continues working unchanged.
## Evidence (code references)
- **Single-instance guard + error message**:
`internal/aggregated/coder/controlplane_provider.go`
- `ClientForNamespace` errors on `len(eligible) > 1`
- `DefaultNamespace` errors on multiple eligible across namespaces
- `multipleEligibleControlPlaneMessage("")` returns the exact string
shown in the screenshot
- **LIST path uses default namespace + empty-namespace client
resolution**:
- `internal/aggregated/storage/template.go`: `(*TemplateStorage).List`
- `internal/aggregated/storage/workspace.go`: `(*WorkspaceStorage).List`
- both call `namespaceForListConversion(...)` and then
`clientForNamespace(ctx, requestNamespace)` where `requestNamespace==""`
triggers the provider’s “pick exactly one CP” logic
- **WATCH path is an in-memory broadcaster**:
`internal/aggregated/storage/watch.go`, `template.go`, `workspace.go`
- the screenshot’s “Watcher failed …” is consistent with list-watch
clients failing during the initial LIST phase
---
## Implementation Details
### 1) Add an optional provider capability: list eligible namespaces
**File:** `internal/aggregated/coder/provider.go`
Add a small, opt-in interface that lets storage enumerate the set of
namespaces it can serve.
```go
// NamespaceLister can enumerate namespaces served by a ClientProvider.
// Used to implement all-namespaces LIST by fanning out across instances.
//
// Implementations should only return namespaces that are eligible/ready.
// Returned namespaces must be non-empty and should be deterministic (sorted).
type NamespaceLister interface {
EligibleNamespaces(ctx context.Context) ([]string, error)
}
```
Rationale: keeps storage decoupled from concrete provider types
(`StaticClientProvider` vs `ControlPlaneClientProvider`).
### 2) Implement `NamespaceLister`
#### 2a) Dynamic control-plane provider
**File:** `internal/aggregated/coder/controlplane_provider.go`
Implement:
- `func (p *ControlPlaneClientProvider) EligibleNamespaces(ctx
context.Context) ([]string, error)`
Algorithm:
1. `eligibleCPs, err := p.findEligibleControlPlanes(ctx, "")`
2. If `len(eligibleCPs) == 0`: return
`ServiceUnavailable(noEligibleControlPlaneMessage(""))`.
3. Group eligible CPs by `cp.Namespace`.
4. If any namespace has >1 eligible CP: return
`BadRequest(multipleEligibleControlPlaneMessage(namespace))` (still not
supported).
5. Return the set of namespaces, **sorted** for determinism.
Defensive programming:
- Assert/validate non-nil receiver and non-nil context.
- Ensure returned namespaces are not empty; crash/assert if an eligible
CP has an empty namespace (should be impossible).
#### 2b) Static provider
**File:** `internal/aggregated/coder/provider.go`
Implement:
- `func (p *StaticClientProvider) EligibleNamespaces(ctx
context.Context) ([]string, error)`
Behavior:
- If `p.Namespace == ""`: return `ServiceUnavailable("static provider
has no default namespace")` (consistent with existing behavior).
- Else return `[]string{p.Namespace}`.
### 3) Update storage LIST to fan out when request namespace is empty
#### 3a) Templates
**File:** `internal/aggregated/storage/template.go`
Update `func (s *TemplateStorage) List(ctx context.Context, _
*metainternalversion.ListOptions) ...`:
- If request namespace is non-empty: keep existing behavior.
- If request namespace is empty:
- If `s.provider` implements `coder.NamespaceLister`:
- `namespaces := lister.EligibleNamespaces(ctx)`
- For each namespace:
- `sdk := s.clientForNamespace(ctx, namespace)`
- `templates := sdk.Templates(ctx, codersdk.TemplateFilter{})`
- Convert each template using `convert.TemplateToK8s(namespace,
template)` and append.
- Sort `list.Items` by `(namespace, name)` for deterministic output.
- Else: fallback to existing `namespaceForListConversion` +
`clientForNamespace(ctx, "")` behavior.
#### 3b) Workspaces
**File:** `internal/aggregated/storage/workspace.go`
Update `func (s *WorkspaceStorage) List(ctx context.Context, _
*metainternalversion.ListOptions) ...` with the same fan-out strategy.
Concurrency (recommended, not required for correctness):
- Use an `errgroup.Group` + a small semaphore/limit (e.g. 4) so large
numbers of namespaces don’t produce N sequential slow requests.
Error semantics (v1):
- Fail-fast: if any namespace fails to resolve a client or list objects,
return an error (preserves today’s “LIST is authoritative” behavior).
### 4) Tests
#### 4a) Provider unit tests
**File:** `internal/aggregated/coder/controlplane_provider_test.go`
Add tests for `EligibleNamespaces`:
- Returns namespaces for multiple eligible CPs across namespaces.
- Returns `BadRequest` when a single namespace has multiple eligible
CPs.
- Returns `ServiceUnavailable` when no eligible CPs exist.
#### 4b) Storage aggregation tests
**File:** `internal/aggregated/storage/storage_test.go`
Add tests verifying all-namespaces aggregation:
- Create two mock Coder servers with distinct seeded
templates/workspaces.
- Create a test provider implementing:
- `coder.ClientProvider` (map namespace → client)
- `coder.NamespaceLister` (returns both namespaces)
- Assert:
- `TemplateStorage.List(context.Background(), nil)` returns items from
both namespaces and each item has the correct `metadata.namespace`.
- same for `WorkspaceStorage.List`.
### 5) Docs / behavior notes
**File:** `internal/aggregated/storage/doc.go`
Update the “v1 semantics” comment to note:
- all-namespaces LIST aggregates across eligible `CoderControlPlane`
namespaces when the provider supports it.
### 6) Validation
Run locally after implementation:
- `make test`
- `make build`
- `make lint`
- `make verify-vendor` (expected no-op; only if deps are added)
---
<details>
<summary>Optional follow-ups (not required to fix the observed
error)</summary>
1) **Upstream-backed watch**: implement per-control-plane
polling/websocket to generate watch events even when changes happen
directly in Coder.
2) **Partial failure mode**: consider returning partial results for
all-namespaces LIST when one instance is down (would require a clear
API/UX decision; Kubernetes APIs generally favor fail-fast).
3) **Performance & caching**: optionally cache per-namespace SDK
clients/tokens (with invalidation on secret changes) to reduce
per-request secret reads.
</details>
</details>
---
_Generated with `mux` • Model: `anthropic:claude-opus-4-6` • Thinking:
`xhigh` • Cost: `$3.28`_
<!-- mux-attribution: model=anthropic:claude-opus-4-6 thinking=xhigh
costs=3.28 -->1 parent 0e64421 commit 0cdc937
8 files changed
Lines changed: 523 additions & 2 deletions
File tree
- internal/aggregated
- coder
- storage
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
5 | 5 | | |
6 | 6 | | |
7 | 7 | | |
| 8 | + | |
8 | 9 | | |
9 | 10 | | |
10 | 11 | | |
| |||
26 | 27 | | |
27 | 28 | | |
28 | 29 | | |
| 30 | + | |
29 | 31 | | |
30 | 32 | | |
31 | 33 | | |
| |||
206 | 208 | | |
207 | 209 | | |
208 | 210 | | |
| 211 | + | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
209 | 250 | | |
210 | 251 | | |
211 | 252 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
266 | 266 | | |
267 | 267 | | |
268 | 268 | | |
| 269 | + | |
| 270 | + | |
| 271 | + | |
| 272 | + | |
| 273 | + | |
| 274 | + | |
| 275 | + | |
| 276 | + | |
| 277 | + | |
| 278 | + | |
| 279 | + | |
| 280 | + | |
| 281 | + | |
| 282 | + | |
| 283 | + | |
| 284 | + | |
| 285 | + | |
| 286 | + | |
| 287 | + | |
| 288 | + | |
| 289 | + | |
| 290 | + | |
| 291 | + | |
| 292 | + | |
| 293 | + | |
| 294 | + | |
| 295 | + | |
| 296 | + | |
| 297 | + | |
| 298 | + | |
| 299 | + | |
| 300 | + | |
| 301 | + | |
| 302 | + | |
| 303 | + | |
| 304 | + | |
| 305 | + | |
| 306 | + | |
| 307 | + | |
| 308 | + | |
| 309 | + | |
| 310 | + | |
| 311 | + | |
| 312 | + | |
| 313 | + | |
| 314 | + | |
| 315 | + | |
| 316 | + | |
| 317 | + | |
| 318 | + | |
| 319 | + | |
| 320 | + | |
| 321 | + | |
| 322 | + | |
| 323 | + | |
| 324 | + | |
| 325 | + | |
| 326 | + | |
| 327 | + | |
| 328 | + | |
| 329 | + | |
| 330 | + | |
| 331 | + | |
| 332 | + | |
| 333 | + | |
| 334 | + | |
| 335 | + | |
| 336 | + | |
| 337 | + | |
| 338 | + | |
| 339 | + | |
| 340 | + | |
| 341 | + | |
| 342 | + | |
| 343 | + | |
| 344 | + | |
| 345 | + | |
| 346 | + | |
| 347 | + | |
| 348 | + | |
| 349 | + | |
| 350 | + | |
| 351 | + | |
| 352 | + | |
| 353 | + | |
| 354 | + | |
| 355 | + | |
| 356 | + | |
| 357 | + | |
| 358 | + | |
| 359 | + | |
| 360 | + | |
| 361 | + | |
| 362 | + | |
| 363 | + | |
| 364 | + | |
| 365 | + | |
| 366 | + | |
| 367 | + | |
| 368 | + | |
| 369 | + | |
| 370 | + | |
| 371 | + | |
| 372 | + | |
| 373 | + | |
| 374 | + | |
| 375 | + | |
| 376 | + | |
| 377 | + | |
| 378 | + | |
| 379 | + | |
| 380 | + | |
| 381 | + | |
| 382 | + | |
| 383 | + | |
| 384 | + | |
| 385 | + | |
| 386 | + | |
| 387 | + | |
| 388 | + | |
| 389 | + | |
| 390 | + | |
| 391 | + | |
| 392 | + | |
| 393 | + | |
| 394 | + | |
| 395 | + | |
| 396 | + | |
| 397 | + | |
| 398 | + | |
| 399 | + | |
| 400 | + | |
| 401 | + | |
| 402 | + | |
| 403 | + | |
| 404 | + | |
| 405 | + | |
| 406 | + | |
| 407 | + | |
| 408 | + | |
| 409 | + | |
| 410 | + | |
| 411 | + | |
| 412 | + | |
| 413 | + | |
| 414 | + | |
| 415 | + | |
| 416 | + | |
| 417 | + | |
| 418 | + | |
| 419 | + | |
| 420 | + | |
| 421 | + | |
| 422 | + | |
| 423 | + | |
269 | 424 | | |
270 | 425 | | |
271 | 426 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
22 | 22 | | |
23 | 23 | | |
24 | 24 | | |
| 25 | + | |
| 26 | + | |
| 27 | + | |
| 28 | + | |
| 29 | + | |
| 30 | + | |
25 | 31 | | |
26 | 32 | | |
27 | 33 | | |
| |||
31 | 37 | | |
32 | 38 | | |
33 | 39 | | |
| 40 | + | |
34 | 41 | | |
35 | 42 | | |
36 | 43 | | |
| |||
77 | 84 | | |
78 | 85 | | |
79 | 86 | | |
| 87 | + | |
| 88 | + | |
| 89 | + | |
| 90 | + | |
| 91 | + | |
| 92 | + | |
| 93 | + | |
| 94 | + | |
| 95 | + | |
| 96 | + | |
| 97 | + | |
| 98 | + | |
80 | 99 | | |
81 | 100 | | |
82 | 101 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
209 | 209 | | |
210 | 210 | | |
211 | 211 | | |
| 212 | + | |
| 213 | + | |
| 214 | + | |
| 215 | + | |
| 216 | + | |
| 217 | + | |
| 218 | + | |
| 219 | + | |
| 220 | + | |
| 221 | + | |
| 222 | + | |
| 223 | + | |
| 224 | + | |
| 225 | + | |
| 226 | + | |
| 227 | + | |
| 228 | + | |
| 229 | + | |
| 230 | + | |
| 231 | + | |
| 232 | + | |
| 233 | + | |
| 234 | + | |
| 235 | + | |
| 236 | + | |
| 237 | + | |
| 238 | + | |
| 239 | + | |
| 240 | + | |
| 241 | + | |
| 242 | + | |
| 243 | + | |
| 244 | + | |
| 245 | + | |
| 246 | + | |
| 247 | + | |
| 248 | + | |
| 249 | + | |
| 250 | + | |
| 251 | + | |
| 252 | + | |
| 253 | + | |
| 254 | + | |
| 255 | + | |
| 256 | + | |
| 257 | + | |
| 258 | + | |
| 259 | + | |
| 260 | + | |
| 261 | + | |
| 262 | + | |
| 263 | + | |
212 | 264 | | |
213 | 265 | | |
214 | 266 | | |
| |||
| Original file line number | Diff line number | Diff line change | |
|---|---|---|---|
| |||
9 | 9 | | |
10 | 10 | | |
11 | 11 | | |
| 12 | + | |
| 13 | + | |
12 | 14 | | |
0 commit comments