Skip to content

Start shared sidecar containers before the harness#12

Merged
CrypticSwarm merged 4 commits into
masterfrom
tongs-phase-7b-shared-tong-spine
Jun 18, 2026
Merged

Start shared sidecar containers before the harness#12
CrypticSwarm merged 4 commits into
masterfrom
tongs-phase-7b-shared-tong-spine

Conversation

@CrypticSwarm

@CrypticSwarm CrypticSwarm commented Jun 18, 2026

Copy link
Copy Markdown
Owner

Summary

The host launcher (scripts/run_anvil.py) wraps the harness container run and,
until now, only discovered sidecar definitions without starting them. This is
the first change where a discovered sidecar actually runs.

What this does

When a sidecar is discovered, the launcher now starts it, waits for it to report
ready, makes it reachable from the harness, runs the harness in the foreground,
and leaves the sidecar running afterwards.

This first cut handles only long-lived (shared) sidecars reached over the
harness's existing network:

  • A shared sidecar is one container keyed by a stable name. A running one whose
    config-hash label still matches is reused untouched; a missing, stopped, or
    stale one is (re)started.
  • A port sidecar's reachability is injected into the harness as
    SWARMFORGE_TONG_<NAME>_HOST/_PORT environment. A none sidecar is started
    but has no harness-facing surface.
  • Readiness is driven by each sidecar's declaration: a TCP dial of its canonical
    alias, an image healthcheck or exec command, or an explicit skip. The TCP probe
    runs from a throwaway container on the network, so --anvil-image is threaded
    in to give it an image with python3; without one it degrades to a
    container-running check.

Anything needing machinery not wired here is refused with a clear message rather
than started half-configured: a per-session lifecycle, a secret reference, an MCP
interface, a volume interface (a shared named volume, which has no consumer
yet), or a shared sidecar that mounts the workspace (a reused container would
leak one session's workspace into the next; a per-workspace mount belongs on a
per-session sidecar).

Structure

Every docker call goes through a small DockerCLI seam so the start / ready /
inject / run sequence is unit-tested against an in-process fake -- covering
reuse, recreate (absent/stopped/stale), readiness timeout, port/volume injection,
multiple sidecars, the refusal of unsupported sidecars, and Ctrl-C leaving the
shared sidecars up.

The launcher will soon start tongs before the anvil, which means running
docker commands and waiting for a sidecar to accept connections. Route every
docker call through a DockerCLI object so the orchestration logic can be
exercised against an in-process fake instead of a live daemon, and add the
readiness prober that decides a tong is up.

DockerCLI wraps the handful of docker verbs the launch depends on: removing a
container, starting one detached, inspecting its running state and config-hash
label, reading an image healthcheck, running an exec, dialing a TCP port from a
throwaway container on the network, and running the anvil in the foreground so
the launcher regains control when it exits. wait_ready dispatches on a tong's
resolved readiness declaration -- a TCP dial of its canonical alias, an image
healthcheck or exec command, or an immediate pass -- and degrades a TCP probe
to a container-running check when no probe image is available.

Nothing calls these yet; they are the seam the shared-tong launch path builds on.
When a tong is discovered, the launcher now starts it, waits for it to report
ready, makes it reachable from the anvil, runs the anvil in the foreground, and
leaves the tong running afterwards -- the first time the launcher touches the
live launch path beyond passing the anvil through.

This first cut handles only `shared` tongs reached over the anvil's existing
network. A `shared` tong is one long-lived container keyed by a stable name: a
running one whose config-hash label still matches is reused untouched, and a
missing, stopped, or stale one is (re)started. A `port` or `volume` tong's
reachability is injected into the anvil as environment, plus a shared mount for
`volume`. Anything that needs machinery not wired here -- a `session` lifecycle,
a secret reference, or an `mcp` interface -- is refused with a clear message
rather than started half-wired.

The passthrough invariant is unchanged and still tested byte-for-byte: with no
tong discovered the launcher execs the anvil argv verbatim, and only a present
tong drives the start/ready/inject path. Validation runs before any docker call
so an invalid definition stops the launch cleanly. The anvil image is threaded
in as `--anvil-image` so a TCP readiness probe can dial a tong's network-internal
port from a throwaway container.
Decide the TCP-probe degrade (no anvil image, or no network to dial on) once
before the readiness loop and warn a single time, instead of re-warning on every
poll -- a long readiness wait no longer floods stderr. Folding the missing
network into the same condition also avoids handing the probe a None network.

Extend the tests to cover a present-but-stopped shared container (recreated, not
reused), two shared tongs started and injected in one launch, and a Ctrl-C
during the run reporting 130 while leaving the shared tongs up.
The `volume` interface (a named volume shared between a tong and the harness) has
no consumer: the credential, broker, and shared-service tongs are all network or
side-effect tongs, and a file-watcher reaches the tree through a `workspace`
mount, not a shared volume. Rather than ship the half-wired kind (the named
volume was injected into the harness but never mounted into the tong), refuse a
`volume` interface here and revisit if a real case appears.

Also refuse a `shared` tong that mounts the `workspace`. A `shared` tong is one
long-lived container reused across sessions, so binding one session's workspace
into it would expose that workspace to every later session that reuses the
container. A per-workspace mount belongs on a `session` tong; the docker-socket
mount (the broker pattern) stays allowed on a shared tong.

The startable set is now `shared` `port`/`none` tongs without secrets
@CrypticSwarm CrypticSwarm merged commit c72008f into master Jun 18, 2026
1 check passed
@CrypticSwarm CrypticSwarm deleted the tongs-phase-7b-shared-tong-spine branch June 18, 2026 03:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant