microsoft · tylerbutler · May 21, 2026 · May 27, 2026 · May 28, 2026 · May 28, 2026
diff --git a/.github/workflows/pr-review-fleet.README.md b/.github/workflows/pr-review-fleet.README.md
@@ -0,0 +1,88 @@
+# Fleet Review Workflow System
+
+Four workflows compose the PR fleet review system:
+
+| Workflow | Role |
+| --- | --- |
+| [`pr-review-auto-route.yml`](./pr-review-auto-route.yml) | Sizes the PR and posts a proposal comment with reviewer checkboxes |
+| [`pr-review-confirm.yml`](./pr-review-confirm.yml) | Watches the proposal comment; dispatches the fleet when the author ticks "Start review" |
+| [`pr-review-dispatch.yml`](./pr-review-dispatch.yml) | Label-driven alternate entry point — stages PR context as an artifact |
+| [`pr-review-fleet.yml`](./pr-review-fleet.yml) | Runs the fleet of reviewer agents in parallel, consolidates results, posts the report |
+
+> `pr-changeset-review.yml` is unrelated — it's Vale linting on changeset files.
+
+## 1. `pr-review-auto-route.yml` — Sizer + Proposal
+
+- **Trigger:** `pull_request_target` on `opened` / `reopened` / `synchronize` (base branches `main`, `next`, `release/**/*`).
+- **Release branches:** Included intentionally. Most release-branch PRs are release automation or cherry-picks, but both entry points are opt-in, and occasional non-standard release changes can still benefit from fleet review.
+- **Permissions:** `contents: read`, `pull-requests: write`.
+- **Context:** Runs from the **base branch** with a write token + secrets. Critical security invariant: never checks out or executes the PR head; it only fetches the head commit and derives deterministic stats (lines / files / packages) from `git diff --numstat`. See [Security Notes](#security-notes) for the threat model.
+- **What it does:** Computes a recommended fleet tier (1 / 3 / 5 reviewers) from thresholds on lines, files, and packages, then posts a **sticky comment** (`<!-- pr-review-confirm -->` marker) with pre-checked reviewer checkboxes and a "Start review" checkbox. On `synchronize` it carries forward prior toggles and short-circuits if "Start review" is already checked. Skips entirely if the `fleet-review` label is set (label flow takes precedence).
+
+### Sizing thresholds
+
+| Tier | Reviewers | Rule |
+| --- | --- | --- |
+| small | 1 | ≤ 100 lines **and** ≤ 5 files **and** ≤ 1 package |
+| large | 5 | > 500 lines **or** > 30 files **or** > 5 packages |
+| medium | 3 | everything else |
+
+"Packages" = distinct top-level dirs under `packages/`, `experimental/`, `examples/`.
+
+## 2. `pr-review-confirm.yml` — Human Toggle → Dispatch
+
+- **Trigger:** `issue_comment` `edited`.
+- **Permissions:** `contents: read`, `pull-requests: write`, **`actions: write`** (required so `gh workflow run` can dispatch the fleet — without it the API 403s).
+- **Activation guard (`if:`):** must be a PR comment, sender is a `User` (blocks bot self-loops), comment was authored by `github-actions[bot]`, contains the `<!-- pr-review-confirm -->` marker, and the "Start review" checkbox transitioned **unchecked → checked**.
+- **Trust check:** Uses `actions-cool/check-user-permission` to require `write` on the *acting user* (the comment author is the bot, so `author_association` is meaningless here).
+- **What it does:** Fetches the PR's current `head_sha` / `base_ref` from the API (not from any pinned artifact — handles force-pushes between proposal and toggle), parses selected reviewer checkboxes, calls `gh workflow run pr-review-fleet.yml` with explicit `pr_number`, `reviewers`, `head_sha`, `base_ref` inputs. Then resets the "Start review" checkbox so a single tick re-runs it next time.
+
+## 3. `pr-review-dispatch.yml` — Label Path
+
+- **Trigger:** `pull_request` `labeled` (label must be `fleet-review`).
+- **Permissions:** `contents: read` only — it just stages context.
+- **What it does:** Writes `{pr_number, head_sha, base_ref}` to `dispatch/params.json` and uploads it as the `dispatch-params` artifact. No fleet trigger from here — the fleet workflow's `workflow_run` listener picks it up.
+- **Race behavior:** Adding the label does not cancel an already-running `auto-route` job because that workflow has a separate concurrency group; at worst, `auto-route` may finish and update the proposal comment once. Fleet runs themselves are keyed per PR, so a label-triggered fleet run cancels any in-progress fleet run for the same PR, and a later confirm/manual run cancels the label-triggered one.
+
+## 4. `pr-review-fleet.yml` — The Reviewer Fleet
+
+- **Triggers:** `workflow_run` on completion of "PR Review Fleet Dispatcher", **or** `workflow_dispatch` (manual / from the confirm workflow). The `workflow_run` path only proceeds if the dispatcher succeeded.
+- **Permissions:** `contents: read`, `pull-requests: write`, `actions: read` (to download the dispatcher artifact), `checks: write` (to surface a "Fleet Review" check on the PR — `workflow_dispatch` runs aren't otherwise visible in the Checks tab).
+- **Concurrency:** Keyed per PR; cancels in-progress runs.
+
+### Jobs
+
+1. **`setup`** — Resolves params (artifact for `workflow_run`, inputs for `workflow_dispatch`). The label path forces the default fleet size of 3 from the priority list `[correctness, security, api-compatibility, performance, testing]`. The confirm path passes an explicit reviewer JSON array. Creates a check run on the PR head and posts an "in progress" sticky comment.
+2. **`review`** (matrix, `fail-fast: false`) — One job per reviewer. Checks out PR head with `persist-credentials: false`, pre-computes the diff / changed-files / api-report-files, then loads the prompt **from the base branch** (`git show origin/${BASE_REF}:.github/prompts/reviewers/${REVIEWER}.md`). Installs `@github/copilot` and runs with `COPILOT_GITHUB_TOKEN`, model `claude-sonnet-4-6`, and **only `--allow-tool=read --allow-tool=write`**. The reviewer writes `review-${reviewer}.json`, uploaded as an artifact. See [Security Notes](#security-notes) for the prompt-injection and credential rationale.
+3. **`consolidate`** — Downloads all `review-*` artifacts, runs `.github/scripts/consolidate_reviews.py` (exit 0 = findings, 2 = clean), scrubs GitHub token patterns from the output as defense-in-depth, and posts the result via the same `pr-review-fleet` sticky header so it overwrites the in-progress comment.
+4. **`teardown`** — Always runs to finalize the check run (success / neutral with findings / failure), so a failed `setup` doesn't leave the "Fleet Review" check stuck in `in_progress`.
+
+## End-to-End Flows
+
+### Recommended path (auto-route)
+
+```
+PR opened/sync ─▶ pr-review-auto-route (pull_request_target, base-branch trusted)
+                    └─▶ posts proposal comment with checkboxes
+User edits comment, ticks "Start review"
+                ─▶ pr-review-confirm (issue_comment.edited, verifies write access)
+                    └─▶ gh workflow run pr-review-fleet.yml (workflow_dispatch)
+                          └─▶ setup → matrix review → consolidate → teardown
+```
+
+### Label path
+
+```
+Label "fleet-review" applied
+  ─▶ pr-review-dispatch (uploads params artifact, contents:read only)
+       └─▶ pr-review-fleet via workflow_run (downloads artifact, default 3 reviewers)
+```
+
+## Security Notes
+
+- `auto-route` uses `pull_request_target`, so fork PRs can trigger a workflow that has repository permissions and secrets. The danger is not the trigger by itself; the danger is combining that privileged context with attacker-controlled PR content. If this workflow checked out the PR head and ran scripts, installs, tests, Git aliases, or other PR-authored code, the PR author could shape that code to read tokens, call GitHub APIs, or influence privileged follow-up actions.
+- To avoid that, `auto-route` only uses trusted workflow code from the base branch. It fetches the PR head commit as data, reads deterministic `git diff --numstat` output, reduces that output to line / file / package counts, and posts a bot-authored proposal comment. The untrusted PR controls the numbers that feed the tiering decision, but not the command sequence, prompt text, posted comment template, or any executable code path.
+- `fleet` is a separate workflow that intentionally checks out the PR head for review context. That step is lower risk because checkout uses `persist-credentials: false`, reviewer prompts are loaded from `origin/${BASE_REF}` instead of the PR checkout, and all git-derived context is pre-materialized before `COPILOT_GITHUB_TOKEN` enters the reviewer step environment.
+- Reviewer agents are credentialed with `COPILOT_GITHUB_TOKEN`, and the PR diff is attacker-controlled text that is sent to an LLM. The model is therefore treated as exposed to prompt injection: it is allowed to read the prepared files and write its JSON review result, but it is denied shell and git tools (`--allow-tool=read --allow-tool=write` only). Without shell/git tools or persisted checkout credentials, a malicious diff can ask the model to exfiltrate or mutate repository state, but the agent has no approved tool path to run commands, invoke git, or push changes.
+- The consolidated report is regex-scrubbed for `gh[pousr]_…` and `github_pat_…` token patterns before posting, since model output itself is untrusted. This is defense-in-depth, not the primary boundary; the primary boundary is preventing PR-controlled content from gaining command execution or access to persisted credentials in the first place.
+- The confirm workflow needs `actions: write` because `GITHUB_TOKEN` can only dispatch workflows in its own repo when that permission is explicitly granted.
diff --git a/.github/workflows/pr-review-fleet.design-notes.md b/.github/workflows/pr-review-fleet.design-notes.md
@@ -0,0 +1,70 @@
+# `pull_request` vs `pull_request_target` — Design Notes
+
+> Adapted from a transcript discussion, corrected against the actual Fleet Review workflows in this directory. See [`pr-review-fleet.README.md`](./pr-review-fleet.README.md) for the workflow reference.
+
+Understanding the difference between the GitHub Actions `pull_request` and `pull_request_target` trigger events is important for designing secure workflows.
+
+At a high level, GitHub's documentation says that `pull_request` runs in the context of the PR's merge commit, while `pull_request_target` runs in the context of the base branch (and, since December 2025, the default branch is the workflow source for `pull_request_target`). Put more simply: when you use `pull_request`, the workflow is evaluating the proposed PR changes; when you use `pull_request_target`, the workflow definition comes from the trusted repository side instead.
+
+The security concern arises because pull requests, especially from forks, must be treated as untrusted input. Build scripts, tests, package install hooks, and other automation paths can all become code-execution points. If a workflow running on PR content also has access to sensitive capabilities, a malicious contributor can shape the PR so that their code runs in that environment and abuses those capabilities.
+
+This is where `pull_request_target` comes in. The documented difference is that it runs using the trusted repository context rather than the PR's merge context, which is why it's commonly used for things like labeling, commenting, or other repo-side automation that needs elevated permissions. But GitHub also warns that this event has a larger attack surface precisely because it runs with repository privileges while still being triggered by user-supplied PR activity. So the safe summary is: `pull_request_target` helps prevent untrusted workflow definitions from being used as the workflow source, but it does not make PR-derived input safe by itself.
+
+This tradeoff is central to using `pull_request_target` safely. Because it executes in a more privileged environment, it requires much more discipline around untrusted input. The primary rule is that PR-derived data is data only: do not check it out as the working tree, execute it, source it, or let it define the workflow/prompt being run.
+
+The secondary rule is to constrain what leaves a privileged job. Instead of passing raw PR-derived data downstream, convert it into well-defined, controlled outputs — typically small blobs of JSON or comments that trusted workflow code constructs. The idea is that no matter what the PR contains, the next workflow step receives something bounded and intentionally shaped.
+
+A concrete example is the review routing logic in `pr-review-auto-route.yml`. The workflow runs on `pull_request_target` from the base branch and **never checks out the PR head** — it only `git fetch`es the head commit and reads `--numstat` from it. From those numbers it derives `lines` / `files` / `packages` counts and picks a tier. That base-branch-only checkout is the foundational input boundary that makes `pull_request_target` safe here. The sticky proposal comment that the bot constructs itself is the separate output boundary.
+
+## How the Fleet Review pipeline composes this
+
+It's worth being precise about the architecture, because it's not a single workflow with privileged and unprivileged jobs — it's **separate workflows on different trigger surfaces**:
+
+- `pr-review-auto-route.yml` runs on `pull_request_target`. Its job is sizing and UI only. It posts a proposal comment with checkboxes; it does not run the reviewer.
+- `pr-review-fleet.yml` runs on `workflow_dispatch` (triggered by the confirm workflow when the author ticks "Start review") or on `workflow_run` (triggered by the label-driven dispatcher). It checks out the PR head itself with `persist-credentials: false`, generates `pr-diff.patch` and `changed-files.txt` inline, and then runs the Copilot CLI in that same job.
+
+So the isolation isn't "diff is passed across job boundaries from a privileged job to a less-privileged one." It's:
+
+1. The reviewer agent has **no shell or exec tools** — only `--allow-tool=read --allow-tool=write`. Even though `.git` is on disk, the agent can't invoke `git`.
+2. Checkout uses `persist-credentials: false`, so the auth token isn't stashed in the local Git config for later steps to reuse.
+3. All git-derived context (the diff, the changed-files list, the reviewer prompt) is materialized **before** `COPILOT_GITHUB_TOKEN` enters the step's environment. Once that token is present, the model-influenced part of the job consumes fixed files instead of invoking git, resolving refs, or loading prompts through a path that attacker-controlled instructions could try to steer.
+
+This was the shape of a previous MSRC issue: earlier versions allowed arbitrary Git operations with persisted credentials, which could have been abused for exfiltration. Removing Git capabilities from the credentialed step closed that hole. Git is still used by deterministic workflow steps before the Copilot token is present; the boundary is that the later model-influenced step has no approved tool path to execute `git`.
+
+## Prompt-injection: a second untrusted input
+
+The diff isn't the only untrusted input the reviewer agent sees — the reviewer prompt itself is too, if you load it from the PR checkout. `pr-review-fleet.yml` loads the prompt from the **base branch** instead:
+
+```bash
+git show "origin/${BASE_REF}:.github/prompts/reviewers/${REVIEWER}.md" > "$PROMPT_FILE"
+```
+
+That way a malicious PR can't rewrite the instructions an authenticated agent is about to execute.
+
+## Output is untrusted too
+
+Model output is also untrusted. The consolidator regex-scrubs GitHub token patterns (`gh[pousr]_…`, `github_pat_…`) from the report before it's posted as a PR comment. This is defense-in-depth, not a complete exfiltration boundary: a model could transform or split a token so it no longer matches the regex. The stronger boundary is that the agent is denied shell/git tools and does not receive persisted checkout credentials. Longer term, the ideal credential model is short-lived, least-privilege, per-run credentials rather than a reusable token.
+
+## Credential handling
+
+Setting `persist-credentials: false` on `actions/checkout` prevents the checkout action from storing the auth token in the local Git config for later Git operations. It does not remove every possible credential from the job, but it closes a common path for accidental or malicious reuse in later steps.
+
+Note also that the confirm workflow (`pr-review-confirm.yml`) declares `actions: write`. That permission is what lets `gh workflow run pr-review-fleet.yml` dispatch the fleet — `GITHUB_TOKEN` can dispatch workflows in its own repo only when this permission is explicitly granted; otherwise the call 403s. It's a small but easy-to-miss requirement when wiring up a confirm-then-dispatch flow.
+
+## Who is allowed to trigger?
+
+There are guardrails on who can trigger the fleet:
+
+- **Confirm path:** `pr-review-confirm.yml` calls `actions-cool/check-user-permission` with `require: write` against `github.event.sender.login` before dispatching. The bot authored the comment, so `author_association` on the event isn't meaningful — we have to check the acting user directly.
+- **Label path:** `pr-review-dispatch.yml` trusts whoever can apply the `fleet-review` label, which is gated by GitHub's own repo permissions on labeling.
+- **Manual `workflow_dispatch`:** gated by GitHub on `actions: write` to the repo.
+
+## Summary
+
+Conceptually, the goal is to split responsibilities:
+
+- `pull_request_target` handles classification / analysis using untrusted input, **without checking out PR head**.
+- Downstream workflows (triggered via `workflow_dispatch` / `workflow_run`) do the heavier work, but with PR code denied any path to credentials — via `persist-credentials: false`, no shell tools for the agent, prompts loaded from the base branch, and output scrubbed before posting.
+- Nothing ever blindly executes code from the PR itself.
+
+In practice, `pull_request` is the appropriate event for running untrusted PR code with limited privileges, while `pull_request_target` is appropriate for trusted repository-side automation that must not execute PR-controlled code. The key is to be precise about what is trusted, what is untrusted, and where credentials or elevated permissions are available — and to remember that "untrusted" includes the diff, any prompt loaded from the PR, **and** the model's own output.