From b964bd609b65bb66b5ac75eb1f35431909ad95b3 Mon Sep 17 00:00:00 2001
From: Tyler Butler <tylerbu@microsoft.com>
Date: Thu, 21 May 2026 06:58:35 -0700
Subject: [PATCH 01/11] docs: add PR Review Fleet workflow README and design
 notes

---
 .github/workflows/pr-review-fleet.README.md   | 85 +++++++++++++++++++
 .../workflows/pr-review-fleet.design-notes.md | 70 +++++++++++++++
 2 files changed, 155 insertions(+)
 create mode 100644 .github/workflows/pr-review-fleet.README.md
 create mode 100644 .github/workflows/pr-review-fleet.design-notes.md

diff --git a/.github/workflows/pr-review-fleet.README.md b/.github/workflows/pr-review-fleet.README.md
new file mode 100644
index 000000000000..544f4a3a0cc7
--- /dev/null
+++ b/.github/workflows/pr-review-fleet.README.md
@@ -0,0 +1,85 @@
+# Fleet Review Workflow System
+
+Four workflows compose the PR fleet review system:
+
+| Workflow | Role |
+| --- | --- |
+| [`pr-review-auto-route.yml`](./pr-review-auto-route.yml) | Sizes the PR and posts a proposal comment with reviewer checkboxes |
+| [`pr-review-confirm.yml`](./pr-review-confirm.yml) | Watches the proposal comment; dispatches the fleet when the author ticks "Start review" |
+| [`pr-review-dispatch.yml`](./pr-review-dispatch.yml) | Label-driven alternate entry point — stages PR context as an artifact |
+| [`pr-review-fleet.yml`](./pr-review-fleet.yml) | Runs the fleet of reviewer agents in parallel, consolidates results, posts the report |
+
+> `pr-changeset-review.yml` is unrelated — it's Vale linting on changeset files.
+
+## 1. `pr-review-auto-route.yml` — Sizer + Proposal
+
+- **Trigger:** `pull_request_target` on `opened` / `reopened` / `synchronize` (base branches `main`, `next`, `release/**`).
+- **Permissions:** `contents: read`, `pull-requests: write`.
+- **Context:** Runs from the **base branch** with a write token + secrets. Critical security invariant: never checks out the PR head; only diffs against it for stats (lines / files / packages). This is what makes it safe to use `pull_request_target` for fork PRs.
+- **What it does:** Computes a recommended fleet tier (1 / 3 / 5 reviewers) from thresholds on lines, files, and packages, then posts a **sticky comment** (`<!-- pr-review-confirm -->` marker) with pre-checked reviewer checkboxes and a "Start review" checkbox. On `synchronize` it carries forward prior toggles and short-circuits if "Start review" is already checked. Skips entirely if the `fleet-review` label is set (label flow takes precedence).
+
+### Sizing thresholds
+
+| Tier | Reviewers | Rule |
+| --- | --- | --- |
+| small | 1 | ≤ 100 lines **and** ≤ 5 files **and** ≤ 1 package |
+| large | 5 | > 500 lines **or** > 30 files **or** > 5 packages |
+| medium | 3 | everything else |
+
+"Packages" = distinct top-level dirs under `packages/`, `experimental/`, `examples/`.
+
+## 2. `pr-review-confirm.yml` — Human Toggle → Dispatch
+
+- **Trigger:** `issue_comment` `edited`.
+- **Permissions:** `contents: read`, `pull-requests: write`, **`actions: write`** (required so `gh workflow run` can dispatch the fleet — without it the API 403s).
+- **Activation guard (`if:`):** must be a PR comment, sender is a `User` (blocks bot self-loops), comment was authored by `github-actions[bot]`, contains the `<!-- pr-review-confirm -->` marker, and the "Start review" checkbox transitioned **unchecked → checked**.
+- **Trust check:** Uses `actions-cool/check-user-permission` to require `write` on the *acting user* (the comment author is the bot, so `author_association` is meaningless here).
+- **What it does:** Fetches the PR's current `head_sha` / `base_ref` from the API (not from any pinned artifact — handles force-pushes between proposal and toggle), parses selected reviewer checkboxes, calls `gh workflow run pr-review-fleet.yml` with explicit `pr_number`, `reviewers`, `head_sha`, `base_ref` inputs. Then resets the "Start review" checkbox so a single tick re-runs it next time.
+
+## 3. `pr-review-dispatch.yml` — Label Path
+
+- **Trigger:** `pull_request` `labeled` (label must be `fleet-review`).
+- **Permissions:** `contents: read` only — it just stages context.
+- **What it does:** Writes `{pr_number, head_sha, base_ref}` to `dispatch/params.json` and uploads it as the `dispatch-params` artifact. No fleet trigger from here — the fleet workflow's `workflow_run` listener picks it up.
+
+## 4. `pr-review-fleet.yml` — The Reviewer Fleet
+
+- **Triggers:** `workflow_run` on completion of "PR Review Fleet Dispatcher", **or** `workflow_dispatch` (manual / from the confirm workflow). The `workflow_run` path only proceeds if the dispatcher succeeded.
+- **Permissions:** `contents: read`, `pull-requests: write`, `actions: read` (to download the dispatcher artifact), `checks: write` (to surface a "Fleet Review" check on the PR — `workflow_dispatch` runs aren't otherwise visible in the Checks tab).
+- **Concurrency:** Keyed per PR; cancels in-progress runs.
+
+### Jobs
+
+1. **`setup`** — Resolves params (artifact for `workflow_run`, inputs for `workflow_dispatch`). The label path forces the default fleet size of 3 from the priority list `[correctness, security, api-compatibility, performance, testing]`. The confirm path passes an explicit reviewer JSON array. Creates a check run on the PR head and posts an "in progress" sticky comment.
+2. **`review`** (matrix, `fail-fast: false`) — One job per reviewer. Checks out PR head with `persist-credentials: false`, pre-computes the diff / changed-files / api-report-files, then loads the prompt **from the base branch** (`git show origin/${BASE_REF}:.github/prompts/reviewers/${REVIEWER}.md`) to prevent prompt-injection via PR-authored prompt edits. Installs `@github/copilot` and runs with `COPILOT_GITHUB_TOKEN`, model `claude-sonnet-4-6`, and **only `--allow-tool=read --allow-tool=write`** — no shell or git tools, because the PR diff is attacker-controlled. The reviewer writes `review-${reviewer}.json`, uploaded as an artifact.
+3. **`consolidate`** — Downloads all `review-*` artifacts, runs `.github/scripts/consolidate_reviews.py` (exit 0 = findings, 2 = clean), scrubs GitHub token patterns from the output as defense-in-depth, and posts the result via the same `pr-review-fleet` sticky header so it overwrites the in-progress comment.
+4. **`teardown`** — Always runs to finalize the check run (success / neutral with findings / failure), so a failed `setup` doesn't leave the "Fleet Review" check stuck in `in_progress`.
+
+## End-to-End Flows
+
+### Recommended path (auto-route)
+
+```
+PR opened/sync ─▶ pr-review-auto-route (pull_request_target, base-branch trusted)
+                    └─▶ posts proposal comment with checkboxes
+User edits comment, ticks "Start review"
+                ─▶ pr-review-confirm (issue_comment.edited, verifies write access)
+                    └─▶ gh workflow run pr-review-fleet.yml (workflow_dispatch)
+                          └─▶ setup → matrix review → consolidate → teardown
+```
+
+### Label path
+
+```
+Label "fleet-review" applied
+  ─▶ pr-review-dispatch (uploads params artifact, contents:read only)
+       └─▶ pr-review-fleet via workflow_run (downloads artifact, default 3 reviewers)
+```
+
+## Security Notes
+
+- `auto-route` uses `pull_request_target` (gets secrets on fork PRs) but **never executes PR code** — it only reads numstat data. Scripts must live on the base branch.
+- `fleet` reads reviewer prompts from `origin/${BASE_REF}`, not the PR checkout — prevents PR authors from rewriting the reviewer instructions that an authenticated agent will execute.
+- Reviewer agents are credentialed with `COPILOT_GITHUB_TOKEN` but denied shell tools; git context is pre-materialized before the token is in scope (git aliases can shell out).
+- The consolidated report is regex-scrubbed for `gh[pousr]_…` and `github_pat_…` token patterns before posting, since the model output itself is untrusted.
+- The confirm workflow needs `actions: write` because `GITHUB_TOKEN` can only dispatch workflows in its own repo when that permission is explicitly granted.
diff --git a/.github/workflows/pr-review-fleet.design-notes.md b/.github/workflows/pr-review-fleet.design-notes.md
new file mode 100644
index 000000000000..2f1e66e5194b
--- /dev/null
+++ b/.github/workflows/pr-review-fleet.design-notes.md
@@ -0,0 +1,70 @@
+# `pull_request` vs `pull_request_target` — Design Notes
+
+> Adapted from a transcript discussion, corrected against the actual Fleet Review workflows in this directory. See [`pr-review-fleet.README.md`](./pr-review-fleet.README.md) for the workflow reference.
+
+Understanding the difference between `pull_request` and `pull_request_target` is important for designing secure GitHub Actions workflows.
+
+At a high level, GitHub's documentation says that `pull_request` runs in the context of the PR's merge commit, while `pull_request_target` runs in the context of the base branch (and, since December 2025, the default branch is the workflow source for `pull_request_target`). Put more simply: when you use `pull_request`, the workflow is evaluating the proposed PR changes; when you use `pull_request_target`, the workflow definition comes from the trusted repository side instead.
+
+The security concern arises because pull requests, especially from forks, must be treated as untrusted input. Build scripts, tests, package install hooks, and other automation paths can all become code-execution points. If a workflow running on PR content also has access to sensitive capabilities, a malicious contributor can shape the PR so that their code runs in that environment and abuses those capabilities.
+
+This is where `pull_request_target` comes in. The documented difference is that it runs using the trusted repository context rather than the PR's merge context, which is why it's commonly used for things like labeling, commenting, or other repo-side automation that needs elevated permissions. But GitHub also warns that this event has a larger attack surface precisely because it runs with repository privileges while still being triggered by user-supplied PR activity. So the safe summary is: `pull_request_target` helps prevent untrusted workflow definitions from being used as the workflow source, but it does not make PR-derived input safe by itself.
+
+This tradeoff is central to using `pull_request_target` safely. Because it executes in a more privileged environment, it requires much more discipline around untrusted input. A safe pattern is to treat PR-derived data as untrusted, transform it into narrowly scoped outputs, and pass along only sanitized artifacts to later jobs.
+
+The pattern we use is to explicitly treat `pull_request_target` jobs as handling untrusted input, and to aggressively constrain what comes out of them. Instead of passing raw data downstream, we convert things into well-defined, controlled outputs — typically small blobs of JSON that we generate ourselves. The idea is that no matter what the PR contains, the only thing leaving the job is something we've sanitized and shaped.
+
+A concrete example is the review routing logic in `pr-review-auto-route.yml`. The workflow runs on `pull_request_target` from the base branch and **never checks out the PR head** — it only `git fetch`es the head commit and reads `--numstat` from it. From those numbers it derives `lines` / `files` / `packages` counts and picks a tier. The only thing that leaves the job is a sticky proposal comment that the bot constructs itself. That base-branch-only checkout is the foundational invariant that makes `pull_request_target` safe here; the JSON-shaped outputs are a second layer on top.
+
+## How the Fleet Review pipeline composes this
+
+It's worth being precise about the architecture, because it's not a single workflow with privileged and unprivileged jobs — it's **separate workflows on different trigger surfaces**:
+
+- `pr-review-auto-route.yml` runs on `pull_request_target`. Its job is sizing and UI only. It posts a proposal comment with checkboxes; it does not run the reviewer.
+- `pr-review-fleet.yml` runs on `workflow_dispatch` (triggered by the confirm workflow when the author ticks "Start review") or on `workflow_run` (triggered by the label-driven dispatcher). It checks out the PR head itself with `persist-credentials: false`, generates `pr-diff.patch` and `changed-files.txt` inline, and then runs the Copilot CLI in that same job.
+
+So the isolation isn't "diff is passed across job boundaries from a privileged job to a less-privileged one." It's:
+
+1. The reviewer agent has **no shell or exec tools** — only `--allow-tool=read --allow-tool=write`. Even though `.git` is on disk, the agent can't invoke `git`.
+2. Checkout uses `persist-credentials: false`, so the auth token isn't stashed in the local Git config for later steps to reuse.
+3. All git-derived context (the diff, the changed-files list, the reviewer prompt) is materialized **before** `COPILOT_GITHUB_TOKEN` enters the step's environment.
+
+This was the shape of a previous MSRC issue: earlier versions allowed arbitrary Git operations with persisted credentials, which could have been abused for exfiltration. Removing Git capabilities from the credentialed step closed that hole. The mitigation is "the agent can't reach git," not "git isn't used at all."
+
+## Prompt-injection: a second untrusted input
+
+The diff isn't the only untrusted input the reviewer agent sees — the reviewer prompt itself is too, if you load it from the PR checkout. `pr-review-fleet.yml` loads the prompt from the **base branch** instead:
+
+```bash
+git show "origin/${BASE_REF}:.github/prompts/reviewers/${REVIEWER}.md" > "$PROMPT_FILE"
+```
+
+That way a malicious PR can't rewrite the instructions an authenticated agent is about to execute.
+
+## Output is untrusted too
+
+Model output is also untrusted. The consolidator regex-scrubs GitHub token patterns (`gh[pousr]_…`, `github_pat_…`) from the report before it's posted as a PR comment. It's defense-in-depth, but worth doing — the agent is denied shell tools, but its prose output still flows to a public-facing comment.
+
+## Credential handling
+
+Setting `persist-credentials: false` on `actions/checkout` prevents the checkout action from storing the auth token in the local Git config for later Git operations. It does not remove every possible credential from the job, but it closes a common path for accidental or malicious reuse in later steps.
+
+Note also that the confirm workflow (`pr-review-confirm.yml`) declares `actions: write`. That permission is what lets `gh workflow run pr-review-fleet.yml` dispatch the fleet — `GITHUB_TOKEN` can dispatch workflows in its own repo only when this permission is explicitly granted; otherwise the call 403s. It's a small but easy-to-miss requirement when wiring up a confirm-then-dispatch flow.
+
+## Who is allowed to trigger?
+
+There are guardrails on who can trigger the fleet:
+
+- **Confirm path:** `pr-review-confirm.yml` calls `actions-cool/check-user-permission` with `require: write` against `github.event.sender.login` before dispatching. The bot authored the comment, so `author_association` on the event isn't meaningful — we have to check the acting user directly.
+- **Label path:** `pr-review-dispatch.yml` trusts whoever can apply the `fleet-review` label, which is gated by GitHub's own repo permissions on labeling.
+- **Manual `workflow_dispatch`:** gated by GitHub on `actions: write` to the repo.
+
+## Summary
+
+Conceptually, the goal is to split responsibilities:
+
+- `pull_request_target` handles classification / analysis using untrusted input, **without checking out PR head**.
+- Downstream workflows (triggered via `workflow_dispatch` / `workflow_run`) do the heavier work, but with PR code denied any path to credentials — via `persist-credentials: false`, no shell tools for the agent, prompts loaded from the base branch, and output scrubbed before posting.
+- Nothing ever blindly executes code from the PR itself.
+
+In practice, `pull_request` is the appropriate event for running untrusted PR code with limited privileges, while `pull_request_target` is appropriate for trusted repository-side automation that must not execute PR-controlled code. The key is to be precise about what is trusted, what is untrusted, and where credentials or elevated permissions are available — and to remember that "untrusted" includes the diff, any prompt loaded from the PR, **and** the model's own output.

From 399f592d22abf20d29c9d87ce46fbb5372860e13 Mon Sep 17 00:00:00 2001
From: Tyler Butler <tyler@tylerbutler.com>
Date: Wed, 27 May 2026 09:33:55 -0700
Subject: [PATCH 02/11] Potential fix for pull request finding

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>
---
 .github/workflows/pr-review-fleet.README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.github/workflows/pr-review-fleet.README.md b/.github/workflows/pr-review-fleet.README.md
index 544f4a3a0cc7..0977e4c562ff 100644
--- a/.github/workflows/pr-review-fleet.README.md
+++ b/.github/workflows/pr-review-fleet.README.md
@@ -13,7 +13,7 @@ Four workflows compose the PR fleet review system:
 
 ## 1. `pr-review-auto-route.yml` — Sizer + Proposal
 
-- **Trigger:** `pull_request_target` on `opened` / `reopened` / `synchronize` (base branches `main`, `next`, `release/**`).
+- **Trigger:** `pull_request_target` on `opened` / `reopened` / `synchronize` (base branches `main`, `next`, `release/**/*`).
 - **Permissions:** `contents: read`, `pull-requests: write`.
 - **Context:** Runs from the **base branch** with a write token + secrets. Critical security invariant: never checks out the PR head; only diffs against it for stats (lines / files / packages). This is what makes it safe to use `pull_request_target` for fork PRs.
 - **What it does:** Computes a recommended fleet tier (1 / 3 / 5 reviewers) from thresholds on lines, files, and packages, then posts a **sticky comment** (`<!-- pr-review-confirm -->` marker) with pre-checked reviewer checkboxes and a "Start review" checkbox. On `synchronize` it carries forward prior toggles and short-circuits if "Start review" is already checked. Skips entirely if the `fleet-review` label is set (label flow takes precedence).

From f285d721d1086a340e1c4a63a87bb45fa2298f81 Mon Sep 17 00:00:00 2001
From: Tyler Butler <tylerbu@microsoft.com>
Date: Thu, 28 May 2026 09:30:56 -0700
Subject: [PATCH 03/11] docs: clarify fleet review threat model

---
 .github/workflows/pr-review-fleet.README.md | 11 ++++++-----
 1 file changed, 6 insertions(+), 5 deletions(-)

diff --git a/.github/workflows/pr-review-fleet.README.md b/.github/workflows/pr-review-fleet.README.md
index 0977e4c562ff..b573af030aca 100644
--- a/.github/workflows/pr-review-fleet.README.md
+++ b/.github/workflows/pr-review-fleet.README.md
@@ -15,7 +15,7 @@ Four workflows compose the PR fleet review system:
 
 - **Trigger:** `pull_request_target` on `opened` / `reopened` / `synchronize` (base branches `main`, `next`, `release/**/*`).
 - **Permissions:** `contents: read`, `pull-requests: write`.
-- **Context:** Runs from the **base branch** with a write token + secrets. Critical security invariant: never checks out the PR head; only diffs against it for stats (lines / files / packages). This is what makes it safe to use `pull_request_target` for fork PRs.
+- **Context:** Runs from the **base branch** with a write token + secrets. Critical security invariant: never checks out or executes the PR head; it only fetches the head commit and derives deterministic stats (lines / files / packages) from `git diff --numstat`. See [Security Notes](#security-notes) for the threat model.
 - **What it does:** Computes a recommended fleet tier (1 / 3 / 5 reviewers) from thresholds on lines, files, and packages, then posts a **sticky comment** (`<!-- pr-review-confirm -->` marker) with pre-checked reviewer checkboxes and a "Start review" checkbox. On `synchronize` it carries forward prior toggles and short-circuits if "Start review" is already checked. Skips entirely if the `fleet-review` label is set (label flow takes precedence).
 
 ### Sizing thresholds
@@ -78,8 +78,9 @@ Label "fleet-review" applied
 
 ## Security Notes
 
-- `auto-route` uses `pull_request_target` (gets secrets on fork PRs) but **never executes PR code** — it only reads numstat data. Scripts must live on the base branch.
-- `fleet` reads reviewer prompts from `origin/${BASE_REF}`, not the PR checkout — prevents PR authors from rewriting the reviewer instructions that an authenticated agent will execute.
-- Reviewer agents are credentialed with `COPILOT_GITHUB_TOKEN` but denied shell tools; git context is pre-materialized before the token is in scope (git aliases can shell out).
-- The consolidated report is regex-scrubbed for `gh[pousr]_…` and `github_pat_…` token patterns before posting, since the model output itself is untrusted.
+- `auto-route` uses `pull_request_target`, so fork PRs can trigger a workflow that has repository permissions and secrets. The danger is not the trigger by itself; the danger is combining that privileged context with attacker-controlled PR content. If this workflow checked out the PR head and ran scripts, installs, tests, Git aliases, or other PR-authored code, the PR author could shape that code to read tokens, call GitHub APIs, or influence privileged follow-up actions.
+- To avoid that, `auto-route` only uses trusted workflow code from the base branch. It fetches the PR head commit as data, reads deterministic `git diff --numstat` output, reduces that output to line / file / package counts, and posts a bot-authored proposal comment. The untrusted PR controls the numbers that feed the tiering decision, but not the command sequence, prompt text, posted comment template, or any executable code path.
+- `fleet` is a separate workflow that intentionally checks out the PR head for review context. That step is lower risk because checkout uses `persist-credentials: false`, reviewer prompts are loaded from `origin/${BASE_REF}` instead of the PR checkout, and all git-derived context is pre-materialized before `COPILOT_GITHUB_TOKEN` enters the reviewer step environment.
+- Reviewer agents are credentialed with `COPILOT_GITHUB_TOKEN`, and the PR diff is attacker-controlled text that is sent to an LLM. The model is therefore treated as exposed to prompt injection: it is allowed to read the prepared files and write its JSON review result, but it is denied shell and git tools (`--allow-tool=read --allow-tool=write` only). Without shell/git tools or persisted checkout credentials, a malicious diff can ask the model to exfiltrate or mutate repository state, but the agent has no approved tool path to run commands, invoke git, or push changes.
+- The consolidated report is regex-scrubbed for `gh[pousr]_…` and `github_pat_…` token patterns before posting, since model output itself is untrusted. This is defense-in-depth, not the primary boundary; the primary boundary is preventing PR-controlled content from gaining command execution or access to persisted credentials in the first place.
 - The confirm workflow needs `actions: write` because `GITHUB_TOKEN` can only dispatch workflows in its own repo when that permission is explicitly granted.

From 2082f721b4181fe9f37d318eadc6b3bc6a3eee20 Mon Sep 17 00:00:00 2001
From: Tyler Butler <tylerbu@microsoft.com>
Date: Thu, 28 May 2026 09:32:15 -0700
Subject: [PATCH 04/11] docs: explain release branch reviews

---
 .github/workflows/pr-review-fleet.README.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.github/workflows/pr-review-fleet.README.md b/.github/workflows/pr-review-fleet.README.md
index b573af030aca..9f1fcc784f26 100644
--- a/.github/workflows/pr-review-fleet.README.md
+++ b/.github/workflows/pr-review-fleet.README.md
@@ -14,6 +14,7 @@ Four workflows compose the PR fleet review system:
 ## 1. `pr-review-auto-route.yml` — Sizer + Proposal
 
 - **Trigger:** `pull_request_target` on `opened` / `reopened` / `synchronize` (base branches `main`, `next`, `release/**/*`).
+- **Release branches:** Included intentionally. Most release-branch PRs are release automation or cherry-picks, but both entry points are opt-in, and occasional non-standard release changes can still benefit from fleet review.
 - **Permissions:** `contents: read`, `pull-requests: write`.
 - **Context:** Runs from the **base branch** with a write token + secrets. Critical security invariant: never checks out or executes the PR head; it only fetches the head commit and derives deterministic stats (lines / files / packages) from `git diff --numstat`. See [Security Notes](#security-notes) for the threat model.
 - **What it does:** Computes a recommended fleet tier (1 / 3 / 5 reviewers) from thresholds on lines, files, and packages, then posts a **sticky comment** (`<!-- pr-review-confirm -->` marker) with pre-checked reviewer checkboxes and a "Start review" checkbox. On `synchronize` it carries forward prior toggles and short-circuits if "Start review" is already checked. Skips entirely if the `fleet-review` label is set (label flow takes precedence).

From dbde7b72316507302e871916d8adb9fb823c0870 Mon Sep 17 00:00:00 2001
From: Tyler Butler <tylerbu@microsoft.com>
Date: Thu, 28 May 2026 09:33:28 -0700
Subject: [PATCH 05/11] docs: explain fleet label races

---
 .github/workflows/pr-review-fleet.README.md | 1 +
 1 file changed, 1 insertion(+)

diff --git a/.github/workflows/pr-review-fleet.README.md b/.github/workflows/pr-review-fleet.README.md
index 9f1fcc784f26..66518532635d 100644
--- a/.github/workflows/pr-review-fleet.README.md
+++ b/.github/workflows/pr-review-fleet.README.md
@@ -42,6 +42,7 @@ Four workflows compose the PR fleet review system:
 - **Trigger:** `pull_request` `labeled` (label must be `fleet-review`).
 - **Permissions:** `contents: read` only — it just stages context.
 - **What it does:** Writes `{pr_number, head_sha, base_ref}` to `dispatch/params.json` and uploads it as the `dispatch-params` artifact. No fleet trigger from here — the fleet workflow's `workflow_run` listener picks it up.
+- **Race behavior:** Adding the label does not cancel an already-running `auto-route` job because that workflow has a separate concurrency group; at worst, `auto-route` may finish and update the proposal comment once. Fleet runs themselves are keyed per PR, so a label-triggered fleet run cancels any in-progress fleet run for the same PR, and a later confirm/manual run cancels the label-triggered one.
 
 ## 4. `pr-review-fleet.yml` — The Reviewer Fleet
 

From a732ddefdf1020e3f44fa1f1622a3ef0407e09b6 Mon Sep 17 00:00:00 2001
From: Tyler Butler <tylerbu@microsoft.com>
Date: Thu, 28 May 2026 09:34:22 -0700
Subject: [PATCH 06/11] docs: centralize fleet security rationale

---
 .github/workflows/pr-review-fleet.README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.github/workflows/pr-review-fleet.README.md b/.github/workflows/pr-review-fleet.README.md
index 66518532635d..f2b3b8c7cb9f 100644
--- a/.github/workflows/pr-review-fleet.README.md
+++ b/.github/workflows/pr-review-fleet.README.md
@@ -53,7 +53,7 @@ Four workflows compose the PR fleet review system:
 ### Jobs
 
 1. **`setup`** — Resolves params (artifact for `workflow_run`, inputs for `workflow_dispatch`). The label path forces the default fleet size of 3 from the priority list `[correctness, security, api-compatibility, performance, testing]`. The confirm path passes an explicit reviewer JSON array. Creates a check run on the PR head and posts an "in progress" sticky comment.
-2. **`review`** (matrix, `fail-fast: false`) — One job per reviewer. Checks out PR head with `persist-credentials: false`, pre-computes the diff / changed-files / api-report-files, then loads the prompt **from the base branch** (`git show origin/${BASE_REF}:.github/prompts/reviewers/${REVIEWER}.md`) to prevent prompt-injection via PR-authored prompt edits. Installs `@github/copilot` and runs with `COPILOT_GITHUB_TOKEN`, model `claude-sonnet-4-6`, and **only `--allow-tool=read --allow-tool=write`** — no shell or git tools, because the PR diff is attacker-controlled. The reviewer writes `review-${reviewer}.json`, uploaded as an artifact.
+2. **`review`** (matrix, `fail-fast: false`) — One job per reviewer. Checks out PR head with `persist-credentials: false`, pre-computes the diff / changed-files / api-report-files, then loads the prompt **from the base branch** (`git show origin/${BASE_REF}:.github/prompts/reviewers/${REVIEWER}.md`). Installs `@github/copilot` and runs with `COPILOT_GITHUB_TOKEN`, model `claude-sonnet-4-6`, and **only `--allow-tool=read --allow-tool=write`**. The reviewer writes `review-${reviewer}.json`, uploaded as an artifact. See [Security Notes](#security-notes) for the prompt-injection and credential rationale.
 3. **`consolidate`** — Downloads all `review-*` artifacts, runs `.github/scripts/consolidate_reviews.py` (exit 0 = findings, 2 = clean), scrubs GitHub token patterns from the output as defense-in-depth, and posts the result via the same `pr-review-fleet` sticky header so it overwrites the in-progress comment.
 4. **`teardown`** — Always runs to finalize the check run (success / neutral with findings / failure), so a failed `setup` doesn't leave the "Fleet Review" check stuck in `in_progress`.
 

From 68926b6eec9500bb0b25936e160f0e060d44428a Mon Sep 17 00:00:00 2001
From: Tyler Butler <tylerbu@microsoft.com>
Date: Thu, 28 May 2026 09:34:54 -0700
Subject: [PATCH 07/11] docs: clarify PR trigger events

---
 .github/workflows/pr-review-fleet.design-notes.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.github/workflows/pr-review-fleet.design-notes.md b/.github/workflows/pr-review-fleet.design-notes.md
index 2f1e66e5194b..0d615e82781a 100644
--- a/.github/workflows/pr-review-fleet.design-notes.md
+++ b/.github/workflows/pr-review-fleet.design-notes.md
@@ -2,7 +2,7 @@
 
 > Adapted from a transcript discussion, corrected against the actual Fleet Review workflows in this directory. See [`pr-review-fleet.README.md`](./pr-review-fleet.README.md) for the workflow reference.
 
-Understanding the difference between `pull_request` and `pull_request_target` is important for designing secure GitHub Actions workflows.
+Understanding the difference between the GitHub Actions `pull_request` and `pull_request_target` trigger events is important for designing secure workflows.
 
 At a high level, GitHub's documentation says that `pull_request` runs in the context of the PR's merge commit, while `pull_request_target` runs in the context of the base branch (and, since December 2025, the default branch is the workflow source for `pull_request_target`). Put more simply: when you use `pull_request`, the workflow is evaluating the proposed PR changes; when you use `pull_request_target`, the workflow definition comes from the trusted repository side instead.
 

From afea22a6979631568001c721112b7e374f825fc0 Mon Sep 17 00:00:00 2001
From: Tyler Butler <tylerbu@microsoft.com>
Date: Thu, 28 May 2026 09:35:34 -0700
Subject: [PATCH 08/11] docs: separate PR input and output bounds

---
 .github/workflows/pr-review-fleet.design-notes.md | 6 +++---
 1 file changed, 3 insertions(+), 3 deletions(-)

diff --git a/.github/workflows/pr-review-fleet.design-notes.md b/.github/workflows/pr-review-fleet.design-notes.md
index 0d615e82781a..added382f61b 100644
--- a/.github/workflows/pr-review-fleet.design-notes.md
+++ b/.github/workflows/pr-review-fleet.design-notes.md
@@ -10,11 +10,11 @@ The security concern arises because pull requests, especially from forks, must b
 
 This is where `pull_request_target` comes in. The documented difference is that it runs using the trusted repository context rather than the PR's merge context, which is why it's commonly used for things like labeling, commenting, or other repo-side automation that needs elevated permissions. But GitHub also warns that this event has a larger attack surface precisely because it runs with repository privileges while still being triggered by user-supplied PR activity. So the safe summary is: `pull_request_target` helps prevent untrusted workflow definitions from being used as the workflow source, but it does not make PR-derived input safe by itself.
 
-This tradeoff is central to using `pull_request_target` safely. Because it executes in a more privileged environment, it requires much more discipline around untrusted input. A safe pattern is to treat PR-derived data as untrusted, transform it into narrowly scoped outputs, and pass along only sanitized artifacts to later jobs.
+This tradeoff is central to using `pull_request_target` safely. Because it executes in a more privileged environment, it requires much more discipline around untrusted input. The primary rule is that PR-derived data is data only: do not check it out as the working tree, execute it, source it, or let it define the workflow/prompt being run.
 
-The pattern we use is to explicitly treat `pull_request_target` jobs as handling untrusted input, and to aggressively constrain what comes out of them. Instead of passing raw data downstream, we convert things into well-defined, controlled outputs — typically small blobs of JSON that we generate ourselves. The idea is that no matter what the PR contains, the only thing leaving the job is something we've sanitized and shaped.
+The secondary rule is to constrain what leaves a privileged job. Instead of passing raw PR-derived data downstream, convert it into well-defined, controlled outputs — typically small blobs of JSON or comments that trusted workflow code constructs. The idea is that no matter what the PR contains, the next workflow step receives something bounded and intentionally shaped.
 
-A concrete example is the review routing logic in `pr-review-auto-route.yml`. The workflow runs on `pull_request_target` from the base branch and **never checks out the PR head** — it only `git fetch`es the head commit and reads `--numstat` from it. From those numbers it derives `lines` / `files` / `packages` counts and picks a tier. The only thing that leaves the job is a sticky proposal comment that the bot constructs itself. That base-branch-only checkout is the foundational invariant that makes `pull_request_target` safe here; the JSON-shaped outputs are a second layer on top.
+A concrete example is the review routing logic in `pr-review-auto-route.yml`. The workflow runs on `pull_request_target` from the base branch and **never checks out the PR head** — it only `git fetch`es the head commit and reads `--numstat` from it. From those numbers it derives `lines` / `files` / `packages` counts and picks a tier. That base-branch-only checkout is the foundational input boundary that makes `pull_request_target` safe here. The sticky proposal comment that the bot constructs itself is the separate output boundary.
 
 ## How the Fleet Review pipeline composes this
 

From 7108d8cd1f30f3ddc615a17f7291fc8dada811ae Mon Sep 17 00:00:00 2001
From: Tyler Butler <tylerbu@microsoft.com>
Date: Thu, 28 May 2026 09:36:52 -0700
Subject: [PATCH 09/11] docs: explain Copilot token boundary

---
 .github/workflows/pr-review-fleet.design-notes.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.github/workflows/pr-review-fleet.design-notes.md b/.github/workflows/pr-review-fleet.design-notes.md
index added382f61b..80a3fdc86491 100644
--- a/.github/workflows/pr-review-fleet.design-notes.md
+++ b/.github/workflows/pr-review-fleet.design-notes.md
@@ -27,7 +27,7 @@ So the isolation isn't "diff is passed across job boundaries from a privileged j
 
 1. The reviewer agent has **no shell or exec tools** — only `--allow-tool=read --allow-tool=write`. Even though `.git` is on disk, the agent can't invoke `git`.
 2. Checkout uses `persist-credentials: false`, so the auth token isn't stashed in the local Git config for later steps to reuse.
-3. All git-derived context (the diff, the changed-files list, the reviewer prompt) is materialized **before** `COPILOT_GITHUB_TOKEN` enters the step's environment.
+3. All git-derived context (the diff, the changed-files list, the reviewer prompt) is materialized **before** `COPILOT_GITHUB_TOKEN` enters the step's environment. Once that token is present, the model-influenced part of the job consumes fixed files instead of invoking git, resolving refs, or loading prompts through a path that attacker-controlled instructions could try to steer.
 
 This was the shape of a previous MSRC issue: earlier versions allowed arbitrary Git operations with persisted credentials, which could have been abused for exfiltration. Removing Git capabilities from the credentialed step closed that hole. The mitigation is "the agent can't reach git," not "git isn't used at all."
 

From ba9c449a9c4baae5e008a8b790cfbd77b9e3b1a6 Mon Sep 17 00:00:00 2001
From: Tyler Butler <tylerbu@microsoft.com>
Date: Thu, 28 May 2026 09:37:39 -0700
Subject: [PATCH 10/11] docs: qualify token scrub boundary

---
 .github/workflows/pr-review-fleet.design-notes.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.github/workflows/pr-review-fleet.design-notes.md b/.github/workflows/pr-review-fleet.design-notes.md
index 80a3fdc86491..12ee4d9b5c3d 100644
--- a/.github/workflows/pr-review-fleet.design-notes.md
+++ b/.github/workflows/pr-review-fleet.design-notes.md
@@ -43,7 +43,7 @@ That way a malicious PR can't rewrite the instructions an authenticated agent is
 
 ## Output is untrusted too
 
-Model output is also untrusted. The consolidator regex-scrubs GitHub token patterns (`gh[pousr]_…`, `github_pat_…`) from the report before it's posted as a PR comment. It's defense-in-depth, but worth doing — the agent is denied shell tools, but its prose output still flows to a public-facing comment.
+Model output is also untrusted. The consolidator regex-scrubs GitHub token patterns (`gh[pousr]_…`, `github_pat_…`) from the report before it's posted as a PR comment. This is defense-in-depth, not a complete exfiltration boundary: a model could transform or split a token so it no longer matches the regex. The stronger boundary is that the agent is denied shell/git tools and does not receive persisted checkout credentials. Longer term, the ideal credential model is short-lived, least-privilege, per-run credentials rather than a reusable token.
 
 ## Credential handling
 

From 83675a6773a7b571e8fa3a870308af2c16319349 Mon Sep 17 00:00:00 2001
From: Tyler Butler <tylerbu@microsoft.com>
Date: Thu, 28 May 2026 09:38:33 -0700
Subject: [PATCH 11/11] docs: clarify git tool boundary

---
 .github/workflows/pr-review-fleet.design-notes.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/.github/workflows/pr-review-fleet.design-notes.md b/.github/workflows/pr-review-fleet.design-notes.md
index 12ee4d9b5c3d..692fa5137ec8 100644
--- a/.github/workflows/pr-review-fleet.design-notes.md
+++ b/.github/workflows/pr-review-fleet.design-notes.md
@@ -29,7 +29,7 @@ So the isolation isn't "diff is passed across job boundaries from a privileged j
 2. Checkout uses `persist-credentials: false`, so the auth token isn't stashed in the local Git config for later steps to reuse.
 3. All git-derived context (the diff, the changed-files list, the reviewer prompt) is materialized **before** `COPILOT_GITHUB_TOKEN` enters the step's environment. Once that token is present, the model-influenced part of the job consumes fixed files instead of invoking git, resolving refs, or loading prompts through a path that attacker-controlled instructions could try to steer.
 
-This was the shape of a previous MSRC issue: earlier versions allowed arbitrary Git operations with persisted credentials, which could have been abused for exfiltration. Removing Git capabilities from the credentialed step closed that hole. The mitigation is "the agent can't reach git," not "git isn't used at all."
+This was the shape of a previous MSRC issue: earlier versions allowed arbitrary Git operations with persisted credentials, which could have been abused for exfiltration. Removing Git capabilities from the credentialed step closed that hole. Git is still used by deterministic workflow steps before the Copilot token is present; the boundary is that the later model-influenced step has no approved tool path to execute `git`.
 
 ## Prompt-injection: a second untrusted input