docs: add PR Review Fleet workflow README and design notes by tylerbutler · Pull Request #27373 · microsoft/FluidFramework

tylerbutler · 2026-05-21T14:03:59Z

Adds two docs alongside the fleet review workflows in .github/workflows/:

pr-review-fleet.README.md — reference for the four workflows that compose the fleet review system (pr-review-auto-route, pr-review-confirm, pr-review-dispatch, pr-review-fleet). Covers triggers, permissions, sizing thresholds, end-to-end flows for both the auto-route and label paths, and per-workflow security invariants.
pr-review-fleet.design-notes.md — design discussion of pull_request vs pull_request_target as applied to this pipeline. Covers why the auto-router can safely run on pull_request_target (never checks out PR head), how the fleet workflow isolates the credentialed Copilot agent from git (no shell tools + persist-credentials: false + prompt loaded from base branch), output scrubbing as defense-in-depth, the actions: write requirement on the confirm workflow, and existing actor-permission checks.

Docs-only. No workflow behavior changes.

github-actions · 2026-05-21T14:06:45Z

Hi! Thank you for opening this PR. Want me to review it?

Based on the diff (158 lines, 2 files), I've queued these reviewers:

Correctness — logic errors, race conditions, lifecycle issues
Security — vulnerabilities, secret exposure, injection
API Compatibility — breaking changes, release tags, type design
Performance — algorithmic regressions, memory leaks
Testing — coverage gaps, hollow tests

How this works

Adjust the reviewer set by ticking/unticking boxes above. Reviewer toggles alone don't trigger anything.
Tick Start review below to dispatch the review fleet.
After review finishes, tick Start review again to request another run — it auto-resets after each dispatch.
This comment updates as new commits land; your reviewer selections are preserved.
Start review

Copilot

Pull request overview

Adds documentation alongside the PR Review Fleet GitHub Actions workflows to explain how the fleet system is triggered, routed, and secured (including the pull_request vs pull_request_target rationale and the credential/prompt-isolation invariants).

Changes:

Add a workflow reference README describing the four fleet workflows, their triggers/permissions, sizing thresholds, end-to-end flows, and security invariants.
Add design notes explaining the security tradeoffs of pull_request vs pull_request_target as applied to this pipeline.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File	Description
.github/workflows/pr-review-fleet.README.md	Reference documentation for the fleet review workflow system (triggers, permissions, flow, security invariants).
.github/workflows/pr-review-fleet.design-notes.md	Design/security notes for choosing `pull_request` vs `pull_request_target` in this pipeline.

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

noencke · 2026-05-27T16:45:12Z

+
+- **Trigger:** `pull_request_target` on `opened` / `reopened` / `synchronize` (base branches `main`, `next`, `release/**/*`).
+- **Permissions:** `contents: read`, `pull-requests: write`.
+- **Context:** Runs from the **base branch** with a write token + secrets. Critical security invariant: never checks out the PR head; only diffs against it for stats (lines / files / packages). This is what makes it safe to use `pull_request_target` for fork PRs.


What is dangerous about checking out the PR head? We should note here that the PR head could have attack instructions which would be sent to an LLM. Even so - what is the level of danger? Is the LLM allowed to execute commands, or are all shell commands run deterministically and only their output is given to the LLM? Isn't the output of this step simply a 1/3/5? Is that constrained deterministically? What's the worst that could happen? The Jobs section and security section below go into a some more detail but are still insufficient to fully describe the risk IMO.

noencke · 2026-05-27T16:46:56Z

+
+## 1. `pr-review-auto-route.yml` — Sizer + Proposal
+
+- **Trigger:** `pull_request_target` on `opened` / `reopened` / `synchronize` (base branches `main`, `next`, `release/**/*`).


Do you think we should run review on release branches?

THat's a good point. Most release branch PRs are either release-related/semi-automated or are basically cherry-picks from main. However, since the reviews are opt-in, I think it's valuable to leave them on. Occasionally we do have PRs that aren't straight ports, and the reviews might be useful there.

noencke · 2026-05-27T16:51:49Z

+- **Trust check:** Uses `actions-cool/check-user-permission` to require `write` on the *acting user* (the comment author is the bot, so `author_association` is meaningless here).
+- **What it does:** Fetches the PR's current `head_sha` / `base_ref` from the API (not from any pinned artifact — handles force-pushes between proposal and toggle), parses selected reviewer checkboxes, calls `gh workflow run pr-review-fleet.yml` with explicit `pr_number`, `reviewers`, `head_sha`, `base_ref` inputs. Then resets the "Start review" checkbox so a single tick re-runs it next time.
+
+## 3. `pr-review-dispatch.yml` — Label Path


We note in the auto-route section that auto-route get skipped if pr-review-fleet label is present. What happens if we add the label when auto-route (or a review kicked off by auto-route) is already running?

noencke · 2026-05-27T17:01:03Z

+       └─▶ pr-review-fleet via workflow_run (downloads artifact, default 3 reviewers)
+```
+
+## Security Notes


As mentioned above, I think each possible attack should be described in a little bit more detail. E.g. "We give access to A, but not B. If an attacker has A, it doesn't help them because X. But if an attacker had B, they would be able to do Y."

Probably, do all the detail here, and then have the sections that mention security in auto-route and review-fleet can simply link/reference this section. E.g. "We load from the base branch, not the fork branch - see below for the security rationale."

noencke · 2026-05-27T17:01:52Z

+
+> Adapted from a transcript discussion, corrected against the actual Fleet Review workflows in this directory. See [`pr-review-fleet.README.md`](./pr-review-fleet.README.md) for the workflow reference.
+
+Understanding the difference between `pull_request` and `pull_request_target` is important for designing secure GitHub Actions workflows.


What are these? Triggers?

They're events that can be used as triggers. I'll clarify.

noencke · 2026-05-27T17:09:45Z

+
+The pattern we use is to explicitly treat `pull_request_target` jobs as handling untrusted input, and to aggressively constrain what comes out of them. Instead of passing raw data downstream, we convert things into well-defined, controlled outputs — typically small blobs of JSON that we generate ourselves. The idea is that no matter what the PR contains, the only thing leaving the job is something we've sanitized and shaped.
+
+A concrete example is the review routing logic in `pr-review-auto-route.yml`. The workflow runs on `pull_request_target` from the base branch and **never checks out the PR head** — it only `git fetch`es the head commit and reads `--numstat` from it. From those numbers it derives `lines` / `files` / `packages` counts and picks a tier. The only thing that leaves the job is a sticky proposal comment that the bot constructs itself. That base-branch-only checkout is the foundational invariant that makes `pull_request_target` safe here; the JSON-shaped outputs are a second layer on top.


This paragraph is mixing two points together.

Sanitizing input

Sanitizing output

The previous paragraph is only talking about 2, and this paragraph leads by saying it's a concrete example of that. But then it emphasizes (in bold) 1. Then it goes back to 2. Then in concludes with 1, saying that 2 is secondary.

Nothing incorrect, just a little pedagogically disorganized and confusing.

Yeah, classic stream of consciousness that didn't get cleaned up enough. I'll re-work it.

noencke · 2026-05-27T17:19:57Z

+
+1. The reviewer agent has **no shell or exec tools** — only `--allow-tool=read --allow-tool=write`. Even though `.git` is on disk, the agent can't invoke `git`.
+2. Checkout uses `persist-credentials: false`, so the auth token isn't stashed in the local Git config for later steps to reuse.
+3. All git-derived context (the diff, the changed-files list, the reviewer prompt) is materialized **before** `COPILOT_GITHUB_TOKEN` enters the step's environment.


Why is this important?

noencke · 2026-05-27T17:23:00Z

+
+## Output is untrusted too
+
+Model output is also untrusted. The consolidator regex-scrubs GitHub token patterns (`gh[pousr]_…`, `github_pat_…`) from the report before it's posted as a PR comment. It's defense-in-depth, but worth doing — the agent is denied shell tools, but its prose output still flows to a public-facing comment.


A clever model could easily just switch the prefix and output the token anyway, though. That's exactly the kind of thing I would expect an exfiltration attempt to do as a baseline. Is there any way we can robustly prevent this? Seems like we would need one-time-use tokens generated per action (maybe that's already the case? Or is it one special token shared by all?).

We currently have a single token that is used for the CLI's auth. Ideally we would have something like GitHub Apps where you can get a short-lived token on-demand, but that doesn't work because of how Copilot is licensed - it's per seat, and requires a billable account to tie the acticity to. So the PAT is as good as we get today - but it's not a long term solution IMO because PAT lifetimes are continually being reduced as part of SFI. I think we're ultimately going to need something like AgencyHub or what Justin's been working on, because this security posture isn't sustainable.

noencke · 2026-05-27T17:26:45Z

+2. Checkout uses `persist-credentials: false`, so the auth token isn't stashed in the local Git config for later steps to reuse.
+3. All git-derived context (the diff, the changed-files list, the reviewer prompt) is materialized **before** `COPILOT_GITHUB_TOKEN` enters the step's environment.
+
+This was the shape of a previous MSRC issue: earlier versions allowed arbitrary Git operations with persisted credentials, which could have been abused for exfiltration. Removing Git capabilities from the credentialed step closed that hole. The mitigation is "the agent can't reach git," not "git isn't used at all."


"The mitigation is "the agent can't reach git," not "git isn't used at all.""

FWIW this wasn't useful for my understanding - I wasn't already thinking "git isn't used at all" and it's also pretty vague what "reaching git" means.

Good point. This was what helped me understand the original mitigations, because my initial understanding was that we couldn't safely invoke git operations anywhere in the pipeline, but the problem was thankfully more subtle than that. I'll either wipe this or reword it.

github-actions · 2026-05-28T17:11:11Z

🔗 No broken links found! ✅

Your attention to detail is admirable.

linkcheck output


> fluid-framework-docs-site@0.0.0 ci:check-links /home/runner/work/FluidFramework/FluidFramework/docs
> start-server-and-test "npm run serve -- --no-open" 3000 check-links

1: starting server using command "npm run serve -- --no-open"
and when url "[ 'http://127.0.0.1:3000' ]" is responding with HTTP status code 200
running tests using command "npm run check-links"


> fluid-framework-docs-site@0.0.0 serve
> docusaurus serve --no-open

[SUCCESS] Serving "build" directory at: http://localhost:3000/

> fluid-framework-docs-site@0.0.0 check-links
> linkcheck http://localhost:3000 --skip-file skipped-urls.txt

Crawling...

Stats:
  288859 links
    1925 destination URLs
    2175 URLs ignored
       0 warnings
       0 errors

docs: add PR Review Fleet workflow README and design notes

b964bd6

tylerbutler marked this pull request as ready for review May 21, 2026 14:04

Copilot AI review requested due to automatic review settings May 21, 2026 14:04

Copilot started reviewing on behalf of tylerbutler May 21, 2026 14:09 View session

Copilot AI reviewed May 21, 2026

View reviewed changes

Comment thread .github/workflows/pr-review-fleet.README.md Outdated

Potential fix for pull request finding

399f592

Co-authored-by: Copilot Autofix powered by AI <175728472+Copilot@users.noreply.github.com>

noencke approved these changes May 27, 2026

View reviewed changes

tylerbutler added 10 commits May 28, 2026 09:30

docs: clarify fleet review threat model

f285d72

docs: explain release branch reviews

2082f72

docs: explain fleet label races

dbde7b7

docs: centralize fleet security rationale

a732dde

docs: clarify PR trigger events

68926b6

docs: separate PR input and output bounds

afea22a

docs: explain Copilot token boundary

7108d8c

docs: qualify token scrub boundary

ba9c449

docs: clarify git tool boundary

83675a6

Merge branch 'main' into docs/pr-review-fleet-readme

da5b3d0


		## 1. `pr-review-auto-route.yml` — Sizer + Proposal

		- Trigger: `pull_request_target` on `opened` / `reopened` / `synchronize` (base branches `main`, `next`, `release/*/`).


		> Adapted from a transcript discussion, corrected against the actual Fleet Review workflows in this directory. See [`pr-review-fleet.README.md`](./pr-review-fleet.README.md) for the workflow reference.

		Understanding the difference between `pull_request` and `pull_request_target` is important for designing secure GitHub Actions workflows.


		The pattern we use is to explicitly treat `pull_request_target` jobs as handling untrusted input, and to aggressively constrain what comes out of them. Instead of passing raw data downstream, we convert things into well-defined, controlled outputs — typically small blobs of JSON that we generate ourselves. The idea is that no matter what the PR contains, the only thing leaving the job is something we've sanitized and shaped.

		A concrete example is the review routing logic in `pr-review-auto-route.yml`. The workflow runs on `pull_request_target` from the base branch and never checks out the PR head — it only `git fetch`es the head commit and reads `--numstat` from it. From those numbers it derives `lines` / `files` / `packages` counts and picks a tier. The only thing that leaves the job is a sticky proposal comment that the bot constructs itself. That base-branch-only checkout is the foundational invariant that makes `pull_request_target` safe here; the JSON-shaped outputs are a second layer on top.


		## Output is untrusted too

		Model output is also untrusted. The consolidator regex-scrubs GitHub token patterns (`gh[pousr]_…`, `github_pat_…`) from the report before it's posted as a PR comment. It's defense-in-depth, but worth doing — the agent is denied shell tools, but its prose output still flows to a public-facing comment.

Conversation

tylerbutler commented May 21, 2026

Uh oh!

github-actions Bot commented May 21, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Reviewed changes

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions Bot commented May 28, 2026

linkcheck output

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

github-actions Bot commented May 21, 2026 •

edited

Loading