Add agent-compatibility plugin

ericzakariasson · ericzakariasson · commit d9eba4d94b22 · 2026-03-25T11:37:56.000+01:00
Publish the Agent Compatibility plugin (CLI-backed scans, subagent prompts,
orchestration skill) to the official marketplace with marketplace listing and
README table entry.

Made-with: Cursor
diff --git a/.cursor-plugin/marketplace.json b/.cursor-plugin/marketplace.json
@@ -32,6 +32,11 @@
       "name": "ralph-loop",
       "source": "ralph-loop",
       "description": "Iterative self-referential AI loops using the Ralph Wiggum technique."
+    },
+    {
+      "name": "agent-compatibility",
+      "source": "agent-compatibility",
+      "description": "Compatibility scans and agent-native workflow audits for repository setup, startup paths, and validation loops."
     }
   ]
 }
diff --git a/README.md b/README.md
@@ -11,6 +11,7 @@ Official Cursor plugins for popular developer tools, frameworks, and SaaS produc
 | [Cursor Team Kit](cursor-team-kit/) | Developer Tools | Internal-style workflows for CI, code review, shipping, and testing |
 | [Create Plugin](create-plugin/) | Developer Tools | Meta workflows for creating Cursor plugins with scaffolding and submission checks |
 | [Ralph Loop](ralph-loop/) | Developer Tools | Iterative self-referential AI loops using the Ralph Wiggum technique |
+| [Agent Compatibility](agent-compatibility/) | Developer Tools | Compatibility scans and agent-native workflow audits for repository setup, startup paths, and validation loops |
 
 ## Repository structure
 
diff --git a/agent-compatibility/.cursor-plugin/plugin.json b/agent-compatibility/.cursor-plugin/plugin.json
@@ -0,0 +1,28 @@
+{
+  "name": "agent-compatibility",
+  "displayName": "Agent Compatibility",
+  "version": "0.1.0",
+  "description": "Compatibility scans and agent-native repo audits built around the agent-compatibility CLI.",
+  "author": {
+    "name": "Eric Zakariasson"
+  },
+  "publisher": "Eric Zakariasson",
+  "logo": "assets/avatar.png",
+  "license": "ISC",
+  "keywords": [
+    "agents",
+    "compatibility",
+    "repo-audit",
+    "startup",
+    "validation"
+  ],
+  "category": "developer-tools",
+  "tags": [
+    "agents",
+    "compatibility",
+    "quality",
+    "workflow"
+  ],
+  "skills": "./skills/",
+  "agents": "./agents/"
+}
diff --git a/agent-compatibility/README.md b/agent-compatibility/README.md
@@ -0,0 +1,70 @@
+# Agent Compatibility Cursor Plugin
+
+This is a thin Cursor plugin that wraps the published `agent-compatibility` CLI.
+
+The top-level skill is intentionally thin. It coordinates one subagent per check and then synthesizes the results.
+
+All review agents are expected to return the same basic shape in **plain text** (no markdown code fences or heading syntax):
+
+- First line: `<Score Name>: <score>/100`
+- Short summary paragraph
+- Line `Problems` then one issue per line prefixed with `- `
+
+The orchestration skill (`run-agent-compatabilty`) answers the user with a minimal markdown result: one `## Agent Compatibility Score: N/100` heading and one flat `Problems / suggestions` list, with no formula or component scores unless the user asks for a breakdown.
+
+## What is in here
+
+- `.cursor-plugin/plugin.json`: plugin manifest
+- `skills/run-agent-compatabilty/SKILL.md`: thin orchestration skill for the full pass
+- `agents/deterministic-scan-review.md`: deterministic CLI scan agent
+- `agents/startup-review.md`: startup verification agent
+- `agents/validation-review.md`: validation-loop agent
+- `agents/docs-reality-review.md`: docs-vs-reality agent
+
+## How it works
+
+The plugin does not embed the scanner. It expects Cursor to run the published npm package when needed:
+
+```bash
+npx -y agent-compatibility@latest --json .
+```
+
+Or, when a Markdown report is easier to reason about:
+
+```bash
+npx -y agent-compatibility@latest --md .
+```
+
+## Local install
+
+If you want to use this plugin directly, symlink this plugin directory into:
+
+```bash
+~/.cursor/plugins/local/agent-compatibility
+```
+
+## Recommended usage
+
+Use `run-agent-compatabilty` when you want the full pass. That skill should fan out to:
+
+- `deterministic-scan-review`
+- `startup-review`
+- `validation-review`
+- `docs-reality-review`
+
+The score names should be:
+
+- `Agent Compatibility Score`
+- `Startup Compatibility Score`
+- `Validation Loop Score`
+- `Docs Reality Score`
+
+## Notes
+
+- The top-level synthesis combines both layers:
+  - it computes an internal workflow score from startup, validation, and docs-reality
+  - `Agent Compatibility Score` = `round((deterministic * 0.7) + (workflow * 0.3))`
+- The default final user-facing output is intentionally simple: one `Agent Compatibility Score` heading and one flat prioritized `Problems / suggestions` list, with no calculation shown.
+- The skill is intentionally thin. The agents do the work.
+- The CLI remains the scoring engine.
+- If you later want tighter integration, the next step is an MCP server that exposes the scanner as structured tools instead of shell commands.
diff --git a/agent-compatibility/agents/deterministic-scan-review.md b/agent-compatibility/agents/deterministic-scan-review.md
@@ -0,0 +1,39 @@
+---
+name: deterministic-scan-review
+description: Run the agent-compatibility CLI and return the deterministic score with its main problems
+model: fast
+readonly: true
+---
+
+# Deterministic scan review
+
+CLI-backed compatibility scan specialist.
+
+## Trigger
+
+Use when the task is specifically to run the published `agent-compatibility` scanner and report the deterministic result.
+
+## Workflow
+
+1. Try the published scanner first with `npx -y agent-compatibility@latest --json "<path>"`.
+2. If you are clearly working inside the scanner source repo and the published package path fails for an environment reason, fall back to the local scanner entrypoint.
+3. Only say the scanner is unavailable after you have actually tried the published package, and the local fallback when it is clearly available.
+4. Prefer JSON when you need structured reasoning. Prefer Markdown when the user wants a direct report.
+5. Keep the scanner's real score, summary direction, and problem ordering.
+6. Do not bundle in startup, validation, or docs-reality judgments. Those belong to separate agents.
+
+## Output
+
+Reply in **plain text only** (no markdown fences, no `#` headings, no emphasis syntax). Use this layout:
+
+First line: `Agent Compatibility Score: <score>/100`
+
+Then a short summary paragraph.
+
+Then the line `Problems` followed by one bullet per line using `- `.
+
+- Use the deterministic scan's real score.
+- Include both rubric issues and accelerator issues when they matter.
+- If there are no meaningful problems, under Problems write `- None.`
+- Do not treat scanner availability as a defect in the target repo.
+- If the scanner truly cannot be run, say that the deterministic scan is unavailable because of the tool environment, not because the repo lacks a compatibility CLI.
diff --git a/agent-compatibility/agents/docs-reality-review.md b/agent-compatibility/agents/docs-reality-review.md
@@ -0,0 +1,44 @@
+---
+name: docs-reality-review
+description: Check whether the documented setup and run paths survive contact with reality
+model: fast
+readonly: true
+---
+
+# Docs reality review
+
+Docs-versus-reality specialist for setup, bootstrap, and run guidance.
+
+## Trigger
+
+Use when the user wants to know whether the repo documentation is actually trustworthy for an agent starting fresh.
+
+## Workflow
+
+1. Run the deterministic compatibility scan first.
+2. Read the obvious documentation surfaces: `README`, setup docs, env docs, and contribution or agent guidance.
+3. Follow the documented setup and run path as literally as practical.
+4. Note where docs are accurate, stale, incomplete, or misleading.
+5. Pick a specific score instead of a round bucket. Start from these anchors and move a few points if the evidence clearly warrants it:
+   - around `93/100` if the docs lead to the working path with little or no correction.
+   - around `84/100` if the docs drift in places but an agent can still get to the right setup or run path without much guesswork.
+   - around `68/100` if the docs are stale enough that the agent has to reconstruct important steps from the tree or CI.
+   - around `27/100` if the docs point the agent down the wrong path or omit key steps you need to proceed.
+   - around `12/100` if the real path depends on private docs or internal context that is not available in the repo.
+6. Prefer a specific score such as `81`, `85`, or `92` over a multiple of ten when that is the more honest read.
+
+## Output
+
+Reply in **plain text only** (no markdown fences, no `#` headings, no emphasis syntax). Use this layout:
+
+First line: `Docs Reality Score: <score>/100`
+
+Then a short summary paragraph.
+
+Then the line `Problems` followed by one bullet per line using `- `.
+
+- Base the score on what happened when you followed the docs.
+- Build Problems from real mismatches, omissions, or misleading guidance.
+- If the repo is blocked on secrets or infrastructure, say so plainly and still use the same output shape.
+- Minor drift or stale references should not drag a good repo into the mid-60s if the real path is still easy to recover.
+- Score the damage from the drift, not the mere existence of drift.
diff --git a/agent-compatibility/agents/startup-review.md b/agent-compatibility/agents/startup-review.md
@@ -0,0 +1,51 @@
+---
+name: startup-review
+description: Try to bootstrap and start a repository like a cold agent, then report where the path breaks down
+model: fast
+readonly: true
+---
+
+# Startup review
+
+Startup-path specialist for repository bootstrap and first-run success.
+
+## Trigger
+
+Use when the user wants to know whether a repo is actually easy to start, not just whether it claims to be.
+
+## Workflow
+
+1. Run the deterministic compatibility scan first.
+2. Read the obvious startup surfaces: `README`, scripts, toolchain files, env examples, and workflow docs.
+3. Pick the most likely bootstrap path and startup command.
+4. Try to reach first success inside a fixed time budget.
+5. If the first path fails, allow a small amount of recovery and note what you had to infer.
+6. Do not infer a startup failure from a lockfile, a bound port, or an existing repo-local process by itself.
+7. Only call startup blocked or failed when your own startup attempt fails, or when the documented startup path cannot be completed within the budget.
+8. Pick a specific score instead of a round bucket. Start from these anchors and move a few points if the evidence clearly warrants it:
+   - around `93/100` if the main startup path works inside the time budget, even if it needs ordinary local prerequisites such as Docker or a database.
+   - around `84/100` if the repo starts, but only after some digging, a recovery step, or heavier setup than the docs suggest.
+   - around `68/100` if a startup path probably exists but stays too manual, too ambiguous, or too expensive for normal agent use.
+   - around `27/100` if you cannot get a credible startup path working from the repo and docs you have.
+   - around `12/100` if the path is blocked on secrets, accounts, or infrastructure you cannot reasonably access.
+9. Prefer a specific score such as `82`, `85`, or `91` over a multiple of ten when that is the more honest read.
+10. Return the result in the same plain-text report shape as the deterministic scan.
+
+## Output
+
+Reply in **plain text only** (no markdown fences, no `#` headings, no emphasis syntax). Use this layout:
+
+First line: `Startup Compatibility Score: <score>/100`
+
+Then a short summary paragraph.
+
+Then the line `Problems` followed by one bullet per line using `- `.
+
+- Base the score on what happened when you actually tried to start the repo.
+- Build Problems from the real startup friction you observed.
+- If the repo is blocked on secrets, accounts, or external infra, say that plainly and still use the same output shape.
+- Do not assume a Next.js lockfile or a port that does not answer HTTP immediately is a repo problem.
+- Do not require an HTTP response unless the documented startup path clearly implies one and you actually started that path yourself.
+- If the environment starts successfully, treat that as a strong result. Record the friction, but do not score it like a near-failure.
+- Treat Docker, local services, and other standard dev prerequisites as friction, not failure.
+- Error-message quality is secondary here unless it actually prevents startup or recovery.
diff --git a/agent-compatibility/agents/validation-review.md b/agent-compatibility/agents/validation-review.md
@@ -0,0 +1,50 @@
+---
+name: validation-review
+description: Assess whether an agent can verify a small change without guessing or running an unnecessarily heavy loop
+model: fast
+readonly: true
+---
+
+# Validation review
+
+Validation-path specialist for self-check loops and targeted verification.
+
+## Trigger
+
+Use when the user wants to know whether an agent can safely verify its own work in a repo.
+
+## Workflow
+
+1. Run the deterministic compatibility scan first.
+2. Inspect the repo's declared test, lint, check, and typecheck paths.
+3. Decide whether there is a practical scoped loop for a small change.
+4. Try the most relevant validation path.
+5. Judge whether the result is:
+   - targeted
+   - actionable
+   - noisy
+   - too expensive for normal iteration
+6. Pick a specific score instead of a round bucket. Start from these anchors and move a few points if the evidence clearly warrants it:
+   - around `93/100` if there is a repeatable validation path and it gives useful signal, even if it is broader than ideal.
+   - around `84/100` if validation works but is heavier than it should be, repo-wide, or split across a few commands.
+   - around `68/100` if a valid loop probably exists but picking the right one takes guesswork or the output is too noisy to trust quickly.
+   - around `27/100` if there is no practical validation loop you can actually use.
+   - around `12/100` if the loop is blocked on secrets, accounts, or infrastructure you cannot reasonably access.
+7. Prefer a specific score such as `83`, `86`, or `91` over a multiple of ten when that is the more honest read.
+8. Return the result in the same plain-text report shape as the deterministic scan.
+
+## Output
+
+Reply in **plain text only** (no markdown fences, no `#` headings, no emphasis syntax). Use this layout:
+
+First line: `Validation Loop Score: <score>/100`
+
+Then a short summary paragraph.
+
+Then the line `Problems` followed by one bullet per line using `- `.
+
+- Base the score on the loop you actually tried.
+- Build Problems from the real validation friction you observed.
+- Prefer concrete issues like "only full-repo test path exists" over generic quality advice.
+- Do not score a repo in the mid-60s just because the loop is heavy. If an agent can still verify changes reliably, keep it in the good range and note the cost.
+- Noisy logs and extra warnings matter only when they hide the actual validation result.
diff --git a/agent-compatibility/assets/avatar.png b/agent-compatibility/assets/avatar.png
diff --git a/agent-compatibility/skills/run-agent-compatabilty/SKILL.md b/agent-compatibility/skills/run-agent-compatabilty/SKILL.md
@@ -0,0 +1,46 @@
+---
+name: run-agent-compatabilty
+description: Coordinate the full compatibility pass by launching one subagent per check. Use when the user wants the full agent compatibility review instead of only startup, validation, or another single workflow check.
+---
+
+# Run agent compatabilty
+
+## Trigger
+
+Use when the user wants the full agent compatibility pass for a repo.
+
+## Workflow
+
+1. Launch `deterministic-scan-review` to run the CLI and capture the deterministic score and problems.
+2. Launch `startup-review` to verify whether the repo can actually be booted by an agent.
+3. Launch `validation-review` to check whether an agent can verify a small change with a credible loop.
+4. Launch `docs-reality-review` to see whether the documented setup and run paths match reality.
+5. Use one subagent per task. Do not collapse these checks into one agent prompt.
+6. Compute an internal workflow score as the rounded average of:
+   - `Startup Compatibility Score`
+   - `Validation Loop Score`
+   - `Docs Reality Score`
+7. Compute an `Agent Compatibility Score` as:
+   - `round((deterministic_score * 0.7) + (workflow_score * 0.3))`
+8. Synthesize the results into one final response.
+
+When scoring internally, use specific non-round workflow scores for the behavioral checks rather than coarse round buckets. If startup, validation, or docs mostly work, treat them as good-with-friction rather than defaulting to the mid-60s. Do not create a low workflow score just because logs are noisy or the error text is rough.
+
+## Output
+
+Respond in markdown, but keep it minimal. Do not use fenced code blocks.
+
+Show **only** one score, as a level-two heading: `## Agent Compatibility Score: N/100`. Do not show how it was computed—no weights (e.g. 70/30), no formula, no deterministic score, no workflow score, no per-check scores, and no arithmetic—unless the user explicitly asks for a breakdown.
+
+Then a flat, prioritized list labeled `Problems / suggestions` with one issue per line, each line starting with `- `.
+
+If the deterministic scanner cannot be run because of tool environment issues, say that separately and do not treat it as a repo defect or penalize the repo. Fold deterministic and behavioral findings into that one list instead of separate sections. Focus on highest-leverage fixes. Do not include a separate summary unless the user asks for more detail.
+
+Example shape:
+
+## Agent Compatibility Score: 72/100
+
+Problems / suggestions
+- First issue
+- Second issue
+- Third issue

Original file line number	Diff line number	Diff line change
`@@ -32,6 +32,11 @@`
`32`	`32`	`"name": "ralph-loop",`
`33`	`33`	`"source": "ralph-loop",`
`34`	`34`	`"description": "Iterative self-referential AI loops using the Ralph Wiggum technique."`
	`35`	`+ },`
	`36`	`+ {`
	`37`	`+ "name": "agent-compatibility",`
	`38`	`+ "source": "agent-compatibility",`
	`39`	`+ "description": "Compatibility scans and agent-native workflow audits for repository setup, startup paths, and validation loops."`
`35`	`40`	`}`
`36`	`41`	`]`
`37`	`42`	`}`