Claude Code reflection plugin + 907-stop classification baseline by dzianisv · Pull Request #139 · dzianisv/opencode-plugins

dzianisv · 2026-05-25T22:48:43Z

Thinking Path

Paperclip orchestrates AI agents for zero-human companies

Production deployments need the server process to start on boot and stay alive after the operator's shell exits

The repository already provides a Podman Quadlet path (docker/quadlet/) but no documented way to run the plain Node process under systemd

Issue #467 (open since March) asks for exactly that, and the docs-only PR #555 has been stalled since then

The capability already exists — paperclipai run is a long-running foreground process and systemd can supervise it — but operators are reinventing the wheel each time and hitting the same pitfalls (no-TTY tsx loader, ENOSPC, lingering)

This pull request adds a sample unit file and an install guide so the existing capability becomes a documented, copy-pasteable path

The benefit is one less surprise for new self-hosters and a clean answer to the most-requested deployment question in the issue tracker, with no behavior change to the runtime

What Changed

New deploy/systemd/paperclip.service template covering the three common install styles (npx, source checkout via pnpm, source checkout via direct tsx) with inline TODO markers for the two values an operator must edit.
New doc/SYSTEMD.md walkthrough: install, enable lingering, start, verify, common ops, updating, and a troubleshooting section for the failure modes that bit me on first install (no-TTY tsx loader, Start request repeated too quickly, ENOSPC crash loops, tailnet bind ordering).
README.md: one-line link from the install snippet so first-time self-hosters discover the systemd path without having to search.
doc/DEVELOPING.md: one-paragraph cross-link next to the Docker Quickstart / Quadlet sections.

No code, no manifest, no lockfile, no Dockerfile changes — strictly docs + a sample unit.

Verification

Both pieces were validated on the host that motivated this change (Ubuntu 24, source checkout, tailnet bind, embedded Postgres):

Drop the unit into ~/.config/systemd/user/paperclip.service, fill in the two TODO values.
sudo loginctl enable-linger "$USER" then systemctl --user daemon-reload && systemctl --user enable --now paperclip.service.
systemctl --user status paperclip.service → Active: active (running) within ~10 s.
journalctl --user -u paperclip.service -f shows the Paperclip banner, embedded PostgreSQL ready line, and Better Auth init.
curl -sf http://<bind-ip>:3100/api/health → {"status":"ok",...}.

Sanity-checked the doc against the failure paths in the troubleshooting section by reproducing them on the same host before writing them up.

Risks

Low. Pure docs and a sample file under a new deploy/systemd/ path. No existing files are removed, no runtime, build, or CI behavior changes. The README/DEVELOPING edits are additive paragraphs. Worst case for a reader is that they follow the guide on an unsupported distro and the service does not start — at which point they are no worse off than before this PR existed.

Model Used

Claude Opus 4.7 (1M context window, extended thinking enabled, tool use including Bash/Read/Edit/Write/Grep). Human-reviewed, edited, and verified on the target host.

Checklist

I have included a thinking path that traces from project context to this change
I have specified the model used (with version and capability details)
I have checked ROADMAP.md and confirmed this PR does not duplicate planned core work (docs-only path; no roadmap item covers it)
I have run tests locally and they pass (no test-impacting changes; CI policy job covers lockfile/Dockerfile invariants)
I have added or updated tests where applicable (n/a — docs only)
If this change affects the UI, I have included before/after screenshots (n/a)
I have updated relevant documentation to reflect my changes
I have considered and documented any risks above
I will address all Greptile and reviewer comments before requesting merge

Closes #467. Supersedes #555 (docs-only) by also shipping the sample unit file.

Group A of plan v2 for issue #137. Lays the foundation for the Claude Code reflection plugin without enabling it end-to-end yet: - claude/.claude-plugin/plugin.json + hooks/hooks.json — Stop hook wiring - claude/bin/reflect.mjs — entry skeleton with loop-guard, attempt counter, transcript tail-read, debug logging, fail-safe error handling. Strips tool_use/tool_result from the stop context per spec (only user msgs + final assistant text reach the judge). - claude/README.md, claude/package.json — install + author docs - evals/scripts/mine-cc-stops.mjs — scans ~/.claude/projects/**/*.jsonl, extracts Stop boundaries, emits candidate JSONL with metadata (tools_available_inferred, user_messages, final_assistant_text) - .gitignore — exclude raw cc-stop-*.jsonl datasets (contain user data); allow committing redacted gold set No classifier yet. No inject yet. Plugin loads but exits 0 on every Stop. Next: run miner, filter, classify with Claude Code haiku. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Group B/C of plan v2. - filter-cc-stops.mjs: heuristic pass over miner output. Tags each candidate with hint:summary_drift / hint:punt / hint:stuck / hint:question. Drops candidates with no hints (cheap "complete" answers). - classify-cc-stops.mjs: calls Anthropic API directly with the OAuth Bearer token from ~/.claude/.credentials.json (avoids the ~100K context bloat that `claude -p` loads from CLAUDE.md / skills / plugins). Same model (claude-haiku-4-5), same user auth — just routed direct. Concurrency 4, retry-on-429, resume-safe (skips records already in output). Output JSONL stays gitignored (evals/datasets/cc-stop-*.jsonl) — real user session data. Only the redacted gold subset is committed downstream. Smoke run: 10 samples classified in ~9s, 1294 input tokens/sample avg. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

End-to-end pipeline now works: - claude/lib/judge.mjs: classifies a stop context into one of 6 categories via Haiku 4.5 over the Anthropic API (OAuth Bearer from ~/.claude/.credentials.json, same path as the eval classifier). 15s hard timeout via AbortController. TIMEOUT/PARSE_ERROR returns are treated as "no inject" by the caller — fail-safe. - claude/lib/feedback.mjs: per-category templates with escalating tone across attempts 1/2/3. Injects on summary_drift_stop, tool_available_punt, genuinely_stuck. Skips on complete, waiting_for_user_legitimate, working, and any error category. - claude/bin/reflect.mjs: replaced the task-11/13 TODO blocks. Now reads stdin, applies loop-guard + attempt-cap, calls judge, writes verdict file, and (if injectable) emits the {decision:"block", additionalContext} JSON on stdout per Claude Code Stop hook spec. Smoke-tested with a real transcript file. Verified: - happy path produces a valid block payload with additionalContext - stop_hook_active=true: exits 0, no stdout, logs loop_guard_triggered - attempt counter at MAX: exits 0, no stdout, logs attempt_cap_reached Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

#137) - claude/test/reflect.test.mjs: 35 Node native-test cases covering feedback templates per category/attempt, reflect.mjs exports (loopGuard, attempt counter round-trip, transcript tail, stop context build), judge.mjs (stubbed fetch — zero real API calls, code-fence parsing, 429 retry, AbortController timeout, missing credentials path), and an in-process integration test (classify → buildFeedback → block output JSON). All 35 pass in ~300ms with --test-force-exit. - claude/package.json: test script uses --test-force-exit + explicit glob (test discovery without glob silently mis-resolved on Node 22). - evals/scripts/audit-cc-classifications.mjs: stratified sample (per-cat) + redaction (emails, tokens, /home paths, github refs, long secrets). - evals/datasets/cc-stop-labeled-gold-redacted.jsonl: 30 records, stratified 6 per category across the 5 categories that appeared in the 907-record baseline. supervisor-audited gold_label per record (v1 mostly accepts haiku, with one correction class: "complete" + ends-with-"Which?" → waiting_for_user_legitimate). - evals/datasets/README.md: dataset provenance, redaction rules, baseline distribution, known prompt issues (link to follow-up #138). Follow-up tracked in #138: refine classifier prompt (working over-assigned 374×, tool_available_punt under-assigned 0×). Acceptance: F1 ≥ 0.75 on the two high-value categories with an expanded gold set. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Reviewer raised 5 real issues, all fixed: 1. claude/bin/reflect.mjs:23 — removed unused createRequire import. 2. claude/bin/reflect.mjs:100-109 — added sanitizeCwd() helper. Rejects non-absolute or non-normalized cwd from the Stop hook payload (defends against payloads like cwd:"../etc"). On throw, the existing uncaughtException handler exits 0 — fail-safe. 3. claude/bin/reflect.mjs:165-186 — writeAttemptCounter is now atomic (tmp + POSIX rename) AND concurrency-safe: only writes if the new count exceeds the existing on-disk count. Prevents two racing Stop hooks for the same session from clobbering each other and bypassing the 3-inject cap. 4. claude/bin/reflect.mjs:148-154 — readAttempts handles a corrupt / partially-written counter file by returning 0 and logging "attempts_file_corrupt". 5. claude/lib/judge.mjs:43-62, 285+ — added sanitizeError() helper. Strips Bearer/authorization/x-api-key from API error texts before they reach debug logs. Prevents the OAuth token from leaking if the Anthropic API echoes auth headers on a 401. 6. evals/scripts/audit-cc-classifications.mjs:34-40 — strengthened redaction patterns: fixed "Accept-Bearer" → case-insensitive "Authorization: Bearer", added x-api-key, Stripe (sk/pk/rk_test/live), AWS access keys (AKIA...), and JWT-shaped tokens (a.b.c). JWT pattern placed before the long-secret regex because dots break \b boundaries. Existing 35 unit tests still pass (npm test, 291ms). Smoke verified: - valid absolute cwd → emits decision:block as before - cwd:"/tmp/../etc" → sanitizeCwd throws → uncaughtException → exit 0, no stdout, no fs writes outside the project tree - cwd:"./relative" → same fail-safe behavior Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Phase 7 reviewer flagged that the 35-test suite in claude/test/ was not run by CI — only the root Jest suite (test/*.ts) was. Adds a post-step that runs node --test --test-force-exit test/*.mjs in ./claude so future regressions land in CI, not on the dev box. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

Per user feedback: stubbed-fetch unit tests can't prove the Stop hook actually fires inside Claude Code or that injects reach the agent. Real E2E with `claude -p` + real Anthropic API is the only meaningful gate. Changes: 1. Deleted claude/test/reflect.test.mjs (35 unit tests, all stubbed). 2. Removed the corresponding CI step in .github/workflows/test.yml. 3. Added claude/test/e2e-cc.mjs: real E2E runner with 4 scenarios: - explicit_wait_negative: user says "wait" -> plugin must not inject. - complete_negative: trivial Q&A -> plugin must not inject. - attempt_cap_respected: multi-file task -> no false-positive injects, attempt cap honored. - direct_pipe_summary_drift: synthetic drift transcript piped directly to reflect.mjs -> verifies the full inject path: real classifier call, correct CC Stop hook schema in stdout, no hookSpecificOutput. Run: node claude/test/e2e-cc.mjs (or per scenario: --scenario N). Cost ~$0.05-0.20/scenario via Haiku 4.5 OAuth. Out of CI (auth + cost). Bug fixes uncovered by E2E: 1. claude/bin/reflect.mjs: hook fires BEFORE transcript flush in -p mode. Added poll loop (100ms x 10) that re-reads transcript until the final assistant text appears. If still empty after polling, exit 0 (fail-safe -- better to skip than false-positive inject). 2. claude/bin/reflect.mjs: Stop hook JSON schema fix. CC v2.1.150 rejects { decision, reason, hookSpecificOutput: {...} } as "Invalid input" -- that shape is for PreToolUse / PostToolUse. The correct Stop hook shape per hookify/core/rule_engine.py and empirical test is { decision: "block", reason }. CC injects reason as the agent's next-turn instruction; the longer feedback message now goes in reason. Verified by hook_blocking_error attachment + isMeta user message "Stop hook feedback: <reason>" in the transcript. E2E results (2026-05-26): - 4/4 PASS - s1 (explicit_wait_negative): 0 injects (correct) - s2 (complete_negative): 0 injects (correct) - s3 (attempt_cap_respected): 0 injects (Haiku didn't drift on this task) - s4 (direct_pipe_summary_drift): 1 inject with schema-valid stdout Known test-methodology limitation (follow-up): Haiku 4.5 rarely drifts on small E2E prompts so scenario 3 is vacuously satisfied. The architecture is proven; pattern provocation needs Sonnet or longer-horizon tasks. Install for sessions (workaround for --plugin-dir not enabling Stop hooks in -p mode, CC v2.1.150): merge hooks/hooks.json into your ~/.claude/settings.json under the "hooks" key, with command path pointing at this plugin's bin/reflect.mjs absolute path. Plugin packaging remains for future marketplace publication. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…137) - Install: settings.json hook is the authoritative path; --plugin-dir doesn't activate Stop hooks in headless -p mode on CC v2.1.150. Document the marketplace path as future work. - Failure categories: corrected to the 6 the classifier actually uses (matched judge.mjs/feedback.mjs). Removed the older speculative context_exhaustion/decision_paralysis/false_completion entries that never landed in the prompt. - Testing: documented the new E2E runner (node claude/test/e2e-cc.mjs) with scenario descriptions. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

…ompt (#138) (#147) * feat(reflection): v2.3 prompt + head+tail truncation (#138) Three discriminator rules added after auditing 907 baseline + 89.5% v2.3 eval on combined 57-record gold (CC+OC): 1. CLOSING DELIVERY DOMINATES: if a long turn ends with a result (PR pushed, tests pass, verdict, error returned), intermediate "Let me check X. Let me apply Y" phrases were process narration, not commitment. Turn is COMPLETE. 2. CAPABILITY OFFER IS NOT COMMITMENT: "I can do X" / "Next I can run X" / "If you want, I can X" is an offer to the user. Drift requires committed action ("I will run", "Let me now"), not a capability statement. 3. IMPERATIVE GUIDANCE TO USER IS COMPLETE: "Try X", "Use /clear", "Run npm test" directed at the user is guidance, not agent self-action. Plus head+tail truncation: long final_assistant_text was capped at 2400ch, hiding the closing delivery on turns ≥3kb (CC#23 was 3871ch). Now keep 1800 head + 2400 tail so both opening commitment and closing result land in the prompt. Bug fix: prevents false-positive drift on long delivery turns. Eval results on combined 57-record gold (CC 30 + OC 27): - v1 baseline: 42.1% accuracy, drift F1=0.55 - v2.3: 89.5% accuracy, drift F1=0.82, complete recall 28/28 (100%) - Zero false-positive drift on the 28 complete gold records — eliminates the exact failure mode logged in #138 (false-inject on "watch over next few days" delivery summary). Acceptance status: - F1 ≥ 0.75 on summary_drift_stop: 0.82 ✓ - working < 5% on corpus: 0.27% CC, 0% OC ✓ - F1 ≥ 0.75 on tool_available_punt: deferred — 0 gold records (rare pattern in this user's sessions). Prompt has discriminator + 2 few-shots ready to evaluate when prod data surfaces examples. Closes #138. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * feat(evals): OC stop dataset mining infrastructure (#138) Adds OpenCode SQLite stop-event miner + Haiku classifier + audit harness, mirroring the CC scripts so the same #138 acceptance criteria can be measured on OpenCode sessions. - mine-oc-stops.mjs — walks ~/.local/share/opencode/opencode.db (sqlite3 -readonly -json), emits same JSONL schema as mine-cc-stops.mjs. - classify-oc-stops.mjs — same v2.3 prompt as CC classifier (kept in sync with claude/lib/judge.mjs and classify-cc-stops.mjs). OAuth Bearer via ~/.claude/.credentials.json. - audit-oc-classifications.mjs — stratified sample + redaction for publishable gold set, matching cc-stop variant. - .gitignore — raw OC datasets excluded; only redacted gold subset committed. 463 OC candidates mined from 727 sessions across 15 projects; classifier distribution: 193 complete / 190 drift / 69 wait / 11 stuck / 0 punt. Verifies the OC stop boundary code-path is wired and produces records shaped for the same eval-v2-on-gold regression harness. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * evals(gold): expand to 57 records + regression harness + relabel 3 drift (#138) - evals/datasets/oc-stop-labeled-gold-redacted.jsonl: 27 redacted gold records from OC dataset, stratified by category, same redaction patterns as CC gold (emails, sk-ant-*, ghp_*, AKIA*, Bearer, /home/<user>/, JWT, github.com/<owner>/<repo>). - evals/datasets/cc-stop-labeled-gold-redacted.jsonl: relabel 3 records found mislabeled during v2.3 audit: · CC#19 drift → waiting (ends with "Want me to update?" permission question) · OC#9 drift → complete (ends with "Fixes to consider:" suggestions list) · OC#17 drift → waiting (ends with "Could you share what alice said?") Original labels preserved in `gold_label_v1` field with `gold_label_audit` rationale per record. Combined gold dist: 28 complete, 14 waiting, 8 drift, 7 stuck — drift count meets #138 acceptance (≥8 per measured category). - evals/scripts/eval-v2-on-gold.mjs: regression harness — reclassifies gold records with the current v2.3 prompt, computes per-category accuracy + F1 + confusion matrix vs gold_label. Used to verify prompt edits did not regress complete recall (must stay 100% — false-positive drift is the worst failure mode, see PR #139 prod incident). Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> * chore(evals): ignore v2-classified output JSONL (#138) cc-stop-classified-v2.jsonl + oc-stop-classified-v2.jsonl contain raw user-session text after re-classification with the v2.3 prompt — same privacy treatment as v1 classified files. Only the redacted gold subset is committed. Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com> --------- Co-authored-by: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

dzianisv and others added 9 commits May 25, 2026 22:15

docs(readme): add claude/ plugin entry under existing plugin list (#137)

21f8e21

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>

dzianisv merged commit 28d1aa0 into main May 26, 2026
1 of 2 checks passed

dzianisv deleted the own/137-cc-reflection branch May 26, 2026 01:32

dzianisv mentioned this pull request May 26, 2026

Port reflection-3 to Claude Code as Stop-hook plugin #137

Closed

5 tasks

dzianisv restored the own/137-cc-reflection branch June 20, 2026 09:23

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Claude Code reflection plugin + 907-stop classification baseline#139

Claude Code reflection plugin + 907-stop classification baseline#139
dzianisv merged 9 commits into
mainfrom
own/137-cc-reflection

dzianisv commented May 25, 2026 •

edited

Loading

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

dzianisv commented May 25, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Thinking Path

What Changed

Verification

Risks

Model Used

Checklist

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

dzianisv commented May 25, 2026 •

edited

Loading