- Fleet shutdown hang — Fix Ctrl+C hanging on the worker fleet shutdown path, ensuring clean process termination.
- Merge queue worktree reset — Reset worktree state before every
prepare()call, ensuring a clean starting point and preventing stale state from prior merge attempts from causing failures.
-
Merge queue worktree locks — Use detached-HEAD checkouts in merge prepare so the operation survives worktree lock contention.
-
Merge queue label poller storage — Route sidecar label poller storage through the existing bridge, fixing state persistence for label-driven merge triggers.
- Merge queue result bubbling — Merge results now bubble back to Linear issues; Accepted status is deferred until the merge actually lands, preventing premature status promotion.
-
Merge queue acceptance handoff — Wire acceptance-to-queue handoff, tighten the merge contract, and remove dead merge template.
-
Merge queue proxy resolution — Sidecar now resolves the proxy issue tracker for coordinator-routed deployments, fixing merge queue failures in proxied environments.
- Document profile-based provider/model configuration system.
- Codex QA hardening — Ship templates to Codex agents, fix duplicate and missing events in the event stream, and coalesce log output for cleaner session traces.
- Production-ready Codex support — Template merge for Codex provider, persistence directives, and app-server as default mode. Codex agents now receive properly merged workflow templates with tool permissions and partials.
- Steering-via-resume — Orchestrator now attempts to steer agents via session resume before falling back to backstop auto-commit, giving agents a chance to self-correct missing outputs.
-
State recovery cross-issue guard —
checkRecoverynow guards against cross-issue state contamination, preventing stale recovery data from one issue affecting another. -
Backstop artifact exclusion — Backstop now uses path-based exclusion that correctly filters build artifacts and
.agent/directories, and caps staged file count to prevent oversized auto-commits.
- Linear session forwarding — Use
providerSessionIdfor Linear forwarding when available, fixing session routing for proxied provider sessions.
- Profile-based config system — Added provider/model/effort configuration bundles, allowing named profiles to set agent provider, model, and effort level in a single config switch.
-
Codex exec stdin hang — Close stdin on codex exec spawn to prevent "Reading additional input" hang that blocked agent sessions.
-
Codex exec early death detection — Added early process death detection for codex exec provider, preventing silent failures when the process exits unexpectedly.
-
Codex exec stream hang — Prevent codex exec mode stream hang when process exits without producing stdout output.
- Updated Turborepo configuration.
-
Scope completion enforcement — New
scope-completion-auditpartial forces agents to self-audit against the issue description before committing. Agents can no longer defer in-scope requirements to "follow-up" comments and emitWORK_RESULT:passed. The work-result marker now includes a scope attestation contract. Coordination templates add scope coverage verification ensuring sub-issue deliverables cover the full parent issue. -
Agent Memory Foundation (OSS Phase 1) — Session memory partial and persistence layer for agent context across retries.
-
A2A Server Foundation (OSS Phase 1) — Agent-to-Agent protocol server scaffolding.
-
Sub-agent model config now works — The
models.subAgentconfig (e.g.,claude-sonnet-4-6) was resolved but never passed to coordination sub-agents. Thetask-lifecyclepartial now emits a mandatorymodelparameter on every Agent tool call. Full model IDs are mapped to Claude Code's required short aliases (sonnet/opus/haiku) via newtoAgentToolModelAlias()utility. -
Push+PR instructions in fallback prompts — The legacy
generatePromptForWorkType()fallback (used when template registry is unavailable) was missing push and PR creation instructions fordevelopmentandinflightwork types, causing the session backstop to fire on every session. -
Bare @mention prompt generation — Session-prompted handler now generates proper work prompts for bare @mentions without explicit commands.
- Added WorkflowRegistry and Transition Engine to architecture docs.
-
Proxy client retry logic —
ProxyIssueTrackerClientnow retries transient network errors (fetch failed,ECONNREFUSED,ETIMEDOUT, etc.) and server 5xx/429 responses with exponential backoff (3 retries, 1s→2s→4s). Previously a single network blip during the ~4 minute quality baseline window would kill the entire agent session. The directLinearAgentClientalready had this resilience viawithRetry; the proxy client now uses the same mechanism. -
Autonomous agent system prompt — Replaced the interactive system prompt with autonomous agent instructions for headless operation.
- Configurable git author identity and API-driven deploy support — Agents can now use custom git author name/email via configuration, and deploy operations can be triggered through the platform API.
-
Millisecond timestamps in session-storage — Session timestamps (
createdAt,updatedAt,claimedAt,queuedAt) were stored in seconds (Math.floor(Date.now() / 1000)) while consumers (orphan cleanup, health probes, phase metrics) compared them againstDate.now()in milliseconds. The 1000x magnitude mismatch made every session appear ~54 years old, causing orphan cleanup to systematically re-queue active sessions on every sweep. This produced repeated agent invocations and duplicate work on the same issue. All session-storage timestamps now useDate.now()(milliseconds) to match consumers. -
Worktree refresh hard-resets on
/clear— The refresh-worktree hook now performs a hard reset when triggered by/clear, ensuring pristine state for new conversations.
-
Worktree path resolution when orchestrator runs from a linked worktree —
gitRootin the orchestrator constructor usedfindRepoRoot()which matches.gitfiles (worktrees) as well as.gitdirectories (main repos). Whenprocess.cwd()is a linked worktree,gitRootresolved to the worktree path instead of the main repo root, causingresolveWorktreePathto compute wrong sibling directories.git worktree addthen ran with incorrect cwd, producing directories with only.agent/artifacts.jsonand no actual worktree. Agents fell back to the parent process's working directory, causing cross-session file contamination. Fixed by usingresolveMainRepoRoot()first, which follows the.gitfile'sgitdir:reference back to the main repo. -
Fleet output staircase formatting — Use explicit
\r\nline endings in fleet worker output to prevent staircase formatting in terminals.
-
Agent work loss prevention — Ban
git stashin all agent templates (stashes are repo-scoped and leak across worktrees, causing index corruption). Backstop auto-commit now handles conflicted index state by runninggit reset HEADbefore staging. Preserved worktree patch backup now captures untracked files viagit diff --no-index. Stale stashes are cleared when creating worktrees. -
Duplicate session prevention — Workers now detect session ownership loss within 5 seconds and stop the duplicate agent. Session ownership transfers complete before updating the worker ID during re-registration, eliminating the race window where in-flight API calls used a mismatched worker ID.
-
Fleet output formatting — Worker subprocess output is sanitized to strip
\r(carriage returns from spinners) and ANSI cursor-position sequences before re-printing with the[W##]prefix, fixing garbled/misaligned log output. -
False coordination dispatch — Prevented coordination dispatch when an issue has no sub-issues.
-
Monorepo warning noise — Silenced spurious monorepo detection warnings for single-repo projects.
-
Credential leak in logs — Removed verbose spawn log that leaked env vars and credentials.
-
LINEAR_API_KEY warning — Downgraded to info level for platform-delegated setups where the key is intentionally absent.
- Comprehensive documentation update across 8 files including file reservation tools, config gaps, and CLI commands.
-
Critical: Claude Code spawn failure (exit code 9) — The
ClaudeProvider.spawnClaudeCodeProcessoverride was usingprocess.execPath(Node.js binary) to spawn Claude Code with CLI-specific flags (--output-format,--input-format, etc.). Node.js does not recognize these flags and immediately exits with code 9 ("bad option"). This broke when upstream Claude Code transitioned from a Node.js package to a native binary — the SDK stopped including a JS entry point in the spawn args, leaving only CLI flags that Node.js cannot parse. Now usesspawnOptions.commandprovided by the SDK, which points to the SDK's bundled platform-specific Claude binary (claude-agent-sdk-darwin-arm64/claude). All agents on affected systems were failing instantly on spawn. -
Diagnostic stderr capture —
spawnClaudeCodeProcessnow captures stderr and logs it alongside non-zero exit codes for faster future debugging.
(Superseded by v0.8.41 — incomplete fix for the spawn failure)
-
Parallel merge queue — New
MergePoolreplaces the single-instanceMergeWorkerwhenmergeQueue.concurrency > 1. Uses aConflictGraph(greedy graph coloring) to find independent sets of non-conflicting PRs that can rebase and test concurrently. IncludesbuildFileManifest()for per-PR file change lists,peekAll()/dequeueBatch()storage methods, and falls back to the original sequential worker when concurrency is 1. -
Conflict predictor and main tracker —
ConflictPredictorchecks open PRs for file overlap before spawning development agents, injecting a{{conflictWarning}}into the template context.MainTrackermonitorsorigin/mainfor new commits and identifies which active agents have overlapping file changes.
-
Stale branches from icebox phase — Research, backlog-creation, refinement, and security work types now use
git worktree add --detachinstead of creating named branches. Previously, branches created during icebox (days before development) persisted and were reused by development without rebasing, causing agents to work on stale code. Branches for non-code work types are also deleted on cleanup. -
Stale branch reset on development start — When
createWorktree()encounters an existing branch with zero unique commits (leftover from a prior phase), it now resets to currentorigin/maininstead of using the stale base. -
QA hard-fail on merge conflicts unconditionally — Removed the
{{#if mergeQueueEnabled}}exemption that let QA pass conflicting PRs as "informational only." Merge conflicts now always fail QA, preventing wasted acceptance sessions on PRs that will predictably fail at merge time. -
File reservation instructions for agents —
af_code_reserve_filestool (and CLI equivalent) now documented in thecode-intelligence-instructionspartial. Previously the tool existed in the plugin but no template mentioned it. -
Session file reservation cleanup — Added
releaseAllSessionFiles()to theFileReservationDelegateinterface and proxy adapter. Orchestrator now releases all file reservations on agent completion, preventing stale reservations from blocking other agents. -
Pre-development rebase instruction — Development template now includes a mandatory
git fetch origin main && git rebase origin/mainstep before agents begin writing code.
-
Fine-grained model selection — New 9-level resolution cascade for controlling which model each agent uses: platform dispatch → issue label (
model:<id>) → configmodels.byWorkType→ configmodels.byProject→ envAGENT_MODEL_{WORKTYPE}→ envAGENT_MODEL_{PROJECT}→ configmodels.default→ envAGENT_MODEL→ provider default. Sub-agent model controlled separately viaQueuedWork.subAgentModel, configmodels.subAgent, orAGENT_SUB_MODEL. Claude provider now passes the resolved model to the SDKquery()options (previously silently ignored). Template context exposes{{model}}and{{subAgentModel}}for coordinator sub-agent control. -
Capability-driven provider dispatch —
AgentProviderCapabilitiesexpanded withsupportsToolPlugins,needsBaseInstructions,needsPermissionConfig,supportsCodeIntelligenceEnforcement, andtoolPermissionFormat. Orchestrator now uses capability flags instead of hardcoded provider name checks, enabling new providers to opt into features declaratively. -
Shared safety rules module — Destructive command deny patterns (rm root, worktree management, hard reset, force push, branch switching) extracted from
claude-provider.tsandcodex-approval-bridge.tsintosafety-rules.ts. Single source of truth — adding a pattern automatically enforces it across all providers. IncludesbuildSafetyInstructions()for providers that need natural-language safety rules.
- Remove TUI package — Deleted the Go-based
packages/tuiterminal UI and associatedaf-statusCLI command. Functionality superseded by the dashboard.
- HTTP proxy adapter for file reservation — Platform-managed workers without direct Redis access can now use file reservation through the platform API proxy. Decision tree: Redis (OSS) → platform API proxy (SaaS) → disabled. Adds
createProxyFileReservationDelegate()in core, with worker-runner fallback and stdio-server-entry env var reconstruction.
- File reservation system for parallel agents — Per-file mutex coordination (Redis SET NX) prevents merge conflicts when multiple agents work in separate git worktrees. Includes per-session file index, TTL-based expiration with refresh for crash recovery, and bulk release on session end. Exposed to agents via
af_code_reserve_files/af_code_check_reservationtools through the code-intelligence plugin. The merge worker clears reservations after successful merge. Server, orchestrator, CLI worker, and code-intelligence plugin all wired up.
-
Add autonomous operation instructions to coordination templates —
acceptance-coordination,qa-coordination, andrefinement-coordinationtemplates were missing theAUTONOMOUS OPERATIONprompt block, allowing agents to output conversational questions instead of making headless decisions. Aligns with the pattern already incoordinationandinflight-coordinationtemplates. -
Add list-issues, check-blocked, list-unblocked-backlog to proxy mode — These three Linear CLI commands were unsupported in proxy mode, causing failures when agents operate without a direct
LINEAR_API_KEY. Implements them using existing proxy client methods with client-side filtering.
-
Force code-intelligence tool adoption via
canUseToolinterception — WhencodeIntelligence.enforceUsageis enabled in.agentfactory/config.yaml, the Claude provider denies Grep/Glob calls with a redirect message pointing agents toaf_code_*tools. After the agent uses any code-intelligence tool, Grep/Glob are unlocked as fallback (configurable viafallbackAfterAttempt). Replaced the statelessautonomousCanUseToolconst with a per-sessioncreateAutonomousCanUseTool()factory that tracks which code-intelligence categories have been attempted. -
Code intelligence adoption telemetry — The orchestrator now counts
af_code_*vs Grep/Glob tool calls per session and logs the ratio at session end when the code-intelligence plugin is registered. -
Worktree creation script for isolated Claude sessions — Added
scripts/create-worktree.shto bootstrap interactive Claude sessions into isolated git worktrees at../agentfactory.wt/<name>. Updated.vscode/agentfactory.code-workspaceto use worktrees for all Claude terminals (expanded from 2 to 4 sessions).
- Propagate
AGENTFACTORY_API_URLto agent environment — The orchestrator now setsAGENTFACTORY_API_URLfromapiActivityConfig.baseUrlin both the primary spawn and resume paths. Without this, agents in proxy mode hadWORKER_AUTH_TOKENbut no API URL, causing the af-linear CLI and Linear tool plugin to fall back to directLinearAgentClientwhich fails with 401.
- Linear plugin creates tools in proxy mode —
createTools()now checks forAGENTFACTORY_API_URL/WORKER_API_URLandWORKER_AUTH_TOKEN/WORKER_API_KEYwhenLINEAR_API_KEYis absent. Fleet agents in proxy mode get Linear tools instead of zero, using the sameProxyIssueTrackerClientpath thatrunLinear()already supports.
- Framework-neutral worktree dependency bootstrap — Extracted shared package-manager constants (
LOCK_FILES,INSTALL_COMMANDS,ADD_COMMANDS) intopackages/core/src/package-manager.ts. Worktrees now fetchorigin/mainbefore branching and bootstrap lockfile +package.jsonfrom the latest remote viagit show.syncDependencies,installDependencies,linkDependencies, andwriteWorktreeHelpersare now framework-neutral, usingpackageManagerconfig instead of hardcoding pnpm. Added behind-drift detection (worktree vsorigin/main). Newaf-add-depCLI command for safe dependency addition in worktrees.
- Propagate MCP tool servers to sub-agents via stdio transport — The Claude provider used in-process MCP servers (
McpSdkServerConfigWithInstance) which only work for the top-level agent — sub-agents spawned via the Agent tool couldn't see them. Switched all providers to usecreateStdioServerConfigs()uniformly. Stdio configs are serializable and propagate through the SDK's transport layer, makingaf_code_*tools available to sub-agents for the first time.
- Gate code-intelligence prompt instructions on plugin availability — The
code-intelligence-instructionspartial now renders only when theaf-code-intelligenceplugin is registered in the tool registry. AddedhasCodeIntelligencetoTemplateContext, set by checkingtoolRegistry.getPlugins()at prompt construction time. Previously the instructions rendered unconditionally, either describing MCP tools or CLI commands that might not exist.
- Auto-emit structured context from tool results — The orchestrator's event loop now emits structured context entries after successful tool completions via
ApiActivityEmitter.emitContext(). Covers file reads (currentFile), writes/edits (lastEditedFile), searches (lastSearch), git commands (lastGitOp), directory changes (workingDirectory), and test runs (lastTestRun). Dashboards and TUIs can consume these key-value entries without parsing raw tool call content.
- Context fields on activity emissions —
ApiActivityEmitternow supports optionalcontextKey/contextValueon the activity POST body. NewemitContext(key, value)method for standalone context entries.emitToolUse()andemitResponse()accept an optionalcontextparameter. Context-type activities are stored server-side but not forwarded to Linear.
- Coordination WORK_RESULT marker in legacy prompts — The
coordinationandinflight-coordinationwork types indefaultGeneratePrompt()were missingWORK_RESULT_MARKER_INSTRUCTION, causing agents to exit without the structured marker and blocking status promotion. The YAML templates already had it; the legacy prompt generator did not. - Backstop test alignment with git config identity — The
auto-commits uncommitted changestest was missing mock return values for the twogit configcalls added in v0.8.27, causing the mock sequence to shift and the PR creation assertion to fail.
- Fix worktree setup for monorepo projects —
findRepoRoot()now accepts worktree.gitfiles (not just directories). AddedresolveMainRepoRoot()to follow worktree gitdir references back to the main repo.linkDependencies(),syncDependencies(),loadAppEnvFiles(), andloadSettingsEnv()now use the main repo root fornode_modules/,.env.local, andsettings.local.jsonthat only exist in the main repo, not in worktrees. update-sub-issueaccepts--statusas alias for--state— Agents passing--status Finishedno longer silently fail. Both the direct runner and proxy runner accept either flag. Also throws a usage error when neither--state/--statusnor--commentis provided.- Backstop gates push and PR creation on commits existing — The session backstop no longer pushes empty branches or creates PRs for branches with no commits ahead of main.
branch_pushedcheckscommitsPresentbefore pushing, andpr_urlchecks bothcommitsPresentandbranchPushedbefore creating a PR.
- Cleanup script preserves merge queue directories —
af-cleanupno longer deletes.patchesand__merge-worker__directories from the worktree root. These are merge queue infrastructure, not orphaned agent worktrees.
issueLabelsandteamMembersproxy methods — Added two new methods to the issue tracker proxy chain (LinearAgentClient→ProxyIssueTrackerClient→ proxy handler). This closes the "Needs Human" label and assignee resolution gaps increate-blockerproxy mode. The proxy runner'screate-issue,update-issue, andcreate-blockercommands now fully resolve label names → IDs and team member names/emails → assignee IDs via the proxy.
- ProxyIssueTrackerAdapter for remote API workers — When
LINEAR_API_KEYis not set butAGENTFACTORY_API_URLis available, workers now use aProxyIssueTrackerAdapterthat routes issue tracker operations through the centralized API proxy instead of falling back toNullIssueTrackerClient. This enables work types that need to create/query issues (backlog-creation, coordination, qa-coordination) to function without direct Linear credentials. - af-linear CLI proxy mode — The
af-linearCLI now supports proxy mode whenLINEAR_API_KEYis absent butAGENTFACTORY_API_URLandWORKER_AUTH_TOKENare set. Supports core commands:get-issue,create-issue,update-issue,create-comment,list-sub-issues,create-blocker, and more. Name-to-ID resolution (team, project, state) is handled via proxy calls. - WORKER_AUTH_TOKEN passed to spawned agents — The orchestrator now sets
WORKER_AUTH_TOKENin the agent environment fromapiActivityConfig.apiKey, enabling spawned agents to authenticate to the proxy via theaf-linearCLI.
- Optional LINEAR_API_KEY for workers — Workers can now start without
LINEAR_API_KEY. When absent, aNullIssueTrackerClientprovides stub data so the orchestrator functions normally, and all Linear operations are delegated to the platform API via theApiActivityEmitter. Workers only needWORKER_API_KEYandWORKER_API_URLto operate in platform-delegated mode.
- Guard event.message type before calling substring — Prevent crash when
event.messageis not a string. - Codex recovery parity — Graceful shutdown, orphan cleanup, and backstop safety improvements for Codex provider sessions.
- Codex app-server schema alignment (v0.117.0) — Fixed field name mismatches across config/batchWrite, thread/start, turn/start, turn/steer, and approval policy values. Codex agents now operate correctly in fleet mode.
- Codex approval bridge — Handle approval requests as JSON-RPC server requests (not notifications). Auto-approve safe commands, decline destructive patterns via the same safety rules as Claude.
- Codex sandbox networking — Enable
networkAccess: truefor workspace-write sandbox so agents can usegh,curl,pnpm install, etc. - Codex session lifecycle — Autonomous agents emit result on turn/completed and end cleanly. Accumulated assistant text populates the result message for completion comments.
- Codex reasoning observability — Buffer and log reasoning events (
item/reasoning/textDelta) for fleet logs. Persist reasoning to Linear sessions as non-ephemeral thoughts. - Linear grouped label support — Reconstruct
group:valueformat for Linear grouped labels (e.g.,provider:codex). Enables per-issue provider routing via label dropdowns. - Codex model selection & cost tracking tests — Comprehensive unit tests for App Server provider spawn, event mapping, and approval bridge.
- Cross-provider recovery — Track provider name in worktree state. Clear stale session ID when provider changes between recovery attempts (e.g., Claude → Codex).
- Provider-agnostic stale session detection — Match both Claude and Codex error patterns for resume failures.
- ANSI escape code stripping — Strip terminal color codes from Codex shell command output in event stream.
- Agent message delta field — Read
params.delta(notparams.text) for Codex agentMessage streaming events.
- Republish
@renseiai/agentfactory-code-intelligence— Tarball for v0.8.20 was missing on npm (ghost publish). Bumped all packages to v0.8.21 to work around npm's 24-hour republish restriction.
- Local merge queue — New
provider: 'local'merge queue adapter that serializes merges through a Redis-backed worker without requiring GitHub's paid merge queue feature. Default for all repos withmergeQueue.enabled: true. - Merge worker fleet sidecar —
af-worker-fleetautomatically starts a merge worker sidecar that polls the Redis queue and processes PRs (rebase → mergiraf → lock regen → test → merge). One per fleet, enforced by Redis lock. - QA no longer bypassed by merge queue — Governor always routes
Finished → trigger-qaregardless of merge queue config. QA validates functional correctness; merge worker handles git mechanics at merge time. - Merge-queue-aware QA/acceptance templates — QA templates skip the merge conflict hard-fail rule when merge queue is enabled (conflicts handled by the merge worker). Acceptance templates label PRs
approved-for-mergeinstead of merging directly. mergeQueueEnabledtemplate variable — New Handlebars variable injected by the orchestrator, enabling templates to conditionally adjust behavior based on merge queue availability.- Codex instructions & permissions via App Server — Codex agents receive system prompts and tool permissions through the App Server protocol (SUP-1734).
- Codex MCP tool integration via App Server — MCP tools forwarded to Codex agents through App Server events (SUP-1733).
- Codex message injection via App Server — Mid-session message injection for Codex agents via App Server turn/start (SUP-1732).
- Quality gates — Baseline-diff, quality ratchet, TDD workflow, and boy scout rule prevent agents from degrading codebase health.
- Stale session recovery — Fall back to fresh spawn when resume hits stale session instead of blocking indefinitely.
- Coordination work result heuristics — Broadened result parsing patterns and added missing result-sensitive work types.
- Stale providerSessionId on work type change — Clear stale session ID during recovery when work type changes mid-lifecycle.
- Decision Engine → Workflow Engine migration — Full rename across UI, API routes, database collections, and types (SUP-1756).
- Codex App Server provider core — New provider implementation for Codex via App Server JSON-RPC (SUP-1731).
- Pagination and time-range filtering — Routing decisions endpoint now supports cursor pagination and date range filters.
- Phase-level cost aggregation endpoint — New API endpoint for aggregating costs by workflow phase.
- QA/acceptance agent crash leaves issues stuck in Started — Auto-transition now handles
agent.status === 'failed'for result-sensitive work types (qa, acceptance, coordination), transitioning to the fail status (Rejected) instead of silently stalling. - Stale providerSessionId blocks recovery indefinitely — When resume fails with "No conversation found", the stale session ID is cleared from state so the next recovery attempt starts fresh. Guards against a race condition where a late init event re-persists the stale ID.
- False positives in validate-cross-deps tool — Eliminated spurious dependency violation warnings.
- Agent forced foreground execution — Agents now run in foreground mode to prevent premature detach.
- Pre-push validation gate — Development and coordination agents now run typecheck, build, and test before committing. Prevents pushing code that fails CI and wasting QA cycles.
- Pre-push rebase step — Agents rebase onto latest main before pushing to prevent merge conflict waste cycles where QA passes but acceptance fails on conflicts.
- QA hard-fail on merge conflicts — QA agents must check PR mergeability and fail immediately if conflicts exist, rather than passing with a caveat that predictably wastes an acceptance session.
- Scope fencing in backlog writer — Sub-issue descriptions now include explicit "DO NOT modify" constraints, cross-package dependency awareness, and exhaustive type coverage requirements to prevent out-of-scope breaking changes.
- Blast radius analysis in research — Research agents now trace type/union change impact, cross-package imports, API endpoint inventory, and existing patterns before writing issue descriptions.
- Integration validation in coordination — Coordinators run typecheck/build/test after all sub-agents finish but before creating the PR, catching integration failures between combined changes.
- Code intelligence: type usage finder — New
af_code_find_type_usagestool (MCP + CLI) finds all switch/case statements, mapping objects, and usage sites for a union type. Prevents missed exhaustive checks when adding type members. - Code intelligence: cross-dep validator — New
af_code_validate_cross_depstool (MCP + CLI) checks that cross-package imports have corresponding package.json dependency declarations. - GitHub Rulesets support — Merge queue adapter detects merge queue via Branch Protection Rulesets (modern API) before falling back to legacy branch protection rules.
- Emit structured security scan events — New
agent.security-scanevent emitted from security station with severity breakdown, consumed viaGET /api/factory/events. - Tool category classification — Tool summary payloads include
toolCategoryfield (security, testing, build, deploy, research, general) for dashboard section auto-population.
- Auto-QA/acceptance race condition —
markAgentWorkednow called when session transitions torunning(not justcompleted), fixing a race where the orchestrator transitioned issue status before the worker recorded the tracking key, causing webhook handlers to skip QA/acceptance withnot_agent_worked. af-linear update-issue --parentId— The--parentIdflag is no longer silently ignored.- Security work type exhaustive coverage — Added missing
case 'security':to all switch statements across prompts.ts, orchestrator.ts, a2a-server.ts, and server types.
- Backstop auto-commit for code-producing agents — The session backstop now auto-commits uncommitted changes when agents exit without committing (model behavioral regression causing agents to post summary comments then exit before
git commit). Scoped to code-producing work types only (development, inflight, coordination, inflight-coordination). Non-code work types (QA, refinement, research, acceptance, backlog-creation) are unaffected. The auto-commit enables the existing push + PR recovery to succeed end-to-end.
- QA failure loop prevention — QA failures now route to Rejected (not Backlog), leveraging the existing refinement handler and cycle escalation infrastructure instead of creating unescalatable dev→QA loops.
- Force-push on feature branches — Agents can now use
git push --force-with-leaseon feature branches when commit history has been rewritten (e.g., splitting commits per QA feedback). Bare--forceand force-push to main/master remain blocked. - Backstop diverged history recovery — The session backstop now detects non-fast-forward push failures and retries with
--force-with-leaseon feature branches, recovering from commit rewrites that previously left branches stuck. - False "work not persisted" warnings — Non-code-producing work types (research, backlog-creation, QA, refinement, etc.) no longer trigger spurious "Agent completed but work was not persisted" warnings from bootstrapped
.agent/files in their worktrees. - Stale worktree directory cleanup —
removeWorktree()now cleans up leftover directory shells containing only.agent/after git worktree removal.
- Preserved worktree deadlock — Preserved QA worktrees no longer permanently block branch reuse. Incomplete work is saved as a patch file before cleanup, preventing full work stoppages when development agents need the same branch.
- Session exit gate — Deterministic post-session validation with typed completion contracts per work type, post-session backstop that auto-pushes branches and auto-creates PRs when agents forget, and provider capability flags (
supportsMessageInjection,supportsSessionResume) for future mid-session steering. - Agent bug backlog — Refinement agents can now create improvement issues against the agent system itself when failures are caused by missing prompt instructions, tool gaps, or template deficiencies.
- Auto-trigger env vars —
ENABLE_AUTO_QAandENABLE_AUTO_ACCEPTANCEenv vars now correctly wire up to governor auto-trigger configuration. - Worktree preservation — Prevent worktree cleanup from destroying preserved work when
.agent/preserved.jsonmarker exists; fixcheckForIncompleteWork()to checkgit ls-remoteoutput length instead of relying on exception handling. - In-process MCP tool permissions — Auto-allow in-process MCP tools (e.g.,
af_linear_*,af_code_*) for autonomous agents so they don't require interactive permission approval.
- Code intelligence CLI (
af-code) — New CLI exposing code-intelligence tools (search-symbols,get-repo-map,search-code,check-duplicate) for Task sub-agents and non-MCP contexts. Coordinator sub-agents can now usepnpm af-codevia Bash to explore codebases with BM25 search, PageRank repo maps, and duplicate detection. - Sequential merge queue (The Refinery) —
MergeQueuemodule with Redis sorted-set storage, pluggable merge strategies (rebase/merge/squash), mergiraf conflict resolution, lock-file regeneration, and CLI (af merge-queue status/retry/skip). Single merge worker processes completed PRs sequentially against latest main (SUP-1545). - Provider plugin type system —
ProviderPlugininterface with typed capabilities, trigger/action definitions, config schemas, andNodeTypeRegistryfor plugin metadata storage (SUP-1511, SUP-1512). - Workflow definition v2 schema — Triggers, providers, nodes, and cross-validation with a complete expression evaluator supporting dotted paths, operators, helpers, and templates (SUP-1513, SUP-1514).
- Template CLI fallback instructions —
code-intelligence-instructionspartial now provides CLI usage guidance whenuseToolPluginsis false, instead of rendering empty. - WorkflowTriggerDefinition rename — Resolve export collision between workflow and provider plugin trigger types.
- NodeTypeRegistry alignment — Fix registry to use canonical SUP-1511 provider plugin interfaces.
- Code intelligence template integration — New
code-intelligence-instructionspartial added to all 16 workflow templates. Agents now receive guidance to useaf_code_*tools (repo map, symbol search, code search, duplicate check) when exploring codebases. - Graceful degradation for code-intelligence —
@renseiai/agentfactory-code-intelligenceis now an optional dependency of the CLI. If not installed, the orchestrator/worker start normally withoutaf_code_*tools.
- Documentation updates — Added code-intelligence package to project structure tables and package listings across CLAUDE.md, CONTRIBUTING.md, README.md, SECURITY.md, and architecture docs.
- Mergiraf setup CLI —
af-setup mergirafconfigures the AST-aware merge driver for syntax-aware conflict resolution across agent worktrees (SUP-1544). - Mergiraf setup guide — Documentation for installation, configuration, and AgentFactory integration.
- Suppress dotenv v17 log spam — Add
quiet: trueto all CLI entry points to silence ads/tips polluting agent-fleet output and log analysis. - Mergiraf docs corrections — Fix incorrect CLI command names, remove non-existent
mergiraf registersubcommand, fix disable mechanism, remove unsupported.cjsextension, add missing.mjs.
- Add
af-setuproot script — Expose setup CLI viapnpm af-setupfor local development.
- Worktree sibling directory layout — Worktrees are now created in a sibling directory (
../{repoName}.wt/{branch}) instead of inside the repo (.worktrees/), preventing VSCode/Cursor filesystem watcher crashes (SUP-1543). - Worktree migration CLI —
pnpm af-migrate-worktreesmoves existing worktrees from.worktrees/to the new sibling layout and updates git references.
- Workflow parallelism — Fan-out, fan-in, and race strategies for parallel workflow execution (SUP-1231).
- Fleet quotas — Kueue-inspired per-project budgets with concurrent session limits, daily cost caps, and cohort-based capacity borrowing/lending (SUP-1235).
- Self-learning routing — Thompson Sampling multi-armed bandit for provider selection with Redis-backed posterior store (SUP-1236).
- Code intelligence v1.1 — Dense vector embeddings with CCS hybrid fusion for improved code search (SUP-1241).
- FOSSA integration — SBOM generation for dependency tracking (SUP-1243).
- Workflow gates — Signal, timer, and webhook gate types with timeout support (SUP-1229).
- Webhook gate endpoint — HTTP handler for external gate signals with token persistence (SUP-1296).
- Eliminate worker claim races — Replace peek-then-claim model with ZPOPMIN-based atomic pop-and-claim. The poll handler now assigns work server-side, eliminating thundering herd contention across workers.
- Add poll jitter — Workers desynchronize with 0–40% random jitter on poll intervals, reducing simultaneous queue access.
- Prevent coordinator early-exit — Coordinators no longer exit prematurely when sub-agents are still running (SUP-1544).
- Sub-issue race prevention — Filter sub-issues from backlog queries and guard against independent pickup by workers (SUP-1544).
- Fix routing-observation-store test types — Resolve pre-existing TypeScript errors in mock return types.
- popAndClaimWork — 5 tests covering atomic pop, empty queue, missing items, and error handling.
- isChildIssue — 2 tests for platform adapter delegation.
- Early-exit detection — Tests for new coordinator early-exit patterns.
- Anthropic SDK license review — Marked as approved (SUP-1227).
- Network API for remote workflow deployment —
POST /api/workflows/deployendpoint with Redis-backed WorkflowStore, hot-reload via Redis pub/sub, and YAML/JSON content support (SUP-1492). - Inline
af statuscommand — Quick fleet status checks from the CLI (SUP-1240). - Context window management — Structured summarization with artifact tracking to keep agents within context limits (SUP-1242).
- Expression evaluator and conditional routing — Workflow branching with runtime expression evaluation (SUP-1228).
- K8s-inspired filter/score scheduling pipeline — Scheduler uses Kubernetes-style filter and score phases for issue dispatch (SUP-1234).
- Merge queue support — Provider-agnostic merge queue adapter interface, GitHub native implementation, merge work type with status mappings and template (SUP-1257, SUP-1259, SUP-1261, SUP-1263).
- Mergiraf as default git merge driver — Automatic semantic merge conflict resolution in agent worktrees (SUP-1254).
- Agent inbox with Valkey Streams — Stream-based message inbox for inter-agent communication (SUP-1232).
- Command palette with fuzzy search — Dashboard command palette with MCP tool actions and key triggers (SUP-1239).
- Stuck agent NUDGE action — Inject-message delivery for agents that appear stalled (SUP-1233).
- Agent detail view with live activity streaming — Dashboard view for monitoring individual agent sessions (SUP-1238).
- Comprehensive test infrastructure for create-app — Test suite for the project scaffolding CLI (SUP-1245).
- Prevent code-producing agents from promoting without a PR — Agents that push commits but exit before creating a PR are now caught: the orchestrator checks for branches ahead of main with no associated PR and blocks promotion with a diagnostic comment.
- Scan all event types for PR URLs — PR URL detection now covers
assistant_textandresultevents in addition totool_result, preventing missed PRs when the URL appears outside tool output. - Post-exit PR detection fallback — After agent exit, the orchestrator runs
gh pr list --head <branch>to catch PRs that were created but whose output wasn't captured during the session. - Add task-lifecycle partial — Prevents coordinators from exiting early (SUP-1238).
- Resolve QA failures — Typecheck, adapter wiring, and test fixes (SUP-1237).
- Add merge work type to template schema — Template schema and prompt generator updated for merge workflows.
- Anthropic SDK license review — Marked as approved (SUP-1227).
- Add code-intelligence package — New
@renseiai/agentfactory-code-intelligencepackage with regex-based symbol extraction (TypeScript, Python, Go, Rust), BM25 code search, incremental Merkle-tree indexing, PageRank repo maps, and xxHash64/SimHash memory deduplication. Registers four MCP tools for Claude agents. - Add inflight-coordination work type — Parent issues already in Started status now receive an
inflight-coordinationworkflow instead of being skipped, allowing the orchestrator to manage sub-agent dispatch mid-flight. - Add cleanup CLI with branch pruning —
pnpm af-cleanupnow supports--skip-worktreesand--skip-branchesflags, with merged/gone branch detection and IDE safety checks.
- Fix release workflow for renamed plugin-linear package — Release CI referenced the old
@renseiai/agentfactory-linearname in 4 places; updated to@renseiai/plugin-linear. - Add code-intelligence to release pipeline —
@renseiai/agentfactory-code-intelligencewas missing from the release workflow's version bump and publish steps. - Prevent code-producing agents from completing without committing — Agents that produce code changes but skip the commit step are now caught before marking work as complete.
- Re-validate coordination upgrade when workType is provided —
spawnAgentForIssuenow rechecks whether an issue should be upgraded to coordination even when an explicit work type is passed. - Remove plugin-linear compile-time dependency on core —
@renseiai/plugin-linearno longer imports from@renseiai/agentfactoryat build time, fixing circular dependency issues.
- Gitignore .agentfactory/ directory — Project-local
.agentfactory/config and templates are now excluded from version control.
- Add "QA Coordination Complete" heuristic to work result parser — Agents that output "QA Coordination Complete" or "QA Complete" as inline text (without a structured
WORK_RESULTmarker) now correctly resolve as "passed", preventing issues from stalling in Finished status.
- Fix infinite session dispatch loop on completed coordination issues — When a qa-coordination agent completed without a structured
WORK_RESULTmarker (e.g., "already done"), the issue stayed in Finished and Linear auto-created new agent sessions indefinitely. Thesession-createdwebhook handler now enforces a total session hard cap (MAX_TOTAL_SESSIONS) and a per-issue dispatch counter, matching the guards already present inissue-updated. - Improve work result detection for "already done" agents — Added heuristic patterns for
already done/complete,APPROVED FOR MERGE, andall checks passedso no-op QA sessions correctly resolve as "passed" and trigger the Finished → Delivered transition.
- Per-issue dispatch counter — New
incrementDispatchCount/getDispatchCount/clearDispatchCountfunctions track all session dispatches in a 4-hour sliding window, independent of workflow phase records. Bothsession-createdandissue-updatedhandlers increment this counter on every dispatch.
- Fix parent issues dispatched as development instead of coordination after refinement — The orchestrator's
run()method hardcoded work type to'development'for all backlog issues, ignoring whether the issue is a parent with sub-issues. Parent issues returning to Backlog after a refinement cycle now correctly resolve to'coordination', loading the right template with post-refinement sub-agent dispatch instructions.
- Multi-provider selection with per-spawn resolution — Providers can now be configured per work type or per project via
AGENT_PROVIDER_{WORKTYPE}andAGENT_PROVIDER_{PROJECT}environment variables with priority cascade. - Comprehensive test infrastructure — Added 200+ new tests across all packages covering providers, templates, governor, orchestrator utilities, and more.
- Clear stale parked work on status change — Prevent QA-coordination loop by clearing parked work when issue status changes.
- Require version input for workflow_dispatch releases — Release workflow now validates version input before proceeding.
- Per-project build command overrides —
RepositoryConfigsupports per-projectbuildCommand,testCommand, andvalidateCommandoverrides in object form withinprojectPaths. - iOS/Apple development support — Native project agents can now handle iOS builds with platform-specific build system detection and safety checks.
- Fix provenance verification failure —
repository.urlin package.json used lowercaserenseiaibut GitHub org isRenseiAI. npm provenance verification requires exact case match.
- Add MCP server to release pipeline —
@renseiai/agentfactory-mcp-serveris now included in automated npm + GitHub Packages publishing with provenance attestation. - Add MCP server README — Documents tools, transports, Claude Desktop configuration, and environment variables.
- Fix npm publish warnings — Normalize
repository.urltogit+https://format, remove unsupportedexports/main/typesfrompublishConfig, and strip./prefix frombinentry paths. - Re-add
--provenanceto release workflow — OIDC Trusted Publishers are now configured for the@renseiainpm scope.
- Org migration:
@supaku/*→@renseiai/*— All npm packages renamed from@supaku/to@renseiai/scope. GitHub repo transferred togithub.com/renseiai/agentfactory. Domains updated:supaku.com→rensei.ai,supaku.dev→rensei.dev. Java namespace updated:com.supaku→com.renseiai. Old@supaku/*packages are deprecated on npm with migration instructions.
- Add sub-issue guard to Backlog transition webhook handler — The webhook handler for
→ Backlogtransitions checkedisParentIssueto upgrade to coordination but never checkedisChildIssueto skip sub-issues. When a coordinator (e.g., refinement-coordination) updated sub-issue statuses in Linear, the resulting webhooks dispatched each sub-issue as individual development work, consuming all workers. Added the sameisChildIssueguard pattern already present in the Finished and Delivered handlers.
- Block
git checkoutandgit switchin Claude provider — Agents running in the main repo directory (research, backlog-creation) could rungit checkout <branch>, changing the IDE's checked-out branch. This caused cascading failures where subsequent agents couldn't create worktrees for the same branch. Added deny rules toautonomousCanUseToolinclaude-provider.ts. - Isolate all work types in git worktrees — Research, backlog-creation, refinement, and refinement-coordination agents previously ran from the main repo root (
process.cwd()), giving them write access to the IDE's working tree. All work types now get isolated.worktrees/directories, eliminating the risk of agents mutating the main checkout.
- Fix governor race condition causing duplicate QA dispatch (SUP-955) —
hasActiveSessiondid not include'finalizing'in active statuses, creating a window where the governor could re-evaluate an issue while its agent was still wrapping up. Added'finalizing'to prevent staleissue-status-changedevents from triggering redundant work. - Fix coordination work result parsing when structured marker is absent (SUP-1059) — Coordination agents that reported "Parent issue marked Finished in Linear" without a
<!-- WORK_RESULT:passed -->marker were classified asunknown. Added heuristic pattern to detect this natural language confirmation. - Rewrite refinement templates as triage-only agents — Refinement agents were attempting to implement fixes instead of triaging rejection feedback. Rewrote both
refinement.yamlandrefinement-coordination.yamlto be read-only triage agents that produce actionable fix instructions. Disallowed git, test, build, and file-edit tools. Removed refinement work types fromWORK_TYPES_REQUIRING_WORKTREE. - Add missing
refinement-coordinationdashboard config — The work type was missing from the dashboard display configuration.
- Fix
refinementnot upgrading torefinement-coordinationfor parent issues — The governor dispatch path (governor-dependencies.ts) and webhook session-created handler were missing therefinement → refinement-coordinationparent-issue upgrade. Parent issues that failed QA-coordination and moved to Rejected would get a single-agent refinement instead of coordinated refinement, causing the agent to struggle with the sub-issue structure.
- Fix typecheck errors for
refinement-coordinationin server, nextjs, and linear packages — Added missingrefinement-coordinationentries to duplicatedAgentWorkTypeunion inpackages/server/src/types.ts, A2A skill map, priority switch statements, and work type messages. Addeddefaultbranches to priority functions to prevent TS2366 (missing return).
- Fix qa-coordination fail status dead end — QA coordination failures moved issues to
Started, which the governor treated as "agent already working" — a dead end with no recovery path. Changed fail status fromStartedtoRejected, routing through the refinement workflow instead.
- Add
refinement-coordinationwork type — Parent issues with sub-issues that fail QA or acceptance now get a coordination-aware refinement agent instead of a single-agent refinement that would struggle with the complexity. The refinement-coordination agent triages QA/acceptance failure feedback, moves only the failing sub-issues back to Backlog (leaving passing ones in Finished), and lets the orchestrator re-trigger coordination for targeted re-implementation.
- Set maxTurns=200 for coordination and inflight agents — Coordinators were hitting the Claude SDK's default ~30 turn limit before finishing sub-agent polling, causing premature exit with unknown work result. Added
maxTurnstoAgentSpawnConfigand threaded it through to the SDK'squery()options. Coordination, QA-coordination, acceptance-coordination, and inflight work types now get 200 turns.
- Add
--work-typeflag to orchestrator CLI — Allows forcing a specific work type when using--singlemode, bypassing auto-detection from issue status. Useful for re-running coordination work that would otherwise be detected as inflight. - Add Spring AI agent provider — New provider for Spring AI-based agents (SUP-1038).
- Add heuristic fallback patterns for coordination work types —
parseWorkResultnow detects pass/fail from real agent output when the structured<!-- WORK_RESULT -->marker is missing. Adds patterns forcoordination("all X/X sub-issues completed", "Must Fix Before Merge"),qa-coordination("Status: N Issues Found", "N Critical Issues (Block Merge)"), andacceptance-coordination("Must Fix Before Merge"). Previously, missing markers causedunknownresults that left issues stuck in Delivered, triggering infinite acceptance retry loops. - Strengthen WORK_RESULT marker instruction and move to end of all templates — The work-result-marker partial now uses a prominent visual box and explicit "VERY LAST line" instruction. Moved from mid-prompt to the final position in all 10 templates that use it, so it's the last thing agents read before generating output.
- Add READ-ONLY/GATE constraints and status manipulation guards to QA and acceptance templates — QA templates now explicitly forbid code changes. Acceptance templates are marked as gates that must not fix issues. All result-sensitive templates prohibit
update-issue --stateto prevent agents from bypassing the orchestrator's state machine. Acceptance tool permissions tightened fromgit push *togit push origin --delete *.
- Pass workflow context to coordination rework prompts via webhook path — When QA-coordination failed and the dev coordination agent was re-triggered via webhook, it received a fresh prompt instead of the rework prompt. The failure context filter excluded the
coordinationwork type from Started status, andworkflowContextwas never passed as the 4th argument togeneratePrompt. Now the webhook handler includes coordination in the filter and passeswfContextthrough, so the rework mode prompt activates correctly.
- Add READ-ONLY constraint to QA and acceptance prompts — QA and acceptance agents (including coordination variants) now receive a
READ-ONLY ROLEconstraint that explicitly forbids modifying source code, config, or migration files. Agents must only read, validate, and report — if issues are found, they emitWORK_RESULT:failedinstead of attempting fixes. Prevents QA agents from silently patching code and masking real bugs. - Expand qa-coordination and acceptance-coordination prompts with structured steps — Both prompts now include numbered validation steps, PR selection guidance, and explicit pass/fail criteria with
WORK_RESULTmarker instructions, matching the detail level of their non-coordination counterparts. - Prevent acceptance re-trigger loop on failure — When an acceptance agent fails or returns unknown result,
markAcceptanceCompletedis now called to prevent the webhook orchestrator from immediately re-dispatching another acceptance agent for the same issue.
- Coordination rework mode for QA retries — When a coordinated parent issue fails QA and is retried, the coordinator now receives a specialized "REWORK MODE" prompt instead of the fresh coordination prompt. This prevents re-spawning sub-agents for already-complete work. The rework prompt instructs the agent to read QA failure comments, apply targeted fixes directly, push to the existing PR branch, and run full validation — addressing the SUP-994 scenario where coordinators saw all sub-issues as Finished and concluded "nothing to do."
- Wire WorkflowContext into governor prompt generation for retries — When an issue failed QA and was retried, the governor dispatched agents with a vanilla prompt containing no failure context. The
WorkflowState(cycle count, strategy, failure summary) was sitting in Redis but never passed togeneratePrompt. NowdispatchWorkfetches workflow state viagetWorkflowState()and passes it asWorkflowContexttogeneratePrompt, so retry agents see previous QA failures, cycle count, and escalation strategy in their prompt. - Emit response activity to close Linear agent sessions on completion — Agent sessions now properly close in Linear's UI when work completes.
- Fix coordination agent circular re-triggering on parent issues — Three compounding bugs caused parent issues with sub-issues to cycle through 8+ agent sessions without progressing. (1) Coordination work type was not result-sensitive — the orchestrator auto-promoted to Finished on session completion regardless of whether sub-issues were actually done. Coordination now requires a
<!-- WORK_RESULT:passed -->marker like QA/acceptance. (2) WORK_RESULT markers embedded in tool call inputs (e.g.,create-comment --body) were invisible to the orchestrator's parser, causing "unknown result → no transition → re-trigger" loops. The stream loop now captures markers from tool inputs. (3) Added{{> partials/work-result-marker}}to the coordination template so agents are instructed to emit the marker. - Add circuit breaker for runaway agent sessions — New
MAX_SESSION_ATTEMPTSguard (default: 3) in the governor decision engine prevents issues from cycling through agents indefinitely. If an issue has had 3+ completed sessions without reaching a terminal status, the governor stops dispatching and the issue requires manual intervention.
af-agentCLI for managing running agent sessions — New command with five subcommands:stop(sets session to stopped in Redis, worker aborts within ~5s),chat(queues a pending prompt injected into the running Claude session),status(shows session details),reconnect(creates a fresh Linear agent session and re-associates Redis state), andlist/ls(shows active sessions with duration and cost,--allfor completed/failed).
- Backlog-writer creates issues in Icebox instead of Backlog — Built-in prompts in three locations (defaults/prompts.ts, orchestrator.ts, backlog-creation.yaml) told the backlog-writer agent to create issues in Backlog status, causing the governor to immediately dispatch dev agents before human review. All prompt sources now consistently use Icebox. The agent definition in downstream repos already specified Icebox, but the agentfactory built-in prompts overrode it.
- Fix ZodError in autonomous agent permission handling — Claude Code 2.1.70 requires
updatedInputinPermissionResultallow responses, butautonomousCanUseToolreturned{ behavior: 'allow' }without it. AddedupdatedInput: inputto all allow returns to satisfy the stricter Zod validation.
- Register CLI commands as in-process agent tools — New
ToolPluginsystem exposes Linear CLI commands as typed, Zod-validated tools for Claude agents. Instead of shelling out topnpm af-linear, agents callaf_linear_get_issue,af_linear_create_comment, etc. directly — no subprocess overhead, no arg string construction, no stdout parsing. Uses the Claude Agent SDK'screateSdkMcpServer()for in-process MCP tool registration (the only extension mechanism for adding custom tools to Claude Code). Non-Claude providers continue using the Bash-based CLI unchanged. - Tool plugin architecture for future integrations —
ToolPlugininterface andToolRegistryenable adding new tool sets (Asana, deployment, framework-specific) with minimal boilerplate. Each plugin provides a name and acreateTools()function returning SDK tool definitions.
- Move
runLinear()from CLI to core — The canonical Linear runner now lives inpackages/core/src/tools/linear-runner.ts. The CLI re-exports from core, ensuring both CLI and tool plugin use the same code path. - Document tool plugin system — Updated
docs/providers.md,docs/architecture.md,CONTRIBUTING.md, andCLAUDE.mdwith MCP integration rationale, plugin authoring guide, and architecture diagrams.
- Fix infinite loop when qa-coordination fails on parent issues — QA coordination failure sent parent issues to
Backlog, which triggered development coordination. The coordinator saw all sub-issues alreadyFinishedand immediately promoted back toFinished, restarting the QA cycle indefinitely. Changedqa-coordinationfail status toStartedinstead, which keeps the issue visible without re-triggering the coordination loop. - Fix
getVersion()returning "unknown" in compiled output — Version resolution now works correctly in bundled builds. - Disable filesystem hooks for autonomous SDK agents — Prevents permission failures when agents run headless.
- Add cost tooltip and all-time total to dashboard — Fleet management dashboard now shows per-session cost breakdowns and cumulative totals.
- Route parent issues to coordination work type — The Governor's
dispatchWorkmappedtrigger-development→developmentregardless of whether the issue had sub-issues, so parent issues were treated as single development tasks instead of using the coordinator. NowdispatchWorkchecks theparentIssueIdscache and upgrades tocoordination/qa-coordination/acceptance-coordinationfor parent issues. Added the same check tospawnAgentForIssueandforwardPromptauto-detect paths for defense in depth. - Fix Bash permission failures for autonomous agents — The
allowedToolslist only included command-specific prefixes (pnpm,git,gh, etc.) but missed common shell commands (cd,pwd,ls,cat,find,mkdir, etc.). Headless agents can't prompt for permission, so unlisted commands silently failed. Added ~25 common shell builtins and utilities. - Hard-block Linear MCP tools for autonomous agents — The
canUseToolcallback denied Linear MCP tools but agents still called them successfully, suggesting the callback raced with MCP execution. Added all Linear MCP tool names todisallowedToolsfor a hard SDK-level block that prevents the tools from being callable.
- Block Linear MCP tools for autonomous agents — Agents were discovering Linear MCP tools via
ToolSearchand calling them instead of usingpnpm af-linearCLI, causing permission errors and data dumps into issue comments. TheautonomousCanUseToolhandler now deniesmcp__*Linear__*tools with a redirect message to the CLI. - Fix noisy/misleading agent startup logs — Show "spawning" instead of "PID: undefined" in
onAgentStart(PID arrives asynchronously after process spawn). Switched dotenv fromconfig()toparse()to eliminate tip spam on stdout. Downgradedsettings.local.jsonwarnings to debug level (file may exist withoutenvkey, which is not an error).
- Fix orphan cleanup deadlock with issue locks — When a worker is disrupted (e.g., tmux kill), orphan cleanup detected the stale session but failed to re-dispatch it because the issue lock (SET NX, 2h TTL) was still held by the same session. Work got parked instead of queued, and the stale-lock cleanup skipped it because the session was reset to
pending(not terminal). Now orphan cleanup releases the issue lock before re-dispatching, and the stale-lock cleanup also treatspendingas a stale lock status. - Fix cross-project work routing after orphan recovery — Orphan and zombie cleanup omitted
projectNamewhen reconstructingQueuedWorkfor re-dispatch. The poll filter treated untagged work as "any worker can take it", allowing workers from the wrong repository to claim issues from other projects. NowprojectNameis preserved from session state during re-queue. - Add server-side project validation at claim time — The claim endpoint (
POST /api/sessions/{id}/claim) now validates that the claiming worker's project list includes the work item'sprojectName. If mismatched, the claim is rejected and work is requeued. Previously the poll filter was the only routing gate with no server-side enforcement. - Tighten poll filter for project-scoped workers — Project-filtered workers now only see work explicitly tagged with their projects. Previously, untagged work (
!w.projectName) was accepted by any worker, bypassing project routing. - Fix triple-dispatch duplication in event-bridge mode — Prevented duplicate agent dispatches when events arrived via both webhook and polling simultaneously.
- Add gitleaks pre-commit hook for local secret scanning
- Add dotenvx pre-commit hook to block
.envcommits
- IDE-safe worktree cleanup — Worktree removal now detects processes (VS Code, Cursor, language servers) with open file handles via
lsof. When detected without--force, the worktree is skipped instead of removed, preventing IDE crashes from sudden workspace deletion. Addedskippedcounter toCleanupResult, inter-removal settle delay (1.5s) for IDE file watchers, and removed redundantgit worktree prunecalls.
- Skip worktree creation for non-code work types — Research and backlog-creation agents no longer create git worktrees, branches, or
.agent/state directories. These agents run from the main repo root withcwdset toprocess.cwd(), eliminating startup latency, branch pollution, andfatal: no upstreamlog noise. AddedWORK_TYPES_REQUIRING_WORKTREEconstant to@renseiai/plugin-linearfor the 8 code-producing work types. MadeworktreeIdentifierandworktreePathoptional onAgentProcess,SpawnAgentOptions, andSpawnAgentWithResumeOptions. All state persistence, recovery checks, and worktree cleanup are automatically skipped when these fields are undefined.
- Fix governor re-dispatching work for issues with active agents —
getSessionStateByIssue()returned the first Redis key match for an issue, regardless of session status. When multiple sessions existed (one running + several failed from prior claim attempts), a failed session could be found first, causinghasActiveSession()to return false. The governor then re-dispatched every scan cycle, creating duplicate queue entries that workers claimed and failed on ("Agent already running"). NowgetSessionStateByIssue()scans all matching sessions and prefers active ones (running/claimed/pending) over inactive ones. - Use issue-lock dispatch in governor — Governor's
dispatchWork()calledqueueWork()directly, bypassing the issue-lock system. Multiple dispatches for the same issue all entered the global queue unserialized. Now usesissueLockDispatchWork()which acquires an atomic issue lock before queueing; if the issue is already locked, work is parked and auto-promoted when the lock is released.
- Skip work for issues in terminal status (Accepted/Canceled/Duplicate) — The governor queues work based on issue status at scan time, but by the time the worker picks up the item, the issue may have already moved to a terminal status. The orchestrator spawned agents anyway, causing issues like SUP-866 to get QA'd after already being Accepted. Added terminal status guards in
spawnAgentForIssue(throws) andforwardPrompt(returns early withreason: 'terminal_status').
- Configurable build/test commands in agent frontmatter and repository config (#15)
- Fix QA infinite loop caused by rigid work result heuristics — QA agents output verdicts like
**PASS**,Verdict: PASS,Status: **PASS**but all heuristic patterns required aQAprefix (e.g.,QA Result: Pass). Every QA run returnedunknown, the orchestrator never transitioned the issue, and the work queue re-dispatched it endlessly. MadeQAprefix optional, added bold markdown support (**PASS**/**FAIL**), and added standalone bold verdict patterns. Added 11 regression tests from real SUP-867 agent output.
- Fix preserved worktrees blocking branch reuse — When a worktree was preserved due to incomplete work, its heartbeat file remained on disk, causing the conflict handler to falsely detect a live agent for 30 seconds. This blocked subsequent agents from creating worktrees on the same branch, exhausting all 3 retries. Now the heartbeat file is deleted when a worktree is preserved (both completed and failed paths). Additionally, the conflict handler now saves a
.patchfile to.worktrees/.patches/before force-removing stale worktrees with incomplete work, preventing data loss.
- Fix worktree cleanup deleting unpushed work for QA/acceptance agents — The
checkForIncompleteWork()safety check was gated behindisDevelopmentWork, so QA and acceptance agents' worktrees were cleaned up without verifying commits were pushed. This caused completed work to vanish when the worktree was removed with--force. Removed the work-type gate so the safety check applies to all completed agents, matching the already-correct failed-agent cleanup path.
- Fix governor creating dead Linear sessions on every dispatch — The governor resolved the OAuth token once at startup and held a static
LinearAgentClientfor the entire process lifetime. When the token expired or was missing, everycreateAgentSessionOnIssue()call failed and all sessions received agovernor-prefixed fallback ID, causing all worker activity/progress forwarding to Linear to silently fail. Changed to a lazy resolver that re-reads the token from Redis on each dispatch, auto-refreshes when needed, and only creates a new client when the token actually changes.
- Enable agents to add new dependencies in worktrees — Agents can now install packages they need (e.g.,
stripe) instead of getting stuck in a loop. The orchestrator writes.agent/add-dep.shinto each worktree during setup, which safely removes symlinkednode_modulesand runspnpm addwith theORCHESTRATOR_INSTALL=1guard bypass. Updated dependency-instructions partial, renseiai CLAUDE.md, and the preinstall guard error message to direct agents to the helper script.
- Prevent worktree node_modules from corrupting main repo — Replaced directory-level symlinks with real directories containing per-entry symlinks. Previously, if an agent ran
pnpm installin a worktree, pnpm would follow the top-level symlink and write into the main repo'snode_modules. Now each entry is individually symlinked, so a rogue install only destroys the worktree's links. Also setsORCHESTRATOR_INSTALL=1env var to bypass the preinstall guard when the orchestrator intentionally runs pnpm install as a fallback.
- Prevent runaway agent loops (SUP-855 post-mortem) — Six fixes to stop multi-session spirals:
- Count
unknownwork results as failures so the 4-cycle escalation ladder fires correctly - Increase cooldown TTLs from 10 seconds to 5 minutes to prevent immediate re-triggering
- Add per-issue hard cap of 8 total sessions with automatic escalation comment
- Harden QA templates to hard-fail on build/typecheck errors instead of rationalizing them
- Skip acceptance-coordination when sub-issues were never actually worked
- Add 30-minute post-acceptance lock to prevent re-triggering after merge
- Count
- Prevent QA state loop caused by agents manually changing issue status — QA coordination agents were bypassing the orchestrator by calling
pnpm af-linear update-issue --statedirectly, creating a Finished→Backlog→Started→Finished loop. Added explicit "never manually change status" instruction to thework-result-markerpartial (included in all QA/acceptance templates) and reinforced in coordination templates. - Route QA failures to Backlog instead of Rejected — Changed
WORK_TYPE_FAIL_STATUSforqaandqa-coordinationfromRejectedtoBacklog, so failed QA issues go directly back to the developer/coordinator with failure context instead of requiring a refinement intermediary. Updated the webhook handler to acceptFinished→Backlogas a valid retry path with circuit breaker checks. - Detect coordination-style QA output in work result parser — Added heuristic patterns for
Overall Result: FAIL,Roll-Up Verdict: FAIL, andParent QA verdict: FAIL(and PASS counterparts) so the orchestrator correctly detects pass/fail from QA coordination agents even without the structured marker.
- Fix Linear CLI team resolution (name vs key) — The orchestrator passed the team display name (e.g., "Rensei") as
LINEAR_TEAM_NAME, but the Linear SDK'steam()method only accepts team keys ("SUP") or UUIDs. Agents wasted many turns reverse-engineering the correct key. Now passesteam?.keyinstead ofteam?.namein all three orchestrator locations and thecreateBlockerCLI fallback. - Make
getTeam()resilient to display names —AgentClient.getTeam()now falls back to a name search when key/ID lookup fails, so agents manually passing--team "Rensei"also work.
- Fix governor version display showing "vunknown" —
getVersion()resolvedpackage.jsonone directory level too shallow from the compileddist/src/output, landing indist/instead of the package root. Fixed path to go up two levels. - Eliminate per-issue isParentIssue API calls burning Linear quota — The
parentIssueIdscache only tracked parent issues, so every non-parent issue (the majority) fell through to an individual API call. With 128 issues in a single project scanned every 60s, this easily exceeded the 5,000 req/hr limit. Added ascannedIssueIdsset so issues already seen in the batch query return immediately without an API fallback. - Wire onScanComplete callback in continuous governor mode —
WorkflowGovernor.start()discardedscanOnce()results in fire-and-forget mode, soprintScanSummarywith its colorized output and quota progress bars never rendered. AddedWorkflowGovernorCallbacksto the governor constructor so the CLI receives scan results on every cycle.
- Fix worktree symlink crash for missing apps —
linkDependenciesnow checks if the destination parent directory exists before creating per-workspacenode_modulessymlinks. When a branch doesn't contain all apps frommain(e.g.,family-mobilemissing on a feature branch), the entry is skipped instead of throwing ENOENT and falling back to a fullpnpm install. - Prevent
pnpm installfallback from corrupting main repo —installDependenciesnow removes any partialnode_modulessymlinks (root + per-workspace) before runningpnpm install. Previously, the rootnode_modulessymlink created bylinkDependenciesbefore the error would causepnpm installto write through it into the main repo'snode_modules, requiring the user to re-runpnpm installafter agent work. - Release claim key on partial failure to prevent work queue deadlock — When
claimWork()succeeded at SETNX but a subsequent Redis operation threw, the claim key was left stuck for its full 1-hour TTL while the work item remained in the queue. All workers would then fail SETNX, causing infinite claim failures. Similarly, if the claim handler threw after removing the item from the queue, neither the claim key nor the work item was cleaned up.
- Governor production logging — New colorized, structured output for the governor CLI replacing plain
console.log. Startup banner with version/config/integration status, per-scan summary with dispatched/skipped counts, and Linear API quota progress bars (request + complexity) with green/yellow/red thresholds. - API call counting for leak diagnosis —
LinearAgentClientnow tracks per-scan API call counts (apiCallCount/resetApiCallCount()) and extracts quota headers viaonApiResponsecallback. Displayed alongside quota bars to help diagnose rate limit consumption.
- Eliminate 2 redundant API calls per dispatch —
dispatchWorknow receives the fullGovernorIssue(already resolved during scan) instead of anissueIdstring. Removes thegetIssue()+ lazy project resolution calls that were re-fetching data already available from the scan query. - Consolidated rawRequest type — Replaced 4 inline
this.client as unknown as { client: { rawRequest... } }casts with a sharedRawGraphQLClienttype alias inagent-client.ts.
- Monorepo path scoping for orchestrator — New
projectPathsandsharedPathsfields in.agentfactory/config.yamlallow mapping Linear projects to specific directories in a monorepo. Agents receive directory scoping instructions via{{projectPath}}and{{sharedPaths}}template variables, and a new{{> partials/path-scoping}}partial validates file changes at push time.
- 403 circuit breaker for ApiActivityEmitter — When a session's ownership is transferred to another worker, the old worker's emitter now detects the 403 "owned by another worker" response, trips an
ownershipRevokedcircuit breaker, and stops all further activity/progress emission. Previously it would spam 403 errors for the entire agent lifetime. NewonOwnershipRevokedcallback allows the worker to request agent shutdown. - Reduce retry waste in agent spawn loop — Reduced
MAX_SPAWN_RETRIESfrom 6 to 3 (45s total instead of 90s). For "agent already running" errors, the retry loop now checks session ownership on the server before retrying — if another worker owns the session, it bails immediately instead of wasting API calls and dependency linking on each attempt. - Orphan cleanup grace period —
findOrphanedSessions()now skips sessions updated within the last 2 minutes (ORPHAN_THRESHOLD_MS), preventing the race condition where a worker re-registers with a new ID but hasn't transferred session ownership yet. - Increase worker TTL and heartbeat timeout —
WORKER_TTLincreased from 120s to 300s,HEARTBEAT_TIMEOUTfrom 90s to 180s. The previous values were too tight — busy workers processing long agents could miss heartbeats, causing Redis key expiry, 404 errors, and re-registration cascades. - Use Node.js rmSync for worktree cleanup — Replaced
execSync('rm -rf ...')withrmSync()withmaxRetries: 3andretryDelay: 1000for more resilient cleanup on mounted volumes and cross-platform compatibility.
- Add WORK_RESULT marker instructions to coordination prompts — The
qa-coordinationandacceptance-coordinationwork types were treated as result-sensitive by the orchestrator (requiring<!-- WORK_RESULT:passed/failed -->in the agent's final output), but neither prompt template included the structured result marker instructions. This caused the orchestrator to post a false "no structured result marker detected" warning even when the QA coordinator successfully moved the issue to Delivered. Fixed in both the hardcoded prompts (orchestrator.ts) and the YAML templates (qa-coordination.yaml,acceptance-coordination.yaml).
- Strip Anthropic API keys from agent environments — App
.env.localfiles (e.g.apps/social/.env.local) may containANTHROPIC_API_KEYfor runtime use. The orchestrator was loading these and passing them into Claude Code agent processes, causing Claude Code to switch from Max subscription billing to API-key billing. AddedAGENT_ENV_BLOCKLISTthat stripsANTHROPIC_API_KEY,ANTHROPIC_AUTH_TOKEN,ANTHROPIC_BASE_URL, andOPENCLAW_GATEWAY_TOKENfrom all three env sources (process.env, app env files, andsettings.local.json) before spawning agents. Applied to bothspawnAgent()andforwardPrompt()code paths. - Governor skips sub-issues in all statuses — Previously only skipped sub-issues in Backlog; now skips in all statuses to prevent double-dispatch.
- Optimize N+1 GraphQL queries — Reduced Linear API calls for sub-issue graph, relations, and status fetching from O(N) to O(1) using raw GraphQL queries with nested fields.
- Skip Linear forwarding for governor-generated fake session IDs — Prevents 404 errors when the governor creates synthetic sessions for internal use.
- Record all auth failures in circuit breaker — Circuit breaker now records auth failures regardless of HTTP status code, improving detection of degraded Linear API states.
- Aligned all package versions to 0.7.20 across the monorepo.
- Circuit breaker for Linear API — New
CircuitBreakerclass in@renseiai/plugin-linearwith closed→open→half-open state machine and exponential backoff. Detects auth errors (400/401/403), GraphQLRATELIMITEDresponses, and error message patterns. Integrated intoLinearAgentClient.withRetry()— checks circuit before acquiring a rate limit token, so no quota is consumed when the circuit is open. - Pluggable rate limiter & circuit breaker strategies — New
RateLimiterStrategyandCircuitBreakerStrategyinterfaces allow swapping in-memory defaults for Redis-backed implementations.LinearAgentClientaccepts optional strategy overrides via config. - Redis-backed shared rate limiter — New
RedisTokenBucketin@renseiai/agentfactory-serveruses atomic Lua scripts to share a single token bucket across all processes (dashboard, governor, agents). Key:linear:rate-limit:{workspaceId}. - Redis-backed shared circuit breaker — New
RedisCircuitBreakerin@renseiai/agentfactory-servershares circuit state across processes via Redis. Supports exponential backoff on reset timeout. - Linear quota tracker — New
QuotaTrackerin@renseiai/agentfactory-serverreads and stores Linear'sX-RateLimit-Requests-RemainingandX-RateLimit-Complexity-Remainingheaders in Redis for proactive throttling. Warns when quota drops below threshold. - Centralized issue tracker proxy — New
POST /api/issue-tracker-proxyendpoint in@renseiai/agentfactory-nextjsacts as a single gateway for all Linear API calls. Agents, governors, and CLI tools call this endpoint instead of Linear directly, centralizing rate limiting, circuit breaking, and OAuth token management. Includes a health endpoint atGET /api/issue-tracker-proxy. - Platform-agnostic proxy types — New
IssueTrackerMethod,SerializedIssue,SerializedComment,ProxyRequest, andProxyResponsetypes in@renseiai/plugin-linearare Linear-agnostic, enabling future issue tracker backends without changing consumer code. - Proxy client — New
ProxyIssueTrackerClientin@renseiai/plugin-linearis a drop-in replacement that routes all calls through the dashboard proxy. Activated whenAGENTFACTORY_API_URLenv var is set.
- Removed harmful OAuth fallback — The Linear client resolver no longer falls back to a personal API key when OAuth token lookup fails. Personal API keys cannot call Agent API endpoints (
createAgentActivity, etc.), so the fallback guaranteed 400 errors that wasted rate limit quota. - Workspace client caching — The Linear client resolver now caches workspace clients with a 5-minute TTL, so all requests within the dashboard process share one client (and one token bucket + circuit breaker) per workspace.
- Governor uses Redis strategies — When
REDIS_URLis available, the governor injectsRedisTokenBucketandRedisCircuitBreakerinto its Linear clients for coordinated rate limiting across processes.
- Circuit breaker unit tests (23 tests) — state transitions, auth error detection, GraphQL RATELIMITED detection, exponential backoff, half-open probe, reset, diagnostics.
- Updated manifest route count (24→25) and create-app template parity for the new proxy route.
- Aligned all package versions to 0.7.19 across the monorepo.
- Governor uses OAuth tokens from Redis for Linear Agent API — The governor now resolves OAuth access tokens from Redis at startup and uses them for
createAgentSessionOnIssue, fixing "Failed to post to Linear" errors caused by using a personal API key for the Agent API. Falls back to personal API key when no OAuth token is available. - Governor stores
organizationIdon session state — Workers can now resolve the correct OAuth token for progress/activity posting since the workspace ID is persisted alongside the session. - Governor includes
promptin queued work items — Agents receive work-type-specific prompts instead of generic fallbacks, improving agent behavior on pickup.
- Aligned all package versions to 0.7.18 across the monorepo.
- Governor skips sub-issues in Backlog — The decision engine now checks
parentIdand skips sub-issues, dispatching only top-level and parent issues. This prevents invisible backlog sprawl when sub-issues are in Backlog but their parent is still in Icebox. - Backlog-writer creates sub-issues in Icebox — Sub-issues are now created with
--state "Icebox"to match their parent. The user promotes the parent to Backlog when ready and sub-issues follow via Linear's built-in behavior. Independent issues still default to Backlog.
- Added decision engine tests for sub-issue skip behavior and precedence over
enableAutoDevelopment.
- Aligned all package versions to 0.7.17 across the monorepo.
- Governor
GOVERNOR_PROJECTSenv var — The governor CLI now readsGOVERNOR_PROJECTS(comma-separated) as a fallback when no--projectflags are provided. Aligns withworkerandworker-fleetCLIs which already support env var defaults viaWORKER_PROJECTS.
- Aligned all package versions to 0.7.16 across the monorepo.
af-sync-routesCLI command — New command that auto-generates missingroute.tsandpage.tsxfiles in consumer projects after upgrading@renseiaipackages. Reads from a central route manifest in@renseiai/agentfactoryand creates only missing files (never overwrites). Use--pagesto also sync dashboard pages,--dry-runto preview. Available asaf-sync-routesbinary and@renseiai/agentfactory-cli/sync-routessubpath export.- Route manifest — New
ROUTE_MANIFESTin@renseiai/agentfactorydefines all 24 API routes and 5 dashboard pages as structured data. IncludesgenerateRouteContent()andgeneratePageContent()generators that produce output identical to thecreate-apptemplates.
- Manifest unit tests (14 tests) — validates entry counts, path patterns, method accessors, and content generation.
- Manifest / create-app parity tests (6 tests) — ensures the manifest stays in sync with
create-apptemplates. - Sync routes runner tests (8 tests) — file creation, no-overwrite safety, dry-run, error handling, and page sync.
- Excluded
__tests__directories from tsc build in core and cli packages. - Aligned all package versions to 0.7.15 across the monorepo.
- Linear API rate limiter — New
TokenBucketrate limiter in@renseiai/plugin-linearproactively throttles requests below Linear's ~100 req/min limit. Uses a token bucket algorithm (80 burst capacity, 1.5 tokens/sec refill). AllLinearAgentClientAPI calls now pass through the rate limiter with automatic backpressure. IncludesRetry-Afterheader parsing for 429 responses. - Auto-detect app URL on Vercel —
getAppUrl()and OAuth callback in@renseiai/agentfactory-nextjsnow fall back toVERCEL_PROJECT_PRODUCTION_URL/VERCEL_URLwhenNEXT_PUBLIC_APP_URLis not set. Removes the need to manually configure app URL on Vercel deployments. - One-click deploy buttons — Vercel and Railway deploy buttons are now fully functional across all READMEs. Vercel deploys from the monorepo subdirectory (
agentfactory/tree/main/templates/dashboard), eliminating the need for a separate template repository. Railway deploy uses a published template with bundled Redis.
- Dashboard template Tailwind v4 build — The dashboard template imported the
@renseiai/agentfactory-dashboardv3 stylesheet which used@apply border-border, failing under Tailwind v4. Converted all dashboard styles to v4-native@themedeclarations and removed the v3 import. vercel.jsonschema validation — Removed invalidenvblock that used object format ({description, required}) instead of Vercel's expected string values. Env vars are prompted via the deploy button URL parameter instead.- Retry improvements — Retry logic now handles
429 Too Many RequestswithRetry-Afterheader parsing, and distinguishes retryable status codes (429, 500, 502, 503, 504) from non-retryable ones. - Governor dependencies cleanup — Simplified
governor-dependencies.tsin the CLI package, removing redundant wiring.
- Workflow Governor documentation — Added comprehensive Governor docs across all packages.
- Updated dashboard template dependency versions to 0.7.14.
- Added
next-env.d.tsto template.gitignore. - Aligned all package versions to 0.7.14 across the monorepo.
- Terminal state guard on AgentSession —
AgentSession.start()andAgentSession.complete()now check the issue's current status before auto-transitioning. If the issue is in a terminal state (Accepted, Canceled, Duplicate), the transition is refused with a warning. This prevents stale or rogue sessions from pulling completed issues back into active workflow states.
- Governor dispatch loop —
dispatchWork()now callsstoreSessionState()beforequeueWork(), sohasActiveSession()correctly returns true on subsequent poll sweeps. Previously, the governor would re-dispatch the same issue every poll cycle because no session was registered in Redis. - Worker claim failures — Workers could never claim governor-dispatched work because
claimSession()requires a pending session in Redis. The session is now registered before the queue entry, eliminating the race window. - Phase completion signal — When research or backlog-creation sessions complete,
markPhaseCompleted()is now called in the session status handler. This prevents the governor from re-dispatching top-of-funnel work for the same issue after each poll sweep. - Project name on dispatched work — Governor-dispatched queue items now include
projectName, enabling correct worker routing across multi-project deployments.
- Icebox auto-triggers default to off —
enableAutoResearchandenableAutoBacklogCreationnow default tofalse. The Icebox is a space for ideation and iterative refinement via @ mentions; automated research/backlog-creation is opt-in via--auto-research/--auto-backlog-creationCLI flags. The governor's default scope is Backlog → development, Finished → QA, Delivered → acceptance.
- Hybrid Event+Poll Governor —
EventDrivenGovernorwraps the existing decision engine with a real-time event loop (viaGovernorEventBus) plus a periodic poll safety net (default 5 min), both feeding throughEventDeduplicatorto avoid duplicate processing - PlatformAdapter pattern — New
PlatformAdapterinterface in core withLinearPlatformAdapterimplementation that normalizes Linear webhooks intoGovernorEvents and scans projects for non-terminal issues - Webhook-to-Governor bridge —
governorModeconfig (direct|event-bridge|governor-only) lets deployments opt in incrementally; webhooks publish status-change events to the bus alongside or instead of direct dispatch - Real GovernorDependencies — CLI governor now wires all 10 dependency callbacks to Linear SDK + Redis (sessions, cooldowns, overrides, workflow state, phase tracking, work dispatch) with stub fallback when env vars aren't set
- Redis Streams event bus —
RedisEventBuswith consumer groups, MAXLEN trimming, and pending message re-delivery;RedisEventDeduplicatorusing SETNX with TTL --mode event-driven|poll-onlyCLI flag —af-governornow supports event-driven mode withInMemoryEventBusfor single-process usage
- Aligned all package versions to 0.7.11 across the monorepo.
- Workflow Governor — Autonomous lifecycle management with human-in-the-loop. Replaces manual @mention-driven triggers with a deterministic state machine that surfaces high-value decision points for human input. Includes:
- WorkSchedulingFrontend interface with AbstractStatus (8 statuses, 16 methods) for frontend-agnostic work scheduling
- LinearFrontendAdapter wrapping LinearAgentClient through the abstract interface
- Governor scan loop — Polling-based scheduler with configurable intervals, decision engine, and priority-aware dispatch
- Human touchpoint system — Structured override directives (HOLD, RESUME, SKIP QA, DECOMPOSE, REASSIGN, PRIORITY) with configurable timeouts
- Auto top-of-funnel — Automatic research and backlog-creation for Icebox issues based on description quality heuristics
- Strategy-aware template selection — Escalation-driven template resolution (normal → context-enriched → decompose → escalate-human)
af-governorCLI — New binary entry point for running the Governor scan loop- PM curation workflow documentation — Docs for PM interaction with the top-of-funnel pipeline
- Workflow state machine and QA loop circuit breaker — Added
WorkflowStatetracking and escalation ladder for QA cycles - YAML-based workflow template system — Agent prompts driven by YAML templates with Handlebars interpolation, overridable per project
- Aligned all package versions to 0.7.10 across the monorepo.
LINEAR_TEAM_NAMEenv var for CLI —pnpm af-linear create-issuenow falls back to theLINEAR_TEAM_NAMEenvironment variable when--teamis omitted. The orchestrator auto-sets this from the issue's team context, so agents no longer waste turns discovering the team name. Explicit--teamalways wins. AddedgetDefaultTeamName()to@renseiai/plugin-linearconstants.- Server-level project filtering —
@renseiai/agentfactory-serversupports project filtering at the server level for multi-project deployments. - Improved WORK_RESULT marker handling — QA and acceptance agent prompts now have better
<!-- WORK_RESULT:passed/failed -->marker instructions and handling.
qa-coordinationandacceptance-coordinationpriority order — Fixed work type priority ordering to includeqa-coordinationandacceptance-coordinationin the correct position.
- Aligned all package versions to 0.7.9 across the monorepo.
- Standardize Linear CLI command references — Replaced all
pnpm linearreferences withpnpm af-linearacross documentation, agent definitions, orchestrator prompts, and templates. Theaf-linearbinary name is the canonical invocation since v0.7.6; the oldpnpm linearscript alias was an internal-only convenience that didn't work in consumer projects or worktrees. - Renamed root
package.jsonscript fromlineartoaf-linearfor consistency. - Aligned all package versions to 0.7.8 across the monorepo.
- Fix autonomous agent permissions in worktrees — Agents spawned by the orchestrator in git worktrees were unable to run
pnpm af-linearand other Bash commands because they received unanswerable permission prompts in headless mode. Two compounding issues:- Wrong
allowedToolspattern format —Bash(pnpm *)(space) doesn't match; Claude Code usesBash(prefix:glob)syntax with a colon separator. Fixed toBash(pnpm:*). - Filesystem hooks unreliable in worktrees — The auto-approve hook (
.claude/hooks/auto-approve.js) loaded viasettingSources: ['project']may not resolve correctly when.gitis a file (worktree) instead of a directory. Added a programmaticcanUseToolcallback as a reliable in-process fallback that doesn't depend on filesystem hook resolution.
- Wrong
create-blockercommand — Newaf-linear create-blocker/pnpm af-linear create-blockercommand for creating human-needed blocker issues that block the source issue.
- Expanded
allowedToolscoverage — Autonomous agents now pre-approvenpm,tsx,python3,python,curl,turbo,tsc,vitest,jest, andclaudein addition to the existingpnpm,git,gh,node,npx. - WORK_RESULT marker instruction — QA and acceptance agent prompts now include explicit instructions for the
<!-- WORK_RESULT:passed/failed -->marker.
- Aligned all package versions to 0.7.7 across the monorepo.
af-linearCLI — Promoted the Linear CLI to a published binary in@renseiai/agentfactory-cli. All 15 commands (get-issue,create-issue,update-issue,list-comments,create-comment,list-backlog-issues,list-unblocked-backlog,check-blocked,add-relation,list-relations,remove-relation,list-sub-issues,list-sub-issue-statuses,update-sub-issue,check-deployment) are now available vianpx af-linearorpnpm af-linearafter installing@renseiai/agentfactory-cli. Previously, the Linear CLI only existed as an internal script inpackages/core/and consumers had to bundle their own copy.@renseiai/agentfactory-cli/linearsubpath export —runLinear()andparseLinearArgs()are available as a programmatic API for building custom CLI wrappers.create-agentfactory-appimprovements — Scaffolded projects now includepnpm af-linearout of the box (viaaf-linear), a.claude/CLAUDE.mdwith Linear CLI reference, and an enhanced developer agent definition with Linear status update workflows.
@renseiai/agentfactory-cliis now a required dependency for all scaffolded projects (not just whenincludeCliis selected).- Deprecated
packages/core/src/linear-cli.tsin favor of the CLI package. - Aligned all package versions to 0.7.6 across the monorepo.
- Fix Redis WRONGTYPE error in expired lock cleanup —
cleanupExpiredLocksWithPendingWork()scannedissue:pending:*which matched both sorted sets (issue:pending:{id}) and hashes (issue:pending:items:{id}). RunningZCARDon a hash key caused a recurringWRONGTYPEerror in production. Added the same colon-guard already present incleanupStaleLocksWithIdleWorkers.
- Aligned all package versions to 0.7.5 across the monorepo.
- Auto-allow bash commands for autonomous agents in worktrees — Agents spawned in git worktrees couldn't run
pnpm af-linearor other bash commands becausesettings.local.jsonand the auto-approve hook weren't accessible from the worktree CWD. The Claude provider now passesallowedToolsto the SDK sopnpm,git,gh,node, andnpxcommands are auto-approved for autonomous agents without relying on filesystem settings. Added optionalallowedToolsfield toAgentSpawnConfigfor custom overrides. - Linear CLI loads
.env.localcredentials automatically —pnpm af-linearcommands no longer requireLINEAR_API_KEYto be exported in the shell. The CLI now loads.envand.env.localvia dotenv at startup.
- Aligned all package versions to 0.7.4 across the monorepo.
create-agentfactory-appscaffold overhaul — Fixed multiple issues that caused scaffolded projects to fail at build time or crash on deployment:- Edge Runtime middleware crash — Changed middleware import from
@renseiai/agentfactory-nextjs(main barrel, pulls in Node.js-only deps like ioredis) to@renseiai/agentfactory-nextjs/middleware(Edge-compatible subpath). Without this fix, every Vercel deployment crashes withMIDDLEWARE_INVOCATION_FAILED/charCodeAterrors. - Tailwind v3 → v4 — Replaced deprecated Tailwind v3 setup (
tailwind.config.ts+postcss.config.js+ autoprefixer) with Tailwind v4 CSS-based config (postcss.config.mjs+@tailwindcss/postcss). Updatedglobals.cssfrom@tailwinddirectives to@import "tailwindcss"+@source. - CLI orchestrator missing
linearApiKey—runOrchestrator()requireslinearApiKeybut the scaffold omitted it, causing a TypeScript error. - CLI cleanup called
.catch()on sync return —runCleanup()returnsCleanupResultsynchronously, not a Promise. The scaffold treated it as async. - CLI worker/worker-fleet missing signal handling — Added
AbortControllerfor graceful SIGINT/SIGTERM shutdown, environment variable fallbacks forWORKER_CAPACITY/WORKER_PROJECTS/WORKER_FLEET_SIZE, and full argument parsing. - Stale dependency versions — Bumped all
@renseiai/agentfactory-*deps from^0.4.0to^0.7.2, Next.js from^15.3.0to^16.1.0. - Removed
/dashboardfrom middleware matcher — The dashboard lives at/, not/dashboard.
- Edge Runtime middleware crash — Changed middleware import from
- Aligned all package versions to 0.7.3 across the monorepo.
-
Prevent catastrophic main worktree destruction — Fixed a critical bug where the stale worktree cleanup logic could
rm -rfthe main repository when a branch was checked out in the main working tree (e.g., via IDE). When the orchestrator encountered a branch conflict with the main tree, it incorrectly identified it as a "stale worktree" and attempted destructive cleanup, destroying all source files,.git,.env.local, and other untracked files.Three layered safety guards added to
tryCleanupConflictingWorktree():isMainWorktree()check — Detects the main working tree by verifying.gitis a directory (not a worktree file), cross-checked withgit worktree list --porcelain. Errs on the side of caution if undetermined.isInsideWorktreesDir()check — Ensures the conflict path is inside.worktrees/before any cleanup is attempted. Paths outside the worktrees directory are never touched."is a main working tree"error guard — Ifgit worktree remove --forceitself reports the path is the main tree, therm -rffallback is now blocked instead of being used as escalation.
Additionally,
removeWorktree()in the CLI cleanup-runner now validates the target is not the main working tree before proceeding.
- Worktree dependency symlinks — Replaced
preInstallDependencies()withlinkDependencies()in the orchestrator. Symlinksnode_modulesfrom the main repo into worktrees instead of runningpnpm install, making worktree setup near-instant vs 10+ minutes on cross-volume setups. Falls back topnpm installif symlinking fails.
- Linear CLI restored — Ported the full Linear CLI entry point (
pnpm af-linear) from the renseiai repo. Provides 16 subcommands (get-issue,create-issue,update-issue,create-comment,list-comments,add-relation,list-relations,remove-relation,list-sub-issues,list-sub-issue-statuses,update-sub-issue,check-blocked,list-backlog-issues,list-unblocked-backlog,check-deployment) wrapping@renseiai/plugin-linear. Runs vianode --import tsxso it works in worktrees without a build step. - CLAUDE.md project instructions — Added root-level
CLAUDE.mdwith Linear CLI reference, autonomous mode detection, project structure, worktree lifecycle rules, and explicit prohibition of Linear MCP tools. - Agent definitions — Added full agent definitions to
examples/agent-definitions/for backlog-writer, developer, qa-reviewer, coordinator, and acceptance-handler. Each includes Linear CLI instructions and MCP prohibition. - Orchestrator CLI guidance — All 10 work types in
generatePromptForWorkType()now include explicit instructions to usepnpm af-linearinstead of MCP tools.
- Linear DEFAULT_TEAM_ID — Resolved empty
DEFAULT_TEAM_IDcaused by ESM import hoisting. - Dashboard clickable links — Styled issue identifiers as clickable links on fleet and pipeline pages.
- Added
worker-fleetandanalyze-logsroot scripts. - Added "Built with AgentFactory" badge system and README badge.
- Resized badges to 20px to match shields.io standard.
- Fleet-runner changes.
- Consolidated backlog-writer to
examples/, gitignored.claude/.