Skip to content

Commit a718e48

Browse files
Claude sessions pid matching (#48)
* feat(agent-manager): update session matching for Claude Code * feat(agent-manager): implement new session matching for Claude Code
1 parent 8f48b96 commit a718e48

7 files changed

Lines changed: 829 additions & 9 deletions

File tree

Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
---
2+
phase: design
3+
title: System Design & Architecture
4+
description: Define the technical architecture, components, and data models
5+
---
6+
7+
# System Design & Architecture
8+
9+
## Architecture Overview
10+
11+
The change is localised to `ClaudeCodeAdapter`. The detection flow always attempts a PID-file lookup for every process first; only processes whose PID file cannot be found fall through to the existing legacy matching step.
12+
13+
```mermaid
14+
flowchart TD
15+
A[detectAgents] --> B[listAgentProcesses - ps aux]
16+
B --> C[enrichProcesses - lsof + ps]
17+
C --> D[For each PID: try read ~/.claude/sessions/PID.json]
18+
D --> E{PID file found?}
19+
E -->|No| G[Add to legacy-fallback set]
20+
E -->|Yes| F{startedAt within 60s\nof proc.startTime?}
21+
F -->|No - stale| G
22+
F -->|Yes| H[Resolve JSONL path from sessionId + cwd]
23+
H --> I{JSONL exists?}
24+
I -->|No| G
25+
I -->|Yes| J[Direct match: process → session]
26+
G --> K[discoverSessions for fallback processes]
27+
K --> L[matchProcessesToSessions - existing algo]
28+
J --> M[Merge direct matches + legacy matches]
29+
L --> M
30+
M --> N[Read sessions and build AgentInfo]
31+
```
32+
33+
## Data Models
34+
35+
### PID file schema (`~/.claude/sessions/<pid>.json`)
36+
```typescript
37+
interface PidFileEntry {
38+
pid: number;
39+
sessionId: string; // filename without .jsonl
40+
cwd: string; // working directory when Claude started
41+
startedAt: number; // epoch milliseconds
42+
kind: string; // e.g. "interactive" — not used
43+
entrypoint: string; // e.g. "cli" — not used
44+
}
45+
```
46+
47+
### New internal type: `DirectMatch`
48+
```typescript
49+
interface DirectMatch {
50+
process: ProcessInfo;
51+
sessionFile: SessionFile; // reuse existing SessionFile shape
52+
}
53+
```
54+
55+
## Component Breakdown
56+
57+
### Modified: `ClaudeCodeAdapter`
58+
59+
**New private method**: `tryPidFileMatching(processes: ProcessInfo[]): { direct: DirectMatch[]; fallback: ProcessInfo[] }`
60+
- For each process, attempts to read `~/.claude/sessions/<pid>.json`.
61+
- If the file is absent or unreadable: process goes to `fallback`.
62+
- If the file is present:
63+
- Cross-checks `entry.startedAt` (epoch ms) against `proc.startTime.getTime()`; if delta > 60 s, file is stale → process goes to `fallback`.
64+
- Resolves the JSONL path: `~/.claude/projects/<encoded-cwd>/<sessionId>.jsonl` using the `cwd` from the PID file.
65+
- Verifies the JSONL exists; if missing: process goes to `fallback`.
66+
- If JSONL exists: process goes to `direct`.
67+
- There is **no upfront directory-existence check** — each PID is always tried individually. Missing files are handled per-process via try/catch.
68+
69+
**Modified**: `detectAgents()`
70+
- Calls `tryPidFileMatching()` after enrichment.
71+
- Passes only `fallback` processes to the existing `discoverSessions()` + `matchProcessesToSessions()` pipeline.
72+
- Merges `direct` matches with legacy match results before building `AgentInfo` objects.
73+
74+
### Unchanged
75+
- `utils/process.ts` — process listing and enrichment unchanged.
76+
- `utils/session.ts` — session file discovery unchanged.
77+
- `utils/matching.ts` — matching algorithm unchanged.
78+
- All other adapters — untouched.
79+
80+
## Design Decisions
81+
82+
| Decision | Choice | Rationale |
83+
|----------|--------|-----------|
84+
| Where to do PID file lookup | Inside `ClaudeCodeAdapter` as a private method | Keeps the change isolated; other adapters don't need it |
85+
| CWD source for JSONL path encoding | PID file's `cwd` field | PID file is authoritative; lsof cwd may differ (symlinks, etc.) |
86+
| `startedAt` type | Epoch milliseconds (`number`) | Verified from real files — not an ISO string |
87+
| Stale file guard | Cross-check `entry.startedAt` vs `proc.startTime` (60 s tolerance) | Catches PID reuse without false positives from normal startup delays |
88+
| `enrichProcesses()` scope | Run on all processes before the split | `proc.startTime` is needed for the stale-file guard; batched call is cheap |
89+
| Error handling for malformed PID files | Catch + fall back to legacy | Avoids crashing; older or corrupt files handled gracefully |
90+
| Batching PID file reads | No batching (sequential per PID) | Files are tiny JSON; overhead is negligible |
91+
| Reuse `SessionFile` shape for direct matches | Yes | Avoids new types; existing `readSession` and `buildAgentInfo` code works unchanged |
92+
93+
## Non-Functional Requirements
94+
95+
- **No performance regression**: PID file reads add at most one `fs.readFileSync` + `fs.existsSync` per process, which is negligible.
96+
- **Backward compatibility**: All existing behaviour is preserved when no PID files exist (older Claude Code installs). Each missing file falls through to the legacy algorithm per-process.
97+
- **No new external dependencies**.
Lines changed: 96 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,96 @@
1+
---
2+
phase: implementation
3+
title: Implementation Guide
4+
description: Technical implementation notes, patterns, and code guidelines
5+
---
6+
7+
# Implementation Guide
8+
9+
## Code Structure
10+
11+
All changes are in `packages/agent-manager/src/adapters/ClaudeCodeAdapter.ts`.
12+
13+
## Implementation Notes
14+
15+
### `tryPidFileMatching()`
16+
17+
No upfront directory check — each PID is always tried individually via try/catch.
18+
19+
```typescript
20+
private tryPidFileMatching(processes: ProcessInfo[]): {
21+
direct: Array<{ process: ProcessInfo; sessionFile: SessionFile }>;
22+
fallback: ProcessInfo[];
23+
} {
24+
const sessionsDir = path.join(os.homedir(), '.claude', 'sessions');
25+
const direct: Array<{ process: ProcessInfo; sessionFile: SessionFile }> = [];
26+
const fallback: ProcessInfo[] = [];
27+
28+
for (const proc of processes) {
29+
const pidFilePath = path.join(sessionsDir, `${proc.pid}.json`);
30+
try {
31+
const raw = fs.readFileSync(pidFilePath, 'utf-8');
32+
const entry = JSON.parse(raw) as PidFileEntry;
33+
34+
// Stale-file guard: reject if startedAt diverges from enriched proc.startTime by > 60 s
35+
if (proc.startTime) {
36+
const deltaMs = Math.abs(proc.startTime.getTime() - entry.startedAt);
37+
if (deltaMs > 60_000) {
38+
fallback.push(proc);
39+
continue;
40+
}
41+
}
42+
43+
const projectDir = this.getProjectDir(entry.cwd);
44+
const jsonlPath = path.join(projectDir, `${entry.sessionId}.jsonl`);
45+
46+
if (!fs.existsSync(jsonlPath)) {
47+
fallback.push(proc);
48+
continue;
49+
}
50+
51+
const sessionFile: SessionFile = {
52+
sessionId: entry.sessionId,
53+
filePath: jsonlPath,
54+
projectDir,
55+
birthtimeMs: 0, // not used for direct matches
56+
resolvedCwd: entry.cwd,
57+
};
58+
direct.push({ process: proc, sessionFile });
59+
} catch {
60+
// PID file absent, unreadable, or malformed → fall back per-process
61+
fallback.push(proc);
62+
}
63+
}
64+
65+
return { direct, fallback };
66+
}
67+
```
68+
69+
### `detectAgents()` changes
70+
71+
After `enrichProcesses(processes)`:
72+
73+
1. Call `tryPidFileMatching(processes)``{ direct, fallback }`.
74+
2. Run existing `discoverSessions(fallback)` + `matchProcessesToSessions(fallback, sessions)` only on `fallback`.
75+
3. Merge `direct` matches and `legacyMatches` into a single list before iterating to build `AgentInfo`.
76+
77+
### `PidFileEntry` interface
78+
79+
Add near the top of `ClaudeCodeAdapter.ts`:
80+
81+
```typescript
82+
interface PidFileEntry {
83+
pid: number;
84+
sessionId: string;
85+
cwd: string;
86+
startedAt: number; // epoch milliseconds
87+
kind: string;
88+
entrypoint: string;
89+
}
90+
```
91+
92+
## Error Handling
93+
94+
- Any `fs.readFileSync` failure (file not found, permission denied) → catch → push to fallback.
95+
- JSON parse failure → catch → push to fallback.
96+
- `fs.existsSync` on JSONL → false → push to fallback.
Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,47 @@
1+
---
2+
phase: planning
3+
title: Project Planning & Task Breakdown
4+
description: Break down work into actionable tasks and estimate timeline
5+
---
6+
7+
# Project Planning & Task Breakdown
8+
9+
## Milestones
10+
11+
- [x] Milestone 1: Implementation — `ClaudeCodeAdapter` updated with PID-file matching
12+
- [x] Milestone 2: Tests — unit tests for new code paths pass, existing tests remain green
13+
- [ ] Milestone 3: Review — code review complete, ready to merge
14+
15+
## Task Breakdown
16+
17+
### Phase 1: Implementation
18+
19+
- [x] Task 1.1: Add `tryPidFileMatching()` private method to `ClaudeCodeAdapter`
20+
- [x] Task 1.2: Integrate `tryPidFileMatching()` into `detectAgents()`
21+
- [x] Task 1.3: Define `PidFileEntry` and `DirectMatch` interfaces (internal to `ClaudeCodeAdapter.ts`)
22+
23+
### Phase 2: Tests
24+
25+
- [x] Task 2.1: Unit tests for `tryPidFileMatching()` — 8 cases covering all branches
26+
- [x] Task 2.2: Integration tests for `detectAgents()` — direct-only and mixed scenarios
27+
- [x] Task 2.3: All 156 tests pass (145 existing + 11 new)
28+
29+
### Phase 3: Cleanup & Review
30+
31+
- [x] Task 3.1: Run `npx ai-devkit@latest lint --feature claude-sessions-pid-matching`
32+
- [ ] Task 3.2: Code review
33+
34+
## Dependencies
35+
36+
- Tasks 1.2 and 1.3 depend on Task 1.1.
37+
- Task 2.1 depends on Task 1.1.
38+
- Task 2.2 depends on Tasks 1.2 + 1.3.
39+
- Task 2.3 can run in parallel with Task 2.1/2.2 as a sanity check.
40+
41+
## Risks & Mitigation
42+
43+
| Risk | Likelihood | Mitigation |
44+
|------|-----------|------------|
45+
| PID file `cwd` encoding differs from lsof cwd (e.g. symlinks) | Low | Use PID file cwd for encoding; document this as the authoritative source |
46+
| `~/.claude/sessions/` path differs across Claude Code versions | Low | Derive path from `os.homedir()` same as existing `~/.claude/projects/` |
47+
| Race condition: process exits between ps and PID file read | Very low | `fs.existsSync` + try-catch; treat as fallback |
Lines changed: 64 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,64 @@
1+
---
2+
phase: requirements
3+
title: Requirements & Problem Understanding
4+
description: Clarify the problem space, gather requirements, and define success criteria
5+
---
6+
7+
# Requirements & Problem Understanding
8+
9+
## Problem Statement
10+
**What problem are we solving?**
11+
12+
- Newer versions of Claude Code write a file at `~/.claude/sessions/<pid>.json` for each running process. This file contains `{ pid, sessionId, cwd, startedAt }`.
13+
- The current Claude adapter in agent-manager matches processes to sessions by encoding the process CWD into a `~/.claude/projects/<encoded>/` directory path and then finding the closest JSONL session file by birthtime (within a 3-minute tolerance).
14+
- This birthtime-based heuristic can produce incorrect matches when multiple Claude processes share the same CWD, or when the session file birthtime diverges significantly from the process start time.
15+
- Users of the agent-manager CLI (`agent list`) may see stale, mismatched, or missing session data as a result.
16+
17+
## Goals & Objectives
18+
**What do we want to achieve?**
19+
20+
- **Primary**: Use `~/.claude/sessions/<pid>.json` as the authoritative source for process-to-session mapping when the file exists for a given PID.
21+
- **Secondary**: Fall back to the existing CWD-encoding + birthtime heuristic for processes where no `~/.claude/sessions/<pid>.json` file is present (older Claude Code versions or sessions not yet written).
22+
- **Non-goals**:
23+
- Changing how session JSONL content is parsed or how status is determined.
24+
- Modifying any adapter other than `ClaudeCodeAdapter`.
25+
- Supporting Windows-specific paths (existing macOS/Linux conventions apply).
26+
27+
## User Stories & Use Cases
28+
**How will users interact with the solution?**
29+
30+
- As an agent-manager user, I want `agent list` to correctly associate each running Claude process with its active session, so that I see accurate status and message summaries.
31+
- As a developer running multiple Claude instances in the same directory, I want each instance to be matched to its own session (not mixed up), so the list output is unambiguous.
32+
33+
**Edge cases to consider:**
34+
- PID file exists but references a `sessionId` whose JSONL does not exist → fall back to legacy matching for that process.
35+
- PID file exists but `cwd` in the file differs from the process's actual CWD reported by `lsof` → trust the PID file's `sessionId` and `cwd` (it is authoritative).
36+
- Stale PID file (process exited, PID reused by a new Claude process) → cross-check `startedAt` (epoch ms) against `proc.startTime` from enrichment; if the delta exceeds 60 seconds, treat as stale and fall back to legacy matching for that process.
37+
- PID file absent for a given process (e.g. older Claude Code) → fall back to legacy matching for that process only. No directory-level check is needed; each PID is tried individually.
38+
- Multiple processes; only some have PID files → use PID files for those that have them, legacy matching for the rest.
39+
40+
## Success Criteria
41+
**How will we know when we're done?**
42+
43+
- `ClaudeCodeAdapter.detectAgents()` reads `~/.claude/sessions/<pid>.json` for each discovered PID and uses the `sessionId` from the file to locate the correct JSONL in `~/.claude/projects/`.
44+
- Processes without a matching PID file are matched via the existing legacy algorithm without regression.
45+
- All existing tests continue to pass.
46+
- New unit tests cover: PID-file happy path, PID-file missing JSONL fallback, directory absent, mixed (some PIDs have files, some don't).
47+
48+
## Constraints & Assumptions
49+
**What limitations do we need to work within?**
50+
51+
- `~/.claude/sessions/<pid>.json` schema (verified from real files):
52+
```json
53+
{ "pid": 81665, "sessionId": "87ada2e7-...", "cwd": "/Users/...", "startedAt": 1774598167519, "kind": "interactive", "entrypoint": "cli" }
54+
```
55+
- `startedAt` is **epoch milliseconds** (not an ISO string).
56+
- `kind` and `entrypoint` fields are present but not used by this feature.
57+
- The JSONL for a session lives at `~/.claude/projects/<encoded-cwd>/<sessionId>.jsonl` — the same location the legacy algorithm already discovers.
58+
- Reading individual small JSON files per PID is acceptable; no batching of the PID file reads is required (files are tiny).
59+
- `enrichProcesses()` continues to run on all processes (direct + fallback) before the PID-file split — the batched `lsof`/`ps` call is cheap and `proc.startTime` is needed for the stale-file guard.
60+
- The feature must remain backward-compatible with older Claude Code installs that do not write PID files.
61+
62+
## Questions & Open Items
63+
64+
- None — requirements are clear from the user's description and existing code analysis.
Lines changed: 52 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,52 @@
1+
---
2+
phase: testing
3+
title: Testing Strategy
4+
description: Define testing approach, test cases, and quality assurance
5+
---
6+
7+
# Testing Strategy
8+
9+
## Test Coverage Goals
10+
11+
- 100% branch coverage of `tryPidFileMatching()`
12+
- `detectAgents()` integration paths for direct-match and fallback-only scenarios
13+
- No regression in existing tests
14+
15+
## Unit Tests
16+
17+
### `tryPidFileMatching()`
18+
19+
- [x] PID file present + JSONL exists + `startedAt` within 60 s of `proc.startTime` → process in `direct` with correct `sessionId` and `resolvedCwd`
20+
- [x] PID file present + JSONL missing → process in `fallback`
21+
- [x] PID file present but `startedAt` > 60 s from `proc.startTime` (stale/reused PID) → process in `fallback`
22+
- [x] `startedAt` within 30 s (boundary) → accepted as direct match
23+
- [x] PID file absent for a PID (file not found) → process in `fallback`, no crash
24+
- [x] PID file contains malformed JSON → process in `fallback` (no throw)
25+
- [x] Sessions dir entirely absent (no PID file for any process) → all processes in `fallback`, no crash
26+
- [x] Mixed: 2 PIDs with files, 1 without → correct split across `direct` and `fallback`
27+
- [x] `proc.startTime` is undefined (enrichment failed) → stale-file check skipped, proceed normally
28+
29+
### `detectAgents()` integration
30+
31+
- [x] All direct matches: `discoverSessions` and `matchProcessesToSessions` not called
32+
- [x] Mixed: direct matches merged correctly with legacy matches in final `AgentInfo` list
33+
- [x] Direct match produces `AgentInfo` with correct `sessionId`
34+
- [x] Direct-matched JSONL becomes unreadable after existence check → process falls back to IDLE
35+
- [x] Legacy-matched JSONL becomes unreadable after match → process falls back to IDLE
36+
37+
## Test Data
38+
39+
Real `tmp` directories with JSON/JSONL fixtures. `jest.spyOn` used only for race-condition branches (lines 128, 141).
40+
41+
## Test Reporting & Coverage
42+
43+
Run: `cd packages/agent-manager && npm test -- --coverage --collectCoverageFrom='src/adapters/ClaudeCodeAdapter.ts'`
44+
45+
| Metric | Result |
46+
|--------|--------|
47+
| Statements | 98.73% |
48+
| Branches | 89.79% |
49+
| Functions | 100% |
50+
| Lines | 99.35% |
51+
52+
**Remaining gap — line 314** (`return null` after `allLines.length === 0` in `readSession`): dead code. `''.trim().split('\n')` always returns `['']` (length ≥ 1), so this condition is structurally unreachable. No test can cover it without modifying the source.

0 commit comments

Comments
 (0)