feat(index): catalog sessions with repo context#13
Conversation
tangletools
left a comment
There was a problem hiding this comment.
✅ Auto-approved drewstone PR — 52011d91
This PR was opened by the trusted drewstone account.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.
tangletools · auto-approval · reason: drewstone_author · 2026-07-02T01:44:08Z
tangletools
left a comment
There was a problem hiding this comment.
🟡 Value Audit — sound-with-nits
| Verdict | sound-with-nits |
| Concerns | 4 (1 low, 3 weak-concern) |
| Heuristic | 0.0s |
| Duplication | 0.1s |
| Interrogation | 360.7s (2 bridge agents) |
| Total | 360.8s |
💰 Value — sound-with-nits
Adds a traces index command and SDK that builds a reusable JSON session catalog with aggregate totals and nearby context files, plus stronger repo/cwd resolution heuristics; it's a coherent, useful extension with only minor rough edges. Ship.
- What it does: Introduces a new
indexcommand (src/cli.ts:264-285) andcollectSessionIndex/buildSessionIndexSDK surface (src/session-index.ts:293-298, src/session-index.ts:118-146) that scans sessions, builds one JSON catalog with one row per session (harness, session id, cwd, repo labels, time bounds, metrics, models, tools, loop/error signals), aggregate totals, and a sidecarcontextindex of local - Goals it achieves: 1) Give operators and downstream tools a general, joinable session inventory rather than only a markdown report or per-row JSONL. 2) Fix incorrect per-repo grouping when harness transcript directories mangle dashes into slashes or omit cwd entirely. 3) Surface how the cwd was recovered so callers can trust or filter rows by provenance. 4) Anchor sessions to local agent docs and
.evolveartifacts - Assessment: Good change. It reuses the existing
scanSessions/buildPolicyEvidenceRecordpipeline instead of inventing new metrics (src/session-index.ts:283-288), keeps the new module focused, and places the cwd/repo inference exactly where the codebase already centralizes that concern (repo.ts). The CLI follows the same pattern asevidence/convert/analyze. The only real cost is added complexity i - Better / existing approach: none — this is the right approach. I searched the repo for existing session catalogs or aggregate JSON producers (
collectPolicyEvidence,collectSessions,analyzeSpans,renderReport,toRuntimeStore) and none emit a reusable per-session index with totals/context. The repo-resolution heuristics are pragmatic because harness transcripts do not reliably carry structured cwd metadata; centrali - Model: opencode/kimi-for-coding/k2p7
- Bridge attempts: 1
🎯 Usefulness — sound
A well-integrated session-catalog command plus a richer repo resolver that flows through the shared parse funnel to every command; reuses the evidence pattern rather than duplicating it.
- Integration: Fully reachable. The new
indexcommand is dispatched in main() (src/cli.ts:539) and cmdIndex (src/cli.ts:264) reuses the existing collectSessionRows→buildPolicyEvidenceRecord funnel. The richerresolveSessionRepoAttrsis wired into parseSession (src/session-source.ts:22), which is THE shared locate→parse path called by convert, evidence, index, and scanSessions (cli.ts:195,206,233,238; session - Fit with existing patterns: Excellent fit. cmdIndex mirrors cmdEvidence exactly (src/cli.ts:243 vs 264): same collectSessionRows filter, same buildPolicyEvidenceRecord-per-row, same --out/stdout dual path. session-index.ts wraps PolicyEvidenceRecord rows into an aggregate catalog with totals + a context index rather than reimplementing metric collection — it delegates to buildPolicyEvidenceRecord (session-index.ts:284) and o
- Real-world viability: Holds up. The path-scanning resolver is bounded (MAX_SPAN_PATH_CANDIDATES=64, MAX_SPAN_TEXT_CHARS=200_000 at repo.ts), deduplicates candidates by path in a Map, and every fs/git op is wrapped fail-safe (pathStat returns null, readGit try/catch, resolveRepoAttrs never throws). Context walking (collectContextRoot) caps markdown walks at 100 files and only stats named files under .evolve/. The eviden
- Model: opencode/zai-coding-plan/glm-5.2
- Bridge attempts: 1
🔎 Heuristic Signals
🟡 Cruft: console debug added src/cli.ts
- console.log(
session index → ${path} (${index.totals.sessions} session rows))
💰 Value Audit
🟡 Repo resolver scans every session's full span text even when cwd is already good [maintenance] ``
resolveSessionRepoAttrs always calls extractAbsolutePaths (repo.ts:266-287) across span names, status messages, and attributes, recursively parsing JSON strings, even when a usable ref-cwd exists. The scan is bounded (200k chars, 64 candidates at repo.ts:199-200), but it adds per-session overhead to analyze/evidence/convert/upload/index. Consider short-circuiting when the recorded cwd already resolves to a git repo.
🟡 Context file discovery partially overlaps adoption's .evolve reading [duplication] ``
session-index.ts walks .evolve/skill-runs.jsonl (session-index.ts:243-246) for file metadata while adoption.ts already reads the same path for skill counts (adoption.ts:111-128). The goals differ (catalog vs. tally), but if more .evolve file kinds are added the two lists could drift. A shared context-root helper would keep them consistent.
🟡 Absolute-path regex uses a hardcoded root allow-list [maintenance] ``
ABSOLUTE_PATH_RE at repo.ts:198 only matches paths under /home, /tmp, /Users, /work, etc. Sessions in other mount points (/data, /project, container bind mounts) won't be inferred and will fall back to the recorded cwd. The list will need ongoing maintenance as new environments appear.
What this audit checks
It judges the change on its merits — not whether it was tasked out in an issue. Unticketed, fast-moving work is fine; the question is whether the change is good and whether a better or existing approach should be used instead.
| Pass | What it asks |
|---|---|
| Heuristic | Vague title? Whitespace-only or cruft-bearing diff? (content signals only) |
| Duplication | Do added function/class names already exist elsewhere in the repo? |
| Value Audit | What does it do? What goal does it achieve? Is it good? Better architecture or already-exists? |
| Usefulness Audit | Does it integrate and fit? Will it hold up in real use and actually get used? |
Findings are concerns, not blocks — the human reviewer decides what to do with them.
Summary
Verification