feat(index): catalog sessions with repo context by drewstone · Pull Request #13 · tangle-network/traces

drewstone · 2026-07-02T01:43:58Z

Summary

add a general traces session index JSON with per-session rows and aggregate totals
improve repo/cwd resolution from repaired paths, span paths, and explicit tool workdirs
include nearby local context files for joins: agent docs, .evolve JSONL files, reflections, and handoffs

Verification

pnpm typecheck
pnpm test (78/78)
pnpm build
node dist/cli.js index --harness codex --cwd /home/drew/code/traces --last 1 --out /tmp/traces-session-index-context-smoke.json

tangletools

✅ Auto-approved drewstone PR — `52011d91`

This PR was opened by the trusted drewstone account.
The full PR reviewer audit still runs separately and will publish findings if it detects issues.

_{tangletools · auto-approval · reason: drewstone_author · 2026-07-02T01:44:08Z}

tangletools

🟡 Value Audit — sound-with-nits


Verdict	sound-with-nits
Concerns	4 (1 low, 3 weak-concern)
Heuristic	0.0s
Duplication	0.1s
Interrogation	360.7s (2 bridge agents)
Total	360.8s

💰 Value — sound-with-nits

Adds a traces index command and SDK that builds a reusable JSON session catalog with aggregate totals and nearby context files, plus stronger repo/cwd resolution heuristics; it's a coherent, useful extension with only minor rough edges. Ship.

What it does: Introduces a new index command (src/cli.ts:264-285) and collectSessionIndex / buildSessionIndex SDK surface (src/session-index.ts:293-298, src/session-index.ts:118-146) that scans sessions, builds one JSON catalog with one row per session (harness, session id, cwd, repo labels, time bounds, metrics, models, tools, loop/error signals), aggregate totals, and a sidecar context index of local
Goals it achieves: 1) Give operators and downstream tools a general, joinable session inventory rather than only a markdown report or per-row JSONL. 2) Fix incorrect per-repo grouping when harness transcript directories mangle dashes into slashes or omit cwd entirely. 3) Surface how the cwd was recovered so callers can trust or filter rows by provenance. 4) Anchor sessions to local agent docs and .evolve artifacts
Assessment: Good change. It reuses the existing scanSessions / buildPolicyEvidenceRecord pipeline instead of inventing new metrics (src/session-index.ts:283-288), keeps the new module focused, and places the cwd/repo inference exactly where the codebase already centralizes that concern (repo.ts). The CLI follows the same pattern as evidence/convert/analyze. The only real cost is added complexity i
Better / existing approach: none — this is the right approach. I searched the repo for existing session catalogs or aggregate JSON producers (collectPolicyEvidence, collectSessions, analyzeSpans, renderReport, toRuntimeStore) and none emit a reusable per-session index with totals/context. The repo-resolution heuristics are pragmatic because harness transcripts do not reliably carry structured cwd metadata; centrali
Model: opencode/kimi-for-coding/k2p7
Bridge attempts: 1

🎯 Usefulness — sound

A well-integrated session-catalog command plus a richer repo resolver that flows through the shared parse funnel to every command; reuses the evidence pattern rather than duplicating it.

Integration: Fully reachable. The new index command is dispatched in main() (src/cli.ts:539) and cmdIndex (src/cli.ts:264) reuses the existing collectSessionRows→buildPolicyEvidenceRecord funnel. The richer resolveSessionRepoAttrs is wired into parseSession (src/session-source.ts:22), which is THE shared locate→parse path called by convert, evidence, index, and scanSessions (cli.ts:195,206,233,238; session
Fit with existing patterns: Excellent fit. cmdIndex mirrors cmdEvidence exactly (src/cli.ts:243 vs 264): same collectSessionRows filter, same buildPolicyEvidenceRecord-per-row, same --out/stdout dual path. session-index.ts wraps PolicyEvidenceRecord rows into an aggregate catalog with totals + a context index rather than reimplementing metric collection — it delegates to buildPolicyEvidenceRecord (session-index.ts:284) and o
Real-world viability: Holds up. The path-scanning resolver is bounded (MAX_SPAN_PATH_CANDIDATES=64, MAX_SPAN_TEXT_CHARS=200_000 at repo.ts), deduplicates candidates by path in a Map, and every fs/git op is wrapped fail-safe (pathStat returns null, readGit try/catch, resolveRepoAttrs never throws). Context walking (collectContextRoot) caps markdown walks at 100 files and only stats named files under .evolve/. The eviden
Model: opencode/zai-coding-plan/glm-5.2
Bridge attempts: 1

🔎 Heuristic Signals

🟡 Cruft: console debug added src/cli.ts

console.log(session index → ${path} (${index.totals.sessions} session rows))

💰 Value Audit

🟡 Repo resolver scans every session's full span text even when cwd is already good [maintenance] ``

resolveSessionRepoAttrs always calls extractAbsolutePaths (repo.ts:266-287) across span names, status messages, and attributes, recursively parsing JSON strings, even when a usable ref-cwd exists. The scan is bounded (200k chars, 64 candidates at repo.ts:199-200), but it adds per-session overhead to analyze/evidence/convert/upload/index. Consider short-circuiting when the recorded cwd already resolves to a git repo.

🟡 Context file discovery partially overlaps adoption's .evolve reading [duplication] ``

session-index.ts walks .evolve/skill-runs.jsonl (session-index.ts:243-246) for file metadata while adoption.ts already reads the same path for skill counts (adoption.ts:111-128). The goals differ (catalog vs. tally), but if more .evolve file kinds are added the two lists could drift. A shared context-root helper would keep them consistent.

🟡 Absolute-path regex uses a hardcoded root allow-list [maintenance] ``

ABSOLUTE_PATH_RE at repo.ts:198 only matches paths under /home, /tmp, /Users, /work, etc. Sessions in other mount points (/data, /project, container bind mounts) won't be inferred and will fall back to the recorded cwd. The list will need ongoing maintenance as new environments appear.

What this audit checks

It judges the change on its merits — not whether it was tasked out in an issue. Unticketed, fast-moving work is fine; the question is whether the change is good and whether a better or existing approach should be used instead.

Pass	What it asks
Heuristic	Vague title? Whitespace-only or cruft-bearing diff? (content signals only)
Duplication	Do added function/class names already exist elsewhere in the repo?
Value Audit	What does it do? What goal does it achieve? Is it good? Better architecture or already-exists?
Usefulness Audit	Does it integrate and fit? Will it hold up in real use and actually get used?

Findings are concerns, not blocks — the human reviewer decides what to do with them.

_{value-audit · 20260702T020637Z}

feat(index): catalog sessions with repo context

52011d9

tangletools approved these changes Jul 2, 2026

View reviewed changes

tangletools reviewed Jul 2, 2026

View reviewed changes

drewstone merged commit 8cbe6e5 into main Jul 2, 2026
1 check passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat(index): catalog sessions with repo context#13

feat(index): catalog sessions with repo context#13
drewstone merged 1 commit into
mainfrom
feat/session-index-context

drewstone commented Jul 2, 2026

Uh oh!

tangletools left a comment

Uh oh!

tangletools left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Uh oh!

Conversation

drewstone commented Jul 2, 2026

Summary

Verification

Uh oh!

tangletools left a comment

Choose a reason for hiding this comment

✅ Auto-approved drewstone PR — 52011d91

Uh oh!

tangletools left a comment

Choose a reason for hiding this comment

🟡 Value Audit — sound-with-nits

💰 Value — sound-with-nits

🎯 Usefulness — sound

🔎 Heuristic Signals

💰 Value Audit

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

✅ Auto-approved drewstone PR — `52011d91`