Skip to content

feat(codedb): agent discoverability — banner framing, ergonomics, verb wrappers#664

Open
rupakm wants to merge 8 commits into
mainfrom
feature/codedb-discoverability-pr
Open

feat(codedb): agent discoverability — banner framing, ergonomics, verb wrappers#664
rupakm wants to merge 8 commits into
mainfrom
feature/codedb-discoverability-pr

Conversation

@rupakm

@rupakm rupakm commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Summary

Three batches of changes that make ox code actually reachable by AI coworkers (Claude Code, Cursor, Codex, Windsurf, etc.). Driven by the observation that even though ox code exists and would help agents understand the codebase, agents almost never reach for it — defaulting to Grep/Glob/Read chains instead.

The investigation report (docs/specs/codedb-agent-discoverability.md) is included in this PR for the full "why." TL;DR:

  1. Framing was prescriptive, not demonstrative. The old banner told agents to "PREFER ox code search over Grep" without showing what ox code can do that grep cannot. Pre-trained tool-use instincts (grep/rg/find) override soft PREFER directives.
  2. The DSL — type:, calls:, calledby:, confidence:, before:/after:, /regex/ — is the unique value, and it was surfaced nowhere the agent could see. Not in --help (zero examples, two flags), not in skill files, not in .claude/rules/. ADR-019's call-graph capability sat behind DSL filters with no verb-mode wrapper.
  3. Output trained the agent off the tool. 120-char snippet truncation taught agents on first call that "ox returns less than grep." Hard errors on code index is currently being built taught them to abandon the tool. Stale-index anxiety was unaddressed.

What changed

Batch 1 — Framing (3 commits, instruction-only)

  • cmd/ox/agent_prime_xml.go<code-search> banner rewritten from prescription to demonstration. Now lists 9 concrete DSL invocations showing what grep cannot do (calledby:authenticate, calls:Handler depth:2, type:pr, type:comment ckind:todo, etc.) and the full keyword inventory. Restricts Grep/Glob to "exact-string-in-known-file" and "0-result fallback." Lives in the static prefix-cache region so the +330 token cost is paid once per session.
  • cmd/ox/code.gocodeCmd.Long and codeSearchCmd.Long carry full DSL grammar + 7 example queries spanning symbol / call-graph / PR / comment / history / regex intents. Previously --help showed two flags and zero examples.
  • cmd/ox/code_activity.go, code_prs.goShort strings reworded from pipeline-internal language ("for the fact extractor") to agent-visible value props.
  • internal/prime/guidance.go<commands> table expanded from 2 ox-code rows to 6, surfacing calls/calledby, type:pr/type:issue, prs, activity, insights as distinct intents.
  • cmd/ox/agent_prime.goCodeSearchTip rewording aligned with new framing.
  • .claude/rules/ox-code.md (new) — decision tree + DSL cheatsheet + anti-patterns + fallback policy + index-health notes.
  • claude-plugin/skills/ox/SKILL.md, extensions/claude/commands/ox.md — "Search Code, History, PRs" sections added. Previously silent on ox code.

Batch 2 — Ergonomics (1 commit, small CLI surface changes)

  • Snippet width: default 120 → 200 chars, plus --snippet N for override. 120 was cutting mid-arg-list on Go signatures. --full-json description updated from ~6x to ~4x tokens.
  • Structured index-not-ready response: ox code search|insights|prs|activity now emit {"status":"indexing"|"not_indexed", "fallback_hint":"..."} JSON on stdout (exit 0) when invoked by an agent (detectAgentContext() true). Humans still get the human-readable error. Agents branch on status instead of abandoning the tool on stderr.
  • JIT DSL hint: when the agent issues a bare single-term query (no DSL filters, no OR, no /regex/) and non-zero results return, append a one-line hint pointing at type:symbol, calledby:/calls:, type:pr/type:issue, before:/after:. Just-in-time discovery without bloating responses that already use the DSL.
  • Stderr latency line: every ox code search now writes codedb: N results in Tms (dirty overlays: K) to stderr. Calibrates agents on latency without per-call context cost. Suppressed by --quiet.

Batch 3 — Verb wrappers (1 commit + 1 follow-up, new surface)

ADR-019's resolved call graph (calls: / calledby:) was completely hidden behind DSL grammar. Agents pattern-match on verbs (callers, who-calls) more reliably than on filter keywords. New thin wrappers (each a shell over the shared runCodeSearch helper extracted from codeSearchCmd.RunE):

ox code defs <name>                # where is <name> defined?   (type:symbol)
ox code callers <name>             # who calls <name>?           (calledby:)
ox code callees <name> --depth N   # what <name> calls           (calls: depth:)
ox code refs <name> [--lang go]    # text references             (type:code lang:)
ox code log <path> [--author X --after YYYY-MM-DD]   # commit history (file: type:commit ...)

All verbs inherit the same flags as ox code search (--limit, --snippet, --full-json) so output is byte-for-byte identical with the equivalent DSL invocation. Banner, guidance table, skill, slash command, and rule all updated to list verbs first, DSL-mode second.

Follow-ups (2 commits, review feedback)

  • fix(codedb): sub-second precision on search-latency stderr line + compact commit fields

    • formatSearchLatency exposes µs/ms below 1s so the stderr line shows real numbers (was floored to <1s)
    • compactSearchResult adds commit_hash, author, commit_messagetype:commit results used to drop these and return {"snippet":""}
    • ox code log <path> defaults --after to one year ago so the natural invocation always returns data (executor requires at least one of author/before/after/message)
  • fix(codedb): review feedback — preserve snippet JSON contract, wrap banner lines

    • Snippet field reverted from omitempty to always-present (preserves agent JSON contract)
    • 3 banner lines wrapped to stay within ~80 chars
  • fix(codedb): close deferred review items — rune-aware snippet truncation + verb-arg validation

    • compactSnippet now walks runes via for byteIdx := range result — byte-slice truncation could corrupt multi-byte UTF-8 (Greek identifiers, emoji, CJK PR titles) and break downstream JSON encoding
    • validateSymbolArg / validatePathArg reject whitespace, colons, and quote characters on verb args + --author/--after/--before/--lang flag values — prevents DSL-injection confusion like ox code callers "foo calledby:bar" rendering as calledby:foo calledby:bar

Tests

26 new test cases across cmd/ox/code_test.go and cmd/ox/code_verbs_test.go:

Suite Coverage
TestIsBareQuery 9 boundary cases (DSL detection for JIT hint)
TestCompactSearchResults_* snippet default, override, paging guidance, JIT-hint boundary
TestCompactSnippet_RuneAware_* ASCII / 2-byte (λ) / 4-byte (🤖) / pass-through
TestEmitIndexNotReadyJSON_* both indexing and not_indexed shapes
TestIndexStatusConstants_AreStable locks agent JSON contract
TestValidateSymbolArg_* clean ids, 7 injection patterns, empty
TestValidatePathArg_* paths, DSL-injection rejection
TestVerb*_BuildsDSL 8 round-trip checks (verb → ParsedQuery)
TestVerbCommands_AreRegistered all 5 verbs reach codeCmd

Full suite: 5,872 pass / 2 pre-existing fails / 7 skip across 12 packages. Two failures (TestDoctorFreshInstall_*) are unrelated — verified by stashing the branch and rerunning (same failures). Root cause is a daemon-cached version-update notice being treated as an unexpected warning by the test fixtures.

Performance

Banner token cost

Old New Delta
<code-search> block bytes 517 1,842 +1,325
Estimated tokens ~130 ~460 +330
Region prefix-cacheable prefix-cacheable unchanged — paid once per session, amortized across all turns under Anthropic's 5-min cache TTL

Search latency (sub-second precision now visible)

codedb: 11 results in 18ms (dirty overlays: 1)
codedb: 11 results in  3ms (dirty overlays: 1)
codedb: 11 results in  4ms (dirty overlays: 1)

Pure search 3–28ms vs grep ~40-100ms / ripgrep ~45-56ms. Wall-clock (incl. binary startup) is ~1s for ox code; that's a known property of the ox binary, not introduced by this PR.

Subagent A/B

Two rounds of testing whether the new framing actually changes agent behavior:

Round 1 — same task, different framing

  • ox repo: 5 understand-the-codebase tasks × 2 framings = 10 subagents. Aggregate: A=37 tools / 103,945 tokens / 137.7s wall, B=36 / 99,371 / 135.5s — −3% tools, −4% tokens, −2% wall. Small but consistent direction; B was also more accurate on T1 (no false positive caller).
  • TraceForge repo (Rust, no .sageox/ priming, 1,484 symbols): 3 tasks × 2 = 6 subagents. Aggregate: B was worse — +31% tools, +30% tokens, +56% wall. Direction reversed because (a) no environmental priming so framing came from prompt only, (b) symbols not in index → ox code calls returned 0 → fell back to grep anyway, (c) repo too small for the index advantage to compound.

What this proves: framing helps when (i) the repo is large enough that codedb actually beats grep, (ii) the target symbol exists in the index, (iii) the agent gets the banner via ox agent prime not just inline prompt. The initial N=1 test showing −70% over-sold the win; N=8 across two repos shows a modest, conditional improvement. Honest writeup in docs/specs/codedb-discoverability-impl-results.md.

Implementation discipline

  • No production logic changed in Batch 1 (instruction-only)
  • runCodeSearch extracted from codeSearchCmd.RunE so verb wrappers share the index-not-ready handling, dirty-overlay attach, search, and stats emission — no duplication
  • All commits follow conventional format
  • ADR-019 dependencies (calls:/calledby:/confidence:) already in main (12643f0) — this PR builds on them
  • Deferred items from the investigation that are NOT in this PR (documented in the report for future work):
    • number:<n> DSL filter (would enable ox code pr <number> verb)
    • ox find root-level alias (touches rootCmd — required-review surface per AGENTS.md)
    • Tool-use telemetry counter (explicitly excluded by user instruction during implementation)
    • Bleve symbol-index health check in ox doctor (separate hardening work)

Test plan

  • go build ./... clean
  • go vet ./cmd/ox/... ./internal/prime/... clean
  • go test ./cmd/ox/ ./internal/codedb/... ./internal/prime/... — 5,872 pass / 2 pre-existing fail / 7 skip
  • Manual: ./ox agent prime --format=xml shows new banner with verb-mode + DSL-mode examples
  • Manual: ./ox code defs <name>, callers, callees, refs, log all execute against live index
  • Manual: ./ox code callers "foo calledby:bar" rejected with helpful error
  • Manual: ./ox code search "λ" type:code returns valid UTF-8 (no mid-rune corruption)
  • Subagent A/B (N=8 across two repos) — see docs/specs/codedb-discoverability-impl-results.md

Files changed

  • New: cmd/ox/code_verbs.go, cmd/ox/code_verbs_test.go, .claude/rules/ox-code.md, docs/specs/codedb-agent-discoverability.md, docs/specs/codedb-discoverability-impl-results.md
  • Modified: cmd/ox/code.go, cmd/ox/code_test.go, cmd/ox/code_activity.go, cmd/ox/code_insights.go, cmd/ox/code_prs.go, cmd/ox/agent_prime.go, cmd/ox/agent_prime_xml.go, cmd/ox/agent_query.go, internal/prime/guidance.go, claude-plugin/skills/ox/SKILL.md, extensions/claude/commands/ox.md

🤖 Generated with Claude Code

Co-Authored-By: SageOx ox@sageox.ai

Summary by CodeRabbit

Release Notes

  • New Features

    • Added verb-based subcommands (callers, callees, defs, refs, log) for common code search patterns
    • Enhanced search result formatting with better UTF-8 handling and rune-aware truncation
    • Agents now receive structured guidance when code index is unavailable
  • Documentation

    • Expanded ox code command documentation with DSL filter examples and verb-mode guides
    • Added comprehensive ruleset for code search capabilities and fallback behavior
  • Tests

    • Added extensive test coverage for new search commands, result formatting, and validation

Rupak Majumdar and others added 8 commits June 14, 2026 19:18
… skills

Batch 1 of three from the codedb agent-discoverability investigation
(docs/ai/specs/codedb-agent-discoverability.md).

The previous surfaces told agents to "PREFER ox code search over Grep" — a
soft directive that pre-trained tool-use instincts override. They never
showed what `ox code` can do that grep cannot, so agents kept defaulting to
grep/rg/find for everything.

This commit changes the framing from prescription to demonstration:

- cmd/ox/agent_prime_xml.go: <code-search> banner now lists 9 concrete DSL
  examples (type:symbol, calledby:, calls: depth:, type:pr, type:comment
  ckind:todo, author+after, prs, insights, activity) and the full keyword
  inventory. Restricts Grep/Glob to exact-string-in-known-file and 0-result
  fallback. Banner lives in the static prefix-cache region.

- cmd/ox/code.go: codeCmd.Long and codeSearchCmd.Long now carry the DSL
  grammar + 7 example queries spanning symbol, call-graph, PR, comment,
  history, and regex intents.

- cmd/ox/code_activity.go, code_prs.go: Short strings reworded from
  pipeline-internal language ("for the fact extractor") to agent-visible
  value props ("Recent GitHub activity ... over a time window",
  "PRs ranked for triage").

- internal/prime/guidance.go: <commands> table expanded from 2 ox-code rows
  to 6, surfacing calls/calledby, type:pr/type:issue, prs, activity, and
  insights as distinct intents. ADR-019's call graph is no longer hidden
  behind DSL grammar agents don't see.

- cmd/ox/agent_prime.go: CodeSearchTip rewording aligned with new framing.

- .claude/rules/ox-code.md (new): decision tree + DSL cheatsheet +
  anti-patterns + fallback policy + index-health notes.

- claude-plugin/skills/ox/SKILL.md, extensions/claude/commands/ox.md:
  "Search Code, History, PRs" sections added so the shipped skill and
  slash-command help mention `ox code` (previously silent on it).

No production logic changed — all surface-text/instruction edits.

Co-Authored-By: SageOx <ox@sageox.ai>
SageOx-Session: https://sageox.ai/repo/repo_019c5812-01e9-7b7d-b5b1-321c471c9777/sessions/2026-05-28T09-15-rupak-OxWo5N/view
…tatus, JIT DSL hint, latency stats

Batch 2 of three from the codedb agent-discoverability investigation.
Focuses on per-call signal quality so the first time an agent invokes
`ox code search` it learns the tool is fast, recoverable, and richer than grep.

- R6 / snippet width:
  defaultSnippetLen bumped 120 → 200 chars. New `--snippet N` flag overrides
  per call. 120 chars was cutting mid-arg-list on typical Go signatures and
  training agents that "ox returns less than grep" — 200 covers most function
  signatures plus brief context. Description on --full-json updated from
  "~6x more context tokens" to "~4x" to reflect the new ratio.

- R7 / structured index-not-ready response:
  When an agent context hits an index that is still being built, or has not
  yet been created, `ox code search|insights|prs|activity` now emit a
  structured JSON response on stdout and exit 0:

      {"status":"indexing"|"not_indexed",
       "message":"...",
       "fallback_hint":"..."}

  Humans still get the previous human-readable error so the terminal
  experience is unchanged. Agents branch on `status` instead of abandoning
  the tool on stderr. Status string constants live in code.go so callers
  parsing the JSON have a stable contract.

- R9 / JIT DSL hint:
  When the agent issues a bare single-term query (no DSL filters, no OR, no
  /regex/), and we returned at least one result, append a one-line nudge to
  the `guidance` field pointing at the filters they likely don't know
  about: `type:symbol`, `calledby:`/`calls:`, `type:pr`/`type:issue`,
  `before:`/`after:`. Just-in-time DSL discovery without bloating the
  per-call response when the query already used filters.

- R8 / stderr latency stats:
  `ox code search` now writes a one-liner to stderr after each call:

      codedb: 247 results in 12ms (dirty overlays: 2)

  Stderr only — stdout JSON consumers unaffected. Suppressed by --quiet.
  Calibrates agents (and humans) on latency without per-call context cost.

Unit tests in cmd/ox/code_test.go cover:
  - isBareQuery: 9 cases including empty, whitespace, DSL, OR, regex
  - compactSearchResults default snippet length (200)
  - --snippet override applies per call
  - paging guidance still emitted when total > limit
  - emitIndexNotReadyJSON shape for both "indexing" and "not_indexed"
  - status constants locked in (agent contract)

The two pre-existing TestDoctorFreshInstall_* failures (version-update
warning v0.8.1 → v0.10.0) are unrelated to this branch — verified by
running them with the branch stashed.

Co-Authored-By: SageOx <ox@sageox.ai>
SageOx-Session: https://sageox.ai/repo/repo_019c5812-01e9-7b7d-b5b1-321c471c9777/sessions/2026-06-14T15-24-rupak-OxWo5N/view
Batch 3 of three from the codedb agent-discoverability investigation.
Closes the verb/DSL mismatch identified in the investigation: agents
pattern-match on verbs, but ADR-019's call-graph capability sat behind
DSL filters (calls:, calledby:) with no verb-mode entry point. After
this commit the unique-value queries are discoverable by guessing the
verb.

New subcommands (each a thin shell over the shared runCodeSearch helper
extracted from codeSearchCmd):

  ox code defs <name>                # where is <name> defined?  (type:symbol)
  ox code callers <name>             # who calls <name>?         (calledby:)
  ox code callees <name> --depth N   # what <name> calls         (calls: depth:)
  ox code refs <name> [--lang go]    # text references           (type:code lang:)
  ox code log <path> [--author X --after YYYY-MM-DD]  # commit history (file: type:commit ...)

All verbs inherit the same flags as `ox code search` (--limit, --snippet,
--full-json) so output is byte-for-byte identical with the equivalent DSL
invocation. Implementation lives in cmd/ox/code_verbs.go; each command
builds the DSL string and calls runCodeSearch — no duplication of search
flow, index-not-ready handling, or stats emission.

cmd/ox/code.go: extracted runCodeSearch(cmd, query) from the inline RunE.
codeSearchCmd.RunE now just joins args and delegates. Behavior is
preserved; existing tests on compactSearchResults/emitIndexNotReadyJSON/
isBareQuery still cover the path.

Tests (cmd/ox/code_verbs_test.go, 9 cases):
- Each verb round-trips through search.ParseQuery to the expected
  ParsedQuery — callers→Filters.CalledBy, callees→Filters.Calls
  (+optional Depth), defs→SearchTypeSymbol, refs→SearchTypeCode
  (+optional Lang), log→Filters.File + SearchTypeCommit (+optional
  Author/After/Before).
- All five verb commands registered on codeCmd so they appear in
  `ox code --help`.

Banner / surface updates so agents find the verbs:
- cmd/ox/agent_prime_xml.go: <code-search> banner now leads with the
  verb listing, then DSL-mode fallback, then the structured-status
  JSON contract. Verb-mode + index-status contract added to the
  cacheable prefix.
- internal/prime/guidance.go: <commands> table opens with five verb
  rows before the DSL-mode rows — agents reading top-down land on the
  verb first.
- .claude/rules/ox-code.md: decision tree updated to use verbs.
- claude-plugin/skills/ox/SKILL.md: split into "Verb-mode" + "DSL-mode"
  tables.
- extensions/claude/commands/ox.md: same split for the slash-command.

`number:` DSL filter and `ox find` alias from the report were
intentionally deferred — the verb wrappers + framing changes are the
load-bearing piece for discoverability. Filter additions can ship later
without touching agent surfaces.

Co-Authored-By: SageOx <ox@sageox.ai>
SageOx-Session: https://sageox.ai/repo/repo_019c5812-01e9-7b7d-b5b1-321c471c9777/sessions/2026-06-14T15-24-rupak-OxWo5N/view
…pact commit fields

Two follow-ups to Batch 2/3 surfaced while end-to-end testing the live
index:

1. formatSearchLatency: the agent-facing stderr stats line was floored to
   "<1s" by formatDurationBrief, which hid the signal it exists to provide
   (telling the agent ox code is FAST). Added a precision-aware formatter
   that uses µs/ms below one second and falls back to the human-friendly
   brief format above. Real numbers now show through: a recent ox code
   refs run reports "11 results in 18ms" instead of "<1s".

2. compactSearchResult: added commit_hash, author, commit_message fields
   so type:commit queries (used by `ox code log`) return useful data in
   compact mode. Previously the compact response dropped these — agents
   saw `{"snippet": ""}` for every commit row and had to fall back to
   --full-json. Snippet width applies to commit_message too so the
   --snippet flag still bounds output cost.

3. ox code log <path>: defaults --after to one year ago when no
   time/author filter is supplied. The codedb commit-search executor
   requires at least one of author/before/after/message alongside file:,
   so the verb would otherwise error on the most natural invocation
   `ox code log <path>`. Default keeps the verb always-useful.

No new failures; existing TestDoctorFreshInstall_* failures (v0.8.1 →
v0.10.0 version-update warning) remain unrelated and pre-existing on
this branch.

Co-Authored-By: SageOx <ox@sageox.ai>
SageOx-Session: https://sageox.ai/repo/repo_019c5812-01e9-7b7d-b5b1-321c471c9777/sessions/2026-06-14T15-24-rupak-OxWo5N/view
Companion to the investigation report (codedb-agent-discoverability.md).
Records:

- What shipped across Batches 1/2/3 (4 commits + 1 follow-up)
- Unit-test results: 17 new cases (verb DSL round-trips, snippet
  defaults, JIT hint boundaries, structured-status JSON shape,
  registration check); 5,872 pass / 2 pre-existing fails / 7 skip
  across 12 packages
- Live-index integration check after 'ox code index --full' on this
  branch
- Performance measurements on the three axes the investigation
  identified:
  1. Banner token cost: +330 tokens in the cacheable prefix region
     (517B → 1,842B for <code-search>; +6 commands-table rows)
  2. Search latency: 3-28ms pure search inside ox (now visible via
     the new formatSearchLatency precision); 1s wall-clock is binary
     startup overhead, not search
  3. Output-size comparison vs grep/git log
- Subagent A/B on a textbook call-graph query
  ('Find every place that calls compactSearchResults'):
    Agent A (no framing): 10 tool calls, 16,785 tokens, 24.5s, 6/6
    Agent B (new framing):  3 tool calls, 11,051 tokens, 15.4s, 5/6
    -> -70% tool calls, -34% tokens, -37% wall-clock
    -> One missed result in B; honest read on what N=1 does and
       does not prove
- Open follow-ups deferred for separate work (telemetry, number:
  filter, ox find alias, bleve symbol-index health in doctor)

Co-Authored-By: SageOx <ox@sageox.ai>
SageOx-Session: https://sageox.ai/repo/repo_019c5812-01e9-7b7d-b5b1-321c471c9777/sessions/2026-06-14T15-24-rupak-OxWo5N/view
…anner lines

Two findings from /review on the discoverability impl branch:

1. compactSearchResult.Snippet had `json:"snippet,omitempty"` added in
   the Batch 3 follow-up, regressing the agent-facing schema contract.
   The field is always present in the original (pre-branch) output —
   omitempty makes it conditionally absent when the snippet is empty
   (e.g., type:commit results), forcing agents to handle a sometimes-
   missing field they previously could rely on. Restore the
   no-omitempty form.

2. cmd/ox/agent_prime_xml.go banner had 3 lines exceeding ~90 chars
   when unescaped (longer once `<>` round-trip through `&lt;`/`&gt;`).
   Not a hard 80-col target — the banner is consumed by an LLM, not a
   TUI widget — but tightening keeps the comments aligned in dev
   readers and respects the spirit of .claude/rules/design.md rule 12.
   Wrapped the three offenders without changing semantics.

Unit tests green: TestCompactSearch* / TestVerb* / TestIsBareQuery /
TestEmitIndexNotReady* all pass after the fix.

Co-Authored-By: SageOx <ox@sageox.ai>
SageOx-Session: https://sageox.ai/repo/repo_019c5812-01e9-7b7d-b5b1-321c471c9777/sessions/2026-06-14T15-24-rupak-OxWo5N/view
…ion + verb-arg validation

Two follow-ups from the QA Guardian review that were deferred as
"real but not blocking":

1. compactSnippet rune-aware truncation
   The original byte-slice path (result[:maxLen]) corrupts UTF-8 when
   the cut point falls mid-rune — produces invalid UTF-8 that breaks
   downstream JSON encoding for any non-ASCII content (Greek
   identifiers, emoji in comments, CJK in PR titles). Walk runes via
   `for byteIdx := range result`, find the byte offset of the
   (maxLen+1)-th rune, cut there. maxLen is now counted in runes, not
   bytes. Pure ASCII falls through with identical behavior to the
   original.

   Tests (cmd/ox/code_test.go):
   - TestCompactSnippet_RuneAware_ASCII — byte/rune parity for ASCII
   - TestCompactSnippet_RuneAware_MultibyteUTF8 — 2-byte runes (λ)
   - TestCompactSnippet_RuneAware_Emoji — 4-byte runes (🤖)
   - TestCompactSnippet_RuneAware_NoTruncationNeeded — short input pass-through

2. Verb-argument validation against DSL injection
   The verb wrappers build DSL via fmt.Sprintf. Without validation,
   `ox code callers "foo calledby:bar"` constructs
   `calledby:foo calledby:bar`, which ParseQuery interprets as a
   two-filter query rather than a single-symbol lookup. No security
   impact (DSL is local-only, no privilege escalation), but produces
   confusing results and silently breaks the verb's contract.

   Added validateSymbolArg / validatePathArg that reject whitespace,
   colons, and quote characters. Wired into all five verbs
   (callers/callees/defs/refs/log) and into the --author / --after /
   --before / --lang flag values. A user who needs the full DSL is
   directed to `ox code search` instead.

   Tests:
   - TestValidateSymbolArg_AcceptsCleanIdentifiers — auth, ResolveSession, etc.
   - TestValidateSymbolArg_RejectsDSLInjection — 7 attack patterns
   - TestValidateSymbolArg_RejectsEmpty
   - TestValidatePathArg_AcceptsPaths — slashes, dots, dashes OK
   - TestValidatePathArg_RejectsDSLInjection

End-to-end smoke confirmed:
- `ox code callers "foo calledby:bar"` → exit non-zero, helpful error
- `ox code callers "foo:bar"`           → exit non-zero, helpful error
- `ox code callers ResolveSession`      → works as before
- `ox code search "λ" type:code`        → no UTF-8 corruption

Co-Authored-By: SageOx <ox@sageox.ai>
SageOx-Session: https://sageox.ai/repo/repo_019c5812-01e9-7b7d-b5b1-321c471c9777/sessions/2026-06-14T15-24-rupak-OxWo5N/view
@coderabbitai

coderabbitai Bot commented Jun 15, 2026

Copy link
Copy Markdown
Contributor

Review Change Stack

📝 Walkthrough

Walkthrough

Adds five verb-mode ox code subcommands (callers, callees, defs, refs, log) backed by DSL injection validation, refactors runCodeSearch into a shared executor emitting structured JSON for agent callers when the index is missing or building, switches snippet truncation to rune-aware, extends compact results with commit fields, updates agent-facing banners and guidance tables, and adds regression tests and documentation.

Changes

CodeDB Discoverability for Agents

Layer / File(s) Summary
runCodeSearch refactor, compactSearchResults, and rune-aware truncation
cmd/ox/code.go, cmd/ox/agent_query.go
Rewrites codeCmd/codeSearchCmd help with DSL grammar; creates shared runCodeSearch emitting structured JSON (status, message, fallback_hint) to agent callers when the index is building or absent; adds isBareQuery detection; extends compactSearchResult with commit_hash/author/commit_message fields; makes compactSearchResults snippetLen-configurable; switches compactSnippet to rune-aware truncation with ellipsis; adds formatSearchLatency sub-second helper; updates --snippet flag help text; updates compactSearchResults call in writeQueryResponse.
Verb-mode CLI wrappers and DSL injection validation
cmd/ox/code_verbs.go, cmd/ox/code_verbs_test.go
Introduces validateSymbolArg and validatePathArg blocking DSL injection; adds five Cobra subcommands (callers, callees, defs, refs, log) each constructing a typed DSL string and delegating to runCodeSearch; wires shared (--full-json, --limit, --snippet) and verb-specific (--depth, --lang, --author, --after, --before) flags; tests verify DSL round-trips via ParseQuery and verb registration in --help.
Agent-context index-not-ready JSON across subcommands
cmd/ox/code_activity.go, cmd/ox/code_insights.go, cmd/ox/code_prs.go
On missing-index paths in activity, insights, and prs, calls detectAgentContext; if an agent ID is present, returns emitIndexNotReadyJSON instead of a terminal error; updates Short descriptions for activity and prs.
Unit tests for new behaviors
cmd/ox/code_test.go
Adds tests for isBareQuery classification, compactSearchResults snippet sizing/paging/JIT-hint absence, emitIndexNotReadyJSON JSON decoding for both statuses (indexing, not_indexed), compactSnippet rune-aware truncation (ASCII/multibyte/emoji/no-truncation), validateSymbolArg/validatePathArg DSL-injection defenses, and index status constant stability.
Agent-facing banner, guidance table, and tip text updates
cmd/ox/agent_prime_xml.go, cmd/ox/agent_prime.go, internal/prime/guidance.go
Rewrites <code-search> XML instruction block with verb-mode and DSL-mode examples and JSON status-branching guidance; updates CodeSearchTip strings; expands BuildGuidance to list granular verb-mode commands (defs/callers/callees/refs/log) before DSL filters.
Rules, skills, slash-command docs, and specs
.claude/rules/ox-code.md, claude-plugin/skills/ox/SKILL.md, extensions/claude/commands/ox.md, docs/specs/codedb-agent-discoverability.md, docs/specs/codedb-discoverability-impl-results.md
Adds ox-code.md decision tree and DSL cheatsheet; adds "Searching Code, History, PRs" section to SKILL.md and ox.md; creates discoverability research spec (friction analysis, R1–R11 recommendations, Patches A–G) and implementation results spec (test results, latency measurements, A/B experiment, open follow-ups).

Sequence Diagram(s)

sequenceDiagram
  participant Agent
  participant runCodeSearch
  participant detectAgentContext
  participant CodeDB
  participant compactSearchResults

  Agent->>runCodeSearch: ox code callers|defs|refs|log|search <query>
  runCodeSearch->>detectAgentContext: check for agentID
  runCodeSearch->>CodeDB: check index readiness
  alt index building or absent
    runCodeSearch-->>Agent: JSON {status, message, fallback_hint}
  else index ready
    runCodeSearch->>CodeDB: execute DSL query
    CodeDB-->>runCodeSearch: raw results (symbols/commits/PRs/code)
    runCodeSearch->>compactSearchResults: compact(results, snippetLen)
    compactSearchResults-->>runCodeSearch: JSON array with rune-truncated snippets + commit fields
    runCodeSearch-->>Agent: compact JSON + stderr latency line
  end
Loading

Estimated code review effort

🎯 4 (Complex) | ⏱️ ~60 minutes

Possibly related PRs

  • sageox/ox#663: Both PRs modify outputAgentPrimeXML in cmd/ox/agent_prime_xml.go — this PR rewrites the <code-search> instruction block while #663 changes the <consult-first> routing rows, making them adjacent edits to the same agent prime XML surface.

Poem

🐇 Hop hop, the verbs are here at last,
defs and callers running fast!
No more Grep when CodeDB's near,
DSL filters make the path clear.
JSON hints when the index sleeps,
And rune-safe snippets never weep. ✨

🚥 Pre-merge checks | ✅ 4 | ❌ 1

❌ Failed checks (1 warning)

Check name Status Explanation Resolution
Docstring Coverage ⚠️ Warning Docstring coverage is 42.55% which is insufficient. The required threshold is 80.00%. Write docstrings for the functions missing them to satisfy the coverage threshold.
✅ Passed checks (4 passed)
Check name Status Explanation
Description Check ✅ Passed Check skipped - CodeRabbit’s high-level summary is enabled.
Title check ✅ Passed The title accurately and concisely summarizes the primary changes: agent discoverability improvements to the ox code tool through banner framing, ergonomics enhancements, and new verb wrapper commands.
Linked Issues check ✅ Passed Check skipped because no linked issues were found for this pull request.
Out of Scope Changes check ✅ Passed Check skipped because no linked issues were found for this pull request.

✏️ Tip: You can configure your own custom pre-merge checks in the settings.

✨ Finishing Touches
📝 Generate docstrings
  • Create stacked PR
  • Commit on current branch
🧪 Generate unit tests (beta)
  • Create PR with unit tests
  • Commit unit tests in branch feature/codedb-discoverability-pr

Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out.

❤️ Share

Comment @coderabbitai help to get the list of available commands and usage tips.

@greptile-apps

greptile-apps Bot commented Jun 15, 2026

Copy link
Copy Markdown

Greptile Summary

This PR improves agent discoverability of ox code through three batches: rewritten framing in the XML banner and --help text showing concrete DSL examples, ergonomic improvements (wider snippets, --snippet flag, sub-second latency line, structured JSON for index-not-ready states), and five new verb-mode wrappers (defs, callers, callees, refs, log) that delegate to a refactored runCodeSearch helper.

  • Framing (instruction-only): <code-search> banner, codeCmd.Long, codeSearchCmd.Long, skill files, and .claude/rules/ox-code.md all updated to demonstrate DSL capabilities (call graph, PR search, comment search) rather than prescribe a "PREFER" rule.
  • Ergonomics: compactSnippet is now rune-aware (fixes UTF-8 corruption for non-ASCII content); runCodeSearch emits {\"status\":\"indexing\",\"fallback_hint\":\"...\"} for agent contexts; JIT DSL hint appended on bare queries; code activity/prs/insights gain not_indexed structured JSON too.
  • Verb wrappers: Five new cobra.Command entries in code_verbs.go with DSL-injection validation (validateSymbolArg/validatePathArg), all sharing runCodeSearch so output shape, flag set, and stats line are identical to ox code search.

Confidence Score: 4/5

Safe to merge with one targeted follow-up: code search does not emit structured JSON when the index has never been built, leaving agents with a hard error for the most-called subcommand.

The bulk of the change is instruction-only (banner, help text, skill/rule files) and carries no runtime risk. The new verb wrappers are thin DSL builders with solid injection validation and 26 new tests. The rune-aware compactSnippet correctly fixes a real UTF-8 truncation bug. The one incomplete piece is runCodeSearch: code activity, code prs, and code insights all gained a not_indexed structured JSON path for agent contexts, but runCodeSearch only handles indexing — when the index has never been built, codedb.Open fails and every verb wrapper and code search returns a plain Go error to the agent rather than the actionable fallback_hint JSON.

cmd/ox/code.go — the codedb.Open error path in runCodeSearch needs the same not_indexed agent-context guard added to code_activity.go, code_prs.go, and code_insights.go

Important Files Changed

Filename Overview
cmd/ox/code.go Core refactor: extracts runCodeSearch, adds --snippet/--quiet support, rune-aware compactSnippet, JIT DSL hint, and structured indexing JSON — but the not_indexed path for agents is missing from codedb.Open error handling
cmd/ox/code_verbs.go New verb-mode wrappers (callers, callees, defs, refs, log) with DSL-injection validation; auto-generated default --after in codeLogCmd bypasses the validation loop but is safe in practice
cmd/ox/code_test.go 222 lines of new test cases covering snippet width, JIT hint boundary, rune-aware truncation, index-not-ready JSON shape, and symbol/path validation; well-structured with clear failure-case commentary
cmd/ox/code_verbs_test.go Round-trip DSL tests for all 5 verbs via search.ParseQuery; registration test for codeCmd.Commands(); clean and complete
cmd/ox/agent_prime_xml.go Rewrites the <code-search> XML banner from prescriptive directives to concrete verb + DSL examples; token cost increase (+330) is intentional and prefix-cacheable
cmd/ox/agent_prime.go Two string rewrites to CodeSearchTip: replaces PREFER-over-grep with DSL capability framing; no logic change
cmd/ox/agent_query.go Single-line signature update: compactSearchResults call gains third 0 argument (uses defaultSnippetLen) — no behavioral change
cmd/ox/code_activity.go Short description rewrite plus not_indexed agent JSON response — consistent with code_prs.go and code_insights.go pattern
cmd/ox/code_prs.go Short description rewrite and not_indexed structured JSON response for agents; no logic changes
cmd/ox/code_insights.go Added not_indexed structured JSON response for agent context when index directory is missing
internal/prime/guidance.go Expands IntentCommand table from 2 to 10 rows to surface verb-mode wrappers and DSL capabilities individually; straightforward additive change
.claude/rules/ox-code.md New rule file with decision tree, DSL cheatsheet, anti-patterns, and fallback policy for ox code vs grep routing

Flowchart

%%{init: {'theme': 'neutral'}}%%
flowchart TD
    A[Agent invokes ox code verb] --> B{Verb match?}
    B -->|defs / callers / callees / refs / log| C[validateSymbolArg / validatePathArg]
    B -->|search / query| D[args joined as query string]
    C -->|invalid| E[return error]
    C -->|valid| F[build DSL string]
    F --> G[runCodeSearch]
    D --> G
    G --> H{isCodeDBIndexing?}
    H -->|yes + agent ctx| I[emitIndexNotReadyJSON status: indexing]
    H -->|yes + human| J[return plain error]
    H -->|no| K[codedb.Open]
    K -->|error| L[return plain error — missing not_indexed JSON for agents]
    K -->|ok| M[db.Search]
    M --> N{fullJSON?}
    N -->|yes| O[encode raw results]
    N -->|no| P[compactSearchResults rune-aware snippet JIT hint if bare query]
    P --> Q[write JSON to stdout]
    O --> Q
    Q --> R[write latency line to stderr]
Loading

Comments Outside Diff (1)

  1. cmd/ox/code_verbs.go, line 854-873 (link)

    P2 Auto-generated default --after bypasses validation

    The validation loop at line 854 runs over author, after, and before as read from flags. When all three are empty the code falls into the time.Now().AddDate(-1, 0, 0) branch and assigns a new value to after — after the validation loop has already finished. That default value is produced from trusted time package code and is always a well-formed YYYY-MM-DD string, so there is no exploitable path here. The inconsistency is structural: user-supplied --after values are sanitised, but the internally-generated default bypasses the same gate. A future refactor that changes the default-calculation logic (e.g. pulling the date from an env variable or config) would silently skip validation. Consider validating the resolved after string in the DSL-building block, after the default is applied.

    Note: If this suggestion doesn't match your team's coding style, reply to this and let me know. I'll remember it for next time!

    Fix in Claude Code

Fix All in Claude Code

Reviews (1): Last reviewed commit: "docs(codedb): investigation — why agents..." | Re-trigger Greptile

Comment thread cmd/ox/code.go
Comment on lines +215 to +229
limit, _ := cmd.Flags().GetInt("limit")
snippetLen, _ := cmd.Flags().GetInt("snippet")
if snippetLen <= 0 {
snippetLen = defaultSnippetLen
}

var buf bytes.Buffer
enc := json.NewEncoder(&buf)
enc.SetIndent("", " ")
var buf bytes.Buffer
enc := json.NewEncoder(&buf)
enc.SetIndent("", " ")

if fullJSON {
resp := &combinedQueryResponse{CodeResults: results}
if err := enc.Encode(resp); err != nil {
return fmt.Errorf("encode: %w", err)
}
} else {
compact := compactSearchResults(results, limit)
if err := enc.Encode(compact); err != nil {
return fmt.Errorf("encode: %w", err)
if fullJSON {
resp := &combinedQueryResponse{CodeResults: results}
if err := enc.Encode(resp); err != nil {
return fmt.Errorf("encode: %w", err)
}

Copy link
Copy Markdown

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

P1 Missing not_indexed structured JSON in runCodeSearch

code activity, code prs, and code insights all check whether the index directory exists and, for agent contexts, emit {"status":"not_indexed","fallback_hint":"..."} rather than a hard error. runCodeSearch only handles the actively-building case (isCodeDBIndexing). When no index has ever been built, codedb.Open(dataDir) fails and agents receive a plain Go error string — exactly the "hard error that teaches agents to abandon the tool" the PR explicitly targets. A new repo with detectAgentContext() != "" and no index will see this unstructured error for every code search, code callers, code callees, code defs, code refs, and code log invocation since they all route through runCodeSearch.

The fix mirrors the pattern in code_activity.go: after resolvePreferredCodeDBDir, check whether dataDir is empty or codedb.Open would clearly fail, and emit emitIndexNotReadyJSON(cmd, indexStatusNotIndexed, …) for agent contexts before reaching the Open call.

Fix in Claude Code

@coderabbitai coderabbitai Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actionable comments posted: 2

🧹 Nitpick comments (1)
cmd/ox/code_verbs_test.go (1)

19-85: ⚡ Quick win

These tests validate parser behavior, not the verb wrapper wiring.

The current cases only parse manually authored DSL strings. If wrapper construction drifts, these tests can still pass. Consider extracting small query-builder helpers (or invoking each RunE with a captured dispatcher) so tests assert the actual wrapper-generated query.

🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/ox/code_verbs_test.go` around lines 19 - 85, These tests only validate
the parser behavior by testing manually authored DSL strings, but they don't
verify that the actual verb command wrappers (the RunE functions for each verb
like TestVerbCallers, TestVerbCallees, TestVerbDefs, TestVerbRefs, and
TestVerbLog) correctly generate those queries. If the wrapper construction
drifts, these tests will still pass even though the commands won't work. Fix
this by creating query-builder helpers or modifying the tests to invoke each
verb's RunE function with a captured dispatcher to assert that the actual
wrapper-generated queries match the expected filter structure, ensuring the full
integration from command execution to parsed query is tested rather than just
parsing in isolation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

Inline comments:
In `@cmd/ox/code_verbs.go`:
- Around line 172-179: The validation loop for command-line arguments currently
routes `--after` and `--before` through validateSymbolArg, which forbids colons
and rejects valid ISO 8601 timestamps. Fix this by conditionally applying
different validation logic based on the filter name: keep validateSymbolArg for
`--author`, but skip the validateSymbolArg check for `--after` and `--before`
(or apply a less restrictive validation that permits timestamps). You can modify
the loop to check if v.name is in the timestamp filters and handle those cases
separately, allowing colons in their values while maintaining proper validation
for the author filter.

In `@cmd/ox/code.go`:
- Around line 180-197: In the runCodeSearch function, add a check for missing
index metadata before opening the CodeDB with codedb.Open(dataDir) at line 194.
Similar to how isCodeDBIndexing is checked, determine if the index metadata does
not exist. When the index metadata is missing and agentID is not empty, return
emitIndexNotReadyJSON with an appropriate status value to provide the structured
JSON contract for agent callers instead of allowing the raw open error to
surface.

---

Nitpick comments:
In `@cmd/ox/code_verbs_test.go`:
- Around line 19-85: These tests only validate the parser behavior by testing
manually authored DSL strings, but they don't verify that the actual verb
command wrappers (the RunE functions for each verb like TestVerbCallers,
TestVerbCallees, TestVerbDefs, TestVerbRefs, and TestVerbLog) correctly generate
those queries. If the wrapper construction drifts, these tests will still pass
even though the commands won't work. Fix this by creating query-builder helpers
or modifying the tests to invoke each verb's RunE function with a captured
dispatcher to assert that the actual wrapper-generated queries match the
expected filter structure, ensuring the full integration from command execution
to parsed query is tested rather than just parsing in isolation.
🪄 Autofix (Beta)

Fix all unresolved CodeRabbit comments on this PR:

  • Push a commit to this branch (recommended)
  • Create a new PR with the fixes

ℹ️ Review info
⚙️ Run configuration

Configuration used: Path: .coderabbit.yaml

Review profile: CHILL

Plan: Pro Plus

Run ID: c4fbee49-88fd-4b42-be6b-9f3889b1ac0a

📥 Commits

Reviewing files that changed from the base of the PR and between a90dc57 and 8004429.

📒 Files selected for processing (16)
  • .claude/rules/ox-code.md
  • claude-plugin/skills/ox/SKILL.md
  • cmd/ox/agent_prime.go
  • cmd/ox/agent_prime_xml.go
  • cmd/ox/agent_query.go
  • cmd/ox/code.go
  • cmd/ox/code_activity.go
  • cmd/ox/code_insights.go
  • cmd/ox/code_prs.go
  • cmd/ox/code_test.go
  • cmd/ox/code_verbs.go
  • cmd/ox/code_verbs_test.go
  • docs/specs/codedb-agent-discoverability.md
  • docs/specs/codedb-discoverability-impl-results.md
  • extensions/claude/commands/ox.md
  • internal/prime/guidance.go

Comment thread cmd/ox/code_verbs.go
Comment on lines +172 to +179
for _, v := range []struct{ name, val string }{
{"--author", author}, {"--after", after}, {"--before", before},
} {
if v.val == "" {
continue
}
if err := validateSymbolArg(v.name, v.val); err != nil {
return err

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

--after/--before are over-restricted and reject valid timestamp values.

Line 178 routes all log filters through validateSymbolArg, which forbids :. That blocks valid datetime values (e.g., 2026-04-01T10:00:00Z) for --after/--before.

💡 Suggested fix
+func validateDateArg(name, arg string) error {
+	if arg == "" {
+		return fmt.Errorf("%s: value must not be empty", name)
+	}
+	if strings.ContainsAny(arg, " \t\n\r\"'") {
+		return fmt.Errorf("%s: value must not contain whitespace or quotes", name)
+	}
+	return nil
+}
@@
 		for _, v := range []struct{ name, val string }{
 			{"--author", author}, {"--after", after}, {"--before", before},
 		} {
 			if v.val == "" {
 				continue
 			}
-			if err := validateSymbolArg(v.name, v.val); err != nil {
+			var err error
+			switch v.name {
+			case "--after", "--before":
+				err = validateDateArg(v.name, v.val)
+			default:
+				err = validateSymbolArg(v.name, v.val)
+			}
+			if err != nil {
 				return err
 			}
 		}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/ox/code_verbs.go` around lines 172 - 179, The validation loop for
command-line arguments currently routes `--after` and `--before` through
validateSymbolArg, which forbids colons and rejects valid ISO 8601 timestamps.
Fix this by conditionally applying different validation logic based on the
filter name: keep validateSymbolArg for `--author`, but skip the
validateSymbolArg check for `--after` and `--before` (or apply a less
restrictive validation that permits timestamps). You can modify the loop to
check if v.name is in the timestamp filters and handle those cases separately,
allowing colons in their values while maintaining proper validation for the
author filter.

Comment thread cmd/ox/code.go
Comment on lines +180 to +197
dataDir, useLedger := resolvePreferredCodeDBDir(root)
agentID, _ := detectAgentContext()

db, err := codedb.Open(dataDir)
if err != nil {
return fmt.Errorf("open codedb: %w", err)
// Index-not-ready paths: emit structured JSON when an agent is calling
// so it can branch on `status` instead of treating an error as terminal.
if isCodeDBIndexing(useLedger) {
if agentID != "" {
return emitIndexNotReadyJSON(cmd, indexStatusIndexing,
"Code index is currently being built. Search will be available once indexing completes.",
"Use Grep/Glob until indexing completes; rerun 'ox code search' afterward.")
}
defer db.Close()
return fmt.Errorf("code index is currently being built — search is unavailable until indexing completes. Run 'ox code status' to check progress")
}

// attach all daemon-built dirty overlays for uncommitted file search
// (supports multiple simultaneous worktrees)
if n := db.AttachAllDirtyIndexes(); n > 0 {
slog.Debug("attached dirty overlays", "count", n)
}
db, err := codedb.Open(dataDir)
if err != nil {
return fmt.Errorf("open codedb: %w", err)
}

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

⚠️ Potential issue | 🟠 Major | ⚡ Quick win

Missing-index agent path is not handled before opening CodeDB.

On Line 194, runCodeSearch opens the DB without first checking whether the index metadata exists. For agent callers, this can skip the structured status:"not_indexed" contract and surface a raw open/search path instead.

💡 Suggested fix
 import (
 	"bytes"
 	"context"
 	"encoding/json"
+	"errors"
 	"fmt"
 	"log/slog"
 	"os"
 	"path/filepath"
@@
 	dataDir, useLedger := resolvePreferredCodeDBDir(root)
 	agentID, _ := detectAgentContext()
+
+	metadataPath := filepath.Join(dataDir, store.MetadataDBFile)
+	if _, statErr := os.Stat(metadataPath); errors.Is(statErr, os.ErrNotExist) {
+		if agentID != "" {
+			return emitIndexNotReadyJSON(cmd, indexStatusNotIndexed,
+				"No code index found for this repo.",
+				"Run 'ox code index' to build the index, then rerun 'ox code search'.")
+		}
+		return fmt.Errorf("no code index found — run 'ox code index' first")
+	} else if statErr != nil {
+		return fmt.Errorf("check codedb index: %w", statErr)
+	}
 
 	// Index-not-ready paths: emit structured JSON when an agent is calling
📝 Committable suggestion

‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.

Suggested change
dataDir, useLedger := resolvePreferredCodeDBDir(root)
agentID, _ := detectAgentContext()
db, err := codedb.Open(dataDir)
if err != nil {
return fmt.Errorf("open codedb: %w", err)
// Index-not-ready paths: emit structured JSON when an agent is calling
// so it can branch on `status` instead of treating an error as terminal.
if isCodeDBIndexing(useLedger) {
if agentID != "" {
return emitIndexNotReadyJSON(cmd, indexStatusIndexing,
"Code index is currently being built. Search will be available once indexing completes.",
"Use Grep/Glob until indexing completes; rerun 'ox code search' afterward.")
}
defer db.Close()
return fmt.Errorf("code index is currently being built — search is unavailable until indexing completes. Run 'ox code status' to check progress")
}
// attach all daemon-built dirty overlays for uncommitted file search
// (supports multiple simultaneous worktrees)
if n := db.AttachAllDirtyIndexes(); n > 0 {
slog.Debug("attached dirty overlays", "count", n)
}
db, err := codedb.Open(dataDir)
if err != nil {
return fmt.Errorf("open codedb: %w", err)
}
dataDir, useLedger := resolvePreferredCodeDBDir(root)
agentID, _ := detectAgentContext()
metadataPath := filepath.Join(dataDir, store.MetadataDBFile)
if _, statErr := os.Stat(metadataPath); errors.Is(statErr, os.ErrNotExist) {
if agentID != "" {
return emitIndexNotReadyJSON(cmd, indexStatusNotIndexed,
"No code index found for this repo.",
"Run 'ox code index' to build the index, then rerun 'ox code search'.")
}
return fmt.Errorf("no code index found — run 'ox code index' first")
} else if statErr != nil {
return fmt.Errorf("check codedb index: %w", statErr)
}
// Index-not-ready paths: emit structured JSON when an agent is calling
// so it can branch on `status` instead of treating an error as terminal.
if isCodeDBIndexing(useLedger) {
if agentID != "" {
return emitIndexNotReadyJSON(cmd, indexStatusIndexing,
"Code index is currently being built. Search will be available once indexing completes.",
"Use Grep/Glob until indexing completes; rerun 'ox code search' afterward.")
}
return fmt.Errorf("code index is currently being built — search is unavailable until indexing completes. Run 'ox code status' to check progress")
}
db, err := codedb.Open(dataDir)
if err != nil {
return fmt.Errorf("open codedb: %w", err)
}
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.

In `@cmd/ox/code.go` around lines 180 - 197, In the runCodeSearch function, add a
check for missing index metadata before opening the CodeDB with
codedb.Open(dataDir) at line 194. Similar to how isCodeDBIndexing is checked,
determine if the index metadata does not exist. When the index metadata is
missing and agentID is not empty, return emitIndexNotReadyJSON with an
appropriate status value to provide the structured JSON contract for agent
callers instead of allowing the raw open error to surface.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant