feat(codedb): agent discoverability — banner framing, ergonomics, verb wrappers#664
feat(codedb): agent discoverability — banner framing, ergonomics, verb wrappers#664rupakm wants to merge 8 commits into
Conversation
… skills
Batch 1 of three from the codedb agent-discoverability investigation
(docs/ai/specs/codedb-agent-discoverability.md).
The previous surfaces told agents to "PREFER ox code search over Grep" — a
soft directive that pre-trained tool-use instincts override. They never
showed what `ox code` can do that grep cannot, so agents kept defaulting to
grep/rg/find for everything.
This commit changes the framing from prescription to demonstration:
- cmd/ox/agent_prime_xml.go: <code-search> banner now lists 9 concrete DSL
examples (type:symbol, calledby:, calls: depth:, type:pr, type:comment
ckind:todo, author+after, prs, insights, activity) and the full keyword
inventory. Restricts Grep/Glob to exact-string-in-known-file and 0-result
fallback. Banner lives in the static prefix-cache region.
- cmd/ox/code.go: codeCmd.Long and codeSearchCmd.Long now carry the DSL
grammar + 7 example queries spanning symbol, call-graph, PR, comment,
history, and regex intents.
- cmd/ox/code_activity.go, code_prs.go: Short strings reworded from
pipeline-internal language ("for the fact extractor") to agent-visible
value props ("Recent GitHub activity ... over a time window",
"PRs ranked for triage").
- internal/prime/guidance.go: <commands> table expanded from 2 ox-code rows
to 6, surfacing calls/calledby, type:pr/type:issue, prs, activity, and
insights as distinct intents. ADR-019's call graph is no longer hidden
behind DSL grammar agents don't see.
- cmd/ox/agent_prime.go: CodeSearchTip rewording aligned with new framing.
- .claude/rules/ox-code.md (new): decision tree + DSL cheatsheet +
anti-patterns + fallback policy + index-health notes.
- claude-plugin/skills/ox/SKILL.md, extensions/claude/commands/ox.md:
"Search Code, History, PRs" sections added so the shipped skill and
slash-command help mention `ox code` (previously silent on it).
No production logic changed — all surface-text/instruction edits.
Co-Authored-By: SageOx <ox@sageox.ai>
SageOx-Session: https://sageox.ai/repo/repo_019c5812-01e9-7b7d-b5b1-321c471c9777/sessions/2026-05-28T09-15-rupak-OxWo5N/view
…tatus, JIT DSL hint, latency stats
Batch 2 of three from the codedb agent-discoverability investigation.
Focuses on per-call signal quality so the first time an agent invokes
`ox code search` it learns the tool is fast, recoverable, and richer than grep.
- R6 / snippet width:
defaultSnippetLen bumped 120 → 200 chars. New `--snippet N` flag overrides
per call. 120 chars was cutting mid-arg-list on typical Go signatures and
training agents that "ox returns less than grep" — 200 covers most function
signatures plus brief context. Description on --full-json updated from
"~6x more context tokens" to "~4x" to reflect the new ratio.
- R7 / structured index-not-ready response:
When an agent context hits an index that is still being built, or has not
yet been created, `ox code search|insights|prs|activity` now emit a
structured JSON response on stdout and exit 0:
{"status":"indexing"|"not_indexed",
"message":"...",
"fallback_hint":"..."}
Humans still get the previous human-readable error so the terminal
experience is unchanged. Agents branch on `status` instead of abandoning
the tool on stderr. Status string constants live in code.go so callers
parsing the JSON have a stable contract.
- R9 / JIT DSL hint:
When the agent issues a bare single-term query (no DSL filters, no OR, no
/regex/), and we returned at least one result, append a one-line nudge to
the `guidance` field pointing at the filters they likely don't know
about: `type:symbol`, `calledby:`/`calls:`, `type:pr`/`type:issue`,
`before:`/`after:`. Just-in-time DSL discovery without bloating the
per-call response when the query already used filters.
- R8 / stderr latency stats:
`ox code search` now writes a one-liner to stderr after each call:
codedb: 247 results in 12ms (dirty overlays: 2)
Stderr only — stdout JSON consumers unaffected. Suppressed by --quiet.
Calibrates agents (and humans) on latency without per-call context cost.
Unit tests in cmd/ox/code_test.go cover:
- isBareQuery: 9 cases including empty, whitespace, DSL, OR, regex
- compactSearchResults default snippet length (200)
- --snippet override applies per call
- paging guidance still emitted when total > limit
- emitIndexNotReadyJSON shape for both "indexing" and "not_indexed"
- status constants locked in (agent contract)
The two pre-existing TestDoctorFreshInstall_* failures (version-update
warning v0.8.1 → v0.10.0) are unrelated to this branch — verified by
running them with the branch stashed.
Co-Authored-By: SageOx <ox@sageox.ai>
SageOx-Session: https://sageox.ai/repo/repo_019c5812-01e9-7b7d-b5b1-321c471c9777/sessions/2026-06-14T15-24-rupak-OxWo5N/view
Batch 3 of three from the codedb agent-discoverability investigation. Closes the verb/DSL mismatch identified in the investigation: agents pattern-match on verbs, but ADR-019's call-graph capability sat behind DSL filters (calls:, calledby:) with no verb-mode entry point. After this commit the unique-value queries are discoverable by guessing the verb. New subcommands (each a thin shell over the shared runCodeSearch helper extracted from codeSearchCmd): ox code defs <name> # where is <name> defined? (type:symbol) ox code callers <name> # who calls <name>? (calledby:) ox code callees <name> --depth N # what <name> calls (calls: depth:) ox code refs <name> [--lang go] # text references (type:code lang:) ox code log <path> [--author X --after YYYY-MM-DD] # commit history (file: type:commit ...) All verbs inherit the same flags as `ox code search` (--limit, --snippet, --full-json) so output is byte-for-byte identical with the equivalent DSL invocation. Implementation lives in cmd/ox/code_verbs.go; each command builds the DSL string and calls runCodeSearch — no duplication of search flow, index-not-ready handling, or stats emission. cmd/ox/code.go: extracted runCodeSearch(cmd, query) from the inline RunE. codeSearchCmd.RunE now just joins args and delegates. Behavior is preserved; existing tests on compactSearchResults/emitIndexNotReadyJSON/ isBareQuery still cover the path. Tests (cmd/ox/code_verbs_test.go, 9 cases): - Each verb round-trips through search.ParseQuery to the expected ParsedQuery — callers→Filters.CalledBy, callees→Filters.Calls (+optional Depth), defs→SearchTypeSymbol, refs→SearchTypeCode (+optional Lang), log→Filters.File + SearchTypeCommit (+optional Author/After/Before). - All five verb commands registered on codeCmd so they appear in `ox code --help`. Banner / surface updates so agents find the verbs: - cmd/ox/agent_prime_xml.go: <code-search> banner now leads with the verb listing, then DSL-mode fallback, then the structured-status JSON contract. Verb-mode + index-status contract added to the cacheable prefix. - internal/prime/guidance.go: <commands> table opens with five verb rows before the DSL-mode rows — agents reading top-down land on the verb first. - .claude/rules/ox-code.md: decision tree updated to use verbs. - claude-plugin/skills/ox/SKILL.md: split into "Verb-mode" + "DSL-mode" tables. - extensions/claude/commands/ox.md: same split for the slash-command. `number:` DSL filter and `ox find` alias from the report were intentionally deferred — the verb wrappers + framing changes are the load-bearing piece for discoverability. Filter additions can ship later without touching agent surfaces. Co-Authored-By: SageOx <ox@sageox.ai> SageOx-Session: https://sageox.ai/repo/repo_019c5812-01e9-7b7d-b5b1-321c471c9777/sessions/2026-06-14T15-24-rupak-OxWo5N/view
…pact commit fields
Two follow-ups to Batch 2/3 surfaced while end-to-end testing the live
index:
1. formatSearchLatency: the agent-facing stderr stats line was floored to
"<1s" by formatDurationBrief, which hid the signal it exists to provide
(telling the agent ox code is FAST). Added a precision-aware formatter
that uses µs/ms below one second and falls back to the human-friendly
brief format above. Real numbers now show through: a recent ox code
refs run reports "11 results in 18ms" instead of "<1s".
2. compactSearchResult: added commit_hash, author, commit_message fields
so type:commit queries (used by `ox code log`) return useful data in
compact mode. Previously the compact response dropped these — agents
saw `{"snippet": ""}` for every commit row and had to fall back to
--full-json. Snippet width applies to commit_message too so the
--snippet flag still bounds output cost.
3. ox code log <path>: defaults --after to one year ago when no
time/author filter is supplied. The codedb commit-search executor
requires at least one of author/before/after/message alongside file:,
so the verb would otherwise error on the most natural invocation
`ox code log <path>`. Default keeps the verb always-useful.
No new failures; existing TestDoctorFreshInstall_* failures (v0.8.1 →
v0.10.0 version-update warning) remain unrelated and pre-existing on
this branch.
Co-Authored-By: SageOx <ox@sageox.ai>
SageOx-Session: https://sageox.ai/repo/repo_019c5812-01e9-7b7d-b5b1-321c471c9777/sessions/2026-06-14T15-24-rupak-OxWo5N/view
Companion to the investigation report (codedb-agent-discoverability.md).
Records:
- What shipped across Batches 1/2/3 (4 commits + 1 follow-up)
- Unit-test results: 17 new cases (verb DSL round-trips, snippet
defaults, JIT hint boundaries, structured-status JSON shape,
registration check); 5,872 pass / 2 pre-existing fails / 7 skip
across 12 packages
- Live-index integration check after 'ox code index --full' on this
branch
- Performance measurements on the three axes the investigation
identified:
1. Banner token cost: +330 tokens in the cacheable prefix region
(517B → 1,842B for <code-search>; +6 commands-table rows)
2. Search latency: 3-28ms pure search inside ox (now visible via
the new formatSearchLatency precision); 1s wall-clock is binary
startup overhead, not search
3. Output-size comparison vs grep/git log
- Subagent A/B on a textbook call-graph query
('Find every place that calls compactSearchResults'):
Agent A (no framing): 10 tool calls, 16,785 tokens, 24.5s, 6/6
Agent B (new framing): 3 tool calls, 11,051 tokens, 15.4s, 5/6
-> -70% tool calls, -34% tokens, -37% wall-clock
-> One missed result in B; honest read on what N=1 does and
does not prove
- Open follow-ups deferred for separate work (telemetry, number:
filter, ox find alias, bleve symbol-index health in doctor)
Co-Authored-By: SageOx <ox@sageox.ai>
SageOx-Session: https://sageox.ai/repo/repo_019c5812-01e9-7b7d-b5b1-321c471c9777/sessions/2026-06-14T15-24-rupak-OxWo5N/view
…anner lines Two findings from /review on the discoverability impl branch: 1. compactSearchResult.Snippet had `json:"snippet,omitempty"` added in the Batch 3 follow-up, regressing the agent-facing schema contract. The field is always present in the original (pre-branch) output — omitempty makes it conditionally absent when the snippet is empty (e.g., type:commit results), forcing agents to handle a sometimes- missing field they previously could rely on. Restore the no-omitempty form. 2. cmd/ox/agent_prime_xml.go banner had 3 lines exceeding ~90 chars when unescaped (longer once `<>` round-trip through `<`/`>`). Not a hard 80-col target — the banner is consumed by an LLM, not a TUI widget — but tightening keeps the comments aligned in dev readers and respects the spirit of .claude/rules/design.md rule 12. Wrapped the three offenders without changing semantics. Unit tests green: TestCompactSearch* / TestVerb* / TestIsBareQuery / TestEmitIndexNotReady* all pass after the fix. Co-Authored-By: SageOx <ox@sageox.ai> SageOx-Session: https://sageox.ai/repo/repo_019c5812-01e9-7b7d-b5b1-321c471c9777/sessions/2026-06-14T15-24-rupak-OxWo5N/view
…ion + verb-arg validation Two follow-ups from the QA Guardian review that were deferred as "real but not blocking": 1. compactSnippet rune-aware truncation The original byte-slice path (result[:maxLen]) corrupts UTF-8 when the cut point falls mid-rune — produces invalid UTF-8 that breaks downstream JSON encoding for any non-ASCII content (Greek identifiers, emoji in comments, CJK in PR titles). Walk runes via `for byteIdx := range result`, find the byte offset of the (maxLen+1)-th rune, cut there. maxLen is now counted in runes, not bytes. Pure ASCII falls through with identical behavior to the original. Tests (cmd/ox/code_test.go): - TestCompactSnippet_RuneAware_ASCII — byte/rune parity for ASCII - TestCompactSnippet_RuneAware_MultibyteUTF8 — 2-byte runes (λ) - TestCompactSnippet_RuneAware_Emoji — 4-byte runes (🤖) - TestCompactSnippet_RuneAware_NoTruncationNeeded — short input pass-through 2. Verb-argument validation against DSL injection The verb wrappers build DSL via fmt.Sprintf. Without validation, `ox code callers "foo calledby:bar"` constructs `calledby:foo calledby:bar`, which ParseQuery interprets as a two-filter query rather than a single-symbol lookup. No security impact (DSL is local-only, no privilege escalation), but produces confusing results and silently breaks the verb's contract. Added validateSymbolArg / validatePathArg that reject whitespace, colons, and quote characters. Wired into all five verbs (callers/callees/defs/refs/log) and into the --author / --after / --before / --lang flag values. A user who needs the full DSL is directed to `ox code search` instead. Tests: - TestValidateSymbolArg_AcceptsCleanIdentifiers — auth, ResolveSession, etc. - TestValidateSymbolArg_RejectsDSLInjection — 7 attack patterns - TestValidateSymbolArg_RejectsEmpty - TestValidatePathArg_AcceptsPaths — slashes, dots, dashes OK - TestValidatePathArg_RejectsDSLInjection End-to-end smoke confirmed: - `ox code callers "foo calledby:bar"` → exit non-zero, helpful error - `ox code callers "foo:bar"` → exit non-zero, helpful error - `ox code callers ResolveSession` → works as before - `ox code search "λ" type:code` → no UTF-8 corruption Co-Authored-By: SageOx <ox@sageox.ai> SageOx-Session: https://sageox.ai/repo/repo_019c5812-01e9-7b7d-b5b1-321c471c9777/sessions/2026-06-14T15-24-rupak-OxWo5N/view
…ndations Co-Authored-By: SageOx <ox@sageox.ai> SageOx-Session: https://sageox.ai/repo/repo_019c5812-01e9-7b7d-b5b1-321c471c9777/sessions/2026-05-28T09-15-rupak-OxWo5N/view
📝 WalkthroughWalkthroughAdds five verb-mode ChangesCodeDB Discoverability for Agents
Sequence Diagram(s)sequenceDiagram
participant Agent
participant runCodeSearch
participant detectAgentContext
participant CodeDB
participant compactSearchResults
Agent->>runCodeSearch: ox code callers|defs|refs|log|search <query>
runCodeSearch->>detectAgentContext: check for agentID
runCodeSearch->>CodeDB: check index readiness
alt index building or absent
runCodeSearch-->>Agent: JSON {status, message, fallback_hint}
else index ready
runCodeSearch->>CodeDB: execute DSL query
CodeDB-->>runCodeSearch: raw results (symbols/commits/PRs/code)
runCodeSearch->>compactSearchResults: compact(results, snippetLen)
compactSearchResults-->>runCodeSearch: JSON array with rune-truncated snippets + commit fields
runCodeSearch-->>Agent: compact JSON + stderr latency line
end
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
Poem
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches📝 Generate docstrings
🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
Greptile SummaryThis PR improves agent discoverability of
Confidence Score: 4/5Safe to merge with one targeted follow-up: The bulk of the change is instruction-only (banner, help text, skill/rule files) and carries no runtime risk. The new verb wrappers are thin DSL builders with solid injection validation and 26 new tests. The rune-aware cmd/ox/code.go — the Important Files Changed
|
| limit, _ := cmd.Flags().GetInt("limit") | ||
| snippetLen, _ := cmd.Flags().GetInt("snippet") | ||
| if snippetLen <= 0 { | ||
| snippetLen = defaultSnippetLen | ||
| } | ||
|
|
||
| var buf bytes.Buffer | ||
| enc := json.NewEncoder(&buf) | ||
| enc.SetIndent("", " ") | ||
| var buf bytes.Buffer | ||
| enc := json.NewEncoder(&buf) | ||
| enc.SetIndent("", " ") | ||
|
|
||
| if fullJSON { | ||
| resp := &combinedQueryResponse{CodeResults: results} | ||
| if err := enc.Encode(resp); err != nil { | ||
| return fmt.Errorf("encode: %w", err) | ||
| } | ||
| } else { | ||
| compact := compactSearchResults(results, limit) | ||
| if err := enc.Encode(compact); err != nil { | ||
| return fmt.Errorf("encode: %w", err) | ||
| if fullJSON { | ||
| resp := &combinedQueryResponse{CodeResults: results} | ||
| if err := enc.Encode(resp); err != nil { | ||
| return fmt.Errorf("encode: %w", err) | ||
| } |
There was a problem hiding this comment.
Missing
not_indexed structured JSON in runCodeSearch
code activity, code prs, and code insights all check whether the index directory exists and, for agent contexts, emit {"status":"not_indexed","fallback_hint":"..."} rather than a hard error. runCodeSearch only handles the actively-building case (isCodeDBIndexing). When no index has ever been built, codedb.Open(dataDir) fails and agents receive a plain Go error string — exactly the "hard error that teaches agents to abandon the tool" the PR explicitly targets. A new repo with detectAgentContext() != "" and no index will see this unstructured error for every code search, code callers, code callees, code defs, code refs, and code log invocation since they all route through runCodeSearch.
The fix mirrors the pattern in code_activity.go: after resolvePreferredCodeDBDir, check whether dataDir is empty or codedb.Open would clearly fail, and emit emitIndexNotReadyJSON(cmd, indexStatusNotIndexed, …) for agent contexts before reaching the Open call.
There was a problem hiding this comment.
Actionable comments posted: 2
🧹 Nitpick comments (1)
cmd/ox/code_verbs_test.go (1)
19-85: ⚡ Quick winThese tests validate parser behavior, not the verb wrapper wiring.
The current cases only parse manually authored DSL strings. If wrapper construction drifts, these tests can still pass. Consider extracting small query-builder helpers (or invoking each
RunEwith a captured dispatcher) so tests assert the actual wrapper-generated query.🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@cmd/ox/code_verbs_test.go` around lines 19 - 85, These tests only validate the parser behavior by testing manually authored DSL strings, but they don't verify that the actual verb command wrappers (the RunE functions for each verb like TestVerbCallers, TestVerbCallees, TestVerbDefs, TestVerbRefs, and TestVerbLog) correctly generate those queries. If the wrapper construction drifts, these tests will still pass even though the commands won't work. Fix this by creating query-builder helpers or modifying the tests to invoke each verb's RunE function with a captured dispatcher to assert that the actual wrapper-generated queries match the expected filter structure, ensuring the full integration from command execution to parsed query is tested rather than just parsing in isolation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@cmd/ox/code_verbs.go`:
- Around line 172-179: The validation loop for command-line arguments currently
routes `--after` and `--before` through validateSymbolArg, which forbids colons
and rejects valid ISO 8601 timestamps. Fix this by conditionally applying
different validation logic based on the filter name: keep validateSymbolArg for
`--author`, but skip the validateSymbolArg check for `--after` and `--before`
(or apply a less restrictive validation that permits timestamps). You can modify
the loop to check if v.name is in the timestamp filters and handle those cases
separately, allowing colons in their values while maintaining proper validation
for the author filter.
In `@cmd/ox/code.go`:
- Around line 180-197: In the runCodeSearch function, add a check for missing
index metadata before opening the CodeDB with codedb.Open(dataDir) at line 194.
Similar to how isCodeDBIndexing is checked, determine if the index metadata does
not exist. When the index metadata is missing and agentID is not empty, return
emitIndexNotReadyJSON with an appropriate status value to provide the structured
JSON contract for agent callers instead of allowing the raw open error to
surface.
---
Nitpick comments:
In `@cmd/ox/code_verbs_test.go`:
- Around line 19-85: These tests only validate the parser behavior by testing
manually authored DSL strings, but they don't verify that the actual verb
command wrappers (the RunE functions for each verb like TestVerbCallers,
TestVerbCallees, TestVerbDefs, TestVerbRefs, and TestVerbLog) correctly generate
those queries. If the wrapper construction drifts, these tests will still pass
even though the commands won't work. Fix this by creating query-builder helpers
or modifying the tests to invoke each verb's RunE function with a captured
dispatcher to assert that the actual wrapper-generated queries match the
expected filter structure, ensuring the full integration from command execution
to parsed query is tested rather than just parsing in isolation.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: Path: .coderabbit.yaml
Review profile: CHILL
Plan: Pro Plus
Run ID: c4fbee49-88fd-4b42-be6b-9f3889b1ac0a
📒 Files selected for processing (16)
.claude/rules/ox-code.mdclaude-plugin/skills/ox/SKILL.mdcmd/ox/agent_prime.gocmd/ox/agent_prime_xml.gocmd/ox/agent_query.gocmd/ox/code.gocmd/ox/code_activity.gocmd/ox/code_insights.gocmd/ox/code_prs.gocmd/ox/code_test.gocmd/ox/code_verbs.gocmd/ox/code_verbs_test.godocs/specs/codedb-agent-discoverability.mddocs/specs/codedb-discoverability-impl-results.mdextensions/claude/commands/ox.mdinternal/prime/guidance.go
| for _, v := range []struct{ name, val string }{ | ||
| {"--author", author}, {"--after", after}, {"--before", before}, | ||
| } { | ||
| if v.val == "" { | ||
| continue | ||
| } | ||
| if err := validateSymbolArg(v.name, v.val); err != nil { | ||
| return err |
There was a problem hiding this comment.
--after/--before are over-restricted and reject valid timestamp values.
Line 178 routes all log filters through validateSymbolArg, which forbids :. That blocks valid datetime values (e.g., 2026-04-01T10:00:00Z) for --after/--before.
💡 Suggested fix
+func validateDateArg(name, arg string) error {
+ if arg == "" {
+ return fmt.Errorf("%s: value must not be empty", name)
+ }
+ if strings.ContainsAny(arg, " \t\n\r\"'") {
+ return fmt.Errorf("%s: value must not contain whitespace or quotes", name)
+ }
+ return nil
+}
@@
for _, v := range []struct{ name, val string }{
{"--author", author}, {"--after", after}, {"--before", before},
} {
if v.val == "" {
continue
}
- if err := validateSymbolArg(v.name, v.val); err != nil {
+ var err error
+ switch v.name {
+ case "--after", "--before":
+ err = validateDateArg(v.name, v.val)
+ default:
+ err = validateSymbolArg(v.name, v.val)
+ }
+ if err != nil {
return err
}
}🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cmd/ox/code_verbs.go` around lines 172 - 179, The validation loop for
command-line arguments currently routes `--after` and `--before` through
validateSymbolArg, which forbids colons and rejects valid ISO 8601 timestamps.
Fix this by conditionally applying different validation logic based on the
filter name: keep validateSymbolArg for `--author`, but skip the
validateSymbolArg check for `--after` and `--before` (or apply a less
restrictive validation that permits timestamps). You can modify the loop to
check if v.name is in the timestamp filters and handle those cases separately,
allowing colons in their values while maintaining proper validation for the
author filter.
| dataDir, useLedger := resolvePreferredCodeDBDir(root) | ||
| agentID, _ := detectAgentContext() | ||
|
|
||
| db, err := codedb.Open(dataDir) | ||
| if err != nil { | ||
| return fmt.Errorf("open codedb: %w", err) | ||
| // Index-not-ready paths: emit structured JSON when an agent is calling | ||
| // so it can branch on `status` instead of treating an error as terminal. | ||
| if isCodeDBIndexing(useLedger) { | ||
| if agentID != "" { | ||
| return emitIndexNotReadyJSON(cmd, indexStatusIndexing, | ||
| "Code index is currently being built. Search will be available once indexing completes.", | ||
| "Use Grep/Glob until indexing completes; rerun 'ox code search' afterward.") | ||
| } | ||
| defer db.Close() | ||
| return fmt.Errorf("code index is currently being built — search is unavailable until indexing completes. Run 'ox code status' to check progress") | ||
| } | ||
|
|
||
| // attach all daemon-built dirty overlays for uncommitted file search | ||
| // (supports multiple simultaneous worktrees) | ||
| if n := db.AttachAllDirtyIndexes(); n > 0 { | ||
| slog.Debug("attached dirty overlays", "count", n) | ||
| } | ||
| db, err := codedb.Open(dataDir) | ||
| if err != nil { | ||
| return fmt.Errorf("open codedb: %w", err) | ||
| } |
There was a problem hiding this comment.
Missing-index agent path is not handled before opening CodeDB.
On Line 194, runCodeSearch opens the DB without first checking whether the index metadata exists. For agent callers, this can skip the structured status:"not_indexed" contract and surface a raw open/search path instead.
💡 Suggested fix
import (
"bytes"
"context"
"encoding/json"
+ "errors"
"fmt"
"log/slog"
"os"
"path/filepath"
@@
dataDir, useLedger := resolvePreferredCodeDBDir(root)
agentID, _ := detectAgentContext()
+
+ metadataPath := filepath.Join(dataDir, store.MetadataDBFile)
+ if _, statErr := os.Stat(metadataPath); errors.Is(statErr, os.ErrNotExist) {
+ if agentID != "" {
+ return emitIndexNotReadyJSON(cmd, indexStatusNotIndexed,
+ "No code index found for this repo.",
+ "Run 'ox code index' to build the index, then rerun 'ox code search'.")
+ }
+ return fmt.Errorf("no code index found — run 'ox code index' first")
+ } else if statErr != nil {
+ return fmt.Errorf("check codedb index: %w", statErr)
+ }
// Index-not-ready paths: emit structured JSON when an agent is calling📝 Committable suggestion
‼️ IMPORTANT
Carefully review the code before committing. Ensure that it accurately replaces the highlighted code, contains no missing lines, and has no issues with indentation. Thoroughly test & benchmark the code to ensure it meets the requirements.
| dataDir, useLedger := resolvePreferredCodeDBDir(root) | |
| agentID, _ := detectAgentContext() | |
| db, err := codedb.Open(dataDir) | |
| if err != nil { | |
| return fmt.Errorf("open codedb: %w", err) | |
| // Index-not-ready paths: emit structured JSON when an agent is calling | |
| // so it can branch on `status` instead of treating an error as terminal. | |
| if isCodeDBIndexing(useLedger) { | |
| if agentID != "" { | |
| return emitIndexNotReadyJSON(cmd, indexStatusIndexing, | |
| "Code index is currently being built. Search will be available once indexing completes.", | |
| "Use Grep/Glob until indexing completes; rerun 'ox code search' afterward.") | |
| } | |
| defer db.Close() | |
| return fmt.Errorf("code index is currently being built — search is unavailable until indexing completes. Run 'ox code status' to check progress") | |
| } | |
| // attach all daemon-built dirty overlays for uncommitted file search | |
| // (supports multiple simultaneous worktrees) | |
| if n := db.AttachAllDirtyIndexes(); n > 0 { | |
| slog.Debug("attached dirty overlays", "count", n) | |
| } | |
| db, err := codedb.Open(dataDir) | |
| if err != nil { | |
| return fmt.Errorf("open codedb: %w", err) | |
| } | |
| dataDir, useLedger := resolvePreferredCodeDBDir(root) | |
| agentID, _ := detectAgentContext() | |
| metadataPath := filepath.Join(dataDir, store.MetadataDBFile) | |
| if _, statErr := os.Stat(metadataPath); errors.Is(statErr, os.ErrNotExist) { | |
| if agentID != "" { | |
| return emitIndexNotReadyJSON(cmd, indexStatusNotIndexed, | |
| "No code index found for this repo.", | |
| "Run 'ox code index' to build the index, then rerun 'ox code search'.") | |
| } | |
| return fmt.Errorf("no code index found — run 'ox code index' first") | |
| } else if statErr != nil { | |
| return fmt.Errorf("check codedb index: %w", statErr) | |
| } | |
| // Index-not-ready paths: emit structured JSON when an agent is calling | |
| // so it can branch on `status` instead of treating an error as terminal. | |
| if isCodeDBIndexing(useLedger) { | |
| if agentID != "" { | |
| return emitIndexNotReadyJSON(cmd, indexStatusIndexing, | |
| "Code index is currently being built. Search will be available once indexing completes.", | |
| "Use Grep/Glob until indexing completes; rerun 'ox code search' afterward.") | |
| } | |
| return fmt.Errorf("code index is currently being built — search is unavailable until indexing completes. Run 'ox code status' to check progress") | |
| } | |
| db, err := codedb.Open(dataDir) | |
| if err != nil { | |
| return fmt.Errorf("open codedb: %w", err) | |
| } |
🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
In `@cmd/ox/code.go` around lines 180 - 197, In the runCodeSearch function, add a
check for missing index metadata before opening the CodeDB with
codedb.Open(dataDir) at line 194. Similar to how isCodeDBIndexing is checked,
determine if the index metadata does not exist. When the index metadata is
missing and agentID is not empty, return emitIndexNotReadyJSON with an
appropriate status value to provide the structured JSON contract for agent
callers instead of allowing the raw open error to surface.
Summary
Three batches of changes that make
ox codeactually reachable by AI coworkers (Claude Code, Cursor, Codex, Windsurf, etc.). Driven by the observation that even thoughox codeexists and would help agents understand the codebase, agents almost never reach for it — defaulting toGrep/Glob/Readchains instead.The investigation report (
docs/specs/codedb-agent-discoverability.md) is included in this PR for the full "why." TL;DR:ox code searchover Grep" without showing whatox codecan do that grep cannot. Pre-trained tool-use instincts (grep/rg/find) override softPREFERdirectives.type:,calls:,calledby:,confidence:,before:/after:,/regex/— is the unique value, and it was surfaced nowhere the agent could see. Not in--help(zero examples, two flags), not in skill files, not in.claude/rules/. ADR-019's call-graph capability sat behind DSL filters with no verb-mode wrapper.code index is currently being builttaught them to abandon the tool. Stale-index anxiety was unaddressed.What changed
Batch 1 — Framing (3 commits, instruction-only)
cmd/ox/agent_prime_xml.go—<code-search>banner rewritten from prescription to demonstration. Now lists 9 concrete DSL invocations showing what grep cannot do (calledby:authenticate,calls:Handler depth:2,type:pr,type:comment ckind:todo, etc.) and the full keyword inventory. Restricts Grep/Glob to "exact-string-in-known-file" and "0-result fallback." Lives in the static prefix-cache region so the +330 token cost is paid once per session.cmd/ox/code.go—codeCmd.LongandcodeSearchCmd.Longcarry full DSL grammar + 7 example queries spanning symbol / call-graph / PR / comment / history / regex intents. Previously--helpshowed two flags and zero examples.cmd/ox/code_activity.go,code_prs.go—Shortstrings reworded from pipeline-internal language ("for the fact extractor") to agent-visible value props.internal/prime/guidance.go—<commands>table expanded from 2 ox-code rows to 6, surfacingcalls/calledby,type:pr/type:issue,prs,activity,insightsas distinct intents.cmd/ox/agent_prime.go—CodeSearchTiprewording aligned with new framing..claude/rules/ox-code.md(new) — decision tree + DSL cheatsheet + anti-patterns + fallback policy + index-health notes.claude-plugin/skills/ox/SKILL.md,extensions/claude/commands/ox.md— "Search Code, History, PRs" sections added. Previously silent onox code.Batch 2 — Ergonomics (1 commit, small CLI surface changes)
--snippet Nfor override. 120 was cutting mid-arg-list on Go signatures.--full-jsondescription updated from~6xto~4xtokens.ox code search|insights|prs|activitynow emit{"status":"indexing"|"not_indexed", "fallback_hint":"..."}JSON on stdout (exit 0) when invoked by an agent (detectAgentContext()true). Humans still get the human-readable error. Agents branch onstatusinstead of abandoning the tool on stderr.type:symbol,calledby:/calls:,type:pr/type:issue,before:/after:. Just-in-time discovery without bloating responses that already use the DSL.ox code searchnow writescodedb: N results in Tms (dirty overlays: K)to stderr. Calibrates agents on latency without per-call context cost. Suppressed by--quiet.Batch 3 — Verb wrappers (1 commit + 1 follow-up, new surface)
ADR-019's resolved call graph (
calls:/calledby:) was completely hidden behind DSL grammar. Agents pattern-match on verbs (callers,who-calls) more reliably than on filter keywords. New thin wrappers (each a shell over the sharedrunCodeSearchhelper extracted fromcodeSearchCmd.RunE):All verbs inherit the same flags as
ox code search(--limit,--snippet,--full-json) so output is byte-for-byte identical with the equivalent DSL invocation. Banner, guidance table, skill, slash command, and rule all updated to list verbs first, DSL-mode second.Follow-ups (2 commits, review feedback)
fix(codedb): sub-second precision on search-latency stderr line + compact commit fieldsformatSearchLatencyexposes µs/ms below 1s so the stderr line shows real numbers (was floored to<1s)compactSearchResultaddscommit_hash,author,commit_message—type:commitresults used to drop these and return{"snippet":""}ox code log <path>defaults--afterto one year ago so the natural invocation always returns data (executor requires at least one of author/before/after/message)fix(codedb): review feedback — preserve snippet JSON contract, wrap banner linesSnippetfield reverted fromomitemptyto always-present (preserves agent JSON contract)fix(codedb): close deferred review items — rune-aware snippet truncation + verb-arg validationcompactSnippetnow walks runes viafor byteIdx := range result— byte-slice truncation could corrupt multi-byte UTF-8 (Greek identifiers, emoji, CJK PR titles) and break downstream JSON encodingvalidateSymbolArg/validatePathArgreject whitespace, colons, and quote characters on verb args +--author/--after/--before/--langflag values — prevents DSL-injection confusion likeox code callers "foo calledby:bar"rendering ascalledby:foo calledby:barTests
26 new test cases across
cmd/ox/code_test.goandcmd/ox/code_verbs_test.go:TestIsBareQueryTestCompactSearchResults_*TestCompactSnippet_RuneAware_*TestEmitIndexNotReadyJSON_*indexingandnot_indexedshapesTestIndexStatusConstants_AreStableTestValidateSymbolArg_*TestValidatePathArg_*TestVerb*_BuildsDSLTestVerbCommands_AreRegisteredcodeCmdFull suite: 5,872 pass / 2 pre-existing fails / 7 skip across 12 packages. Two failures (
TestDoctorFreshInstall_*) are unrelated — verified by stashing the branch and rerunning (same failures). Root cause is a daemon-cached version-update notice being treated as an unexpected warning by the test fixtures.Performance
Banner token cost
<code-search>block bytesSearch latency (sub-second precision now visible)
Pure search 3–28ms vs grep ~40-100ms / ripgrep ~45-56ms. Wall-clock (incl. binary startup) is ~1s for
ox code; that's a known property of the ox binary, not introduced by this PR.Subagent A/B
Two rounds of testing whether the new framing actually changes agent behavior:
Round 1 — same task, different framing
oxrepo: 5 understand-the-codebase tasks × 2 framings = 10 subagents. Aggregate: A=37 tools / 103,945 tokens / 137.7s wall, B=36 / 99,371 / 135.5s — −3% tools, −4% tokens, −2% wall. Small but consistent direction; B was also more accurate on T1 (no false positive caller)..sageox/priming, 1,484 symbols): 3 tasks × 2 = 6 subagents. Aggregate: B was worse — +31% tools, +30% tokens, +56% wall. Direction reversed because (a) no environmental priming so framing came from prompt only, (b) symbols not in index → ox code calls returned 0 → fell back to grep anyway, (c) repo too small for the index advantage to compound.What this proves: framing helps when (i) the repo is large enough that codedb actually beats grep, (ii) the target symbol exists in the index, (iii) the agent gets the banner via
ox agent primenot just inline prompt. The initial N=1 test showing −70% over-sold the win; N=8 across two repos shows a modest, conditional improvement. Honest writeup indocs/specs/codedb-discoverability-impl-results.md.Implementation discipline
runCodeSearchextracted fromcodeSearchCmd.RunEso verb wrappers share the index-not-ready handling, dirty-overlay attach, search, and stats emission — no duplicationcalls:/calledby:/confidence:) already in main (12643f0) — this PR builds on themnumber:<n>DSL filter (would enableox code pr <number>verb)ox findroot-level alias (touchesrootCmd— required-review surface perAGENTS.md)ox doctor(separate hardening work)Test plan
go build ./...cleango vet ./cmd/ox/... ./internal/prime/...cleango test ./cmd/ox/ ./internal/codedb/... ./internal/prime/...— 5,872 pass / 2 pre-existing fail / 7 skip./ox agent prime --format=xmlshows new banner with verb-mode + DSL-mode examples./ox code defs <name>,callers,callees,refs,logall execute against live index./ox code callers "foo calledby:bar"rejected with helpful error./ox code search "λ" type:codereturns valid UTF-8 (no mid-rune corruption)docs/specs/codedb-discoverability-impl-results.mdFiles changed
cmd/ox/code_verbs.go,cmd/ox/code_verbs_test.go,.claude/rules/ox-code.md,docs/specs/codedb-agent-discoverability.md,docs/specs/codedb-discoverability-impl-results.mdcmd/ox/code.go,cmd/ox/code_test.go,cmd/ox/code_activity.go,cmd/ox/code_insights.go,cmd/ox/code_prs.go,cmd/ox/agent_prime.go,cmd/ox/agent_prime_xml.go,cmd/ox/agent_query.go,internal/prime/guidance.go,claude-plugin/skills/ox/SKILL.md,extensions/claude/commands/ox.md🤖 Generated with Claude Code
Co-Authored-By: SageOx ox@sageox.ai
Summary by CodeRabbit
Release Notes
New Features
callers,callees,defs,refs,log) for common code search patternsDocumentation
ox codecommand documentation with DSL filter examples and verb-mode guidesTests