Skip to content

fix(security): unify path validation to use realpath, removing dead isPathWithinRootReal#384

Open
matrix9neonebuchadnezzar2199-sketch wants to merge 3 commits into
colbymchenry:mainfrom
matrix9neonebuchadnezzar2199-sketch:fix/path-validation-unify
Open

fix(security): unify path validation to use realpath, removing dead isPathWithinRootReal#384
matrix9neonebuchadnezzar2199-sketch wants to merge 3 commits into
colbymchenry:mainfrom
matrix9neonebuchadnezzar2199-sketch:fix/path-validation-unify

Conversation

@matrix9neonebuchadnezzar2199-sketch
Copy link
Copy Markdown

Summary

Consolidates two path-validation helpers in src/utils.ts so that all callers
transparently get realpath-based symlink-escape protection. The public API
is preserved; only the internals of validatePathWithinRoot change.
This PR also includes fork-only HTML docs under docs/; the security fix is src/utils.ts + tests/security.test.ts only.

Problem

src/utils.ts defines two helpers with overlapping intent:

  • validatePathWithinRoot(projectRoot, filePath) — logical check only
    (path.resolve + prefix comparison)
  • isPathWithinRootReal(filePath, rootDir) — same logical check, plus a
    fs.realpathSync step to catch symlink escapes

Every production call site uses the logical-only variant:

  • src/mcp/tools.ts (all MCP tool handlers)
  • src/extraction/index.ts (parser entry points)
  • src/context/* (context builder)

isPathWithinRootReal is defined and exported but never imported anywhere
in the repo — git grep isPathWithinRootReal returns zero call sites. The
stronger check has been dead code since it was added.

As a result, a symlink inside the project tree whose target is outside the
project root passes validation: the logical path stays under the root, so
the prefix check succeeds, even though the on-disk resolution escapes the
project. There is no test today that exercises this scenario — the
__tests__/security.test.ts "Path Traversal Prevention" section only covers
relative-path traversal and a missing-id case.

Change

src/utils.ts:

  • validatePathWithinRoot now performs the logical check and a
    fs.realpathSync step. Resolved real paths are compared with a new
    case-aware helper so Windows roots (C:\Proj vs c:\proj) compare
    correctly on case-insensitive filesystems.
  • ENOENT from realpathSync falls back to the logical result. This is
    intentional: indexing legitimately references files that may not exist
    on disk at validation time (deletions, race with the watcher). The
    logical prefix has already been verified at this point, so the
    attack surface for ENOENT is bounded to in-root paths.
  • Other realpathSync errors (ELOOP, EACCES, etc.) are treated
    conservatively as rejections.
  • isPathWithinRoot becomes a thin boolean wrapper around
    validatePathWithinRoot.
  • isPathWithinRootReal is removed (verified dead).

__tests__/security.test.ts:

Seven new cases under a new "Path Traversal Prevention — symlink escapes"
block:

  1. Relative .. traversal rejected
  2. Absolute path outside root rejected
  3. Symlink inside project pointing outside — rejected (primary regression target)
  4. Symlink inside project pointing inside — accepted
  5. Nested valid paths accepted
  6. Non-existent path inside/outside root — handled per ENOENT policy above
  7. Windows-only: case-insensitive root comparison

Tests creating symlinks early-return on EPERM/ENOSYS, so Windows runners
without developer mode are not broken; they skip 2 of the 7 cases cleanly.

Compatibility

  • Public API unchanged. validatePathWithinRoot, isPathWithinRoot
    keep their signatures and return types. Callers that gate on
    !== null / boolean continue to behave identically for previously
    accepted paths.
  • Behavior change is strictly tightening. Paths that used to pass and
    now fail are exactly the symlink-escape cases the helper was nominally
    supposed to block.
  • Returned resolved path now reflects realpath when available. Existing
    callers store this in the DB and pass it to fs.readFile, both of which
    are unaffected (and arguably more consistent now that symlinked indirections
    are normalized).
  • No dependency changes. No package.json touch.
  • No call-site changes required. All three production modules
    (mcp/tools.ts, extraction/index.ts, context/*) keep working without
    edits.

Verification

npm run build # clean npm test # full suite: see note below vitest run tests/security.test.ts # 44 passed / 2 skipped (the 2 are Windows-symlink # cases that early-return without admin privileges) git grep isPathWithinRootReal # 0 hits

In the fork's CI on Windows, 5 unrelated tests in mcp-roots.* fail with
EBUSY / timeout. These reproduce on main of this repo prior to the
patch and are not introduced by this PR. They appear consistent with the
known stdin-close → process.exit(0) race in src/mcp/transport.ts,
which I'll address in a separate PR.

Out of scope

Deliberately not bundled here, to keep the diff focused and reviewable:

  • install.sh / install.ps1 SHA256 verification (the npm-shim path
    already verifies SHA256SUMS since 0.9.4; only the shell direct-install
    path is unguarded)
  • MCP transport graceful shutdown (the process.exit(0) issue mentioned above)
  • src/mcp/tools.ts decomposition (currently ~100 KB, single file)

Happy to send follow-ups for any of these if the direction is welcome.

Diff size

src/utils.ts +92 -38 tests/security.test.ts +103 -0


Author note: Patch developed against a fork where I'm keeping a code
review log; I'm happy to omit or relocate any of that on merge. The
substantive change is fully contained in the two files above.

Document tool architecture, Cursor integration effects, quality/i18n design, and feature proposals for the codegraph-jp fork effort.

Co-authored-by: Cursor <cursoragent@cursor.com>
Consolidate validatePathWithinRoot with realpath symlink checks; remove unused isPathWithinRootReal; add seven security tests.

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
@rickgonzalez
Copy link
Copy Markdown

I just randomly scanned this PR with a tool I'm building - this is the first PR I've done. I hope it helps with the review. https://www.lenzon.ai/viewer/cmpkhwzw10000138ywdsk076u?voice=google-neural2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants