feat: add file-backed data repo and CLI#223
Conversation
|
Warning Rate limit exceeded
You’ve run out of usage credits. Purchase more in the billing tab. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. ℹ️ Review info⚙️ Run configurationConfiguration used: defaults Review profile: CHILL Plan: Pro Plus Run ID: 📒 Files selected for processing (5)
📝 WalkthroughWalkthroughThis PR expands the monorepo by introducing two new npm packages— ChangesMonorepo Package Expansion
Estimated code review effort🎯 4 (Complex) | ⏱️ ~60 minutes Possibly related PRs
🚥 Pre-merge checks | ✅ 4 | ❌ 1❌ Failed checks (1 warning)
✅ Passed checks (4 passed)
✏️ Tip: You can configure your own custom pre-merge checks in the settings. ✨ Finishing Touches🧪 Generate unit tests (beta)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
There was a problem hiding this comment.
Actionable comments posted: 1
🧹 Nitpick comments (6)
packages/fs/src/paths.ts (1)
1-3: ⚡ Quick winConsider using Node.js
path.posixfor more robust path normalization.The current implementation uses a simple regex replacement to convert backslashes to forward slashes. While this works for basic cases, it doesn't handle edge cases like:
- Mixed path separators
- UNC paths on Windows
- Path traversal sequences (
..,.)- Duplicate separators
Consider using Node.js built-in
path.posix.normalize()for more robust cross-platform path handling, or at minimum document the function's limitations if the simple approach is intentional for this specific use case.♻️ Alternative implementation using path.posix
+import { posix } from 'node:path'; + export function toDataPath(path: string): string { - return path.replace(/\\/g, '/'); + return posix.normalize(path.replace(/\\/g, '/')); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/fs/src/paths.ts` around lines 1 - 3, The toDataPath function currently does a simple backslash-to-slash replacement which misses edge cases; update to use Node's path.posix.normalize to robustly normalize separators, collapse duplicate slashes and resolve '.'/'..' segments (or explicitly document the limitations if you must keep the simple replace). Locate the function toDataPath in packages/fs/src/paths.ts and replace the regex approach with path.posix.normalize-based normalization (importing path from 'path') or add a clear comment explaining that the implementation intentionally only replaces backslashes and does not handle UNC paths, traversal, or duplicate separators.packages/fs/tsconfig.json (1)
14-16: ⚡ Quick winExclude test files from the TypeScript build.
The pattern
"src/**/*.ts"includes test files likeindex.test.tsin compilation. Withdeclaration: trueandoutDir: "./dist", this generates unnecessary.d.tsfiles for tests in the published package. Add an exclude pattern to keep the build clean.Proposed tsconfig adjustment
{ "compilerOptions": { "target": "es2022", "module": "es2022", "moduleResolution": "bundler", "declaration": true, "sourceMap": true, "outDir": "./dist", "esModuleInterop": true, "forceConsistentCasingInFileNames": true, "strict": true, "skipLibCheck": true }, "include": [ "src/**/*.ts" - ] + ], + "exclude": [ + "src/**/*.test.ts", + "src/**/*.spec.ts" + ] }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/fs/tsconfig.json` around lines 14 - 16, The tsconfig currently uses the include pattern "src/**/*.ts" which pulls test files (e.g., index.test.ts) into the build and produces unwanted declaration files; add an "exclude" array to the same tsconfig.json to filter out test files and test directories (for example patterns like "**/*.test.ts", "**/*.spec.ts", and "**/__tests__/**") so that when declaration: true and outDir: "./dist" are used the test files are not compiled or emitted; update the tsconfig.json to include this exclude entry alongside the existing include.packages/cli/tsconfig.json (1)
14-16: ⚡ Quick winConsider excluding test files from TypeScript compilation.
Test files are currently included in the build output via the
src/**/*.tspattern. While published artifacts are already gated by the"files": ["dist"]field in package.json, it's cleaner to explicitly exclude test files from compilation to avoid unnecessary emit and follow TypeScript best practices.Proposed tsconfig adjustment
{ "compilerOptions": { "target": "es2022", "module": "es2022", "moduleResolution": "bundler", "declaration": true, "sourceMap": true, "outDir": "./dist", "esModuleInterop": true, "forceConsistentCasingInFileNames": true, "strict": true, "skipLibCheck": true }, "include": [ - "src/**/*.ts" + "src/**/*.ts" ], + "exclude": [ + "src/**/*.test.ts", + "src/**/*.spec.ts" + ] }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/cli/tsconfig.json` around lines 14 - 16, The tsconfig currently includes "src/**/*.ts" which pulls test files into compilation; update the tsconfig.json to explicitly exclude test files (e.g. patterns like "**/*.test.ts", "**/*.spec.ts", and test directories such as "src/**/__tests__/**") so TypeScript does not emit test sources—add an "exclude" array with these patterns alongside the existing "include" entry to prevent tests from being compiled.packages/fs/src/validate.ts (1)
18-28: 💤 Low valueConsider deriving scope list from constants to prevent drift.
The hardcoded scope list could become out of sync if entity collections are added or removed elsewhere.
🔧 Suggested refactor
+import type { ValidationScope } from './validate.js'; + function emptyChecked(): Record<ValidationScope, number> { - return { - landmark: 0, - line: 0, - operator: 0, - service: 0, - station: 0, - town: 0, - issue: 0, - }; + const checked: Record<ValidationScope, number> = {} as Record<ValidationScope, number>; + for (const collection of entityCollections) { + checked[collection] = 0; + } + checked.issue = 0; + return checked; }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/fs/src/validate.ts` around lines 18 - 28, The hardcoded scope map in emptyChecked() can drift from the canonical scope definitions; change emptyChecked to dynamically build and return a Record<ValidationScope, number> by iterating over the authoritative scope source (for example the ValidationScope enum or a VALIDATION_SCOPES/validationScopes constant) and setting each key's value to 0. Replace the hardcoded object in emptyChecked() with a reducer or Object.fromEntries over that canonical list so new/removed scopes are automatically included and types remain ValidationScope.packages/fs/src/issues.ts (1)
36-42: ⚡ Quick winClarify expected ID format in error message.
The error message on Line 39 indicates the ID is invalid but doesn't specify the expected format, making it harder for developers to debug.
💬 Suggested improvement
export function issuePathFromId(dataDir: string, id: string): string { const [year, month] = id.split('-'); if (!year || !month) { - throw new Error(`Invalid issue id: ${id}`); + throw new Error(`Invalid issue id: ${id} (expected format: YYYY-MM-...)`); } return join(dataDir, issueDirectory, year, month, id); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/fs/src/issues.ts` around lines 36 - 42, The error thrown in issuePathFromId when id.split('-') doesn't yield a year and month should be more descriptive: update the thrown Error in function issuePathFromId to include the expected ID pattern (e.g. "expected format 'YYYY-MM-<suffix>'" or a concrete example like "2023-05-123") so callers know the required format; locate the check that throws new Error(`Invalid issue id: ${id}`) and replace the message with one that includes the expected pattern and an example, referencing issuePathFromId, the id parameter, and issueDirectory/join usage so the intent is clear.packages/fs/src/json.ts (1)
22-31: ⚡ Quick winConsider wrapping validation errors with line number context.
When schema validation fails on Line 30, the error won't indicate which NDJSON line failed, making debugging harder for malformed data files.
💡 Suggested enhancement to add line context
export async function readNdjsonFile<T>( path: string, schema: z.ZodType<T>, ): Promise<T[]> { const text = await readFile(path, 'utf8'); - return text + const lines = text .split('\n') - .filter((line) => line.trim().length > 0) - .map((line) => schema.parse(JSON.parse(line) as unknown)); + .filter((line) => line.trim().length > 0); + return lines.map((line, index) => { + try { + return schema.parse(JSON.parse(line) as unknown); + } catch (error) { + throw new Error(`Line ${index + 1}: ${error instanceof Error ? error.message : String(error)}`); + } + }); }🤖 Prompt for AI Agents
Verify each finding against current code. Fix only still-valid issues, skip the rest with a brief reason, keep changes minimal, and validate. In `@packages/fs/src/json.ts` around lines 22 - 31, The readNdjsonFile function currently maps over lines without associating validation failures to their NDJSON line number; modify readNdjsonFile to iterate with index (forEach or for loop), and for each line wrap JSON.parse and schema.parse in a try/catch that, on error, rethrows a new Error (or zod error with .cause) that includes the NDJSON line index (1-based), the original error message (or error as cause), and optionally the offending line content; reference readNdjsonFile, JSON.parse, and schema.parse when adding the try/catch and enriched error message so callers can see which NDJSON line failed validation.
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/cli/package.json`:
- Around line 17-19: The CLI package's compiled entrypoint (dist/index.js) loses
the Node shebang from src/index.ts, breaking the mrtdown bin; add a post-build
step in packages/cli/package.json (e.g., "postbuild") that restores the shebang
on dist/index.js (or configure the bundler) — run a small command after build to
prepend "#!/usr/bin/env node" to dist/index.js so the "bin":
{"mrtdown":"dist/index.js"} entry works when installed.
---
Nitpick comments:
In `@packages/cli/tsconfig.json`:
- Around line 14-16: The tsconfig currently includes "src/**/*.ts" which pulls
test files into compilation; update the tsconfig.json to explicitly exclude test
files (e.g. patterns like "**/*.test.ts", "**/*.spec.ts", and test directories
such as "src/**/__tests__/**") so TypeScript does not emit test sources—add an
"exclude" array with these patterns alongside the existing "include" entry to
prevent tests from being compiled.
In `@packages/fs/src/issues.ts`:
- Around line 36-42: The error thrown in issuePathFromId when id.split('-')
doesn't yield a year and month should be more descriptive: update the thrown
Error in function issuePathFromId to include the expected ID pattern (e.g.
"expected format 'YYYY-MM-<suffix>'" or a concrete example like "2023-05-123")
so callers know the required format; locate the check that throws new
Error(`Invalid issue id: ${id}`) and replace the message with one that includes
the expected pattern and an example, referencing issuePathFromId, the id
parameter, and issueDirectory/join usage so the intent is clear.
In `@packages/fs/src/json.ts`:
- Around line 22-31: The readNdjsonFile function currently maps over lines
without associating validation failures to their NDJSON line number; modify
readNdjsonFile to iterate with index (forEach or for loop), and for each line
wrap JSON.parse and schema.parse in a try/catch that, on error, rethrows a new
Error (or zod error with .cause) that includes the NDJSON line index (1-based),
the original error message (or error as cause), and optionally the offending
line content; reference readNdjsonFile, JSON.parse, and schema.parse when adding
the try/catch and enriched error message so callers can see which NDJSON line
failed validation.
In `@packages/fs/src/paths.ts`:
- Around line 1-3: The toDataPath function currently does a simple
backslash-to-slash replacement which misses edge cases; update to use Node's
path.posix.normalize to robustly normalize separators, collapse duplicate
slashes and resolve '.'/'..' segments (or explicitly document the limitations if
you must keep the simple replace). Locate the function toDataPath in
packages/fs/src/paths.ts and replace the regex approach with
path.posix.normalize-based normalization (importing path from 'path') or add a
clear comment explaining that the implementation intentionally only replaces
backslashes and does not handle UNC paths, traversal, or duplicate separators.
In `@packages/fs/src/validate.ts`:
- Around line 18-28: The hardcoded scope map in emptyChecked() can drift from
the canonical scope definitions; change emptyChecked to dynamically build and
return a Record<ValidationScope, number> by iterating over the authoritative
scope source (for example the ValidationScope enum or a
VALIDATION_SCOPES/validationScopes constant) and setting each key's value to 0.
Replace the hardcoded object in emptyChecked() with a reducer or
Object.fromEntries over that canonical list so new/removed scopes are
automatically included and types remain ValidationScope.
In `@packages/fs/tsconfig.json`:
- Around line 14-16: The tsconfig currently uses the include pattern
"src/**/*.ts" which pulls test files (e.g., index.test.ts) into the build and
produces unwanted declaration files; add an "exclude" array to the same
tsconfig.json to filter out test files and test directories (for example
patterns like "**/*.test.ts", "**/*.spec.ts", and "**/__tests__/**") so that
when declaration: true and outDir: "./dist" are used the test files are not
compiled or emitted; update the tsconfig.json to include this exclude entry
alongside the existing include.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: ef39f0d9-919b-4093-b02f-a762903efd92
⛔ Files ignored due to path filters (1)
package-lock.jsonis excluded by!**/package-lock.json
📒 Files selected for processing (33)
.github/workflows/validate.ymlAGENTS.mdDockerfileREADME.mdfixtures/data/issue/2024/01/2024-01-15-circle-line-delay/evidence.ndjsonfixtures/data/issue/2024/01/2024-01-15-circle-line-delay/impact.ndjsonfixtures/data/issue/2024/01/2024-01-15-circle-line-delay/issue.jsonfixtures/data/landmark/millenia-walk.jsonfixtures/data/line/ccl.jsonfixtures/data/operator/smrt-trains.jsonfixtures/data/service/ccl-main.jsonfixtures/data/station/promenade.jsonfixtures/data/town/downtown-core.jsonpackage.jsonpackages/cli/biome.jsonpackages/cli/package.jsonpackages/cli/src/index.test.tspackages/cli/src/index.tspackages/cli/tsconfig.jsonpackages/fs/biome.jsonpackages/fs/package.jsonpackages/fs/src/constants.tspackages/fs/src/entities.tspackages/fs/src/id.tspackages/fs/src/index.test.tspackages/fs/src/index.tspackages/fs/src/issues.tspackages/fs/src/json.tspackages/fs/src/manifest.tspackages/fs/src/pagesIndex.tspackages/fs/src/paths.tspackages/fs/src/validate.tspackages/fs/tsconfig.json
There was a problem hiding this comment.
Actionable comments posted: 2
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@packages/fs/src/issues.ts`:
- Around line 139-147: Replace the non-atomic check-then-create pattern: remove
the call to issueExists(dataDir, issue.id) and instead create the parent path(s)
with mkdir(parentDir, { recursive: true }) and then atomically claim the leaf
directory by calling mkdir(issueDir) without { recursive: true }; if that second
mkdir throws EEXIST (or returns that it already exists) throw the "Issue already
exists: ${issue.id}" error. After successfully claiming the leaf directory,
proceed to writeJsonFile(join(issueDir, issueFileName), issue) and
writeNdjsonFile for evidenceFileName and impactFileName as before; reference
functions/vars: issuePathFromId, mkdir, writeJsonFile, writeNdjsonFile,
issueFileName, evidenceFileName, impactFileName.
In `@packages/fs/src/json.ts`:
- Around line 38-42: The throw site using the unsupported overload "throw new
Error(..., { cause: error })" should be changed to construct a plain Error,
assign the cause property explicitly, and then throw it; locate the throw of
Error in packages/fs/src/json.ts that uses "throw new Error(..., { cause: error
})", replace it with code that creates an Error instance with the same message,
set error.cause = error (or via a narrow type cast) to preserve the original
cause, and then throw that Error object so TypeScript compiles without requiring
ErrorOptions.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 777bf005-b8ca-4c83-ab40-db72e04ead0e
📒 Files selected for processing (9)
packages/cli/package.jsonpackages/cli/scripts/ensure-shebang.mjspackages/cli/tsconfig.jsonpackages/fs/src/index.test.tspackages/fs/src/issues.tspackages/fs/src/json.tspackages/fs/src/paths.tspackages/fs/src/validate.tspackages/fs/tsconfig.json
✅ Files skipped from review due to trivial changes (2)
- packages/cli/package.json
- packages/cli/tsconfig.json
🚧 Files skipped from review as they are similar to previous changes (3)
- packages/fs/tsconfig.json
- packages/fs/src/paths.ts
- packages/fs/src/validate.ts
|
@coderabbitai review |
✅ Actions performedReview triggered.
|
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: d2efadf292
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
|
@codex review |
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: ddc3b82a2e
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
There was a problem hiding this comment.
Actionable comments posted: 3
🤖 Prompt for all review comments with AI agents
Verify each finding against current code. Fix only still-valid issues, skip the
rest with a brief reason, keep changes minimal, and validate.
Inline comments:
In `@fixtures/data/issue/2026/02/2026-02-07-tgl-maintenance/impact.ndjson`:
- Around line 3-4: The service_scopes entries for events with ids ie_3 and ie_4
contain duplicate identical scopes; update the JSON so each "serviceScopes"
array contains a single {"type":"service.whole"} element instead of two
duplicates (i.e., modify the objects with "id": "ie_3" and "id": "ie_4" to have
"serviceScopes": [{"type":"service.whole"}]) to remove redundancy while
preserving all other fields like "entity", "ts", and "basis".
In `@fixtures/data/service/SLL_MAIN_N.json`:
- Around line 18-24: The service revision contains duplicate displayCode "SL9"
for two stops (stationId "HBF" and "GSW") which creates ambiguous stop codes;
update the revision in SLL_MAIN_N.json so each stopId has a unique displayCode
within this revision path—locate the entries with stationId "HBF" and "GSW" and
change one of the "displayCode" values to the correct unique code (or
remove/rename per the canonical stop code mapping used by your system) and run
any relevant validation to ensure no other duplicates remain.
In `@fixtures/data/service/TGL_MAIN_E.json`:
- Around line 38-44: The eastbound stops array contains a duplicated stop object
with "stationId": "BMC" and "displayCode": "TG6" (the consecutive entries shown
in the diff); remove the duplicate entry from the eastbound path so the route
order only contains a single {"stationId":"BMC","displayCode":"TG6"} object,
ensuring the JSON array remains valid (commas adjusted if needed) and the route
ordering is preserved.
🪄 Autofix (Beta)
Fix all unresolved CodeRabbit comments on this PR:
- Push a commit to this branch (recommended)
- Create a new PR with the fixes
ℹ️ Review info
⚙️ Run configuration
Configuration used: defaults
Review profile: CHILL
Plan: Pro Plus
Run ID: 0a58a792-26de-48f0-b8b2-4b5a629c8219
📒 Files selected for processing (39)
fixtures/README.mdfixtures/data/issue/2026/01/2026-01-01-tgl-train-fault/evidence.ndjsonfixtures/data/issue/2026/01/2026-01-01-tgl-train-fault/impact.ndjsonfixtures/data/issue/2026/01/2026-01-01-tgl-train-fault/issue.jsonfixtures/data/issue/2026/02/2026-02-07-tgl-maintenance/evidence.ndjsonfixtures/data/issue/2026/02/2026-02-07-tgl-maintenance/impact.ndjsonfixtures/data/issue/2026/02/2026-02-07-tgl-maintenance/issue.jsonfixtures/data/line/SLL.jsonfixtures/data/line/TGL.jsonfixtures/data/service/SLL_MAIN_N.jsonfixtures/data/service/SLL_MAIN_S.jsonfixtures/data/service/TGL_MAIN_E.jsonfixtures/data/service/TGL_MAIN_W.jsonfixtures/data/station/BBT.jsonfixtures/data/station/BKB.jsonfixtures/data/station/BMC.jsonfixtures/data/station/BTW.jsonfixtures/data/station/FNV.jsonfixtures/data/station/GSW.jsonfixtures/data/station/HBF.jsonfixtures/data/station/OTP.jsonfixtures/data/station/QSW.jsonfixtures/data/station/SER.jsonfixtures/data/station/SKG.jsonfixtures/data/station/SLA.jsonfixtures/data/station/SLT.jsonfixtures/data/station/TBW.jsonfixtures/data/station/TGH.jsonfixtures/data/station/TLH.jsonfixtures/data/station/UTS.jsonpackages/cli/src/index.test.tspackages/cli/src/index.tspackages/core/src/schema/issue/id.test.tspackages/core/src/schema/issue/id.tspackages/fs/src/entities.tspackages/fs/src/index.test.tspackages/fs/src/issues.tspackages/fs/src/json.tspackages/fs/src/manifest.ts
✅ Files skipped from review due to trivial changes (15)
- fixtures/README.md
- fixtures/data/station/SLT.json
- fixtures/data/issue/2026/02/2026-02-07-tgl-maintenance/issue.json
- fixtures/data/issue/2026/01/2026-01-01-tgl-train-fault/evidence.ndjson
- fixtures/data/line/TGL.json
- fixtures/data/station/TLH.json
- fixtures/data/station/OTP.json
- fixtures/data/station/BBT.json
- fixtures/data/station/TGH.json
- fixtures/data/service/TGL_MAIN_W.json
- fixtures/data/issue/2026/02/2026-02-07-tgl-maintenance/evidence.ndjson
- fixtures/data/service/SLL_MAIN_S.json
- fixtures/data/issue/2026/01/2026-01-01-tgl-train-fault/issue.json
- fixtures/data/station/HBF.json
- fixtures/data/station/TBW.json
🚧 Files skipped from review as they are similar to previous changes (2)
- packages/fs/src/entities.ts
- packages/cli/src/index.ts
Summary
Adds the next data-overhaul slice after
@mrtdown/core:@mrtdown/fswith file-backed target-layout readers, writers, manifest generation, Pages index rendering, issue ID helpers, and validation@mrtdown/cliwith validate, list, show, create, id issue, manifest, and pages-index commandsfixtures/datatarget-layout dataset for package and CLI testsImpact
This does not migrate production data or wire the legacy runtime to the new packages. It gives later migration PRs a reviewed filesystem API and CLI surface to validate and inspect target-layout data before generated data lands.
Validation
npm run checknpm run typechecknpm test -- --runnpm run build:corenpm run build:fsnpm run build:clinpm run test:fsnpm run test:clinpm run data:validateDid not run legacy
npm run buildbecause it still triggers the existing postbuild Sentry sourcemap upload path.Summary by CodeRabbit
Release Notes
New Features
Tests
Documentation