- Always communicate and work in English.
- Before starting development, check if
PRD.mdexists in the project root. If it does, read and follow the requirements defined in it throughout the development process. - IMPORTANT: Always prefer Rust native implementations. Avoid unnecessary external dependencies and leverage the Rust standard library as much as possible. Only use third-party crates when there is a clear, justified need.
- IMPORTANT: Follow Test-Driven Development (TDD). See the Testing (TDD) section below for detailed rules.
- IMPORTANT: Read and follow
METHODOLOGY.mdbefore starting any task. - When editing
CLAUDE.md, use the minimum words and sentences needed to convey 100% of the meaning. - After completing each planned task, run tests and commit before moving to the next task. Skip tests if the change has no impact on runtime behavior (e.g., docs, comments, CI config). Changes to runtime config files (YAML, JSON, etc. read by code) must still trigger tests.
- After any code change (feature addition, bug fix, refactoring, PR merge), check if
README.mdneeds updating. If project description, usage, setup, architecture, or API changed, updateREADME.mdwith clear, concise language. Keep it minimal — only document what users need to know.
- Write tests first. Follow Red-Green-Refactor: (1) failing test, (2) minimal code to pass, (3) refactor.
- Use real-world scenarios and realistic data in tests. Prefer actual use cases over trivial/contrived examples.
- Never overfit to tests. Implementation must solve the general problem, not just the specific test cases. No hardcoded returns, no input-matching conditionals, no logic that only handles test values. Use triangulation — when a fake/hardcoded implementation passes, add tests with different inputs to force generalization.
- Test behavior, not implementation. Assert on observable outcomes, not internal details — tests must survive refactoring.
- Every new feature or bug fix must have corresponding tests.
- Optimize test execution speed. Run independent tests in parallel. Use
cargo testdefault parallelism. Keep each test isolated — no shared mutable state — so parallel execution is safe. - For I/O-bound tests (network, file), prefer async or use mocks to avoid blocking. For CPU-bound tests, use multi-thread parallelism.
- If full test suite exceeds 30 seconds, investigate: split slow integration tests from fast unit tests, run unit tests first for quick feedback.
- Skip tests when no runtime impact. In CI/CD, use path filters to trigger tests only when source code, test files, or runtime config files are modified. Non-runtime changes (docs, README,
.md, CI pipeline config) should not trigger test runs.
- Add structured logs at key decision points, state transitions, and external calls — not every line. Logs alone should reveal the execution flow and root cause.
- Include context: request/correlation IDs, input parameters, elapsed time, and outcome (success/failure with reason).
- Use appropriate log levels:
error!for failures requiring action,warn!for recoverable issues,info!for business events,debug!/trace!for development diagnostics. - Use the
tracingcrate for structured, async-safe logging. Prefertracing::instrumentfor automatic span creation. - Never log sensitive data (credentials, tokens, PII). Mask or omit them.
- Avoid excessive logging in hot paths — logging must not degrade performance or increase latency noticeably.
- Names must be self-descriptive — understandable without reading surrounding code. Avoid cryptic abbreviations (
proc,mgr,tmp). - Prefer clarity over brevity, but don't over-pad.
user_email>e,calculate_shipping_cost>calc. - Booleans should read as yes/no questions:
is_valid,has_permission,should_retry. - Functions/methods should describe the action and target:
parse_config,send_notification,validate_input.
- Prefer explicit type annotations over type inference. Annotate function signatures (parameters and return types) always.
- Annotate variables when the type isn't obvious from the assigned value.
- Use newtypes to enforce domain semantics (e.g.,
struct Emu(f64)instead of baref64).
- Explain why, not what. Code already shows what it does — comments should capture intent, constraints, and non-obvious decisions.
- Comment business rules, workarounds, and "why this approach over the obvious one" — context that can't be inferred from code alone.
- Mark known limitations with
TODO(reason)orFIXME(reason)— always include why, not just what. - Delete comments when the code changes — outdated comments are worse than no comments.
- When facing design decisions or implementation challenges, first check if
references/INDEX.mdexists and find relevant reference projects. - If no relevant project exists in
references/, search the web for well-maintained open-source projects that solve similar problems. Search across all languages — architectural patterns transfer regardless of language. - When a new useful project is discovered and
references/exists, add it toreferences/INDEX.mdand create a corresponding detail file. Keep detail files under 50 lines. - Cite which reference project informed your approach when applying patterns from it.
- NEVER mention
tests/classified_fixtures/content (file names, paths, company names, personal names, document titles) in commit messages, PR titles/descriptions, issue comments, or any public-facing text. - Use generic references instead: "classified fixture", "internal test document", "ground truth PDF", etc.
- All commits must use the local git config
user.nameanduser.email. Verify withgit config user.nameandgit config user.emailbefore committing. - All commits must include
Signed-off-byline (always usegit commit -s). TheSigned-off-byname must match the commit author.
- All changes go through pull requests. No direct commits to
main. - Branch naming:
<type>/<short-description>(e.g.,feat/add-parser,fix/table-bug). - One branch = one focused unit of work.
- Use git worktrees for all branch work. Do not use
git checkout/git switchin the main repo.- Create:
git worktree add ../<repo-name>-<branch-name> -b <type>/<short-description> - Work and push from inside the worktree.
- Do not delete worktrees immediately after task completion — remove only when starting new work or upon user confirmation.
- Create:
Follow all steps in order:
- Rewrite PR description if empty/unclear via
gh pr edit. Include: what changed, why, key changes, and relevant context. - Cross-reference related issues (
gh issue list). Use "Related: #N" — avoid auto-close keywords unless instructed. - Check for conflicts. If
mainhas advanced, rebase/merge as needed. - Wait for CI to pass:
gh pr checks <number> --watch. Abort if tests fail. - Final code review via
gh pr diff <number>— check for debug statements, hardcoded paths, credentials, unused imports. - Merge:
gh pr merge <number> --merge. Never use--delete-branch(worktree depends on the branch). - Return to main repo,
git pullto sync. - Remove worktree:
git worktree remove ../<repo-name>-<branch-name> - Delete local branch:
git branch -d <branch-name> - Delete remote branch:
git push origin --delete <branch-name>
This project follows a 6-month rolling MSRV policy (aligned with tokio and other major crates):
- The
rust-versioninCargo.tomlMUST target a Rust stable release that was published at least 6 months ago - Rust stable releases ship every 6 weeks — consult releases.rs for exact dates
- When a newer Rust version crosses the 6-month threshold, updating the MSRV is allowed but not required — only bump when a newer language feature or dependency demands it
- Floor: the MSRV can never go below the minimum required by
editioninCargo.toml(edition 2024 = Rust 1.85)
Before any MSRV change:
- Verify no language features or APIs exclusive to versions above the target are used
- Confirm all dependencies compile on the target version (
cargo checkwith the target toolchain, or review dependency MSRV metadata) - Update CI matrix to include the new MSRV version
When comparing PDF output against ground truth (classified fixtures):
- Run
cargo test -p office2pdf --test artifact_generator -- --ignored --nocaptureto generate artifacts. - Read
tests/classified_fixtures/_work/report.json— contains per-file page counts, text lengths, and PNG paths. - Identify worst files: page count mismatches, large text length differences, conversion errors.
- For worst files, use the Read tool to view PNG images in
tests/classified_fixtures/_work/<work_dir>/:output-*.png— rendered pages from office2pdf outputgt-*.png— rendered pages from ground truth PDFoutput.txt/gt.txt— extracted text
- Compare output and GT PNGs visually to identify specific rendering differences (layout, font, table, image, margin, page break, etc.).
- For user-provided DOCX/XLSX/PPTX files on macOS, if Word/Excel/PowerPoint is available for that file type, export a PDF from the native Microsoft app first and compare that GT before guessing.
- Fix root causes in parser/codegen via TDD. Prioritize high-leverage fixes that improve multiple files.
When asked to "release", always perform both GitHub Release and crates.io publish:
- Version bump — Create a PR (
chore/publish-<version>) that bumpsversionin bothcrates/office2pdf/Cargo.tomlandcrates/office2pdf-cli/Cargo.toml, and updates the CLI'soffice2pdfdependency version. Merge via standard PR workflow. - GitHub Release —
gh release create v<version>with changelog and contributors section.- Use
git log <prev-tag>..HEAD --format='%an' | sort -uto find contributors. List each with their GitHub profile link.
- Use
- crates.io publish — Publish lib first, then CLI:
cargo publish -p office2pdfcargo publish -p office2pdf-cli
- Tag alignment — Ensure the GitHub release tag (
v<version>) and Cargo.toml versions match.