Add markdown golden-snapshot test harness for engine migration by jaredwray · Pull Request #467 · jaredwray/writr

jaredwray · 2026-06-14T21:48:03Z

Why

We're preparing to migrate writr's markdown engine from JS (unified/remark/rehype) to native Rust. That migration is only safe if we can prove the new engine produces the same HTML as today's engine across a large, diverse body of real markdown. The repo previously had ~25 ad-hoc fixtures and no snapshot testing — not enough to catch regressions across the 12 markdown features writr supports.

This PR adds a golden-snapshot harness: the current JS engine's output is the source of truth. We fetch a diverse corpus, render every example with today's engine, and commit the HTML as goldens. A future Rust engine plugs in behind the same adapter interface and must reproduce the goldens (with a reviewed allowlist for intentional diffs).

What's included

Corpus fetcher (test/harness/fetch/) — reproducible, pinned public sources: CommonMark spec, GFM spec, markdown-it fixtures, every public github.com/jaredwray/* repo (they all consume writr), and permissively-licensed docs. Raw payloads are cached and committed for offline replay. 1000 unique documents (incl. 339 from all jaredwray repos), each with provenance (source, license, attribution, sha256) in corpus/manifest.json.
Render profiles (profiles.ts) — default, commonmark, gfm-only, no-highlight, no-math, rawhtml, mdx. Failures isolate to a feature, and no-highlight/no-math let the Rust engine pass core contracts before achieving highlight.js/KaTeX parity. CommonMark/GFM examples also render under rawhtml so raw-HTML inputs (correctly stripped to empty under the default rawHtml:false) still get a non-empty golden pinning the passthrough.
Per-feature diagnostic suites (diagnostics/) — hand-authored tiny examples for every plugin (gfm tables/alerts/tasklists/strikethrough/autolinks, emoji, toc, slug, highlight, math, mdx, raw html, frontmatter, commonmark core) so a failure pinpoints the exact feature.
Pluggable RenderAdapter — current engine is WritrJsAdapter; a future WritrRustAdapter is selected via HARNESS_ENGINE and checked against the same JS-generated goldens. allowlist.json (engine-keyed) records reviewed intentional divergences.
Golden generator — forces caching:false, asserts renderSync == render, surfaces genuine engine errors via Writr's emitted error event (a legitimately-empty render is a valid golden), and records resolved plugin versions in versions.json for drift auditing.
Vitest runner — reports first-divergence index/line/col with context instead of a giant diff.

Results

2041 goldens generated, 2041 harness tests pass, zero render errors across the full corpus.
golden:check is clean (no drift); pnpm build and pnpm test are unaffected (100% coverage) — the harness is excluded from the default run via vitest.harness.config.ts + a vitest.config.ts exclude, mirroring the existing integration-test pattern.

New scripts

pnpm corpus:fetch          # fetch/refresh the corpus (online; caches raw payloads)
pnpm corpus:fetch:offline  # rebuild from committed cache only (no network)
pnpm golden:generate       # render corpus + diagnostics, write goldens + versions.json
pnpm golden:check          # CI gate: verify goldens still match the current engine
pnpm test:harness          # run the Vitest runner against committed goldens

Notes

The corpus is exactly 1000 unique documents (the cap). The stratified round-robin prioritises the jaredwray consumer markdown; CommonMark/GFM/markdown-it are overlapping spec suites, so global dedupe collapses byte-identical examples rather than padding with near-duplicates.
No credentials are committed: cached API payloads are public repo metadata and the manifest stores only raw CDN URLs (verified). The ghp_ strings present in the corpus are ghp_yourtoken placeholders inside jaredwray's own token documentation.
See test/harness/README.md for the full workflow and how to wire the Rust adapter (HARNESS_ENGINE=writr-rust).

https://claude.ai/code/session_01Rj1dsi6MW36qudnPKE2Zx9

Introduce a large-scale regression harness that pins the exact HTML output of the current markdown engine across a diverse corpus, so the upcoming JS -> native (Rust) engine migration can be validated for parity. What it includes: - A reproducible corpus fetcher (test/harness/fetch) pulling from pinned public sources: CommonMark spec, GFM spec, markdown-it fixtures, public jaredwray repos, and permissively-licensed docs. Raw payloads are cached and committed for offline replay. ~900 unique documents with per-doc provenance (source, license, attribution, sha256) in a manifest. - Named render profiles (default, commonmark, gfm-only, no-highlight, no-math, rawhtml, mdx) so failures isolate to a feature. CommonMark/GFM examples also render under rawhtml so raw-HTML inputs (stripped to empty under the default rawHtml:false) still produce a non-empty golden. - Hand-authored per-feature diagnostic suites covering every plugin (gfm tables/alerts/tasklists, emoji, toc, slug, highlight, math, mdx, raw html, frontmatter, commonmark core) for pinpoint failure reporting. - A pluggable RenderAdapter interface: the current engine is WritrJsAdapter; a future WritrRustAdapter plugs in behind HARNESS_ENGINE and is checked against the same goldens, with an engine-keyed allowlist for intentional diffs. - A golden generator (golden:generate / golden:check) that forces caching off, asserts sync/async parity, surfaces genuine engine errors via the emitted error event (legit empty output stays a valid golden), and records resolved plugin versions for drift auditing. - A Vitest runner reporting first-divergence index/line/col with context. The harness is excluded from the default `pnpm test`/coverage run via a dedicated vitest.harness.config.ts and a vitest.config.ts exclude, so the 2000+ case suite never slows the normal dev loop. New scripts: corpus:fetch, corpus:fetch:offline, golden:generate, golden:check, test:harness. https://claude.ai/code/session_01Rj1dsi6MW36qudnPKE2Zx9

codecov · 2026-06-14T21:48:57Z

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 100.00%. Comparing base (b159168) to head (91d440d).

Additional details and impacted files

@@            Coverage Diff            @@
##              main      #467   +/-   ##
=========================================
  Coverage   100.00%   100.00%           
=========================================
  Files            5         5           
  Lines          518       518           
  Branches       144       144           
=========================================
  Hits           518       518

☔ View full report in Codecov by Harness.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:

❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

gemini-code-assist

Code Review

This pull request introduces a comprehensive Markdown Golden-Snapshot Harness designed to ensure HTML output consistency across different markdown engines (specifically for an upcoming JS to Rust migration). It adds new package scripts, a detailed README, allowlist management, corpus loading utilities, and a large set of CommonMark spec markdown files to serve as the regression test corpus. There are no review comments to address, so no feedback is provided.

Re-ran the fetcher with an authenticated GitHub token so the jaredwray source is no longer throttled. It now enumerates all public jaredwray repos and pulls their markdown (339 unique files after dedupe), bringing the corpus to the full 1000-document cap. The stratified round-robin prioritises the jaredwray consumer markdown, so CommonMark trims to fit. Regenerated all goldens (2041, zero render errors); drift check clean and the full harness suite passes. No credentials are committed — cached API payloads are public repo metadata and the manifest stores only raw CDN URLs. https://claude.ai/code/session_01Rj1dsi6MW36qudnPKE2Zx9

aikido-pr-checks · 2026-06-14T22:03:34Z

+GOOGLE_CLOUD_PROJECT=contrib-dev
+GOOGLE_CLOUD_LOCATION=us-central1
+GOOGLE_CLOUD_TASK_QUEUE=bids-notification
+GOOGLE_CLOUD_TASK_API_TOKEN=12345678987654321


Exposed secret in test/harness/corpus/inputs/jaredwray/0148.md - low severity
Detected a Generic API Key, potentially exposing access to various services and sensitive operations.

_{Reply @AikidoSec ignore: [REASON] to ignore this issue.}
_{More Info}

@AikidoSec ignore: False positive. This is a test corpus fixture — markdown faithfully fetched from a public jaredwray repo for the golden-snapshot harness. Line 126 is a documentation placeholder inside a fenced code block (GOOGLE_CLOUD_TASK_API_TOKEN=12345678987654321), not a real credential.

Generated by Claude Code

✅ Based on your feedback, we ignored this issue because of the following reason:

False positive. This is a test corpus fixture — markdown faithfully fetched from a public jaredwray repo for the golden-snapshot harness. Line 126 is a documentation placeholder inside a fenced code block (GOOGLE_CLOUD_TASK_API_TOKEN=12345678987654321), not a real credential.

Generated by Claude Code

Document that both corpus inputs and golden outputs are checked in (so a fresh clone can run the harness with no network/generation step), and add a per-source breakdown of the 1000-document corpus. https://claude.ai/code/session_01Rj1dsi6MW36qudnPKE2Zx9

gemini-code-assist Bot reviewed Jun 14, 2026

View reviewed changes

aikido-pr-checks Bot reviewed Jun 14, 2026

View reviewed changes

jaredwray merged commit fce1459 into main Jun 14, 2026
13 checks passed

jaredwray deleted the claude/markdown-test-harness-tcw9ix branch June 14, 2026 22:38

jaredwray mentioned this pull request Jun 15, 2026

release: v6.1.3 #468

Merged

4 tasks

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add markdown golden-snapshot test harness for engine migration#467

Add markdown golden-snapshot test harness for engine migration#467
jaredwray merged 3 commits into
mainfrom
claude/markdown-test-harness-tcw9ix

jaredwray commented Jun 14, 2026 •

edited

Loading

Uh oh!

codecov Bot commented Jun 14, 2026 •

edited

Loading

Uh oh!

gemini-code-assist Bot left a comment

Uh oh!

aikido-pr-checks Bot Jun 14, 2026

Uh oh!

jaredwray Jun 14, 2026

Uh oh!

aikido-pr-checks Bot Jun 14, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jaredwray commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Why

What's included

Results

New scripts

Notes

Uh oh!

codecov Bot commented Jun 14, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Codecov Report

Uh oh!

gemini-code-assist Bot left a comment

Choose a reason for hiding this comment

Code Review

Uh oh!

aikido-pr-checks Bot Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

jaredwray Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

aikido-pr-checks Bot Jun 14, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jaredwray commented Jun 14, 2026 •

edited

Loading

codecov Bot commented Jun 14, 2026 •

edited

Loading