Skip to content

Wire doc/scanner notebooks into the integration notebook harness #1752

@romanlutz

Description

@romanlutz

Problem

The doc/scanner/ notebooks (1_pyrit_scan.ipynb, airt.ipynb, benchmark.ipynb, foundry.ipynb, garak.ipynb) are listed in doc/myst.yml and render in our docs site, but nothing in CI actually executes them. The integration notebook harness (tests/integration/<area>/test_notebooks_*.py) is hard-coded to doc/code/<area>/ via pyrit.common.path.DOCS_CODE_PATH, so the scanner notebooks are render-only.

This means breakages in the user-facing scanner API surface — like the deprecated-type slip-through fixed in #1746 — won't be caught by CI.

Proposal

Add tests/integration/scanner/test_notebooks_scanner.py, mirroring the existing per-area pattern (e.g. tests/integration/scenarios/test_notebooks_scenarios.py). It would parametrize over os.listdir(DOC_ROOT / "scanner") and run each notebook via ExecutePreprocessor under RUN_ALL_TESTS=true.

Points to confirm during implementation:

  • The scanner notebooks call pyrit_scan end-to-end with OpenAIChatTarget() against real datasets. They are likely slow enough to warrant max_dataset_size=1 shims or a skipped_files entry for the heavy ones — pick the right cost/coverage tradeoff up front.
  • DOCS_CODE_PATH is doc/code by name; introducing a parallel DOCS_SCANNER_PATH constant in pyrit/common/path.py keeps the pattern consistent.

Context

Surfaced in the PR review thread on #1746: #1746 (comment)

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions