FEAT: Add CategoricalHarmfulQA (CatQA) dataset loader by romanlutz · Pull Request #1749 · microsoft/PyRIT

romanlutz · 2026-05-18T13:49:59Z

Description

Adds a loader for declare-lab/CategoricalHarmfulQA (CatQA), a 550-question safety evaluation dataset hand-authored against the combined prohibited-use lists from OpenAI's usage policies and Meta's Llama2 acceptable use policy.

CatQA complements the existing harmful_qa loader rather than duplicating it:

Real harm taxonomy mapped to harm_categories (11 main categories: Illegal Activity, Child Abuse, Hate/Harass/Violence, Malware Viruses, Physical Harm, Economic Harm, Fraud/Deception, Adult Content, Political Campaigning, Privacy Violation Activity, Tailored Financial Advice), each with 5 sub-categories surfaced via per-prompt metadata. HarmfulQA's "topics" (Social Sciences, Computer Science, ...) are academic disciplines, not harm categories.
Multilingual. Same prompts across English, Chinese, and Vietnamese splits via language={"en","zh","vi"} (default "en"). HarmfulQA is English-only.
Different construction methodology. Hand-authored against published policy lists vs HarmfulQA's auto-generated Chain of Utterances approach.

The loader follows the same patterns as the existing remote dataset providers (_HarmfulQADataset, _AyaRedteamingDataset, etc.): inherits _RemoteDatasetLoader, fetches via the datasets library, exposes class-level metadata (tags={"safety","multilingual"}, size="large", modalities=["text"], full harm_categories list) for filterable discovery, and registers automatically through the __init_subclass__ hook.

Tests and Documentation

New unit tests in tests/unit/datasets/test_categorical_harmful_qa_dataset.py cover default English split, all three language splits (parametrized), empty-category handling, and dataset_name. All 6 pass; the full tests/unit/datasets tier (428 tests) is also green.
Added the bhardwaj2024homer bibliography entry (arXiv 2402.11746) and listed CatQA in doc/code/datasets/1_loading_datasets.{py,ipynb}.
Re-executed 1_loading_datasets.ipynb via jupytext --to ipynb --execute so the printed dataset roster includes categorical_harmful_qa.
pre-commit (ruff-format, ruff-check, ty, nbstripout, link-checker) is clean.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

romanlutz and others added 2 commits May 18, 2026 06:35

Add CategoricalHarmfulQA (CatQA) dataset loader

75ef816

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Regenerate loading datasets notebook with CatQA in the list

b61b348

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEAT: Add CategoricalHarmfulQA (CatQA) dataset loader#1749

FEAT: Add CategoricalHarmfulQA (CatQA) dataset loader#1749
romanlutz wants to merge 2 commits into
microsoft:mainfrom
romanlutz:romanlutz/categorical-harmfulqa-review

romanlutz commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

romanlutz commented May 18, 2026

Description

Tests and Documentation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant