FEAT: Add DangerousQA dataset loader by romanlutz · Pull Request #1751 · microsoft/PyRIT

romanlutz · 2026-05-18T19:11:24Z

Description

Adds a remote seed dataset loader for DangerousQA (Shaikh et al., 2022 — On Second Thought, Let's Not Think Step by Step! Bias and Toxicity in Zero-Shot Reasoning, arXiv:2212.08061). The dataset is ~200 harmful questions generated from a single seed prompt and is widely reused as a baseline in subsequent red-teaming work, e.g., Bhardwaj & Poria's Red-Eval (2023).

Approach

New class _DangerousQADataset(_RemoteDatasetLoader) in pyrit/datasets/seed_datasets/remote/dangerous_qa_dataset.py, registered in the package __init__.
The source JSON is a flat list[str] rather than the list[dict] shape _fetch_from_url expects. To avoid touching the shared base class for a one-off shape, the loader fetches and caches questions itself via small private helpers (_fetch_questions / _load_raw_questions) while still reusing _get_cache_file_name and the JSON read/write helpers — each string is wrapped as {"question": s} on disk so cache I/O stays compatible.
Pinned to commit 445568d3b73f81a9054f51c739172186d5648157 of SALT-NLP/chain-of-thought-bias for reproducibility, matching how HarmBench pins its source.
harm_categories is intentionally left empty on every SeedPrompt and is not set at the class level. The paper describes the dataset as covering racist/stereotypical/sexist/illegal/toxic/harmful content, but those labels apply in aggregate — the source JSON has no per-item categorisation, so any class-level list would mis-label individual prompts. The docstring and the description field document this explicitly.

Tests and Documentation

New unit tests at tests/unit/datasets/test_dangerous_qa_dataset.py (9 tests) cover fetch behaviour, the cache flag, dataset name, the pinned-commit default source, class-level metadata (tags/size/modalities), and error paths for HTTP failure, non-list payloads, and non-string items.
Documentation: added @shaikh2022second to doc/references.bib (next to other dataset citations) and to the hidden-citations list in doc/bibliography.md.
Verified locally: uv run ruff format, uv run ruff check, uv run ty check, uv run pytest tests/unit/datasets/test_dangerous_qa_dataset.py -v (9 passed), full uv run pytest tests/unit/datasets -v (no regressions). Also smoke-tested a live fetch against the pinned URL — returns 200 seeds and cache round-trip is stable. No notebooks added, so no JupyText run needed.

Adds a remote seed dataset loader for the DangerousQA dataset from Shaikh et al. (2022), 'On Second Thought, Let's Not Think Step by Step! Bias and Toxicity in Zero-Shot Reasoning'. The dataset contains ~200 harmful questions spanning racist, stereotypical, sexist, illegal, toxic, and harmful categories and is widely used as a baseline in Bhardwaj & Poria's Red-Eval (2023) benchmark. The source JSON at https://github.com/SALT-NLP/chain-of-thought-bias is a flat list of strings, so the loader handles fetch and on-disk caching directly while still reusing the base class's cache-key helper. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Resolves conflict in doc/bibliography.md hidden-citations list against microsoft#1747 (DOC: Correct citations) by keeping main's citation-key renames and re-adding @shaikh2022second in alphabetical order. Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

romanlutz and others added 3 commits May 18, 2026 12:06

Cover cache and source branches in dangerous_qa tests

2b38cdd

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

FEAT: Add DangerousQA dataset loader#1751

FEAT: Add DangerousQA dataset loader#1751
romanlutz wants to merge 3 commits into
microsoft:mainfrom
romanlutz:romanlutz/add-dangerous-qa-dataset

romanlutz commented May 18, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

romanlutz commented May 18, 2026

Description

Tests and Documentation

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant