PolicyEngine US - Development Guide

⚠️ MANDATORY FIRST ACTION

At the START of each session, ask the user:

"Would you like to load PolicyEngine development skills for this session?"

Options to present:

"Yes, load skills" (Recommended) - Load pattern skills for code quality
"No, skip" - Proceed without loading skills

If Option 1 selected, load ALL of these:

/policyengine-code-style
/policyengine-parameter-patterns
/policyengine-period-patterns
/policyengine-testing-patterns
/policyengine-variable-patterns

Build/Test/Lint Commands

# Install dependencies
make install
# Alternative installation
pip install -e .[dev]

# Format code
make format  # Runs ruff format

# Run all tests
make test

# Run specific test file or directory
pytest policyengine_us/tests/path/to/test_file.py

# Run specific test function
pytest policyengine_us/tests/path/to/test_file.py::test_function_name

# Run specific YAML tests
policyengine-core test path/to/tests -c policyengine_us [-v]

# Run microsimulation test
pytest policyengine_us/tests/microsimulation/test_microsim.py

# Run YAML-specific tests
make test-yaml-structural
make test-yaml-no-structural

# Generate documentation
make documentation

GitHub Workflow

Checkout a PR: gh pr checkout [PR-NUMBER]
View PR list: gh pr list
View PR details: gh pr view [PR-NUMBER]
Contributing to PRs:
- ALWAYS run make format before committing - this ensures code meets style guidelines and is non-negotiable
- Use git push to push changes to the PR branch

Changelog

Every PR needs a changelog fragment in changelog.d/:

echo "Description of change." > changelog.d/<branch-name>.<type>.md

Types: added (minor bump), changed (patch), fixed (patch), removed (minor), breaking (major)

DO NOT edit CHANGELOG.md directly or use changelog_entry.yaml (deprecated).

Project Requirements

Python >= 3.11, < 3.15
Follow GitHub Flow with PRs targeting master branch
Every PR needs a changelog fragment in changelog.d/
ALWAYS run make format before every commit - this is mandatory

Project-Specific Gotchas

Unit tests with scalar values can pass while vectorized microsimulation fails
When implementing a previously empty variable, check for dependent formulas
When using defined_for, ensure it's tested in microsimulation context
For scale parameters that return integers, avoid using rate_unit: int in metadata (use /1 instead)
Use bool instead of int or /1 in rate_unit for scale parameters when appropriate
Program takeup is assigned during microdata construction, not simulation time
- Changes to takeup parameters (SNAP, EITC, etc.) have no effect in the web app
- These parameters should include economy: false in their metadata
Labor Supply Response & Negative Earnings: Use max_(earnings, 0) to prevent sign flips. Negative total earnings should result in zero labor supply responses.

Program registry (programs.yaml)

policyengine_us/programs.yaml is the single source of truth for program coverage metadata
Served via the /us/metadata API and consumed by the model coverage page
When adding a new program: add an entry with id, name, full_name, category, agency, status, coverage, variable, parameter_prefix
When extending year coverage: update verified_years (e.g., "2022-2026") after verifying parameters and tests cover the new year
When adding state implementations: add to state_implementations list under the parent federal program
Status values: complete, partial, in_progress
Keep entries sorted by: Taxes, then Benefits by agency (USDA, HHS, SSA, HUD, FCC, ED, DOE), then State, then Local

State Program Patterns

When refactoring federal programs to state-specific implementations:
- Keep shared federal components if they're from federal regulations (CFR/USC)
- Check all dependencies before removing variables - use grep to find references
- Create integration tests to verify the refactoring works correctly
State programs should be self-contained with their own income calculations and eligibility rules
- Use state-specific variable names (e.g., il_tanf_countable_income not tanf_countable_income)

Regulatory Compliance

Always cite specific regulation sections in variable reference and documentation
When implementing complex benefit calculations, document the step-by-step process based on regulations
Follow the exact order of operations specified in regulations
Verify behavior at edge cases (income just below/above thresholds, exact boundary conditions)
Consider real-world examples to validate implementation, including official calculators

Code Integrity

BEFORE DELETING ANY CODE, VERIFY IT IS ACTUALLY UNUSED
- Grep for all callers: grep -r 'name' --include='*.py' | grep -v test | grep -v __pycache__
- Code that lives near dead code is not necessarily dead — verify each piece independently
- Existing tests may bypass the code being removed (e.g. providing a variable as direct input rather than testing its derivation) — passing tests ≠ safe to delete
ABSOLUTELY NEVER HARDCODE LOGIC JUST TO PASS SPECIFIC TEST CASES
- NEVER add conditional logic that returns fixed values for specific input combinations
- NEVER use period.start.year or other conditional checks to return hardcoded values for test cases
- If tests fail, fix the ACTUAL ROOT CAUSE, not the symptom
When dealing with regulatory examples:
- Use period-appropriate parameter values
- Document any special time-period specific logic in BOTH code comments and variable documentation
- Focus on preserving the calculation PROCESS rather than just matching specific OUTCOMES

Code Coverage Exclusions

Use # pragma: no cover only for code that cannot be tested in unit tests:

Allowed:

Microsim-specific branches: simulation.is_over_dataset, simulation.has_axes
Behavioral response code with simulation branching: simulation.get_branch(), simulation.baseline

NOT allowed:

Code that simply lacks tests (write tests instead)
Complex logic that seems hard to test (find a way)
Edge cases or error handling (these should be tested)

Parameter Validation Gotchas

When using breakdown metadata in parameters, avoid using variable references for integer values. Use Python expressions like range(1, 5).
The parameter validation system has issues with certain structures:
- Using boolean keys (True/False) as parameter names can cause validation errors
- Using integer output variables in breakdown metadata can cause errors
To fix validation issues:
- Split complex parameters into separate, simpler parameter files
- Use string names instead of boolean keys
- See GitHub issue #346

Entity Structures

Marital Units:
- Include exactly 1 person (if unmarried) or 2 people (if married)
- Do NOT include children or dependents
- marital_unit.nb_persons() will return 1 or 2, never more
SSI Income Attribution:
- For married couples where both are SSI-eligible: combined income is attributed to each spouse via ssi_marital_earned_income and ssi_marital_unearned_income
SSI Spousal Deeming:
- Only applies when one spouse is eligible and the other is ineligible

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PolicyEngine US - Development Guide

⚠️ MANDATORY FIRST ACTION

Build/Test/Lint Commands

GitHub Workflow

Changelog

Project Requirements

Project-Specific Gotchas

Program registry (programs.yaml)

State Program Patterns

Regulatory Compliance

Code Integrity

Code Coverage Exclusions

Parameter Validation Gotchas

Entity Structures

FilesExpand file tree

CLAUDE.md

Latest commit

History

CLAUDE.md

File metadata and controls

PolicyEngine US - Development Guide

⚠️ MANDATORY FIRST ACTION

Build/Test/Lint Commands

GitHub Workflow

Changelog

Project Requirements

Project-Specific Gotchas

Program registry (programs.yaml)

State Program Patterns

Regulatory Compliance

Code Integrity

Code Coverage Exclusions

Parameter Validation Gotchas

Entity Structures