Open-source toolkit for specification-curve and multiverse analysis in clinical AI research — quantifying how analytical choices affect conclusions.
The Multiverse Analysis Toolkit provides open-source implementations for running multiverse analyses (Steegen et al., 2016) and specification-curve analyses (Simonsohn et al., 2020) on clinical AI prediction models. It systematically varies analytical decisions — preprocessing, feature selection, model architecture, hyperparameters, outcome definitions — and reports the full distribution of results across all defensible specifications.
Multiverse analysis addresses a critical credibility problem in clinical AI: published results typically report a single "best" specification, hiding the sensitivity of conclusions to analytical choices. This toolkit makes the full decision space transparent and auditable.
EvidenceOS uses a 12-axis multiverse framework for clinical AI evaluation:
| Axis | What varies | Example |
|---|---|---|
| 1. Cohort definition | Inclusion/exclusion criteria | Age ranges, injury severity thresholds |
| 2. Feature selection | Which predictors to include | GCS alone vs GCS + biomarkers + imaging |
| 3. Missing data | Imputation strategy | Complete case, MICE, mean imputation |
| 4. Outcome definition | How outcome is measured | 6-month GOS-E vs 12-month mortality |
| 5. Model architecture | Algorithm choice | Logistic regression, XGBoost, neural net |
| 6. Hyperparameters | Tuning ranges | Learning rate, regularization, depth |
| 7. Validation strategy | How performance is estimated | k-fold, temporal split, external validation |
| 8. Performance metric | Which metric is primary | AUROC, calibration slope, net benefit |
| 9. Subgroup | Population subset | Pediatric, elderly, mild TBI, severe TBI |
| 10. Threshold | Decision boundary | Sensitivity-optimized vs specificity-optimized |
| 11. Temporal window | Follow-up duration | 30-day, 90-day, 6-month, 12-month |
| 12. Site | Data source | Single-center, multi-center, cross-country |
git clone https://github.com/EvidenceOS/multiverse-analysis-toolkit.git
cd multiverse-analysis-toolkit
pip install -r requirements.txt
# Run example multiverse analysis (synthetic TBI data)
python examples/tbi_multiverse.py
# Generate specification curve plot
python tools/spec_curve.py --results examples/output/tbi_results.csv --output spec_curve.png
# Run your own multiverse
python tools/run_multiverse.py --config your_config.yaml --data your_data.csv/multiverse-analysis-toolkit
├── README.md
├── CONTRIBUTING.md
├── LICENSE — Apache 2.0
├── requirements.txt
├── /core
│ ├── multiverse.py — Core multiverse engine
│ ├── spec_curve.py — Specification curve implementation
│ ├── config_parser.py — YAML config for defining decision space
│ └── report_generator.py — Automated report generation
├── /visualizations
│ ├── spec_curve_plot.py — Specification curve visualization
│ ├── heatmap.py — Decision × outcome heatmap
│ └── forest_plot.py — Forest plot of specifications
├── /examples
│ ├── tbi_multiverse.py — TBI prediction model multiverse
│ ├── configs/ — Example YAML configs
│ └── output/ — Example outputs
├── /tests
└── /docs
├── methodology.md
├── config_reference.md
└── interpretation_guide.md
We welcome contributions. See CONTRIBUTING.md.
- Add new analysis axes
- Submit worked examples (synthetic data only — no real PHI)
- Improve visualizations
- Extend to new clinical domains
EvidenceOS Inc. (2026). Multiverse Analysis Toolkit: Open-Source
Specification-Curve Analysis for Clinical AI Research.
https://github.com/EvidenceOS/multiverse-analysis-toolkit
Based on: Steegen et al. (2016). Increasing Transparency Through a
Multiverse Analysis. Perspectives on Psychological Science.
- evidence-commons — Ontologies used in axis definitions
- evidenceos-bridge-tbi — Clinical model evaluated via multiverse
- tripod-ai-templates — Reporting compliance
Apache 2.0 (see LICENSE). Maintained by EvidenceOS Inc. Contact: peter@evidenceos.com | https://evidenceos.com
Verified claims in this README:
| Claim | Source | Status |
|---|---|---|
| 12-axis multiverse framework | EvidenceOS architecture specification | Verified |
| Steegen et al., 2016 methodology | Steegen et al., Perspectives on Psychological Science, 2016 | Verified |
| Simonsohn et al., 2020 specification curve | Simonsohn et al., American Economic Review, 2020 | Verified |