LLMs can summarize a hundred papers in seconds. What they cannot give you is an auditable systematic review: logged searches, traceable sources, recorded inclusion decisions, evidence tied to exact passages, and a replayable workflow another researcher can verify.
ResearchForge builds the package behind that review. Searches are logged. Sources are tracked. Inclusion decisions are stored. Evidence links to exact passages. The finished package replays offline.
The command-line tool is rforge.
- Search forty-four scholarly sources: OpenAlex, arXiv, Crossref, Semantic Scholar, PubMed, Europe PMC, NASA ADS, DOAJ, CORE, bioRxiv/medRxiv, Zenodo, INSPIRE HEP, dblp, ClinicalTrials.gov, OSF, OpenCitations, BASE, zbMATH Open, figshare, DataCite, Lens.org, ERIC, HAL, Dimensions, PubChem, ChemRxiv, NTRS, DOAB, OpenAIRE, PLOS, OSTI, Dryad, Research Square, CiNii, BioStudies, GBIF Literature, Harvard Dataverse, NASA CMR, PubMed Central (PMC), HuggingFace Papers, OAPEN, NBER Working Papers, Open Library, eLife Sciences
- Import and deduplicate papers into a local project store
- Track provenance end to end; every reference knows where it came from
- Screen studies with recorded inclusion and exclusion decisions
- Extract evidence linked to exact source passages
- Run meta-analysis: arm-pair effect sizes (SMD, odds ratio, risk ratio, mean difference, correlation) and scientific benchmarking mode for pooling a single continuous metric per study (e.g. solar-to-hydrogen efficiency %, AUROC) — with variance floor imputation and automatic sensitivity runs
- Build auditable reports
- Export replayable review packages that another researcher can audit offline
Works today:
- Project workspaces
- Multi-source search and import
- Source and reference provenance logs
- Deduplication reporting
- Offline forge fixture workflow (
rforge forge) - Package audit and replay
- Local Go + HTMX research cockpit (
rforge ui)
Experimental:
- Document parsing
- Screening engine
- Evidence extraction (including abstract extraction interface for LLM-backed numeric extraction)
- Meta-analysis: arm-pair effect sizes and scientific benchmarking mode (
--effect raw-continuous,rforge analysis ready, automatic floor-sensitivity runs per ADR-0007) - Web GUI beyond the skeleton
Adapter seams planned: GROBID (PDF parsing), OpenSearch, Qdrant (search), R/metafor, PyMARE (meta-analysis), PostgreSQL, LLM abstract extraction (Claude / OpenAI)
See ROADMAP.md for milestone sequencing.
curl -fsSL https://raw.githubusercontent.com/TrebuchetDynamics/research-forge/main/install.sh | bashThe installer downloads the latest release binary and does not require Go. Run rforge version to verify.
To build from source with Go 1.26+:
go install github.com/TrebuchetDynamics/research-forge/cmd/rforge@latestOr clone and build:
git clone https://github.com/TrebuchetDynamics/research-forge
cd research-forge
go build -o bin/rforge ./cmd/rforgerforge project create ./my-review --title "High entropy superconductors"
rforge search import --source openalex --query "high entropy superconductors" \
--pages 3 --project ./my-review
rforge duplicate report --project ./my-review
rforge report build --out ./my-review/report.md
rforge --project ./my-review uirforge forge is the guided workflow for building an auditable review package — a self-contained artifact that can be verified and replayed offline.
rforge forge init --project ./my-review \
--question "Do artificial photosynthesis catalysts improve solar fuel generation outcomes?"
rforge forge approve --project ./my-review --gate "question approval" --note accepted
rforge forge approve --project ./my-review --gate "protocol approval" --note accepted
rforge forge approve --project ./my-review --gate "network/API approval" --note accepted
rforge forge source-fixture --project ./my-review
rforge forge reference-fixture --project ./my-review
rforge forge approve --project ./my-review --gate "identity approval" --note accepted
rforge forge acquisition-fixture --project ./my-review
# Continue parser/screening/evidence/analysis/report approvals, then:
rforge forge package-fixture --project ./my-review --out ./review.rforgepkg
rforge package audit ./review.rforgepkg
rforge package replay ./review.rforgepkgPool a single continuous metric (e.g. solar-to-hydrogen efficiency %) across device papers without requiring treatment/control arms:
# Search with the chemistry preset, import records, and fetch legal OA PDFs as text
rforge --project ./my-review search batch --queries queries.txt --sources chemistry \
--out ./my-review/searches --stats --fetch-pdfs
# Prepare a benchmarking run — variance floor per ADR-0007, moderator fields auto-wired
rforge analysis prepare my-run --effect raw-continuous --variance-floor 0.0025 \
--moderator device_type --moderator auxiliary_bias --moderator measurement_standard
# Check all accepted evidence items carry the required schema fields
rforge analysis ready my-run --required device_type,auxiliary_bias,measurement_standard
# Run — emits main result + automatic no-floor sensitivity artifact
rforge analysis run my-runrforge oss add kermitt2/grobid
rforge oss scan --topic "meta-analysis"
rforge oss report --area parsersA review package is a directory:
review.rforgepkg
├── protocol.yaml
├── searches/
├── sources/
├── screening/
├── evidence/
├── analysis/
├── report.md
└── provenance.jsonl
rforge package audit verifies the package is complete. rforge package replay re-runs the workflow from the provenance log.
rforge --project ./my-review uiOpens a local Go + HTMX web interface for browsing papers, graph/artifact views, provenance, screening decisions, and evidence.
ResearchForge does not replace scientific judgment with AI answers. It gives researchers and agents structured tools for searching, screening, extracting, and reporting, while keeping citations, source passages, and replayable logs as the record of truth.
LLM outputs that enter the workflow are stored with provenance like any other step. The system is model-agnostic and works without any LLM connection.
ResearchForge ships a standalone agent skill — skills/research-forge/SKILL.md — that works in any project, not just this repo. The skill auto-installs rforge if it is not on the system, then handles literature search, provenance, and review packaging for any academic topic.
Claude Code / Pi — install globally (one command):
mkdir -p ~/.claude/skills/research-forge && \
curl -fsSL https://raw.githubusercontent.com/TrebuchetDynamics/research-forge/main/skills/research-forge/SKILL.md \
> ~/.claude/skills/research-forge/SKILL.mdAny other harness — paste the contents of skills/research-forge/SKILL.md as your system prompt or opening message.
Once installed, invoke it from any project:
Use the research-forge skill to research: <your topic>
The skill will check for rforge, install it if missing, create a project workspace, search the relevant sources, and write provenance.json before finishing.
Research question / domain query
-> Query planner + protocol compiler
-> rforge CLI
-> Scholarly source connectors (OpenAlex, arXiv, Crossref, ...)
-> Local SQLite project store + provenance log
-> Document acquisition + parser adapters (GROBID, S2ORC, ...)
-> Retrieval index (SQLite FTS / OpenSearch / Qdrant)
-> Screening engine + active-learning scaffold
-> Evidence extraction + risk-of-bias
-> Meta-analysis / statistical engine (R/metafor, PyMARE)
-> Report generator + reproducible package exporter
-> Local Go + HTMX research cockpit (rforge ui)
| Layer | Choice |
|---|---|
| Language | Go |
| CLI | rforge |
| Local web GUI | Go + HTMX (rforge ui) |
| Database | SQLite (PostgreSQL adapter seam planned) |
| Search | SQLite FTS; OpenSearch and Qdrant as optional adapters |
| PDF parsing | GROBID (optional); S2ORC, PaperMage, Anystyle adapter seams |
| Metadata sources | OpenAlex, Crossref, arXiv, Semantic Scholar, PubMed, Europe PMC, NASA ADS, Unpaywall, DOAJ, CORE, bioRxiv/medRxiv, Zenodo, INSPIRE HEP, dblp, ClinicalTrials.gov, OSF, OpenCitations, BASE, zbMATH Open, figshare, DataCite, Lens.org, ERIC, HAL, Dimensions, PubChem, ChemRxiv, NTRS, DOAB, OpenAIRE, PLOS, OSTI, Dryad, Research Square, CiNii, BioStudies, GBIF Literature, Harvard Dataverse, NASA CMR, PubMed Central (PMC), HuggingFace Papers, OAPEN, NBER Working Papers, Open Library, eLife Sciences |
| Meta-analysis | R metafor; PyMARE adapter seam |
See SKILLS.md before starting implementation work — each development skill owns a specific slice and enforces red-green-refactor TDD. See RESEARCH-FORGE-PRD.md for full product requirements and docs/reproducible-review-package.md for the package contract.
MIT License (SPDX: MIT), Copyright (c) 2026 Trebuchet Dynamics. See LICENSE. The license was selected by the repository owner on 2026-06-13; the decision record is tracked in issue #1 and docs/owner-decisions.md.
Local web GUI delivery targets Go + HTMX; the implementation tracker is recorded in issue #2 and ADR 0006. Run make todo-audit to verify that remaining unchecked TODO.md items are covered by owner decisions, make todo-completion-audit for the closeout prompt-to-artifact checklist, or make decisions-markdown for a review-friendly blocker table.