You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/lean-squad.md
+10-2Lines changed: 10 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -25,7 +25,8 @@ graph LR
25
25
A --> T4[Task 4: Implementation Extraction]
26
26
A --> T5[Task 5: Proof Assistance]
27
27
A --> T6[Task 6: Maintain Open Lean Squad PRs]
28
-
T1 & T2 & T3 & T4 & T5 & T6 --> T7[Task 7: Update FV Status Issue]
28
+
A --> T10[Task 10: Project Report]
29
+
T1 & T2 & T3 & T4 & T5 & T6 & T10 --> T7[Task 7: Update FV Status Issue]
29
30
T7 --> M[Save repo-memory]
30
31
````
31
32
@@ -37,7 +38,7 @@ The weighting scheme adapts automatically: when no FV work exists Task 1 dominat
37
38
38
39
Default weighting: dominates when no FV work exists yet.
39
40
40
-
Surveys the codebase to identify 3–5 functions, data structures, or algorithms that are strong formal verification candidates. For each target documents: expected benefit, rough specification size, proof tractability (`decide` / routine tactics / deep proof engineering), approximations needed, and recommended approach (model checking, inductive invariant, equational proof). Consults Lean 4 / Mathlib documentation and FV literature. Produces `formal-verification/RESEARCH.md` and `formal-verification/TARGETS.md` as a PR, and optionally a tracking issue inviting maintainer input on priorities.
41
+
Surveys the codebase to identify 3–5 functions, data structures, or algorithms that are strong formal verification candidates. If prior FV work exists, reads the latest `formal-verification/CRITIQUE.md` to incorporate feedback — adjusting target priorities, revising approaches, and addressing high-value gaps flagged by the critique. For each target documents: expected benefit, rough specification size, proof tractability (`decide` / routine tactics / deep proof engineering), approximations needed, and recommended approach (model checking, inductive invariant, equational proof). Consults Lean 4 / Mathlib documentation and FV literature. Produces `formal-verification/RESEARCH.md` and `formal-verification/TARGETS.md` as a PR, and optionally a tracking issue inviting maintainer input on priorities.
41
42
42
43
### Task 2: Informal Spec Extraction
43
44
@@ -73,6 +74,12 @@ Reviews open `[Lean Squad]` PRs, fixes CI failures (Lean syntax errors, `lake bu
73
74
74
75
Maintains a single `[Lean Squad] Formal Verification Status` issue as a continuously-updated dashboard with an at-a-glance table (one row per target, showing current phase and status), summary narrative, findings section (bugs found, counterexamples), approach notes, and a prepended run history entry for every run.
75
76
77
+
### Task 10: Project Report
78
+
79
+
Default weighting: important once proofs exist; available once Lean specs exist.
80
+
81
+
Creates and incrementally maintains `formal-verification/REPORT.md` — a comprehensive, reader-friendly project report summarising the entire FV effort. Uses mermaid diagrams extensively to visualise proof architecture, dependency layers, modelling choices, the main proof chain, and project timeline. Includes a mandatory Findings section documenting any bugs found (with counterexamples and issue links), formulation issues caught during development, and interesting structural discoveries. The report is updated incrementally each run rather than rewritten from scratch.
82
+
76
83
## What Gets Created
77
84
78
85
| Artifact | Location | Description |
@@ -81,6 +88,7 @@ Maintains a single `[Lean Squad] Formal Verification Status` issue as a continuo
81
88
| Target list |`formal-verification/TARGETS.md`| Prioritised targets with phase status |
# Phase progress heuristics derived from repo state
@@ -172,6 +176,7 @@ steps:
172
176
7: (10.0 if not has_critique else 3.0) if has_proofs else 0.0, # critique: critical when proofs exist but no doc
173
177
8: (3.0 if has_lean_specs else 1.0) if (has_rust and has_research) else 0.0, # aeneas: only for Rust codebases with research done
174
178
9: 12.0 if (has_lean_specs and not has_ci) else 2.0, # CI: critical when lean files exist but no CI; regular check otherwise
179
+
10: (8.0 if not has_report else 3.0) if has_proofs else (2.0 if has_lean_specs else 0.0), # report: important when proofs exist but no report; available once lean specs exist
'weights': {str(k): round(v, 2) for k, v in weights.items()},
@@ -408,6 +415,7 @@ formal-verification/
408
415
TARGETS.md # Prioritised target list with current phase per target
409
416
CORRESPONDENCE.md # How each Lean implementation model maps to the Rust source
410
417
CRITIQUE.md # Ongoing assessment of proof utility and coverage
418
+
REPORT.md # Ongoing latest project report
411
419
specs/
412
420
<name>_informal.md # Informal specification per target
413
421
lean/
@@ -421,26 +429,28 @@ formal-verification/
421
429
422
430
### Task 1: Research & Target Identification
423
431
424
-
**Goal**: Survey the codebase and identify 3–5 functions, data structures, or algorithms that are strong candidates for formal verification. Document the approach, expected benefits, likely spec sizes, and proof tractability.
432
+
**Goal**: Survey the codebase and identify 3–5 functions, data structures, or algorithms that are strong candidates for formal verification. Document the approach, expected benefits, likely spec sizes, and proof tractability. If prior FV work exists, incorporate feedback from the latest critique to adjust priorities and approach.
425
433
426
434
1. Read the repository: explore the structure, primary language(s), key modules. Read README, CONTRIBUTING, and any architecture docs.
427
-
2. Identify **FV-amenable targets** — look for:
435
+
2.**Read the latest critique** (if `formal-verification/CRITIQUE.md` exists): review its assessments of proof utility, identified gaps, concerns about vacuous proofs, and recommended next targets. Use these findings to adjust which targets to prioritise, which approaches to revise, and which high-value gaps to address. If the critique flags theorems as weak or models as mismatched, factor that into the research plan — either by re-prioritising targets, recommending spec revisions, or noting that certain areas need deeper modelling before further proof work.
436
+
3. Identify **FV-amenable targets** — look for:
428
437
- Pure or nearly-pure functions with clear inputs/outputs
429
438
- Data structure invariants (e.g., sorted lists, balanced trees, valid state machines)
430
439
- Algorithms with textbook correctness criteria (sorting, searching, parsing, hashing)
-**Approximations needed**: what aspects of the original code can't be directly modelled in Lean (e.g., I/O, side effects, memory layout)? Document these clearly.
439
449
-**Approach**: enumeration/`decide`, inductive invariant, equational proof, model checking via bounded `decide`?
440
-
4. Search the web (`web-fetch`) for Lean 4 FV patterns relevant to the language/domain. Check Mathlib for relevant existing lemmas and automation.
441
-
5. Create or update `formal-verification/RESEARCH.md` and `formal-verification/TARGETS.md`. Create a PR.
442
-
6. Optionally, open an issue summarising the survey and inviting maintainer input on priorities.
443
-
7. Update memory with identified targets, approach choices, and rationale.
450
+
5. Search the web (`web-fetch`) for Lean 4 FV patterns relevant to the language/domain. Check Mathlib for relevant existing lemmas and automation.
451
+
6. Create or update `formal-verification/RESEARCH.md` and `formal-verification/TARGETS.md`. If updating, include a section noting how critique feedback was incorporated (e.g., re-prioritised targets, revised approaches, new targets added from gap analysis). Create a PR.
452
+
7. Optionally, open an issue summarising the survey and inviting maintainer input on priorities.
453
+
8. Update memory with identified targets, approach choices, rationale, and any critique-driven adjustments.
444
454
445
455
---
446
456
@@ -878,6 +888,194 @@ Record in memory:
878
888
879
889
---
880
890
891
+
### Task 10: Project Report
892
+
893
+
**Goal**: Create and incrementally maintain `formal-verification/REPORT.md` — a comprehensive, reader-friendly project report that summarises the entire formal verification effort, including proof architecture, what was verified, findings (including bugs), modelling choices, and project timeline. The report uses mermaid diagrams extensively to visualise proof architecture, dependency layers, modelling choices, and timeline.
894
+
895
+
This task produces a living document. Each run updates the report to reflect the current state of the project rather than rewriting it from scratch.
896
+
897
+
1. Read all existing FV artifacts: Lean files, informal specs, CORRESPONDENCE.md, CRITIQUE.md, TARGETS.md, RESEARCH.md, memory, open issues, and merged PRs.
898
+
2.**Create or update**`formal-verification/REPORT.md` with the following structure:
899
+
900
+
#### Report Structure
901
+
902
+
```markdown
903
+
> 🔬 *Lean Squad — automated formal verification for `<owner>/<repo>`.*
{List any implementation bugs discovered through formal verification.
1007
+
For each bug, include: the property that was expected to hold, the
1008
+
counterexample or proof failure, severity, and link to the filed issue.
1009
+
1010
+
If no bugs found, state this explicitly — it is itself a positive finding.}
1011
+
1012
+
### Formulation Issues
1013
+
1014
+
{Any spec or proof formulation bugs caught during development (e.g.
1015
+
over-general propositions that turned out to be false).}
1016
+
1017
+
### Interesting Structural Discoveries
1018
+
1019
+
{Properties that turned out to be stronger or weaker than expected,
1020
+
surprising equivalences, or non-obvious invariants.}
1021
+
1022
+
---
1023
+
1024
+
## Project Timeline
1025
+
1026
+
{Use a mermaid timeline diagram to show the progression of the project.}
1027
+
1028
+
```mermaid
1029
+
timeline
1030
+
title FV Project Development
1031
+
section Phase 1
1032
+
Target A : N theorems
1033
+
section Phase 2
1034
+
Target B : M theorems
1035
+
```
1036
+
1037
+
---
1038
+
1039
+
## Toolchain
1040
+
1041
+
-**Prover**: Lean 4 (version X.Y.Z)
1042
+
-**Libraries**: Mathlib / stdlib only
1043
+
-**CI**: description of CI setup
1044
+
-**Build system**: Lake
1045
+
1046
+
{Include tactic inventory table if proofs exist.}
1047
+
1048
+
| Tactic | Usage |
1049
+
|--------|-------|
1050
+
|`omega`| Integer/natural-number arithmetic |
1051
+
| ... | ... |
1052
+
```
1053
+
1054
+
3.**Mermaid diagrams are mandatory** for:
1055
+
- Proof architecture / dependency layers
1056
+
- Each verification layer's file structure
1057
+
- The main proof chain (if a headline theorem exists)
1058
+
- Modelling choices (real code → model → proofs)
1059
+
- Project timeline
1060
+
4.**Findings section is mandatory**: always include a Findings section, even when no bugs have been found. If no bugs were found, state this explicitly as a positive finding. If bugs were found, include for each:
1061
+
- The property that was expected to hold
1062
+
- The counterexample or proof failure that refuted it
1063
+
- The affected function/file and impact
1064
+
- Link to the GitHub issue filed (from Task 5)
1065
+
5. The report should be **incremental**: read the existing REPORT.md (if any), update sections that have changed, add new layers/files/theorems, and update the timeline. Do not delete prior content unless it has become incorrect.
1066
+
6.**Always** include a `## Last Updated` section near the top with the current UTC date/time and the HEAD commit SHA:
1067
+
```
1068
+
## Last Updated
1069
+
- **Date**: YYYY-MM-DD HH:MM UTC
1070
+
- **Commit**: `<SHA>`
1071
+
```
1072
+
7. Count theorems, `sorry`s, and files by inspecting the actual Lean sources — do not guess from memory alone.
1073
+
8. Cross-reference CORRESPONDENCE.md and CRITIQUE.md when describing modelling choices, proof utility, and known limitations.
1074
+
9. Create a PR with the updated REPORT.md.
1075
+
10. Update memory: note that the report exists and what state it covers.
1076
+
1077
+
---
1078
+
881
1079
### Task Final: Update Lean Squad Status Issue *(ALWAYS DO THIS EVERY RUN)*
882
1080
883
1081
Maintain a single open issue titled `[Lean Squad] Formal Verification Status` as a continuously-updated dashboard for maintainers.
0 commit comments