Skip to content

Commit 1f672ae

Browse files
committed
adapt weights for lean-squad
1 parent 770e00d commit 1f672ae

2 files changed

Lines changed: 31 additions & 3 deletions

File tree

.github/aw/actions-lock.json

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -45,6 +45,11 @@
4545
"version": "v0.64.0",
4646
"sha": "51c65948c64ab6752536ead71fba1fc2c20ed0bc"
4747
},
48+
"github/gh-aw-actions/setup@v0.64.1": {
49+
"repo": "github/gh-aw-actions/setup",
50+
"version": "v0.64.1",
51+
"sha": "f2ebb97e47f0704f098aeb6f8f32546ddc8fee85"
52+
},
4853
"github/gh-aw/actions/setup@v0.62.5": {
4954
"repo": "github/gh-aw/actions/setup",
5055
"version": "v0.62.5",

workflows/lean-squad.md

Lines changed: 26 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -37,6 +37,7 @@ network:
3737
- "leanprover-community.github.io"
3838
- "release.leanlang.org"
3939
- "release.lean-lang.org"
40+
- "releases.lean-lang.org"
4041
- ocaml
4142
- "releaseassets.githubusercontent.com"
4243
- "raw.githubusercontent.com" # required: elan installer bootstrap script
@@ -68,6 +69,7 @@ safe-outputs:
6869
labels: [automation, lean-squad]
6970
max: 2
7071
protected-files: fallback-to-issue
72+
draft: false
7173
push-to-pull-request-branch:
7274
target: "*"
7375
title-prefix: "[Lean Squad] "
@@ -95,6 +97,10 @@ steps:
9597
[ -f ".github/workflows/lean-ci.yml" ] && echo 1 > /tmp/gh-aw/has_lean_ci.txt || echo 0 > /tmp/gh-aw/has_lean_ci.txt
9698
[ -f ".github/workflows/aeneas-generate.yml" ] && echo 1 > /tmp/gh-aw/has_aeneas_ci.txt || echo 0 > /tmp/gh-aw/has_aeneas_ci.txt
9799
100+
# Detect CORRESPONDENCE.md and CRITIQUE.md
101+
[ -f "formal-verification/CORRESPONDENCE.md" ] && echo 1 > /tmp/gh-aw/has_correspondence.txt || echo 0 > /tmp/gh-aw/has_correspondence.txt
102+
[ -f "formal-verification/CRITIQUE.md" ] && echo 1 > /tmp/gh-aw/has_critique.txt || echo 0 > /tmp/gh-aw/has_critique.txt
103+
98104
# Detect formal-verification directory
99105
[ -d "formal-verification" ] && echo 1 > /tmp/gh-aw/fv_dir.txt || echo 0 > /tmp/gh-aw/fv_dir.txt
100106
@@ -121,6 +127,8 @@ steps:
121127
rust_count = int(open('/tmp/gh-aw/rust_count.txt').read().strip() or 0)
122128
has_lean_ci = int(open('/tmp/gh-aw/has_lean_ci.txt').read().strip() or 0)
123129
has_aeneas_ci = int(open('/tmp/gh-aw/has_aeneas_ci.txt').read().strip() or 0)
130+
has_correspondence = int(open('/tmp/gh-aw/has_correspondence.txt').read().strip() or 0)
131+
has_critique = int(open('/tmp/gh-aw/has_critique.txt').read().strip() or 0)
124132
fv_dir = int(open('/tmp/gh-aw/fv_dir.txt').read().strip() or 0)
125133
fv_docs = int(open('/tmp/gh-aw/fv_docs.txt').read().strip() or 0)
126134
fv_issues = json.load(open('/tmp/gh-aw/fv_issues.json'))
@@ -157,8 +165,8 @@ steps:
157165
3: (8.0 if not has_lean_specs else 2.0) if has_inf_specs else 0.3,
158166
4: (6.0 if not has_impl else 2.0) if has_lean_specs else 0.2,
159167
5: (6.0 if not has_proofs else 2.0) if has_impl else 0.1,
160-
6: 3.0 if has_impl else 0.5, # correspondence review: useful once impl exists
161-
7: 3.0 if has_proofs else 0.0, # critique: only meaningful once proofs exist
168+
6: (10.0 if not has_correspondence else 3.0) if has_impl else 0.5, # correspondence: critical when impl exists but no doc
169+
7: (10.0 if not has_critique else 3.0) if has_proofs else 0.0, # critique: critical when proofs exist but no doc
162170
8: (3.0 if has_lean_specs else 1.0) if (has_rust and has_research) else 0.0, # aeneas: only for Rust codebases with research done
163171
9: 12.0 if (has_lean_specs and not has_ci) else 2.0, # CI: critical when lean files exist but no CI; regular check otherwise
164172
}
@@ -186,7 +194,8 @@ steps:
186194
print(f'Open FV PRs : {n_prs}')
187195
print(f'Phase flags : research={has_research}, inf_specs={has_inf_specs}, '
188196
f'lean_specs={has_lean_specs}, impl={has_impl}, proofs={has_proofs}, '
189-
f'rust={has_rust}, ci={has_ci}')
197+
f'rust={has_rust}, ci={has_ci}, '
198+
f'correspondence={bool(has_correspondence)}, critique={bool(has_critique)}')
190199
print()
191200
print('Task weights:')
192201
for t, w in weights.items():
@@ -207,6 +216,8 @@ steps:
207216
'has_proofs': has_proofs,
208217
'has_rust': has_rust,
209218
'has_ci': has_ci,
219+
'has_correspondence': bool(has_correspondence),
220+
'has_critique': bool(has_critique),
210221
},
211222
'task_names': task_names,
212223
'weights': {str(k): round(v, 2) for k, v in weights.items()},
@@ -531,6 +542,12 @@ This task is important because the value of any proof depends entirely on how fa
531542
- For each modelled function or type, include a markdown table or enumerated list with: Lean name, Rust name, file + line reference, correspondence level, and a brief justification.
532543
- Include links to the Rust source lines (use relative paths, e.g. `src/raft_log.rs#L42`).
533544
- Summarise any known gaps or mismatches that should be resolved before trusting associated proofs.
545+
- **Always** include a `## Last Updated` section at the top with the current UTC date/time and the HEAD commit SHA:
546+
```
547+
## Last Updated
548+
- **Date**: YYYY-MM-DD HH:MM UTC
549+
- **Commit**: `<SHA>`
550+
```
534551
4. If any **mismatches** are found (Lean model is incorrect relative to the Rust): flag them prominently in CORRESPONDENCE.md under a `## Known Mismatches` heading. Open a GitHub issue for each mismatch that invalidates a proved theorem.
535552
5. Create a PR with the updated CORRESPONDENCE.md.
536553
6. Update memory: note the correspondence status for each modelled target, flag any mismatches found.
@@ -552,6 +569,12 @@ This is a reflective task. The goal is not to prove more things, but to evaluate
552569
3. For unproved / `sorry`-guarded theorems, assess whether they are worth proving or should be revised.
553570
4. Identify the **highest-value gaps**: which properties, if proved, would give the most confidence in the codebase? Are there important invariants or safety properties that have not yet been attempted?
554571
5. Write or update `formal-verification/CRITIQUE.md`:
572+
- **Always** include a `## Last Updated` section at the top with the current UTC date/time and the HEAD commit SHA:
573+
```
574+
## Last Updated
575+
- **Date**: YYYY-MM-DD HH:MM UTC
576+
- **Commit**: `<SHA>`
577+
```
555578
- **Overall assessment**: 2–4 sentences on the current state of formal verification and its utility.
556579
- **Proved theorems** table: theorem name, file, level (low/mid/high), bug-catching potential (low/medium/high), notes.
557580
- **Gaps and recommendations**: what should be proved next and why — prioritised by impact.

0 commit comments

Comments
 (0)