docs: add ChainWeaver + evaluation-artifact integration cookbooks (#95, #96)#112
Open
dgenio wants to merge 1 commit into
Open
docs: add ChainWeaver + evaluation-artifact integration cookbooks (#95, #96)#112dgenio wants to merge 1 commit into
dgenio wants to merge 1 commit into
Conversation
#96) Add two more ecosystem integration cookbooks under docs/integrations/, each with a runnable, offline companion wired into `make ci`. Follows the pattern established by the contextweaver/repository-check cookbooks. #95 — ChainWeaver compiled flows as capabilities: a ChainWeaverDriver wraps a compiled flow behind the Driver protocol so the flow runs through the normal policy/audit pipeline and produces a kernel-visible ActionTrace. A flow-step failure is translated into a DriverError that preserves the flow id and the failing step, so the orchestration context survives for the caller and the audit trail. ChainWeaver stays an optional dependency (the driver only needs a run(inputs) method and a flow_id), so the example ships tiny CompiledFlow / FlowExecutionError stand-ins and depends on no ChainWeaver package. New docs/integrations/chainweaver.md, examples/chainweaver_flow.py, and tests/test_chainweaver_flow.py. #96 — policy guardrails for statistical evaluation artifacts: a generic, producer-agnostic assess_artifact() layer lets an agent summarize an evaluation artifact while gating deployment/rollout recommendations on its support diagnostics. The gate is multi-signal (support_health, decision_stable, warnings, recommendation.intent) — a good point estimate with weak support is still blocked — and an unknown/missing support_health normalises to the safest state. Denied actions downgrade to a manual-review recommendation whose reason is recorded in ActionTrace.args. No statistical estimation is added and no producer dependency is taken; artifacts are fixtures. New docs/integrations/evaluation_artifacts.md, examples/evaluation_artifact_policy.py, and tests/test_evaluation_artifact_policy.py (covering ok/caution/high_risk). Wiring: both examples added to the Makefile `example` target; README and docs/integrations.md link the new pages; CHANGELOG updated. make ci passes (fmt-check, lint, mypy strict, 580 passed / 1 skipped, examples run). https://claude.ai/code/session_013hGyqqjAquhtSZXeYPkAuU
There was a problem hiding this comment.
Pull request overview
Adds two new ecosystem integration cookbooks (docs/integrations/chainweaver.md, docs/integrations/evaluation_artifacts.md) and their runnable, offline companion examples + tests, following the precedent set by the contextweaver / repository-check cookbooks. No src/ or public-API changes; both examples are wired into make ci via the example target.
Changes:
- New ChainWeaver integration: a
ChainWeaverDriverwraps a compiled-flow stand-in as aDriver, with flow failures translated intoDriverErrorthat preserves flow id + failing step. - New evaluation-artifact policy guardrail: a producer-agnostic
assess_artifact()multi-signal gate downgrades deployment recommendations to manual-review (recording the reason inActionTrace.args). - README,
docs/integrations.md,CHANGELOG.md, andMakefileupdated to surface and run the two new cookbooks.
Reviewed changes
Copilot reviewed 10 out of 10 changed files in this pull request and generated no comments.
Show a summary per file
| File | Description |
|---|---|
| examples/chainweaver_flow.py | New runnable example: ChainWeaverDriver + CompiledFlow/FlowExecutionError stand-ins, plus a release-notes flow. |
| examples/evaluation_artifact_policy.py | New runnable example: assess_artifact + kernel wiring for summarize / deploy / manual-review capabilities. |
| tests/test_chainweaver_flow.py | Tests flow ordering, error context, audit trace, and DriverError propagation. |
| tests/test_evaluation_artifact_policy.py | Tests ok / caution / high_risk paths, multi-signal gate, unknown-health normalisation, and audit-trace reason capture. |
| docs/integrations/chainweaver.md | New cookbook describing the ChainWeaver capability pattern. |
| docs/integrations/evaluation_artifacts.md | New cookbook describing the artifact policy/downgrade pattern. |
| docs/integrations.md | Adds links to the two new cookbooks. |
| README.md | Adds the two new cookbooks under the integrations list. |
| Makefile | Runs the two new examples under make example/make ci. |
| CHANGELOG.md | [Unreleased] entries for both cookbooks. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What changed
Adds two more ecosystem integration cookbooks under
docs/integrations/, each with a runnable, offline companion wired intomake ci. Follows the pattern established by the contextweaver / repository-check cookbooks (#92, #93). Closes #95 and #96.#95 — ChainWeaver compiled flows as capabilities
examples/chainweaver_flow.py(new) — aChainWeaverDriverwraps a compiled flow behind theDriverprotocol so the flow runs through the normalpolicy → token → invoke → firewall → tracepipeline and produces a kernel-visibleActionTrace. A flow-step failure is translated into aDriverErrorthat preserves the flow id and failing step, so the orchestration context survives for the caller and the audit trail.run(inputs)method and aflow_id, so the example ships tinyCompiledFlow/FlowExecutionErrorstand-ins and imports no ChainWeaver package.docs/integrations/chainweaver.md(new),tests/test_chainweaver_flow.py(new).#96 — policy guardrails for statistical evaluation artifacts
examples/evaluation_artifact_policy.py(new) — a generic, producer-agnosticassess_artifact()layer lets an agent summarize an evaluation artifact while gating deployment/rollout recommendations on its support diagnostics. The gate is multi-signal (support_health,decision_stable,warnings,recommendation.intent) — a good point estimate with weak support is still blocked — and an unknown/missingsupport_healthnormalises to the safest state.ActionTrace.args, so the audit trail explains why an action was downgraded.skdr-eval) dependency is taken; artifacts are fixtures.docs/integrations/evaluation_artifacts.md(new),tests/test_evaluation_artifact_policy.py(new, coversok/caution/high_risk).Wiring
Makefile— both examples added to theexampletarget so they run undermake ci.README.md+docs/integrations.md— link the two new pages.CHANGELOG.md—[Unreleased]entries.Why
#95 and #96 were the recommended coherent group from issue triage: both are additive, dependency-free integration cookbooks sharing the same code area (
docs/integrations/+examples/+ theDriver/capability pattern) and implementation path. Nosrc/changes were needed — integration-specific drivers live in the examples (theRepositoryCheckDriverprecedent).How verified
make ci— passes end to end:ruff format --check— cleanruff check— All checks passed!mypy src/(strict) — Success: no issues found in 41 source filespytest— 580 passed, 1 skippedmake example— all example scripts run, including the two new onesdriver_id == "chainweaver",result_summarypopulated); flow-failureDriverErrorpreserves flow id + failing step (type and message asserted); artifact decisions forok/caution/high_risk; the multi-signal gate (good support but unstable decision is still denied); unknownsupport_healthnormalises to safest; and that a downgraded action records its reason inActionTrace.args.Tradeoffs / risks
CompiledFlow; fixture artifacts) rather than the real ChainWeaver /skdr-evalpackages — intentional, so the cookbooks run offline in CI and keep those projects optional. The docs note how to swap in the real producers.assess_artifact's field names are a producer-neutral interim contract; ifweaver-specpublishes a formalEvaluationArtifact, the field names should be aligned (noted in the doc).Scope notes (Mode B)
Scope is limited to the two issues. No
src/or public-API changes; no new dependencies. Remaining open issues #94 (trace export shape) and #99 (property-based policy tests) are intentionally not included — they form a separate, less tightly coupled group and are better as their own PR.https://claude.ai/code/session_013hGyqqjAquhtSZXeYPkAuU
Generated by Claude Code