Add SIA: Self-Improving AI with Harness & Weight Updates by imnida · Pull Request #3 · imnida/Python

imnida · 2026-05-29T18:25:03Z

Implements the SIA framework from Hebbar et al. (arXiv:2605.27276).
The loop lets a Feedback-Agent iteratively improve both the scaffold
(harness) and the model weights (LoRA) of a task-specific agent.

Key components:

sia_loop.py — main configurable loop (Meta-Agent → execute → Feedback-Agent)
meta_agent.py — generates initial scaffold A1 using Claude Sonnet 4.6
feedback_agent.py — analyses trajectory τg, decides harness vs weight update
task_agent.py — executes scaffold against dataset, captures trajectory
trajectory.py — structured execution log (Step, ToolCall, Trajectory)
verifier.py — deterministic per-instance reward interface
weight_updates/ — six RL algorithms: PPO+GAE, GRPO, Entropic Advantage
Weighting, REINFORCE+KL, Best-of-N BC, DPO
tasks/ — three benchmark tasks: LawBench (191-class Chinese legal),
AlphaEvolve TriMul (CUDA kernel), MAGIC scRNA-seq denoising

https://claude.ai/code/session_01DLqnGSQGNhPHnUzTLgJ6id

Implements the SIA framework from Hebbar et al. (arXiv:2605.27276). The loop lets a Feedback-Agent iteratively improve both the scaffold (harness) and the model weights (LoRA) of a task-specific agent. Key components: - sia_loop.py — main configurable loop (Meta-Agent → execute → Feedback-Agent) - meta_agent.py — generates initial scaffold A1 using Claude Sonnet 4.6 - feedback_agent.py — analyses trajectory τg, decides harness vs weight update - task_agent.py — executes scaffold against dataset, captures trajectory - trajectory.py — structured execution log (Step, ToolCall, Trajectory) - verifier.py — deterministic per-instance reward interface - weight_updates/ — six RL algorithms: PPO+GAE, GRPO, Entropic Advantage Weighting, REINFORCE+KL, Best-of-N BC, DPO - tasks/ — three benchmark tasks: LawBench (191-class Chinese legal), AlphaEvolve TriMul (CUDA kernel), MAGIC scRNA-seq denoising https://claude.ai/code/session_01DLqnGSQGNhPHnUzTLgJ6id

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add SIA: Self-Improving AI with Harness & Weight Updates#3

Add SIA: Self-Improving AI with Harness & Weight Updates#3
imnida wants to merge 1 commit into
masterfrom
claude/sia-repo-setup-X38L0

imnida commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

imnida commented May 29, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants