feat(skill): add tdd skill

codeaholicguy · codeaholicguy · commit 2d9440a527d1 · 2026-04-03T07:44:05.000+02:00
diff --git a/skills/agent-orchestration/SKILL.md b/skills/agent-orchestration/SKILL.md
@@ -76,7 +76,7 @@ Keep assessment concise — read only what you need. Avoid `--full` unless a sho
 
 | Situation | Action |
 |-----------|--------|
-| Finished task | Apply `$verify` — check the agent's diff and run tests before marking complete |
+| Finished task | Apply the `verify` skill — check the agent's diff and run tests before marking complete |
 | Waiting for approval | Auto-approve if within guardrails, else escalate |
 | Waiting for clarification | Answer from your context, escalate only if you truly lack the answer |
 | Stuck or looping | Send corrective instruction or new approach |
diff --git a/skills/debug/SKILL.md b/skills/debug/SKILL.md
@@ -33,7 +33,7 @@ For each hypothesis, include:
 
 ## Validation
 - Confirm a pre-fix failing signal exists.
-- Confirm post-fix success using `$verify` — including regression verification for bug fixes.
+- Confirm post-fix success using the `verify` skill — including regression verification for bug fixes.
 - Summarize remaining risks and follow-ups.
 
 ## Output Template
diff --git a/skills/dev-lifecycle/SKILL.md b/skills/dev-lifecycle/SKILL.md
@@ -80,4 +80,4 @@ Use `npx ai-devkit@latest memory` CLI in any phase that involves clarification q
 - Read existing `docs/ai/` before changes. Keep diffs minimal.
 - Use mermaid diagrams for architecture visuals.
 - After each phase, summarize output and suggest next phase.
-- Apply `$verify` before completing Phase 4 tasks, Phase 6 checks, Phase 7 coverage claims, and Phase 8 review items. No phase transition without fresh evidence.
+- Apply the `verify` skill before completing Phase 4 tasks, Phase 6 checks, Phase 7 coverage claims, and Phase 8 review items. No phase transition without fresh evidence.
diff --git a/skills/dev-lifecycle/references/execute-plan.md b/skills/dev-lifecycle/references/execute-plan.md
@@ -5,7 +5,7 @@ Work through `docs/ai/planning/feature-{name}.md` one task at a time.
 1. **Gather context** — feature name, planning doc path, supporting docs (design, requirements), current branch/diff.
 2. **Load plan** — parse task lists (checkboxes), build ordered queue by section.
 3. **Present task queue** with status: `todo`, `in-progress`, `done`, `blocked`.
-4. **For each task**: show context, suggest relevant docs, offer to outline sub-steps from design doc, execute, prompt for status + notes. If blocked, record blocker and defer.
+4. **For each task**: show context, suggest relevant docs, offer to outline sub-steps from design doc. Apply the `tdd` skill — write a failing test before production code, then make it pass. If blocked, record blocker and defer.
 5. **Inline tracking** — generate markdown snippet after each status change (lightweight; full reconciliation in Phase 5).
 6. **After each section**, ask if new tasks were discovered.
 7. **Session summary** — completed, in-progress, blocked, skipped, new tasks.
diff --git a/skills/tdd/SKILL.md b/skills/tdd/SKILL.md
@@ -0,0 +1,66 @@
+---
+name: tdd
+description: Test-driven development — write a failing test before writing production code. Use when implementing new functionality, adding behavior, or fixing bugs during active development.
+---
+
+# TDD
+
+Red. Green. Refactor. In that order, every time.
+
+## Hard Rules
+
+- No production code without a failing test first.
+- If production code was written before its test, delete it and start over with a failing test.
+- Never skip the red step. A test that has never failed proves nothing.
+
+## Cycle
+
+For each unit of behavior:
+
+1. **Red** — Write a test for the next behavior. Run it. It must fail. Read the failure message — it should describe the missing behavior.
+2. **Green** — Write the minimum production code to make the test pass. Nothing more. Run the test. Apply the `verify` skill.
+3. **Refactor** — Clean up both test and production code. Run the test again. Still green? Done. Apply the `verify` skill.
+
+Then pick the next behavior and repeat.
+
+## Rules for Each Step
+
+**Red:**
+- Test one behavior, not one function. Name the test after what the system should do, not what the function is called.
+- The test must fail for the right reason — a missing method, wrong return value, unmet condition. Not a syntax error or import failure.
+- If the test passes immediately, it's not testing new behavior. Delete it or pick a different behavior.
+
+**Green:**
+- Write the simplest code that passes. Hardcode if needed — the next test will force generalization.
+- Do not add code "while you're in there." If it's not required by a failing test, it doesn't exist yet.
+- Do not refactor during green. Pass first, clean second.
+
+**Refactor:**
+- Remove duplication between test and production code.
+- Extract only when you see real duplication, not predicted duplication.
+- Tests must still pass after every refactor move. Run them after each change.
+
+## Anti-Patterns
+
+| Pattern | Problem | Fix |
+|---|---|---|
+| Test-after | Code shapes the test instead of the other way around | Delete the code, write the test first |
+| Testing internals | Tests break on refactor, not on behavior change | Test public behavior only |
+| Giant red step | Multiple behaviors in one test | One assertion per behavior |
+| Gold-plating green | Adding code no test requires | Remove untested code |
+| Skipping refactor | Tech debt accumulates immediately | Refactor before the next red |
+| Mock-heavy tests | Tests pass but real code fails | Prefer real dependencies, mock at boundaries only |
+
+## Red Flags and Rationalizations
+
+| Rationalization | Why It's Wrong | Do Instead |
+|---|---|---|
+| "This is too simple to test first" | Simple code still needs a spec | Write the test — it'll be fast |
+| "I'll add the test right after" | You won't, and the code will shape the test | Test first, always |
+| "I need to see the design first" | The test IS the design | Let the test drive the interface |
+| "Mocking is too hard for this" | Difficulty mocking signals tight coupling | Fix the design, then test |
+| "The test would be identical to the implementation" | Then you're testing internals | Test the behavior from the outside |
+
+## Memory Integration
+
+After completing a TDD session, store reusable test patterns via the `memory` skill — e.g., test setup for a specific framework, common assertion patterns, or fixtures that were hard to get right.
diff --git a/skills/tdd/agents/openai.yaml b/skills/tdd/agents/openai.yaml
@@ -0,0 +1,4 @@
+interface:
+  display_name: "TDD"
+  short_description: "Test-driven development — write failing tests before production code"
+  default_prompt: "Use $tdd to implement this behavior test-first — write a failing test, make it pass with minimum code, refactor, repeat."
diff --git a/skills/verify/SKILL.md b/skills/verify/SKILL.md
@@ -64,4 +64,4 @@ If step 4 passes, the test is wrong. Rewrite it.
 
 ## Memory Integration
 
-After a failed verification, store the failure pattern via `$memory` with tags `verify,failure-pattern` so future sessions can avoid the same mistake.
+After a failed verification, store the failure pattern via the `memory` skill with tags `verify,failure-pattern` so future sessions can avoid the same mistake.

Original file line number	Diff line number	Diff line change
`@@ -64,4 +64,4 @@ If step 4 passes, the test is wrong. Rewrite it.`
`64`	`64`
`65`	`65`	`## Memory Integration`
`66`	`66`
`67`		-After a failed verification, store the failure pattern via `$memory` with tags `verify,failure-pattern` so future sessions can avoid the same mistake.
	`67`	+After a failed verification, store the failure pattern via the `memory` skill with tags `verify,failure-pattern` so future sessions can avoid the same mistake.