Skip to content

Commit 2d9440a

Browse files
committed
feat(skill): add tdd skill
1 parent ed0ccdc commit 2d9440a

7 files changed

Lines changed: 75 additions & 5 deletions

File tree

skills/agent-orchestration/SKILL.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -76,7 +76,7 @@ Keep assessment concise — read only what you need. Avoid `--full` unless a sho
7676

7777
| Situation | Action |
7878
|-----------|--------|
79-
| Finished task | Apply `$verify` — check the agent's diff and run tests before marking complete |
79+
| Finished task | Apply the `verify` skill — check the agent's diff and run tests before marking complete |
8080
| Waiting for approval | Auto-approve if within guardrails, else escalate |
8181
| Waiting for clarification | Answer from your context, escalate only if you truly lack the answer |
8282
| Stuck or looping | Send corrective instruction or new approach |

skills/debug/SKILL.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -33,7 +33,7 @@ For each hypothesis, include:
3333

3434
## Validation
3535
- Confirm a pre-fix failing signal exists.
36-
- Confirm post-fix success using `$verify` — including regression verification for bug fixes.
36+
- Confirm post-fix success using the `verify` skill — including regression verification for bug fixes.
3737
- Summarize remaining risks and follow-ups.
3838

3939
## Output Template

skills/dev-lifecycle/SKILL.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -80,4 +80,4 @@ Use `npx ai-devkit@latest memory` CLI in any phase that involves clarification q
8080
- Read existing `docs/ai/` before changes. Keep diffs minimal.
8181
- Use mermaid diagrams for architecture visuals.
8282
- After each phase, summarize output and suggest next phase.
83-
- Apply `$verify` before completing Phase 4 tasks, Phase 6 checks, Phase 7 coverage claims, and Phase 8 review items. No phase transition without fresh evidence.
83+
- Apply the `verify` skill before completing Phase 4 tasks, Phase 6 checks, Phase 7 coverage claims, and Phase 8 review items. No phase transition without fresh evidence.

skills/dev-lifecycle/references/execute-plan.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5,7 +5,7 @@ Work through `docs/ai/planning/feature-{name}.md` one task at a time.
55
1. **Gather context** — feature name, planning doc path, supporting docs (design, requirements), current branch/diff.
66
2. **Load plan** — parse task lists (checkboxes), build ordered queue by section.
77
3. **Present task queue** with status: `todo`, `in-progress`, `done`, `blocked`.
8-
4. **For each task**: show context, suggest relevant docs, offer to outline sub-steps from design doc, execute, prompt for status + notes. If blocked, record blocker and defer.
8+
4. **For each task**: show context, suggest relevant docs, offer to outline sub-steps from design doc. Apply the `tdd` skill — write a failing test before production code, then make it pass. If blocked, record blocker and defer.
99
5. **Inline tracking** — generate markdown snippet after each status change (lightweight; full reconciliation in Phase 5).
1010
6. **After each section**, ask if new tasks were discovered.
1111
7. **Session summary** — completed, in-progress, blocked, skipped, new tasks.

skills/tdd/SKILL.md

Lines changed: 66 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,66 @@
1+
---
2+
name: tdd
3+
description: Test-driven development — write a failing test before writing production code. Use when implementing new functionality, adding behavior, or fixing bugs during active development.
4+
---
5+
6+
# TDD
7+
8+
Red. Green. Refactor. In that order, every time.
9+
10+
## Hard Rules
11+
12+
- No production code without a failing test first.
13+
- If production code was written before its test, delete it and start over with a failing test.
14+
- Never skip the red step. A test that has never failed proves nothing.
15+
16+
## Cycle
17+
18+
For each unit of behavior:
19+
20+
1. **Red** — Write a test for the next behavior. Run it. It must fail. Read the failure message — it should describe the missing behavior.
21+
2. **Green** — Write the minimum production code to make the test pass. Nothing more. Run the test. Apply the `verify` skill.
22+
3. **Refactor** — Clean up both test and production code. Run the test again. Still green? Done. Apply the `verify` skill.
23+
24+
Then pick the next behavior and repeat.
25+
26+
## Rules for Each Step
27+
28+
**Red:**
29+
- Test one behavior, not one function. Name the test after what the system should do, not what the function is called.
30+
- The test must fail for the right reason — a missing method, wrong return value, unmet condition. Not a syntax error or import failure.
31+
- If the test passes immediately, it's not testing new behavior. Delete it or pick a different behavior.
32+
33+
**Green:**
34+
- Write the simplest code that passes. Hardcode if needed — the next test will force generalization.
35+
- Do not add code "while you're in there." If it's not required by a failing test, it doesn't exist yet.
36+
- Do not refactor during green. Pass first, clean second.
37+
38+
**Refactor:**
39+
- Remove duplication between test and production code.
40+
- Extract only when you see real duplication, not predicted duplication.
41+
- Tests must still pass after every refactor move. Run them after each change.
42+
43+
## Anti-Patterns
44+
45+
| Pattern | Problem | Fix |
46+
|---|---|---|
47+
| Test-after | Code shapes the test instead of the other way around | Delete the code, write the test first |
48+
| Testing internals | Tests break on refactor, not on behavior change | Test public behavior only |
49+
| Giant red step | Multiple behaviors in one test | One assertion per behavior |
50+
| Gold-plating green | Adding code no test requires | Remove untested code |
51+
| Skipping refactor | Tech debt accumulates immediately | Refactor before the next red |
52+
| Mock-heavy tests | Tests pass but real code fails | Prefer real dependencies, mock at boundaries only |
53+
54+
## Red Flags and Rationalizations
55+
56+
| Rationalization | Why It's Wrong | Do Instead |
57+
|---|---|---|
58+
| "This is too simple to test first" | Simple code still needs a spec | Write the test — it'll be fast |
59+
| "I'll add the test right after" | You won't, and the code will shape the test | Test first, always |
60+
| "I need to see the design first" | The test IS the design | Let the test drive the interface |
61+
| "Mocking is too hard for this" | Difficulty mocking signals tight coupling | Fix the design, then test |
62+
| "The test would be identical to the implementation" | Then you're testing internals | Test the behavior from the outside |
63+
64+
## Memory Integration
65+
66+
After completing a TDD session, store reusable test patterns via the `memory` skill — e.g., test setup for a specific framework, common assertion patterns, or fixtures that were hard to get right.

skills/tdd/agents/openai.yaml

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,4 @@
1+
interface:
2+
display_name: "TDD"
3+
short_description: "Test-driven development — write failing tests before production code"
4+
default_prompt: "Use $tdd to implement this behavior test-first — write a failing test, make it pass with minimum code, refactor, repeat."

skills/verify/SKILL.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -64,4 +64,4 @@ If step 4 passes, the test is wrong. Rewrite it.
6464

6565
## Memory Integration
6666

67-
After a failed verification, store the failure pattern via `$memory` with tags `verify,failure-pattern` so future sessions can avoid the same mistake.
67+
After a failed verification, store the failure pattern via the `memory` skill with tags `verify,failure-pattern` so future sessions can avoid the same mistake.

0 commit comments

Comments
 (0)