mutation-testing
Tests pass ≠ tests are useful. Find gaps your test suite would miss.
What it does
mutation-testing validates whether a test suite is actually exercising its code, not just executing it. One subagent per source file generates intelligent mutations — conditional flips, boolean negation, arithmetic swaps, return value changes, boundary shifts — and runs them through the project's native test runner (pytest, jest, vitest, mocha, or go test). Every mutant is classified as Killed, Survived, Error, or Skipped, and a mutation score is computed across the run. When the score falls below 70%, the skill warns and exits with code 2 so CI can catch it. Every run persists to a per-project SQLite ledger and JSONL trace under ./.logbooks/mutation-testing/, and a human-facing mutation-todos.md is written at the repo root with rationale for each surviving mutant.
How it does it
Discover sources + detect runner
Scans the repo for source files (or honors explicit globs), identifies the test runner from config files, runs the baseline suite to confirm tests pass before mutating.
Read open gaps from gap_ledger
Queries the gap_ledger for currently-open gaps in this project so subagents can avoid re-discovering them.
Generate mutations (one subagent per file)
Dispatches one subagent per source file to generate intelligent mutations (conditional flips, boolean negation, arithmetic swaps, return value changes, boundary shifts). Each subagent receives the known-gaps hint.
Apply → run → restore loop
For each mutation: apply to disk, invoke the test runner, restore the file. Records every result.
Classify + compute mutation score
Bands: ≥90% strong, 70–90% acceptable (review survivors), <70% weak. Score excludes Skipped.
Patch gap_ledger + write mutation-todos.md
Closes gaps for newly-killed mutants; opens gaps for new survivors; reopens previously-fixed mutants that re-survived. Writes the human-facing todo file at repo root.
Schema overview
| Layer | Table | Mutability | Purpose |
|---|---|---|---|
| History | runs | append-only | One row per execution |
| Outcomes | mutant_results | append-only | Every mutant result per run |
| Identity | mutants | append + counter updates | Canonical mutant (mutant_key from sha256) |
| State | gap_ledger | patchable | Current open / fixed / acknowledged / wont_fix gaps |
The gap_ledger is a patchable present-tense view of which gaps are still open, independent of run history. (mutation-runs.logbook.md:64)