Skill example · logbooks marketplace

mutation-testing

Tests pass ≠ tests are useful. Find gaps your test suite would miss.

/plugin install mutation-testing@logbooks /mutation-testing GitHub →

What it does

mutation-testing validates whether a test suite is actually exercising its code, not just executing it. One subagent per source file generates intelligent mutations — conditional flips, boolean negation, arithmetic swaps, return value changes, boundary shifts — and runs them through the project's native test runner (pytest, jest, vitest, mocha, or go test). Every mutant is classified as Killed, Survived, Error, or Skipped, and a mutation score is computed across the run. When the score falls below 70%, the skill warns and exits with code 2 so CI can catch it. Every run persists to a per-project SQLite ledger and JSONL trace under ./.logbooks/mutation-testing/, and a human-facing mutation-todos.md is written at the repo root with rationale for each surviving mutant.

How it does it

Discover sources + detect runner

Scans the repo for source files (or honors explicit globs), identifies the test runner from config files, runs the baseline suite to confirm tests pass before mutating.

JSONL{ type: "run", run_id, runner: "pytest", source_globs, started_at }

Read open gaps from `gap_ledger`

Queries the gap_ledger for currently-open gaps in this project so subagents can avoid re-discovering them.

Read-from-state: gap_ledger informs mutation generation so subagents create novel mutants instead of re-discovering known gaps.

SQLSELECT mutant_key, summary FROM gap_ledger WHERE status = 'open' AND project = ?

Generate mutations (one subagent per file)

Dispatches one subagent per source file to generate intelligent mutations (conditional flips, boolean negation, arithmetic swaps, return value changes, boundary shifts). Each subagent receives the known-gaps hint.

JSONL hint passed inknown_gaps_for_file: [...]

Apply → run → restore loop

For each mutation: apply to disk, invoke the test runner, restore the file. Records every result.

SQLite mutant_results rowmutant_key, status ∈ {Killed, Survived, Error, Skipped}, runner_output

Classify + compute mutation score

Bands: ≥90% strong, 70–90% acceptable (review survivors), <70% weak. Score excludes Skipped.

Formulascore = Killed / (Killed + Survived + Errors) × 100

Patch `gap_ledger` + write `mutation-todos.md`

Closes gaps for newly-killed mutants; opens gaps for new survivors; reopens previously-fixed mutants that re-survived. Writes the human-facing todo file at repo root.

Patch-the-state: closing a gap (test added, code fixed) is a first-class operation — not just an absence in the next run.

Patchgap_ledger.status: open → fixed when killed; insert open for new survivors; reopen previously-fixed survivors

Schema overview

Layer	Table	Mutability	Purpose
History	`runs`	append-only	One row per execution
Outcomes	`mutant_results`	append-only	Every mutant result per run
Identity	`mutants`	append + counter updates	Canonical mutant (`mutant_key` from sha256)
State	`gap_ledger`	patchable	Current open / fixed / acknowledged / wont_fix gaps

The gap_ledger is a patchable present-tense view of which gaps are still open, independent of run history. (mutation-runs.logbook.md:64)

mutation-testing

What it does

How it does it

Discover sources + detect runner

Read open gaps from gap_ledger

Generate mutations (one subagent per file)

Apply → run → restore loop

Classify + compute mutation score

Patch gap_ledger + write mutation-todos.md

Schema overview

Go deeper

Read open gaps from `gap_ledger`

Patch `gap_ledger` + write `mutation-todos.md`