deep-code-review
Hotspot-first deep code review for PRs, branches, and diffs.
What it does
deep-code-review is optimized to be right about a few important things rather than produce many comments. A run models behavior changes from the diff, picks risky hotspots, fans out per-hotspot lens subagents, runs a skeptic pass, and surfaces at most 5 outputs as findings or questions. Each run is persisted to a per-PR SQLite ledger and a per-run JSONL trace under ./.logbooks/code-review/ in the reviewed repo, so follow-up reviews can see what was already flagged. Formatting-only changes, trivial renames, and speculative style comments are ignored by default.
How it does it
Gather inputs
Resolve the review target (PR, branch, pasted diff, or current WIP) into a stable PR_REF, then initialize the SQLite ledger and JSONL trace under ./.logbooks/code-review/.
Build change map + select hotspots
Read the diff once, classify edits into archetypes (guard-removed, public-contract-changed, persistence-schema-changed…), then pick up to 8 risky changed units worth focused review.
Pick lenses per hotspot
Choose review lenses for each hotspot from a fixed catalog (correctness, security, concurrency, performance, api-contract, …). Correctness and maintainability are always-on.
Generate candidate findings (parallel fan-out)
One subagent per (hotspot × lens) combination acquires minimal local context, then emits findings or questions with evidence, severity, and a local confidence score.
Skeptic + dedup + priority score
A skeptic pass challenges each candidate, root-cause fingerprints collapse duplicates within the run, and a multi-factor priority score (0–100) ranks survivors.
Persist + report
The top-ranked survivors (≤5) are surfaced as PR comments or a chat report; everything else stays in the ledger as suppressed judgment for future runs.
Schema overview
| Table / Stream | Purpose |
|---|---|
hotspots (SQLite) |
Planning — which units got reviewed, why selected, which lenses applied |
candidate_findings (SQLite) |
Judgment — every candidate the model produced, with evidence + severity + state |
| JSONL trace | Action log — run, hotspot, candidate, decision, output, pr_comment_dedup records |
| (computed view) | Surfacing — only the ≤5 candidates with surfacing_state = posted |
Four concerns kept separate: trace, judgment, planning, presentation. (findings.logbook.md:40-43)