What are logbooks?

A logbook is a small structured table a skill writes to during a run and reads from at the start of the next. One table, or a few connected ones — same columns every time, so anything that wrote to it can read from it.

The shape

The simplest logbook is a single table:

id, title, status, owner, next_step, updated_at

Anything more involved adds a few connected tables for annotations, comments, or action history:

research-logbook/
├── findings.csv      # main entries
├── comments.csv      # optional annotations
└── actions.csv       # optional action history

Two things are load-bearing. First, columns don't drift — the same row shape is used by every writer, every run, and every reader. Second, rows can be patched in place, not just appended. A row's status moves from open to fixed; it doesn't become a second row with a "fixed" annotation tacked on.

What it isn't

The fastest way to describe a logbook is by what it isn't. It sits adjacent to several familiar things, but it's not any of them.

A memory storeLogbooks aren't semantic recall. Rows are filtered by structure, not by similarity to a question.
A trackerTrackers hold committed work in a workflow. Logbooks hold draft work that hasn't earned a ticket yet.
A logLogs record events for replay. Logbooks hold present-tense state you can patch — a row's status moves from open to fixed.
A documentDocs hold prose for humans to read. Logbooks hold rows for tools to query.
Chat stateChat memory dies with the session. A logbook outlives every run that wrote to it.

Three properties make it work

A logbook earns its keep when three properties hold. Drop any one and it stops being useful.

01 Stable entry contract

Every row has the same columns regardless of which producer wrote it. A finding from the security lens is structurally identical to one from the performance lens. That's what makes rows mergeable, queryable, and patchable across runs.

In the wild

See deep-code-review's candidate_findings table — every row carries the same fields regardless of which review angle (correctness, security, performance, accessibility) generated it.

02 Tool-queried, not reread

Readers hit the logbook with structured queries — status=open AND severity=high, last_seen<7d — instead of rereading transcripts. The reader doesn't need to know what previous runs did. The query answers.

In the wild

See ideation's CLI: python scripts/ideation_db.py query top-by-composite answers "which ideas survived stress-testing" without rereading every operator run.

03 Outlives the session

Rows persist beyond the run that wrote them. A test added today can close a gap reported last week, because the gap is a row — not a sentence in a transcript no one will reread.

In the wild

See mutation-testing's gap_ledger — open gaps survive across runs as patchable, present-tense state. A test added today closes a gap reported last week.

The loop

Three things happen every time, in order.

1. Write

A skill produces a row and appends or patches it.

f4, security, medium, "API key in repo history", open

2. Query

A reader — the same skill, another skill, or a human — asks the logbook a structured question.

findings WHERE status=open AND severity IN (high, medium)

3. Act

The result drives a downstream action: a reviewer reads the filtered list, tickets get created, or the next run picks up where this one left off.

That loop is how partial work compounds, instead of dissolving with the session.

Where to go next