wiki: Internals — reflect Phase 3 planning pass, (summary, completeness) return, cache layout

2026-04-18 20:08:30 -06:00 · 2026-04-18 20:08:30 -06:00 · d3315b530f
commit d3315b530f
parent 717cde8562
1 changed files with 234 additions and 171 deletions
--- a/Internals.md
+++ b/Internals.md
@ -7,7 +7,8 @@ agent loop can finish this page and start making non-trivial changes.
 All file:line references are accurate as of the date this page was last
 edited — verify with `git log` or by opening the file before relying on a
-specific line number.
+specific line number. `ai.py` in particular grows each phase and
 references drift.
 ---
@ -36,9 +37,9 @@ wait for a scan they can't use.
 ## 2. Base scan walkthrough
-Entry: `luminos.py:main()` parses args, then calls `scan(target, ...)` at
+Entry: `luminos.py:main()` parses args, then calls `scan(target, ...)`.
-`luminos.py:45`. `scan()` is a flat sequence — it builds a `report` dict
+`scan()` is a flat sequence — it builds a `report` dict by calling helpers
-by calling helpers from `luminos_lib/`, one per concern, in order:
+from `luminos_lib/`, one per concern, in order:
 ```
 scan(target)
@ -60,7 +61,7 @@ event-driven, and there is no shared state object — everything passes
 through the local `report` dict.
 The progress lines you see on stderr (`[scan] Counting lines... foo.py`)
-come from `_progress()` in `luminos.py:23`, which returns an `on_file`
+come from `_progress()` in `luminos.py`, which returns an `on_file`
 callback that the helpers call as they work. If you add a new helper that
 walks files, plumb a progress callback through the same way for
 consistency.
@ -77,46 +78,54 @@ the base scan because it needs `report["survey_signals"]` and
 The AI pipeline is what makes Luminos interesting and is also where
 almost all the complexity lives. Everything below happens inside
-`luminos_lib/ai.py` (1438 lines as of writing), called from
+`luminos_lib/ai.py` (~2060 lines as of writing), called from `luminos.py`
-`luminos.py:157` via `analyze_directory()`.
+via `analyze_directory()`.
 ### 3.1 The orchestrator
-`analyze_directory()` (`ai.py:1408`) is a thin wrapper that checks
+`analyze_directory()` is a thin wrapper that checks dependencies, gets the
-dependencies, gets the API key, builds the Anthropic client, and calls
+API key, builds the Anthropic client, and calls `_run_investigation()`.
-`_run_investigation()`. If anything fails it prints a warning and returns
+If anything fails it prints a warning and returns empty strings — the
-empty strings — the rest of luminos keeps working.
+rest of luminos keeps working.
-`_run_investigation()` (`ai.py:1286`) is the real entry point. Read this
+`_run_investigation()` is the real entry point. Read this function first
-function first if you want to understand the pipeline shape. It does six
+if you want to understand the pipeline shape. It does **seven** things,
-things, in order:
+in order:
-1. **Get/create an investigation ID and cache** (`ai.py:1289–1294`).
+1. **Get/create an investigation ID and cache**. Investigation IDs let
-   Investigation IDs let you resume a previous run; see §5 below.
+   you resume a previous run; see §5 below.
 2. **Discover all directories** under the target via
-   `_discover_directories()` (`ai.py:715`). Returns them sorted
+   `_discover_directories()`. Returns them sorted *leaves-first* — the
-   *leaves-first* — the deepest paths come first. This matters because
+   deepest paths come first. This matters because each dir loop reads
-   each dir loop reads its child directories' summaries from cache, so
+   its child directories' summaries from cache, so children must be
-   children must be investigated before parents.
+   investigated before parents.
-3. **Run the survey pass** (`ai.py:1300–1334`) unless the target is below
+3. **Run the survey pass** unless the target is below
-   the size thresholds at `ai.py:780–781`, in which case
+   `_SURVEY_MIN_FILES` and `_SURVEY_MIN_DIRS`, in which case
   `_default_survey()` returns a synthetic skip.
-4. **Filter out cached directories** (`ai.py:1336–1349`). If you're
+4. **Filter out cached directories**. If you're resuming an
-   resuming an investigation, dirs that already have a `dir` cache entry
+   investigation, dirs that already have a `dir` cache entry are
-   are skipped — only new ones get a fresh dir loop.
+   skipped — only new ones get a fresh dir loop.
-5. **Run a dir loop per remaining directory** (`ai.py:1351–1375`). This
+5. **Run the planning pass** (Phase 3) unless the target is small, in
-   is the heart of the system — see §4.
+   which case `_default_plan()` returns an empty plan. On resumed runs
-6. **Run the synthesis pass** (`ai.py:1382`) reading only `dir` cache
+   the planner is skipped and `plan.json` is loaded from cache instead.
-   entries to produce `(brief, detailed)`.
+   `_apply_plan()` then sorts dirs into priority/default/shallow bands
   and builds a `{dir_path: max_turns}` map. Leaf-first ordering is
   preserved *within* each band (see §4.7).
 6. **Run a dir loop per remaining directory**, iterating the
   plan-ordered list with the per-directory `max_turns` from the plan.
   `_write_plan_evaluation()` records turn-utilization metrics at the
   end. This is the heart of the system — see §4.
 7. **Run the synthesis pass** reading only `dir` cache entries to
   produce `(brief, detailed)`.
-It also reads `flags.jsonl` from disk at the end (`ai.py:1387–1397`) and
+It also reads `flags.jsonl` from disk at the end and returns
-returns `(brief, detailed, flags)` to `analyze_directory()`.
+`(brief, detailed, flags)` to `analyze_directory()`.
 ### 3.2 The survey pass
-`_run_survey()` (`ai.py:1051`) is a short, single-purpose loop. It exists
+`_run_survey()` is a short, single-purpose loop. It exists to give the
-to give the dir loops some shared context about what they're looking at
+dir loops some shared context about what they're looking at *as a whole*
-*as a whole* before any of them start.
+before any of them start.
 Inputs go into the system prompt (`_SURVEY_SYSTEM_PROMPT` in
 `prompts.py`):
@ -125,9 +134,9 @@ Inputs go into the system prompt (`_SURVEY_SYSTEM_PROMPT` in
 - A 2-level tree preview from `build_tree(target, max_depth=2)`
 - The list of tools the dir loop will have available
-The survey is allowed only `submit_survey` as a tool (`_SURVEY_TOOLS` at
+The survey is allowed only `submit_survey` as a tool (`_SURVEY_TOOLS`).
-`ai.py:356`). It runs at most 3 turns. The agent must call `submit_survey`
+It runs at most 3 turns. The agent must call `submit_survey` exactly
-exactly once with six fields:
+once with six fields:
 ```python
 {
@ -148,54 +157,82 @@ loops still run but with `survey=None` — the system degrades gracefully.
 Two things happen with the survey output before each dir loop runs:
-**Survey block injection.** `_format_survey_block()` (`ai.py:803`) renders
+**Survey block injection.** `_format_survey_block()` renders the survey
-the survey dict as a labeled text block, which gets `.format()`-injected
+dict as a labeled text block, which gets `.format()`-injected into the
-into the dir loop system prompt as `{survey_context}`. The dir agent sees
+dir loop system prompt as `{survey_context}`. The dir agent sees the
-the description, approach, domain notes, and which tools it should lean on
+description, approach, domain notes, and which tools it should lean on
 or skip.
-**Tool filtering.** `_filter_dir_tools()` (`ai.py:824`) returns a copy of
+**Tool filtering.** `_filter_dir_tools()` returns a copy of `_DIR_TOOLS`
-`_DIR_TOOLS` with anything in `skip_tools` removed — but only if the
+with anything in `skip_tools` removed — but only if the survey's
-survey's confidence is at or above `_SURVEY_CONFIDENCE_THRESHOLD = 0.5`
+confidence is at or above `_SURVEY_CONFIDENCE_THRESHOLD = 0.5`. Below
-(`ai.py:775`). Below that threshold the agent gets the full toolbox. The
+that threshold the agent gets the full toolbox. The control-flow tool
-control-flow tool `submit_report` is in `_PROTECTED_DIR_TOOLS` and can
+`submit_report` is in `_PROTECTED_DIR_TOOLS` and can never be filtered
-never be filtered out — removing it would break loop termination.
+out — removing it would break loop termination.
-This is the only place in the codebase where the agent's available tools
+This is the only place in the codebase where the agent's available
-change at runtime. If you add a new tool, decide whether it should be
+tools change at runtime. If you add a new tool, decide whether it
-protectable.
+should be protectable.
 ### 3.4 The planning pass (Phase 3)
 `_run_planning()` is structured like `_run_survey()`: a single-purpose
 loop with one submit tool (`submit_plan`), low max turns. Its job is to
 decide *where* the dir loops should spend turns, not to investigate.
 Inputs:
 - The survey dict (formatted via `_format_survey_block()`)
 - The full tree at depth 6 (deeper than the survey's 2-level preview)
 - The base scan's `survey_signals` (raw file signals)
 - The list of already-cached directories (so the planner doesn't plan
  around dirs that will be skipped)
 The plan schema, tier allocations (priority 15–20 cap 25, default 10,
 shallow 5, skip 0), fallback behavior, and resume behavior are covered
 in full on the [Planning Pass](PlanningPass) page.
 `_apply_plan()` is a pure helper that translates the plan into an
 ordered list of directories plus a `{dir_path: max_turns}` map. It
 sorts dirs into priority/default/shallow bands but **preserves
 leaf-first ordering within each band** — so children always run before
 their parents, even in "priority-first" mode. See §4.7.
 `_write_plan_evaluation()` writes `plan_evaluation.json` at the end of
 every run with `turns_allocated`, `turns_used`, and `completeness` per
 directory. This is the planning pass's report card.
 ---
 ## 4. The dir loop in depth
-`_run_dir_loop()` is at `ai.py:1017`. It is a hand-written agent loop, and
+`_run_dir_loop()` is a hand-written agent loop, and you should expect
-you should expect to read it several times before it clicks. As of #57 the
+to read it several times before it clicks. As of #57 the loop body
-loop body itself is a thin coordinator (~25 lines): it calls three helpers
+itself is a thin coordinator (~25 lines): it calls three helpers that
-that own the layers it used to inline.
+own the layers it used to inline.
-| Helper | Lines | Job |
+| Helper | Job |
-|---|---|---|
+|---|---|
-| `_build_dir_loop_context()` | `ai.py:855` | Pure setup. Builds dir context, child summaries, survey block, filtered tool list, system prompt, and the seed user message. Returns a `_DirLoopContext` namedtuple. |
+| `_build_dir_loop_context()` | Pure setup. Builds dir context, child summaries, survey block, filtered tool list, system prompt, and the seed user message. Returns a `_DirLoopContext` namedtuple. |
-| `_flush_partial_dir_entry()` | `ai.py:896` | Idempotent partial-cache writer for the budget-exceeded path. Synthesizes a summary from already-cached file entries when possible, or writes a "no files processed" stub. Returns the partial summary string. |
+| `_flush_partial_dir_entry()` | Idempotent partial-cache writer for the budget-exceeded path. Synthesizes a summary from already-cached file entries when possible, or writes a "no files processed" stub. Returns the partial summary string. |
-| `_handle_turn_response()` | `ai.py:957` | Per-turn response processing. Prints text blocks and tool decisions to stderr, appends the assistant message, dispatches tools (or nudges the agent to call submit_report), appends tool_results. Returns `(done, summary)`. |
+| `_handle_turn_response()` | Per-turn response processing. Prints text blocks and tool decisions to stderr, appends the assistant message, dispatches tools (or nudges the agent to call submit_report), appends tool_results. Returns `(done, summary, completeness)`. |
 The shape of the loop body is now:
 ```
 ctx = _build_dir_loop_context(...)
 reset per-loop token counter
-for turn in range(max_turns):                    # max_turns = 14
+for turn in range(max_turns):                   # max_turns from plan (5–25)
    if budget exceeded:
        print warning
        partial = _flush_partial_dir_entry(...)
        if partial: summary = partial
        break
    call API (streaming)
-    done, turn_summary = _handle_turn_response(...)
+    done, turn_summary, turn_completeness = _handle_turn_response(...)
    if turn_summary: summary = turn_summary
    if turn_completeness: completeness = turn_completeness
    if done: break
-return summary
+return (summary, completeness)
 ```
 A few non-obvious mechanics:
@ -207,95 +244,104 @@ message (the tool results). Nothing is ever evicted. This means
 `input_tokens` on each successive API call grows roughly linearly — the
 model is re-sent the full conversation every turn. On code targets we see
 ~1.5–2k tokens added per turn. At `max_turns=14` this stays under the
-budget; raising the cap would expose this. See **#51**.
+budget; raising the cap would expose this. With Phase 3's priority-tier
 cap of 25, we're still well under budget in practice but closer to the
 ceiling. See **#51**.
 ### 4.2 Tool dispatch
 Tools are plain functions in `ai.py`. They are wired up via a single
-`register_tool()` call (`ai.py:172`) that lands the schema in one or
+`register_tool()` call that lands the schema in one or more scope lists
-more scope lists (`_DIR_TOOLS`, `_SYNTHESIS_TOOLS`, `_SURVEY_TOOLS`)
+(`_DIR_TOOLS`, `_SYNTHESIS_TOOLS`, `_SURVEY_TOOLS`, `_PLANNING_TOOLS`)
 and the handler in `_TOOL_DISPATCH`. The registrations live below the
-tool implementations in `ai.py` and read top-to-bottom in dir-then-
+tool implementations in `ai.py` and read top-to-bottom in
-synthesis-then-survey order.
+dir-then-synthesis-then-survey-then-planning order.
 `_execute_tool()` looks up the handler by name in `_TOOL_DISPATCH`,
 calls it, logs the turn to `investigation.log`, and returns the result
-string. **Tools intercepted by the loop body — `submit_report` and
+string. **Tools intercepted by the loop body — `submit_report`,
-`submit_survey` — register their schema only and have no handler entry.**
+`submit_survey`, `submit_plan` — register their schema only and have no
-`_handle_turn_response()` recognizes `submit_report` specially: it sets
+handler entry.** `_handle_turn_response()` recognizes `submit_report`
-`done = True` and extracts the summary directly from the tool input.
+specially: it sets `done = True`, extracts the summary from the tool
 input, and also extracts the optional `completeness` field (Phase 3
 instrumentation).
 `think`, `checkpoint`, and `flag` *are* in dispatch, but they have side
-effects that just print to stderr or append to `flags.jsonl` — the return
+effects that just print to stderr or append to `flags.jsonl` — the
-value is always `"ok"`.
+return value is always `"ok"`.
 When you add a tool: write the function, then add one `register_tool()`
 call below it. That's it. There is no second place to forget.
 ### 4.3 Pre-loaded context
-Before the loop starts, `_build_dir_loop_context()` (`ai.py:855`) calls
+Before the loop starts, `_build_dir_loop_context()` calls two helpers
-two helpers that prepare static context for the system prompt:
+that prepare static context for the system prompt:
- `_build_dir_context()` (`ai.py:741`) — `ls`-style listing of the dir
+- `_build_dir_context()` — `ls`-style listing of the dir with sizes and
-  with sizes and MIME types via `python-magic`. The agent sees this
+  MIME types via `python-magic`. The agent sees this *before* it makes
-  *before* it makes any tool calls, so it doesn't waste a turn just
+  any tool calls, so it doesn't waste a turn just listing the directory.
-  listing the directory.
+- `_get_child_summaries()` — looks up each subdirectory in the cache and
- `_get_child_summaries()` (`ai.py:763`) — looks up each subdirectory in
+  pulls its `summary` field. This is how leaves-first ordering pays off:
-  the cache and pulls its `summary` field. This is how leaves-first
+  by the time the loop runs on `src/`, all of `src/auth/`, `src/db/`,
-  ordering pays off: by the time the loop runs on `src/`, all of
+  `src/middleware/` already have cached summaries that get injected as
-  `src/auth/`, `src/db/`, `src/middleware/` already have cached summaries
+  `{child_summaries}`.
  that get injected as `{child_summaries}`.
-If `_get_child_summaries()` returns nothing, the prompt says
+If `_get_child_summaries()` returns nothing, the prompt distinguishes
-`(none — this is a leaf directory)`.
+leaf directories (`"(none: this is a leaf directory)"`) from parents
 whose children haven't been investigated yet (`"(child directories
 exist but have not been investigated yet)"`). See §4.7.
 ### 4.4 The token tracker and the budget check
-`_TokenTracker` (`ai.py:94`) is a tiny accumulator with one important
+`_TokenTracker` is a tiny accumulator with one important subtlety,
-subtlety, captured in **#44**:
+captured in **#44**:
 > Cumulative input tokens are NOT a meaningful proxy for context size:
 > each turn's `input_tokens` already includes the full message history,
 > so summing across turns double-counts everything. Use `last_input` for
 > budget decisions, totals for billing.
-So `budget_exceeded()` (`ai.py:135`) compares `last_input` (the most
+So `budget_exceeded()` compares `last_input` (the most recent call's
-recent call's input_tokens) to `CONTEXT_BUDGET` (`ai.py:40`), which is
+input_tokens) to `CONTEXT_BUDGET`, which is 70% of 200k. This is
-70% of 200k. This is checked at the *top* of each loop iteration, before
+checked at the *top* of each loop iteration, before the next API call.
 the next API call.
 When the budget check trips, the loop:
 1. Prints a `Context budget reached` warning to stderr
-2. Calls `_flush_partial_dir_entry()` (`ai.py:896`), which writes a
+2. Calls `_flush_partial_dir_entry()`, which writes a partial dir cache
-   partial dir cache entry from any `file` cache entries the agent
+   entry from any `file` cache entries the agent already produced,
-   already produced, marked with `partial: True` and `partial_reason`.
+   marked with `partial: True` and `partial_reason`. The helper is
-   The helper is idempotent — if a dir entry already exists, it returns
+   idempotent — if a dir entry already exists, it returns `""` without
-   `""` without writing.
+   writing.
 3. Breaks out of the loop
-This means a budget breach doesn't lose work — anything the agent already
+This means a budget breach doesn't lose work — anything the agent
-cached survives, and the synthesis pass will see a partial dir summary
+already cached survives, and the synthesis pass will see a partial dir
-rather than nothing.
+summary rather than nothing.
 ### 4.5 What the loop returns
-`_run_dir_loop()` returns the `summary` string from `submit_report` (or
+`_run_dir_loop()` returns `(summary, completeness)`. The summary is the
-the partial summary returned by `_flush_partial_dir_entry()` if the
+string from `submit_report` (or the partial summary returned by
-budget tripped). `_run_investigation()` then writes a normal `dir` cache
+`_flush_partial_dir_entry()` if the budget tripped). The completeness
-entry from this summary, *unless* the dir loop already wrote one itself
+is the agent's self-rated investigation thoroughness (0.0–1.0) — Phase
-via the partial-flush path, in which case the `cache.has_entry("dir",
+3 instrumentation used in `plan_evaluation.json` — or `None` if the
-dir_path)` check skips it.
+agent didn't report one.
 `_run_investigation()` writes a normal `dir` cache entry from this
 summary (with `completeness` included if non-None), *unless* the dir
 loop already wrote one itself via the partial-flush path, in which case
 the `cache.has_entry("dir", dir_path)` check skips it.
 ### 4.6 The streaming API caller
-`_call_api_streaming()` (`ai.py:686`) is a thin wrapper around
+`_call_api_streaming()` is a thin wrapper around
 `client.messages.stream()`. It currently doesn't print tokens as they
 arrive — it iterates the stream, drops everything, then pulls the final
 message via `stream.get_final_message()`. The streaming API is used for
-real-time tool decision printing, which today happens only after the full
+real-time tool decision printing, which today happens only after the
-response arrives. There's room here to add live progress printing if you
+full response arrives. There's room here to add live progress printing
-want it.
+if you want it.
 ### 4.7 The leaf-first contract (load-bearing for child summaries)
@ -339,32 +385,34 @@ the full design.
 ## 5. The cache model
 Cache lives at `/tmp/luminos/{investigation_id}/`. Code is
-`luminos_lib/cache.py` (201 lines).
+`luminos_lib/cache.py`.
 ### 5.1 Investigation IDs
 `/tmp/luminos/investigations.json` maps absolute target paths to UUIDs.
-`_get_investigation_id()` (`cache.py:40`) looks up the target and either
+`_get_investigation_id()` looks up the target and either returns the
-returns the existing UUID (resume) or creates a new one (fresh run).
+existing UUID (resume) or creates a new one (fresh run). `--fresh`
-`--fresh` forces a new UUID even if one exists.
+forces a new UUID even if one exists.
 ### 5.2 What's stored
 Inside `/tmp/luminos/{uuid}/`:
 ```
-meta.json              investigation metadata (model, start time, dir count)
+meta.json                 investigation metadata (model, start time, dir count)
-files/<sha256>.json    one file per cached file entry
+plan.json                 planning pass output — cached for resumed runs
-dirs/<sha256>.json     one file per cached directory entry
+plan_evaluation.json      post-investigation quality report (Phase 3)
-flags.jsonl            JSONL — appended on every flag tool call
+files/<sha256>.json       one file per cached file entry
-investigation.log      JSONL — appended on every tool call
+dirs/<sha256>.json        one file per cached directory entry
 flags.jsonl               JSONL — appended on every flag tool call
 investigation.log         JSONL — appended on every tool call
 ```
 **File and dir cache entries are NOT in JSONL** — they are one
-sha256-keyed JSON file per entry. The sha256 is over the path string
+sha256-keyed JSON file per entry. The sha256 is over the path string.
-(`cache.py:13`). Only `flags.jsonl` and `investigation.log` use JSONL.
+Only `flags.jsonl` and `investigation.log` use JSONL.
-Required fields are validated in `write_entry()` (`cache.py:115`):
+Required fields are validated in `write_entry()`:
 ```python
 file: {path, relative_path, size_bytes, category, summary, cached_at}
@ -376,31 +424,45 @@ The validator also rejects entries containing `content`, `contents`, or
 contents, summaries only. If you change the schema, update the required
 set in `write_entry()` and update the test in `tests/test_cache.py`.
-### 5.3 Confidence support already exists
+### 5.3 Confidence + completeness support
-`write_entry()` validates an optional `confidence` field
+`write_entry()` validates optional `confidence` and `confidence_reason`
-(`cache.py:129–134`) and a `confidence_reason` string.
+fields (Phase 1) and an optional `completeness` field (Phase 3,
-`low_confidence_entries(threshold=0.7)` (`cache.py:191`) returns all
+0.0–1.0, the dir agent's self-rated thoroughness).
-entries below a threshold, sorted ascending. The agent doesn't currently
+`low_confidence_entries(threshold=0.7)` returns all entries below a
-*set* these fields in any prompt — that lights up when Phase 1 work
+threshold, sorted ascending — future refinement-pass fuel.
 actually wires the prompts.
 ### 5.4 Why one-file-per-entry instead of JSONL
-Random access by path. The dir loop calls `cache.has_entry("dir", path)`
+Random access by path. The dir loop calls
-once per directory during the `_get_child_summaries()` lookup; with
+`cache.has_entry("dir", path)` once per directory during the
-sha256-keyed files this is an `os.path.exists()` call. With JSONL it
+`_get_child_summaries()` lookup; with sha256-keyed files this is an
-would be a full file scan.
+`os.path.exists()` call. With JSONL it would be a full file scan.
 ### 5.5 The planning files
 `plan.json` is written by `_run_investigation()` after a successful
 planning pass, so resumed runs can skip the planner. It is loaded
 before the dir loops run when `--fresh` is not set and the file
 exists.
 `plan_evaluation.json` is written by `_write_plan_evaluation()` after
 the dir loops finish. Schema: `plan_order`, `total_dirs_investigated`,
 `total_turns_allocated`, `total_turns_used`, `overall_utilization`,
 `per_directory` (list of `{dir, planned_tier, turns_allocated,
 turns_used, utilization, completeness, confidence}`), `evaluated_at`.
 See [Planning Pass](PlanningPass) for how to use it.
 ---
 ## 6. Prompts
-All prompt templates live in `luminos_lib/prompts.py`. There are three:
+All prompt templates live in `luminos_lib/prompts.py`. There are four:
 | Constant | Used by | What it carries |
 |---|---|---|
 | `_SURVEY_SYSTEM_PROMPT` | `_run_survey` | survey_signals, tree_preview, available_tools |
 | `_PLANNING_SYSTEM_PROMPT` | `_run_planning` | survey, tree, file signals, cached_dirs |
 | `_DIR_SYSTEM_PROMPT` | `_run_dir_loop` | dir_path, dir_rel, max_turns, context, child_summaries, survey_context |
 | `_SYNTHESIS_SYSTEM_PROMPT` | `_run_synthesis` | target, summaries_text |
@ -424,8 +486,8 @@ that reason.
 ## 7. Synthesis pass
-`_run_synthesis()` (`ai.py:1157`) is structurally similar to the dir loop
+`_run_synthesis()` is structurally similar to the dir loop but much
-but much simpler:
+simpler:
 - Reads all `dir` cache entries via `cache.read_all_entries("dir")`
 - Renders them into a `summaries_text` block (one section per dir)
@ -434,31 +496,29 @@ but much simpler:
  `detailed` fields
 Tools available: `read_cache`, `list_cache`, `flag`, `submit_report`
-(`_SYNTHESIS_TOOLS` at `ai.py:401`). The synthesis agent can pull
+(`_SYNTHESIS_TOOLS`). The synthesis agent can pull specific cache
-specific cache entries back if it needs to drill in, but it cannot read
+entries back if it needs to drill in, but it cannot read files directly
-files directly — synthesis is meant to operate on summaries, not raw
+— synthesis is meant to operate on summaries, not raw contents.
 contents.
 There's a fallback: if synthesis runs out of turns without calling
-`submit_report`, `_synthesize_from_cache()` (`ai.py:1262`) builds a
+`submit_report`, `_synthesize_from_cache()` builds a mechanical
-mechanical brief+detailed from the cached dir summaries with no AI call.
+brief+detailed from the cached dir summaries with no AI call. This
-This guarantees you always get *something* in the report.
+guarantees you always get *something* in the report.
 ---
 ## 8. Flags
 The `flag` tool is the agent's pressure valve for "I noticed something
-that should not be lost in the summary." `_tool_flag()` (`ai.py:629`)
+that should not be lost in the summary." `_tool_flag()` prints to stderr
-prints to stderr *and* appends a JSONL line to
+*and* appends a JSONL line to `{cache.root}/flags.jsonl`. At the end of
-`{cache.root}/flags.jsonl`. At the end of `_run_investigation()`
+`_run_investigation()`, the orchestrator reads that file back and
-(`ai.py:1387–1397`), the orchestrator reads that file back and includes
+includes the flags in its return tuple. `format_report()` then renders
-the flags in its return tuple. `format_report()` then renders them in a
+them in a dedicated section.
 dedicated section.
 Severity is `info | concern | critical`. The agent is told to flag
-*immediately* on discovery, not save findings for the report — this is in
+*immediately* on discovery, not save findings for the report — this is
-the tool description at `ai.py:312`.
+in the tool description.
 ---
@ -484,10 +544,11 @@ A cookbook for the kinds of changes that come up most often.
   contains your handler and `_DIR_TOOLS` contains your schema after
   importing `luminos_lib.ai`.
-To make a tool available in synthesis or survey instead of (or in
+To make a tool available in synthesis, survey, or planning instead of
-addition to) dir, pass `scopes=["synthesis"]`, `scopes=["survey"]`, or
+(or in addition to) dir, pass `scopes=["synthesis"]`, `scopes=["survey"]`,
-`scopes=["dir", "synthesis"]`. Tools whose schema differs by scope (like
+`scopes=["planning"]`, or any combination. Tools whose schema differs by
-`submit_report`) get a separate `register_tool()` call per scope.
+scope (like `submit_report`) get a separate `register_tool()` call per
 scope.
 ### 9.2 Add a whole new pass
@ -522,8 +583,7 @@ unless you `--fresh`.
 ### 9.4 Change cache schema
 1. Update the required-fields set in `cache.py:write_entry()`
-   (`cache.py:119–123`)
+2. Update `_DIR_TOOLS`'s `write_cache` description in `ai.py` so the
 2. Update `_DIR_TOOLS`'s `write_cache` description in `ai.py:228` so the
   agent knows what to write
 3. Update `_DIR_SYSTEM_PROMPT` in `prompts.py` if the agent needs to know
   *how* to populate the new field
@ -532,26 +592,25 @@ unless you `--fresh`.
 ### 9.5 Add a CLI flag
-Edit `luminos.py:88` (`main()`'s argparse setup) to define the flag, then
+Edit `luminos.py:main()`'s argparse setup to define the flag, then
 plumb it through whatever functions need it. New AI-related flags
-typically need to be added to `analyze_directory()`'s signature
+typically need to be added to `analyze_directory()`'s signature and
-(`ai.py:1408`) and then forwarded to `_run_investigation()`.
+then forwarded to `_run_investigation()`.
 ---
 ## 10. Token budget and cost
-Budget logic is in `_TokenTracker.budget_exceeded()` and is checked at the
+Budget logic is in `_TokenTracker.budget_exceeded()` and is checked at
-top of every dir loop iteration (`ai.py:882`). The budget is **per call**,
+the top of every dir loop iteration. The budget is **per call**, not
-not cumulative — see §4.4. The breach handler flushes a partial dir cache
+cumulative — see §4.4. The breach handler flushes a partial dir cache
 entry so work isn't lost.
-Cost reporting happens once at the end of `_run_investigation()`
+Cost reporting happens once at the end of `_run_investigation()`, using
-(`ai.py:1399`), using the cumulative `total_input` and `total_output`
+the cumulative `total_input` and `total_output` counters multiplied by
-counters multiplied by the constants at `ai.py:43–44`. There is no
+the constants near the top of `ai.py`. There is no running cost display
-running cost display during the investigation today. If you want one,
+during the investigation today. If you want one, `_TokenTracker.summary()`
-`_TokenTracker.summary()` already returns the formatted string — just
+already returns the formatted string — just call it after each dir loop.
 call it after each dir loop.
 ---
@ -560,16 +619,20 @@ call it after each dir loop.
 | Term | Meaning |
 |---|---|
 | **base scan** | The non-AI phase: tree, classification, languages, recency, disk usage. Stdlib + coreutils only. |
-| **dir loop** | Per-directory agent loop in `_run_dir_loop`. Up to 14 turns. Produces a `dir` cache entry. |
+| **dir loop** | Per-directory agent loop in `_run_dir_loop`. Turns allocated by the planning pass (5 shallow / 10 default / 15–20 priority, capped at 25). Produces a `dir` cache entry. |
 | **survey pass** | Single short loop before any dir loops, producing a shared description and tool guidance. |
 | **planning pass** | Phase 3 pass after the survey, before dir loops. Produces a plan (priority/shallow/skip dirs + turn allocations + order). |
 | **synthesis pass** | Final loop that reads `dir` cache entries and produces `(brief, detailed)`. |
-| **leaves-first** | Discovery order in `_discover_directories`: deepest paths first, so child summaries exist when parents are investigated. |
+| **leaves-first** | Discovery order in `_discover_directories`: deepest paths first, so child summaries exist when parents are investigated. Preserved within planning bands by `_apply_plan`. |
 | **investigation** | One end-to-end run, identified by a UUID, persisted under `/tmp/luminos/{uuid}/`. |
 | **investigation_id** | The UUID. Stored in `/tmp/luminos/investigations.json` keyed by absolute target path. |
 | **cache entry** | A JSON file under `files/` or `dirs/` named by sha256(path). |
 | **flag** | An agent finding written to `flags.jsonl` and reported separately. info / concern / critical. |
 | **partial entry** | A `dir` cache entry written when the budget tripped before `submit_report`. Marked with `partial: True`. |
 | **completeness** | Phase 3 agent self-rated thoroughness (0.0–1.0) from `submit_report`. Feeds `plan_evaluation.json`. |
 | **survey signals** | The histogram + samples computed by `filetypes.survey_signals()` during the base scan, fed to the survey prompt. |
 | **last_input** | The `input_tokens` count from the most recent API call. The basis for budget checks. NOT the cumulative sum. |
 | **CONTEXT_BUDGET** | 70% of 200k = 140k. Trigger threshold for early exit. |
 | **`_PROTECTED_DIR_TOOLS`** | Tools the survey is forbidden from filtering out of the dir loop's toolbox. Currently `{submit_report}`. |
 | **plan.json** | Serialized planning output, cached so resumed runs skip the planner. |
 | **plan_evaluation.json** | Post-investigation quality report comparing plan predictions to outcomes. |