From d3315b530f3957d5a10eae5f79918a68b8be420d Mon Sep 17 00:00:00 2001 From: claude-code Date: Sat, 18 Apr 2026 20:08:30 -0600 Subject: [PATCH] =?UTF-8?q?wiki:=20Internals=20=E2=80=94=20reflect=20Phase?= =?UTF-8?q?=203=20planning=20pass,=20(summary,=20completeness)=20return,?= =?UTF-8?q?=20cache=20layout?= MIME-Version: 1.0 Content-Type: text/plain; charset=UTF-8 Content-Transfer-Encoding: 8bit --- Internals.md | 405 +++++++++++++++++++++++++++++---------------------- 1 file changed, 234 insertions(+), 171 deletions(-) diff --git a/Internals.md b/Internals.md index 4fc12ea..df0606d 100644 --- a/Internals.md +++ b/Internals.md @@ -7,7 +7,8 @@ agent loop can finish this page and start making non-trivial changes. All file:line references are accurate as of the date this page was last edited — verify with `git log` or by opening the file before relying on a -specific line number. +specific line number. `ai.py` in particular grows each phase and +references drift. --- @@ -36,9 +37,9 @@ wait for a scan they can't use. ## 2. Base scan walkthrough -Entry: `luminos.py:main()` parses args, then calls `scan(target, ...)` at -`luminos.py:45`. `scan()` is a flat sequence — it builds a `report` dict -by calling helpers from `luminos_lib/`, one per concern, in order: +Entry: `luminos.py:main()` parses args, then calls `scan(target, ...)`. +`scan()` is a flat sequence — it builds a `report` dict by calling helpers +from `luminos_lib/`, one per concern, in order: ``` scan(target) @@ -60,7 +61,7 @@ event-driven, and there is no shared state object — everything passes through the local `report` dict. The progress lines you see on stderr (`[scan] Counting lines... foo.py`) -come from `_progress()` in `luminos.py:23`, which returns an `on_file` +come from `_progress()` in `luminos.py`, which returns an `on_file` callback that the helpers call as they work. If you add a new helper that walks files, plumb a progress callback through the same way for consistency. @@ -77,46 +78,54 @@ the base scan because it needs `report["survey_signals"]` and The AI pipeline is what makes Luminos interesting and is also where almost all the complexity lives. Everything below happens inside -`luminos_lib/ai.py` (1438 lines as of writing), called from -`luminos.py:157` via `analyze_directory()`. +`luminos_lib/ai.py` (~2060 lines as of writing), called from `luminos.py` +via `analyze_directory()`. ### 3.1 The orchestrator -`analyze_directory()` (`ai.py:1408`) is a thin wrapper that checks -dependencies, gets the API key, builds the Anthropic client, and calls -`_run_investigation()`. If anything fails it prints a warning and returns -empty strings — the rest of luminos keeps working. +`analyze_directory()` is a thin wrapper that checks dependencies, gets the +API key, builds the Anthropic client, and calls `_run_investigation()`. +If anything fails it prints a warning and returns empty strings — the +rest of luminos keeps working. -`_run_investigation()` (`ai.py:1286`) is the real entry point. Read this -function first if you want to understand the pipeline shape. It does six -things, in order: +`_run_investigation()` is the real entry point. Read this function first +if you want to understand the pipeline shape. It does **seven** things, +in order: -1. **Get/create an investigation ID and cache** (`ai.py:1289–1294`). - Investigation IDs let you resume a previous run; see §5 below. +1. **Get/create an investigation ID and cache**. Investigation IDs let + you resume a previous run; see §5 below. 2. **Discover all directories** under the target via - `_discover_directories()` (`ai.py:715`). Returns them sorted - *leaves-first* — the deepest paths come first. This matters because - each dir loop reads its child directories' summaries from cache, so - children must be investigated before parents. -3. **Run the survey pass** (`ai.py:1300–1334`) unless the target is below - the size thresholds at `ai.py:780–781`, in which case + `_discover_directories()`. Returns them sorted *leaves-first* — the + deepest paths come first. This matters because each dir loop reads + its child directories' summaries from cache, so children must be + investigated before parents. +3. **Run the survey pass** unless the target is below + `_SURVEY_MIN_FILES` and `_SURVEY_MIN_DIRS`, in which case `_default_survey()` returns a synthetic skip. -4. **Filter out cached directories** (`ai.py:1336–1349`). If you're - resuming an investigation, dirs that already have a `dir` cache entry - are skipped — only new ones get a fresh dir loop. -5. **Run a dir loop per remaining directory** (`ai.py:1351–1375`). This - is the heart of the system — see §4. -6. **Run the synthesis pass** (`ai.py:1382`) reading only `dir` cache - entries to produce `(brief, detailed)`. +4. **Filter out cached directories**. If you're resuming an + investigation, dirs that already have a `dir` cache entry are + skipped — only new ones get a fresh dir loop. +5. **Run the planning pass** (Phase 3) unless the target is small, in + which case `_default_plan()` returns an empty plan. On resumed runs + the planner is skipped and `plan.json` is loaded from cache instead. + `_apply_plan()` then sorts dirs into priority/default/shallow bands + and builds a `{dir_path: max_turns}` map. Leaf-first ordering is + preserved *within* each band (see §4.7). +6. **Run a dir loop per remaining directory**, iterating the + plan-ordered list with the per-directory `max_turns` from the plan. + `_write_plan_evaluation()` records turn-utilization metrics at the + end. This is the heart of the system — see §4. +7. **Run the synthesis pass** reading only `dir` cache entries to + produce `(brief, detailed)`. -It also reads `flags.jsonl` from disk at the end (`ai.py:1387–1397`) and -returns `(brief, detailed, flags)` to `analyze_directory()`. +It also reads `flags.jsonl` from disk at the end and returns +`(brief, detailed, flags)` to `analyze_directory()`. ### 3.2 The survey pass -`_run_survey()` (`ai.py:1051`) is a short, single-purpose loop. It exists -to give the dir loops some shared context about what they're looking at -*as a whole* before any of them start. +`_run_survey()` is a short, single-purpose loop. It exists to give the +dir loops some shared context about what they're looking at *as a whole* +before any of them start. Inputs go into the system prompt (`_SURVEY_SYSTEM_PROMPT` in `prompts.py`): @@ -125,9 +134,9 @@ Inputs go into the system prompt (`_SURVEY_SYSTEM_PROMPT` in - A 2-level tree preview from `build_tree(target, max_depth=2)` - The list of tools the dir loop will have available -The survey is allowed only `submit_survey` as a tool (`_SURVEY_TOOLS` at -`ai.py:356`). It runs at most 3 turns. The agent must call `submit_survey` -exactly once with six fields: +The survey is allowed only `submit_survey` as a tool (`_SURVEY_TOOLS`). +It runs at most 3 turns. The agent must call `submit_survey` exactly +once with six fields: ```python { @@ -148,54 +157,82 @@ loops still run but with `survey=None` — the system degrades gracefully. Two things happen with the survey output before each dir loop runs: -**Survey block injection.** `_format_survey_block()` (`ai.py:803`) renders -the survey dict as a labeled text block, which gets `.format()`-injected -into the dir loop system prompt as `{survey_context}`. The dir agent sees -the description, approach, domain notes, and which tools it should lean on +**Survey block injection.** `_format_survey_block()` renders the survey +dict as a labeled text block, which gets `.format()`-injected into the +dir loop system prompt as `{survey_context}`. The dir agent sees the +description, approach, domain notes, and which tools it should lean on or skip. -**Tool filtering.** `_filter_dir_tools()` (`ai.py:824`) returns a copy of -`_DIR_TOOLS` with anything in `skip_tools` removed — but only if the -survey's confidence is at or above `_SURVEY_CONFIDENCE_THRESHOLD = 0.5` -(`ai.py:775`). Below that threshold the agent gets the full toolbox. The -control-flow tool `submit_report` is in `_PROTECTED_DIR_TOOLS` and can -never be filtered out — removing it would break loop termination. +**Tool filtering.** `_filter_dir_tools()` returns a copy of `_DIR_TOOLS` +with anything in `skip_tools` removed — but only if the survey's +confidence is at or above `_SURVEY_CONFIDENCE_THRESHOLD = 0.5`. Below +that threshold the agent gets the full toolbox. The control-flow tool +`submit_report` is in `_PROTECTED_DIR_TOOLS` and can never be filtered +out — removing it would break loop termination. -This is the only place in the codebase where the agent's available tools -change at runtime. If you add a new tool, decide whether it should be -protectable. +This is the only place in the codebase where the agent's available +tools change at runtime. If you add a new tool, decide whether it +should be protectable. + +### 3.4 The planning pass (Phase 3) + +`_run_planning()` is structured like `_run_survey()`: a single-purpose +loop with one submit tool (`submit_plan`), low max turns. Its job is to +decide *where* the dir loops should spend turns, not to investigate. + +Inputs: +- The survey dict (formatted via `_format_survey_block()`) +- The full tree at depth 6 (deeper than the survey's 2-level preview) +- The base scan's `survey_signals` (raw file signals) +- The list of already-cached directories (so the planner doesn't plan + around dirs that will be skipped) + +The plan schema, tier allocations (priority 15–20 cap 25, default 10, +shallow 5, skip 0), fallback behavior, and resume behavior are covered +in full on the [Planning Pass](PlanningPass) page. + +`_apply_plan()` is a pure helper that translates the plan into an +ordered list of directories plus a `{dir_path: max_turns}` map. It +sorts dirs into priority/default/shallow bands but **preserves +leaf-first ordering within each band** — so children always run before +their parents, even in "priority-first" mode. See §4.7. + +`_write_plan_evaluation()` writes `plan_evaluation.json` at the end of +every run with `turns_allocated`, `turns_used`, and `completeness` per +directory. This is the planning pass's report card. --- ## 4. The dir loop in depth -`_run_dir_loop()` is at `ai.py:1017`. It is a hand-written agent loop, and -you should expect to read it several times before it clicks. As of #57 the -loop body itself is a thin coordinator (~25 lines): it calls three helpers -that own the layers it used to inline. +`_run_dir_loop()` is a hand-written agent loop, and you should expect +to read it several times before it clicks. As of #57 the loop body +itself is a thin coordinator (~25 lines): it calls three helpers that +own the layers it used to inline. -| Helper | Lines | Job | -|---|---|---| -| `_build_dir_loop_context()` | `ai.py:855` | Pure setup. Builds dir context, child summaries, survey block, filtered tool list, system prompt, and the seed user message. Returns a `_DirLoopContext` namedtuple. | -| `_flush_partial_dir_entry()` | `ai.py:896` | Idempotent partial-cache writer for the budget-exceeded path. Synthesizes a summary from already-cached file entries when possible, or writes a "no files processed" stub. Returns the partial summary string. | -| `_handle_turn_response()` | `ai.py:957` | Per-turn response processing. Prints text blocks and tool decisions to stderr, appends the assistant message, dispatches tools (or nudges the agent to call submit_report), appends tool_results. Returns `(done, summary)`. | +| Helper | Job | +|---|---| +| `_build_dir_loop_context()` | Pure setup. Builds dir context, child summaries, survey block, filtered tool list, system prompt, and the seed user message. Returns a `_DirLoopContext` namedtuple. | +| `_flush_partial_dir_entry()` | Idempotent partial-cache writer for the budget-exceeded path. Synthesizes a summary from already-cached file entries when possible, or writes a "no files processed" stub. Returns the partial summary string. | +| `_handle_turn_response()` | Per-turn response processing. Prints text blocks and tool decisions to stderr, appends the assistant message, dispatches tools (or nudges the agent to call submit_report), appends tool_results. Returns `(done, summary, completeness)`. | The shape of the loop body is now: ``` ctx = _build_dir_loop_context(...) reset per-loop token counter -for turn in range(max_turns): # max_turns = 14 +for turn in range(max_turns): # max_turns from plan (5–25) if budget exceeded: print warning partial = _flush_partial_dir_entry(...) if partial: summary = partial break call API (streaming) - done, turn_summary = _handle_turn_response(...) + done, turn_summary, turn_completeness = _handle_turn_response(...) if turn_summary: summary = turn_summary + if turn_completeness: completeness = turn_completeness if done: break -return summary +return (summary, completeness) ``` A few non-obvious mechanics: @@ -207,95 +244,104 @@ message (the tool results). Nothing is ever evicted. This means `input_tokens` on each successive API call grows roughly linearly — the model is re-sent the full conversation every turn. On code targets we see ~1.5–2k tokens added per turn. At `max_turns=14` this stays under the -budget; raising the cap would expose this. See **#51**. +budget; raising the cap would expose this. With Phase 3's priority-tier +cap of 25, we're still well under budget in practice but closer to the +ceiling. See **#51**. ### 4.2 Tool dispatch Tools are plain functions in `ai.py`. They are wired up via a single -`register_tool()` call (`ai.py:172`) that lands the schema in one or -more scope lists (`_DIR_TOOLS`, `_SYNTHESIS_TOOLS`, `_SURVEY_TOOLS`) +`register_tool()` call that lands the schema in one or more scope lists +(`_DIR_TOOLS`, `_SYNTHESIS_TOOLS`, `_SURVEY_TOOLS`, `_PLANNING_TOOLS`) and the handler in `_TOOL_DISPATCH`. The registrations live below the -tool implementations in `ai.py` and read top-to-bottom in dir-then- -synthesis-then-survey order. +tool implementations in `ai.py` and read top-to-bottom in +dir-then-synthesis-then-survey-then-planning order. `_execute_tool()` looks up the handler by name in `_TOOL_DISPATCH`, calls it, logs the turn to `investigation.log`, and returns the result -string. **Tools intercepted by the loop body — `submit_report` and -`submit_survey` — register their schema only and have no handler entry.** -`_handle_turn_response()` recognizes `submit_report` specially: it sets -`done = True` and extracts the summary directly from the tool input. +string. **Tools intercepted by the loop body — `submit_report`, +`submit_survey`, `submit_plan` — register their schema only and have no +handler entry.** `_handle_turn_response()` recognizes `submit_report` +specially: it sets `done = True`, extracts the summary from the tool +input, and also extracts the optional `completeness` field (Phase 3 +instrumentation). `think`, `checkpoint`, and `flag` *are* in dispatch, but they have side -effects that just print to stderr or append to `flags.jsonl` — the return -value is always `"ok"`. +effects that just print to stderr or append to `flags.jsonl` — the +return value is always `"ok"`. When you add a tool: write the function, then add one `register_tool()` call below it. That's it. There is no second place to forget. ### 4.3 Pre-loaded context -Before the loop starts, `_build_dir_loop_context()` (`ai.py:855`) calls -two helpers that prepare static context for the system prompt: +Before the loop starts, `_build_dir_loop_context()` calls two helpers +that prepare static context for the system prompt: -- `_build_dir_context()` (`ai.py:741`) — `ls`-style listing of the dir - with sizes and MIME types via `python-magic`. The agent sees this - *before* it makes any tool calls, so it doesn't waste a turn just - listing the directory. -- `_get_child_summaries()` (`ai.py:763`) — looks up each subdirectory in - the cache and pulls its `summary` field. This is how leaves-first - ordering pays off: by the time the loop runs on `src/`, all of - `src/auth/`, `src/db/`, `src/middleware/` already have cached summaries - that get injected as `{child_summaries}`. +- `_build_dir_context()` — `ls`-style listing of the dir with sizes and + MIME types via `python-magic`. The agent sees this *before* it makes + any tool calls, so it doesn't waste a turn just listing the directory. +- `_get_child_summaries()` — looks up each subdirectory in the cache and + pulls its `summary` field. This is how leaves-first ordering pays off: + by the time the loop runs on `src/`, all of `src/auth/`, `src/db/`, + `src/middleware/` already have cached summaries that get injected as + `{child_summaries}`. -If `_get_child_summaries()` returns nothing, the prompt says -`(none — this is a leaf directory)`. +If `_get_child_summaries()` returns nothing, the prompt distinguishes +leaf directories (`"(none: this is a leaf directory)"`) from parents +whose children haven't been investigated yet (`"(child directories +exist but have not been investigated yet)"`). See §4.7. ### 4.4 The token tracker and the budget check -`_TokenTracker` (`ai.py:94`) is a tiny accumulator with one important -subtlety, captured in **#44**: +`_TokenTracker` is a tiny accumulator with one important subtlety, +captured in **#44**: > Cumulative input tokens are NOT a meaningful proxy for context size: > each turn's `input_tokens` already includes the full message history, > so summing across turns double-counts everything. Use `last_input` for > budget decisions, totals for billing. -So `budget_exceeded()` (`ai.py:135`) compares `last_input` (the most -recent call's input_tokens) to `CONTEXT_BUDGET` (`ai.py:40`), which is -70% of 200k. This is checked at the *top* of each loop iteration, before -the next API call. +So `budget_exceeded()` compares `last_input` (the most recent call's +input_tokens) to `CONTEXT_BUDGET`, which is 70% of 200k. This is +checked at the *top* of each loop iteration, before the next API call. When the budget check trips, the loop: 1. Prints a `Context budget reached` warning to stderr -2. Calls `_flush_partial_dir_entry()` (`ai.py:896`), which writes a - partial dir cache entry from any `file` cache entries the agent - already produced, marked with `partial: True` and `partial_reason`. - The helper is idempotent — if a dir entry already exists, it returns - `""` without writing. +2. Calls `_flush_partial_dir_entry()`, which writes a partial dir cache + entry from any `file` cache entries the agent already produced, + marked with `partial: True` and `partial_reason`. The helper is + idempotent — if a dir entry already exists, it returns `""` without + writing. 3. Breaks out of the loop -This means a budget breach doesn't lose work — anything the agent already -cached survives, and the synthesis pass will see a partial dir summary -rather than nothing. +This means a budget breach doesn't lose work — anything the agent +already cached survives, and the synthesis pass will see a partial dir +summary rather than nothing. ### 4.5 What the loop returns -`_run_dir_loop()` returns the `summary` string from `submit_report` (or -the partial summary returned by `_flush_partial_dir_entry()` if the -budget tripped). `_run_investigation()` then writes a normal `dir` cache -entry from this summary, *unless* the dir loop already wrote one itself -via the partial-flush path, in which case the `cache.has_entry("dir", -dir_path)` check skips it. +`_run_dir_loop()` returns `(summary, completeness)`. The summary is the +string from `submit_report` (or the partial summary returned by +`_flush_partial_dir_entry()` if the budget tripped). The completeness +is the agent's self-rated investigation thoroughness (0.0–1.0) — Phase +3 instrumentation used in `plan_evaluation.json` — or `None` if the +agent didn't report one. + +`_run_investigation()` writes a normal `dir` cache entry from this +summary (with `completeness` included if non-None), *unless* the dir +loop already wrote one itself via the partial-flush path, in which case +the `cache.has_entry("dir", dir_path)` check skips it. ### 4.6 The streaming API caller -`_call_api_streaming()` (`ai.py:686`) is a thin wrapper around +`_call_api_streaming()` is a thin wrapper around `client.messages.stream()`. It currently doesn't print tokens as they arrive — it iterates the stream, drops everything, then pulls the final message via `stream.get_final_message()`. The streaming API is used for -real-time tool decision printing, which today happens only after the full -response arrives. There's room here to add live progress printing if you -want it. +real-time tool decision printing, which today happens only after the +full response arrives. There's room here to add live progress printing +if you want it. ### 4.7 The leaf-first contract (load-bearing for child summaries) @@ -339,32 +385,34 @@ the full design. ## 5. The cache model Cache lives at `/tmp/luminos/{investigation_id}/`. Code is -`luminos_lib/cache.py` (201 lines). +`luminos_lib/cache.py`. ### 5.1 Investigation IDs `/tmp/luminos/investigations.json` maps absolute target paths to UUIDs. -`_get_investigation_id()` (`cache.py:40`) looks up the target and either -returns the existing UUID (resume) or creates a new one (fresh run). -`--fresh` forces a new UUID even if one exists. +`_get_investigation_id()` looks up the target and either returns the +existing UUID (resume) or creates a new one (fresh run). `--fresh` +forces a new UUID even if one exists. ### 5.2 What's stored Inside `/tmp/luminos/{uuid}/`: ``` -meta.json investigation metadata (model, start time, dir count) -files/.json one file per cached file entry -dirs/.json one file per cached directory entry -flags.jsonl JSONL — appended on every flag tool call -investigation.log JSONL — appended on every tool call +meta.json investigation metadata (model, start time, dir count) +plan.json planning pass output — cached for resumed runs +plan_evaluation.json post-investigation quality report (Phase 3) +files/.json one file per cached file entry +dirs/.json one file per cached directory entry +flags.jsonl JSONL — appended on every flag tool call +investigation.log JSONL — appended on every tool call ``` **File and dir cache entries are NOT in JSONL** — they are one -sha256-keyed JSON file per entry. The sha256 is over the path string -(`cache.py:13`). Only `flags.jsonl` and `investigation.log` use JSONL. +sha256-keyed JSON file per entry. The sha256 is over the path string. +Only `flags.jsonl` and `investigation.log` use JSONL. -Required fields are validated in `write_entry()` (`cache.py:115`): +Required fields are validated in `write_entry()`: ```python file: {path, relative_path, size_bytes, category, summary, cached_at} @@ -376,31 +424,45 @@ The validator also rejects entries containing `content`, `contents`, or contents, summaries only. If you change the schema, update the required set in `write_entry()` and update the test in `tests/test_cache.py`. -### 5.3 Confidence support already exists +### 5.3 Confidence + completeness support -`write_entry()` validates an optional `confidence` field -(`cache.py:129–134`) and a `confidence_reason` string. -`low_confidence_entries(threshold=0.7)` (`cache.py:191`) returns all -entries below a threshold, sorted ascending. The agent doesn't currently -*set* these fields in any prompt — that lights up when Phase 1 work -actually wires the prompts. +`write_entry()` validates optional `confidence` and `confidence_reason` +fields (Phase 1) and an optional `completeness` field (Phase 3, +0.0–1.0, the dir agent's self-rated thoroughness). +`low_confidence_entries(threshold=0.7)` returns all entries below a +threshold, sorted ascending — future refinement-pass fuel. ### 5.4 Why one-file-per-entry instead of JSONL -Random access by path. The dir loop calls `cache.has_entry("dir", path)` -once per directory during the `_get_child_summaries()` lookup; with -sha256-keyed files this is an `os.path.exists()` call. With JSONL it -would be a full file scan. +Random access by path. The dir loop calls +`cache.has_entry("dir", path)` once per directory during the +`_get_child_summaries()` lookup; with sha256-keyed files this is an +`os.path.exists()` call. With JSONL it would be a full file scan. + +### 5.5 The planning files + +`plan.json` is written by `_run_investigation()` after a successful +planning pass, so resumed runs can skip the planner. It is loaded +before the dir loops run when `--fresh` is not set and the file +exists. + +`plan_evaluation.json` is written by `_write_plan_evaluation()` after +the dir loops finish. Schema: `plan_order`, `total_dirs_investigated`, +`total_turns_allocated`, `total_turns_used`, `overall_utilization`, +`per_directory` (list of `{dir, planned_tier, turns_allocated, +turns_used, utilization, completeness, confidence}`), `evaluated_at`. +See [Planning Pass](PlanningPass) for how to use it. --- ## 6. Prompts -All prompt templates live in `luminos_lib/prompts.py`. There are three: +All prompt templates live in `luminos_lib/prompts.py`. There are four: | Constant | Used by | What it carries | |---|---|---| | `_SURVEY_SYSTEM_PROMPT` | `_run_survey` | survey_signals, tree_preview, available_tools | +| `_PLANNING_SYSTEM_PROMPT` | `_run_planning` | survey, tree, file signals, cached_dirs | | `_DIR_SYSTEM_PROMPT` | `_run_dir_loop` | dir_path, dir_rel, max_turns, context, child_summaries, survey_context | | `_SYNTHESIS_SYSTEM_PROMPT` | `_run_synthesis` | target, summaries_text | @@ -424,8 +486,8 @@ that reason. ## 7. Synthesis pass -`_run_synthesis()` (`ai.py:1157`) is structurally similar to the dir loop -but much simpler: +`_run_synthesis()` is structurally similar to the dir loop but much +simpler: - Reads all `dir` cache entries via `cache.read_all_entries("dir")` - Renders them into a `summaries_text` block (one section per dir) @@ -434,31 +496,29 @@ but much simpler: `detailed` fields Tools available: `read_cache`, `list_cache`, `flag`, `submit_report` -(`_SYNTHESIS_TOOLS` at `ai.py:401`). The synthesis agent can pull -specific cache entries back if it needs to drill in, but it cannot read -files directly — synthesis is meant to operate on summaries, not raw -contents. +(`_SYNTHESIS_TOOLS`). The synthesis agent can pull specific cache +entries back if it needs to drill in, but it cannot read files directly +— synthesis is meant to operate on summaries, not raw contents. There's a fallback: if synthesis runs out of turns without calling -`submit_report`, `_synthesize_from_cache()` (`ai.py:1262`) builds a -mechanical brief+detailed from the cached dir summaries with no AI call. -This guarantees you always get *something* in the report. +`submit_report`, `_synthesize_from_cache()` builds a mechanical +brief+detailed from the cached dir summaries with no AI call. This +guarantees you always get *something* in the report. --- ## 8. Flags The `flag` tool is the agent's pressure valve for "I noticed something -that should not be lost in the summary." `_tool_flag()` (`ai.py:629`) -prints to stderr *and* appends a JSONL line to -`{cache.root}/flags.jsonl`. At the end of `_run_investigation()` -(`ai.py:1387–1397`), the orchestrator reads that file back and includes -the flags in its return tuple. `format_report()` then renders them in a -dedicated section. +that should not be lost in the summary." `_tool_flag()` prints to stderr +*and* appends a JSONL line to `{cache.root}/flags.jsonl`. At the end of +`_run_investigation()`, the orchestrator reads that file back and +includes the flags in its return tuple. `format_report()` then renders +them in a dedicated section. Severity is `info | concern | critical`. The agent is told to flag -*immediately* on discovery, not save findings for the report — this is in -the tool description at `ai.py:312`. +*immediately* on discovery, not save findings for the report — this is +in the tool description. --- @@ -484,10 +544,11 @@ A cookbook for the kinds of changes that come up most often. contains your handler and `_DIR_TOOLS` contains your schema after importing `luminos_lib.ai`. -To make a tool available in synthesis or survey instead of (or in -addition to) dir, pass `scopes=["synthesis"]`, `scopes=["survey"]`, or -`scopes=["dir", "synthesis"]`. Tools whose schema differs by scope (like -`submit_report`) get a separate `register_tool()` call per scope. +To make a tool available in synthesis, survey, or planning instead of +(or in addition to) dir, pass `scopes=["synthesis"]`, `scopes=["survey"]`, +`scopes=["planning"]`, or any combination. Tools whose schema differs by +scope (like `submit_report`) get a separate `register_tool()` call per +scope. ### 9.2 Add a whole new pass @@ -522,8 +583,7 @@ unless you `--fresh`. ### 9.4 Change cache schema 1. Update the required-fields set in `cache.py:write_entry()` - (`cache.py:119–123`) -2. Update `_DIR_TOOLS`'s `write_cache` description in `ai.py:228` so the +2. Update `_DIR_TOOLS`'s `write_cache` description in `ai.py` so the agent knows what to write 3. Update `_DIR_SYSTEM_PROMPT` in `prompts.py` if the agent needs to know *how* to populate the new field @@ -532,26 +592,25 @@ unless you `--fresh`. ### 9.5 Add a CLI flag -Edit `luminos.py:88` (`main()`'s argparse setup) to define the flag, then +Edit `luminos.py:main()`'s argparse setup to define the flag, then plumb it through whatever functions need it. New AI-related flags -typically need to be added to `analyze_directory()`'s signature -(`ai.py:1408`) and then forwarded to `_run_investigation()`. +typically need to be added to `analyze_directory()`'s signature and +then forwarded to `_run_investigation()`. --- ## 10. Token budget and cost -Budget logic is in `_TokenTracker.budget_exceeded()` and is checked at the -top of every dir loop iteration (`ai.py:882`). The budget is **per call**, -not cumulative — see §4.4. The breach handler flushes a partial dir cache +Budget logic is in `_TokenTracker.budget_exceeded()` and is checked at +the top of every dir loop iteration. The budget is **per call**, not +cumulative — see §4.4. The breach handler flushes a partial dir cache entry so work isn't lost. -Cost reporting happens once at the end of `_run_investigation()` -(`ai.py:1399`), using the cumulative `total_input` and `total_output` -counters multiplied by the constants at `ai.py:43–44`. There is no -running cost display during the investigation today. If you want one, -`_TokenTracker.summary()` already returns the formatted string — just -call it after each dir loop. +Cost reporting happens once at the end of `_run_investigation()`, using +the cumulative `total_input` and `total_output` counters multiplied by +the constants near the top of `ai.py`. There is no running cost display +during the investigation today. If you want one, `_TokenTracker.summary()` +already returns the formatted string — just call it after each dir loop. --- @@ -560,16 +619,20 @@ call it after each dir loop. | Term | Meaning | |---|---| | **base scan** | The non-AI phase: tree, classification, languages, recency, disk usage. Stdlib + coreutils only. | -| **dir loop** | Per-directory agent loop in `_run_dir_loop`. Up to 14 turns. Produces a `dir` cache entry. | +| **dir loop** | Per-directory agent loop in `_run_dir_loop`. Turns allocated by the planning pass (5 shallow / 10 default / 15–20 priority, capped at 25). Produces a `dir` cache entry. | | **survey pass** | Single short loop before any dir loops, producing a shared description and tool guidance. | +| **planning pass** | Phase 3 pass after the survey, before dir loops. Produces a plan (priority/shallow/skip dirs + turn allocations + order). | | **synthesis pass** | Final loop that reads `dir` cache entries and produces `(brief, detailed)`. | -| **leaves-first** | Discovery order in `_discover_directories`: deepest paths first, so child summaries exist when parents are investigated. | +| **leaves-first** | Discovery order in `_discover_directories`: deepest paths first, so child summaries exist when parents are investigated. Preserved within planning bands by `_apply_plan`. | | **investigation** | One end-to-end run, identified by a UUID, persisted under `/tmp/luminos/{uuid}/`. | | **investigation_id** | The UUID. Stored in `/tmp/luminos/investigations.json` keyed by absolute target path. | | **cache entry** | A JSON file under `files/` or `dirs/` named by sha256(path). | | **flag** | An agent finding written to `flags.jsonl` and reported separately. info / concern / critical. | | **partial entry** | A `dir` cache entry written when the budget tripped before `submit_report`. Marked with `partial: True`. | +| **completeness** | Phase 3 agent self-rated thoroughness (0.0–1.0) from `submit_report`. Feeds `plan_evaluation.json`. | | **survey signals** | The histogram + samples computed by `filetypes.survey_signals()` during the base scan, fed to the survey prompt. | | **last_input** | The `input_tokens` count from the most recent API call. The basis for budget checks. NOT the cumulative sum. | | **CONTEXT_BUDGET** | 70% of 200k = 140k. Trigger threshold for early exit. | | **`_PROTECTED_DIR_TOOLS`** | Tools the survey is forbidden from filtering out of the dir loop's toolbox. Currently `{submit_report}`. | +| **plan.json** | Serialized planning output, cached so resumed runs skip the planner. | +| **plan_evaluation.json** | Post-investigation quality report comparing plan predictions to outcomes. |