wiki: Internals — reflect Phase 3 planning pass, (summary, completeness) return, cache layout

claude-code 2026-04-18 20:08:30 -06:00
parent 717cde8562
commit d3315b530f

@ -7,7 +7,8 @@ agent loop can finish this page and start making non-trivial changes.
All file:line references are accurate as of the date this page was last
edited — verify with `git log` or by opening the file before relying on a
specific line number.
specific line number. `ai.py` in particular grows each phase and
references drift.
---
@ -36,9 +37,9 @@ wait for a scan they can't use.
## 2. Base scan walkthrough
Entry: `luminos.py:main()` parses args, then calls `scan(target, ...)` at
`luminos.py:45`. `scan()` is a flat sequence — it builds a `report` dict
by calling helpers from `luminos_lib/`, one per concern, in order:
Entry: `luminos.py:main()` parses args, then calls `scan(target, ...)`.
`scan()` is a flat sequence — it builds a `report` dict by calling helpers
from `luminos_lib/`, one per concern, in order:
```
scan(target)
@ -60,7 +61,7 @@ event-driven, and there is no shared state object — everything passes
through the local `report` dict.
The progress lines you see on stderr (`[scan] Counting lines... foo.py`)
come from `_progress()` in `luminos.py:23`, which returns an `on_file`
come from `_progress()` in `luminos.py`, which returns an `on_file`
callback that the helpers call as they work. If you add a new helper that
walks files, plumb a progress callback through the same way for
consistency.
@ -77,46 +78,54 @@ the base scan because it needs `report["survey_signals"]` and
The AI pipeline is what makes Luminos interesting and is also where
almost all the complexity lives. Everything below happens inside
`luminos_lib/ai.py` (1438 lines as of writing), called from
`luminos.py:157` via `analyze_directory()`.
`luminos_lib/ai.py` (~2060 lines as of writing), called from `luminos.py`
via `analyze_directory()`.
### 3.1 The orchestrator
`analyze_directory()` (`ai.py:1408`) is a thin wrapper that checks
dependencies, gets the API key, builds the Anthropic client, and calls
`_run_investigation()`. If anything fails it prints a warning and returns
empty strings — the rest of luminos keeps working.
`analyze_directory()` is a thin wrapper that checks dependencies, gets the
API key, builds the Anthropic client, and calls `_run_investigation()`.
If anything fails it prints a warning and returns empty strings — the
rest of luminos keeps working.
`_run_investigation()` (`ai.py:1286`) is the real entry point. Read this
function first if you want to understand the pipeline shape. It does six
things, in order:
`_run_investigation()` is the real entry point. Read this function first
if you want to understand the pipeline shape. It does **seven** things,
in order:
1. **Get/create an investigation ID and cache** (`ai.py:12891294`).
Investigation IDs let you resume a previous run; see §5 below.
1. **Get/create an investigation ID and cache**. Investigation IDs let
you resume a previous run; see §5 below.
2. **Discover all directories** under the target via
`_discover_directories()` (`ai.py:715`). Returns them sorted
*leaves-first* — the deepest paths come first. This matters because
each dir loop reads its child directories' summaries from cache, so
children must be investigated before parents.
3. **Run the survey pass** (`ai.py:13001334`) unless the target is below
the size thresholds at `ai.py:780781`, in which case
`_discover_directories()`. Returns them sorted *leaves-first* — the
deepest paths come first. This matters because each dir loop reads
its child directories' summaries from cache, so children must be
investigated before parents.
3. **Run the survey pass** unless the target is below
`_SURVEY_MIN_FILES` and `_SURVEY_MIN_DIRS`, in which case
`_default_survey()` returns a synthetic skip.
4. **Filter out cached directories** (`ai.py:13361349`). If you're
resuming an investigation, dirs that already have a `dir` cache entry
are skipped — only new ones get a fresh dir loop.
5. **Run a dir loop per remaining directory** (`ai.py:13511375`). This
is the heart of the system — see §4.
6. **Run the synthesis pass** (`ai.py:1382`) reading only `dir` cache
entries to produce `(brief, detailed)`.
4. **Filter out cached directories**. If you're resuming an
investigation, dirs that already have a `dir` cache entry are
skipped — only new ones get a fresh dir loop.
5. **Run the planning pass** (Phase 3) unless the target is small, in
which case `_default_plan()` returns an empty plan. On resumed runs
the planner is skipped and `plan.json` is loaded from cache instead.
`_apply_plan()` then sorts dirs into priority/default/shallow bands
and builds a `{dir_path: max_turns}` map. Leaf-first ordering is
preserved *within* each band (see §4.7).
6. **Run a dir loop per remaining directory**, iterating the
plan-ordered list with the per-directory `max_turns` from the plan.
`_write_plan_evaluation()` records turn-utilization metrics at the
end. This is the heart of the system — see §4.
7. **Run the synthesis pass** reading only `dir` cache entries to
produce `(brief, detailed)`.
It also reads `flags.jsonl` from disk at the end (`ai.py:13871397`) and
returns `(brief, detailed, flags)` to `analyze_directory()`.
It also reads `flags.jsonl` from disk at the end and returns
`(brief, detailed, flags)` to `analyze_directory()`.
### 3.2 The survey pass
`_run_survey()` (`ai.py:1051`) is a short, single-purpose loop. It exists
to give the dir loops some shared context about what they're looking at
*as a whole* before any of them start.
`_run_survey()` is a short, single-purpose loop. It exists to give the
dir loops some shared context about what they're looking at *as a whole*
before any of them start.
Inputs go into the system prompt (`_SURVEY_SYSTEM_PROMPT` in
`prompts.py`):
@ -125,9 +134,9 @@ Inputs go into the system prompt (`_SURVEY_SYSTEM_PROMPT` in
- A 2-level tree preview from `build_tree(target, max_depth=2)`
- The list of tools the dir loop will have available
The survey is allowed only `submit_survey` as a tool (`_SURVEY_TOOLS` at
`ai.py:356`). It runs at most 3 turns. The agent must call `submit_survey`
exactly once with six fields:
The survey is allowed only `submit_survey` as a tool (`_SURVEY_TOOLS`).
It runs at most 3 turns. The agent must call `submit_survey` exactly
once with six fields:
```python
{
@ -148,54 +157,82 @@ loops still run but with `survey=None` — the system degrades gracefully.
Two things happen with the survey output before each dir loop runs:
**Survey block injection.** `_format_survey_block()` (`ai.py:803`) renders
the survey dict as a labeled text block, which gets `.format()`-injected
into the dir loop system prompt as `{survey_context}`. The dir agent sees
the description, approach, domain notes, and which tools it should lean on
**Survey block injection.** `_format_survey_block()` renders the survey
dict as a labeled text block, which gets `.format()`-injected into the
dir loop system prompt as `{survey_context}`. The dir agent sees the
description, approach, domain notes, and which tools it should lean on
or skip.
**Tool filtering.** `_filter_dir_tools()` (`ai.py:824`) returns a copy of
`_DIR_TOOLS` with anything in `skip_tools` removed — but only if the
survey's confidence is at or above `_SURVEY_CONFIDENCE_THRESHOLD = 0.5`
(`ai.py:775`). Below that threshold the agent gets the full toolbox. The
control-flow tool `submit_report` is in `_PROTECTED_DIR_TOOLS` and can
never be filtered out — removing it would break loop termination.
**Tool filtering.** `_filter_dir_tools()` returns a copy of `_DIR_TOOLS`
with anything in `skip_tools` removed — but only if the survey's
confidence is at or above `_SURVEY_CONFIDENCE_THRESHOLD = 0.5`. Below
that threshold the agent gets the full toolbox. The control-flow tool
`submit_report` is in `_PROTECTED_DIR_TOOLS` and can never be filtered
out — removing it would break loop termination.
This is the only place in the codebase where the agent's available tools
change at runtime. If you add a new tool, decide whether it should be
protectable.
This is the only place in the codebase where the agent's available
tools change at runtime. If you add a new tool, decide whether it
should be protectable.
### 3.4 The planning pass (Phase 3)
`_run_planning()` is structured like `_run_survey()`: a single-purpose
loop with one submit tool (`submit_plan`), low max turns. Its job is to
decide *where* the dir loops should spend turns, not to investigate.
Inputs:
- The survey dict (formatted via `_format_survey_block()`)
- The full tree at depth 6 (deeper than the survey's 2-level preview)
- The base scan's `survey_signals` (raw file signals)
- The list of already-cached directories (so the planner doesn't plan
around dirs that will be skipped)
The plan schema, tier allocations (priority 1520 cap 25, default 10,
shallow 5, skip 0), fallback behavior, and resume behavior are covered
in full on the [Planning Pass](PlanningPass) page.
`_apply_plan()` is a pure helper that translates the plan into an
ordered list of directories plus a `{dir_path: max_turns}` map. It
sorts dirs into priority/default/shallow bands but **preserves
leaf-first ordering within each band** — so children always run before
their parents, even in "priority-first" mode. See §4.7.
`_write_plan_evaluation()` writes `plan_evaluation.json` at the end of
every run with `turns_allocated`, `turns_used`, and `completeness` per
directory. This is the planning pass's report card.
---
## 4. The dir loop in depth
`_run_dir_loop()` is at `ai.py:1017`. It is a hand-written agent loop, and
you should expect to read it several times before it clicks. As of #57 the
loop body itself is a thin coordinator (~25 lines): it calls three helpers
that own the layers it used to inline.
`_run_dir_loop()` is a hand-written agent loop, and you should expect
to read it several times before it clicks. As of #57 the loop body
itself is a thin coordinator (~25 lines): it calls three helpers that
own the layers it used to inline.
| Helper | Lines | Job |
|---|---|---|
| `_build_dir_loop_context()` | `ai.py:855` | Pure setup. Builds dir context, child summaries, survey block, filtered tool list, system prompt, and the seed user message. Returns a `_DirLoopContext` namedtuple. |
| `_flush_partial_dir_entry()` | `ai.py:896` | Idempotent partial-cache writer for the budget-exceeded path. Synthesizes a summary from already-cached file entries when possible, or writes a "no files processed" stub. Returns the partial summary string. |
| `_handle_turn_response()` | `ai.py:957` | Per-turn response processing. Prints text blocks and tool decisions to stderr, appends the assistant message, dispatches tools (or nudges the agent to call submit_report), appends tool_results. Returns `(done, summary)`. |
| Helper | Job |
|---|---|
| `_build_dir_loop_context()` | Pure setup. Builds dir context, child summaries, survey block, filtered tool list, system prompt, and the seed user message. Returns a `_DirLoopContext` namedtuple. |
| `_flush_partial_dir_entry()` | Idempotent partial-cache writer for the budget-exceeded path. Synthesizes a summary from already-cached file entries when possible, or writes a "no files processed" stub. Returns the partial summary string. |
| `_handle_turn_response()` | Per-turn response processing. Prints text blocks and tool decisions to stderr, appends the assistant message, dispatches tools (or nudges the agent to call submit_report), appends tool_results. Returns `(done, summary, completeness)`. |
The shape of the loop body is now:
```
ctx = _build_dir_loop_context(...)
reset per-loop token counter
for turn in range(max_turns): # max_turns = 14
for turn in range(max_turns): # max_turns from plan (525)
if budget exceeded:
print warning
partial = _flush_partial_dir_entry(...)
if partial: summary = partial
break
call API (streaming)
done, turn_summary = _handle_turn_response(...)
done, turn_summary, turn_completeness = _handle_turn_response(...)
if turn_summary: summary = turn_summary
if turn_completeness: completeness = turn_completeness
if done: break
return summary
return (summary, completeness)
```
A few non-obvious mechanics:
@ -207,95 +244,104 @@ message (the tool results). Nothing is ever evicted. This means
`input_tokens` on each successive API call grows roughly linearly — the
model is re-sent the full conversation every turn. On code targets we see
~1.52k tokens added per turn. At `max_turns=14` this stays under the
budget; raising the cap would expose this. See **#51**.
budget; raising the cap would expose this. With Phase 3's priority-tier
cap of 25, we're still well under budget in practice but closer to the
ceiling. See **#51**.
### 4.2 Tool dispatch
Tools are plain functions in `ai.py`. They are wired up via a single
`register_tool()` call (`ai.py:172`) that lands the schema in one or
more scope lists (`_DIR_TOOLS`, `_SYNTHESIS_TOOLS`, `_SURVEY_TOOLS`)
`register_tool()` call that lands the schema in one or more scope lists
(`_DIR_TOOLS`, `_SYNTHESIS_TOOLS`, `_SURVEY_TOOLS`, `_PLANNING_TOOLS`)
and the handler in `_TOOL_DISPATCH`. The registrations live below the
tool implementations in `ai.py` and read top-to-bottom in dir-then-
synthesis-then-survey order.
tool implementations in `ai.py` and read top-to-bottom in
dir-then-synthesis-then-survey-then-planning order.
`_execute_tool()` looks up the handler by name in `_TOOL_DISPATCH`,
calls it, logs the turn to `investigation.log`, and returns the result
string. **Tools intercepted by the loop body — `submit_report` and
`submit_survey` — register their schema only and have no handler entry.**
`_handle_turn_response()` recognizes `submit_report` specially: it sets
`done = True` and extracts the summary directly from the tool input.
string. **Tools intercepted by the loop body — `submit_report`,
`submit_survey`, `submit_plan` — register their schema only and have no
handler entry.** `_handle_turn_response()` recognizes `submit_report`
specially: it sets `done = True`, extracts the summary from the tool
input, and also extracts the optional `completeness` field (Phase 3
instrumentation).
`think`, `checkpoint`, and `flag` *are* in dispatch, but they have side
effects that just print to stderr or append to `flags.jsonl` — the return
value is always `"ok"`.
effects that just print to stderr or append to `flags.jsonl` — the
return value is always `"ok"`.
When you add a tool: write the function, then add one `register_tool()`
call below it. That's it. There is no second place to forget.
### 4.3 Pre-loaded context
Before the loop starts, `_build_dir_loop_context()` (`ai.py:855`) calls
two helpers that prepare static context for the system prompt:
Before the loop starts, `_build_dir_loop_context()` calls two helpers
that prepare static context for the system prompt:
- `_build_dir_context()` (`ai.py:741`) — `ls`-style listing of the dir
with sizes and MIME types via `python-magic`. The agent sees this
*before* it makes any tool calls, so it doesn't waste a turn just
listing the directory.
- `_get_child_summaries()` (`ai.py:763`) — looks up each subdirectory in
the cache and pulls its `summary` field. This is how leaves-first
ordering pays off: by the time the loop runs on `src/`, all of
`src/auth/`, `src/db/`, `src/middleware/` already have cached summaries
that get injected as `{child_summaries}`.
- `_build_dir_context()``ls`-style listing of the dir with sizes and
MIME types via `python-magic`. The agent sees this *before* it makes
any tool calls, so it doesn't waste a turn just listing the directory.
- `_get_child_summaries()` — looks up each subdirectory in the cache and
pulls its `summary` field. This is how leaves-first ordering pays off:
by the time the loop runs on `src/`, all of `src/auth/`, `src/db/`,
`src/middleware/` already have cached summaries that get injected as
`{child_summaries}`.
If `_get_child_summaries()` returns nothing, the prompt says
`(none — this is a leaf directory)`.
If `_get_child_summaries()` returns nothing, the prompt distinguishes
leaf directories (`"(none: this is a leaf directory)"`) from parents
whose children haven't been investigated yet (`"(child directories
exist but have not been investigated yet)"`). See §4.7.
### 4.4 The token tracker and the budget check
`_TokenTracker` (`ai.py:94`) is a tiny accumulator with one important
subtlety, captured in **#44**:
`_TokenTracker` is a tiny accumulator with one important subtlety,
captured in **#44**:
> Cumulative input tokens are NOT a meaningful proxy for context size:
> each turn's `input_tokens` already includes the full message history,
> so summing across turns double-counts everything. Use `last_input` for
> budget decisions, totals for billing.
So `budget_exceeded()` (`ai.py:135`) compares `last_input` (the most
recent call's input_tokens) to `CONTEXT_BUDGET` (`ai.py:40`), which is
70% of 200k. This is checked at the *top* of each loop iteration, before
the next API call.
So `budget_exceeded()` compares `last_input` (the most recent call's
input_tokens) to `CONTEXT_BUDGET`, which is 70% of 200k. This is
checked at the *top* of each loop iteration, before the next API call.
When the budget check trips, the loop:
1. Prints a `Context budget reached` warning to stderr
2. Calls `_flush_partial_dir_entry()` (`ai.py:896`), which writes a
partial dir cache entry from any `file` cache entries the agent
already produced, marked with `partial: True` and `partial_reason`.
The helper is idempotent — if a dir entry already exists, it returns
`""` without writing.
2. Calls `_flush_partial_dir_entry()`, which writes a partial dir cache
entry from any `file` cache entries the agent already produced,
marked with `partial: True` and `partial_reason`. The helper is
idempotent — if a dir entry already exists, it returns `""` without
writing.
3. Breaks out of the loop
This means a budget breach doesn't lose work — anything the agent already
cached survives, and the synthesis pass will see a partial dir summary
rather than nothing.
This means a budget breach doesn't lose work — anything the agent
already cached survives, and the synthesis pass will see a partial dir
summary rather than nothing.
### 4.5 What the loop returns
`_run_dir_loop()` returns the `summary` string from `submit_report` (or
the partial summary returned by `_flush_partial_dir_entry()` if the
budget tripped). `_run_investigation()` then writes a normal `dir` cache
entry from this summary, *unless* the dir loop already wrote one itself
via the partial-flush path, in which case the `cache.has_entry("dir",
dir_path)` check skips it.
`_run_dir_loop()` returns `(summary, completeness)`. The summary is the
string from `submit_report` (or the partial summary returned by
`_flush_partial_dir_entry()` if the budget tripped). The completeness
is the agent's self-rated investigation thoroughness (0.01.0) — Phase
3 instrumentation used in `plan_evaluation.json` — or `None` if the
agent didn't report one.
`_run_investigation()` writes a normal `dir` cache entry from this
summary (with `completeness` included if non-None), *unless* the dir
loop already wrote one itself via the partial-flush path, in which case
the `cache.has_entry("dir", dir_path)` check skips it.
### 4.6 The streaming API caller
`_call_api_streaming()` (`ai.py:686`) is a thin wrapper around
`_call_api_streaming()` is a thin wrapper around
`client.messages.stream()`. It currently doesn't print tokens as they
arrive — it iterates the stream, drops everything, then pulls the final
message via `stream.get_final_message()`. The streaming API is used for
real-time tool decision printing, which today happens only after the full
response arrives. There's room here to add live progress printing if you
want it.
real-time tool decision printing, which today happens only after the
full response arrives. There's room here to add live progress printing
if you want it.
### 4.7 The leaf-first contract (load-bearing for child summaries)
@ -339,32 +385,34 @@ the full design.
## 5. The cache model
Cache lives at `/tmp/luminos/{investigation_id}/`. Code is
`luminos_lib/cache.py` (201 lines).
`luminos_lib/cache.py`.
### 5.1 Investigation IDs
`/tmp/luminos/investigations.json` maps absolute target paths to UUIDs.
`_get_investigation_id()` (`cache.py:40`) looks up the target and either
returns the existing UUID (resume) or creates a new one (fresh run).
`--fresh` forces a new UUID even if one exists.
`_get_investigation_id()` looks up the target and either returns the
existing UUID (resume) or creates a new one (fresh run). `--fresh`
forces a new UUID even if one exists.
### 5.2 What's stored
Inside `/tmp/luminos/{uuid}/`:
```
meta.json investigation metadata (model, start time, dir count)
files/<sha256>.json one file per cached file entry
dirs/<sha256>.json one file per cached directory entry
flags.jsonl JSONL — appended on every flag tool call
investigation.log JSONL — appended on every tool call
meta.json investigation metadata (model, start time, dir count)
plan.json planning pass output — cached for resumed runs
plan_evaluation.json post-investigation quality report (Phase 3)
files/<sha256>.json one file per cached file entry
dirs/<sha256>.json one file per cached directory entry
flags.jsonl JSONL — appended on every flag tool call
investigation.log JSONL — appended on every tool call
```
**File and dir cache entries are NOT in JSONL** — they are one
sha256-keyed JSON file per entry. The sha256 is over the path string
(`cache.py:13`). Only `flags.jsonl` and `investigation.log` use JSONL.
sha256-keyed JSON file per entry. The sha256 is over the path string.
Only `flags.jsonl` and `investigation.log` use JSONL.
Required fields are validated in `write_entry()` (`cache.py:115`):
Required fields are validated in `write_entry()`:
```python
file: {path, relative_path, size_bytes, category, summary, cached_at}
@ -376,31 +424,45 @@ The validator also rejects entries containing `content`, `contents`, or
contents, summaries only. If you change the schema, update the required
set in `write_entry()` and update the test in `tests/test_cache.py`.
### 5.3 Confidence support already exists
### 5.3 Confidence + completeness support
`write_entry()` validates an optional `confidence` field
(`cache.py:129134`) and a `confidence_reason` string.
`low_confidence_entries(threshold=0.7)` (`cache.py:191`) returns all
entries below a threshold, sorted ascending. The agent doesn't currently
*set* these fields in any prompt — that lights up when Phase 1 work
actually wires the prompts.
`write_entry()` validates optional `confidence` and `confidence_reason`
fields (Phase 1) and an optional `completeness` field (Phase 3,
0.01.0, the dir agent's self-rated thoroughness).
`low_confidence_entries(threshold=0.7)` returns all entries below a
threshold, sorted ascending — future refinement-pass fuel.
### 5.4 Why one-file-per-entry instead of JSONL
Random access by path. The dir loop calls `cache.has_entry("dir", path)`
once per directory during the `_get_child_summaries()` lookup; with
sha256-keyed files this is an `os.path.exists()` call. With JSONL it
would be a full file scan.
Random access by path. The dir loop calls
`cache.has_entry("dir", path)` once per directory during the
`_get_child_summaries()` lookup; with sha256-keyed files this is an
`os.path.exists()` call. With JSONL it would be a full file scan.
### 5.5 The planning files
`plan.json` is written by `_run_investigation()` after a successful
planning pass, so resumed runs can skip the planner. It is loaded
before the dir loops run when `--fresh` is not set and the file
exists.
`plan_evaluation.json` is written by `_write_plan_evaluation()` after
the dir loops finish. Schema: `plan_order`, `total_dirs_investigated`,
`total_turns_allocated`, `total_turns_used`, `overall_utilization`,
`per_directory` (list of `{dir, planned_tier, turns_allocated,
turns_used, utilization, completeness, confidence}`), `evaluated_at`.
See [Planning Pass](PlanningPass) for how to use it.
---
## 6. Prompts
All prompt templates live in `luminos_lib/prompts.py`. There are three:
All prompt templates live in `luminos_lib/prompts.py`. There are four:
| Constant | Used by | What it carries |
|---|---|---|
| `_SURVEY_SYSTEM_PROMPT` | `_run_survey` | survey_signals, tree_preview, available_tools |
| `_PLANNING_SYSTEM_PROMPT` | `_run_planning` | survey, tree, file signals, cached_dirs |
| `_DIR_SYSTEM_PROMPT` | `_run_dir_loop` | dir_path, dir_rel, max_turns, context, child_summaries, survey_context |
| `_SYNTHESIS_SYSTEM_PROMPT` | `_run_synthesis` | target, summaries_text |
@ -424,8 +486,8 @@ that reason.
## 7. Synthesis pass
`_run_synthesis()` (`ai.py:1157`) is structurally similar to the dir loop
but much simpler:
`_run_synthesis()` is structurally similar to the dir loop but much
simpler:
- Reads all `dir` cache entries via `cache.read_all_entries("dir")`
- Renders them into a `summaries_text` block (one section per dir)
@ -434,31 +496,29 @@ but much simpler:
`detailed` fields
Tools available: `read_cache`, `list_cache`, `flag`, `submit_report`
(`_SYNTHESIS_TOOLS` at `ai.py:401`). The synthesis agent can pull
specific cache entries back if it needs to drill in, but it cannot read
files directly — synthesis is meant to operate on summaries, not raw
contents.
(`_SYNTHESIS_TOOLS`). The synthesis agent can pull specific cache
entries back if it needs to drill in, but it cannot read files directly
— synthesis is meant to operate on summaries, not raw contents.
There's a fallback: if synthesis runs out of turns without calling
`submit_report`, `_synthesize_from_cache()` (`ai.py:1262`) builds a
mechanical brief+detailed from the cached dir summaries with no AI call.
This guarantees you always get *something* in the report.
`submit_report`, `_synthesize_from_cache()` builds a mechanical
brief+detailed from the cached dir summaries with no AI call. This
guarantees you always get *something* in the report.
---
## 8. Flags
The `flag` tool is the agent's pressure valve for "I noticed something
that should not be lost in the summary." `_tool_flag()` (`ai.py:629`)
prints to stderr *and* appends a JSONL line to
`{cache.root}/flags.jsonl`. At the end of `_run_investigation()`
(`ai.py:13871397`), the orchestrator reads that file back and includes
the flags in its return tuple. `format_report()` then renders them in a
dedicated section.
that should not be lost in the summary." `_tool_flag()` prints to stderr
*and* appends a JSONL line to `{cache.root}/flags.jsonl`. At the end of
`_run_investigation()`, the orchestrator reads that file back and
includes the flags in its return tuple. `format_report()` then renders
them in a dedicated section.
Severity is `info | concern | critical`. The agent is told to flag
*immediately* on discovery, not save findings for the report — this is in
the tool description at `ai.py:312`.
*immediately* on discovery, not save findings for the report — this is
in the tool description.
---
@ -484,10 +544,11 @@ A cookbook for the kinds of changes that come up most often.
contains your handler and `_DIR_TOOLS` contains your schema after
importing `luminos_lib.ai`.
To make a tool available in synthesis or survey instead of (or in
addition to) dir, pass `scopes=["synthesis"]`, `scopes=["survey"]`, or
`scopes=["dir", "synthesis"]`. Tools whose schema differs by scope (like
`submit_report`) get a separate `register_tool()` call per scope.
To make a tool available in synthesis, survey, or planning instead of
(or in addition to) dir, pass `scopes=["synthesis"]`, `scopes=["survey"]`,
`scopes=["planning"]`, or any combination. Tools whose schema differs by
scope (like `submit_report`) get a separate `register_tool()` call per
scope.
### 9.2 Add a whole new pass
@ -522,8 +583,7 @@ unless you `--fresh`.
### 9.4 Change cache schema
1. Update the required-fields set in `cache.py:write_entry()`
(`cache.py:119123`)
2. Update `_DIR_TOOLS`'s `write_cache` description in `ai.py:228` so the
2. Update `_DIR_TOOLS`'s `write_cache` description in `ai.py` so the
agent knows what to write
3. Update `_DIR_SYSTEM_PROMPT` in `prompts.py` if the agent needs to know
*how* to populate the new field
@ -532,26 +592,25 @@ unless you `--fresh`.
### 9.5 Add a CLI flag
Edit `luminos.py:88` (`main()`'s argparse setup) to define the flag, then
Edit `luminos.py:main()`'s argparse setup to define the flag, then
plumb it through whatever functions need it. New AI-related flags
typically need to be added to `analyze_directory()`'s signature
(`ai.py:1408`) and then forwarded to `_run_investigation()`.
typically need to be added to `analyze_directory()`'s signature and
then forwarded to `_run_investigation()`.
---
## 10. Token budget and cost
Budget logic is in `_TokenTracker.budget_exceeded()` and is checked at the
top of every dir loop iteration (`ai.py:882`). The budget is **per call**,
not cumulative — see §4.4. The breach handler flushes a partial dir cache
Budget logic is in `_TokenTracker.budget_exceeded()` and is checked at
the top of every dir loop iteration. The budget is **per call**, not
cumulative — see §4.4. The breach handler flushes a partial dir cache
entry so work isn't lost.
Cost reporting happens once at the end of `_run_investigation()`
(`ai.py:1399`), using the cumulative `total_input` and `total_output`
counters multiplied by the constants at `ai.py:4344`. There is no
running cost display during the investigation today. If you want one,
`_TokenTracker.summary()` already returns the formatted string — just
call it after each dir loop.
Cost reporting happens once at the end of `_run_investigation()`, using
the cumulative `total_input` and `total_output` counters multiplied by
the constants near the top of `ai.py`. There is no running cost display
during the investigation today. If you want one, `_TokenTracker.summary()`
already returns the formatted string — just call it after each dir loop.
---
@ -560,16 +619,20 @@ call it after each dir loop.
| Term | Meaning |
|---|---|
| **base scan** | The non-AI phase: tree, classification, languages, recency, disk usage. Stdlib + coreutils only. |
| **dir loop** | Per-directory agent loop in `_run_dir_loop`. Up to 14 turns. Produces a `dir` cache entry. |
| **dir loop** | Per-directory agent loop in `_run_dir_loop`. Turns allocated by the planning pass (5 shallow / 10 default / 1520 priority, capped at 25). Produces a `dir` cache entry. |
| **survey pass** | Single short loop before any dir loops, producing a shared description and tool guidance. |
| **planning pass** | Phase 3 pass after the survey, before dir loops. Produces a plan (priority/shallow/skip dirs + turn allocations + order). |
| **synthesis pass** | Final loop that reads `dir` cache entries and produces `(brief, detailed)`. |
| **leaves-first** | Discovery order in `_discover_directories`: deepest paths first, so child summaries exist when parents are investigated. |
| **leaves-first** | Discovery order in `_discover_directories`: deepest paths first, so child summaries exist when parents are investigated. Preserved within planning bands by `_apply_plan`. |
| **investigation** | One end-to-end run, identified by a UUID, persisted under `/tmp/luminos/{uuid}/`. |
| **investigation_id** | The UUID. Stored in `/tmp/luminos/investigations.json` keyed by absolute target path. |
| **cache entry** | A JSON file under `files/` or `dirs/` named by sha256(path). |
| **flag** | An agent finding written to `flags.jsonl` and reported separately. info / concern / critical. |
| **partial entry** | A `dir` cache entry written when the budget tripped before `submit_report`. Marked with `partial: True`. |
| **completeness** | Phase 3 agent self-rated thoroughness (0.01.0) from `submit_report`. Feeds `plan_evaluation.json`. |
| **survey signals** | The histogram + samples computed by `filetypes.survey_signals()` during the base scan, fed to the survey prompt. |
| **last_input** | The `input_tokens` count from the most recent API call. The basis for budget checks. NOT the cumulative sum. |
| **CONTEXT_BUDGET** | 70% of 200k = 140k. Trigger threshold for early exit. |
| **`_PROTECTED_DIR_TOOLS`** | Tools the survey is forbidden from filtering out of the dir loop's toolbox. Currently `{submit_report}`. |
| **plan.json** | Serialized planning output, cached so resumed runs skip the planner. |
| **plan_evaluation.json** | Post-investigation quality report comparing plan predictions to outcomes. |