archeious/luminos

Fork 0

Table of Contents

Internals

1. The two layers
2. Base scan walkthrough
3. AI pipeline walkthrough

3.1 The orchestrator
3.2 The survey pass
3.3 How the survey shapes dir loops

4. The dir loop in depth

4.1 The message history grows monotonically
4.2 Tool dispatch
4.3 Pre-loaded context
4.4 The token tracker and the budget check
4.5 What the loop returns
4.6 The streaming API caller

5. The cache model

5.1 Investigation IDs
5.2 What's stored
5.3 Confidence support already exists
5.4 Why one-file-per-entry instead of JSONL

6. Prompts
7. Synthesis pass
8. Flags
9. Where to make common changes

9.1 Add a new tool the dir agent can call
9.2 Add a whole new pass
9.3 Change a prompt
9.4 Change cache schema
9.5 Add a CLI flag

10. Token budget and cost
11. Glossary

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

Internals

A code tour of how Luminos actually works. Read this after Development Guide and Architecture. The goal is that a developer who knows basic Python but has never built an agent loop can finish this page and start making non-trivial changes.

All file:line references are accurate as of the date this page was last edited — verify with git log or by opening the file before relying on a specific line number.

1. The two layers

Luminos has a hard internal split:

Layer	What it does	Imports
Base scan	Walks the directory, classifies files, counts lines, ranks recency, measures disk usage, prints a report.	stdlib only + GNU coreutils via subprocess. No pip packages.
AI pipeline (`--ai`)	Runs a multi-pass agent investigation via the Claude API on top of the base scan output.	`anthropic`, `tree-sitter`, `python-magic` — all imported lazily.

The split is enforced by lazy imports. luminos.py:156 is the only place that imports from luminos_lib.ai, and it sits inside if args.ai:. You can grep the codebase to verify: nothing in the base scan modules imports anything from ai.py, ast_parser.py, or prompts.py. This means python3 luminos.py /target works on a stock Python 3 install with no packages installed at all.

When you change a base-scan module, the question to ask is: does this introduce a top-level import of anything outside stdlib? If yes, you've broken the constraint and the change must be rewritten.

2. Base scan walkthrough

Entry: luminos.py:main() parses args, then calls scan(target, ...) at luminos.py:45. scan() is a flat sequence — it builds a report dict by calling helpers from luminos_lib/, one per concern, in order:

scan(target)
  build_tree()           → report["tree"], report["tree_rendered"]
  classify_files()       → report["classified_files"]
  summarize_categories() → report["file_categories"]
  survey_signals()       → report["survey_signals"]    ← input to AI survey
  detect_languages()     → report["languages"], report["lines_of_code"]
  find_large_files()     → report["large_files"]
  find_recent_files()    → report["recent_files"]
  get_disk_usage()       → report["disk_usage"]
  top_directories()      → report["top_directories"]
  return report

Each helper is independent. You could delete find_recent_files() and the report would just be missing that field. The flow is procedural, not event-driven, and there is no shared state object — everything passes through the local report dict.

The progress lines you see on stderr ([scan] Counting lines... foo.py) come from _progress() in luminos.py:23, which returns an on_file callback that the helpers call as they work. If you add a new helper that walks files, plumb a progress callback through the same way for consistency.

After scan() returns, main() either runs the AI pipeline or jumps straight to format_report() (luminos_lib/report.py) for terminal output, or json.dumps() for JSON. The AI pipeline always runs after the base scan because it needs report["survey_signals"] and report["file_categories"] as inputs.

3. AI pipeline walkthrough

The AI pipeline is what makes Luminos interesting and is also where almost all the complexity lives. Everything below happens inside luminos_lib/ai.py (1438 lines as of writing), called from luminos.py:157 via analyze_directory().

3.1 The orchestrator

analyze_directory() (ai.py:1408) is a thin wrapper that checks dependencies, gets the API key, builds the Anthropic client, and calls _run_investigation(). If anything fails it prints a warning and returns empty strings — the rest of luminos keeps working.

_run_investigation() (ai.py:1286) is the real entry point. Read this function first if you want to understand the pipeline shape. It does six things, in order:

Get/create an investigation ID and cache (ai.py:1289–1294). Investigation IDs let you resume a previous run; see §5 below.
Discover all directories under the target via _discover_directories() (ai.py:715). Returns them sorted leaves-first — the deepest paths come first. This matters because each dir loop reads its child directories' summaries from cache, so children must be investigated before parents.
Run the survey pass (ai.py:1300–1334) unless the target is below the size thresholds at ai.py:780–781, in which case _default_survey() returns a synthetic skip.
Filter out cached directories (ai.py:1336–1349). If you're resuming an investigation, dirs that already have a dir cache entry are skipped — only new ones get a fresh dir loop.
Run a dir loop per remaining directory (ai.py:1351–1375). This is the heart of the system — see §4.
Run the synthesis pass (ai.py:1382) reading only dir cache entries to produce (brief, detailed).

It also reads flags.jsonl from disk at the end (ai.py:1387–1397) and returns (brief, detailed, flags) to analyze_directory().

3.2 The survey pass

_run_survey() (ai.py:1051) is a short, single-purpose loop. It exists to give the dir loops some shared context about what they're looking at as a whole before any of them start.

Inputs go into the system prompt (_SURVEY_SYSTEM_PROMPT in prompts.py):

survey_signals — extension histogram, file --brief outputs, filename samples (built by filetypes.survey_signals() during the base scan)
A 2-level tree preview from build_tree(target, max_depth=2)
The list of tools the dir loop will have available

The survey is allowed only submit_survey as a tool (_SURVEY_TOOLS at ai.py:356). It runs at most 3 turns. The agent must call submit_survey exactly once with six fields:

{
    "description":     "plain language — what is this target",
    "approach":        "how the dir loops should investigate it",
    "relevant_tools":  ["read_file", "parse_structure", ...],
    "skip_tools":      ["parse_structure", ...],   # for non-code targets
    "domain_notes":    "anything unusual the dir loops should know",
    "confidence":      0.0–1.0,
}

The result is a Python dict that gets passed into every dir loop as survey=.... If the survey fails (API error, ran out of turns), the dir loops still run but with survey=None — the system degrades gracefully.

3.3 How the survey shapes dir loops

Two things happen with the survey output before each dir loop runs:

Survey block injection. _format_survey_block() (ai.py:803) renders the survey dict as a labeled text block, which gets .format()-injected into the dir loop system prompt as {survey_context}. The dir agent sees the description, approach, domain notes, and which tools it should lean on or skip.

Tool filtering. _filter_dir_tools() (ai.py:824) returns a copy of _DIR_TOOLS with anything in skip_tools removed — but only if the survey's confidence is at or above _SURVEY_CONFIDENCE_THRESHOLD = 0.5 (ai.py:775). Below that threshold the agent gets the full toolbox. The control-flow tool submit_report is in _PROTECTED_DIR_TOOLS and can never be filtered out — removing it would break loop termination.

This is the only place in the codebase where the agent's available tools change at runtime. If you add a new tool, decide whether it should be protectable.

4. The dir loop in depth

_run_dir_loop() is at ai.py:845. This is a hand-written agent loop and you should expect to read it several times before it clicks. The shape is:

build system prompt (with survey context, child summaries, dir contents)
build initial user message ("investigate this directory now")
reset per-loop token counter
for turn in range(max_turns):                    # max_turns = 14
    if budget exceeded: flush partial cache and break
    call API (streaming)
    record token usage
    print text blocks and tool decisions to stderr
    append assistant response to message history
    if no tool calls: nudge agent to call submit_report; continue
    execute each tool call, build tool_result blocks
    append tool_results to message history as user message
    if submit_report was called: break
return summary

A few non-obvious mechanics:

4.1 The message history grows monotonically

Every turn appends an assistant message (the model's response) and a user message (the tool results). Nothing is ever evicted. This means input_tokens on each successive API call grows roughly linearly — the model is re-sent the full conversation every turn. On code targets we see ~1.5–2k tokens added per turn. At max_turns=14 this stays under the budget; raising the cap would expose this. See #51.

4.2 Tool dispatch

Tools are not class methods. They're plain functions in ai.py:486–642, registered into _TOOL_DISPATCH at ai.py:645. _execute_tool() (ai.py:659) is a 16-line function that looks up the handler by name, calls it, logs the turn to investigation.log, and returns the result string. The two control-flow tools — submit_report and think/ checkpoint for narration — are NOT in _TOOL_DISPATCH because the loop body handles them specially:

submit_report is recognized in the tool-use scan at ai.py:977, sets done = True, and doesn't go through dispatch
think, checkpoint, and flag are in dispatch, but they have side effects that just print to stderr or append to flags.jsonl — the return value is always "ok"

When you add a tool: write the function, add it to _TOOL_DISPATCH, add its schema to _DIR_TOOLS. That's it.

4.3 Pre-loaded context

Before the loop starts, two helpers prepare static context that goes into the system prompt:

_build_dir_context() (ai.py:736) — ls-style listing of the dir with sizes and MIME types via python-magic. The agent sees this before it makes any tool calls, so it doesn't waste a turn just listing the directory.
_get_child_summaries() (ai.py:758) — looks up each subdirectory in the cache and pulls its summary field. This is how leaves-first ordering pays off: by the time the loop runs on src/, all of src/auth/, src/db/, src/middleware/ already have cached summaries that get injected as {child_summaries}.

If _get_child_summaries() returns nothing, the prompt says (none — this is a leaf directory).

4.4 The token tracker and the budget check

_TokenTracker (ai.py:94) is a tiny accumulator with one important subtlety, captured in #44:

Cumulative input tokens are NOT a meaningful proxy for context size: each turn's input_tokens already includes the full message history, so summing across turns double-counts everything. Use last_input for budget decisions, totals for billing.

So budget_exceeded() (ai.py:135) compares last_input (the most recent call's input_tokens) to CONTEXT_BUDGET (ai.py:40), which is 70% of 200k. This is checked at the top of each loop iteration, before the next API call.

When the budget check trips, the loop:

Prints a Context budget reached warning to stderr
If no dir cache entry exists yet, builds a partial one from any file cache entries the agent already wrote (ai.py:889–937), marks it with partial: True and partial_reason, and writes it
Breaks out of the loop

This means a budget breach doesn't lose work — anything the agent already cached survives, and the synthesis pass will see a partial dir summary rather than nothing.

4.5 What the loop returns

_run_dir_loop() returns the summary string from submit_report (or the partial summary if the budget tripped). _run_investigation() then writes a normal dir cache entry from this summary at ai.py:1363–1375 — unless the dir loop already wrote one itself via the partial-flush path, in which case the cache.has_entry("dir", dir_path) check skips it.

4.6 The streaming API caller

_call_api_streaming() (ai.py:681) is a thin wrapper around client.messages.stream(). It currently doesn't print tokens as they arrive — it iterates the stream, drops everything, then pulls the final message via stream.get_final_message(). The streaming API is used for real-time tool decision printing, which today happens only after the full response arrives. There's room here to add live progress printing if you want it.

5. The cache model

Cache lives at /tmp/luminos/{investigation_id}/. Code is luminos_lib/cache.py (201 lines).

5.1 Investigation IDs

/tmp/luminos/investigations.json maps absolute target paths to UUIDs. _get_investigation_id() (cache.py:40) looks up the target and either returns the existing UUID (resume) or creates a new one (fresh run). --fresh forces a new UUID even if one exists.

5.2 What's stored

Inside /tmp/luminos/{uuid}/:

meta.json              investigation metadata (model, start time, dir count)
files/<sha256>.json    one file per cached file entry
dirs/<sha256>.json     one file per cached directory entry
flags.jsonl            JSONL — appended on every flag tool call
investigation.log      JSONL — appended on every tool call

File and dir cache entries are NOT in JSONL — they are one sha256-keyed JSON file per entry. The sha256 is over the path string (cache.py:13). Only flags.jsonl and investigation.log use JSONL.

Required fields are validated in write_entry() (cache.py:115):

file: {path, relative_path, size_bytes, category, summary, cached_at}
dir:  {path, relative_path, child_count, dominant_category, summary, cached_at}

The validator also rejects entries containing content, contents, or raw fields — the agent is explicitly forbidden from caching raw file contents, summaries only. If you change the schema, update the required set in write_entry() and update the test in tests/test_cache.py.

5.3 Confidence support already exists

write_entry() validates an optional confidence field (cache.py:129–134) and a confidence_reason string. low_confidence_entries(threshold=0.7) (cache.py:191) returns all entries below a threshold, sorted ascending. The agent doesn't currently set these fields in any prompt — that lights up when Phase 1 work actually wires the prompts.

5.4 Why one-file-per-entry instead of JSONL

Random access by path. The dir loop calls cache.has_entry("dir", path) once per directory during the _get_child_summaries() lookup; with sha256-keyed files this is an os.path.exists() call. With JSONL it would be a full file scan.

6. Prompts

All prompt templates live in luminos_lib/prompts.py. There are three:

Constant	Used by	What it carries
`_SURVEY_SYSTEM_PROMPT`	`_run_survey`	survey_signals, tree_preview, available_tools
`_DIR_SYSTEM_PROMPT`	`_run_dir_loop`	dir_path, dir_rel, max_turns, context, child_summaries, survey_context
`_SYNTHESIS_SYSTEM_PROMPT`	`_run_synthesis`	target, summaries_text

Each is a Python f-string-style template with {name} placeholders. The caller assembles values and passes them to .format(...) immediately before the API call. There is no template engine — it's plain string formatting.

When you change a prompt, the only thing you need to keep in sync is the set of placeholders. If you add {foo} to the template, the caller must provide foo=.... If you remove a placeholder from the template but leave the kwarg in the caller, .format() silently ignores it. If you add a placeholder and forget to provide it, .format() raises KeyError at runtime.

prompts.py has no logic and no tests — it's listed in Development Guide as exempt from unit testing for that reason.

7. Synthesis pass

_run_synthesis() (ai.py:1157) is structurally similar to the dir loop but much simpler:

Reads all dir cache entries via cache.read_all_entries("dir")
Renders them into a summaries_text block (one section per dir)
Stuffs that into _SYNTHESIS_SYSTEM_PROMPT
Loops up to max_turns=5 waiting for submit_report with brief and detailed fields

Tools available: read_cache, list_cache, flag, submit_report (_SYNTHESIS_TOOLS at ai.py:401). The synthesis agent can pull specific cache entries back if it needs to drill in, but it cannot read files directly — synthesis is meant to operate on summaries, not raw contents.

There's a fallback: if synthesis runs out of turns without calling submit_report, _synthesize_from_cache() (ai.py:1262) builds a mechanical brief+detailed from the cached dir summaries with no AI call. This guarantees you always get something in the report.

8. Flags

The flag tool is the agent's pressure valve for "I noticed something that should not be lost in the summary." _tool_flag() (ai.py:629) prints to stderr and appends a JSONL line to {cache.root}/flags.jsonl. At the end of _run_investigation() (ai.py:1387–1397), the orchestrator reads that file back and includes the flags in its return tuple. format_report() then renders them in a dedicated section.

Severity is info | concern | critical. The agent is told to flag immediately on discovery, not save findings for the report — this is in the tool description at ai.py:312.

9. Where to make common changes

A cookbook for the kinds of changes that come up most often.

9.1 Add a new tool the dir agent can call

Write the implementation: _tool_<name>(args, target, cache) somewhere in the tool implementations section of ai.py (~lines 486–642). Return a string.
Add it to _TOOL_DISPATCH at ai.py:645.
Add its schema to _DIR_TOOLS at ai.py:151. The schema must follow Anthropic tool-use shape: name, description, input_schema.
Decide whether the survey should be able to filter it out (default: yes — leave it out of _PROTECTED_DIR_TOOLS) or whether it's control-flow critical (add to _PROTECTED_DIR_TOOLS).
Update _DIR_SYSTEM_PROMPT in prompts.py if the agent needs instructions on when to use the new tool.
There is no unit test for tool registration today (ai.py is exempt). If you want coverage, the test would mock client.messages.stream and assert that the dispatch table contains your tool.

9.2 Add a whole new pass

(Phase 3's planning pass is the immediate example.) The pattern:

Define a new system prompt constant in prompts.py
Define a new tool list in ai.py for the pass-specific submit tool
Write _run_<pass>() in ai.py, modeled on _run_survey() — single submit tool, low max_turns, returns a dict or None on failure
Wire it into _run_investigation() between existing passes
Pass its output downstream by adding a kwarg to _run_dir_loop() (or wherever it's needed) and threading it through

The survey pass is the cleanest reference implementation because it's short and self-contained.

9.3 Change a prompt

Edit the constant in prompts.py. If you add a {placeholder}, also update the corresponding .format(...) call in ai.py. Search the codebase for the constant name to find the call site:

grep -n SURVEY_SYSTEM_PROMPT luminos_lib/ai.py

There is no prompt versioning today. Investigation cache entries don't record which prompt version produced them, so re-running with a new prompt against an existing investigation will mix old and new outputs unless you --fresh.

9.4 Change cache schema

Update the required-fields set in cache.py:write_entry() (cache.py:119–123)
Update _DIR_TOOLS's write_cache description in ai.py:228 so the agent knows what to write
Update _DIR_SYSTEM_PROMPT in prompts.py if the agent needs to know how to populate the new field
Update tests/test_cache.py — schema validation is the part of the cache that is covered

9.5 Add a CLI flag

Edit luminos.py:88 (main()'s argparse setup) to define the flag, then plumb it through whatever functions need it. New AI-related flags typically need to be added to analyze_directory()'s signature (ai.py:1408) and then forwarded to _run_investigation().

10. Token budget and cost

Budget logic is in _TokenTracker.budget_exceeded() and is checked at the top of every dir loop iteration (ai.py:882). The budget is per call, not cumulative — see §4.4. The breach handler flushes a partial dir cache entry so work isn't lost.

Cost reporting happens once at the end of _run_investigation() (ai.py:1399), using the cumulative total_input and total_output counters multiplied by the constants at ai.py:43–44. There is no running cost display during the investigation today. If you want one, _TokenTracker.summary() already returns the formatted string — just call it after each dir loop.

11. Glossary

Term	Meaning
base scan	The non-AI phase: tree, classification, languages, recency, disk usage. Stdlib + coreutils only.
dir loop	Per-directory agent loop in `_run_dir_loop`. Up to 14 turns. Produces a `dir` cache entry.
survey pass	Single short loop before any dir loops, producing a shared description and tool guidance.
synthesis pass	Final loop that reads `dir` cache entries and produces `(brief, detailed)`.
leaves-first	Discovery order in `_discover_directories`: deepest paths first, so child summaries exist when parents are investigated.
investigation	One end-to-end run, identified by a UUID, persisted under `/tmp/luminos/{uuid}/`.
investigation_id	The UUID. Stored in `/tmp/luminos/investigations.json` keyed by absolute target path.
cache entry	A JSON file under `files/` or `dirs/` named by sha256(path).
flag	An agent finding written to `flags.jsonl` and reported separately. info / concern / critical.
partial entry	A `dir` cache entry written when the budget tripped before `submit_report`. Marked with `partial: True`.
survey signals	The histogram + samples computed by `filetypes.survey_signals()` during the base scan, fed to the survey prompt.
last_input	The `input_tokens` count from the most recent API call. The basis for budget checks. NOT the cumulative sum.
CONTEXT_BUDGET	70% of 200k = 140k. Trigger threshold for early exit.
`_PROTECTED_DIR_TOOLS`	Tools the survey is forbidden from filtering out of the dir loop's toolbox. Currently `{submit_report}`.