docs: add Planning Pass design sketch, update Architecture and Internals for Phase 3
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
parent
31a052eca0
commit
3fcf8c221d
4 changed files with 322 additions and 41 deletions
|
|
@ -8,8 +8,9 @@
|
||||||
|
|
||||||
Luminos is an agentic Claude investigation tool. Every invocation runs the
|
Luminos is an agentic Claude investigation tool. Every invocation runs the
|
||||||
full pipeline: a base scan first to feed the agent its initial picture, then
|
full pipeline: a base scan first to feed the agent its initial picture, then
|
||||||
a survey pass, then per-directory dir loops, then a final synthesis pass.
|
a survey pass, a planning pass, per-directory dir loops with dynamic turn
|
||||||
The base scan is not a standalone product, it is the agent's input.
|
allocation, then a final synthesis pass. The base scan is not a standalone
|
||||||
|
product, it is the agent's input.
|
||||||
|
|
||||||
**Entry point:** `luminos.py` — argument parsing, scan orchestration, AI
|
**Entry point:** `luminos.py` — argument parsing, scan orchestration, AI
|
||||||
pipeline kickoff, output routing.
|
pipeline kickoff, output routing.
|
||||||
|
|
@ -69,7 +70,17 @@ analyze_directory(report, target)
|
||||||
│
|
│
|
||||||
├── _filter_dir_tools(survey) remove skip_tools (if confidence ≥ 0.5)
|
├── _filter_dir_tools(survey) remove skip_tools (if confidence ≥ 0.5)
|
||||||
│
|
│
|
||||||
├── per-directory loop (each uncached dir, up to max_turns=14)
|
├── _run_planning() single loop, max 3 turns
|
||||||
|
│ inputs: survey output + full tree + file signals
|
||||||
|
│ Tools: submit_plan
|
||||||
|
│ output: plan dict (priority/shallow/skip dirs,
|
||||||
|
│ turn allocations, investigation order)
|
||||||
|
│ (skipped on tiny targets or loaded from plan.json
|
||||||
|
│ on resumed runs)
|
||||||
|
│
|
||||||
|
├── _apply_plan() sort dirs into bands, build turn map
|
||||||
|
│
|
||||||
|
├── per-directory loop (ordered by plan, dynamic max_turns)
|
||||||
│ _build_dir_context() list files + sizes + MIME
|
│ _build_dir_context() list files + sizes + MIME
|
||||||
│ _get_child_summaries() read cached child summaries
|
│ _get_child_summaries() read cached child summaries
|
||||||
│ _format_survey_block() inject survey context into prompt
|
│ _format_survey_block() inject survey context into prompt
|
||||||
|
|
@ -78,7 +89,9 @@ analyze_directory(report, target)
|
||||||
│ cache entry on budget breach
|
│ cache entry on budget breach
|
||||||
│ Tools: read_file, list_directory, run_command,
|
│ Tools: read_file, list_directory, run_command,
|
||||||
│ parse_structure, write_cache, think, checkpoint,
|
│ parse_structure, write_cache, think, checkpoint,
|
||||||
│ flag, submit_report
|
│ flag, submit_report (with completeness)
|
||||||
|
│
|
||||||
|
├── _write_plan_evaluation() plan_evaluation.json quality metrics
|
||||||
│
|
│
|
||||||
├── _run_synthesis() single loop, max 5 turns
|
├── _run_synthesis() single loop, max 5 turns
|
||||||
│ reads all "dir" cache entries
|
│ reads all "dir" cache entries
|
||||||
|
|
@ -104,6 +117,8 @@ Layout:
|
||||||
|
|
||||||
```
|
```
|
||||||
meta.json investigation metadata
|
meta.json investigation metadata
|
||||||
|
plan.json planning pass output (cached for resumed runs)
|
||||||
|
plan_evaluation.json quality metrics: plan predictions vs outcomes
|
||||||
files/<sha256>.json one JSON file per cached file entry
|
files/<sha256>.json one JSON file per cached file entry
|
||||||
dirs/<sha256>.json one JSON file per cached directory entry
|
dirs/<sha256>.json one JSON file per cached directory entry
|
||||||
flags.jsonl JSONL — appended on every flag tool call
|
flags.jsonl JSONL — appended on every flag tool call
|
||||||
|
|
@ -170,19 +185,18 @@ the *latest* per-call `input_tokens` reading (the actual size of the
|
||||||
context window in use), not the cumulative sum across turns. Early
|
context window in use), not the cumulative sum across turns. Early
|
||||||
exit flushes partial cache on budget breach. See #44.
|
exit flushes partial cache on budget breach. See #44.
|
||||||
|
|
||||||
**Per-loop turn cap.** Each dir loop runs for at most `max_turns = 14`
|
**Per-loop turn cap.** The planning pass assigns each directory a turn
|
||||||
turns. This is a sanity bound separate from the context budget — even
|
budget: priority dirs get 15-20 (capped at 25), shallow dirs get 5,
|
||||||
on small targets the agent should produce a `submit_report` long
|
default dirs get 10. This replaced the old fixed `max_turns=14`. The
|
||||||
before exhausting 14 turns. The cap exists to prevent runaway loops
|
cap exists to prevent runaway loops when the agent gets stuck. The
|
||||||
when the agent gets stuck (e.g. repeatedly retrying a failing tool
|
`plan_evaluation.json` quality report tracks turns used vs allocated
|
||||||
call). If we observe legitimate investigations consistently hitting
|
per directory. See [Planning Pass](PlanningPass) for the full design.
|
||||||
14, raise the cap; do not raise it speculatively.
|
|
||||||
|
|
||||||
**Per-loop message history growth.** Tool results are appended to the
|
**Per-loop message history growth.** Tool results are appended to the
|
||||||
message history and never evicted, so per-turn `input_tokens` grows
|
message history and never evicted, so per-turn `input_tokens` grows
|
||||||
roughly linearly across a loop (~1.5–2k per turn observed on
|
roughly linearly across a loop (~1.5-2k per turn observed on
|
||||||
codebase targets). At the current `max_turns=14` cap this stays well
|
codebase targets). At the current caps (max 25 turns for priority
|
||||||
under 200k. Raising `max_turns` significantly (e.g. via Phase 3
|
dirs) this stays under 200k. Raising caps significantly would
|
||||||
dynamic turn allocation) would expose this — see #51.
|
expose this further. See #51.
|
||||||
|
|
||||||
Pricing tracked and reported at end of each run.
|
Pricing tracked and reported at end of each run.
|
||||||
|
|
|
||||||
7
Home.md
7
Home.md
|
|
@ -10,9 +10,9 @@ runs first to feed the agent its initial picture of the target.
|
||||||
|
|
||||||
## Current State
|
## Current State
|
||||||
|
|
||||||
- **Phase:** Active development — core pipeline stable, scaling and domain intelligence planned
|
- **Phase:** Active development — Phases 1-3 complete. Phase 3 added planning pass with dynamic turn allocation and quality instrumentation.
|
||||||
- **Last worked on:** 2026-04-06
|
- **Last worked on:** 2026-04-12
|
||||||
- **Last commit:** merge: add -x/--exclude flag for directory exclusion
|
- **Last commit:** feat(ai): Phase 3 investigation planning (#75)
|
||||||
- **Blocking:** None
|
- **Blocking:** None
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
@ -23,6 +23,7 @@ runs first to feed the agent its initial picture of the target.
|
||||||
|---|---|
|
|---|---|
|
||||||
| [Architecture](Architecture) | Module breakdown, data flow, AI pipeline |
|
| [Architecture](Architecture) | Module breakdown, data flow, AI pipeline |
|
||||||
| [Internals](Internals) | Code-level tour: dir loop, cache, prompts, where to make changes |
|
| [Internals](Internals) | Code-level tour: dir loop, cache, prompts, where to make changes |
|
||||||
|
| [Planning Pass](PlanningPass) | Phase 3 design sketch: dynamic turn allocation, quality metrics |
|
||||||
| [Development Guide](DevelopmentGuide) | Setup, git workflow, testing, commands |
|
| [Development Guide](DevelopmentGuide) | Setup, git workflow, testing, commands |
|
||||||
| [Roadmap](Roadmap) | Phase status — pointer to PLAN.md and open issues |
|
| [Roadmap](Roadmap) | Phase status — pointer to PLAN.md and open issues |
|
||||||
| [Session Retrospectives](SessionRetrospectives) | Full session history |
|
| [Session Retrospectives](SessionRetrospectives) | Full session history |
|
||||||
|
|
|
||||||
40
Internals.md
40
Internals.md
|
|
@ -313,32 +313,26 @@ is the entire payoff of leaves-first ordering.
|
||||||
|
|
||||||
The trick: those subdirectory summaries only exist if the children
|
The trick: those subdirectory summaries only exist if the children
|
||||||
were investigated *first*. If `src/` runs before `src/auth/`, the
|
were investigated *first*. If `src/` runs before `src/auth/`, the
|
||||||
cache lookup at `ai.py:825` returns nothing. The function falls
|
cache lookup returns nothing.
|
||||||
through to its default at `ai.py:832` and returns the string
|
|
||||||
`(none — this is a leaf directory)`. The parent's system prompt
|
|
||||||
silently loses all of its child context, and the agent has no way to
|
|
||||||
know — the placeholder claims the dir is a leaf, which is a lie when
|
|
||||||
the children just haven't been investigated yet. The dir summary
|
|
||||||
degrades and the synthesis pass inherits the degradation.
|
|
||||||
|
|
||||||
**If you change the investigation order**, you have to do one of:
|
**Phase 3 addressed this contract in two ways:**
|
||||||
|
|
||||||
1. **Preserve the leaf-first invariant within whatever new order you
|
1. **Band-sorted ordering preserves leaf-first within priority bands.**
|
||||||
introduce.** A "priority-first" order can still process directories
|
`_apply_plan()` groups directories into priority/default/shallow
|
||||||
leaves-first within each priority band, so children always run
|
bands but keeps the leaf-first sort within each band. So children
|
||||||
before parents.
|
always run before their parents, even in "priority-first" mode.
|
||||||
2. **Explicitly handle the missing-child-summaries case in the
|
|
||||||
prompt.** Replace the lie ("leaf directory") with the truth
|
|
||||||
("children not yet investigated") so the agent at least knows what
|
|
||||||
it doesn't have, and accept that some dirs will run with degraded
|
|
||||||
context.
|
|
||||||
|
|
||||||
Phase 3's planning pass introduces the temptation to investigate
|
2. **The placeholder was fixed.** `_get_child_summaries()` now
|
||||||
priority dirs first. Both alternatives above are open. Whichever is
|
distinguishes actual leaf directories ("this is a leaf directory")
|
||||||
chosen, this contract has to be addressed *explicitly* — the test
|
from parents whose children haven't been investigated yet ("child
|
||||||
class `TestDiscoverDirectories` (in `tests/test_ai_pure.py`) pins the
|
directories exist but have not been investigated yet"). The old
|
||||||
current ordering, so any change will be loud, but the *reason* the
|
placeholder claimed every empty-cache case was a leaf, which was a
|
||||||
ordering matters lives here.
|
lie when children simply hadn't been processed yet.
|
||||||
|
|
||||||
|
The test class `TestDiscoverDirectories` (in `tests/test_ai_pure.py`)
|
||||||
|
pins the base leaf-first ordering. `TestGetChildSummaries` pins the
|
||||||
|
updated placeholder behavior. See [Planning Pass](PlanningPass) for
|
||||||
|
the full design.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
|
||||||
272
PlanningPass.md
Normal file
272
PlanningPass.md
Normal file
|
|
@ -0,0 +1,272 @@
|
||||||
|
# Planning Pass Design Sketch
|
||||||
|
|
||||||
|
The planning pass is Phase 3 of the Luminos investigation pipeline. It
|
||||||
|
runs after the survey and before the per-directory dir loops, deciding
|
||||||
|
where to invest investigative depth across the directory tree.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Problem
|
||||||
|
|
||||||
|
Before Phase 3, every directory received the same fixed allocation:
|
||||||
|
`max_turns=14`. A two-file docs directory got the same budget as a
|
||||||
|
fifty-file core source directory. This wasted turns on trivial dirs and
|
||||||
|
under-invested in complex ones.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Solution: Plan Before You Investigate
|
||||||
|
|
||||||
|
A single-turn Claude call (the "planning pass") examines cheap signals
|
||||||
|
(survey output, full directory tree, file statistics) and produces a
|
||||||
|
structured plan that the orchestrator uses to allocate resources.
|
||||||
|
|
||||||
|
```
|
||||||
|
survey pass
|
||||||
|
| survey dict
|
||||||
|
v
|
||||||
|
planning pass <-- NEW
|
||||||
|
| plan dict (priority/shallow/skip dirs, turn allocations)
|
||||||
|
v
|
||||||
|
dir loop (per directory, ordered by plan)
|
||||||
|
| cached dir entries
|
||||||
|
v
|
||||||
|
synthesis pass
|
||||||
|
```
|
||||||
|
|
||||||
|
The planning pass does not read files or explore the filesystem. It is
|
||||||
|
a "strategy from the map" pass: it looks at structure and makes
|
||||||
|
judgment calls about where depth will pay off.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Plan Schema
|
||||||
|
|
||||||
|
The planning agent produces a plan via the `submit_plan` tool:
|
||||||
|
|
||||||
|
```python
|
||||||
|
{
|
||||||
|
"priority_dirs": [
|
||||||
|
{"path": str, "reason": str, "suggested_turns": int}
|
||||||
|
],
|
||||||
|
"shallow_dirs": [
|
||||||
|
{"path": str, "reason": str}
|
||||||
|
],
|
||||||
|
"skip_dirs": [
|
||||||
|
{"path": str, "reason": str}
|
||||||
|
],
|
||||||
|
"investigation_order": "leaf-first" | "priority-first",
|
||||||
|
"notes": str,
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Directories not mentioned in any tier receive a default allocation
|
||||||
|
(currently 10 turns). The planner does not need to list every
|
||||||
|
directory; it focuses on cases where the default would clearly be
|
||||||
|
wrong.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Turn Allocation
|
||||||
|
|
||||||
|
| Tier | Turns | When to use |
|
||||||
|
|---|---|---|
|
||||||
|
| **priority** | 15-20 (capped at 25) | Complex, central, or important dirs: many source files, core logic, schemas, migrations |
|
||||||
|
| **default** | 10 | Unlisted dirs; reasonable for most directories |
|
||||||
|
| **shallow** | 5 | Simple, peripheral, or predictable: few files, test fixtures, static assets, docs-only |
|
||||||
|
| **skip** | 0 (excluded) | Build output, dependency caches, vendored code, generated artifacts |
|
||||||
|
|
||||||
|
The global turn budget is `base_turns_per_dir * dir_count` (10 per
|
||||||
|
dir). The planner's allocations should roughly respect this budget.
|
||||||
|
Allocations above the ceiling (25 turns) are capped by the
|
||||||
|
orchestrator.
|
||||||
|
|
||||||
|
### Why no mid-loop borrowing (yet)
|
||||||
|
|
||||||
|
PLAN.md envisions a global budget with mid-loop turn borrowing (an
|
||||||
|
agent that needs more turns can "borrow" from the remaining budget).
|
||||||
|
This requires inter-loop communication that does not exist today. The
|
||||||
|
v1 implementation uses simple per-directory allocation with no
|
||||||
|
borrowing. If the quality instrumentation shows that priority dirs
|
||||||
|
consistently exhaust their allocation while shallow dirs finish early,
|
||||||
|
borrowing becomes worth building.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Investigation Order
|
||||||
|
|
||||||
|
Two strategies are available:
|
||||||
|
|
||||||
|
**leaf-first** (default): the existing order from `_discover_directories()`.
|
||||||
|
Deepest directories first, parents last. Ensures child summaries are
|
||||||
|
always cached before parent investigation begins.
|
||||||
|
|
||||||
|
**priority-first**: priority directories before shallow/default, but
|
||||||
|
leaf-first *within each band*. This preserves the child-summaries
|
||||||
|
invariant while letting high-value subtrees inform the rest of the
|
||||||
|
investigation.
|
||||||
|
|
||||||
|
Both strategies preserve the leaf-first contract documented in
|
||||||
|
[Internals](Internals) section 4.7. The `_apply_plan()` function sorts
|
||||||
|
directories into bands without breaking the within-band leaf ordering.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Inputs to the Planner
|
||||||
|
|
||||||
|
The planning agent receives four signals:
|
||||||
|
|
||||||
|
1. **Survey output**: the full survey dict (description, approach,
|
||||||
|
domain notes, tool recommendations), formatted as a text block.
|
||||||
|
2. **Full directory tree**: `render_tree()` output at depth 6 (deeper
|
||||||
|
than the survey's 2-level preview).
|
||||||
|
3. **File signals**: extension histogram, `file --brief` descriptions,
|
||||||
|
filename samples (the same raw signals the survey sees).
|
||||||
|
4. **Cached directories**: which dirs are already cached from a prior
|
||||||
|
run (so the planner knows what will be skipped).
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Fallback Behavior
|
||||||
|
|
||||||
|
The planning pass degrades gracefully:
|
||||||
|
|
||||||
|
- **Small targets** (below `_SURVEY_MIN_FILES` and `_SURVEY_MIN_DIRS`):
|
||||||
|
planning is skipped entirely, same threshold as the survey. All dirs
|
||||||
|
get the default allocation in leaf-first order.
|
||||||
|
- **Planning fails** (API error, agent doesn't call `submit_plan`):
|
||||||
|
`_default_plan()` returns an empty plan. All dirs get 10 turns,
|
||||||
|
leaf-first order. The investigation proceeds as if Phase 3 didn't
|
||||||
|
exist.
|
||||||
|
- **Resumed runs**: the plan is cached as `plan.json` in the
|
||||||
|
investigation cache. On resume (without `--fresh`), the cached plan
|
||||||
|
is loaded and `_run_planning()` is skipped.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Quality Instrumentation
|
||||||
|
|
||||||
|
Phase 3 ships with built-in measurement so we can tell whether planning
|
||||||
|
actually improves investigation quality. Three metrics:
|
||||||
|
|
||||||
|
### Turn utilization
|
||||||
|
|
||||||
|
Tracked per directory: turns allocated vs turns used. An agent that
|
||||||
|
finishes in 3 turns on an 18-turn budget suggests over-allocation. An
|
||||||
|
agent that hits the cap on a 5-turn budget suggests under-allocation.
|
||||||
|
|
||||||
|
### Completeness self-rating
|
||||||
|
|
||||||
|
The `submit_report` tool (dir scope) now includes a `completeness`
|
||||||
|
field (0.0-1.0). The agent rates how thoroughly it investigated the
|
||||||
|
directory. This is not perfectly reliable (it is a self-assessment),
|
||||||
|
but it provides signal: a priority dir with completeness 0.3 probably
|
||||||
|
needed more turns; a shallow dir with completeness 0.95 probably
|
||||||
|
didn't need its 5 turns.
|
||||||
|
|
||||||
|
### plan_evaluation.json
|
||||||
|
|
||||||
|
Written at the end of every investigation, this file is the planning
|
||||||
|
pass's report card. It compares plan predictions to outcomes:
|
||||||
|
|
||||||
|
```json
|
||||||
|
{
|
||||||
|
"plan_order": "leaf-first",
|
||||||
|
"total_dirs_investigated": 12,
|
||||||
|
"total_turns_allocated": 120,
|
||||||
|
"total_turns_used": 87,
|
||||||
|
"overall_utilization": 0.73,
|
||||||
|
"per_directory": [
|
||||||
|
{
|
||||||
|
"dir": "src/core",
|
||||||
|
"planned_tier": "priority",
|
||||||
|
"turns_allocated": 18,
|
||||||
|
"turns_used": 14,
|
||||||
|
"utilization": 0.78,
|
||||||
|
"completeness": 0.9,
|
||||||
|
"confidence": 0.85
|
||||||
|
}
|
||||||
|
],
|
||||||
|
"evaluated_at": "2026-04-12T..."
|
||||||
|
}
|
||||||
|
```
|
||||||
|
|
||||||
|
Run luminos on the same target before and after changes to compare
|
||||||
|
these metrics. The golden set for baseline comparison: luminos itself.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Implementation Map
|
||||||
|
|
||||||
|
| Component | Location | Purpose |
|
||||||
|
|---|---|---|
|
||||||
|
| `_PLANNING_SYSTEM_PROMPT` | `prompts.py` | System prompt for the planning agent |
|
||||||
|
| `submit_plan` tool | `ai.py` (planning scope) | Tool schema for plan submission |
|
||||||
|
| `_run_planning()` | `ai.py` | Runs the planning pass (follows `_run_survey` pattern) |
|
||||||
|
| `_apply_plan()` | `ai.py` | Pure function: plan + dir list to ordered list + turn map |
|
||||||
|
| `_default_plan()` | `ai.py` | Fallback empty plan |
|
||||||
|
| `_write_plan_evaluation()` | `ai.py` | Writes `plan_evaluation.json` after dir loops |
|
||||||
|
| `_TokenTracker._loop_turns` | `ai.py` | Counts API calls per dir loop for utilization tracking |
|
||||||
|
| `plan.json` | cache root | Persisted plan for resumed runs |
|
||||||
|
| `plan_evaluation.json` | cache root | Post-investigation quality report |
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Design Decisions
|
||||||
|
|
||||||
|
### Why band-sorted order instead of arbitrary reordering
|
||||||
|
|
||||||
|
The leaf-first contract (`_get_child_summaries()`) is load-bearing.
|
||||||
|
Breaking it silently degrades parent summaries because child cache
|
||||||
|
entries don't exist yet. Band-sorting preserves leaf-first within each
|
||||||
|
priority band, giving us "priority-first" without losing child context.
|
||||||
|
|
||||||
|
### Why per-directory allocation instead of a shared global pool
|
||||||
|
|
||||||
|
A shared pool with mid-loop borrowing requires the orchestrator to
|
||||||
|
communicate with running agents, which doesn't exist in the current
|
||||||
|
architecture (each `_run_dir_loop` call is independent). Per-directory
|
||||||
|
allocation is a strict improvement over fixed-14-for-everyone with zero
|
||||||
|
new machinery. The quality instrumentation will tell us if borrowing is
|
||||||
|
worth building.
|
||||||
|
|
||||||
|
### Why the child-summaries placeholder was fixed
|
||||||
|
|
||||||
|
`_get_child_summaries()` previously returned "this is a leaf directory"
|
||||||
|
for any directory with no cached children, whether it was actually a
|
||||||
|
leaf or just hadn't been investigated yet. With priority-first ordering,
|
||||||
|
this lie becomes more likely to trigger. The fix distinguishes the two
|
||||||
|
cases: actual leaves get "this is a leaf directory", uninvestigated
|
||||||
|
parents get "child directories exist but have not been investigated
|
||||||
|
yet".
|
||||||
|
|
||||||
|
### Why completeness is a self-rating
|
||||||
|
|
||||||
|
An external completeness metric would require knowing "how many files
|
||||||
|
should have been examined", which depends on the directory contents and
|
||||||
|
is exactly the kind of judgment the agent makes. Self-rating is
|
||||||
|
imperfect but cheap, and the correlation between self-rated
|
||||||
|
completeness and turn utilization gives us a useful signal even if the
|
||||||
|
absolute values aren't perfectly calibrated.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## Future Work
|
||||||
|
|
||||||
|
- **Mid-loop turn borrowing**: if utilization data shows priority dirs
|
||||||
|
consistently hit their cap while others finish early, implement a
|
||||||
|
shared budget pool.
|
||||||
|
- **Plan refinement**: after the first dir loop run, re-evaluate the
|
||||||
|
plan based on early findings (some "shallow" dirs might turn out to
|
||||||
|
be important).
|
||||||
|
- **Cross-run learning**: use `plan_evaluation.json` from prior runs to
|
||||||
|
improve planning on similar targets.
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
|
## References
|
||||||
|
|
||||||
|
- Issues: #8, #9, #10, #11, #74
|
||||||
|
- PR: #75
|
||||||
|
- PLAN.md Part 4: Investigation Planning
|
||||||
|
- [Internals](Internals) section 4.7: leaf-first contract
|
||||||
Loading…
Reference in a new issue