wiki: document leaf-first investigation contract in Internals §4.7 (#72)
parent
0a33ec6bd2
commit
508d66cba8
1 changed files with 43 additions and 0 deletions
43
Internals.md
43
Internals.md
|
|
@ -297,6 +297,49 @@ real-time tool decision printing, which today happens only after the full
|
|||
response arrives. There's room here to add live progress printing if you
|
||||
want it.
|
||||
|
||||
### 4.7 The leaf-first contract (load-bearing for child summaries)
|
||||
|
||||
`_discover_directories()` returns directories sorted leaves-first (the
|
||||
deepest paths first, parents last). This is not a stylistic choice. It
|
||||
is a load-bearing invariant. **`_get_child_summaries()` depends on it.**
|
||||
|
||||
When the dir loop runs on a parent like `src/`,
|
||||
`_get_child_summaries()` reads the cache for each subdirectory of
|
||||
`src/` (`src/auth/`, `src/db/`, `src/middleware/`) and injects their
|
||||
existing summaries into the parent's system prompt under
|
||||
`{child_summaries}`. This is how the agent gets context about parts of
|
||||
the project it isn't currently inside without re-reading them, and it
|
||||
is the entire payoff of leaves-first ordering.
|
||||
|
||||
The trick: those subdirectory summaries only exist if the children
|
||||
were investigated *first*. If `src/` runs before `src/auth/`, the
|
||||
cache lookup at `ai.py:825` returns nothing. The function falls
|
||||
through to its default at `ai.py:832` and returns the string
|
||||
`(none — this is a leaf directory)`. The parent's system prompt
|
||||
silently loses all of its child context, and the agent has no way to
|
||||
know — the placeholder claims the dir is a leaf, which is a lie when
|
||||
the children just haven't been investigated yet. The dir summary
|
||||
degrades and the synthesis pass inherits the degradation.
|
||||
|
||||
**If you change the investigation order**, you have to do one of:
|
||||
|
||||
1. **Preserve the leaf-first invariant within whatever new order you
|
||||
introduce.** A "priority-first" order can still process directories
|
||||
leaves-first within each priority band, so children always run
|
||||
before parents.
|
||||
2. **Explicitly handle the missing-child-summaries case in the
|
||||
prompt.** Replace the lie ("leaf directory") with the truth
|
||||
("children not yet investigated") so the agent at least knows what
|
||||
it doesn't have, and accept that some dirs will run with degraded
|
||||
context.
|
||||
|
||||
Phase 3's planning pass introduces the temptation to investigate
|
||||
priority dirs first. Both alternatives above are open. Whichever is
|
||||
chosen, this contract has to be addressed *explicitly* — the test
|
||||
class `TestDiscoverDirectories` (in `tests/test_ai_pure.py`) pins the
|
||||
current ordering, so any change will be loud, but the *reason* the
|
||||
ordering matters lives here.
|
||||
|
||||
---
|
||||
|
||||
## 5. The cache model
|
||||
|
|
|
|||
Loading…
Reference in a new issue