wiki: refresh Internals.md §4 for #57 dir loop refactor
parent
ecfae7edba
commit
725ef62edd
1 changed files with 47 additions and 36 deletions
83
Internals.md
83
Internals.md
|
|
@ -169,23 +169,32 @@ protectable.
|
||||||
|
|
||||||
## 4. The dir loop in depth
|
## 4. The dir loop in depth
|
||||||
|
|
||||||
`_run_dir_loop()` is at `ai.py:845`. This is a hand-written agent loop and
|
`_run_dir_loop()` is at `ai.py:1017`. It is a hand-written agent loop, and
|
||||||
you should expect to read it several times before it clicks. The shape is:
|
you should expect to read it several times before it clicks. As of #57 the
|
||||||
|
loop body itself is a thin coordinator (~25 lines): it calls three helpers
|
||||||
|
that own the layers it used to inline.
|
||||||
|
|
||||||
|
| Helper | Lines | Job |
|
||||||
|
|---|---|---|
|
||||||
|
| `_build_dir_loop_context()` | `ai.py:855` | Pure setup. Builds dir context, child summaries, survey block, filtered tool list, system prompt, and the seed user message. Returns a `_DirLoopContext` namedtuple. |
|
||||||
|
| `_flush_partial_dir_entry()` | `ai.py:896` | Idempotent partial-cache writer for the budget-exceeded path. Synthesizes a summary from already-cached file entries when possible, or writes a "no files processed" stub. Returns the partial summary string. |
|
||||||
|
| `_handle_turn_response()` | `ai.py:957` | Per-turn response processing. Prints text blocks and tool decisions to stderr, appends the assistant message, dispatches tools (or nudges the agent to call submit_report), appends tool_results. Returns `(done, summary)`. |
|
||||||
|
|
||||||
|
The shape of the loop body is now:
|
||||||
|
|
||||||
```
|
```
|
||||||
build system prompt (with survey context, child summaries, dir contents)
|
ctx = _build_dir_loop_context(...)
|
||||||
build initial user message ("investigate this directory now")
|
|
||||||
reset per-loop token counter
|
reset per-loop token counter
|
||||||
for turn in range(max_turns): # max_turns = 14
|
for turn in range(max_turns): # max_turns = 14
|
||||||
if budget exceeded: flush partial cache and break
|
if budget exceeded:
|
||||||
|
print warning
|
||||||
|
partial = _flush_partial_dir_entry(...)
|
||||||
|
if partial: summary = partial
|
||||||
|
break
|
||||||
call API (streaming)
|
call API (streaming)
|
||||||
record token usage
|
done, turn_summary = _handle_turn_response(...)
|
||||||
print text blocks and tool decisions to stderr
|
if turn_summary: summary = turn_summary
|
||||||
append assistant response to message history
|
if done: break
|
||||||
if no tool calls: nudge agent to call submit_report; continue
|
|
||||||
execute each tool call, build tool_result blocks
|
|
||||||
append tool_results to message history as user message
|
|
||||||
if submit_report was called: break
|
|
||||||
return summary
|
return summary
|
||||||
```
|
```
|
||||||
|
|
||||||
|
|
@ -202,32 +211,31 @@ budget; raising the cap would expose this. See **#51**.
|
||||||
|
|
||||||
### 4.2 Tool dispatch
|
### 4.2 Tool dispatch
|
||||||
|
|
||||||
Tools are not class methods. They're plain functions in `ai.py:486–642`,
|
Tools are plain functions in `ai.py`, registered into `_TOOL_DISPATCH` at
|
||||||
registered into `_TOOL_DISPATCH` at `ai.py:645`. `_execute_tool()`
|
`ai.py:650`. `_execute_tool()` (`ai.py:664`) is a small function that
|
||||||
(`ai.py:659`) is a 16-line function that looks up the handler by name,
|
looks up the handler by name, calls it, logs the turn to
|
||||||
calls it, logs the turn to `investigation.log`, and returns the result
|
`investigation.log`, and returns the result string. **The control-flow
|
||||||
string. **The two control-flow tools — `submit_report` and `think`/
|
tool `submit_report` is NOT in `_TOOL_DISPATCH`** because
|
||||||
`checkpoint` for narration — are NOT in `_TOOL_DISPATCH`** because the
|
`_handle_turn_response()` recognizes it specially: it sets `done = True`
|
||||||
loop body handles them specially:
|
and extracts the summary directly from the tool input.
|
||||||
- `submit_report` is recognized in the tool-use scan at `ai.py:977`, sets
|
|
||||||
`done = True`, and doesn't go through dispatch
|
`think`, `checkpoint`, and `flag` *are* in dispatch, but they have side
|
||||||
- `think`, `checkpoint`, and `flag` *are* in dispatch, but they have side
|
effects that just print to stderr or append to `flags.jsonl` — the return
|
||||||
effects that just print to stderr or append to `flags.jsonl` — the
|
value is always `"ok"`.
|
||||||
return value is always `"ok"`
|
|
||||||
|
|
||||||
When you add a tool: write the function, add it to `_TOOL_DISPATCH`, add
|
When you add a tool: write the function, add it to `_TOOL_DISPATCH`, add
|
||||||
its schema to `_DIR_TOOLS`. That's it.
|
its schema to `_DIR_TOOLS`. That's it.
|
||||||
|
|
||||||
### 4.3 Pre-loaded context
|
### 4.3 Pre-loaded context
|
||||||
|
|
||||||
Before the loop starts, two helpers prepare static context that goes into
|
Before the loop starts, `_build_dir_loop_context()` (`ai.py:855`) calls
|
||||||
the system prompt:
|
two helpers that prepare static context for the system prompt:
|
||||||
|
|
||||||
- `_build_dir_context()` (`ai.py:736`) — `ls`-style listing of the dir
|
- `_build_dir_context()` (`ai.py:741`) — `ls`-style listing of the dir
|
||||||
with sizes and MIME types via `python-magic`. The agent sees this
|
with sizes and MIME types via `python-magic`. The agent sees this
|
||||||
*before* it makes any tool calls, so it doesn't waste a turn just
|
*before* it makes any tool calls, so it doesn't waste a turn just
|
||||||
listing the directory.
|
listing the directory.
|
||||||
- `_get_child_summaries()` (`ai.py:758`) — looks up each subdirectory in
|
- `_get_child_summaries()` (`ai.py:763`) — looks up each subdirectory in
|
||||||
the cache and pulls its `summary` field. This is how leaves-first
|
the cache and pulls its `summary` field. This is how leaves-first
|
||||||
ordering pays off: by the time the loop runs on `src/`, all of
|
ordering pays off: by the time the loop runs on `src/`, all of
|
||||||
`src/auth/`, `src/db/`, `src/middleware/` already have cached summaries
|
`src/auth/`, `src/db/`, `src/middleware/` already have cached summaries
|
||||||
|
|
@ -253,9 +261,11 @@ the next API call.
|
||||||
|
|
||||||
When the budget check trips, the loop:
|
When the budget check trips, the loop:
|
||||||
1. Prints a `Context budget reached` warning to stderr
|
1. Prints a `Context budget reached` warning to stderr
|
||||||
2. If no `dir` cache entry exists yet, builds a *partial* one from any
|
2. Calls `_flush_partial_dir_entry()` (`ai.py:896`), which writes a
|
||||||
`file` cache entries the agent already wrote (`ai.py:889–937`), marks
|
partial dir cache entry from any `file` cache entries the agent
|
||||||
it with `partial: True` and `partial_reason`, and writes it
|
already produced, marked with `partial: True` and `partial_reason`.
|
||||||
|
The helper is idempotent — if a dir entry already exists, it returns
|
||||||
|
`""` without writing.
|
||||||
3. Breaks out of the loop
|
3. Breaks out of the loop
|
||||||
|
|
||||||
This means a budget breach doesn't lose work — anything the agent already
|
This means a budget breach doesn't lose work — anything the agent already
|
||||||
|
|
@ -265,14 +275,15 @@ rather than nothing.
|
||||||
### 4.5 What the loop returns
|
### 4.5 What the loop returns
|
||||||
|
|
||||||
`_run_dir_loop()` returns the `summary` string from `submit_report` (or
|
`_run_dir_loop()` returns the `summary` string from `submit_report` (or
|
||||||
the partial summary if the budget tripped). `_run_investigation()` then
|
the partial summary returned by `_flush_partial_dir_entry()` if the
|
||||||
writes a normal `dir` cache entry from this summary at `ai.py:1363–1375`
|
budget tripped). `_run_investigation()` then writes a normal `dir` cache
|
||||||
— *unless* the dir loop already wrote one itself via the partial-flush
|
entry from this summary, *unless* the dir loop already wrote one itself
|
||||||
path, in which case the `cache.has_entry("dir", dir_path)` check skips it.
|
via the partial-flush path, in which case the `cache.has_entry("dir",
|
||||||
|
dir_path)` check skips it.
|
||||||
|
|
||||||
### 4.6 The streaming API caller
|
### 4.6 The streaming API caller
|
||||||
|
|
||||||
`_call_api_streaming()` (`ai.py:681`) is a thin wrapper around
|
`_call_api_streaming()` (`ai.py:686`) is a thin wrapper around
|
||||||
`client.messages.stream()`. It currently doesn't print tokens as they
|
`client.messages.stream()`. It currently doesn't print tokens as they
|
||||||
arrive — it iterates the stream, drops everything, then pulls the final
|
arrive — it iterates the stream, drops everything, then pulls the final
|
||||||
message via `stream.get_final_message()`. The streaming API is used for
|
message via `stream.get_final_message()`. The streaming API is used for
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue