docs(arch): document context budget fix (#44) and max_turns rationale
parent
0bbd98c9eb
commit
aabf7807fd
1 changed files with 20 additions and 2 deletions
|
|
@ -117,7 +117,25 @@ Cache is reused across runs for the same target. `--fresh` ignores it.
|
||||||
|
|
||||||
`claude-sonnet-4-20250514`
|
`claude-sonnet-4-20250514`
|
||||||
|
|
||||||
Context budget: 70% of 180,000 tokens (126,000). Early exit flushes partial
|
**Context budget.** 70% of 200,000 tokens (140,000) — Sonnet 4's real
|
||||||
cache on budget breach.
|
context window with a 30% safety margin. The budget is checked against
|
||||||
|
the *latest* per-call `input_tokens` reading (the actual size of the
|
||||||
|
context window in use), not the cumulative sum across turns. Early
|
||||||
|
exit flushes partial cache on budget breach. See #44.
|
||||||
|
|
||||||
|
**Per-loop turn cap.** Each dir loop runs for at most `max_turns = 14`
|
||||||
|
turns. This is a sanity bound separate from the context budget — even
|
||||||
|
on small targets the agent should produce a `submit_report` long
|
||||||
|
before exhausting 14 turns. The cap exists to prevent runaway loops
|
||||||
|
when the agent gets stuck (e.g. repeatedly retrying a failing tool
|
||||||
|
call). If we observe legitimate investigations consistently hitting
|
||||||
|
14, raise the cap; do not raise it speculatively.
|
||||||
|
|
||||||
|
**Per-loop message history growth.** Tool results are appended to the
|
||||||
|
message history and never evicted, so per-turn `input_tokens` grows
|
||||||
|
roughly linearly across a loop (~1.5–2k per turn observed on
|
||||||
|
codebase targets). At the current `max_turns=14` cap this stays well
|
||||||
|
under 200k. Raising `max_turns` significantly (e.g. via Phase 3
|
||||||
|
dynamic turn allocation) would expose this — see #51.
|
||||||
|
|
||||||
Pricing tracked and reported at end of each run.
|
Pricing tracked and reported at end of each run.
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue