From aabf7807fd38c41dd4393bcc683ac53ec57e5827 Mon Sep 17 00:00:00 2001 From: Jeff Smith Date: Mon, 6 Apr 2026 22:48:57 -0600 Subject: [PATCH] docs(arch): document context budget fix (#44) and max_turns rationale --- Architecture.md | 22 ++++++++++++++++++++-- 1 file changed, 20 insertions(+), 2 deletions(-) diff --git a/Architecture.md b/Architecture.md index 8041758..8e538a3 100644 --- a/Architecture.md +++ b/Architecture.md @@ -117,7 +117,25 @@ Cache is reused across runs for the same target. `--fresh` ignores it. `claude-sonnet-4-20250514` -Context budget: 70% of 180,000 tokens (126,000). Early exit flushes partial -cache on budget breach. +**Context budget.** 70% of 200,000 tokens (140,000) — Sonnet 4's real +context window with a 30% safety margin. The budget is checked against +the *latest* per-call `input_tokens` reading (the actual size of the +context window in use), not the cumulative sum across turns. Early +exit flushes partial cache on budget breach. See #44. + +**Per-loop turn cap.** Each dir loop runs for at most `max_turns = 14` +turns. This is a sanity bound separate from the context budget — even +on small targets the agent should produce a `submit_report` long +before exhausting 14 turns. The cap exists to prevent runaway loops +when the agent gets stuck (e.g. repeatedly retrying a failing tool +call). If we observe legitimate investigations consistently hitting +14, raise the cap; do not raise it speculatively. + +**Per-loop message history growth.** Tool results are appended to the +message history and never evicted, so per-turn `input_tokens` grows +roughly linearly across a loop (~1.5–2k per turn observed on +codebase targets). At the current `max_turns=14` cap this stays well +under 200k. Raising `max_turns` significantly (e.g. via Phase 3 +dynamic turn allocation) would expose this — see #51. Pricing tracked and reported at end of each run.