Second wave of pre-Phase-3 test coverage. The #55 round picked off the
easy decision-logic helpers; this round covers the three highest-impact
helpers that escaped the first sweep.
Three new test classes appended to tests/test_ai_pure.py:
- TestTokenTracker (11 tests)
Pins the load-bearing #44 fix: budget_exceeded() must use last_input
(the most recent call's context size) NOT cumulative input, because
each turn's input_tokens already includes the full message history.
Tests assert: cumulative-input far above budget does NOT trip the
gate when last_input stays small; reset_loop() preserves grand
totals; the boundary is strict > not >=.
- TestSynthesizeFromCache (5 tests)
The synthesis fallback fires only when _run_synthesis exhausts its
max_turns, which almost never happens in normal runs — exactly the
kind of code that silently rots. Tests assert: empty cache returns
the incomplete-message brief and empty detailed; single dir entry
produces a markdown line; multi-entry detailed contains all entries;
empty-summary entries are skipped; file entries alone do not satisfy
(the function reads dir entries only).
- TestDiscoverDirectories (9 tests)
The leaves-first walk drives the entire dir-loop iteration order
and is the foundation of the cache reuse story. Tests assert:
empty target returns target only; nested trees come back leaves-
first; .git / __pycache__ / node_modules / *.egg-info excluded;
custom --exclude honored; hidden dirs excluded by default; show_
hidden=True includes them but does not override the skip list.
PLAN.md: added Phase 2.7 (#56✅) and Phase 2.8 (#55✅, #70) entries
to the implementation order, and removed the now-stale Phase 3.4 (#56)
and Background chore (#55) sections that were displaced by the
pre-Phase-3 cleanup pattern.
Verification: 234 tests pass (209 prior + 25 new).
ai.py was documented as fully exempt from unit testing because the dir
loop and synthesis pass require a live Anthropic API. But several
helpers in the module are pure functions with no API dependency, and
they're the kind of thing that breaks silently. The #57 refactor added
two more (_build_dir_loop_context, _flush_partial_dir_entry) that are
also naturally testable.
New tests/test_ai_pure.py — 45 tests across 8 helpers:
- _should_skip_dir: exact-match, *.egg-info glob, no-match cases
- _path_is_safe: inside, nested, equals, outside, traversal,
sibling-with-target-prefix (the easy-to-miss security case)
- _default_survey: shape, zero confidence guarantees no filtering,
passes through _filter_dir_tools unchanged
- _format_survey_block: None, empty, minimal, with relevant_tools,
with skip_tools, with domain_notes, empty-list omission
- _filter_dir_tools: None, empty, low confidence, high confidence
filters, protected tools never removed, unknown skip silently
ignored, garbage/None confidence treated as zero, threshold
boundary inclusive
- _format_survey_signals: None, empty, zero total_files, full,
partial (only extensions)
- _block_to_dict: text, tool_use, unknown type
- _flush_partial_dir_entry (#57): idempotent when entry exists,
no-file-entries stub path, with-file-entries summary synthesis,
notable_files collection
Uses the same _make_manager() pattern as test_cache.py to construct
a _CacheManager rooted in a tempdir, sidestepping CACHE_ROOT entirely.
Doc updates:
- CLAUDE.md, README.md, docs/wiki/DevelopmentGuide.md: ai.py is no
longer fully exempt — only the API-dependent loops are. Pure
helpers are covered by test_ai_pure.py.
Verification: 209 tests pass (164 prior + 45 new).