Commit graph

4 commits

Author SHA1 Message Date
Jeff Smith
79bb10b9dc fix(ai): match target root dir by basename in _apply_plan() (#76)
The planner sees basename(target) in the tree output (e.g. "luminos_lib")
and uses that as the path in its plan. But _apply_plan() mapped the
target root to "." via os.path.relpath(), so the planner's path never
matched and the allocation was silently dropped.

Fix: register both "." and basename(target) as aliases for the target
root in the lookup table. Also log a warning when plan paths don't
match any known directory, so future mismatches are visible.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 20:38:55 -06:00
Jeff Smith
2adbed9d28 feat(ai): implement Phase 3 investigation planning (#8, #9, #10, #11, #74)
Add a planning pass that runs after survey and before dir loops. The
planner classifies directories into priority/shallow/skip tiers and
allocates turns accordingly, replacing the fixed max_turns=14 per
directory with dynamic allocation from a global budget.

Planning pass:
- _PLANNING_SYSTEM_PROMPT in prompts.py with submit_plan tool
- _run_planning() follows the same single-turn pattern as _run_survey()
- submit_plan tool registered in new "planning" scope
- _apply_plan() pure function: band-sorted ordering (leaf-first within
  bands), turn map, skip-dir removal
- _default_plan() fallback when planning is skipped or fails
- Plan cached as plan.json for resumed runs

Dynamic turn allocation:
- Priority dirs: 15-20 turns (capped at 25)
- Shallow dirs: 5 turns
- Default: 10 turns
- Skip dirs: excluded entirely
- Orchestrator passes per-dir max_turns to _run_dir_loop()

Quality instrumentation:
- _TokenTracker._loop_turns counts API calls per dir loop
- completeness field (0.0-1.0) added to dir-scope submit_report
- plan_evaluation.json emitted after dir loops comparing plan predictions
  to actual turn utilization, completeness, and confidence
- Turn utilization logged per directory during investigation

Also fixes _get_child_summaries() to distinguish actual leaf directories
from parents whose children have not been investigated yet, replacing
the misleading "this is a leaf directory" placeholder.

26 new tests (260 total, all passing).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 20:21:49 -06:00
Jeff Smith
efaa2024d7 test(ai): cover _TokenTracker, _synthesize_from_cache, _discover_directories (#70)
Second wave of pre-Phase-3 test coverage. The #55 round picked off the
easy decision-logic helpers; this round covers the three highest-impact
helpers that escaped the first sweep.

Three new test classes appended to tests/test_ai_pure.py:

- TestTokenTracker (11 tests)
  Pins the load-bearing #44 fix: budget_exceeded() must use last_input
  (the most recent call's context size) NOT cumulative input, because
  each turn's input_tokens already includes the full message history.
  Tests assert: cumulative-input far above budget does NOT trip the
  gate when last_input stays small; reset_loop() preserves grand
  totals; the boundary is strict > not >=.

- TestSynthesizeFromCache (5 tests)
  The synthesis fallback fires only when _run_synthesis exhausts its
  max_turns, which almost never happens in normal runs — exactly the
  kind of code that silently rots. Tests assert: empty cache returns
  the incomplete-message brief and empty detailed; single dir entry
  produces a markdown line; multi-entry detailed contains all entries;
  empty-summary entries are skipped; file entries alone do not satisfy
  (the function reads dir entries only).

- TestDiscoverDirectories (9 tests)
  The leaves-first walk drives the entire dir-loop iteration order
  and is the foundation of the cache reuse story. Tests assert:
  empty target returns target only; nested trees come back leaves-
  first; .git / __pycache__ / node_modules / *.egg-info excluded;
  custom --exclude honored; hidden dirs excluded by default; show_
  hidden=True includes them but does not override the skip list.

PLAN.md: added Phase 2.7 (#56 ) and Phase 2.8 (#55 , #70) entries
to the implementation order, and removed the now-stale Phase 3.4 (#56)
and Background chore (#55) sections that were displaced by the
pre-Phase-3 cleanup pattern.

Verification: 234 tests pass (209 prior + 25 new).
2026-04-11 10:41:16 -06:00
Jeff Smith
a6333858ee test(ai): add unit coverage for pure helpers in ai.py (#55)
ai.py was documented as fully exempt from unit testing because the dir
loop and synthesis pass require a live Anthropic API. But several
helpers in the module are pure functions with no API dependency, and
they're the kind of thing that breaks silently. The #57 refactor added
two more (_build_dir_loop_context, _flush_partial_dir_entry) that are
also naturally testable.

New tests/test_ai_pure.py — 45 tests across 8 helpers:

- _should_skip_dir: exact-match, *.egg-info glob, no-match cases
- _path_is_safe: inside, nested, equals, outside, traversal,
  sibling-with-target-prefix (the easy-to-miss security case)
- _default_survey: shape, zero confidence guarantees no filtering,
  passes through _filter_dir_tools unchanged
- _format_survey_block: None, empty, minimal, with relevant_tools,
  with skip_tools, with domain_notes, empty-list omission
- _filter_dir_tools: None, empty, low confidence, high confidence
  filters, protected tools never removed, unknown skip silently
  ignored, garbage/None confidence treated as zero, threshold
  boundary inclusive
- _format_survey_signals: None, empty, zero total_files, full,
  partial (only extensions)
- _block_to_dict: text, tool_use, unknown type
- _flush_partial_dir_entry (#57): idempotent when entry exists,
  no-file-entries stub path, with-file-entries summary synthesis,
  notable_files collection

Uses the same _make_manager() pattern as test_cache.py to construct
a _CacheManager rooted in a tempdir, sidestepping CACHE_ROOT entirely.

Doc updates:
- CLAUDE.md, README.md, docs/wiki/DevelopmentGuide.md: ai.py is no
  longer fully exempt — only the API-dependent loops are. Pure
  helpers are covered by test_ai_pure.py.

Verification: 209 tests pass (164 prior + 45 new).
2026-04-11 10:24:47 -06:00