luminos/tests
Jeff Smith 2adbed9d28 feat(ai): implement Phase 3 investigation planning (#8, #9, #10, #11, #74)
Add a planning pass that runs after survey and before dir loops. The
planner classifies directories into priority/shallow/skip tiers and
allocates turns accordingly, replacing the fixed max_turns=14 per
directory with dynamic allocation from a global budget.

Planning pass:
- _PLANNING_SYSTEM_PROMPT in prompts.py with submit_plan tool
- _run_planning() follows the same single-turn pattern as _run_survey()
- submit_plan tool registered in new "planning" scope
- _apply_plan() pure function: band-sorted ordering (leaf-first within
  bands), turn map, skip-dir removal
- _default_plan() fallback when planning is skipped or fails
- Plan cached as plan.json for resumed runs

Dynamic turn allocation:
- Priority dirs: 15-20 turns (capped at 25)
- Shallow dirs: 5 turns
- Default: 10 turns
- Skip dirs: excluded entirely
- Orchestrator passes per-dir max_turns to _run_dir_loop()

Quality instrumentation:
- _TokenTracker._loop_turns counts API calls per dir loop
- completeness field (0.0-1.0) added to dir-scope submit_report
- plan_evaluation.json emitted after dir loops comparing plan predictions
  to actual turn utilization, completeness, and confidence
- Turn utilization logged per directory during investigation

Also fixes _get_child_summaries() to distinguish actual leaf directories
from parents whose children have not been investigated yet, replacing
the misleading "this is a leaf directory" placeholder.

26 new tests (260 total, all passing).

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-12 20:21:49 -06:00
..
__init__.py feat(tests): add unit test coverage for all testable modules (#37) 2026-04-06 16:57:26 -06:00
test_ai_filter.py fix(ai): correct context budget metric — track per-call, not sum (#44) 2026-04-06 22:49:25 -06:00
test_ai_pure.py feat(ai): implement Phase 3 investigation planning (#8, #9, #10, #11, #74) 2026-04-12 20:21:49 -06:00
test_cache.py feat(cache): add low_confidence_entries() query to CacheManager (#3) 2026-04-06 21:13:58 -06:00
test_code.py feat(tests): add unit test coverage for all testable modules (#37) 2026-04-06 16:57:26 -06:00
test_disk.py feat(tests): add unit test coverage for all testable modules (#37) 2026-04-06 16:57:26 -06:00
test_filetypes.py feat(filetypes): expose raw signals to survey, remove classifier bias (#42) 2026-04-06 22:36:14 -06:00
test_recency.py feat(tests): add unit test coverage for all testable modules (#37) 2026-04-06 16:57:26 -06:00
test_report.py feat(tests): add unit test coverage for all testable modules (#37) 2026-04-06 16:57:26 -06:00
test_tree.py feat(tests): add unit test coverage for all testable modules (#37) 2026-04-06 16:57:26 -06:00