test(ai): cover _TokenTracker, _synthesize_from_cache, _discover_directories (#70) #71

Merged
claude-code merged 1 commit from test/issue-70-tracker-synth-discover into main 2026-04-11 10:41:50 -06:00
Collaborator

Closes #70.

Second wave of pre-Phase-3 test coverage. The #55 round picked off the easy decision-logic helpers; this round covers the three highest-impact helpers that escaped the first sweep.

Three new test classes

Class Tests What it pins
TestTokenTracker 11 The load-bearing #44 fix: budget_exceeded() uses last_input (most recent call's context size), NOT cumulative input. Cumulative would double-count because each turn's input_tokens already includes the full message history. Tests assert: cumulative-input far above budget does NOT trip the gate when last_input stays small; reset_loop() preserves grand totals; the boundary is strict > not >=.
TestSynthesizeFromCache 5 The synthesis fallback fires only when _run_synthesis exhausts its max_turns, which almost never happens in normal runs — exactly the kind of code that silently rots. Tests assert: empty cache returns the incomplete-message brief; single dir entry produces a markdown line; multi-entry detailed contains all entries; empty-summary entries are skipped; file entries alone do not satisfy (the function reads dir entries only).
TestDiscoverDirectories 9 The leaves-first walk drives the entire dir-loop iteration order and is the foundation of the cache reuse story. Tests assert: empty target returns target only; nested trees come back leaves-first; .git / __pycache__ / node_modules / *.egg-info excluded; custom --exclude honored; hidden dirs excluded by default; show_hidden=True includes them but does not override the skip list.

PLAN.md updates

  • New Phase 2.7 entry for #56 (tool registration cleanup shipped Session 9).
  • New Phase 2.8 entry for #55 ( shipped) and #70 (this PR) — frames pre-Phase-3 test coverage as a two-wave milestone.
  • Removed the now-stale Phase 3.4 (#56) and "Background chore" (#55) sections that were displaced by the pre-Phase-3 cleanup pattern.

Verification

  • 234 tests pass (209 prior + 25 new).
  • All new tests run in well under a second; no real filesystem outside tempfile.mkdtemp().

Notable behavior the tests pin down

  • _TokenTracker.reset_loop() zeroes last_input, not just the loop counters. The original issue draft assumed last_input was preserved across resets — reading the actual code revealed it isn't. The tests now pin the real behavior.
  • The budget gate is strict greater-than. last_input == CONTEXT_BUDGET does NOT trip budget_exceeded(). If anyone changes that to >= later, the test screams.
  • _synthesize_from_cache reads dir entries only. File entries are ignored entirely. The test makes this contract explicit.
  • show_hidden=True does not override the skip list. .git/ stays excluded even with hidden dirs visible — the skip filter and hidden filter are independent gates. Now pinned.
Closes #70. Second wave of pre-Phase-3 test coverage. The #55 round picked off the easy decision-logic helpers; this round covers the three highest-impact helpers that escaped the first sweep. ## Three new test classes | Class | Tests | What it pins | |---|---|---| | `TestTokenTracker` | 11 | The load-bearing #44 fix: `budget_exceeded()` uses `last_input` (most recent call's context size), NOT cumulative input. Cumulative would double-count because each turn's `input_tokens` already includes the full message history. Tests assert: cumulative-input far above budget does NOT trip the gate when `last_input` stays small; `reset_loop()` preserves grand totals; the boundary is strict `>` not `>=`. | | `TestSynthesizeFromCache` | 5 | The synthesis fallback fires only when `_run_synthesis` exhausts its `max_turns`, which almost never happens in normal runs — exactly the kind of code that silently rots. Tests assert: empty cache returns the incomplete-message brief; single dir entry produces a markdown line; multi-entry detailed contains all entries; empty-summary entries are skipped; file entries alone do not satisfy (the function reads dir entries only). | | `TestDiscoverDirectories` | 9 | The leaves-first walk drives the entire dir-loop iteration order and is the foundation of the cache reuse story. Tests assert: empty target returns target only; nested trees come back leaves-first; `.git` / `__pycache__` / `node_modules` / `*.egg-info` excluded; custom `--exclude` honored; hidden dirs excluded by default; `show_hidden=True` includes them but does not override the skip list. | ## PLAN.md updates - New **Phase 2.7** entry for #56 (tool registration cleanup ✅ shipped Session 9). - New **Phase 2.8** entry for #55 (✅ shipped) and #70 (this PR) — frames pre-Phase-3 test coverage as a two-wave milestone. - Removed the now-stale Phase 3.4 (#56) and "Background chore" (#55) sections that were displaced by the pre-Phase-3 cleanup pattern. ## Verification - 234 tests pass (209 prior + 25 new). - All new tests run in well under a second; no real filesystem outside `tempfile.mkdtemp()`. ## Notable behavior the tests pin down - **`_TokenTracker.reset_loop()` zeroes `last_input`**, not just the loop counters. The original issue draft assumed `last_input` was preserved across resets — reading the actual code revealed it isn't. The tests now pin the real behavior. - **The budget gate is strict greater-than.** `last_input == CONTEXT_BUDGET` does NOT trip `budget_exceeded()`. If anyone changes that to `>=` later, the test screams. - **`_synthesize_from_cache` reads dir entries only.** File entries are ignored entirely. The test makes this contract explicit. - **`show_hidden=True` does not override the skip list.** `.git/` stays excluded even with hidden dirs visible — the skip filter and hidden filter are independent gates. Now pinned.
claude-code added 1 commit 2026-04-11 10:41:44 -06:00
Second wave of pre-Phase-3 test coverage. The #55 round picked off the
easy decision-logic helpers; this round covers the three highest-impact
helpers that escaped the first sweep.

Three new test classes appended to tests/test_ai_pure.py:

- TestTokenTracker (11 tests)
  Pins the load-bearing #44 fix: budget_exceeded() must use last_input
  (the most recent call's context size) NOT cumulative input, because
  each turn's input_tokens already includes the full message history.
  Tests assert: cumulative-input far above budget does NOT trip the
  gate when last_input stays small; reset_loop() preserves grand
  totals; the boundary is strict > not >=.

- TestSynthesizeFromCache (5 tests)
  The synthesis fallback fires only when _run_synthesis exhausts its
  max_turns, which almost never happens in normal runs — exactly the
  kind of code that silently rots. Tests assert: empty cache returns
  the incomplete-message brief and empty detailed; single dir entry
  produces a markdown line; multi-entry detailed contains all entries;
  empty-summary entries are skipped; file entries alone do not satisfy
  (the function reads dir entries only).

- TestDiscoverDirectories (9 tests)
  The leaves-first walk drives the entire dir-loop iteration order
  and is the foundation of the cache reuse story. Tests assert:
  empty target returns target only; nested trees come back leaves-
  first; .git / __pycache__ / node_modules / *.egg-info excluded;
  custom --exclude honored; hidden dirs excluded by default; show_
  hidden=True includes them but does not override the skip list.

PLAN.md: added Phase 2.7 (#56 ) and Phase 2.8 (#55 , #70) entries
to the implementation order, and removed the now-stale Phase 3.4 (#56)
and Background chore (#55) sections that were displaced by the
pre-Phase-3 cleanup pattern.

Verification: 234 tests pass (209 prior + 25 new).
claude-code merged commit ef34a83f70 into main 2026-04-11 10:41:50 -06:00
Sign in to join this conversation.
No reviewers
No labels
No milestone
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: archeious/luminos#71
No description provided.