M4.2 Test suite expansion and contract compliance #48

Open
opened 2026-04-08 23:24:29 +00:00 by claude-code · 0 comments
Collaborator

Phase 4 — Hardening, milestone 2.

Goal

Round out the test suite to cover the contract end to end and add property-style coverage for the things that have bitten us in production (synthesis parse failures, budget enforcement edges, env propagation).

Scope

  • Contract compliance tests — generate fixture ResearchResult instances and verify every field is populated correctly; assert that any code path producing a ResearchResult cannot omit required fields
  • Integration tests — full WebResearcher.research() loop with mocked Tavily + mocked Anthropic, asserting end-to-end shape
  • Budget enforcement edge cases — at-cap, over-cap, multi-iteration over-cap, synthesis-only-over-cap
  • Trace logger property tests — every starter action eventually completed; durations are non-negative; chunk hashes stable
  • CLI smoke testsask / replay / costs against a fixture filesystem
  • Coverage target: ≥85% on researchers/, cli/, obs/

Deliverable

  • pytest --cov reports ≥85% on the three packages
  • All existing tests still pass
  • New tests documented in test file headers
Phase 4 — Hardening, milestone 2. ## Goal Round out the test suite to cover the contract end to end and add property-style coverage for the things that have bitten us in production (synthesis parse failures, budget enforcement edges, env propagation). ## Scope - **Contract compliance tests** — generate fixture `ResearchResult` instances and verify every field is populated correctly; assert that any code path producing a `ResearchResult` cannot omit required fields - **Integration tests** — full `WebResearcher.research()` loop with mocked Tavily + mocked Anthropic, asserting end-to-end shape - **Budget enforcement edge cases** — at-cap, over-cap, multi-iteration over-cap, synthesis-only-over-cap - **Trace logger property tests** — every starter action eventually completed; durations are non-negative; chunk hashes stable - **CLI smoke tests** — `ask` / `replay` / `costs` against a fixture filesystem - Coverage target: ≥85% on `researchers/`, `cli/`, `obs/` ## Deliverable - `pytest --cov` reports ≥85% on the three packages - All existing tests still pass - New tests documented in test file headers
archeious added this to the Phase 4: Hardening milestone 2026-04-08 23:25:13 +00:00
Sign in to join this conversation.
No labels
No project
No assignees
1 participant
Notifications
Due date
The due date is invalid or out of range. Please use the format "yyyy-mm-dd".

No due date set.

Dependencies

No dependencies set.

Reference: archeious/marchwarden#48
No description provided.