marchwarden/researchers
Jeff Smith 1203b07248 fix(observability): persist full ResearchResult and per-item trace events
Closes #54.

The JSONL trace previously stored only counts on the `complete` event
(gap_count, citation_count, discovery_count). Replay could re-render the
step log but could not recover which gaps fired or which sources were
cited, blocking M3.2/M3.3 stress-testing and calibration work.

Two complementary fixes:

1. (a) TraceLogger.write_result() dumps the pydantic ResearchResult to
   `<trace_id>.result.json` next to the JSONL trace. The agent calls it
   right before emitting the `complete` step. `cli replay` now loads the
   sibling result file when present and renders the structured tables
   under the trace step log.

2. (b) The agent emits one `gap_recorded`, `citation_recorded`, or
   `discovery_recorded` trace event per item from the final result. This
   gives the JSONL stream a queryable timeline of what was kept, with
   categories and topics in-band, without needing to load the result
   sibling.

Tests: 4 added (127 total passing). Smoke-tested live with a real ask;
both files written and replay rendering verified.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 19:27:33 -06:00
..
web fix(observability): persist full ResearchResult and per-item trace events 2026-04-08 19:27:33 -06:00
__init__.py Initial project structure and scaffolding 2026-04-08 11:57:15 -06:00