marchwarden

History

Jeff Smith 1203b07248 fix(observability): persist full ResearchResult and per-item trace events Closes #54. The JSONL trace previously stored only counts on the `complete` event (gap_count, citation_count, discovery_count). Replay could re-render the step log but could not recover which gaps fired or which sources were cited, blocking M3.2/M3.3 stress-testing and calibration work. Two complementary fixes: 1. (a) TraceLogger.write_result() dumps the pydantic ResearchResult to `<trace_id>.result.json` next to the JSONL trace. The agent calls it right before emitting the `complete` step. `cli replay` now loads the sibling result file when present and renders the structured tables under the trace step log. 2. (b) The agent emits one `gap_recorded`, `citation_recorded`, or `discovery_recorded` trace event per item from the final result. This gives the JSONL stream a queryable timeline of what was kept, with categories and topics in-band, without needing to load the result sibling. Tests: 4 added (127 total passing). Smoke-tested live with a real ask; both files written and replay rendering verified. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>		2026-04-08 19:27:33 -06:00
..
__init__.py	Initial project structure and scaffolding	2026-04-08 11:57:15 -06:00
__main__.py	M1.4: MCP server wrapping web researcher	2026-04-08 14:41:13 -06:00
agent.py	fix(observability): persist full ResearchResult and per-item trace events	2026-04-08 19:27:33 -06:00
models.py	depth flag now drives constraint defaults (#30 )	2026-04-08 16:27:38 -06:00
server.py	depth flag now drives constraint defaults (#30 )	2026-04-08 16:27:38 -06:00
tools.py	M1.1: Search and fetch tools with tests	2026-04-08 14:17:18 -06:00
trace.py	fix(observability): persist full ResearchResult and per-item trace events	2026-04-08 19:27:33 -06:00