Adds an append-only JSONL ledger of every research() call at
~/.marchwarden/costs.jsonl, supplementing (not replacing) the
per-call cost_metadata field returned to callers. The ledger is
the operator-facing source of truth for spend tracking, queryable
via the upcoming `marchwarden costs` command (M2.5.3).
Fields per entry: timestamp, trace_id, question (truncated 200ch),
model_id, tokens_used, tokens_input, tokens_output, iterations_run,
wall_time_sec, tavily_searches, estimated_cost_usd, budget_exhausted,
confidence.
Cost estimation reads ~/.marchwarden/prices.toml, which is
auto-created with seed values for current Anthropic + Tavily rates
on first run. Operators are expected to update prices.toml
manually when upstream rates change — there is no automatic
fetching. Existing files are never overwritten. Unknown models
log a WARN and record estimated_cost_usd: null instead of
crashing.
Each ledger write also emits a structured `cost_recorded` log line
via the M2.5.1 logger, so cost data ships to OpenSearch alongside
the ledger file with no extra plumbing.
Tracking changes in agent.py:
- Track tokens_input / tokens_output split (not just total)
- Count tavily_searches across iterations
- _synthesize now returns (result, synth_in, synth_out) so the
caller can attribute synthesis tokens to the running counters
- Ledger.record() called after research_completed log; failures
are caught and warn-logged so a ledger write can never poison
a successful research call
Tests cover: price table seeding, no-overwrite of existing files,
cost estimation for known/unknown models, tavily-only cost,
ledger appends, question truncation, env var override.
End-to-end verified with a real Anthropic+Tavily call:
9107 input + 1140 output tokens, 1 tavily search, $0.049 estimated.
104/104 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds an operational logging layer separate from the JSONL trace
audit logs. Operational logs cover system events (startup, errors,
MCP transport, research lifecycle); JSONL traces remain the
researcher provenance audit trail.
Backend: structlog with two renderers selectable via
MARCHWARDEN_LOG_FORMAT (json|console). Defaults to console when
stderr is a TTY, json otherwise — so dev runs are human-readable
and shipped runs (containers, automation) emit OpenSearch-ready
JSON without configuration.
Key features:
- Named loggers per component: marchwarden.cli,
marchwarden.mcp, marchwarden.researcher.web
- MARCHWARDEN_LOG_LEVEL controls global level (default INFO)
- MARCHWARDEN_LOG_FILE=1 enables a 10MB-rotating file at
~/.marchwarden/logs/marchwarden.log
- structlog contextvars bind trace_id + researcher at the start
of each research() call so every downstream log line carries
them automatically; cleared on completion
- stdlib logging is funneled through the same pipeline so noisy
third-party loggers (httpx, anthropic) get the same formatting
and quieted to WARN unless DEBUG is requested
- Logs to stderr to keep MCP stdio stdout clean
Wired into:
- cli.main.cli — configures logging on startup, logs ask_started/
ask_completed/ask_failed
- researchers.web.server.main — configures logging on startup,
logs mcp_server_starting
- researchers.web.agent.research — binds trace context, logs
research_started/research_completed
Tests verify JSON and console formats, contextvar propagation,
level filtering, idempotency, and auto-configure-on-first-use.
94/94 tests passing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds `marchwarden replay <trace_id>` to pretty-print a prior research
run from its JSONL trace file. Resolves the trace under
~/.marchwarden/traces/ by default; --trace-dir overrides for tests and
custom locations. Renders each step as a row with action, decision,
extra fields, and content_hash. Friendly errors for unknown trace_id
and malformed JSON lines.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Click app with `ask` subcommand that spawns the web researcher MCP
server over stdio, calls the research tool, and pretty-prints the
ResearchResult contract using rich (panels for answer/confidence/cost,
tables for citations, gaps, discovery events, and open questions).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
FastMCP server exposing a single 'research' tool:
- Delegates to WebResearcher with keys from ~/secrets
- Accepts question, context, depth, max_iterations, token_budget
- Returns full ResearchResult as JSON
- Configurable model via MARCHWARDEN_MODEL env var
- Runnable as: python -m researchers.web
4 tests: secret reading, JSON response validation, default parameters.
Refs: archeious/marchwarden#1
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
New field on ResearchResult: open_questions — follow-up questions that
emerged from the research itself. Distinct from gaps (backward: what
failed) and discovery_events (sideways: what's lateral). Open questions
look forward: 'based on what I found, this needs deeper investigation.'
- OpenQuestion model: question, context, priority (high/medium/low),
source_locator
- Updated agent synthesis prompt to produce open_questions
- Updated agent result builder to parse open_questions from JSON
- 3 new tests for OpenQuestion model
- Updated existing tests for new field
77 tests passing.
Refs: archeious/marchwarden#1
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
WebResearcher — the core agentic research loop:
- Tool-use loop: Claude decides when to search (Tavily) and fetch (httpx)
- Budget enforcement: stops at max_iterations or token_budget
- Synthesis step: separate LLM call produces structured ResearchResult JSON
- Fallback: valid ResearchResult even when synthesis JSON is unparseable
- Full trace logging at every step (start, search, fetch, synthesis, complete)
- Populates all contract fields: raw_excerpt, categorized gaps,
discovery_events, confidence_factors, cost_metadata with model_id
9 tests: complete research loop, budget exhaustion, synthesis failure
fallback, trace file creation, fetch_url tool integration, search
result formatting.
Refs: archeious/marchwarden#1
Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>