marchwarden/researchers/web/server.py

"""MCP server for the web researcher.

Exposes a single tool `research` that delegates to WebResearcher.
Run with: python -m researchers.web.server
"""

import asyncio
import os
import sys
from typing import Optional

from mcp.server.fastmcp import FastMCP

from obs import configure_logging, get_logger
from researchers.web.agent import WebResearcher
from researchers.web.models import constraints_for_depth

log = get_logger("marchwarden.mcp")

mcp = FastMCP(
    name="marchwarden-web-researcher",
    instructions=(
        "A Marchwarden web research specialist. "
        "Call the research tool with a question to get a grounded, "
        "evidence-based answer with citations, gaps, open questions, "
        "and confidence scoring."
    ),
)


def _read_secret(key: str) -> str:
    """Read a secret from ~/secrets file."""
    secrets_path = os.path.expanduser("~/secrets")
    with open(secrets_path) as f:
        for line in f:
            if line.startswith(f"{key}="):
                return line.split("=", 1)[1].strip()
    raise ValueError(f"Key {key} not found in {secrets_path}")


def _get_researcher() -> WebResearcher:
    """Create a WebResearcher with keys from ~/secrets."""
    return WebResearcher(
        anthropic_api_key=_read_secret("ANTHROPIC_API_KEY"),
        tavily_api_key=_read_secret("TAVILY_API_KEY"),
        model_id=os.environ.get("MARCHWARDEN_MODEL", "claude-sonnet-4-6"),
    )


@mcp.tool()
async def research(
    question: str,
    context: Optional[str] = None,
    depth: str = "balanced",
    max_iterations: Optional[int] = None,
    token_budget: Optional[int] = None,
) -> str:
    """Research a question using web search and return a structured answer.

    Args:
        question: The question to investigate.
        context: What the caller already knows (optional).
        depth: Research depth — "shallow", "balanced", or "deep". Each
            depth picks default max_iterations / token_budget / max_sources.
        max_iterations: Override the depth preset for iterations (1-20).
        token_budget: Override the depth preset for token budget.

    Returns:
        JSON string containing the full ResearchResult with answer,
        citations, gaps, discovery_events, open_questions, confidence,
        and cost_metadata.
    """
    researcher = _get_researcher()
    constraints = constraints_for_depth(
        depth,
        max_iterations=max_iterations,
        token_budget=token_budget,
    )

    result = await researcher.research(
        question=question,
        context=context,
        depth=depth,
        constraints=constraints,
    )

    return result.model_dump_json(indent=2)


def main():
    """Run the MCP server on stdio."""
    configure_logging()
    log.info("mcp_server_starting", transport="stdio", server="marchwarden-web-researcher")
    mcp.run(transport="stdio")


if __name__ == "__main__":
    main()
M1.4: MCP server wrapping web researcher FastMCP server exposing a single 'research' tool: - Delegates to WebResearcher with keys from ~/secrets - Accepts question, context, depth, max_iterations, token_budget - Returns full ResearchResult as JSON - Configurable model via MARCHWARDEN_MODEL env var - Runnable as: python -m researchers.web 4 tests: secret reading, JSON response validation, default parameters. Refs: archeious/marchwarden#1 Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> 2026-04-08 20:41:13 +00:00			`"""MCP server for the web researcher.`

			Exposes a single tool `research` that delegates to WebResearcher.
			`Run with: python -m researchers.web.server`
			`"""`

			`import asyncio`
			`import os`
			`import sys`
			`from typing import Optional`

			`from mcp.server.fastmcp import FastMCP`

M2.5.1: Structured application logger via structlog (#24) Adds an operational logging layer separate from the JSONL trace audit logs. Operational logs cover system events (startup, errors, MCP transport, research lifecycle); JSONL traces remain the researcher provenance audit trail. Backend: structlog with two renderers selectable via MARCHWARDEN_LOG_FORMAT (json\|console). Defaults to console when stderr is a TTY, json otherwise — so dev runs are human-readable and shipped runs (containers, automation) emit OpenSearch-ready JSON without configuration. Key features: - Named loggers per component: marchwarden.cli, marchwarden.mcp, marchwarden.researcher.web - MARCHWARDEN_LOG_LEVEL controls global level (default INFO) - MARCHWARDEN_LOG_FILE=1 enables a 10MB-rotating file at ~/.marchwarden/logs/marchwarden.log - structlog contextvars bind trace_id + researcher at the start of each research() call so every downstream log line carries them automatically; cleared on completion - stdlib logging is funneled through the same pipeline so noisy third-party loggers (httpx, anthropic) get the same formatting and quieted to WARN unless DEBUG is requested - Logs to stderr to keep MCP stdio stdout clean Wired into: - cli.main.cli — configures logging on startup, logs ask_started/ ask_completed/ask_failed - researchers.web.server.main — configures logging on startup, logs mcp_server_starting - researchers.web.agent.research — binds trace context, logs research_started/research_completed Tests verify JSON and console formats, contextvar propagation, level filtering, idempotency, and auto-configure-on-first-use. 94/94 tests passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> 2026-04-08 21:46:51 +00:00			`from obs import configure_logging, get_logger`
M1.4: MCP server wrapping web researcher FastMCP server exposing a single 'research' tool: - Delegates to WebResearcher with keys from ~/secrets - Accepts question, context, depth, max_iterations, token_budget - Returns full ResearchResult as JSON - Configurable model via MARCHWARDEN_MODEL env var - Runnable as: python -m researchers.web 4 tests: secret reading, JSON response validation, default parameters. Refs: archeious/marchwarden#1 Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> 2026-04-08 20:41:13 +00:00			`from researchers.web.agent import WebResearcher`
depth flag now drives constraint defaults (#30) Previously the depth parameter (shallow/balanced/deep) was passed only as a text hint inside the agent's user message, with no mechanical effect on iterations, token budget, or source count. The flag was effectively cosmetic — the LLM was expected to "interpret" it. Add DEPTH_PRESETS table and constraints_for_depth() helper in researchers.web.models: shallow: 2 iters, 5,000 tokens, 5 sources balanced: 5 iters, 20,000 tokens, 10 sources (= historical defaults) deep: 8 iters, 60,000 tokens, 20 sources Wired through the stack: - WebResearcher.research(): when constraints is None, builds from the depth preset instead of bare ResearchConstraints() - MCP server `research` tool: max_iterations and token_budget now default to None; constraints are built via constraints_for_depth with explicit values overriding the preset - CLI `ask` command: --max-iterations and --budget default to None; the CLI only forwards them to the MCP tool when set, so unset flags fall through to the depth preset balanced is unchanged from the historical defaults so existing callers see no behavior difference. Explicit --max-iterations / --budget always win over the preset. Tests cover each preset's values, balanced backward-compat, unknown depth fallback, full override, and partial override. 116/116 tests passing. Live-verified: --depth shallow on a simple question now caps at 2 iterations and stays under budget. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> 2026-04-08 22:27:38 +00:00			`from researchers.web.models import constraints_for_depth`
M1.4: MCP server wrapping web researcher FastMCP server exposing a single 'research' tool: - Delegates to WebResearcher with keys from ~/secrets - Accepts question, context, depth, max_iterations, token_budget - Returns full ResearchResult as JSON - Configurable model via MARCHWARDEN_MODEL env var - Runnable as: python -m researchers.web 4 tests: secret reading, JSON response validation, default parameters. Refs: archeious/marchwarden#1 Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> 2026-04-08 20:41:13 +00:00
M2.5.1: Structured application logger via structlog (#24) Adds an operational logging layer separate from the JSONL trace audit logs. Operational logs cover system events (startup, errors, MCP transport, research lifecycle); JSONL traces remain the researcher provenance audit trail. Backend: structlog with two renderers selectable via MARCHWARDEN_LOG_FORMAT (json\|console). Defaults to console when stderr is a TTY, json otherwise — so dev runs are human-readable and shipped runs (containers, automation) emit OpenSearch-ready JSON without configuration. Key features: - Named loggers per component: marchwarden.cli, marchwarden.mcp, marchwarden.researcher.web - MARCHWARDEN_LOG_LEVEL controls global level (default INFO) - MARCHWARDEN_LOG_FILE=1 enables a 10MB-rotating file at ~/.marchwarden/logs/marchwarden.log - structlog contextvars bind trace_id + researcher at the start of each research() call so every downstream log line carries them automatically; cleared on completion - stdlib logging is funneled through the same pipeline so noisy third-party loggers (httpx, anthropic) get the same formatting and quieted to WARN unless DEBUG is requested - Logs to stderr to keep MCP stdio stdout clean Wired into: - cli.main.cli — configures logging on startup, logs ask_started/ ask_completed/ask_failed - researchers.web.server.main — configures logging on startup, logs mcp_server_starting - researchers.web.agent.research — binds trace context, logs research_started/research_completed Tests verify JSON and console formats, contextvar propagation, level filtering, idempotency, and auto-configure-on-first-use. 94/94 tests passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> 2026-04-08 21:46:51 +00:00			`log = get_logger("marchwarden.mcp")`

M1.4: MCP server wrapping web researcher FastMCP server exposing a single 'research' tool: - Delegates to WebResearcher with keys from ~/secrets - Accepts question, context, depth, max_iterations, token_budget - Returns full ResearchResult as JSON - Configurable model via MARCHWARDEN_MODEL env var - Runnable as: python -m researchers.web 4 tests: secret reading, JSON response validation, default parameters. Refs: archeious/marchwarden#1 Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> 2026-04-08 20:41:13 +00:00			`mcp = FastMCP(`
			`name="marchwarden-web-researcher",`
			`instructions=(`
			`"A Marchwarden web research specialist. "`
			`"Call the research tool with a question to get a grounded, "`
			`"evidence-based answer with citations, gaps, open questions, "`
			`"and confidence scoring."`
			`),`
			`)`


			`def _read_secret(key: str) -> str:`
			`"""Read a secret from ~/secrets file."""`
			`secrets_path = os.path.expanduser("~/secrets")`
			`with open(secrets_path) as f:`
			`for line in f:`
			`if line.startswith(f"{key}="):`
			`return line.split("=", 1)[1].strip()`
			`raise ValueError(f"Key {key} not found in {secrets_path}")`


			`def _get_researcher() -> WebResearcher:`
			`"""Create a WebResearcher with keys from ~/secrets."""`
			`return WebResearcher(`
			`anthropic_api_key=_read_secret("ANTHROPIC_API_KEY"),`
			`tavily_api_key=_read_secret("TAVILY_API_KEY"),`
Fix invalid default model id (#15) Both the MCP server and WebResearcher defaulted to claude-sonnet-4-5-20250514, which 404s against the Anthropic API. Update both defaults to claude-sonnet-4-6, which is current as of 2026-04. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> 2026-04-08 21:25:19 +00:00			`model_id=os.environ.get("MARCHWARDEN_MODEL", "claude-sonnet-4-6"),`
M1.4: MCP server wrapping web researcher FastMCP server exposing a single 'research' tool: - Delegates to WebResearcher with keys from ~/secrets - Accepts question, context, depth, max_iterations, token_budget - Returns full ResearchResult as JSON - Configurable model via MARCHWARDEN_MODEL env var - Runnable as: python -m researchers.web 4 tests: secret reading, JSON response validation, default parameters. Refs: archeious/marchwarden#1 Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> 2026-04-08 20:41:13 +00:00			`)`


			`@mcp.tool()`
			`async def research(`
			`question: str,`
			`context: Optional[str] = None,`
			`depth: str = "balanced",`
depth flag now drives constraint defaults (#30) Previously the depth parameter (shallow/balanced/deep) was passed only as a text hint inside the agent's user message, with no mechanical effect on iterations, token budget, or source count. The flag was effectively cosmetic — the LLM was expected to "interpret" it. Add DEPTH_PRESETS table and constraints_for_depth() helper in researchers.web.models: shallow: 2 iters, 5,000 tokens, 5 sources balanced: 5 iters, 20,000 tokens, 10 sources (= historical defaults) deep: 8 iters, 60,000 tokens, 20 sources Wired through the stack: - WebResearcher.research(): when constraints is None, builds from the depth preset instead of bare ResearchConstraints() - MCP server `research` tool: max_iterations and token_budget now default to None; constraints are built via constraints_for_depth with explicit values overriding the preset - CLI `ask` command: --max-iterations and --budget default to None; the CLI only forwards them to the MCP tool when set, so unset flags fall through to the depth preset balanced is unchanged from the historical defaults so existing callers see no behavior difference. Explicit --max-iterations / --budget always win over the preset. Tests cover each preset's values, balanced backward-compat, unknown depth fallback, full override, and partial override. 116/116 tests passing. Live-verified: --depth shallow on a simple question now caps at 2 iterations and stays under budget. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> 2026-04-08 22:27:38 +00:00			`max_iterations: Optional[int] = None,`
			`token_budget: Optional[int] = None,`
M1.4: MCP server wrapping web researcher FastMCP server exposing a single 'research' tool: - Delegates to WebResearcher with keys from ~/secrets - Accepts question, context, depth, max_iterations, token_budget - Returns full ResearchResult as JSON - Configurable model via MARCHWARDEN_MODEL env var - Runnable as: python -m researchers.web 4 tests: secret reading, JSON response validation, default parameters. Refs: archeious/marchwarden#1 Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> 2026-04-08 20:41:13 +00:00			`) -> str:`
			`"""Research a question using web search and return a structured answer.`

			`Args:`
			`question: The question to investigate.`
			`context: What the caller already knows (optional).`
depth flag now drives constraint defaults (#30) Previously the depth parameter (shallow/balanced/deep) was passed only as a text hint inside the agent's user message, with no mechanical effect on iterations, token budget, or source count. The flag was effectively cosmetic — the LLM was expected to "interpret" it. Add DEPTH_PRESETS table and constraints_for_depth() helper in researchers.web.models: shallow: 2 iters, 5,000 tokens, 5 sources balanced: 5 iters, 20,000 tokens, 10 sources (= historical defaults) deep: 8 iters, 60,000 tokens, 20 sources Wired through the stack: - WebResearcher.research(): when constraints is None, builds from the depth preset instead of bare ResearchConstraints() - MCP server `research` tool: max_iterations and token_budget now default to None; constraints are built via constraints_for_depth with explicit values overriding the preset - CLI `ask` command: --max-iterations and --budget default to None; the CLI only forwards them to the MCP tool when set, so unset flags fall through to the depth preset balanced is unchanged from the historical defaults so existing callers see no behavior difference. Explicit --max-iterations / --budget always win over the preset. Tests cover each preset's values, balanced backward-compat, unknown depth fallback, full override, and partial override. 116/116 tests passing. Live-verified: --depth shallow on a simple question now caps at 2 iterations and stays under budget. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> 2026-04-08 22:27:38 +00:00			`depth: Research depth — "shallow", "balanced", or "deep". Each`
			`depth picks default max_iterations / token_budget / max_sources.`
			`max_iterations: Override the depth preset for iterations (1-20).`
			`token_budget: Override the depth preset for token budget.`
M1.4: MCP server wrapping web researcher FastMCP server exposing a single 'research' tool: - Delegates to WebResearcher with keys from ~/secrets - Accepts question, context, depth, max_iterations, token_budget - Returns full ResearchResult as JSON - Configurable model via MARCHWARDEN_MODEL env var - Runnable as: python -m researchers.web 4 tests: secret reading, JSON response validation, default parameters. Refs: archeious/marchwarden#1 Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> 2026-04-08 20:41:13 +00:00
			`Returns:`
			`JSON string containing the full ResearchResult with answer,`
			`citations, gaps, discovery_events, open_questions, confidence,`
			`and cost_metadata.`
			`"""`
			`researcher = _get_researcher()`
depth flag now drives constraint defaults (#30) Previously the depth parameter (shallow/balanced/deep) was passed only as a text hint inside the agent's user message, with no mechanical effect on iterations, token budget, or source count. The flag was effectively cosmetic — the LLM was expected to "interpret" it. Add DEPTH_PRESETS table and constraints_for_depth() helper in researchers.web.models: shallow: 2 iters, 5,000 tokens, 5 sources balanced: 5 iters, 20,000 tokens, 10 sources (= historical defaults) deep: 8 iters, 60,000 tokens, 20 sources Wired through the stack: - WebResearcher.research(): when constraints is None, builds from the depth preset instead of bare ResearchConstraints() - MCP server `research` tool: max_iterations and token_budget now default to None; constraints are built via constraints_for_depth with explicit values overriding the preset - CLI `ask` command: --max-iterations and --budget default to None; the CLI only forwards them to the MCP tool when set, so unset flags fall through to the depth preset balanced is unchanged from the historical defaults so existing callers see no behavior difference. Explicit --max-iterations / --budget always win over the preset. Tests cover each preset's values, balanced backward-compat, unknown depth fallback, full override, and partial override. 116/116 tests passing. Live-verified: --depth shallow on a simple question now caps at 2 iterations and stays under budget. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> 2026-04-08 22:27:38 +00:00			`constraints = constraints_for_depth(`
			`depth,`
M1.4: MCP server wrapping web researcher FastMCP server exposing a single 'research' tool: - Delegates to WebResearcher with keys from ~/secrets - Accepts question, context, depth, max_iterations, token_budget - Returns full ResearchResult as JSON - Configurable model via MARCHWARDEN_MODEL env var - Runnable as: python -m researchers.web 4 tests: secret reading, JSON response validation, default parameters. Refs: archeious/marchwarden#1 Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> 2026-04-08 20:41:13 +00:00			`max_iterations=max_iterations,`
			`token_budget=token_budget,`
			`)`

			`result = await researcher.research(`
			`question=question,`
			`context=context,`
			`depth=depth,`
			`constraints=constraints,`
			`)`

			`return result.model_dump_json(indent=2)`


			`def main():`
			`"""Run the MCP server on stdio."""`
M2.5.1: Structured application logger via structlog (#24) Adds an operational logging layer separate from the JSONL trace audit logs. Operational logs cover system events (startup, errors, MCP transport, research lifecycle); JSONL traces remain the researcher provenance audit trail. Backend: structlog with two renderers selectable via MARCHWARDEN_LOG_FORMAT (json\|console). Defaults to console when stderr is a TTY, json otherwise — so dev runs are human-readable and shipped runs (containers, automation) emit OpenSearch-ready JSON without configuration. Key features: - Named loggers per component: marchwarden.cli, marchwarden.mcp, marchwarden.researcher.web - MARCHWARDEN_LOG_LEVEL controls global level (default INFO) - MARCHWARDEN_LOG_FILE=1 enables a 10MB-rotating file at ~/.marchwarden/logs/marchwarden.log - structlog contextvars bind trace_id + researcher at the start of each research() call so every downstream log line carries them automatically; cleared on completion - stdlib logging is funneled through the same pipeline so noisy third-party loggers (httpx, anthropic) get the same formatting and quieted to WARN unless DEBUG is requested - Logs to stderr to keep MCP stdio stdout clean Wired into: - cli.main.cli — configures logging on startup, logs ask_started/ ask_completed/ask_failed - researchers.web.server.main — configures logging on startup, logs mcp_server_starting - researchers.web.agent.research — binds trace context, logs research_started/research_completed Tests verify JSON and console formats, contextvar propagation, level filtering, idempotency, and auto-configure-on-first-use. 94/94 tests passing. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com> 2026-04-08 21:46:51 +00:00			`configure_logging()`
			`log.info("mcp_server_starting", transport="stdio", server="marchwarden-web-researcher")`
M1.4: MCP server wrapping web researcher FastMCP server exposing a single 'research' tool: - Delegates to WebResearcher with keys from ~/secrets - Accepts question, context, depth, max_iterations, token_budget - Returns full ResearchResult as JSON - Configurable model via MARCHWARDEN_MODEL env var - Runnable as: python -m researchers.web 4 tests: secret reading, JSON response validation, default parameters. Refs: archeious/marchwarden#1 Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com> 2026-04-08 20:41:13 +00:00			`mcp.run(transport="stdio")`


			`if __name__ == "__main__":`
			`main()`