marchwarden/CLAUDE.md
Jeff Smith d279c4c20e chore: update CLAUDE.md for session 2
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
2026-04-08 17:30:59 -06:00

63 lines
3.1 KiB
Markdown
Raw Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

# Marchwarden — Project Context
## What This Is
A network of agentic research specialists (MCP servers) coordinated by a
principal investigator (PI) agent. Educational project learning agents, MCP,
and agent composition.
## Current Project State
| | |
|---|---|
| **Phase** | Phases 02.5 complete; V1 shipped. Next: Phase 3 (stress testing) or Phase 5 (arxiv-rag) |
| **Last worked on** | 2026-04-08 |
| **Last commit** | `af79358` — Merge PR #36: per-step durations in trace and operational logs |
| **Branch** | `main` (clean) |
| **Tests** | 123 passing |
| **Blocking issues** | None |
## Key Files
| File | Purpose |
|---|---|
| `researchers/web/models.py` | Research Contract v1 Pydantic models + `DEPTH_PRESETS` |
| `researchers/web/tools.py` | Tavily search + URL fetch with content hashing |
| `researchers/web/trace.py` | JSONL trace logger + step-duration tracking + structlog mirror |
| `researchers/web/agent.py` | WebResearcher — inner agentic loop |
| `researchers/web/server.py` | FastMCP server wrapping the researcher |
| `cli/main.py` | CLI: `ask` / `replay` / `costs` |
| `obs/__init__.py` | Structured operational logger (structlog) |
| `obs/costs.py` | Cost ledger + price table |
| `Makefile` | `make install` / `test` / `ask` / `costs` / `clean` |
| `Dockerfile` + `scripts/docker-test.sh` | Reproducible test environment |
## Architecture
- **Researcher** = MCP server exposing `research(question) -> ResearchResult`
- **ResearchResult** = answer + citations (with raw_excerpt) + categorized gaps +
discovery_events + open_questions + confidence + confidence_factors + cost_metadata + trace_id
- **Agent loop** = Claude tool-use loop (plan→search→fetch→iterate) + synthesis step
- **Trace** = JSONL audit log per research call at `~/.marchwarden/traces/`
## Conventions
- API keys live in `~/secrets` (not `.env`)
- Wiki is at `docs/wiki/` (local git clone, not MCP — wiki MCP is buggy)
- All merges via Forgejo API (claude-code user can't merge via MCP)
- One branch per concern, merge via PR, delete branch after
## Session Log
| Session | Date | Summary |
|---|---|---|
| 1 | 2026-04-08 | Project creation, naming, contract design, Phase 0 + Phase 1 complete (81 tests) |
| 2 | 2026-04-08 | Phase 2 (CLI shim) + Phase 2.5 (logging + cost tracking) shipped; V1 ships; depth presets; docker test env; per-step duration tracking; arxiv-rag scoped as M5.1; Phase 3/4/5/6 milestones populated (123 tests) |
## What's Next
**Recommended next session: Phase 3 (Stress Testing & Calibration)** before Phase 5, since stress tests will likely tighten the contract before a second researcher has to implement it.
- **Phase 3:** Issue #44 (M3.1 single-axis stress tests) → #45 (M3.2 multi-axis) → #46 (M3.3 confidence calibration)
- **Phase 5 alternative:** Issue #38 (M5.1.1 arxiv-rag ingest pipeline). New deps: pymupdf, chromadb, sentence-transformers, arxiv. Design lives at [wiki/ArxivRagProposal](https://forgejo.labbity.unbiasedgeek.com/archeious/marchwarden/wiki/ArxivRagProposal).
Open milestones in Forgejo: Phase 3 (3 issues), Phase 4 (3 issues), Phase 5 (8 issues including arxiv-rag tracker), Phase 6 (2 issues).