retro: Session 3 — Phase 1 complete, MCP backend architecture design
parent
e7b4613471
commit
0bbd98c9eb
2 changed files with 73 additions and 0 deletions
72
Session3.md
Normal file
72
Session3.md
Normal file
|
|
@ -0,0 +1,72 @@
|
|||
# Session 3
|
||||
|
||||
**Date:** 2026-04-06
|
||||
**Focus:** Phase 1 completion, architectural tangent (MCP backends), documentation
|
||||
|
||||
---
|
||||
|
||||
## What Was Done
|
||||
|
||||
### Phase 1: Confidence Tracking — completed
|
||||
|
||||
All three Phase 1 issues shipped and closed:
|
||||
|
||||
- **#1** (confidence fields in cache schemas) — already done at session start
|
||||
- **#2** (update dir loop prompt) — added `confidence` and `confidence_reason` to both cache schemas in `_DIR_SYSTEM_PROMPT`. Added a `## Confidence` section with categorical guidance (high ≥ 0.8, medium 0.5–0.8, low < 0.5) and the rule to include `confidence_reason` when below 0.7. Commit: `feat(prompts): instruct agent to set confidence on cache writes`
|
||||
- **#3** (`low_confidence_entries()`) — added method to `_CacheManager` that returns all file+dir cache entries below a confidence threshold (default 0.7). Entries missing a confidence field are included as unrated. Results sorted ascending. 7 new tests added. All 136 tests pass. Commit: `feat(cache): add low_confidence_entries() query to CacheManager`
|
||||
|
||||
### New issues opened
|
||||
|
||||
- **#38** — Cache invalidation based on file mtime. The cache is keyed purely on path with no staleness detection. Re-runs silently use stale entries. Fix: store mtime at write time, compare on re-run, invalidate if file is newer.
|
||||
- **#39** — Phase 3.5: Migrate to MCP backend architecture. Full design captured.
|
||||
- **#40** — Review and update Phase 4+ issues after MCP pivot lands (primarily Phase 4 external knowledge tool issues need rewriting as MCP servers).
|
||||
|
||||
### PLAN.md updated
|
||||
|
||||
Added **Part 10: MCP Backend Abstraction** — full design for migrating luminos into an MCP client/server model. Added **Phase 3.5** to the implementation order between Phase 3 and Phase 4.
|
||||
|
||||
---
|
||||
|
||||
## Discoveries and Observations
|
||||
|
||||
- The cache has no invalidation logic at all. `has_entry()` is purely path-based. `cached_at` is stored but never read for comparison. `--fresh` is the only escape. This is more of a gap than initially obvious — documented as #38.
|
||||
- The existing `read_all_entries()` method made `low_confidence_entries()` trivial to implement — the query layer composed naturally on top of the read layer.
|
||||
|
||||
---
|
||||
|
||||
## Decisions Made
|
||||
|
||||
**MCP pivot timing: after Phase 3, before Phase 4.**
|
||||
|
||||
Discussed at length. Rationale:
|
||||
- After Phase 3, survey + planning + dir loops + synthesis are all working with filesystem assumptions baked in. Enough surface area to make the migration genuinely instructive.
|
||||
- Phase 4 external tools (web_search, fetch_url, package_lookup) are naturally MCP servers — implementing them before the pivot would mean doing them twice.
|
||||
- The project's primary goal is learning agentic AI. Migrating working code into an MCP architecture is a valuable lesson in itself — the migration pain is intentional.
|
||||
|
||||
**Confidence field: include missing entries in `low_confidence_entries()`.**
|
||||
|
||||
Entries written before confidence tracking existed have no confidence field. Treating them as unrated (confidence=0.0) means they surface in refinement pass queries rather than being silently trusted. Safer default.
|
||||
|
||||
---
|
||||
|
||||
## Raw Thinking
|
||||
|
||||
The MCP tangent surfaced a real architectural tension: luminos is currently a tightly coupled pipeline where filesystem assumptions are woven throughout `ai.py`, the prompts, and the tool dispatch. The investigation loop *logic* (survey → plan → investigate → synthesize) is genuinely generic, but it's not expressed that way in the code. The MCP pivot will force that separation to become explicit — which is the whole point.
|
||||
|
||||
The "tree assumption" is the most load-bearing thing to watch during the pivot. Every part of the investigation loop assumes hierarchical containers (leaf-first traversal, child summaries injected upward, synthesis over dir summaries). Non-filesystem backends that aren't clean trees will expose this. The filesystem MCP server can just wrap the existing logic, but the second backend built will immediately hit this constraint.
|
||||
|
||||
The distinction between **backend tools** (read_file, list_dir, parse_structure — move to MCP) and **control tools** (write_cache, submit_report, submit_plan — stay in luminos) is the key design invariant to preserve during Phase 3.5. If control tools leak into MCP servers, the investigation loop loses its ability to observe and record findings.
|
||||
|
||||
Confidence calibration concern from PLAN.md is real — LLMs tend to be overconfident. The categorical guidance in the prompt (high/medium/low with numeric anchors) is a reasonable first attempt, but it's worth watching whether the agent actually produces a meaningful distribution of confidence scores in practice or whether everything comes back 0.85.
|
||||
|
||||
---
|
||||
|
||||
## What's Next
|
||||
|
||||
**Phase 2: Survey Pass** — four issues ready to implement in order:
|
||||
- #4 Add `_SURVEY_SYSTEM_PROMPT` to `prompts.py`
|
||||
- #5 Implement `_run_survey()` and `submit_survey` tool in `ai.py`
|
||||
- #6 Wire survey output into dir loop system prompt
|
||||
- #7 Skip survey pass for targets below minimum size threshold
|
||||
|
||||
Start with #4 — prompt work only, no AI logic, low risk.
|
||||
|
|
@ -4,6 +4,7 @@
|
|||
|---|---|---|
|
||||
| [Session 1](Session1) | 2026-04-06 | Project setup, scan improvements, Forgejo repo, wiki, development practices |
|
||||
| [Session 2](Session2) | 2026-04-06 | Forgejo milestones, issues, project board (36 issues, 9 milestones), Gitea MCP setup |
|
||||
| [Session 3](Session3) | 2026-04-06 | Phase 1 complete, MCP backend architecture design, issues #38–#40 opened |
|
||||
|
||||
---
|
||||
|
||||
|
|
|
|||
Loading…
Reference in a new issue