wiki: scope change to AI-first, drop zero-dep + watch mode (#64)

Jeff Smith 2026-04-11 09:48:11 -06:00
parent d8d5842be2
commit ecfae7edba
6 changed files with 149 additions and 82 deletions

@ -6,11 +6,13 @@
## Overview
Luminos is a zero-dependency Python CLI at its base. The `--ai` flag layers an
agentic investigation on top using the Claude API. The two layers are strictly
separated — the base scan never requires pip packages.
Luminos is an agentic Claude investigation tool. Every invocation runs the
full pipeline: a base scan first to feed the agent its initial picture, then
a survey pass, then per-directory dir loops, then a final synthesis pass.
The base scan is not a standalone product, it is the agent's input.
**Entry point:** `luminos.py` — argument parsing, scan orchestration, output routing.
**Entry point:** `luminos.py` — argument parsing, scan orchestration, AI
pipeline kickoff, output routing.
---
@ -18,16 +20,14 @@ separated — the base scan never requires pip packages.
| Module | Purpose | External commands |
|---|---|---|
| `luminos.py` | Entry point — arg parsing, scan(), main() | None |
| `luminos.py` | Entry point — arg parsing, scan(), AI kickoff, main() | None |
| `luminos_lib/tree.py` | Recursive directory tree with file sizes | None (os) |
| `luminos_lib/filetypes.py` | Classifies files into 7 categories | `file --brief` |
| `luminos_lib/code.py` | Language detection, LOC counting, large file flagging | `wc -l` |
| `luminos_lib/recency.py` | Finds N most recently modified files | `find -printf` |
| `luminos_lib/disk.py` | Per-directory disk usage | `du -b` |
| `luminos_lib/report.py` | Formats report dict as terminal output | None |
| `luminos_lib/watch.py` | Continuous monitoring loop with snapshot diffing | None |
| `luminos_lib/capabilities.py` | Optional dependency detection, cache cleanup | None |
| `luminos_lib/cache.py` | AI investigation cache — read/write/clear/flush | None |
| `luminos_lib/cache.py` | Investigation cache — read/write/clear/flush | None |
| `luminos_lib/ast_parser.py` | tree-sitter code structure parsing | tree-sitter |
| `luminos_lib/prompts.py` | System prompt templates for AI loops | None |
| `luminos_lib/ai.py` | Multi-pass agentic analysis via Claude API | anthropic, python-magic |
@ -49,7 +49,7 @@ scan(target)
---
## AI Pipeline (--ai flag)
## AI Pipeline
```
analyze_directory(report, target)
@ -146,14 +146,17 @@ the entire cache root.
## Key Constraints
- **Base tool: no pip dependencies.** tree, filetypes, code, disk, recency,
report, watch use only stdlib and GNU coreutils.
- **AI deps are lazy.** `anthropic`, `tree-sitter`, `python-magic` imported
only when `--ai` is used. Missing packages produce a clear install error.
- **AI investigation is the product.** The base scan exists to feed the agent.
There is no `--ai` flag. AI runs unconditionally on every invocation.
- **Anthropic API key is required.** If `ANTHROPIC_API_KEY` is unset, luminos
exits cleanly (exit 0) with a one-line hint instead of running.
- **Dependencies installed via `requirements.txt`.** anthropic, tree-sitter +
grammars, and python-magic are normal pip dependencies. `setup_env.sh`
creates a venv and installs them.
- **Subprocess for OS tools.** LOC counting, file detection, disk usage, and
recency shell out to GNU coreutils. Do not reimplement in pure Python.
- **Graceful degradation everywhere.** Permission denied, subprocess timeouts,
missing API key — all handled without crashing.
individual dir-loop failures — all handled without crashing the run.
---

@ -5,45 +5,50 @@
> loop, the cache, the survey pass, where to add a tool — read
> [Internals](Internals).
## Setup
```bash
# One-time setup: creates ~/luminos-env and installs requirements.txt
./setup_env.sh
# Activate the venv in subsequent sessions
source ~/luminos-env/bin/activate
# Set your Anthropic API key
export ANTHROPIC_API_KEY=your-key-here
```
Or do it by hand:
```bash
python3 -m venv ~/luminos-env
source ~/luminos-env/bin/activate
pip install -r requirements.txt
```
The base scan also shells out to GNU coreutils (`wc`, `file`, `grep`, `head`,
`tail`, `stat`, `du`, `find`), so they need to be on `$PATH`.
## Running Luminos
```bash
# Base scan
python3 luminos.py <target>
# With AI analysis (requires ANTHROPIC_API_KEY)
source ~/luminos-env/bin/activate
python3 luminos.py --ai <target>
# Common flags
python3 luminos.py --ai --fresh --clear-cache <target> # force clean run
python3 luminos.py -x .git -x node_modules <target> # exclude dirs
python3 luminos.py -d 8 -a <target> # depth 8, include hidden
python3 luminos.py --json -o report.json <target> # JSON output
# Watch mode
python3 luminos.py --watch <target>
# Check optional dep status
python3 luminos.py --install-extras
```
---
That is the whole interface. AI runs unconditionally on every invocation.
## Optional Dependencies Setup
### Common flags
```bash
# One-time setup
bash setup_env.sh
# Or manually
python3 -m venv ~/luminos-env
source ~/luminos-env/bin/activate
pip install anthropic tree-sitter tree-sitter-python \
tree-sitter-javascript tree-sitter-rust \
tree-sitter-go python-magic
python3 luminos.py --fresh <target> # ignore cached results
python3 luminos.py --clear-cache # wipe /tmp/luminos/
python3 luminos.py -x .git -x node_modules <target> # exclude dirs
python3 luminos.py -d 8 -a <target> # depth 8, include hidden
python3 luminos.py --json -o report.json <target> # JSON output
```
If `ANTHROPIC_API_KEY` is unset, luminos exits cleanly with a one-line hint.
---
## Git Workflow
@ -84,13 +89,8 @@ One commit per logical unit of work, not one per file.
### Merge procedure
```bash
git checkout main
git merge --no-ff <branch> -m "merge: <description>"
git branch -d <branch>
```
`--no-ff` preserves branch history. Delete branch after merging.
PRs are merged via the Forgejo API or web UI, not local `git merge`. See
`~/.claude/CLAUDE.md` for the full branching discipline.
---
@ -102,7 +102,7 @@ git branch -d <branch>
python3 -m unittest discover -s tests/ -v
```
No dependencies needed — the test suite uses stdlib `unittest` only.
The test suite uses stdlib `unittest` only.
### Test coverage
@ -117,21 +117,19 @@ Tests live in `tests/`, one file per module:
| `test_recency.py` | `recency.py` — recent file detection, output parsing |
| `test_tree.py` | `tree.py` — tree building, rendering, hidden/exclude logic |
| `test_report.py` | `report.py``format_flags`, `format_report`, all sections |
| `test_capabilities.py` | `capabilities.py``_check_package` |
Modules **not covered** (exempt from unit testing):
| Module | Reason |
|---|---|
| `ai.py` | Requires live Anthropic API |
| `ast_parser.py` | Requires tree-sitter optional dep |
| `watch.py` | Stateful filesystem event loop |
| `ast_parser.py` | Imports tree-sitter grammars at module load |
| `prompts.py` | String templates with no logic |
### Test requirements
- Every change to a covered module must include or update its tests
- All 129 tests must pass before merging to main
- All tests must pass before merging to main
- Subprocess-heavy functions (`wc`, `du`, `find`, `file`) are tested via `unittest.mock` — no real filesystem calls needed
- Tests that require real filesystem interaction use `tempfile.mkdtemp()` and clean up automatically
@ -153,7 +151,7 @@ Modules **not covered** (exempt from unit testing):
| Classes | PascalCase | `_TokenTracker`, `_CacheManager` |
| Constants | UPPER_SNAKE_CASE | `MAX_CONTEXT`, `CACHE_ROOT` |
| Module files | snake_case | `ast_parser.py`, `filetypes.py` |
| CLI flags | kebab-case | `--clear-cache`, `--install-extras` |
| CLI flags | kebab-case | `--clear-cache`, `--fresh` |
| Private functions | leading underscore | `_run_synthesis`, `_build_dir_context` |
---
@ -163,22 +161,21 @@ Modules **not covered** (exempt from unit testing):
```
luminos/
├── luminos.py entry point
├── requirements.txt anthropic, tree-sitter + grammars, python-magic
├── setup_env.sh venv + dependency setup script
├── luminos_lib/
│ ├── ai.py AI pipeline (heaviest module)
│ ├── ast_parser.py tree-sitter parsing
│ ├── cache.py investigation cache management
│ ├── capabilities.py optional dep detection
│ ├── cache.py investigation cache management (incl. clear_cache)
│ ├── code.py language + LOC detection
│ ├── disk.py disk usage
│ ├── filetypes.py file classification
│ ├── prompts.py AI system prompt templates
│ ├── recency.py recently modified files
│ ├── report.py terminal report formatter
│ ├── tree.py directory tree
│ └── watch.py watch mode
│ └── tree.py directory tree
├── tests/
│ ├── test_cache.py
│ ├── test_capabilities.py
│ ├── test_code.py
│ ├── test_disk.py
│ ├── test_filetypes.py
@ -186,7 +183,6 @@ luminos/
│ ├── test_report.py
│ └── test_tree.py
├── docs/wiki/ local clone of Forgejo wiki (gitignored)
├── setup_env.sh venv + AI dep setup script
├── CLAUDE.md Claude Code context (thin — points to wiki)
└── PLAN.md evolution plan and design notes
```

19
Home.md

@ -1,9 +1,10 @@
# Luminos
Luminos is a file system intelligence tool — a zero-dependency Python CLI that
scans a directory and produces a reconnaissance report. With `--ai` it runs a
multi-pass agentic investigation via the Claude API, producing a deep analysis
of what the directory contains and why.
Luminos is a file system intelligence tool. Point it at a directory and it
runs a multi-pass agentic investigation via the Claude API: a survey pass,
isolated dir-loop agents per directory, and a synthesis pass that produces
a project-level verdict with severity-ranked flags. A lightweight base scan
runs first to feed the agent its initial picture of the target.
---
@ -31,11 +32,11 @@ of what the directory contains and why.
## At a Glance
```bash
python3 luminos.py <target> # base scan
python3 luminos.py --ai <target> # AI analysis
python3 luminos.py --ai --refine <target> # AI + refinement pass (planned)
python3 luminos.py -x .git -x node_modules <target> # exclude dirs
python3 luminos.py --watch <target> # continuous monitoring
python3 luminos.py <target> # full investigation
python3 luminos.py --fresh <target> # ignore cached results
python3 luminos.py -x .git -x node_modules <target> # exclude dirs
python3 luminos.py --json -o report.json <target> # JSON output
python3 luminos.py --clear-cache # wipe /tmp/luminos/
```
---

@ -13,23 +13,24 @@ specific line number.
## 1. The two layers
Luminos has a hard internal split:
Luminos still has two internal layers, but they are no longer separated by
a CLI flag. AI runs unconditionally on every invocation.
| Layer | What it does | Imports |
|---|---|---|
| **Base scan** | Walks the directory, classifies files, counts lines, ranks recency, measures disk usage, prints a report. | stdlib only + GNU coreutils via subprocess. **No pip packages.** |
| **AI pipeline** (`--ai`) | Runs a multi-pass agent investigation via the Claude API on top of the base scan output. | `anthropic`, `tree-sitter`, `python-magic` — all imported lazily. |
| **Base scan** | Walks the directory, classifies files, counts lines, ranks recency, measures disk usage. Produces the `report` dict that feeds the agent. | stdlib + GNU coreutils via subprocess. |
| **AI pipeline** | Runs a multi-pass agent investigation via the Claude API on top of the base scan output. | `anthropic`, `tree-sitter`, `python-magic`. |
The split is enforced by lazy imports. `luminos.py:156` is the only place
that imports from `luminos_lib.ai`, and it sits inside `if args.ai:`. You
can grep the codebase to verify: nothing in the base scan modules imports
anything from `ai.py`, `ast_parser.py`, or `prompts.py`. This means
`python3 luminos.py /target` works on a stock Python 3 install with no
packages installed at all.
The base scan modules (`tree.py`, `filetypes.py`, `code.py`, `recency.py`,
`disk.py`, `report.py`) still don't depend on `ai.py`, `ast_parser.py`, or
`prompts.py`. This separation is convention rather than a hard constraint
now: it keeps the base scan fast and easy to test, but it is no longer
enforced by lazy imports. `luminos.py` imports from `luminos_lib.ai` at
the bottom of `main()`, after the base scan has produced its `report`.
When you change a base-scan module, the question to ask is: *does this
introduce a top-level import of anything outside stdlib?* If yes, you've
broken the constraint and the change must be rewritten.
If `ANTHROPIC_API_KEY` is unset, `luminos.py` exits cleanly (exit 0) with
a one-line hint *before* running the base scan, so the user isn't made to
wait for a scan they can't use.
---

65
Session9.md Normal file

@ -0,0 +1,65 @@
# Session 9
**Date:** 2026-04-11
**Focus:** Scope shift — AI investigation is the product, drop zero-dependency constraint, delete watch mode (#64)
**Duration estimate:** ~45 minutes
## What was done
A coordinated scope change. Two original design constraints were dropped and one feature was deleted:
1. **Zero-dependency Python CLI is no longer a goal.** Luminos installs from `requirements.txt` like a normal Python project. `anthropic`, `tree-sitter` + grammars, and `python-magic` are normal pip dependencies, not lazy imports gated by a CLI flag.
2. **AI investigation is the headline.** The base scan exists to feed the agent. There is no `--ai` flag and no `--no-ai` mode. AI runs unconditionally on every invocation.
3. **Watch mode deleted.** A non-AI filesystem-churn monitor conflicts with the new philosophy. If a live update mode comes back, it gets rebuilt as incremental AI re-investigation.
### Code
- Deleted `luminos_lib/watch.py` and the `--watch` flag.
- Deleted `luminos_lib/capabilities.py` and `tests/test_capabilities.py`. Moved `clear_cache()` into `cache.py`.
- `luminos.py`: removed `--watch`, `--ai`, `--install-extras`. Kept `--clear-cache`, `--fresh`, `-x`, `-d`, `-a`, `-o`, `--json`. AI runs unconditionally after the base scan. If `ANTHROPIC_API_KEY` is unset, exits 0 with a one-line hint *before* running the base scan.
- `ai.py`: dropped the `check_ai_dependencies()` call and the import.
- New `requirements.txt`. `setup_env.sh` installs from it.
### Docs
- `README.md` rewritten to lead with AI investigation, drops the two-modes framing and the watch feature line.
- `CLAUDE.md` (project): rewrites Key Constraints, updates module map and Running Luminos commands.
- `PLAN.md`: strips zero-dep philosophy from the file map and reframes the watch+incremental note as a future live-mode feature.
- Wiki: `Architecture.md`, `DevelopmentGuide.md`, `Home.md`, `Internals.md` updated.
### Ship
- PR #65 merged via gitea MCP, branch deleted local + remote.
- Issue #64 closed manually (not relying on `Closes #N`).
- Issue #35 (incremental AI re-investigation in watch mode) closed as obsolete with a comment explaining why the framing no longer fits.
- Tests: 164 pass (down from 168 with the 4 removed capabilities tests).
## Discoveries and observations
- **The "lazy import" pattern was thinner than expected.** `ai.py` and `ast_parser.py` already did top-level imports of `anthropic`/`magic`/`tree_sitter`. The "lazy" behavior lived only in `luminos.py`'s `if args.ai:` gate. So the actual code change for "drop the zero-dep constraint" was much smaller than the conceptual shift suggested: just remove the gate and accept that `luminos.py`'s AI import always fires.
- **`capabilities.py` was almost dead weight.** Only `clear_cache()` was load-bearing, and even that only because it knew about `CACHE_ROOT` (which already lives in `cache.py`). Moving `clear_cache()` to `cache.py` was a one-import simplification — `capabilities.py` was a tax we'd been paying for the lazy-deps story.
- **The graceful exit needed to fire *before* the base scan, not after.** First draft put the `ANTHROPIC_API_KEY` check after the scan ran. That makes the user wait through a multi-second scan only to get told they can't actually use the result. Moved it to the top of `main()` after target validation but before `scan()`.
## Decisions made and why
- **Option (b), not (a).** The choice was between "AI-default with `--no-ai` escape hatch" (a) and "AI-only, base scan is internal" (b). Picked (b). Reasoning: keeping `--no-ai` would have meant maintaining two CLI surfaces and two documentation paths for what the philosophy says is one product. The base scan is still useful internally (it produces the `report` dict the agent reads), it just doesn't need to be exposed as a standalone CLI mode. Cleaner story, less drift.
- **Delete watch mode rather than park it.** It was ~110 lines, no tests, no users in our context. Parking it as a "scoped-down churn monitor" would have meant explaining in docs why one feature ignores the AI-first philosophy. Delete + a clear note in PLAN.md that watch comes back as incremental AI re-investigation if it comes back at all.
- **Delete `--ai` cleanly, no deprecation cycle.** Per global CLAUDE.md ("don't use feature flags or backwards-compatibility shims when you can just change the code"). It's a personal project with no external users to deprecate against.
- **Graceful exit on missing API key, exit 0 not exit 1.** "Missing API key" is a user-fixable configuration state, not an error condition. Exit 0 + hint reads as "here's what you need to do," not "something broke."
- **One commit, not three.** The code changes and the doc changes are tightly coupled — splitting them creates a half-broken state in commit 1 where the code says one thing and the docs say another. The whole scope shift is one logical change.
## Raw thinking
- The fact that `ai.py` already did top-level `import anthropic` is interesting in hindsight. The lazy-deps story was load-bearing in the docs and the prompts but not really in the code. We were one CLI gate removal away from the dependency story being trivial, and we'd been paying the conceptual cost the whole time. Lesson: when a constraint feels heavier in docs than in code, check if the code is actually enforcing it.
- Watch mode being deleted feels right but is also a small loss of optionality. The next time someone says "I want to monitor this directory for changes," the answer is "luminos doesn't do that anymore, use `inotifywait`." That's fine for now — luminos was never the best churn monitor — but worth remembering if a real use case surfaces.
- The session was unusually fast (~45 min for a coordinated scope change spanning 11 files plus 5 wiki pages) because the user had already done the conceptual work in conversation. By the time I started cutting code, every decision had been pre-confirmed: option (b), delete watch, delete `--ai` cleanly, graceful exit. The TaskCreate breakdown (10 tasks) helped keep it linear, but the real speedup was that nothing was ambiguous when execution started.
- Phase 3 prerequisites (#55, #56, #57) are now genuinely the only thing between us and Phase 3 proper. The scope change didn't touch any of them. Next session can pick one.
## What's next
In priority order:
1. **#57** — refactor `_run_dir_loop` before Phase 3 dynamic turn allocation lands. Prerequisite cleanup, small.
2. **#56** — dedupe `_TOOL_DISPATCH` / `_DIR_TOOLS` registration. Decision point: fix now or let it die in Phase 3.5 MCP migration.
3. **#55** — unit test coverage for ai.py pure helpers. Foundation for Phase 3 confidence work.
4. Phase 3 proper (#19#29 cluster).

@ -10,6 +10,7 @@
| [Session 6](Session6) | 2026-04-07 | Extracted shared workflow/branching/protocols from project CLAUDE.md to global `~/.claude/CLAUDE.md`; moved externalize.md and wrap-up.md to `~/.claude/protocols/` |
| [Session 7](Session7) | 2026-04-07 | Phase 1 audit (#1 closed, only #54 remains); gitea MCP credential overhaul — dedicated `claude-code` Forgejo user with admin on luminos, write+delete verified |
| [Session 8](Session8) | 2026-04-07 | Closed #54 — added confidence/confidence_reason to write_cache tool schema description; Phase 1 milestone now 4/4 complete |
| [Session 9](Session9) | 2026-04-11 | Scope shift (#64): AI investigation is the product, zero-dep constraint dropped, watch mode + capabilities.py deleted, requirements.txt added, README/CLAUDE/PLAN/wiki rewritten |
---