Merge pull request 'feat: AI investigation is the product, drop zero-dep constraint (#64)' (#65) from feat/issue-64-ai-first-scope into main

2026-04-11 15:46:46 +00:00 · 2026-04-11 15:46:46 +00:00 · 5c5c4dbb1a
commit 5c5c4dbb1a
parent 54713f09a6 c93c748ea3
11 changed files with 114 additions and 432 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -19,9 +19,11 @@
 ## Project Overview
-Luminos is a file system intelligence tool — a zero-dependency Python CLI that
+Luminos is a file system intelligence tool. Point it at a directory and it
-scans a directory and produces a reconnaissance report. With `--ai` it runs a
+runs a multi-pass agentic investigation via the Claude API: a survey pass,
-multi-pass agentic investigation via the Claude API.
+isolated dir-loop agents per directory, and a synthesis pass that produces a
 project-level verdict with severity-ranked flags. A lightweight base scan
 runs first to feed the agent its initial picture of the target.
 ---
@ -32,8 +34,7 @@ multi-pass agentic investigation via the Claude API.
 | `luminos.py` | Entry point — arg parsing, scan(), main() |
 | `luminos_lib/ai.py` | Multi-pass agentic analysis via Claude API |
 | `luminos_lib/ast_parser.py` | tree-sitter code structure parsing |
-| `luminos_lib/cache.py` | Investigation cache management |
+| `luminos_lib/cache.py` | Investigation cache management (incl. clear_cache) |
 | `luminos_lib/capabilities.py` | Optional dep detection, cache cleanup |
 | `luminos_lib/code.py` | Language detection, LOC counting |
 | `luminos_lib/disk.py` | Per-directory disk usage |
 | `luminos_lib/filetypes.py` | File classification (7 categories) |
@ -41,7 +42,6 @@ multi-pass agentic investigation via the Claude API.
 | `luminos_lib/recency.py` | Recently modified files |
 | `luminos_lib/report.py` | Terminal report formatter |
 | `luminos_lib/tree.py` | Directory tree visualization |
 | `luminos_lib/watch.py` | Watch mode with snapshot diffing |
 Details: wiki — [Architecture](https://forgejo.labbity.unbiasedgeek.com/archeious/luminos/wiki/Architecture) | [Development Guide](https://forgejo.labbity.unbiasedgeek.com/archeious/luminos/wiki/DevelopmentGuide)
@ -49,32 +49,36 @@ Details: wiki — [Architecture](https://forgejo.labbity.unbiasedgeek.com/archei
 ## Key Constraints
- **Base tool: no pip dependencies.** tree, filetypes, code, disk, recency,
+- **AI investigation is the product.** The base scan exists to feed the agent.
-  report, watch use only stdlib and GNU coreutils. Must always work on bare Python 3.
+  There is no `--ai` flag and no `--no-ai` mode. AI runs unconditionally on
- **AI deps are lazy.** `anthropic`, `tree-sitter`, `python-magic` imported only
+  every invocation.
-  when `--ai` is used. Missing packages produce a clear install error.
+- **Anthropic API key is required.** If `ANTHROPIC_API_KEY` is unset, luminos
  exits cleanly (exit 0) with a one-line hint instead of running.
 - **Dependencies installed via `requirements.txt`.** anthropic, tree-sitter +
  grammars, and python-magic are normal pip dependencies, not lazy imports.
  `setup_env.sh` creates a venv and installs them.
 - **Subprocess for OS tools.** LOC counting, file detection, disk usage, and
  recency shell out to GNU coreutils. Do not reimplement in pure Python.
 - **Graceful degradation everywhere.** Permission denied, subprocess timeouts,
-  missing API key — all handled without crashing.
+  individual dir-loop failures — all handled without crashing the run.
 ---
 ## Running Luminos
 ```bash
-# Base scan
+# Activate the venv (one-time setup: ./setup_env.sh)
 python3 luminos.py <target>
 # With AI analysis (requires ANTHROPIC_API_KEY)
 source ~/luminos-env/bin/activate
-python3 luminos.py --ai <target>
+export ANTHROPIC_API_KEY=your-key-here
 # Run an investigation
 python3 luminos.py <target>
 # Common flags
 python3 luminos.py -d 8 -a -x .git -x node_modules <target>
 python3 luminos.py --json -o report.json <target>
-python3 luminos.py --watch <target>
+python3 luminos.py --fresh <target>
-python3 luminos.py --install-extras
+python3 luminos.py --clear-cache
 ```
 ---
@ -83,8 +87,7 @@ python3 luminos.py --install-extras
 Run tests with `python3 -m unittest discover -s tests/`. Modules exempt from
 unit testing: `ai.py` (requires live API), `ast_parser.py` (requires
-tree-sitter), `watch.py` (stateful events), `prompts.py` (string templates
+tree-sitter grammars at import time), `prompts.py` (string templates only).
 only).
 (Development workflow, branching discipline, and session protocols live in
 `~/.claude/CLAUDE.md`.)
@ -99,7 +102,7 @@ only).
 | Classes | PascalCase | `_TokenTracker`, `_CacheManager` |
 | Constants | UPPER_SNAKE_CASE | `MAX_CONTEXT`, `CACHE_ROOT` |
 | Module files | snake_case | `ast_parser.py` |
-| CLI flags | kebab-case | `--clear-cache`, `--install-extras` |
+| CLI flags | kebab-case | `--clear-cache`, `--fresh` |
 | Private functions | leading underscore | `_run_synthesis` |
 ---
--- a/PLAN.md
+++ b/PLAN.md
@ -687,7 +687,7 @@ fold into any session that touches these helpers.
  extension sub-section or similar. Low priority, not blocking.
 - **Revisit survey-skip thresholds (#46)** — `_SURVEY_MIN_FILES` and
  `_SURVEY_MIN_DIRS` shipped with values from #7's example, no
-  empirical basis. Once `--ai` has been run on a variety of real
+  empirical basis. Once luminos has been run on a variety of real
  targets, look at which runs skipped the survey vs ran it and decide
  whether the thresholds (or the gate logic itself) need to change.
@ -706,7 +706,7 @@ fold into any session that touches these helpers.
 | `luminos_lib/search.py` | **new** — web_search, fetch_url, package_lookup implementations |
 No changes needed to: `tree.py`, `filetypes.py`, `code.py`, `recency.py`,
-`disk.py`, `capabilities.py`, `watch.py`, `ast_parser.py`
+`disk.py`, `ast_parser.py`
 ---
@ -798,20 +798,20 @@ agent read, in what order, what it decided to skip). Storing the full message
 history per directory would allow replaying or auditing an investigation. Cost:
 storage. Benefit: debuggability, ability to resume investigations more faithfully.
-**Watch mode + incremental investigation**
+**Live re-investigation mode**
-Watch mode currently re-runs the full base scan on changes. For AI-augmented
+A "watch" replacement: detect which directories changed, re-investigate only
-watch mode: detect which directories changed, re-investigate only those, and
+those, and patch the cache entries. The synthesis would then re-run from the
-patch the cache entries. The synthesis would then re-run from the updated cache
+updated cache without re-investigating unchanged directories. The original
-without re-investigating unchanged directories.
+non-AI watch mode was deleted in the #64 scope change because it conflicted
 with the AI-first philosophy. If watch comes back, it comes back as this.
-**Optional PDF and Office document readers**
+**PDF and Office document readers**
 The data and documents domains would benefit from native content extraction:
 - `pdfminer` or `pypdf` for PDF text extraction
 - `openpyxl` for Excel schema and sheet enumeration
 - `python-docx` for Word document text
-These would be optional deps like the existing AI deps, gated behind
+These slot into `requirements.txt` like any other dependency. The agent
-`--install-extras`. The agent currently can only see filename and size for
+currently can only see filename and size for these formats.
 these formats.
 **Security-focused analysis mode**
 A `--security` flag could tune the investigation toward security-relevant
@ -874,11 +874,6 @@ bad plan wastes turns on shallow directories and skips critical ones. The system
 needs quality signals — probably the confidence scores aggregated across the
 investigation — to detect when something went wrong and potentially retry.
 **Watch mode compatibility**
 Several of the planned features (survey pass, planning, external tools) are not
 designed for incremental re-use in watch mode. Adding AI capability to watch
 mode is a separate design problem that deserves its own thinking.
 **Turn budget contention**
 If the planning pass allocates turns and the agent borrows from its budget when
 it needs more, there's a risk of runaway investigation on unexpectedly complex
--- a/README.md
+++ b/README.md
@ -1,55 +1,38 @@
 # Luminos
-A file system intelligence tool. Scans a directory and produces a reconnaissance report that tells you what the directory is, what's in it, and what might be worth your attention.
+A file system intelligence tool. Point it at a directory and it runs an agentic Claude investigation that figures out what the directory is, what's in it, and what might be worth your attention.
-Luminos has two modes. The **base mode** is a single Python file that uses only the standard library and GNU coreutils. No pip install, no virtual environment, no dependencies to audit. The **`--ai` mode** runs a multi-pass agentic investigation against the [Claude API](https://www.anthropic.com/api) to reason about what the project actually does and flag anything that looks off. AI mode is opt-in and is the only path that requires pip-installable packages.
+Luminos is built around a harder question than "what files are here?" It is built around "what is this, and should I be worried about any of it?" To answer that, it runs a multi-pass agentic investigation against the [Claude API](https://www.anthropic.com/api): a survey pass to orient on the target, an isolated dir-loop agent per directory with a small toolbelt (read files, run whitelisted coreutils commands, write cache entries), and a final synthesis pass that produces a project-level verdict with severity-ranked flags.
-## Why
+A lightweight base scan runs first to feed the agent its initial picture of the target. The base scan is not a standalone product, it is the first step of the investigation.
 Most "repo explorer" tools answer one question: "what files are here?" Luminos is built around a harder question: "what is this, and should I be worried about any of it?"
 The base scan gives you the mechanical answer: directory tree, file classification across seven categories, language breakdown with line counts, recently modified files, disk usage, and the largest files. That is usually enough for a quick "what is this" look.
 The AI mode goes further. It runs an isolated investigation per directory, leaves-first, with a small toolbelt (read files, run whitelisted coreutils commands, write cache entries) and a per-directory context budget. Each directory gets its own summary, and a final synthesis pass reads only the directory-level cache entries to produce a whole-project verdict. Findings are flagged with a severity level (`critical`, `concern`, or `info`) so the important stuff floats to the top.
 ## Features
- **Zero dependencies in base mode.** Runs on bare Python 3 plus the GNU coreutils you already have.
+- **Agentic AI investigation.** Multi-pass, leaves-first analysis via Claude. Survey then dir loops then synthesis.
- **Graceful degradation everywhere.** Permission denied, subprocess timeouts, missing API key, missing optional packages: all handled without crashing the scan.
+- **Investigation cache.** Per-file and per-directory summaries are cached under `/tmp/luminos/` so repeat runs on the same target are cheap.
 - **Directory tree.** Visual tree with configurable depth and exclude patterns.
 - **File classification.** Files bucketed into seven categories (code, config, docs, data, media, binary, other) via `file(1)` magic.
 - **Language detection and LOC counting.** Which languages are present, how many lines of code per language.
 - **Recently modified files.** Surface the files most likely to reflect current activity.
 - **Disk usage.** Per-directory disk usage with top offenders called out.
 - **Watch mode.** Re-scan every 30 seconds and show diffs.
 - **JSON output.** Pipe reports to other tools or save for comparison.
 - **AI investigation (opt-in).** Multi-pass, leaves-first agentic analysis via Claude, with an investigation cache so repeat runs are cheap.
 - **Severity-ranked flags.** Findings are sorted so `critical` items are the first thing you see.
 - **Context budget guard.** Per-turn `input_tokens` is watched against a budget so a rogue directory can't blow the context and silently degrade quality.
 - **Graceful degradation.** Permission denied, subprocess timeouts, missing API key: all handled without crashing.
 - **JSON output.** Pipe reports to other tools or save for comparison.
 ## Installation
-### Base mode
+Luminos is a normal Python project. Clone, create a venv, and install from `requirements.txt`. The repository ships a helper script that does this for you:
 No installation required. Clone and run.
 ```bash
 git clone https://github.com/archeious/luminos.git
 cd luminos
 python3 luminos.py <target>
 ```
 Works on any system with Python 3 and standard GNU coreutils (`wc`, `file`, `grep`, `head`, `tail`, `stat`, `du`, `find`).
 ### AI mode
 AI mode needs a few pip-installable packages. The project ships a helper script that creates a dedicated virtual environment and installs them:
 ```bash
 ./setup_env.sh
 source ~/luminos-env/bin/activate
 ```
-The packages installed are `anthropic`, `tree-sitter`, a handful of tree-sitter language grammars, and `python-magic`.
+Or do it by hand:
 ```bash
 python3 -m venv ~/luminos-env
 source ~/luminos-env/bin/activate
 pip install -r requirements.txt
 ```
 You also need an Anthropic API key exported as an environment variable:
@ -57,25 +40,15 @@ You also need an Anthropic API key exported as an environment variable:
 export ANTHROPIC_API_KEY=your-key-here
 ```
-Check which optional dependencies are present:
+The base scan shells out to a handful of GNU coreutils (`wc`, `file`, `grep`, `head`, `tail`, `stat`, `du`, `find`), so you also need those on `$PATH`. They are installed by default on every mainstream Linux distribution and on macOS via Homebrew.
 ```bash
 python3 luminos.py --install-extras
 ```
 ## Usage
 ### Base scan
 ```bash
 python3 luminos.py /path/to/project
 ```
-### AI scan
+That is the whole interface. The investigation runs end to end and prints a report.
 ```bash
 python3 luminos.py --ai /path/to/project
 ```
 ### Common flags
@ -86,28 +59,25 @@ python3 luminos.py -d 8 -a -x .git -x node_modules -x vendor /path/to/project
 # JSON output to a file
 python3 luminos.py --json -o report.json /path/to/project
-# Watch mode (re-scan every 30s, show diffs)
+# Force a fresh investigation, ignoring the cache
-python3 luminos.py --watch /path/to/project
+python3 luminos.py --fresh /path/to/project
-# Force a fresh AI investigation, ignoring the cache
+# Clear the investigation cache
 python3 luminos.py --ai --fresh /path/to/project
 # Clear the AI investigation cache
 python3 luminos.py --clear-cache
 ```
 Run `python3 luminos.py --help` for the full flag list.
-## How AI mode works
+## How the investigation works
-A short version of what happens when you pass `--ai`:
+A short version of what happens on every run:
-1. **Discover** every directory under the target.
+1. **Base scan.** Builds the directory tree, classifies files into seven categories, counts lines of code, finds large and recently modified files, computes per-directory disk usage. This is the agent's initial picture of the target.
-2. **Sort leaves-first** so the deepest directories are investigated before their parents.
+2. **Survey pass.** A short agent loop (max 3 turns) reads the base scan, describes the target in plain language, and decides which investigation tools are relevant. Tiny targets skip the survey.
-3. **Run an isolated agent loop per directory** with a max of 10 turns each. The agent has a small toolbelt: read files, run whitelisted coreutils commands (`wc`, `file`, `grep`, `head`, `tail`, `stat`, `du`, `find`), and write cache entries.
+3. **Dir loops.** Every directory gets its own isolated agent loop, leaves-first, with up to 14 turns. The agent has read-only access to the filesystem and a toolbelt of `read_file`, `list_directory`, `run_command`, `parse_structure`, `write_cache`, `think`, `checkpoint`, `flag`, and `submit_report`.
-4. **Cache everything.** Each file and directory summary is written to `/tmp/luminos/` so that subsequent runs on the same target don't burn tokens re-deriving things that haven't changed.
+4. **Cache.** Each file and directory summary is written to `/tmp/luminos/` so subsequent runs on the same target don't re-derive what hasn't changed.
-5. **Context budget guard.** Per-turn `input_tokens` is watched against a budget (currently 70% of the model's context window) so a rogue directory can't blow the context and silently degrade quality.
+5. **Context budget guard.** Per-turn `input_tokens` is watched against a budget (currently 70% of the model's context window) so a rogue directory can't blow the context window.
-6. **Final synthesis pass** reads only the directory-level cache entries (not the raw files) to produce a project-level summary and the severity-ranked flags.
+6. **Final synthesis.** A short agent loop reads the directory-level cache entries (not the raw files) and produces the project-level brief, the detailed analysis, and the severity-ranked flags.
 ## Development
@ -117,11 +87,10 @@ Run the test suite:
 python3 -m unittest discover -s tests/
 ```
-Modules that are intentionally not unit tested and why:
+Modules that are intentionally not unit tested:
- `luminos_lib/ai.py`: requires a live Anthropic API, tested in practice
+- `luminos_lib/ai.py`: requires a live Anthropic API, exercised in practice
 - `luminos_lib/ast_parser.py`: requires tree-sitter grammars installed
 - `luminos_lib/watch.py`: stateful event loop, tested manually
 - `luminos_lib/prompts.py`: string templates only
 ## License
--- a/luminos.py
+++ b/luminos.py
@ -16,16 +16,11 @@ from luminos_lib.filetypes import (
 from luminos_lib.code import detect_languages, find_large_files
 from luminos_lib.recency import find_recent_files
 from luminos_lib.disk import get_disk_usage, top_directories
 from luminos_lib.watch import watch_loop
 from luminos_lib.report import format_report
 def _progress(label):
-    """Return (on_file, finish) for in-place per-file progress on stderr.
+    """Return (on_file, finish) for in-place per-file progress on stderr."""
    on_file(path) overwrites the current line with the label and truncated path.
    finish() finalises the line with a newline.
    """
    cols = shutil.get_terminal_size((80, 20)).columns
    prefix = f"  [scan] {label}... "
    available = max(cols - len(prefix), 10)
@ -43,7 +38,7 @@ def _progress(label):
 def scan(target, depth=3, show_hidden=False, exclude=None):
-    """Run all analyses on the target directory and return a report dict."""
+    """Run the base scan and return the report dict consumed by the AI pass."""
    report = {}
    exclude = exclude or []
@ -89,7 +84,8 @@ def main():
    parser = argparse.ArgumentParser(
        prog="luminos",
        description="Luminos — file system intelligence tool. "
-                    "Explores a directory and produces a reconnaissance report.",
+                    "Runs an agentic Claude investigation against a directory "
                    "and produces a reconnaissance report.",
    )
    parser.add_argument("target", nargs="?", help="Target directory to analyze")
    parser.add_argument("-d", "--depth", type=int, default=3,
@ -100,17 +96,10 @@ def main():
                        help="Output report as JSON")
    parser.add_argument("-o", "--output", metavar="FILE",
                        help="Write report to a file")
    parser.add_argument("--ai", action="store_true",
                        help="Use Claude AI to analyze directory purpose "
                             "(requires ANTHROPIC_API_KEY)")
    parser.add_argument("--watch", action="store_true",
                        help="Re-scan every 30 seconds and show diffs")
    parser.add_argument("--clear-cache", action="store_true",
-                        help="Clear the AI investigation cache (/tmp/luminos/)")
+                        help="Clear the investigation cache (/tmp/luminos/)")
    parser.add_argument("--fresh", action="store_true",
-                        help="Force a new AI investigation (ignore cached results)")
+                        help="Force a new investigation (ignore cached results)")
    parser.add_argument("--install-extras", action="store_true",
                        help="Show status of optional AI dependencies")
    parser.add_argument("-x", "--exclude", metavar="DIR", action="append",
                        default=[],
                        help="Exclude a directory name from scan and analysis "
@ -118,15 +107,8 @@ def main():
    args = parser.parse_args()
    # --install-extras: show package status and exit
    if args.install_extras:
        from luminos_lib.capabilities import print_status
        print_status()
        return
    # --clear-cache: wipe /tmp/luminos/ (lazy import to avoid AI deps)
    if args.clear_cache:
-        from luminos_lib.capabilities import clear_cache
+        from luminos_lib.cache import clear_cache
        clear_cache()
        if not args.target:
            return
@ -140,19 +122,18 @@ def main():
              file=sys.stderr)
        sys.exit(1)
    if not os.environ.get("ANTHROPIC_API_KEY"):
        print("luminos requires ANTHROPIC_API_KEY. "
              "Set it with: export ANTHROPIC_API_KEY=your-key-here",
              file=sys.stderr)
        sys.exit(0)
    if args.exclude:
        print(f"  [scan] Excluding: {', '.join(args.exclude)}", file=sys.stderr)
    if args.watch:
        watch_loop(target, depth=args.depth, show_hidden=args.all,
                   json_output=args.json_output)
        return
    report = scan(target, depth=args.depth, show_hidden=args.all,
                  exclude=args.exclude)
    flags = []
    if args.ai:
    from luminos_lib.ai import analyze_directory
    brief, detailed, flags = analyze_directory(
        report, target, fresh=args.fresh, exclude=args.exclude)
--- a/luminos_lib/ai.py
+++ b/luminos_lib/ai.py
@ -21,7 +21,6 @@ import anthropic
 import magic
 from luminos_lib.ast_parser import parse_structure
 from luminos_lib.cache import _CacheManager, _get_investigation_id
 from luminos_lib.capabilities import check_ai_dependencies
 from luminos_lib.prompts import (
    _DIR_SYSTEM_PROMPT,
    _SURVEY_SYSTEM_PROMPT,
@ -1414,11 +1413,8 @@ def analyze_directory(report, target, verbose_tools=False, fresh=False,
                      exclude=None):
    """Run AI analysis on the directory. Returns (brief, detailed, flags).
-    Returns ("", "", []) if the API key is missing or dependencies are not met.
+    Returns ("", "", []) if the API key is missing.
    """
    if not check_ai_dependencies():
        sys.exit(1)
    api_key = _get_api_key()
    if not api_key:
        return "", "", []
--- a/luminos_lib/cache.py
+++ b/luminos_lib/cache.py
@ -3,6 +3,8 @@
 import hashlib
 import json
 import os
 import shutil
 import sys
 import uuid
 from datetime import datetime, timezone
@ -10,6 +12,16 @@ CACHE_ROOT = "/tmp/luminos"
 INVESTIGATIONS_PATH = os.path.join(CACHE_ROOT, "investigations.json")
 def clear_cache():
    """Remove all investigation caches under CACHE_ROOT."""
    if os.path.isdir(CACHE_ROOT):
        shutil.rmtree(CACHE_ROOT)
        print(f"Cleared cache: {CACHE_ROOT}", file=sys.stderr)
    else:
        print(f"No cache to clear ({CACHE_ROOT} does not exist).",
              file=sys.stderr)
 def _sha256_path(path):
    """Return a hex SHA-256 of a path string, used as cache key."""
    return hashlib.sha256(path.encode("utf-8")).hexdigest()
--- a/luminos_lib/capabilities.py
+++ b/luminos_lib/capabilities.py
@ -1,139 +0,0 @@
 """Capability detection and cache management for optional luminos dependencies.
 The base tool requires zero external packages. The --ai flag requires:
  - anthropic       (API transport)
  - tree-sitter     (AST parsing via parse_structure tool)
  - python-magic    (improved file classification)
 This module is the single place that knows about optional dependencies.
 """
 _PACKAGES = {
    "anthropic": {
        "import": "anthropic",
        "pip": "anthropic",
        "purpose": "Claude API client (streaming, retries, token counting)",
    },
    "tree-sitter": {
        "import": "tree_sitter",
        "pip": ("tree-sitter tree-sitter-python tree-sitter-javascript "
                "tree-sitter-rust tree-sitter-go"),
        "purpose": "AST parsing for parse_structure tool",
    },
    "python-magic": {
        "import": "magic",
        "pip": "python-magic",
        "purpose": "Improved file type detection via libmagic",
    },
 }
 def _check_package(import_name):
    """Return True if a package is importable."""
    try:
        __import__(import_name)
        return True
    except ImportError:
        return False
 ANTHROPIC_AVAILABLE = _check_package("anthropic")
 TREE_SITTER_AVAILABLE = _check_package("tree_sitter")
 MAGIC_AVAILABLE = _check_package("magic")
 def check_ai_dependencies():
    """Check that all --ai dependencies are installed.
    If any are missing, prints a clear error with the pip install command
    and returns False. Returns True if everything is available.
    """
    missing = []
    for name, info in _PACKAGES.items():
        if not _check_package(info["import"]):
            missing.append(name)
    if not missing:
        return True
    # Also check tree-sitter grammar packages
    grammar_missing = []
    if "tree-sitter" not in missing:
        for grammar in ["tree_sitter_python", "tree_sitter_javascript",
                        "tree_sitter_rust", "tree_sitter_go"]:
            if not _check_package(grammar):
                grammar_missing.append(grammar.replace("_", "-"))
    import sys
    print("\nluminos --ai requires missing packages:", file=sys.stderr)
    for name in missing:
        print(f"  \u2717 {name}", file=sys.stderr)
    for name in grammar_missing:
        print(f"  \u2717 {name}", file=sys.stderr)
    # Build pip install command
    pip_parts = []
    for name in missing:
        pip_parts.append(_PACKAGES[name]["pip"])
    for name in grammar_missing:
        pip_parts.append(name)
    pip_cmd = " \\\n                ".join(pip_parts)
    print(f"\n  Install with:\n    pip install {pip_cmd}\n", file=sys.stderr)
    return False
 def print_status():
    """Print the install status of all optional packages."""
    print("\nLuminos optional dependencies:\n")
    for name, info in _PACKAGES.items():
        available = _check_package(info["import"])
        mark = "\u2713" if available else "\u2717"
        status = "installed" if available else "missing"
        print(f"  {mark} {name:20s} {status:10s}  {info['purpose']}")
    # Grammar packages
    grammars = {
        "tree-sitter-python": "tree_sitter_python",
        "tree-sitter-javascript": "tree_sitter_javascript",
        "tree-sitter-rust": "tree_sitter_rust",
        "tree-sitter-go": "tree_sitter_go",
    }
    print()
    for name, imp in grammars.items():
        available = _check_package(imp)
        mark = "\u2713" if available else "\u2717"
        status = "installed" if available else "missing"
        print(f"  {mark} {name:20s} {status:10s}  Language grammar")
    # Full install command (deduplicated)
    all_pkgs = []
    seen = set()
    for info in _PACKAGES.values():
        for pkg in info["pip"].split():
            if pkg not in seen:
                all_pkgs.append(pkg)
                seen.add(pkg)
    for name in grammars:
        if name not in seen:
            all_pkgs.append(name)
            seen.add(name)
    print(f"\n  Install all with:\n    pip install {' '.join(all_pkgs)}\n")
 from luminos_lib.cache import CACHE_ROOT
 def clear_cache():
    """Remove all investigation caches under /tmp/luminos/."""
    import shutil
    import os
    import sys
    if os.path.isdir(CACHE_ROOT):
        shutil.rmtree(CACHE_ROOT)
        print(f"Cleared cache: {CACHE_ROOT}", file=sys.stderr)
    else:
        print(f"No cache to clear ({CACHE_ROOT} does not exist).",
              file=sys.stderr)
--- a/luminos_lib/watch.py
+++ b/luminos_lib/watch.py
@ -1,108 +0,0 @@
 """Watch mode — re-scan and show diffs every 30 seconds."""
 import json
 import sys
 import time
 import os
 def _snapshot(classified_files):
    """Create a snapshot dict: path -> (size, category)."""
    return {f["path"]: (f["size"], f["category"]) for f in classified_files}
 def _diff_snapshots(old, new):
    """Compare two snapshots and return changes."""
    old_paths = set(old.keys())
    new_paths = set(new.keys())
    added = new_paths - old_paths
    removed = old_paths - new_paths
    common = old_paths & new_paths
    size_changes = []
    for p in common:
        old_size = old[p][0]
        new_size = new[p][0]
        if old_size != new_size:
            size_changes.append((p, old_size, new_size))
    return added, removed, size_changes
 def _human_size(nbytes):
    for unit in ("B", "KB", "MB", "GB"):
        if nbytes < 1024:
            if unit == "B":
                return f"{nbytes} {unit}"
            return f"{nbytes:.1f} {unit}"
        nbytes /= 1024
    return f"{nbytes:.1f} TB"
 def watch_loop(target, depth=3, show_hidden=False, json_output=False):
    """Run scan in a loop, printing diffs between runs."""
    # Import here to avoid circular import
    from luminos_lib.filetypes import classify_files
    print(f"[luminos] Watching {target} (Ctrl+C to stop)")
    print(f"[luminos] Scanning every 30 seconds...")
    print()
    prev_snapshot = None
    try:
        while True:
            classified = classify_files(target, show_hidden=show_hidden)
            current = _snapshot(classified)
            if prev_snapshot is not None:
                added, removed, size_changes = _diff_snapshots(
                    prev_snapshot, current
                )
                if not added and not removed and not size_changes:
                    ts = time.strftime("%H:%M:%S")
                    print(f"[{ts}] No changes detected.")
                else:
                    ts = time.strftime("%H:%M:%S")
                    print(f"[{ts}] Changes detected:")
                    if json_output:
                        diff = {
                            "timestamp": ts,
                            "added": sorted(added),
                            "removed": sorted(removed),
                            "size_changes": [
                                {"path": p, "old_size": o, "new_size": n}
                                for p, o, n in size_changes
                            ],
                        }
                        print(json.dumps(diff, indent=2))
                    else:
                        for p in sorted(added):
                            name = os.path.basename(p)
                            print(f"  + NEW  {name}")
                            print(f"         {p}")
                        for p in sorted(removed):
                            name = os.path.basename(p)
                            print(f"  - DEL  {name}")
                            print(f"         {p}")
                        for p, old_s, new_s in size_changes:
                            name = os.path.basename(p)
                            delta = new_s - old_s
                            sign = "+" if delta > 0 else ""
                            print(f"  ~ SIZE {name}  "
                                  f"{_human_size(old_s)} -> {_human_size(new_s)} "
                                  f"({sign}{_human_size(delta)})")
                    print()
            else:
                print(f"[{time.strftime('%H:%M:%S')}] "
                      f"Initial scan complete: {len(current)} files indexed.")
                print()
            prev_snapshot = current
            time.sleep(30)
    except KeyboardInterrupt:
        print("\n[luminos] Watch stopped.")
--- a/requirements.txt
+++ b/requirements.txt
@ -0,0 +1,7 @@
 anthropic
 python-magic
 tree-sitter
 tree-sitter-python
 tree-sitter-javascript
 tree-sitter-rust
 tree-sitter-go
--- a/setup_env.sh
+++ b/setup_env.sh
@ -2,6 +2,7 @@
 set -euo pipefail
 VENV_DIR="$HOME/luminos-env"
 SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
 if [ -d "$VENV_DIR" ]; then
    echo "venv already exists at $VENV_DIR"
@ -13,17 +14,19 @@ fi
 echo "Activating venv..."
 source "$VENV_DIR/bin/activate"
-echo "Installing packages..."
+echo "Installing packages from requirements.txt..."
-pip install anthropic tree-sitter tree-sitter-python \
+pip install -r "$SCRIPT_DIR/requirements.txt"
            tree-sitter-javascript tree-sitter-rust \
            tree-sitter-go python-magic
 echo ""
 echo "Done. To activate the venv in future sessions:"
 echo ""
 echo "  source ~/luminos-env/bin/activate"
 echo ""
-echo "Then run luminos as usual:"
+echo "Set your Anthropic API key:"
 echo ""
-echo "  python3 luminos.py --ai <target>"
+echo "  export ANTHROPIC_API_KEY=your-key-here"
 echo ""
 echo "Then run luminos:"
 echo ""
 echo "  python3 luminos.py <target>"
 echo ""
--- a/tests/test_capabilities.py
+++ b/tests/test_capabilities.py
@ -1,37 +0,0 @@
 """Tests for luminos_lib/capabilities.py"""
 import unittest
 from unittest.mock import patch
 from luminos_lib.capabilities import _check_package
 class TestCheckPackage(unittest.TestCase):
    def test_importable_package(self):
        # json is always available in stdlib
        self.assertTrue(_check_package("json"))
    def test_missing_package(self):
        self.assertFalse(_check_package("_luminos_nonexistent_package_xyz"))
    def test_importable_returns_true(self):
        with patch("builtins.__import__", return_value=None):
            # patch doesn't work cleanly here; use a real stdlib module
            pass
        self.assertTrue(_check_package("os"))
    def test_import_error_returns_false(self):
        import builtins
        original_import = builtins.__import__
        def fake_import(name, *args, **kwargs):
            if name == "_fake_missing_module":
                raise ImportError("No module named '_fake_missing_module'")
            return original_import(name, *args, **kwargs)
        with patch("builtins.__import__", side_effect=fake_import):
            self.assertFalse(_check_package("_fake_missing_module"))
 if __name__ == "__main__":
    unittest.main()