# Luminos A file system intelligence tool. Scans a directory and produces a reconnaissance report that tells you what the directory is, what's in it, and what might be worth your attention. Luminos has two modes. The **base mode** is a single Python file that uses only the standard library and GNU coreutils. No pip install, no virtual environment, no dependencies to audit. The **`--ai` mode** runs a multi-pass agentic investigation against the [Claude API](https://www.anthropic.com/api) to reason about what the project actually does and flag anything that looks off. AI mode is opt-in and is the only path that requires pip-installable packages. ## Why Most "repo explorer" tools answer one question: "what files are here?" Luminos is built around a harder question: "what is this, and should I be worried about any of it?" The base scan gives you the mechanical answer: directory tree, file classification across seven categories, language breakdown with line counts, recently modified files, disk usage, and the largest files. That is usually enough for a quick "what is this" look. The AI mode goes further. It runs an isolated investigation per directory, leaves-first, with a small toolbelt (read files, run whitelisted coreutils commands, write cache entries) and a per-directory context budget. Each directory gets its own summary, and a final synthesis pass reads only the directory-level cache entries to produce a whole-project verdict. Findings are flagged with a severity level (`critical`, `concern`, or `info`) so the important stuff floats to the top. ## Features - **Zero dependencies in base mode.** Runs on bare Python 3 plus the GNU coreutils you already have. - **Graceful degradation everywhere.** Permission denied, subprocess timeouts, missing API key, missing optional packages: all handled without crashing the scan. - **Directory tree.** Visual tree with configurable depth and exclude patterns. - **File classification.** Files bucketed into seven categories (code, config, docs, data, media, binary, other) via `file(1)` magic. - **Language detection and LOC counting.** Which languages are present, how many lines of code per language. - **Recently modified files.** Surface the files most likely to reflect current activity. - **Disk usage.** Per-directory disk usage with top offenders called out. - **Watch mode.** Re-scan every 30 seconds and show diffs. - **JSON output.** Pipe reports to other tools or save for comparison. - **AI investigation (opt-in).** Multi-pass, leaves-first agentic analysis via Claude, with an investigation cache so repeat runs are cheap. - **Severity-ranked flags.** Findings are sorted so `critical` items are the first thing you see. ## Installation ### Base mode No installation required. Clone and run. ```bash git clone https://github.com/archeious/luminos.git cd luminos python3 luminos.py ``` Works on any system with Python 3 and standard GNU coreutils (`wc`, `file`, `grep`, `head`, `tail`, `stat`, `du`, `find`). ### AI mode AI mode needs a few pip-installable packages. The project ships a helper script that creates a dedicated virtual environment and installs them: ```bash ./setup_env.sh source ~/luminos-env/bin/activate ``` The packages installed are `anthropic`, `tree-sitter`, a handful of tree-sitter language grammars, and `python-magic`. You also need an Anthropic API key exported as an environment variable: ```bash export ANTHROPIC_API_KEY=your-key-here ``` Check which optional dependencies are present: ```bash python3 luminos.py --install-extras ``` ## Usage ### Base scan ```bash python3 luminos.py /path/to/project ``` ### AI scan ```bash python3 luminos.py --ai /path/to/project ``` ### Common flags ```bash # Deeper tree, include hidden files, exclude build and vendor dirs python3 luminos.py -d 8 -a -x .git -x node_modules -x vendor /path/to/project # JSON output to a file python3 luminos.py --json -o report.json /path/to/project # Watch mode (re-scan every 30s, show diffs) python3 luminos.py --watch /path/to/project # Force a fresh AI investigation, ignoring the cache python3 luminos.py --ai --fresh /path/to/project # Clear the AI investigation cache python3 luminos.py --clear-cache ``` Run `python3 luminos.py --help` for the full flag list. ## How AI mode works A short version of what happens when you pass `--ai`: 1. **Discover** every directory under the target. 2. **Sort leaves-first** so the deepest directories are investigated before their parents. 3. **Run an isolated agent loop per directory** with a max of 10 turns each. The agent has a small toolbelt: read files, run whitelisted coreutils commands (`wc`, `file`, `grep`, `head`, `tail`, `stat`, `du`, `find`), and write cache entries. 4. **Cache everything.** Each file and directory summary is written to `/tmp/luminos/` so that subsequent runs on the same target don't burn tokens re-deriving things that haven't changed. 5. **Context budget guard.** Per-turn `input_tokens` is watched against a budget (currently 70% of the model's context window) so a rogue directory can't blow the context and silently degrade quality. 6. **Final synthesis pass** reads only the directory-level cache entries (not the raw files) to produce a project-level summary and the severity-ranked flags. ## Development Run the test suite: ```bash python3 -m unittest discover -s tests/ ``` Modules that are intentionally not unit tested and why: - `luminos_lib/ai.py`: requires a live Anthropic API, tested in practice - `luminos_lib/ast_parser.py`: requires tree-sitter grammars installed - `luminos_lib/watch.py`: stateful event loop, tested manually - `luminos_lib/prompts.py`: string templates only ## License Apache License 2.0. See [`LICENSE`](LICENSE) for the full text. ## Source of truth The canonical home for this project is the [Forgejo repository](https://forgejo.labbity.unbiasedgeek.com/archeious/luminos). The GitHub copy is a read-only mirror, pushed automatically from Forgejo. Issues, pull requests, and the project wiki live on Forgejo.