16 changed files with 988 additions and 2700 deletions
--- a/CLAUDE.md
+++ b/CLAUDE.md
@ -10,21 +10,18 @@

 ## Current Project State

- **Phase:** Active development — Phases 1, 2, 2.5, 2.6, 2.7, 2.8, 3 complete. Next: fix #78 (synthesis persistence), #79 (stale cache), then reassess Phase 4+ (#40).
- **Last worked on:** 2026-04-12
- **Last commit:** fix(ai): match target root by basename in _apply_plan() (#76)
+- **Phase:** Active development — Phase 2 (survey pass) and Phase 2.5 (context budget) complete; Phase 3 (investigation planning) ready to start
+- **Last worked on:** 2026-04-06
+- **Last commit:** merge: feat/issue-44-context-budget (#44)
 - **Blocking:** None
- **Test count:** 262 passing

 ---

 ## Project Overview

-Luminos is a file system intelligence tool. Point it at a directory and it
-runs a multi-pass agentic investigation via the Claude API: a survey pass,
-isolated dir-loop agents per directory, and a synthesis pass that produces a
-project-level verdict with severity-ranked flags. A lightweight base scan
-runs first to feed the agent its initial picture of the target.
+Luminos is a file system intelligence tool — a zero-dependency Python CLI that
+scans a directory and produces a reconnaissance report. With `--ai` it runs a
+multi-pass agentic investigation via the Claude API.

 ---

@ -35,7 +32,8 @@ runs first to feed the agent its initial picture of the target.
 | `luminos.py` | Entry point — arg parsing, scan(), main() |
 | `luminos_lib/ai.py` | Multi-pass agentic analysis via Claude API |
 | `luminos_lib/ast_parser.py` | tree-sitter code structure parsing |
-| `luminos_lib/cache.py` | Investigation cache management (incl. clear_cache) |
+| `luminos_lib/cache.py` | Investigation cache management |
+| `luminos_lib/capabilities.py` | Optional dep detection, cache cleanup |
 | `luminos_lib/code.py` | Language detection, LOC counting |
 | `luminos_lib/disk.py` | Per-directory disk usage |
 | `luminos_lib/filetypes.py` | File classification (7 categories) |
@ -43,6 +41,7 @@ runs first to feed the agent its initial picture of the target.
 | `luminos_lib/recency.py` | Recently modified files |
 | `luminos_lib/report.py` | Terminal report formatter |
 | `luminos_lib/tree.py` | Directory tree visualization |
+| `luminos_lib/watch.py` | Watch mode with snapshot diffing |

 Details: wiki — [Architecture](https://forgejo.labbity.unbiasedgeek.com/archeious/luminos/wiki/Architecture) | [Development Guide](https://forgejo.labbity.unbiasedgeek.com/archeious/luminos/wiki/DevelopmentGuide)

@ -50,52 +49,72 @@ Details: wiki — [Architecture](https://forgejo.labbity.unbiasedgeek.com/archei

 ## Key Constraints

- **AI investigation is the product.** The base scan exists to feed the agent.
-  There is no `--ai` flag and no `--no-ai` mode. AI runs unconditionally on
-  every invocation.
- **Anthropic API key is required.** If `ANTHROPIC_API_KEY` is unset, luminos
-  exits cleanly (exit 0) with a one-line hint instead of running.
- **Dependencies installed via `requirements.txt`.** anthropic, tree-sitter +
-  grammars, and python-magic are normal pip dependencies, not lazy imports.
-  `setup_env.sh` creates a venv and installs them.
+- **Base tool: no pip dependencies.** tree, filetypes, code, disk, recency,
+  report, watch use only stdlib and GNU coreutils. Must always work on bare Python 3.
+- **AI deps are lazy.** `anthropic`, `tree-sitter`, `python-magic` imported only
+  when `--ai` is used. Missing packages produce a clear install error.
 - **Subprocess for OS tools.** LOC counting, file detection, disk usage, and
  recency shell out to GNU coreutils. Do not reimplement in pure Python.
 - **Graceful degradation everywhere.** Permission denied, subprocess timeouts,
-  individual dir-loop failures — all handled without crashing the run.
+  missing API key — all handled without crashing.

 ---

 ## Running Luminos

 ```bash
-# Activate the venv (one-time setup: ./setup_env.sh)
-source ~/luminos-env/bin/activate
-export ANTHROPIC_API_KEY=your-key-here
-
-# Run an investigation
+# Base scan
 python3 luminos.py <target>

+# With AI analysis (requires ANTHROPIC_API_KEY)
+source ~/luminos-env/bin/activate
+python3 luminos.py --ai <target>
+
 # Common flags
 python3 luminos.py -d 8 -a -x .git -x node_modules <target>
 python3 luminos.py --json -o report.json <target>
-python3 luminos.py --fresh <target>
-python3 luminos.py --clear-cache
+python3 luminos.py --watch <target>
+python3 luminos.py --install-extras
 ```

 ---

-## Project-Specific Test Notes
+## Development Workflow

-Run tests with `python3 -m unittest discover -s tests/`. Modules exempt from
-unit testing: `ast_parser.py` (requires tree-sitter grammars at import time)
-and `prompts.py` (string templates only). `ai.py` is partially covered:
-end-to-end loops require a live Anthropic API and stay exempt, but the pure
-helpers (`_filter_dir_tools`, `_format_survey_block`, `_path_is_safe`,
-`_should_skip_dir`, `_block_to_dict`, `_flush_partial_dir_entry`, etc.) are
-covered by `tests/test_ai_pure.py`.
+- **Issue-driven work** — all work must be tied to a Forgejo issue. If the
+  user names a specific issue, use it. If they describe work without an issue
+  number, search open issues for a match. If no issue exists, gather enough
+  context to create one before starting work. Branches and commits should
+  reference the issue number.
+- **Explain then build** — articulate the approach in a few bullets before
+  writing code. Surface assumptions early.
+- **Atomic commits** — each commit is one logical change.
+- **Test coverage required** — every change to a testable module must include
+  or update tests in `tests/`. Run with `python3 -m unittest discover -s tests/`.
+  All tests must pass before merging. Modules exempt from unit testing:
+  `ai.py` (requires live API), `ast_parser.py` (requires tree-sitter),
+  `watch.py` (stateful events), `prompts.py` (string templates only).
+- **Shiny object capture** — new ideas go to PLAN.md (Raw Thoughts) or a
+  Forgejo issue, not into current work.

-(Development workflow, branching discipline, and session protocols live in
-`~/.claude/CLAUDE.md`.)
+---
+
+## Branching Discipline
+
+- **Always branch** — no direct commits to main, ever
+- **Branch before first change** — create the branch before touching any files
+- **Naming:** `feat/`, `fix/`, `refactor/`, `chore/` + short description
+- **One branch, one concern** — don't mix unrelated changes
+- **Two-branch maximum** — never have more than 2 unmerged branches
+- **Merge with `--no-ff`** — preserves branch history in the log
+- **Delete after merge** — `git branch -d <branch>` immediately after merge
+- **Close the underlying issue manually** — after merging, `PATCH` the
+  referenced issue to `state: closed` via the Forgejo API. Do not rely
+  on `Closes #N` keyword auto-close — it has not worked reliably in
+  this Forgejo instance, leaving issues stale while their PRs are
+  merged. Manual close is one extra API call and is part of the merge
+  step, not optional.
+- **Push after commits** — keep Forgejo in sync after each commit or logical batch

 ---

@ -107,17 +126,71 @@ covered by `tests/test_ai_pure.py`.
 | Classes | PascalCase | `_TokenTracker`, `_CacheManager` |
 | Constants | UPPER_SNAKE_CASE | `MAX_CONTEXT`, `CACHE_ROOT` |
 | Module files | snake_case | `ast_parser.py` |
-| CLI flags | kebab-case | `--clear-cache`, `--fresh` |
+| CLI flags | kebab-case | `--clear-cache`, `--install-extras` |
 | Private functions | leading underscore | `_run_synthesis` |

 ---

+## Documentation Workflow
+
+- **Wiki location:** `docs/wiki/` — local git checkout of `luminos.wiki.git`
+- **Clone URL:** `ssh://git@forgejo-claude/archeious/luminos.wiki.git`
+- **Session startup:** clone if missing, `git -C docs/wiki pull` if present
+- **All reads and writes** happen on local files in `docs/wiki/`. Use Read,
+  Edit, Write, Grep, Glob — never the Forgejo web API for wiki content.
+- **Naming:** CamelCase slugs (`Architecture.md`, `DevelopmentGuide.md`).
+  Display name comes from the H1 heading inside the file.
+- **Commits:** direct to main branch. Batch logically — commit when finishing
+  a round of updates, not after every file.
+- **Push:** after each commit batch.
+
+---
+
+## ADHD Session Protocols
+
+> **MANDATORY — follow literally, every session, no exceptions.**
+
+1. **Session Start Ritual** — Ensure `docs/wiki/` is cloned and current.
+   Fetch open issues from Forgejo (`archeious/luminos`) and present them as
+   suggested tasks. Ask: *"What's the one thing we're shipping?"* Once the
+   user answers, match to an existing issue or create one before starting
+   work. Do NOT summarize project state, recap history, or do any other work
+   before asking this question.
+
+2. **Dopamine-Friendly Task Sizing** — break work into 5–15 minute tasks with
+   clear, visible outputs. Each task should have a moment of completion.
+
+3. **Focus Guard** — classify incoming requests as on-topic / adjacent /
+   off-topic. Name it out loud before acting. Adjacent work goes to a new
+   issue; off-topic work gets deferred.
+
+4. **Shiny Object Capture** — when a new idea surfaces mid-session, write it
+   to PLAN.md (Raw Thoughts) or open a Forgejo issue, then return to the
+   current task. Do not context-switch.
+
+5. **Breadcrumb Protocol** — after each completed task, output:
+   `Done: <what was completed>. Next: <what comes next>.`
+   This re-orients after any interruption.
+
+6. **Session End Protocol** — before closing, state the exact pickup point for
+   the next session: branch name, file, what was in progress, and the
+   recommended first action next time.
+
+---
+
+## Session Protocols
+
+- **"externalize"** → read and follow `docs/externalize.md`
+- **"wrap up" / "end session"** → read and follow `docs/wrap-up.md`
+
+---
+
 ## Session Log

 | # | Date | Summary |
 |---|---|---|
-| 8 | 2026-04-07 | Closed #54 — added confidence/confidence_reason to write_cache tool schema description; Phase 1 milestone now 4/4 complete |
-| 9 | 2026-04-11 | Scope shift (#64) + ALL Phase 3 prereqs: dir loop refactor (#57), tool registry consolidation (#56), pure-helper test coverage waves 1+2 (#55, #70), leaf-first contract docs (#72). 6 PRs, 70 net new tests (164→234), Phase 2.6/2.7/2.8 milestones complete |
-| 10 | 2026-04-12 | Phase 3 shipped: planning pass, dynamic turn allocation, quality instrumentation (#8, #9, #10, #11, #74). Fixed root-path matching bug (#76). Smoke tests on luminos + homelab IaC. Filed #78 (synthesis persistence), #79 (stale cache). 3 PRs, 28 new tests (234→262) |
+| 2 | 2026-04-06 | Forgejo milestones (9), issues (36), project board, Gitea MCP installed and configured globally |
+| 3 | 2026-04-06 | Phase 1 complete (#1–#3), MCP backend architecture design (Part 10, Phase 3.5), issues #38–#40 opened |
+| 4 | 2026-04-06 | Phase 2 + 2.5 complete (#4–#7, #42, #44), filetype classifier rebuild, context budget metric fix, 8 PRs merged, issues #46/#48/#49/#51 opened |

 Full log: wiki — [Session Retrospectives](https://forgejo.labbity.unbiasedgeek.com/archeious/luminos/wiki/SessionRetrospectives)
--- a/202
+++ b/202
@ -1,202 +0,0 @@
-
-                                 Apache License
-                           Version 2.0, January 2004
-                        http://www.apache.org/licenses/
-
-   TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
-
-   1. Definitions.
-
-      "License" shall mean the terms and conditions for use, reproduction,
-      and distribution as defined by Sections 1 through 9 of this document.
-
-      "Licensor" shall mean the copyright owner or entity authorized by
-      the copyright owner that is granting the License.
-
-      "Legal Entity" shall mean the union of the acting entity and all
-      other entities that control, are controlled by, or are under common
-      control with that entity. For the purposes of this definition,
-      "control" means (i) the power, direct or indirect, to cause the
-      direction or management of such entity, whether by contract or
-      otherwise, or (ii) ownership of fifty percent (50%) or more of the
-      outstanding shares, or (iii) beneficial ownership of such entity.
-
-      "You" (or "Your") shall mean an individual or Legal Entity
-      exercising permissions granted by this License.
-
-      "Source" form shall mean the preferred form for making modifications,
-      including but not limited to software source code, documentation
-      source, and configuration files.
-
-      "Object" form shall mean any form resulting from mechanical
-      transformation or translation of a Source form, including but
-      not limited to compiled object code, generated documentation,
-      and conversions to other media types.
-
-      "Work" shall mean the work of authorship, whether in Source or
-      Object form, made available under the License, as indicated by a
-      copyright notice that is included in or attached to the work
-      (an example is provided in the Appendix below).
-
-      "Derivative Works" shall mean any work, whether in Source or Object
-      form, that is based on (or derived from) the Work and for which the
-      editorial revisions, annotations, elaborations, or other modifications
-      represent, as a whole, an original work of authorship. For the purposes
-      of this License, Derivative Works shall not include works that remain
-      separable from, or merely link (or bind by name) to the interfaces of,
-      the Work and Derivative Works thereof.
-
-      "Contribution" shall mean any work of authorship, including
-      the original version of the Work and any modifications or additions
-      to that Work or Derivative Works thereof, that is intentionally
-      submitted to Licensor for inclusion in the Work by the copyright owner
-      or by an individual or Legal Entity authorized to submit on behalf of
-      the copyright owner. For the purposes of this definition, "submitted"
-      means any form of electronic, verbal, or written communication sent
-      to the Licensor or its representatives, including but not limited to
-      communication on electronic mailing lists, source code control systems,
-      and issue tracking systems that are managed by, or on behalf of, the
-      Licensor for the purpose of discussing and improving the Work, but
-      excluding communication that is conspicuously marked or otherwise
-      designated in writing by the copyright owner as "Not a Contribution."
-
-      "Contributor" shall mean Licensor and any individual or Legal Entity
-      on behalf of whom a Contribution has been received by Licensor and
-      subsequently incorporated within the Work.
-
-   2. Grant of Copyright License. Subject to the terms and conditions of
-      this License, each Contributor hereby grants to You a perpetual,
-      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
-      copyright license to reproduce, prepare Derivative Works of,
-      publicly display, publicly perform, sublicense, and distribute the
-      Work and such Derivative Works in Source or Object form.
-
-   3. Grant of Patent License. Subject to the terms and conditions of
-      this License, each Contributor hereby grants to You a perpetual,
-      worldwide, non-exclusive, no-charge, royalty-free, irrevocable
-      (except as stated in this section) patent license to make, have made,
-      use, offer to sell, sell, import, and otherwise transfer the Work,
-      where such license applies only to those patent claims licensable
-      by such Contributor that are necessarily infringed by their
-      Contribution(s) alone or by combination of their Contribution(s)
-      with the Work to which such Contribution(s) was submitted. If You
-      institute patent litigation against any entity (including a
-      cross-claim or counterclaim in a lawsuit) alleging that the Work
-      or a Contribution incorporated within the Work constitutes direct
-      or contributory patent infringement, then any patent licenses
-      granted to You under this License for that Work shall terminate
-      as of the date such litigation is filed.
-
-   4. Redistribution. You may reproduce and distribute copies of the
-      Work or Derivative Works thereof in any medium, with or without
-      modifications, and in Source or Object form, provided that You
-      meet the following conditions:
-
-      (a) You must give any other recipients of the Work or
-          Derivative Works a copy of this License; and
-
-      (b) You must cause any modified files to carry prominent notices
-          stating that You changed the files; and
-
-      (c) You must retain, in the Source form of any Derivative Works
-          that You distribute, all copyright, patent, trademark, and
-          attribution notices from the Source form of the Work,
-          excluding those notices that do not pertain to any part of
-          the Derivative Works; and
-
-      (d) If the Work includes a "NOTICE" text file as part of its
-          distribution, then any Derivative Works that You distribute must
-          include a readable copy of the attribution notices contained
-          within such NOTICE file, excluding those notices that do not
-          pertain to any part of the Derivative Works, in at least one
-          of the following places: within a NOTICE text file distributed
-          as part of the Derivative Works; within the Source form or
-          documentation, if provided along with the Derivative Works; or,
-          within a display generated by the Derivative Works, if and
-          wherever such third-party notices normally appear. The contents
-          of the NOTICE file are for informational purposes only and
-          do not modify the License. You may add Your own attribution
-          notices within Derivative Works that You distribute, alongside
-          or as an addendum to the NOTICE text from the Work, provided
-          that such additional attribution notices cannot be construed
-          as modifying the License.
-
-      You may add Your own copyright statement to Your modifications and
-      may provide additional or different license terms and conditions
-      for use, reproduction, or distribution of Your modifications, or
-      for any such Derivative Works as a whole, provided Your use,
-      reproduction, and distribution of the Work otherwise complies with
-      the conditions stated in this License.
-
-   5. Submission of Contributions. Unless You explicitly state otherwise,
-      any Contribution intentionally submitted for inclusion in the Work
-      by You to the Licensor shall be under the terms and conditions of
-      this License, without any additional terms or conditions.
-      Notwithstanding the above, nothing herein shall supersede or modify
-      the terms of any separate license agreement you may have executed
-      with Licensor regarding such Contributions.
-
-   6. Trademarks. This License does not grant permission to use the trade
-      names, trademarks, service marks, or product names of the Licensor,
-      except as required for reasonable and customary use in describing the
-      origin of the Work and reproducing the content of the NOTICE file.
-
-   7. Disclaimer of Warranty. Unless required by applicable law or
-      agreed to in writing, Licensor provides the Work (and each
-      Contributor provides its Contributions) on an "AS IS" BASIS,
-      WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
-      implied, including, without limitation, any warranties or conditions
-      of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
-      PARTICULAR PURPOSE. You are solely responsible for determining the
-      appropriateness of using or redistributing the Work and assume any
-      risks associated with Your exercise of permissions under this License.
-
-   8. Limitation of Liability. In no event and under no legal theory,
-      whether in tort (including negligence), contract, or otherwise,
-      unless required by applicable law (such as deliberate and grossly
-      negligent acts) or agreed to in writing, shall any Contributor be
-      liable to You for damages, including any direct, indirect, special,
-      incidental, or consequential damages of any character arising as a
-      result of this License or out of the use or inability to use the
-      Work (including but not limited to damages for loss of goodwill,
-      work stoppage, computer failure or malfunction, or any and all
-      other commercial damages or losses), even if such Contributor
-      has been advised of the possibility of such damages.
-
-   9. Accepting Warranty or Additional Liability. While redistributing
-      the Work or Derivative Works thereof, You may choose to offer,
-      and charge a fee for, acceptance of support, warranty, indemnity,
-      or other liability obligations and/or rights consistent with this
-      License. However, in accepting such obligations, You may act only
-      on Your own behalf and on Your sole responsibility, not on behalf
-      of any other Contributor, and only if You agree to indemnify,
-      defend, and hold each Contributor harmless for any liability
-      incurred by, or claims asserted against, such Contributor by reason
-      of your accepting any such warranty or additional liability.
-
-   END OF TERMS AND CONDITIONS
-
-   APPENDIX: How to apply the Apache License to your work.
-
-      To apply the Apache License to your work, attach the following
-      boilerplate notice, with the fields enclosed by brackets "[]"
-      replaced with your own identifying information. (Don't include
-      the brackets!)  The text should be enclosed in the appropriate
-      comment syntax for the file format. We also recommend that a
-      file or class name and description of purpose be included on the
-      same "printed page" as the copyright notice for easier
-      identification within third-party archives.
-
-   Copyright [yyyy] [name of copyright owner]
-
-   Licensed under the Apache License, Version 2.0 (the "License");
-   you may not use this file except in compliance with the License.
-   You may obtain a copy of the License at
-
-       http://www.apache.org/licenses/LICENSE-2.0
-
-   Unless required by applicable law or agreed to in writing, software
-   distributed under the License is distributed on an "AS IS" BASIS,
-   WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
-   See the License for the specific language governing permissions and
-   limitations under the License.
--- a/PLAN.md
+++ b/PLAN.md
@ -576,46 +576,6 @@ without 9 phases of rework.
  output size, redundant reads) before picking a fix.
 - Add token-usage instrumentation so regressions are visible.

-### Phase 2.6 — Pre-Phase-3 cleanup (#54, #57) ✅ shipped
-Two debts surfaced during the Session 5 documentation deep dive that
-were paid before Phase 3 adds more state to the same code paths:
-
- **#54** — Phase 1 confidence-write path was dormant. Cache schema
-  accepted `confidence` and `low_confidence_entries()` worked, but no
-  prompt instructed the agent to set the field. Wired in Session 8.
- **#57** — `_run_dir_loop` was ~160 lines holding four conceptual
-  layers. Refactored in Session 9 into three focused helpers
-  (`_build_dir_loop_context`, `_flush_partial_dir_entry`,
-  `_handle_turn_response`) so Phase 3 dynamic turn allocation has a
-  thin coordinator to inject into.
-
-### Phase 2.7 — Tool registration cleanup (#56) ✅ shipped
-Adding a tool used to require updating `_TOOL_DISPATCH` and `_DIR_TOOLS`
-in two separate places. Forgetting one half was silent. Replaced both
-with a single `register_tool()` call per (tool, scope) in Session 9.
-Phase 3.5 MCP backend will eventually replace this with dynamic
-discovery, at which point `register_tool()` collapses to a one-line
-forward.
-
-### Phase 2.8 — Pre-Phase-3 test coverage (#55, #70)
-Safety nets for the helpers Phase 3 will pile state on top of. Two
-waves:
-
- **#55** ✅ shipped — `tests/test_ai_pure.py` covers the easy
-  decision-logic helpers: `_filter_dir_tools`, `_format_survey_block`,
-  `_format_survey_signals`, `_default_survey`, `_should_skip_dir`,
-  `_path_is_safe`, `_block_to_dict`, plus `_flush_partial_dir_entry`
-  from #57. 45 tests added in Session 9.
- **#70** — second wave covering the highest-impact remaining helpers
-  that escaped the first sweep:
-  - `_TokenTracker` — pins the load-bearing #44 fix
-    (`last_input` vs cumulative for budget decisions)
-  - `_synthesize_from_cache` — last-resort fallback that fires almost
-    never in normal runs and is therefore the kind of code that silently
-    rots
-  - `_discover_directories` — leaves-first walk and skip-dir filter,
-    foundation of the cache reuse story
-
 ### Phase 3 — Investigation planning
 - Planning pass after survey, before dir loops
 - `submit_plan` tool
@ -692,7 +652,7 @@ architecture. The migration pain is intentional and instructive.
  extension sub-section or similar. Low priority, not blocking.
 - **Revisit survey-skip thresholds (#46)** — `_SURVEY_MIN_FILES` and
  `_SURVEY_MIN_DIRS` shipped with values from #7's example, no
-  empirical basis. Once luminos has been run on a variety of real
+  empirical basis. Once `--ai` has been run on a variety of real
  targets, look at which runs skipped the survey vs ran it and decide
  whether the thresholds (or the gate logic itself) need to change.

@ -711,7 +671,7 @@ architecture. The migration pain is intentional and instructive.
 | `luminos_lib/search.py` | **new** — web_search, fetch_url, package_lookup implementations |

 No changes needed to: `tree.py`, `filetypes.py`, `code.py`, `recency.py`,
-`disk.py`, `ast_parser.py`
+`disk.py`, `capabilities.py`, `watch.py`, `ast_parser.py`

 ---

@ -803,20 +763,20 @@ agent read, in what order, what it decided to skip). Storing the full message
 history per directory would allow replaying or auditing an investigation. Cost:
 storage. Benefit: debuggability, ability to resume investigations more faithfully.

-**Live re-investigation mode**
-A "watch" replacement: detect which directories changed, re-investigate only
-those, and patch the cache entries. The synthesis would then re-run from the
-updated cache without re-investigating unchanged directories. The original
-non-AI watch mode was deleted in the #64 scope change because it conflicted
-with the AI-first philosophy. If watch comes back, it comes back as this.
+**Watch mode + incremental investigation**
+Watch mode currently re-runs the full base scan on changes. For AI-augmented
+watch mode: detect which directories changed, re-investigate only those, and
+patch the cache entries. The synthesis would then re-run from the updated cache
+without re-investigating unchanged directories.

-**PDF and Office document readers**
+**Optional PDF and Office document readers**
 The data and documents domains would benefit from native content extraction:
 - `pdfminer` or `pypdf` for PDF text extraction
 - `openpyxl` for Excel schema and sheet enumeration
 - `python-docx` for Word document text
-These slot into `requirements.txt` like any other dependency. The agent
-currently can only see filename and size for these formats.
+These would be optional deps like the existing AI deps, gated behind
+`--install-extras`. The agent currently can only see filename and size for
+these formats.

 **Security-focused analysis mode**
 A `--security` flag could tune the investigation toward security-relevant
@ -879,6 +839,11 @@ bad plan wastes turns on shallow directories and skips critical ones. The system
 needs quality signals — probably the confidence scores aggregated across the
 investigation — to detect when something went wrong and potentially retry.

+**Watch mode compatibility**
+Several of the planned features (survey pass, planning, external tools) are not
+designed for incremental re-use in watch mode. Adding AI capability to watch
+mode is a separate design problem that deserves its own thinking.
+
 **Turn budget contention**
 If the planning pass allocates turns and the agent borrows from its budget when
 it needs more, there's a risk of runaway investigation on unexpectedly complex
--- a/README.md
+++ b/README.md
@ -1,103 +0,0 @@
-# Luminos
-
-A file system intelligence tool. Point it at a directory and it runs an agentic Claude investigation that figures out what the directory is, what's in it, and what might be worth your attention.
-
-Luminos is built around a harder question than "what files are here?" It is built around "what is this, and should I be worried about any of it?" To answer that, it runs a multi-pass agentic investigation against the [Claude API](https://www.anthropic.com/api): a survey pass to orient on the target, an isolated dir-loop agent per directory with a small toolbelt (read files, run whitelisted coreutils commands, write cache entries), and a final synthesis pass that produces a project-level verdict with severity-ranked flags.
-
-A lightweight base scan runs first to feed the agent its initial picture of the target. The base scan is not a standalone product, it is the first step of the investigation.
-
-## Features
-
- **Agentic AI investigation.** Multi-pass, leaves-first analysis via Claude. Survey then dir loops then synthesis.
- **Investigation cache.** Per-file and per-directory summaries are cached under `/tmp/luminos/` so repeat runs on the same target are cheap.
- **Severity-ranked flags.** Findings are sorted so `critical` items are the first thing you see.
- **Context budget guard.** Per-turn `input_tokens` is watched against a budget so a rogue directory can't blow the context and silently degrade quality.
- **Graceful degradation.** Permission denied, subprocess timeouts, missing API key: all handled without crashing.
- **JSON output.** Pipe reports to other tools or save for comparison.
-
-## Installation
-
-Luminos is a normal Python project. Clone, create a venv, and install from `requirements.txt`. The repository ships a helper script that does this for you:
-
-```bash
-git clone https://github.com/archeious/luminos.git
-cd luminos
-./setup_env.sh
-source ~/luminos-env/bin/activate
-```
-
-Or do it by hand:
-
-```bash
-python3 -m venv ~/luminos-env
-source ~/luminos-env/bin/activate
-pip install -r requirements.txt
-```
-
-You also need an Anthropic API key exported as an environment variable:
-
-```bash
-export ANTHROPIC_API_KEY=your-key-here
-```
-
-The base scan shells out to a handful of GNU coreutils (`wc`, `file`, `grep`, `head`, `tail`, `stat`, `du`, `find`), so you also need those on `$PATH`. They are installed by default on every mainstream Linux distribution and on macOS via Homebrew.
-
-## Usage
-
-```bash
-python3 luminos.py /path/to/project
-```
-
-That is the whole interface. The investigation runs end to end and prints a report.
-
-### Common flags
-
-```bash
-# Deeper tree, include hidden files, exclude build and vendor dirs
-python3 luminos.py -d 8 -a -x .git -x node_modules -x vendor /path/to/project
-
-# JSON output to a file
-python3 luminos.py --json -o report.json /path/to/project
-
-# Force a fresh investigation, ignoring the cache
-python3 luminos.py --fresh /path/to/project
-
-# Clear the investigation cache
-python3 luminos.py --clear-cache
-```
-
-Run `python3 luminos.py --help` for the full flag list.
-
-## How the investigation works
-
-A short version of what happens on every run:
-
-1. **Base scan.** Builds the directory tree, classifies files into seven categories, counts lines of code, finds large and recently modified files, computes per-directory disk usage. This is the agent's initial picture of the target.
-2. **Survey pass.** A short agent loop (max 3 turns) reads the base scan, describes the target in plain language, and decides which investigation tools are relevant. Tiny targets skip the survey.
-3. **Dir loops.** Every directory gets its own isolated agent loop, leaves-first, with up to 14 turns. The agent has read-only access to the filesystem and a toolbelt of `read_file`, `list_directory`, `run_command`, `parse_structure`, `write_cache`, `think`, `checkpoint`, `flag`, and `submit_report`.
-4. **Cache.** Each file and directory summary is written to `/tmp/luminos/` so subsequent runs on the same target don't re-derive what hasn't changed.
-5. **Context budget guard.** Per-turn `input_tokens` is watched against a budget (currently 70% of the model's context window) so a rogue directory can't blow the context window.
-6. **Final synthesis.** A short agent loop reads the directory-level cache entries (not the raw files) and produces the project-level brief, the detailed analysis, and the severity-ranked flags.
-
-## Development
-
-Run the test suite:
-
-```bash
-python3 -m unittest discover -s tests/
-```
-
-Modules that are intentionally not unit tested:
-
- `luminos_lib/ast_parser.py`: requires tree-sitter grammars installed
- `luminos_lib/prompts.py`: string templates only
-
-`luminos_lib/ai.py` is partially covered. End-to-end agent loops require a live Anthropic API and stay exempt, but pure helpers are tested in `tests/test_ai_pure.py`.
-
-## License
-
-Apache License 2.0. See [`LICENSE`](LICENSE) for the full text.
-
-## Source of truth
-
-The canonical home for this project is the [Forgejo repository](https://forgejo.labbity.unbiasedgeek.com/archeious/luminos). The GitHub copy is a read-only mirror, pushed automatically from Forgejo. Issues, pull requests, and the project wiki live on Forgejo.
--- a/docs/externalize.md
+++ b/docs/externalize.md
@ -0,0 +1,36 @@
+# Externalize Protocol
+
+> Triggered when the user says "externalize" or "externalize your thoughts."
+> This is a STANDALONE action. Do NOT wrap up unless separately asked.
+
+## Steps
+
+1. **Determine session number** — check the Session Log in CLAUDE.md for the
+   latest session number, increment by 1
+
+2. **Pull wiki** — ensure `docs/wiki/` is current:
+   ```bash
+   git -C docs/wiki pull   # or clone if missing
+   ```
+
+3. **Create session wiki page** — write `docs/wiki/Session{N}.md` with:
+   - Date, focus, duration estimate
+   - What was done (with detail — reference actual files and commits)
+   - Discoveries and observations
+   - Decisions made and why
+   - Raw Thinking — observations, concerns, trade-offs, and loose threads that
+     came up during the session but weren't part of the main deliverable.
+     Things you'd mention if pair programming: prerequisites noticed, corners
+     being painted into, intent mismatches, unresolved questions.
+   - What's next
+
+4. **Update SessionRetrospectives.md** — read the current index, add the new
+   session row, write it back
+
+5. **Commit and push wiki:**
+   ```bash
+   cd docs/wiki
+   git add -A
+   git commit -m "retro: Session {N} — <one-line summary>"
+   git push
+   ```
--- a/docs/wrap-up.md
+++ b/docs/wrap-up.md
@ -0,0 +1,31 @@
+# Session Wrap-Up Checklist
+
+> Triggered when the user says "wrap up", "end session", or similar.
+> Always externalize FIRST, then do the steps below.
+
+## Steps
+
+1. **Externalize** — run the `docs/externalize.md` protocol if not already
+   done this session
+
+2. **Reread CLAUDE.md** — ensure you have the latest context before editing
+
+3. **Update CLAUDE.md:**
+   - Update **Current Project State** — phase, last worked on (today's date),
+     last commit, blocking issues
+   - Update **Session Log** — add new entry, keep only last 3 sessions,
+     remove older ones (full history is in the wiki)
+
+4. **Commit and push main repo:**
+   ```bash
+   git add CLAUDE.md
+   git commit -m "chore: update CLAUDE.md for session {N}"
+   git push
+   ```
+
+5. **Verify nothing is unpushed** — both the main repo and docs/wiki should
+   have no pending commits
+
+6. **Recommend next session** — tell the user what the best next session
+   should tackle, in priority order based on PLAN.md and any open Forgejo
+   issues
--- a/luminos.py
+++ b/luminos.py
@ -16,11 +16,16 @@ from luminos_lib.filetypes import (
 from luminos_lib.code import detect_languages, find_large_files
 from luminos_lib.recency import find_recent_files
 from luminos_lib.disk import get_disk_usage, top_directories
+from luminos_lib.watch import watch_loop
 from luminos_lib.report import format_report


 def _progress(label):
-    """Return (on_file, finish) for in-place per-file progress on stderr."""
+    """Return (on_file, finish) for in-place per-file progress on stderr.
+
+    on_file(path) overwrites the current line with the label and truncated path.
+    finish() finalises the line with a newline.
+    """
    cols = shutil.get_terminal_size((80, 20)).columns
    prefix = f"  [scan] {label}... "
    available = max(cols - len(prefix), 10)
@ -38,7 +43,7 @@ def _progress(label):


 def scan(target, depth=3, show_hidden=False, exclude=None):
-    """Run the base scan and return the report dict consumed by the AI pass."""
+    """Run all analyses on the target directory and return a report dict."""
    report = {}

    exclude = exclude or []
@ -84,8 +89,7 @@ def main():
    parser = argparse.ArgumentParser(
        prog="luminos",
        description="Luminos — file system intelligence tool. "
-                    "Runs an agentic Claude investigation against a directory "
-                    "and produces a reconnaissance report.",
+                    "Explores a directory and produces a reconnaissance report.",
    )
    parser.add_argument("target", nargs="?", help="Target directory to analyze")
    parser.add_argument("-d", "--depth", type=int, default=3,
@ -96,10 +100,17 @@ def main():
                        help="Output report as JSON")
    parser.add_argument("-o", "--output", metavar="FILE",
                        help="Write report to a file")
+    parser.add_argument("--ai", action="store_true",
+                        help="Use Claude AI to analyze directory purpose "
+                             "(requires ANTHROPIC_API_KEY)")
+    parser.add_argument("--watch", action="store_true",
+                        help="Re-scan every 30 seconds and show diffs")
    parser.add_argument("--clear-cache", action="store_true",
-                        help="Clear the investigation cache (/tmp/luminos/)")
+                        help="Clear the AI investigation cache (/tmp/luminos/)")
    parser.add_argument("--fresh", action="store_true",
-                        help="Force a new investigation (ignore cached results)")
+                        help="Force a new AI investigation (ignore cached results)")
+    parser.add_argument("--install-extras", action="store_true",
+                        help="Show status of optional AI dependencies")
    parser.add_argument("-x", "--exclude", metavar="DIR", action="append",
                        default=[],
                        help="Exclude a directory name from scan and analysis "
@ -107,8 +118,15 @@ def main():

    args = parser.parse_args()

+    # --install-extras: show package status and exit
+    if args.install_extras:
+        from luminos_lib.capabilities import print_status
+        print_status()
+        return
+
+    # --clear-cache: wipe /tmp/luminos/ (lazy import to avoid AI deps)
    if args.clear_cache:
-        from luminos_lib.cache import clear_cache
+        from luminos_lib.capabilities import clear_cache
        clear_cache()
        if not args.target:
            return
@ -122,24 +140,25 @@ def main():
              file=sys.stderr)
        sys.exit(1)

-    if not os.environ.get("ANTHROPIC_API_KEY"):
-        print("luminos requires ANTHROPIC_API_KEY. "
-              "Set it with: export ANTHROPIC_API_KEY=your-key-here",
-              file=sys.stderr)
-        sys.exit(0)
-
    if args.exclude:
        print(f"  [scan] Excluding: {', '.join(args.exclude)}", file=sys.stderr)

+    if args.watch:
+        watch_loop(target, depth=args.depth, show_hidden=args.all,
+                   json_output=args.json_output)
+        return
+
    report = scan(target, depth=args.depth, show_hidden=args.all,
                  exclude=args.exclude)

-    from luminos_lib.ai import analyze_directory
-    brief, detailed, flags = analyze_directory(
-        report, target, fresh=args.fresh, exclude=args.exclude)
-    report["ai_brief"] = brief
-    report["ai_detailed"] = detailed
-    report["flags"] = flags
+    flags = []
+    if args.ai:
+        from luminos_lib.ai import analyze_directory
+        brief, detailed, flags = analyze_directory(
+            report, target, fresh=args.fresh, exclude=args.exclude)
+        report["ai_brief"] = brief
+        report["ai_detailed"] = detailed
+        report["flags"] = flags

    if args.json_output:
        output = json.dumps(report, indent=2, default=str)
--- a/luminos_lib/ai.py
+++ b/luminos_lib/ai.py
--- a/luminos_lib/cache.py
+++ b/luminos_lib/cache.py
@ -3,8 +3,6 @@
 import hashlib
 import json
 import os
-import shutil
-import sys
 import uuid
 from datetime import datetime, timezone

@ -12,16 +10,6 @@ CACHE_ROOT = "/tmp/luminos"
 INVESTIGATIONS_PATH = os.path.join(CACHE_ROOT, "investigations.json")


-def clear_cache():
-    """Remove all investigation caches under CACHE_ROOT."""
-    if os.path.isdir(CACHE_ROOT):
-        shutil.rmtree(CACHE_ROOT)
-        print(f"Cleared cache: {CACHE_ROOT}", file=sys.stderr)
-    else:
-        print(f"No cache to clear ({CACHE_ROOT} does not exist).",
-              file=sys.stderr)
-
-
 def _sha256_path(path):
    """Return a hex SHA-256 of a path string, used as cache key."""
    return hashlib.sha256(path.encode("utf-8")).hexdigest()
--- a/luminos_lib/capabilities.py
+++ b/luminos_lib/capabilities.py
@ -0,0 +1,139 @@
+"""Capability detection and cache management for optional luminos dependencies.
+
+The base tool requires zero external packages. The --ai flag requires:
+  - anthropic       (API transport)
+  - tree-sitter     (AST parsing via parse_structure tool)
+  - python-magic    (improved file classification)
+
+This module is the single place that knows about optional dependencies.
+"""
+
+_PACKAGES = {
+    "anthropic": {
+        "import": "anthropic",
+        "pip": "anthropic",
+        "purpose": "Claude API client (streaming, retries, token counting)",
+    },
+    "tree-sitter": {
+        "import": "tree_sitter",
+        "pip": ("tree-sitter tree-sitter-python tree-sitter-javascript "
+                "tree-sitter-rust tree-sitter-go"),
+        "purpose": "AST parsing for parse_structure tool",
+    },
+    "python-magic": {
+        "import": "magic",
+        "pip": "python-magic",
+        "purpose": "Improved file type detection via libmagic",
+    },
+}
+
+
+def _check_package(import_name):
+    """Return True if a package is importable."""
+    try:
+        __import__(import_name)
+        return True
+    except ImportError:
+        return False
+
+
+ANTHROPIC_AVAILABLE = _check_package("anthropic")
+TREE_SITTER_AVAILABLE = _check_package("tree_sitter")
+MAGIC_AVAILABLE = _check_package("magic")
+
+
+def check_ai_dependencies():
+    """Check that all --ai dependencies are installed.
+
+    If any are missing, prints a clear error with the pip install command
+    and returns False. Returns True if everything is available.
+    """
+    missing = []
+    for name, info in _PACKAGES.items():
+        if not _check_package(info["import"]):
+            missing.append(name)
+
+    if not missing:
+        return True
+
+    # Also check tree-sitter grammar packages
+    grammar_missing = []
+    if "tree-sitter" not in missing:
+        for grammar in ["tree_sitter_python", "tree_sitter_javascript",
+                        "tree_sitter_rust", "tree_sitter_go"]:
+            if not _check_package(grammar):
+                grammar_missing.append(grammar.replace("_", "-"))
+
+    import sys
+    print("\nluminos --ai requires missing packages:", file=sys.stderr)
+    for name in missing:
+        print(f"  \u2717 {name}", file=sys.stderr)
+    for name in grammar_missing:
+        print(f"  \u2717 {name}", file=sys.stderr)
+
+    # Build pip install command
+    pip_parts = []
+    for name in missing:
+        pip_parts.append(_PACKAGES[name]["pip"])
+    for name in grammar_missing:
+        pip_parts.append(name)
+    pip_cmd = " \\\n                ".join(pip_parts)
+
+    print(f"\n  Install with:\n    pip install {pip_cmd}\n", file=sys.stderr)
+    return False
+
+
+def print_status():
+    """Print the install status of all optional packages."""
+    print("\nLuminos optional dependencies:\n")
+
+    for name, info in _PACKAGES.items():
+        available = _check_package(info["import"])
+        mark = "\u2713" if available else "\u2717"
+        status = "installed" if available else "missing"
+        print(f"  {mark} {name:20s} {status:10s}  {info['purpose']}")
+
+    # Grammar packages
+    grammars = {
+        "tree-sitter-python": "tree_sitter_python",
+        "tree-sitter-javascript": "tree_sitter_javascript",
+        "tree-sitter-rust": "tree_sitter_rust",
+        "tree-sitter-go": "tree_sitter_go",
+    }
+    print()
+    for name, imp in grammars.items():
+        available = _check_package(imp)
+        mark = "\u2713" if available else "\u2717"
+        status = "installed" if available else "missing"
+        print(f"  {mark} {name:20s} {status:10s}  Language grammar")
+
+    # Full install command (deduplicated)
+    all_pkgs = []
+    seen = set()
+    for info in _PACKAGES.values():
+        for pkg in info["pip"].split():
+            if pkg not in seen:
+                all_pkgs.append(pkg)
+                seen.add(pkg)
+    for name in grammars:
+        if name not in seen:
+            all_pkgs.append(name)
+            seen.add(name)
+
+    print(f"\n  Install all with:\n    pip install {' '.join(all_pkgs)}\n")
+
+
+from luminos_lib.cache import CACHE_ROOT
+
+
+def clear_cache():
+    """Remove all investigation caches under /tmp/luminos/."""
+    import shutil
+    import os
+    import sys
+    if os.path.isdir(CACHE_ROOT):
+        shutil.rmtree(CACHE_ROOT)
+        print(f"Cleared cache: {CACHE_ROOT}", file=sys.stderr)
+    else:
+        print(f"No cache to clear ({CACHE_ROOT} does not exist).",
+              file=sys.stderr)
--- a/luminos_lib/prompts.py
+++ b/luminos_lib/prompts.py
@ -209,84 +209,3 @@ Call `submit_survey` exactly once with:
 You have at most 3 turns. In almost all cases you should call
 `submit_survey` on your first turn. Use a second turn only if you
 genuinely need to think before committing."""
-
-_PLANNING_SYSTEM_PROMPT = """\
-You are an investigation planner. Your job is to decide where to invest
-investigative depth across a directory tree, BEFORE the per-directory
-investigation begins. You allocate turns (agent reasoning steps) to
-directories based on their likely complexity and importance.
-
-## Your Task
-Create an investigation plan for the target: {target}
-
-## Inputs
-
-Survey assessment (from a prior reconnaissance pass):
-{survey_context}
-
-Full directory tree:
-{tree_text}
-
-File signals:
-{file_signals}
-
-Total directories to investigate: {dir_count}
-Directories already cached (will be skipped): {cached_dirs}
-
-## How to Allocate
-
-Classify each directory into one of three tiers:
-
-**priority** (15-20 turns): directories that are likely complex, central,
-or important. Signs: many source files, core application logic, complex
-configuration, entry points, schemas, migrations. These deserve deep
-investigation with multiple tool calls per file.
-
-**shallow** (5 turns): directories that are simple, peripheral, or
-predictable. Signs: few files, generated/vendored content, test fixtures,
-static assets, documentation-only dirs. A quick pass is sufficient.
-
-**skip** (0 turns): directories that should be skipped entirely. Signs:
-build output, dependency caches, vendored code, generated artifacts. The
-investigation would waste turns and produce noise.
-
-Directories you do not mention go into a default tier ({default_turns}
-turns). You do NOT need to list every directory. Focus on the ones where
-the default allocation would clearly be wrong (too many turns for a
-trivial dir, or too few for a complex one).
-
-## Investigation Order
-
-Choose one of these ordering strategies:
-
- **leaf-first**: deepest directories first, parents last. This is the
-  default and ensures parent directories always have child summaries
-  available. Best for most codebases.
-
- **priority-first**: priority directories before shallow ones, but
-  still leaf-first within each tier. Good when certain subtrees are
-  clearly more important and you want findings from them to inform
-  the rest of the investigation.
-
-Both strategies preserve the leaf-first invariant (children before
-parents) to ensure child summaries are available when investigating
-parent directories.
-
-## Budget
-
-The global turn budget is {global_budget} turns across all directories.
-Your allocations should roughly respect this budget, though small
-overages are fine. If you allocate significantly more than the budget,
-the orchestrator will cap individual directories.
-
-## Notes Field
-
-Use `notes` to communicate anything the per-directory agents should
-know that the survey did not capture. Cross-cutting concerns, suspected
-relationships between directories, or investigation priorities. Leave
-empty if you have nothing to add beyond the tier assignments.
-
-## Output
-Call `submit_plan` exactly once. You have at most 3 turns, but you
-should almost always submit on your first turn. Use additional turns
-only if you genuinely need to reason through a complex target layout."""
--- a/luminos_lib/watch.py
+++ b/luminos_lib/watch.py
@ -0,0 +1,108 @@
+"""Watch mode — re-scan and show diffs every 30 seconds."""
+
+import json
+import sys
+import time
+import os
+
+
+def _snapshot(classified_files):
+    """Create a snapshot dict: path -> (size, category)."""
+    return {f["path"]: (f["size"], f["category"]) for f in classified_files}
+
+
+def _diff_snapshots(old, new):
+    """Compare two snapshots and return changes."""
+    old_paths = set(old.keys())
+    new_paths = set(new.keys())
+
+    added = new_paths - old_paths
+    removed = old_paths - new_paths
+    common = old_paths & new_paths
+
+    size_changes = []
+    for p in common:
+        old_size = old[p][0]
+        new_size = new[p][0]
+        if old_size != new_size:
+            size_changes.append((p, old_size, new_size))
+
+    return added, removed, size_changes
+
+
+def _human_size(nbytes):
+    for unit in ("B", "KB", "MB", "GB"):
+        if nbytes < 1024:
+            if unit == "B":
+                return f"{nbytes} {unit}"
+            return f"{nbytes:.1f} {unit}"
+        nbytes /= 1024
+    return f"{nbytes:.1f} TB"
+
+
+def watch_loop(target, depth=3, show_hidden=False, json_output=False):
+    """Run scan in a loop, printing diffs between runs."""
+    # Import here to avoid circular import
+    from luminos_lib.filetypes import classify_files
+
+    print(f"[luminos] Watching {target} (Ctrl+C to stop)")
+    print(f"[luminos] Scanning every 30 seconds...")
+    print()
+
+    prev_snapshot = None
+
+    try:
+        while True:
+            classified = classify_files(target, show_hidden=show_hidden)
+            current = _snapshot(classified)
+
+            if prev_snapshot is not None:
+                added, removed, size_changes = _diff_snapshots(
+                    prev_snapshot, current
+                )
+
+                if not added and not removed and not size_changes:
+                    ts = time.strftime("%H:%M:%S")
+                    print(f"[{ts}] No changes detected.")
+                else:
+                    ts = time.strftime("%H:%M:%S")
+                    print(f"[{ts}] Changes detected:")
+
+                    if json_output:
+                        diff = {
+                            "timestamp": ts,
+                            "added": sorted(added),
+                            "removed": sorted(removed),
+                            "size_changes": [
+                                {"path": p, "old_size": o, "new_size": n}
+                                for p, o, n in size_changes
+                            ],
+                        }
+                        print(json.dumps(diff, indent=2))
+                    else:
+                        for p in sorted(added):
+                            name = os.path.basename(p)
+                            print(f"  + NEW  {name}")
+                            print(f"         {p}")
+                        for p in sorted(removed):
+                            name = os.path.basename(p)
+                            print(f"  - DEL  {name}")
+                            print(f"         {p}")
+                        for p, old_s, new_s in size_changes:
+                            name = os.path.basename(p)
+                            delta = new_s - old_s
+                            sign = "+" if delta > 0 else ""
+                            print(f"  ~ SIZE {name}  "
+                                  f"{_human_size(old_s)} -> {_human_size(new_s)} "
+                                  f"({sign}{_human_size(delta)})")
+                    print()
+            else:
+                print(f"[{time.strftime('%H:%M:%S')}] "
+                      f"Initial scan complete: {len(current)} files indexed.")
+                print()
+
+            prev_snapshot = current
+            time.sleep(30)
+
+    except KeyboardInterrupt:
+        print("\n[luminos] Watch stopped.")
--- a/requirements.txt
+++ b/requirements.txt
@ -1,7 +0,0 @@
-anthropic
-python-magic
-tree-sitter
-tree-sitter-python
-tree-sitter-javascript
-tree-sitter-rust
-tree-sitter-go
--- a/setup_env.sh
+++ b/setup_env.sh
@ -2,7 +2,6 @@
 set -euo pipefail

 VENV_DIR="$HOME/luminos-env"
-SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"

 if [ -d "$VENV_DIR" ]; then
    echo "venv already exists at $VENV_DIR"
@ -14,19 +13,17 @@ fi
 echo "Activating venv..."
 source "$VENV_DIR/bin/activate"

-echo "Installing packages from requirements.txt..."
-pip install -r "$SCRIPT_DIR/requirements.txt"
+echo "Installing packages..."
+pip install anthropic tree-sitter tree-sitter-python \
+            tree-sitter-javascript tree-sitter-rust \
+            tree-sitter-go python-magic

 echo ""
 echo "Done. To activate the venv in future sessions:"
 echo ""
 echo "  source ~/luminos-env/bin/activate"
 echo ""
-echo "Set your Anthropic API key:"
+echo "Then run luminos as usual:"
 echo ""
-echo "  export ANTHROPIC_API_KEY=your-key-here"
-echo ""
-echo "Then run luminos:"
-echo ""
-echo "  python3 luminos.py <target>"
+echo "  python3 luminos.py --ai <target>"
 echo ""
--- a/tests/test_ai_pure.py
+++ b/tests/test_ai_pure.py
--- a/tests/test_capabilities.py
+++ b/tests/test_capabilities.py
@ -0,0 +1,37 @@
+"""Tests for luminos_lib/capabilities.py"""
+
+import unittest
+from unittest.mock import patch
+
+from luminos_lib.capabilities import _check_package
+
+
+class TestCheckPackage(unittest.TestCase):
+    def test_importable_package(self):
+        # json is always available in stdlib
+        self.assertTrue(_check_package("json"))
+
+    def test_missing_package(self):
+        self.assertFalse(_check_package("_luminos_nonexistent_package_xyz"))
+
+    def test_importable_returns_true(self):
+        with patch("builtins.__import__", return_value=None):
+            # patch doesn't work cleanly here; use a real stdlib module
+            pass
+        self.assertTrue(_check_package("os"))
+
+    def test_import_error_returns_false(self):
+        import builtins
+        original_import = builtins.__import__
+
+        def fake_import(name, *args, **kwargs):
+            if name == "_fake_missing_module":
+                raise ImportError("No module named '_fake_missing_module'")
+            return original_import(name, *args, **kwargs)
+
+        with patch("builtins.__import__", side_effect=fake_import):
+            self.assertFalse(_check_package("_fake_missing_module"))
+
+
+if __name__ == "__main__":
+    unittest.main()