Compare commits
No commits in common. "main" and "docs/issue-53-onboarding-internals" have entirely different histories.
main
...
docs/issue
16 changed files with 988 additions and 2700 deletions
153
CLAUDE.md
153
CLAUDE.md
|
|
@ -10,21 +10,18 @@
|
|||
|
||||
## Current Project State
|
||||
|
||||
- **Phase:** Active development — Phases 1, 2, 2.5, 2.6, 2.7, 2.8, 3 complete. Next: fix #78 (synthesis persistence), #79 (stale cache), then reassess Phase 4+ (#40).
|
||||
- **Last worked on:** 2026-04-12
|
||||
- **Last commit:** fix(ai): match target root by basename in _apply_plan() (#76)
|
||||
- **Phase:** Active development — Phase 2 (survey pass) and Phase 2.5 (context budget) complete; Phase 3 (investigation planning) ready to start
|
||||
- **Last worked on:** 2026-04-06
|
||||
- **Last commit:** merge: feat/issue-44-context-budget (#44)
|
||||
- **Blocking:** None
|
||||
- **Test count:** 262 passing
|
||||
|
||||
---
|
||||
|
||||
## Project Overview
|
||||
|
||||
Luminos is a file system intelligence tool. Point it at a directory and it
|
||||
runs a multi-pass agentic investigation via the Claude API: a survey pass,
|
||||
isolated dir-loop agents per directory, and a synthesis pass that produces a
|
||||
project-level verdict with severity-ranked flags. A lightweight base scan
|
||||
runs first to feed the agent its initial picture of the target.
|
||||
Luminos is a file system intelligence tool — a zero-dependency Python CLI that
|
||||
scans a directory and produces a reconnaissance report. With `--ai` it runs a
|
||||
multi-pass agentic investigation via the Claude API.
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -35,7 +32,8 @@ runs first to feed the agent its initial picture of the target.
|
|||
| `luminos.py` | Entry point — arg parsing, scan(), main() |
|
||||
| `luminos_lib/ai.py` | Multi-pass agentic analysis via Claude API |
|
||||
| `luminos_lib/ast_parser.py` | tree-sitter code structure parsing |
|
||||
| `luminos_lib/cache.py` | Investigation cache management (incl. clear_cache) |
|
||||
| `luminos_lib/cache.py` | Investigation cache management |
|
||||
| `luminos_lib/capabilities.py` | Optional dep detection, cache cleanup |
|
||||
| `luminos_lib/code.py` | Language detection, LOC counting |
|
||||
| `luminos_lib/disk.py` | Per-directory disk usage |
|
||||
| `luminos_lib/filetypes.py` | File classification (7 categories) |
|
||||
|
|
@ -43,6 +41,7 @@ runs first to feed the agent its initial picture of the target.
|
|||
| `luminos_lib/recency.py` | Recently modified files |
|
||||
| `luminos_lib/report.py` | Terminal report formatter |
|
||||
| `luminos_lib/tree.py` | Directory tree visualization |
|
||||
| `luminos_lib/watch.py` | Watch mode with snapshot diffing |
|
||||
|
||||
Details: wiki — [Architecture](https://forgejo.labbity.unbiasedgeek.com/archeious/luminos/wiki/Architecture) | [Development Guide](https://forgejo.labbity.unbiasedgeek.com/archeious/luminos/wiki/DevelopmentGuide)
|
||||
|
||||
|
|
@ -50,52 +49,72 @@ Details: wiki — [Architecture](https://forgejo.labbity.unbiasedgeek.com/archei
|
|||
|
||||
## Key Constraints
|
||||
|
||||
- **AI investigation is the product.** The base scan exists to feed the agent.
|
||||
There is no `--ai` flag and no `--no-ai` mode. AI runs unconditionally on
|
||||
every invocation.
|
||||
- **Anthropic API key is required.** If `ANTHROPIC_API_KEY` is unset, luminos
|
||||
exits cleanly (exit 0) with a one-line hint instead of running.
|
||||
- **Dependencies installed via `requirements.txt`.** anthropic, tree-sitter +
|
||||
grammars, and python-magic are normal pip dependencies, not lazy imports.
|
||||
`setup_env.sh` creates a venv and installs them.
|
||||
- **Base tool: no pip dependencies.** tree, filetypes, code, disk, recency,
|
||||
report, watch use only stdlib and GNU coreutils. Must always work on bare Python 3.
|
||||
- **AI deps are lazy.** `anthropic`, `tree-sitter`, `python-magic` imported only
|
||||
when `--ai` is used. Missing packages produce a clear install error.
|
||||
- **Subprocess for OS tools.** LOC counting, file detection, disk usage, and
|
||||
recency shell out to GNU coreutils. Do not reimplement in pure Python.
|
||||
- **Graceful degradation everywhere.** Permission denied, subprocess timeouts,
|
||||
individual dir-loop failures — all handled without crashing the run.
|
||||
missing API key — all handled without crashing.
|
||||
|
||||
---
|
||||
|
||||
## Running Luminos
|
||||
|
||||
```bash
|
||||
# Activate the venv (one-time setup: ./setup_env.sh)
|
||||
source ~/luminos-env/bin/activate
|
||||
export ANTHROPIC_API_KEY=your-key-here
|
||||
|
||||
# Run an investigation
|
||||
# Base scan
|
||||
python3 luminos.py <target>
|
||||
|
||||
# With AI analysis (requires ANTHROPIC_API_KEY)
|
||||
source ~/luminos-env/bin/activate
|
||||
python3 luminos.py --ai <target>
|
||||
|
||||
# Common flags
|
||||
python3 luminos.py -d 8 -a -x .git -x node_modules <target>
|
||||
python3 luminos.py --json -o report.json <target>
|
||||
python3 luminos.py --fresh <target>
|
||||
python3 luminos.py --clear-cache
|
||||
python3 luminos.py --watch <target>
|
||||
python3 luminos.py --install-extras
|
||||
```
|
||||
|
||||
---
|
||||
|
||||
## Project-Specific Test Notes
|
||||
## Development Workflow
|
||||
|
||||
Run tests with `python3 -m unittest discover -s tests/`. Modules exempt from
|
||||
unit testing: `ast_parser.py` (requires tree-sitter grammars at import time)
|
||||
and `prompts.py` (string templates only). `ai.py` is partially covered:
|
||||
end-to-end loops require a live Anthropic API and stay exempt, but the pure
|
||||
helpers (`_filter_dir_tools`, `_format_survey_block`, `_path_is_safe`,
|
||||
`_should_skip_dir`, `_block_to_dict`, `_flush_partial_dir_entry`, etc.) are
|
||||
covered by `tests/test_ai_pure.py`.
|
||||
- **Issue-driven work** — all work must be tied to a Forgejo issue. If the
|
||||
user names a specific issue, use it. If they describe work without an issue
|
||||
number, search open issues for a match. If no issue exists, gather enough
|
||||
context to create one before starting work. Branches and commits should
|
||||
reference the issue number.
|
||||
- **Explain then build** — articulate the approach in a few bullets before
|
||||
writing code. Surface assumptions early.
|
||||
- **Atomic commits** — each commit is one logical change.
|
||||
- **Test coverage required** — every change to a testable module must include
|
||||
or update tests in `tests/`. Run with `python3 -m unittest discover -s tests/`.
|
||||
All tests must pass before merging. Modules exempt from unit testing:
|
||||
`ai.py` (requires live API), `ast_parser.py` (requires tree-sitter),
|
||||
`watch.py` (stateful events), `prompts.py` (string templates only).
|
||||
- **Shiny object capture** — new ideas go to PLAN.md (Raw Thoughts) or a
|
||||
Forgejo issue, not into current work.
|
||||
|
||||
(Development workflow, branching discipline, and session protocols live in
|
||||
`~/.claude/CLAUDE.md`.)
|
||||
---
|
||||
|
||||
## Branching Discipline
|
||||
|
||||
- **Always branch** — no direct commits to main, ever
|
||||
- **Branch before first change** — create the branch before touching any files
|
||||
- **Naming:** `feat/`, `fix/`, `refactor/`, `chore/` + short description
|
||||
- **One branch, one concern** — don't mix unrelated changes
|
||||
- **Two-branch maximum** — never have more than 2 unmerged branches
|
||||
- **Merge with `--no-ff`** — preserves branch history in the log
|
||||
- **Delete after merge** — `git branch -d <branch>` immediately after merge
|
||||
- **Close the underlying issue manually** — after merging, `PATCH` the
|
||||
referenced issue to `state: closed` via the Forgejo API. Do not rely
|
||||
on `Closes #N` keyword auto-close — it has not worked reliably in
|
||||
this Forgejo instance, leaving issues stale while their PRs are
|
||||
merged. Manual close is one extra API call and is part of the merge
|
||||
step, not optional.
|
||||
- **Push after commits** — keep Forgejo in sync after each commit or logical batch
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -107,17 +126,71 @@ covered by `tests/test_ai_pure.py`.
|
|||
| Classes | PascalCase | `_TokenTracker`, `_CacheManager` |
|
||||
| Constants | UPPER_SNAKE_CASE | `MAX_CONTEXT`, `CACHE_ROOT` |
|
||||
| Module files | snake_case | `ast_parser.py` |
|
||||
| CLI flags | kebab-case | `--clear-cache`, `--fresh` |
|
||||
| CLI flags | kebab-case | `--clear-cache`, `--install-extras` |
|
||||
| Private functions | leading underscore | `_run_synthesis` |
|
||||
|
||||
---
|
||||
|
||||
## Documentation Workflow
|
||||
|
||||
- **Wiki location:** `docs/wiki/` — local git checkout of `luminos.wiki.git`
|
||||
- **Clone URL:** `ssh://git@forgejo-claude/archeious/luminos.wiki.git`
|
||||
- **Session startup:** clone if missing, `git -C docs/wiki pull` if present
|
||||
- **All reads and writes** happen on local files in `docs/wiki/`. Use Read,
|
||||
Edit, Write, Grep, Glob — never the Forgejo web API for wiki content.
|
||||
- **Naming:** CamelCase slugs (`Architecture.md`, `DevelopmentGuide.md`).
|
||||
Display name comes from the H1 heading inside the file.
|
||||
- **Commits:** direct to main branch. Batch logically — commit when finishing
|
||||
a round of updates, not after every file.
|
||||
- **Push:** after each commit batch.
|
||||
|
||||
---
|
||||
|
||||
## ADHD Session Protocols
|
||||
|
||||
> **MANDATORY — follow literally, every session, no exceptions.**
|
||||
|
||||
1. **Session Start Ritual** — Ensure `docs/wiki/` is cloned and current.
|
||||
Fetch open issues from Forgejo (`archeious/luminos`) and present them as
|
||||
suggested tasks. Ask: *"What's the one thing we're shipping?"* Once the
|
||||
user answers, match to an existing issue or create one before starting
|
||||
work. Do NOT summarize project state, recap history, or do any other work
|
||||
before asking this question.
|
||||
|
||||
2. **Dopamine-Friendly Task Sizing** — break work into 5–15 minute tasks with
|
||||
clear, visible outputs. Each task should have a moment of completion.
|
||||
|
||||
3. **Focus Guard** — classify incoming requests as on-topic / adjacent /
|
||||
off-topic. Name it out loud before acting. Adjacent work goes to a new
|
||||
issue; off-topic work gets deferred.
|
||||
|
||||
4. **Shiny Object Capture** — when a new idea surfaces mid-session, write it
|
||||
to PLAN.md (Raw Thoughts) or open a Forgejo issue, then return to the
|
||||
current task. Do not context-switch.
|
||||
|
||||
5. **Breadcrumb Protocol** — after each completed task, output:
|
||||
`Done: <what was completed>. Next: <what comes next>.`
|
||||
This re-orients after any interruption.
|
||||
|
||||
6. **Session End Protocol** — before closing, state the exact pickup point for
|
||||
the next session: branch name, file, what was in progress, and the
|
||||
recommended first action next time.
|
||||
|
||||
---
|
||||
|
||||
## Session Protocols
|
||||
|
||||
- **"externalize"** → read and follow `docs/externalize.md`
|
||||
- **"wrap up" / "end session"** → read and follow `docs/wrap-up.md`
|
||||
|
||||
---
|
||||
|
||||
## Session Log
|
||||
|
||||
| # | Date | Summary |
|
||||
|---|---|---|
|
||||
| 8 | 2026-04-07 | Closed #54 — added confidence/confidence_reason to write_cache tool schema description; Phase 1 milestone now 4/4 complete |
|
||||
| 9 | 2026-04-11 | Scope shift (#64) + ALL Phase 3 prereqs: dir loop refactor (#57), tool registry consolidation (#56), pure-helper test coverage waves 1+2 (#55, #70), leaf-first contract docs (#72). 6 PRs, 70 net new tests (164→234), Phase 2.6/2.7/2.8 milestones complete |
|
||||
| 10 | 2026-04-12 | Phase 3 shipped: planning pass, dynamic turn allocation, quality instrumentation (#8, #9, #10, #11, #74). Fixed root-path matching bug (#76). Smoke tests on luminos + homelab IaC. Filed #78 (synthesis persistence), #79 (stale cache). 3 PRs, 28 new tests (234→262) |
|
||||
| 2 | 2026-04-06 | Forgejo milestones (9), issues (36), project board, Gitea MCP installed and configured globally |
|
||||
| 3 | 2026-04-06 | Phase 1 complete (#1–#3), MCP backend architecture design (Part 10, Phase 3.5), issues #38–#40 opened |
|
||||
| 4 | 2026-04-06 | Phase 2 + 2.5 complete (#4–#7, #42, #44), filetype classifier rebuild, context budget metric fix, 8 PRs merged, issues #46/#48/#49/#51 opened |
|
||||
|
||||
Full log: wiki — [Session Retrospectives](https://forgejo.labbity.unbiasedgeek.com/archeious/luminos/wiki/SessionRetrospectives)
|
||||
|
|
|
|||
202
LICENSE
202
LICENSE
|
|
@ -1,202 +0,0 @@
|
|||
|
||||
Apache License
|
||||
Version 2.0, January 2004
|
||||
http://www.apache.org/licenses/
|
||||
|
||||
TERMS AND CONDITIONS FOR USE, REPRODUCTION, AND DISTRIBUTION
|
||||
|
||||
1. Definitions.
|
||||
|
||||
"License" shall mean the terms and conditions for use, reproduction,
|
||||
and distribution as defined by Sections 1 through 9 of this document.
|
||||
|
||||
"Licensor" shall mean the copyright owner or entity authorized by
|
||||
the copyright owner that is granting the License.
|
||||
|
||||
"Legal Entity" shall mean the union of the acting entity and all
|
||||
other entities that control, are controlled by, or are under common
|
||||
control with that entity. For the purposes of this definition,
|
||||
"control" means (i) the power, direct or indirect, to cause the
|
||||
direction or management of such entity, whether by contract or
|
||||
otherwise, or (ii) ownership of fifty percent (50%) or more of the
|
||||
outstanding shares, or (iii) beneficial ownership of such entity.
|
||||
|
||||
"You" (or "Your") shall mean an individual or Legal Entity
|
||||
exercising permissions granted by this License.
|
||||
|
||||
"Source" form shall mean the preferred form for making modifications,
|
||||
including but not limited to software source code, documentation
|
||||
source, and configuration files.
|
||||
|
||||
"Object" form shall mean any form resulting from mechanical
|
||||
transformation or translation of a Source form, including but
|
||||
not limited to compiled object code, generated documentation,
|
||||
and conversions to other media types.
|
||||
|
||||
"Work" shall mean the work of authorship, whether in Source or
|
||||
Object form, made available under the License, as indicated by a
|
||||
copyright notice that is included in or attached to the work
|
||||
(an example is provided in the Appendix below).
|
||||
|
||||
"Derivative Works" shall mean any work, whether in Source or Object
|
||||
form, that is based on (or derived from) the Work and for which the
|
||||
editorial revisions, annotations, elaborations, or other modifications
|
||||
represent, as a whole, an original work of authorship. For the purposes
|
||||
of this License, Derivative Works shall not include works that remain
|
||||
separable from, or merely link (or bind by name) to the interfaces of,
|
||||
the Work and Derivative Works thereof.
|
||||
|
||||
"Contribution" shall mean any work of authorship, including
|
||||
the original version of the Work and any modifications or additions
|
||||
to that Work or Derivative Works thereof, that is intentionally
|
||||
submitted to Licensor for inclusion in the Work by the copyright owner
|
||||
or by an individual or Legal Entity authorized to submit on behalf of
|
||||
the copyright owner. For the purposes of this definition, "submitted"
|
||||
means any form of electronic, verbal, or written communication sent
|
||||
to the Licensor or its representatives, including but not limited to
|
||||
communication on electronic mailing lists, source code control systems,
|
||||
and issue tracking systems that are managed by, or on behalf of, the
|
||||
Licensor for the purpose of discussing and improving the Work, but
|
||||
excluding communication that is conspicuously marked or otherwise
|
||||
designated in writing by the copyright owner as "Not a Contribution."
|
||||
|
||||
"Contributor" shall mean Licensor and any individual or Legal Entity
|
||||
on behalf of whom a Contribution has been received by Licensor and
|
||||
subsequently incorporated within the Work.
|
||||
|
||||
2. Grant of Copyright License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
copyright license to reproduce, prepare Derivative Works of,
|
||||
publicly display, publicly perform, sublicense, and distribute the
|
||||
Work and such Derivative Works in Source or Object form.
|
||||
|
||||
3. Grant of Patent License. Subject to the terms and conditions of
|
||||
this License, each Contributor hereby grants to You a perpetual,
|
||||
worldwide, non-exclusive, no-charge, royalty-free, irrevocable
|
||||
(except as stated in this section) patent license to make, have made,
|
||||
use, offer to sell, sell, import, and otherwise transfer the Work,
|
||||
where such license applies only to those patent claims licensable
|
||||
by such Contributor that are necessarily infringed by their
|
||||
Contribution(s) alone or by combination of their Contribution(s)
|
||||
with the Work to which such Contribution(s) was submitted. If You
|
||||
institute patent litigation against any entity (including a
|
||||
cross-claim or counterclaim in a lawsuit) alleging that the Work
|
||||
or a Contribution incorporated within the Work constitutes direct
|
||||
or contributory patent infringement, then any patent licenses
|
||||
granted to You under this License for that Work shall terminate
|
||||
as of the date such litigation is filed.
|
||||
|
||||
4. Redistribution. You may reproduce and distribute copies of the
|
||||
Work or Derivative Works thereof in any medium, with or without
|
||||
modifications, and in Source or Object form, provided that You
|
||||
meet the following conditions:
|
||||
|
||||
(a) You must give any other recipients of the Work or
|
||||
Derivative Works a copy of this License; and
|
||||
|
||||
(b) You must cause any modified files to carry prominent notices
|
||||
stating that You changed the files; and
|
||||
|
||||
(c) You must retain, in the Source form of any Derivative Works
|
||||
that You distribute, all copyright, patent, trademark, and
|
||||
attribution notices from the Source form of the Work,
|
||||
excluding those notices that do not pertain to any part of
|
||||
the Derivative Works; and
|
||||
|
||||
(d) If the Work includes a "NOTICE" text file as part of its
|
||||
distribution, then any Derivative Works that You distribute must
|
||||
include a readable copy of the attribution notices contained
|
||||
within such NOTICE file, excluding those notices that do not
|
||||
pertain to any part of the Derivative Works, in at least one
|
||||
of the following places: within a NOTICE text file distributed
|
||||
as part of the Derivative Works; within the Source form or
|
||||
documentation, if provided along with the Derivative Works; or,
|
||||
within a display generated by the Derivative Works, if and
|
||||
wherever such third-party notices normally appear. The contents
|
||||
of the NOTICE file are for informational purposes only and
|
||||
do not modify the License. You may add Your own attribution
|
||||
notices within Derivative Works that You distribute, alongside
|
||||
or as an addendum to the NOTICE text from the Work, provided
|
||||
that such additional attribution notices cannot be construed
|
||||
as modifying the License.
|
||||
|
||||
You may add Your own copyright statement to Your modifications and
|
||||
may provide additional or different license terms and conditions
|
||||
for use, reproduction, or distribution of Your modifications, or
|
||||
for any such Derivative Works as a whole, provided Your use,
|
||||
reproduction, and distribution of the Work otherwise complies with
|
||||
the conditions stated in this License.
|
||||
|
||||
5. Submission of Contributions. Unless You explicitly state otherwise,
|
||||
any Contribution intentionally submitted for inclusion in the Work
|
||||
by You to the Licensor shall be under the terms and conditions of
|
||||
this License, without any additional terms or conditions.
|
||||
Notwithstanding the above, nothing herein shall supersede or modify
|
||||
the terms of any separate license agreement you may have executed
|
||||
with Licensor regarding such Contributions.
|
||||
|
||||
6. Trademarks. This License does not grant permission to use the trade
|
||||
names, trademarks, service marks, or product names of the Licensor,
|
||||
except as required for reasonable and customary use in describing the
|
||||
origin of the Work and reproducing the content of the NOTICE file.
|
||||
|
||||
7. Disclaimer of Warranty. Unless required by applicable law or
|
||||
agreed to in writing, Licensor provides the Work (and each
|
||||
Contributor provides its Contributions) on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or
|
||||
implied, including, without limitation, any warranties or conditions
|
||||
of TITLE, NON-INFRINGEMENT, MERCHANTABILITY, or FITNESS FOR A
|
||||
PARTICULAR PURPOSE. You are solely responsible for determining the
|
||||
appropriateness of using or redistributing the Work and assume any
|
||||
risks associated with Your exercise of permissions under this License.
|
||||
|
||||
8. Limitation of Liability. In no event and under no legal theory,
|
||||
whether in tort (including negligence), contract, or otherwise,
|
||||
unless required by applicable law (such as deliberate and grossly
|
||||
negligent acts) or agreed to in writing, shall any Contributor be
|
||||
liable to You for damages, including any direct, indirect, special,
|
||||
incidental, or consequential damages of any character arising as a
|
||||
result of this License or out of the use or inability to use the
|
||||
Work (including but not limited to damages for loss of goodwill,
|
||||
work stoppage, computer failure or malfunction, or any and all
|
||||
other commercial damages or losses), even if such Contributor
|
||||
has been advised of the possibility of such damages.
|
||||
|
||||
9. Accepting Warranty or Additional Liability. While redistributing
|
||||
the Work or Derivative Works thereof, You may choose to offer,
|
||||
and charge a fee for, acceptance of support, warranty, indemnity,
|
||||
or other liability obligations and/or rights consistent with this
|
||||
License. However, in accepting such obligations, You may act only
|
||||
on Your own behalf and on Your sole responsibility, not on behalf
|
||||
of any other Contributor, and only if You agree to indemnify,
|
||||
defend, and hold each Contributor harmless for any liability
|
||||
incurred by, or claims asserted against, such Contributor by reason
|
||||
of your accepting any such warranty or additional liability.
|
||||
|
||||
END OF TERMS AND CONDITIONS
|
||||
|
||||
APPENDIX: How to apply the Apache License to your work.
|
||||
|
||||
To apply the Apache License to your work, attach the following
|
||||
boilerplate notice, with the fields enclosed by brackets "[]"
|
||||
replaced with your own identifying information. (Don't include
|
||||
the brackets!) The text should be enclosed in the appropriate
|
||||
comment syntax for the file format. We also recommend that a
|
||||
file or class name and description of purpose be included on the
|
||||
same "printed page" as the copyright notice for easier
|
||||
identification within third-party archives.
|
||||
|
||||
Copyright [yyyy] [name of copyright owner]
|
||||
|
||||
Licensed under the Apache License, Version 2.0 (the "License");
|
||||
you may not use this file except in compliance with the License.
|
||||
You may obtain a copy of the License at
|
||||
|
||||
http://www.apache.org/licenses/LICENSE-2.0
|
||||
|
||||
Unless required by applicable law or agreed to in writing, software
|
||||
distributed under the License is distributed on an "AS IS" BASIS,
|
||||
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
|
||||
See the License for the specific language governing permissions and
|
||||
limitations under the License.
|
||||
67
PLAN.md
67
PLAN.md
|
|
@ -576,46 +576,6 @@ without 9 phases of rework.
|
|||
output size, redundant reads) before picking a fix.
|
||||
- Add token-usage instrumentation so regressions are visible.
|
||||
|
||||
### Phase 2.6 — Pre-Phase-3 cleanup (#54, #57) ✅ shipped
|
||||
Two debts surfaced during the Session 5 documentation deep dive that
|
||||
were paid before Phase 3 adds more state to the same code paths:
|
||||
|
||||
- **#54** — Phase 1 confidence-write path was dormant. Cache schema
|
||||
accepted `confidence` and `low_confidence_entries()` worked, but no
|
||||
prompt instructed the agent to set the field. Wired in Session 8.
|
||||
- **#57** — `_run_dir_loop` was ~160 lines holding four conceptual
|
||||
layers. Refactored in Session 9 into three focused helpers
|
||||
(`_build_dir_loop_context`, `_flush_partial_dir_entry`,
|
||||
`_handle_turn_response`) so Phase 3 dynamic turn allocation has a
|
||||
thin coordinator to inject into.
|
||||
|
||||
### Phase 2.7 — Tool registration cleanup (#56) ✅ shipped
|
||||
Adding a tool used to require updating `_TOOL_DISPATCH` and `_DIR_TOOLS`
|
||||
in two separate places. Forgetting one half was silent. Replaced both
|
||||
with a single `register_tool()` call per (tool, scope) in Session 9.
|
||||
Phase 3.5 MCP backend will eventually replace this with dynamic
|
||||
discovery, at which point `register_tool()` collapses to a one-line
|
||||
forward.
|
||||
|
||||
### Phase 2.8 — Pre-Phase-3 test coverage (#55, #70)
|
||||
Safety nets for the helpers Phase 3 will pile state on top of. Two
|
||||
waves:
|
||||
|
||||
- **#55** ✅ shipped — `tests/test_ai_pure.py` covers the easy
|
||||
decision-logic helpers: `_filter_dir_tools`, `_format_survey_block`,
|
||||
`_format_survey_signals`, `_default_survey`, `_should_skip_dir`,
|
||||
`_path_is_safe`, `_block_to_dict`, plus `_flush_partial_dir_entry`
|
||||
from #57. 45 tests added in Session 9.
|
||||
- **#70** — second wave covering the highest-impact remaining helpers
|
||||
that escaped the first sweep:
|
||||
- `_TokenTracker` — pins the load-bearing #44 fix
|
||||
(`last_input` vs cumulative for budget decisions)
|
||||
- `_synthesize_from_cache` — last-resort fallback that fires almost
|
||||
never in normal runs and is therefore the kind of code that silently
|
||||
rots
|
||||
- `_discover_directories` — leaves-first walk and skip-dir filter,
|
||||
foundation of the cache reuse story
|
||||
|
||||
### Phase 3 — Investigation planning
|
||||
- Planning pass after survey, before dir loops
|
||||
- `submit_plan` tool
|
||||
|
|
@ -692,7 +652,7 @@ architecture. The migration pain is intentional and instructive.
|
|||
extension sub-section or similar. Low priority, not blocking.
|
||||
- **Revisit survey-skip thresholds (#46)** — `_SURVEY_MIN_FILES` and
|
||||
`_SURVEY_MIN_DIRS` shipped with values from #7's example, no
|
||||
empirical basis. Once luminos has been run on a variety of real
|
||||
empirical basis. Once `--ai` has been run on a variety of real
|
||||
targets, look at which runs skipped the survey vs ran it and decide
|
||||
whether the thresholds (or the gate logic itself) need to change.
|
||||
|
||||
|
|
@ -711,7 +671,7 @@ architecture. The migration pain is intentional and instructive.
|
|||
| `luminos_lib/search.py` | **new** — web_search, fetch_url, package_lookup implementations |
|
||||
|
||||
No changes needed to: `tree.py`, `filetypes.py`, `code.py`, `recency.py`,
|
||||
`disk.py`, `ast_parser.py`
|
||||
`disk.py`, `capabilities.py`, `watch.py`, `ast_parser.py`
|
||||
|
||||
---
|
||||
|
||||
|
|
@ -803,20 +763,20 @@ agent read, in what order, what it decided to skip). Storing the full message
|
|||
history per directory would allow replaying or auditing an investigation. Cost:
|
||||
storage. Benefit: debuggability, ability to resume investigations more faithfully.
|
||||
|
||||
**Live re-investigation mode**
|
||||
A "watch" replacement: detect which directories changed, re-investigate only
|
||||
those, and patch the cache entries. The synthesis would then re-run from the
|
||||
updated cache without re-investigating unchanged directories. The original
|
||||
non-AI watch mode was deleted in the #64 scope change because it conflicted
|
||||
with the AI-first philosophy. If watch comes back, it comes back as this.
|
||||
**Watch mode + incremental investigation**
|
||||
Watch mode currently re-runs the full base scan on changes. For AI-augmented
|
||||
watch mode: detect which directories changed, re-investigate only those, and
|
||||
patch the cache entries. The synthesis would then re-run from the updated cache
|
||||
without re-investigating unchanged directories.
|
||||
|
||||
**PDF and Office document readers**
|
||||
**Optional PDF and Office document readers**
|
||||
The data and documents domains would benefit from native content extraction:
|
||||
- `pdfminer` or `pypdf` for PDF text extraction
|
||||
- `openpyxl` for Excel schema and sheet enumeration
|
||||
- `python-docx` for Word document text
|
||||
These slot into `requirements.txt` like any other dependency. The agent
|
||||
currently can only see filename and size for these formats.
|
||||
These would be optional deps like the existing AI deps, gated behind
|
||||
`--install-extras`. The agent currently can only see filename and size for
|
||||
these formats.
|
||||
|
||||
**Security-focused analysis mode**
|
||||
A `--security` flag could tune the investigation toward security-relevant
|
||||
|
|
@ -879,6 +839,11 @@ bad plan wastes turns on shallow directories and skips critical ones. The system
|
|||
needs quality signals — probably the confidence scores aggregated across the
|
||||
investigation — to detect when something went wrong and potentially retry.
|
||||
|
||||
**Watch mode compatibility**
|
||||
Several of the planned features (survey pass, planning, external tools) are not
|
||||
designed for incremental re-use in watch mode. Adding AI capability to watch
|
||||
mode is a separate design problem that deserves its own thinking.
|
||||
|
||||
**Turn budget contention**
|
||||
If the planning pass allocates turns and the agent borrows from its budget when
|
||||
it needs more, there's a risk of runaway investigation on unexpectedly complex
|
||||
|
|
|
|||
103
README.md
103
README.md
|
|
@ -1,103 +0,0 @@
|
|||
# Luminos
|
||||
|
||||
A file system intelligence tool. Point it at a directory and it runs an agentic Claude investigation that figures out what the directory is, what's in it, and what might be worth your attention.
|
||||
|
||||
Luminos is built around a harder question than "what files are here?" It is built around "what is this, and should I be worried about any of it?" To answer that, it runs a multi-pass agentic investigation against the [Claude API](https://www.anthropic.com/api): a survey pass to orient on the target, an isolated dir-loop agent per directory with a small toolbelt (read files, run whitelisted coreutils commands, write cache entries), and a final synthesis pass that produces a project-level verdict with severity-ranked flags.
|
||||
|
||||
A lightweight base scan runs first to feed the agent its initial picture of the target. The base scan is not a standalone product, it is the first step of the investigation.
|
||||
|
||||
## Features
|
||||
|
||||
- **Agentic AI investigation.** Multi-pass, leaves-first analysis via Claude. Survey then dir loops then synthesis.
|
||||
- **Investigation cache.** Per-file and per-directory summaries are cached under `/tmp/luminos/` so repeat runs on the same target are cheap.
|
||||
- **Severity-ranked flags.** Findings are sorted so `critical` items are the first thing you see.
|
||||
- **Context budget guard.** Per-turn `input_tokens` is watched against a budget so a rogue directory can't blow the context and silently degrade quality.
|
||||
- **Graceful degradation.** Permission denied, subprocess timeouts, missing API key: all handled without crashing.
|
||||
- **JSON output.** Pipe reports to other tools or save for comparison.
|
||||
|
||||
## Installation
|
||||
|
||||
Luminos is a normal Python project. Clone, create a venv, and install from `requirements.txt`. The repository ships a helper script that does this for you:
|
||||
|
||||
```bash
|
||||
git clone https://github.com/archeious/luminos.git
|
||||
cd luminos
|
||||
./setup_env.sh
|
||||
source ~/luminos-env/bin/activate
|
||||
```
|
||||
|
||||
Or do it by hand:
|
||||
|
||||
```bash
|
||||
python3 -m venv ~/luminos-env
|
||||
source ~/luminos-env/bin/activate
|
||||
pip install -r requirements.txt
|
||||
```
|
||||
|
||||
You also need an Anthropic API key exported as an environment variable:
|
||||
|
||||
```bash
|
||||
export ANTHROPIC_API_KEY=your-key-here
|
||||
```
|
||||
|
||||
The base scan shells out to a handful of GNU coreutils (`wc`, `file`, `grep`, `head`, `tail`, `stat`, `du`, `find`), so you also need those on `$PATH`. They are installed by default on every mainstream Linux distribution and on macOS via Homebrew.
|
||||
|
||||
## Usage
|
||||
|
||||
```bash
|
||||
python3 luminos.py /path/to/project
|
||||
```
|
||||
|
||||
That is the whole interface. The investigation runs end to end and prints a report.
|
||||
|
||||
### Common flags
|
||||
|
||||
```bash
|
||||
# Deeper tree, include hidden files, exclude build and vendor dirs
|
||||
python3 luminos.py -d 8 -a -x .git -x node_modules -x vendor /path/to/project
|
||||
|
||||
# JSON output to a file
|
||||
python3 luminos.py --json -o report.json /path/to/project
|
||||
|
||||
# Force a fresh investigation, ignoring the cache
|
||||
python3 luminos.py --fresh /path/to/project
|
||||
|
||||
# Clear the investigation cache
|
||||
python3 luminos.py --clear-cache
|
||||
```
|
||||
|
||||
Run `python3 luminos.py --help` for the full flag list.
|
||||
|
||||
## How the investigation works
|
||||
|
||||
A short version of what happens on every run:
|
||||
|
||||
1. **Base scan.** Builds the directory tree, classifies files into seven categories, counts lines of code, finds large and recently modified files, computes per-directory disk usage. This is the agent's initial picture of the target.
|
||||
2. **Survey pass.** A short agent loop (max 3 turns) reads the base scan, describes the target in plain language, and decides which investigation tools are relevant. Tiny targets skip the survey.
|
||||
3. **Dir loops.** Every directory gets its own isolated agent loop, leaves-first, with up to 14 turns. The agent has read-only access to the filesystem and a toolbelt of `read_file`, `list_directory`, `run_command`, `parse_structure`, `write_cache`, `think`, `checkpoint`, `flag`, and `submit_report`.
|
||||
4. **Cache.** Each file and directory summary is written to `/tmp/luminos/` so subsequent runs on the same target don't re-derive what hasn't changed.
|
||||
5. **Context budget guard.** Per-turn `input_tokens` is watched against a budget (currently 70% of the model's context window) so a rogue directory can't blow the context window.
|
||||
6. **Final synthesis.** A short agent loop reads the directory-level cache entries (not the raw files) and produces the project-level brief, the detailed analysis, and the severity-ranked flags.
|
||||
|
||||
## Development
|
||||
|
||||
Run the test suite:
|
||||
|
||||
```bash
|
||||
python3 -m unittest discover -s tests/
|
||||
```
|
||||
|
||||
Modules that are intentionally not unit tested:
|
||||
|
||||
- `luminos_lib/ast_parser.py`: requires tree-sitter grammars installed
|
||||
- `luminos_lib/prompts.py`: string templates only
|
||||
|
||||
`luminos_lib/ai.py` is partially covered. End-to-end agent loops require a live Anthropic API and stay exempt, but pure helpers are tested in `tests/test_ai_pure.py`.
|
||||
|
||||
## License
|
||||
|
||||
Apache License 2.0. See [`LICENSE`](LICENSE) for the full text.
|
||||
|
||||
## Source of truth
|
||||
|
||||
The canonical home for this project is the [Forgejo repository](https://forgejo.labbity.unbiasedgeek.com/archeious/luminos). The GitHub copy is a read-only mirror, pushed automatically from Forgejo. Issues, pull requests, and the project wiki live on Forgejo.
|
||||
36
docs/externalize.md
Normal file
36
docs/externalize.md
Normal file
|
|
@ -0,0 +1,36 @@
|
|||
# Externalize Protocol
|
||||
|
||||
> Triggered when the user says "externalize" or "externalize your thoughts."
|
||||
> This is a STANDALONE action. Do NOT wrap up unless separately asked.
|
||||
|
||||
## Steps
|
||||
|
||||
1. **Determine session number** — check the Session Log in CLAUDE.md for the
|
||||
latest session number, increment by 1
|
||||
|
||||
2. **Pull wiki** — ensure `docs/wiki/` is current:
|
||||
```bash
|
||||
git -C docs/wiki pull # or clone if missing
|
||||
```
|
||||
|
||||
3. **Create session wiki page** — write `docs/wiki/Session{N}.md` with:
|
||||
- Date, focus, duration estimate
|
||||
- What was done (with detail — reference actual files and commits)
|
||||
- Discoveries and observations
|
||||
- Decisions made and why
|
||||
- Raw Thinking — observations, concerns, trade-offs, and loose threads that
|
||||
came up during the session but weren't part of the main deliverable.
|
||||
Things you'd mention if pair programming: prerequisites noticed, corners
|
||||
being painted into, intent mismatches, unresolved questions.
|
||||
- What's next
|
||||
|
||||
4. **Update SessionRetrospectives.md** — read the current index, add the new
|
||||
session row, write it back
|
||||
|
||||
5. **Commit and push wiki:**
|
||||
```bash
|
||||
cd docs/wiki
|
||||
git add -A
|
||||
git commit -m "retro: Session {N} — <one-line summary>"
|
||||
git push
|
||||
```
|
||||
31
docs/wrap-up.md
Normal file
31
docs/wrap-up.md
Normal file
|
|
@ -0,0 +1,31 @@
|
|||
# Session Wrap-Up Checklist
|
||||
|
||||
> Triggered when the user says "wrap up", "end session", or similar.
|
||||
> Always externalize FIRST, then do the steps below.
|
||||
|
||||
## Steps
|
||||
|
||||
1. **Externalize** — run the `docs/externalize.md` protocol if not already
|
||||
done this session
|
||||
|
||||
2. **Reread CLAUDE.md** — ensure you have the latest context before editing
|
||||
|
||||
3. **Update CLAUDE.md:**
|
||||
- Update **Current Project State** — phase, last worked on (today's date),
|
||||
last commit, blocking issues
|
||||
- Update **Session Log** — add new entry, keep only last 3 sessions,
|
||||
remove older ones (full history is in the wiki)
|
||||
|
||||
4. **Commit and push main repo:**
|
||||
```bash
|
||||
git add CLAUDE.md
|
||||
git commit -m "chore: update CLAUDE.md for session {N}"
|
||||
git push
|
||||
```
|
||||
|
||||
5. **Verify nothing is unpushed** — both the main repo and docs/wiki should
|
||||
have no pending commits
|
||||
|
||||
6. **Recommend next session** — tell the user what the best next session
|
||||
should tackle, in priority order based on PLAN.md and any open Forgejo
|
||||
issues
|
||||
57
luminos.py
57
luminos.py
|
|
@ -16,11 +16,16 @@ from luminos_lib.filetypes import (
|
|||
from luminos_lib.code import detect_languages, find_large_files
|
||||
from luminos_lib.recency import find_recent_files
|
||||
from luminos_lib.disk import get_disk_usage, top_directories
|
||||
from luminos_lib.watch import watch_loop
|
||||
from luminos_lib.report import format_report
|
||||
|
||||
|
||||
def _progress(label):
|
||||
"""Return (on_file, finish) for in-place per-file progress on stderr."""
|
||||
"""Return (on_file, finish) for in-place per-file progress on stderr.
|
||||
|
||||
on_file(path) overwrites the current line with the label and truncated path.
|
||||
finish() finalises the line with a newline.
|
||||
"""
|
||||
cols = shutil.get_terminal_size((80, 20)).columns
|
||||
prefix = f" [scan] {label}... "
|
||||
available = max(cols - len(prefix), 10)
|
||||
|
|
@ -38,7 +43,7 @@ def _progress(label):
|
|||
|
||||
|
||||
def scan(target, depth=3, show_hidden=False, exclude=None):
|
||||
"""Run the base scan and return the report dict consumed by the AI pass."""
|
||||
"""Run all analyses on the target directory and return a report dict."""
|
||||
report = {}
|
||||
|
||||
exclude = exclude or []
|
||||
|
|
@ -84,8 +89,7 @@ def main():
|
|||
parser = argparse.ArgumentParser(
|
||||
prog="luminos",
|
||||
description="Luminos — file system intelligence tool. "
|
||||
"Runs an agentic Claude investigation against a directory "
|
||||
"and produces a reconnaissance report.",
|
||||
"Explores a directory and produces a reconnaissance report.",
|
||||
)
|
||||
parser.add_argument("target", nargs="?", help="Target directory to analyze")
|
||||
parser.add_argument("-d", "--depth", type=int, default=3,
|
||||
|
|
@ -96,10 +100,17 @@ def main():
|
|||
help="Output report as JSON")
|
||||
parser.add_argument("-o", "--output", metavar="FILE",
|
||||
help="Write report to a file")
|
||||
parser.add_argument("--ai", action="store_true",
|
||||
help="Use Claude AI to analyze directory purpose "
|
||||
"(requires ANTHROPIC_API_KEY)")
|
||||
parser.add_argument("--watch", action="store_true",
|
||||
help="Re-scan every 30 seconds and show diffs")
|
||||
parser.add_argument("--clear-cache", action="store_true",
|
||||
help="Clear the investigation cache (/tmp/luminos/)")
|
||||
help="Clear the AI investigation cache (/tmp/luminos/)")
|
||||
parser.add_argument("--fresh", action="store_true",
|
||||
help="Force a new investigation (ignore cached results)")
|
||||
help="Force a new AI investigation (ignore cached results)")
|
||||
parser.add_argument("--install-extras", action="store_true",
|
||||
help="Show status of optional AI dependencies")
|
||||
parser.add_argument("-x", "--exclude", metavar="DIR", action="append",
|
||||
default=[],
|
||||
help="Exclude a directory name from scan and analysis "
|
||||
|
|
@ -107,8 +118,15 @@ def main():
|
|||
|
||||
args = parser.parse_args()
|
||||
|
||||
# --install-extras: show package status and exit
|
||||
if args.install_extras:
|
||||
from luminos_lib.capabilities import print_status
|
||||
print_status()
|
||||
return
|
||||
|
||||
# --clear-cache: wipe /tmp/luminos/ (lazy import to avoid AI deps)
|
||||
if args.clear_cache:
|
||||
from luminos_lib.cache import clear_cache
|
||||
from luminos_lib.capabilities import clear_cache
|
||||
clear_cache()
|
||||
if not args.target:
|
||||
return
|
||||
|
|
@ -122,24 +140,25 @@ def main():
|
|||
file=sys.stderr)
|
||||
sys.exit(1)
|
||||
|
||||
if not os.environ.get("ANTHROPIC_API_KEY"):
|
||||
print("luminos requires ANTHROPIC_API_KEY. "
|
||||
"Set it with: export ANTHROPIC_API_KEY=your-key-here",
|
||||
file=sys.stderr)
|
||||
sys.exit(0)
|
||||
|
||||
if args.exclude:
|
||||
print(f" [scan] Excluding: {', '.join(args.exclude)}", file=sys.stderr)
|
||||
|
||||
if args.watch:
|
||||
watch_loop(target, depth=args.depth, show_hidden=args.all,
|
||||
json_output=args.json_output)
|
||||
return
|
||||
|
||||
report = scan(target, depth=args.depth, show_hidden=args.all,
|
||||
exclude=args.exclude)
|
||||
|
||||
from luminos_lib.ai import analyze_directory
|
||||
brief, detailed, flags = analyze_directory(
|
||||
report, target, fresh=args.fresh, exclude=args.exclude)
|
||||
report["ai_brief"] = brief
|
||||
report["ai_detailed"] = detailed
|
||||
report["flags"] = flags
|
||||
flags = []
|
||||
if args.ai:
|
||||
from luminos_lib.ai import analyze_directory
|
||||
brief, detailed, flags = analyze_directory(
|
||||
report, target, fresh=args.fresh, exclude=args.exclude)
|
||||
report["ai_brief"] = brief
|
||||
report["ai_detailed"] = detailed
|
||||
report["flags"] = flags
|
||||
|
||||
if args.json_output:
|
||||
output = json.dumps(report, indent=2, default=str)
|
||||
|
|
|
|||
1550
luminos_lib/ai.py
1550
luminos_lib/ai.py
File diff suppressed because it is too large
Load diff
|
|
@ -3,8 +3,6 @@
|
|||
import hashlib
|
||||
import json
|
||||
import os
|
||||
import shutil
|
||||
import sys
|
||||
import uuid
|
||||
from datetime import datetime, timezone
|
||||
|
||||
|
|
@ -12,16 +10,6 @@ CACHE_ROOT = "/tmp/luminos"
|
|||
INVESTIGATIONS_PATH = os.path.join(CACHE_ROOT, "investigations.json")
|
||||
|
||||
|
||||
def clear_cache():
|
||||
"""Remove all investigation caches under CACHE_ROOT."""
|
||||
if os.path.isdir(CACHE_ROOT):
|
||||
shutil.rmtree(CACHE_ROOT)
|
||||
print(f"Cleared cache: {CACHE_ROOT}", file=sys.stderr)
|
||||
else:
|
||||
print(f"No cache to clear ({CACHE_ROOT} does not exist).",
|
||||
file=sys.stderr)
|
||||
|
||||
|
||||
def _sha256_path(path):
|
||||
"""Return a hex SHA-256 of a path string, used as cache key."""
|
||||
return hashlib.sha256(path.encode("utf-8")).hexdigest()
|
||||
|
|
|
|||
139
luminos_lib/capabilities.py
Normal file
139
luminos_lib/capabilities.py
Normal file
|
|
@ -0,0 +1,139 @@
|
|||
"""Capability detection and cache management for optional luminos dependencies.
|
||||
|
||||
The base tool requires zero external packages. The --ai flag requires:
|
||||
- anthropic (API transport)
|
||||
- tree-sitter (AST parsing via parse_structure tool)
|
||||
- python-magic (improved file classification)
|
||||
|
||||
This module is the single place that knows about optional dependencies.
|
||||
"""
|
||||
|
||||
_PACKAGES = {
|
||||
"anthropic": {
|
||||
"import": "anthropic",
|
||||
"pip": "anthropic",
|
||||
"purpose": "Claude API client (streaming, retries, token counting)",
|
||||
},
|
||||
"tree-sitter": {
|
||||
"import": "tree_sitter",
|
||||
"pip": ("tree-sitter tree-sitter-python tree-sitter-javascript "
|
||||
"tree-sitter-rust tree-sitter-go"),
|
||||
"purpose": "AST parsing for parse_structure tool",
|
||||
},
|
||||
"python-magic": {
|
||||
"import": "magic",
|
||||
"pip": "python-magic",
|
||||
"purpose": "Improved file type detection via libmagic",
|
||||
},
|
||||
}
|
||||
|
||||
|
||||
def _check_package(import_name):
|
||||
"""Return True if a package is importable."""
|
||||
try:
|
||||
__import__(import_name)
|
||||
return True
|
||||
except ImportError:
|
||||
return False
|
||||
|
||||
|
||||
ANTHROPIC_AVAILABLE = _check_package("anthropic")
|
||||
TREE_SITTER_AVAILABLE = _check_package("tree_sitter")
|
||||
MAGIC_AVAILABLE = _check_package("magic")
|
||||
|
||||
|
||||
def check_ai_dependencies():
|
||||
"""Check that all --ai dependencies are installed.
|
||||
|
||||
If any are missing, prints a clear error with the pip install command
|
||||
and returns False. Returns True if everything is available.
|
||||
"""
|
||||
missing = []
|
||||
for name, info in _PACKAGES.items():
|
||||
if not _check_package(info["import"]):
|
||||
missing.append(name)
|
||||
|
||||
if not missing:
|
||||
return True
|
||||
|
||||
# Also check tree-sitter grammar packages
|
||||
grammar_missing = []
|
||||
if "tree-sitter" not in missing:
|
||||
for grammar in ["tree_sitter_python", "tree_sitter_javascript",
|
||||
"tree_sitter_rust", "tree_sitter_go"]:
|
||||
if not _check_package(grammar):
|
||||
grammar_missing.append(grammar.replace("_", "-"))
|
||||
|
||||
import sys
|
||||
print("\nluminos --ai requires missing packages:", file=sys.stderr)
|
||||
for name in missing:
|
||||
print(f" \u2717 {name}", file=sys.stderr)
|
||||
for name in grammar_missing:
|
||||
print(f" \u2717 {name}", file=sys.stderr)
|
||||
|
||||
# Build pip install command
|
||||
pip_parts = []
|
||||
for name in missing:
|
||||
pip_parts.append(_PACKAGES[name]["pip"])
|
||||
for name in grammar_missing:
|
||||
pip_parts.append(name)
|
||||
pip_cmd = " \\\n ".join(pip_parts)
|
||||
|
||||
print(f"\n Install with:\n pip install {pip_cmd}\n", file=sys.stderr)
|
||||
return False
|
||||
|
||||
|
||||
def print_status():
|
||||
"""Print the install status of all optional packages."""
|
||||
print("\nLuminos optional dependencies:\n")
|
||||
|
||||
for name, info in _PACKAGES.items():
|
||||
available = _check_package(info["import"])
|
||||
mark = "\u2713" if available else "\u2717"
|
||||
status = "installed" if available else "missing"
|
||||
print(f" {mark} {name:20s} {status:10s} {info['purpose']}")
|
||||
|
||||
# Grammar packages
|
||||
grammars = {
|
||||
"tree-sitter-python": "tree_sitter_python",
|
||||
"tree-sitter-javascript": "tree_sitter_javascript",
|
||||
"tree-sitter-rust": "tree_sitter_rust",
|
||||
"tree-sitter-go": "tree_sitter_go",
|
||||
}
|
||||
print()
|
||||
for name, imp in grammars.items():
|
||||
available = _check_package(imp)
|
||||
mark = "\u2713" if available else "\u2717"
|
||||
status = "installed" if available else "missing"
|
||||
print(f" {mark} {name:20s} {status:10s} Language grammar")
|
||||
|
||||
# Full install command (deduplicated)
|
||||
all_pkgs = []
|
||||
seen = set()
|
||||
for info in _PACKAGES.values():
|
||||
for pkg in info["pip"].split():
|
||||
if pkg not in seen:
|
||||
all_pkgs.append(pkg)
|
||||
seen.add(pkg)
|
||||
for name in grammars:
|
||||
if name not in seen:
|
||||
all_pkgs.append(name)
|
||||
seen.add(name)
|
||||
|
||||
print(f"\n Install all with:\n pip install {' '.join(all_pkgs)}\n")
|
||||
|
||||
|
||||
from luminos_lib.cache import CACHE_ROOT
|
||||
|
||||
|
||||
def clear_cache():
|
||||
"""Remove all investigation caches under /tmp/luminos/."""
|
||||
import shutil
|
||||
import os
|
||||
import sys
|
||||
if os.path.isdir(CACHE_ROOT):
|
||||
shutil.rmtree(CACHE_ROOT)
|
||||
print(f"Cleared cache: {CACHE_ROOT}", file=sys.stderr)
|
||||
else:
|
||||
print(f"No cache to clear ({CACHE_ROOT} does not exist).",
|
||||
file=sys.stderr)
|
||||
|
|
@ -209,84 +209,3 @@ Call `submit_survey` exactly once with:
|
|||
You have at most 3 turns. In almost all cases you should call
|
||||
`submit_survey` on your first turn. Use a second turn only if you
|
||||
genuinely need to think before committing."""
|
||||
|
||||
_PLANNING_SYSTEM_PROMPT = """\
|
||||
You are an investigation planner. Your job is to decide where to invest
|
||||
investigative depth across a directory tree, BEFORE the per-directory
|
||||
investigation begins. You allocate turns (agent reasoning steps) to
|
||||
directories based on their likely complexity and importance.
|
||||
|
||||
## Your Task
|
||||
Create an investigation plan for the target: {target}
|
||||
|
||||
## Inputs
|
||||
|
||||
Survey assessment (from a prior reconnaissance pass):
|
||||
{survey_context}
|
||||
|
||||
Full directory tree:
|
||||
{tree_text}
|
||||
|
||||
File signals:
|
||||
{file_signals}
|
||||
|
||||
Total directories to investigate: {dir_count}
|
||||
Directories already cached (will be skipped): {cached_dirs}
|
||||
|
||||
## How to Allocate
|
||||
|
||||
Classify each directory into one of three tiers:
|
||||
|
||||
**priority** (15-20 turns): directories that are likely complex, central,
|
||||
or important. Signs: many source files, core application logic, complex
|
||||
configuration, entry points, schemas, migrations. These deserve deep
|
||||
investigation with multiple tool calls per file.
|
||||
|
||||
**shallow** (5 turns): directories that are simple, peripheral, or
|
||||
predictable. Signs: few files, generated/vendored content, test fixtures,
|
||||
static assets, documentation-only dirs. A quick pass is sufficient.
|
||||
|
||||
**skip** (0 turns): directories that should be skipped entirely. Signs:
|
||||
build output, dependency caches, vendored code, generated artifacts. The
|
||||
investigation would waste turns and produce noise.
|
||||
|
||||
Directories you do not mention go into a default tier ({default_turns}
|
||||
turns). You do NOT need to list every directory. Focus on the ones where
|
||||
the default allocation would clearly be wrong (too many turns for a
|
||||
trivial dir, or too few for a complex one).
|
||||
|
||||
## Investigation Order
|
||||
|
||||
Choose one of these ordering strategies:
|
||||
|
||||
- **leaf-first**: deepest directories first, parents last. This is the
|
||||
default and ensures parent directories always have child summaries
|
||||
available. Best for most codebases.
|
||||
|
||||
- **priority-first**: priority directories before shallow ones, but
|
||||
still leaf-first within each tier. Good when certain subtrees are
|
||||
clearly more important and you want findings from them to inform
|
||||
the rest of the investigation.
|
||||
|
||||
Both strategies preserve the leaf-first invariant (children before
|
||||
parents) to ensure child summaries are available when investigating
|
||||
parent directories.
|
||||
|
||||
## Budget
|
||||
|
||||
The global turn budget is {global_budget} turns across all directories.
|
||||
Your allocations should roughly respect this budget, though small
|
||||
overages are fine. If you allocate significantly more than the budget,
|
||||
the orchestrator will cap individual directories.
|
||||
|
||||
## Notes Field
|
||||
|
||||
Use `notes` to communicate anything the per-directory agents should
|
||||
know that the survey did not capture. Cross-cutting concerns, suspected
|
||||
relationships between directories, or investigation priorities. Leave
|
||||
empty if you have nothing to add beyond the tier assignments.
|
||||
|
||||
## Output
|
||||
Call `submit_plan` exactly once. You have at most 3 turns, but you
|
||||
should almost always submit on your first turn. Use additional turns
|
||||
only if you genuinely need to reason through a complex target layout."""
|
||||
|
|
|
|||
108
luminos_lib/watch.py
Normal file
108
luminos_lib/watch.py
Normal file
|
|
@ -0,0 +1,108 @@
|
|||
"""Watch mode — re-scan and show diffs every 30 seconds."""
|
||||
|
||||
import json
|
||||
import sys
|
||||
import time
|
||||
import os
|
||||
|
||||
|
||||
def _snapshot(classified_files):
|
||||
"""Create a snapshot dict: path -> (size, category)."""
|
||||
return {f["path"]: (f["size"], f["category"]) for f in classified_files}
|
||||
|
||||
|
||||
def _diff_snapshots(old, new):
|
||||
"""Compare two snapshots and return changes."""
|
||||
old_paths = set(old.keys())
|
||||
new_paths = set(new.keys())
|
||||
|
||||
added = new_paths - old_paths
|
||||
removed = old_paths - new_paths
|
||||
common = old_paths & new_paths
|
||||
|
||||
size_changes = []
|
||||
for p in common:
|
||||
old_size = old[p][0]
|
||||
new_size = new[p][0]
|
||||
if old_size != new_size:
|
||||
size_changes.append((p, old_size, new_size))
|
||||
|
||||
return added, removed, size_changes
|
||||
|
||||
|
||||
def _human_size(nbytes):
|
||||
for unit in ("B", "KB", "MB", "GB"):
|
||||
if nbytes < 1024:
|
||||
if unit == "B":
|
||||
return f"{nbytes} {unit}"
|
||||
return f"{nbytes:.1f} {unit}"
|
||||
nbytes /= 1024
|
||||
return f"{nbytes:.1f} TB"
|
||||
|
||||
|
||||
def watch_loop(target, depth=3, show_hidden=False, json_output=False):
|
||||
"""Run scan in a loop, printing diffs between runs."""
|
||||
# Import here to avoid circular import
|
||||
from luminos_lib.filetypes import classify_files
|
||||
|
||||
print(f"[luminos] Watching {target} (Ctrl+C to stop)")
|
||||
print(f"[luminos] Scanning every 30 seconds...")
|
||||
print()
|
||||
|
||||
prev_snapshot = None
|
||||
|
||||
try:
|
||||
while True:
|
||||
classified = classify_files(target, show_hidden=show_hidden)
|
||||
current = _snapshot(classified)
|
||||
|
||||
if prev_snapshot is not None:
|
||||
added, removed, size_changes = _diff_snapshots(
|
||||
prev_snapshot, current
|
||||
)
|
||||
|
||||
if not added and not removed and not size_changes:
|
||||
ts = time.strftime("%H:%M:%S")
|
||||
print(f"[{ts}] No changes detected.")
|
||||
else:
|
||||
ts = time.strftime("%H:%M:%S")
|
||||
print(f"[{ts}] Changes detected:")
|
||||
|
||||
if json_output:
|
||||
diff = {
|
||||
"timestamp": ts,
|
||||
"added": sorted(added),
|
||||
"removed": sorted(removed),
|
||||
"size_changes": [
|
||||
{"path": p, "old_size": o, "new_size": n}
|
||||
for p, o, n in size_changes
|
||||
],
|
||||
}
|
||||
print(json.dumps(diff, indent=2))
|
||||
else:
|
||||
for p in sorted(added):
|
||||
name = os.path.basename(p)
|
||||
print(f" + NEW {name}")
|
||||
print(f" {p}")
|
||||
for p in sorted(removed):
|
||||
name = os.path.basename(p)
|
||||
print(f" - DEL {name}")
|
||||
print(f" {p}")
|
||||
for p, old_s, new_s in size_changes:
|
||||
name = os.path.basename(p)
|
||||
delta = new_s - old_s
|
||||
sign = "+" if delta > 0 else ""
|
||||
print(f" ~ SIZE {name} "
|
||||
f"{_human_size(old_s)} -> {_human_size(new_s)} "
|
||||
f"({sign}{_human_size(delta)})")
|
||||
print()
|
||||
else:
|
||||
print(f"[{time.strftime('%H:%M:%S')}] "
|
||||
f"Initial scan complete: {len(current)} files indexed.")
|
||||
print()
|
||||
|
||||
prev_snapshot = current
|
||||
time.sleep(30)
|
||||
|
||||
except KeyboardInterrupt:
|
||||
print("\n[luminos] Watch stopped.")
|
||||
|
|
@ -1,7 +0,0 @@
|
|||
anthropic
|
||||
python-magic
|
||||
tree-sitter
|
||||
tree-sitter-python
|
||||
tree-sitter-javascript
|
||||
tree-sitter-rust
|
||||
tree-sitter-go
|
||||
15
setup_env.sh
15
setup_env.sh
|
|
@ -2,7 +2,6 @@
|
|||
set -euo pipefail
|
||||
|
||||
VENV_DIR="$HOME/luminos-env"
|
||||
SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
|
||||
|
||||
if [ -d "$VENV_DIR" ]; then
|
||||
echo "venv already exists at $VENV_DIR"
|
||||
|
|
@ -14,19 +13,17 @@ fi
|
|||
echo "Activating venv..."
|
||||
source "$VENV_DIR/bin/activate"
|
||||
|
||||
echo "Installing packages from requirements.txt..."
|
||||
pip install -r "$SCRIPT_DIR/requirements.txt"
|
||||
echo "Installing packages..."
|
||||
pip install anthropic tree-sitter tree-sitter-python \
|
||||
tree-sitter-javascript tree-sitter-rust \
|
||||
tree-sitter-go python-magic
|
||||
|
||||
echo ""
|
||||
echo "Done. To activate the venv in future sessions:"
|
||||
echo ""
|
||||
echo " source ~/luminos-env/bin/activate"
|
||||
echo ""
|
||||
echo "Set your Anthropic API key:"
|
||||
echo "Then run luminos as usual:"
|
||||
echo ""
|
||||
echo " export ANTHROPIC_API_KEY=your-key-here"
|
||||
echo ""
|
||||
echo "Then run luminos:"
|
||||
echo ""
|
||||
echo " python3 luminos.py <target>"
|
||||
echo " python3 luminos.py --ai <target>"
|
||||
echo ""
|
||||
|
|
|
|||
File diff suppressed because it is too large
Load diff
37
tests/test_capabilities.py
Normal file
37
tests/test_capabilities.py
Normal file
|
|
@ -0,0 +1,37 @@
|
|||
"""Tests for luminos_lib/capabilities.py"""
|
||||
|
||||
import unittest
|
||||
from unittest.mock import patch
|
||||
|
||||
from luminos_lib.capabilities import _check_package
|
||||
|
||||
|
||||
class TestCheckPackage(unittest.TestCase):
|
||||
def test_importable_package(self):
|
||||
# json is always available in stdlib
|
||||
self.assertTrue(_check_package("json"))
|
||||
|
||||
def test_missing_package(self):
|
||||
self.assertFalse(_check_package("_luminos_nonexistent_package_xyz"))
|
||||
|
||||
def test_importable_returns_true(self):
|
||||
with patch("builtins.__import__", return_value=None):
|
||||
# patch doesn't work cleanly here; use a real stdlib module
|
||||
pass
|
||||
self.assertTrue(_check_package("os"))
|
||||
|
||||
def test_import_error_returns_false(self):
|
||||
import builtins
|
||||
original_import = builtins.__import__
|
||||
|
||||
def fake_import(name, *args, **kwargs):
|
||||
if name == "_fake_missing_module":
|
||||
raise ImportError("No module named '_fake_missing_module'")
|
||||
return original_import(name, *args, **kwargs)
|
||||
|
||||
with patch("builtins.__import__", side_effect=fake_import):
|
||||
self.assertFalse(_check_package("_fake_missing_module"))
|
||||
|
||||
|
||||
if __name__ == "__main__":
|
||||
unittest.main()
|
||||
Loading…
Reference in a new issue