Returns all file and dir cache entries with confidence below a given
threshold (default 0.7). Entries missing a confidence field are
included as unrated/untrusted. Results sorted ascending by confidence
so least-confident entries come first.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add confidence and confidence_reason to both cache schemas in the dir
loop prompt. Add a Confidence section with categorical guidance
(high ≥ 0.8, medium 0.5–0.8, low < 0.5) and the rule to include
confidence_reason when confidence is below 0.7.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
129 tests across cache, filetypes, code, disk, recency, tree, report,
and capabilities. Uses stdlib unittest only — no new dependencies.
Also updates CLAUDE.md development workflow to require test coverage
for all future changes.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Add optional confidence (float 0.0–1.0) and confidence_reason (str) fields
to both file and dir cache entries. Validation rejects out-of-range values
and wrong types. Fields are not yet required — pure schema instrumentation
for Phase 1.
Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Moves _DIR_SYSTEM_PROMPT and _SYNTHESIS_SYSTEM_PROMPT from ai.py into
a dedicated prompts module. Both are pure template strings with .format()
placeholders — no runtime imports needed in prompts.py. Prompt content
is byte-for-byte identical to the original.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Moves all tree-sitter parsing logic from ai.py into a dedicated module.
Replaces the if/elif language chain with a _LANGUAGE_HANDLERS registry
mapping language names to handler functions.
Extracted: _tool_parse_structure body, _get_ts_parser, _child_by_type,
_text, and all per-language helpers (_py_func_sig, _py_class, etc.).
ai.py retains a thin wrapper for path validation.
Public API: parse_structure(path) -> JSON string
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- Delete unused clear_cache() from ai.py (luminos.py imports it from
capabilities.py)
- Remove CACHE_ROOT import from ai.py (was only used by dead function)
- Replace local CACHE_ROOT constant in capabilities.py with import
from cache.py (single source of truth)
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Moves investigation ID persistence and _CacheManager class from ai.py
into a dedicated cache module. No behavior changes.
Moved: _load_investigations, _save_investigations, _get_investigation_id,
_CacheManager (all methods), _sha256_path, CACHE_ROOT, INVESTIGATIONS_PATH.
Also added a local _now_iso() in cache.py to avoid a circular import
(ai.py imports from cache.py).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds think, checkpoint, and flag tools for agent reasoning visibility:
- think: records observation/hypothesis/next_action before investigation
- checkpoint: summarizes learned/unknown/next_phase after file clusters
- flag: marks notable findings to flags.jsonl with severity levels
Additional changes:
- Step numbering in investigation system prompt
- Text blocks from agent now printed to stderr (step labels visible)
- flag tool available in both investigation and synthesis passes
- analyze_directory() returns (brief, detailed, flags) three-tuple
- format_flags() in report.py renders flags sorted by severity
- Per-directory max_turns increased from 10 to 14
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
When the 70% context budget is hit mid-directory, the early exit now
writes a partial directory cache entry from whatever file summaries
the agent cached in prior turns, instead of discarding the work.
If file entries exist: concatenates their summaries into a directory
entry marked partial=true. If no files were cached: writes a minimal
entry noting the budget was reached before processing.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds branch naming, commit message format, and merge procedure.
All future changes must start on a branch and merge to main with --no-ff.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- setup_env.sh creates ~/luminos-env venv and installs all AI packages
- CLAUDE.md updated to reflect the new dependency model: base tool is
zero-dep, --ai requires packages installed via venv
- Documents the capabilities module and updated ai.py architecture
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
- --install-extras: prints status of all optional AI packages
- --clear-cache: wipes /tmp/luminos/ investigation cache
- --fresh: forces a new investigation ID, ignoring cached results
- AI import is now lazy (only when --ai is used) so the base tool
never touches optional dependencies
- target argument is optional when using --install-extras
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Rewrites ai.py from a single Claude API call into a multi-pass,
cache-driven agent architecture:
- Per-directory isolated agent loops (max 10 turns each) with context
discarded between directories
- Leaves-first processing order so child summaries inform parents
- Disk cache (/tmp/luminos/{uuid}/) persists across runs for resumability
- Investigation ID persistence keyed by target realpath
- Separate synthesis pass reads only directory-level cache entries
- Replaces urllib with Anthropic SDK (streaming, automatic retries)
- Token counting with 70% context budget threshold for early exit
- parse_structure tool via tree-sitter (Python, JS, Rust, Go)
- python-magic integration for MIME-aware directory listings
- Cost tracking printed at end of investigation
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Introduces luminos_lib/capabilities.py as the single source of truth for
optional package availability. Detects anthropic, tree-sitter, python-magic
and their grammar packages. Provides check_ai_dependencies() for gating
--ai mode and print_status() for --install-extras. Also hosts clear_cache()
to avoid pulling heavy AI imports for cache cleanup.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Adds --ai flag that sends the directory tree, file categories, and
sampled file contents to Claude for analysis. Produces a brief
summary at the top of the report and a detailed breakdown at the
end. Requires ANTHROPIC_API_KEY env var; degrades gracefully without it.
Uses only stdlib (urllib) to keep the zero-dependency constraint.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Re-scans every 30 seconds and shows new files, deleted files, and
size changes between scans.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Human-readable terminal report with clear sections, plus JSON output
mode and file output support.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Uses du to show per-directory disk usage and highlights the top 5
largest directories.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Finds the 10 most recently modified files using find with printf
and shows human-readable timestamps.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Detects programming languages, counts lines of code per language via
wc -l, and flags unusually large files (>1000 lines or >10MB).
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Classifies files by category (source, config, data, media, document,
archive, unknown) using extension mapping and the `file` command.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Renders a visual tree with file sizes, configurable depth, and
hidden file filtering.
Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>