Wire survey output into dir loop system prompt #6

New issue

Closed

opened 2026-04-06 16:27:09 -06:00 by archeious · 2 comments

archeious commented

2026-04-06 16:27:09 -06:00

Owner

After _run_survey() completes, inject its output into the dir loop system prompt so each dir loop agent knows what it's looking at before it starts.

The domain_notes, approach, and relevant_tools/skip_tools fields shape what the dir loop agent prioritizes.

After `_run_survey()` completes, inject its output into the dir loop system prompt so each dir loop agent knows what it's looking at before it starts. The `domain_notes`, `approach`, and `relevant_tools`/`skip_tools` fields shape what the dir loop agent prioritizes.

archeious added this to the Phase 2: Survey Pass milestone 2026-04-06 16:27:09 -06:00

archeious added this to the Agentic Investigation Engine project 2026-04-06 16:33:53 -06:00

archeious referenced this issue from a commit

2026-04-06 21:50:01 -06:00

feat(ai): add _run_survey() and submit_survey tool (#5)

archeious referenced this issue

2026-04-06 21:50:06 -06:00

feat(ai): add _run_survey() and submit_survey tool (#5) #43

archeious commented

2026-04-06 21:58:49 -06:00

Author

Owner

Empirical finding from #5 smoke test

Ran python3 luminos.py --ai luminos_lib after #5 shipped (survey log-only, not yet wired into the dir loop).

The survey returned skip_tools: ["run_command"] for this Python library target — a sensible call. The dir loop nonetheless invoked run_command twice on its second turn:

[AI]   -> run_command(command="tail -n 50 ai.py")
[AI]   -> run_command(command="grep -n analyze_directory ai.py")

This means the prompt-injection approach alone is insufficient. The dir-loop agent reaches for any tool that is in its toolbox, even when the system prompt has been told otherwise. The model treats prompt instructions as soft preferences and tool availability as hard affordances.

Scope expansion

#6 should do BOTH:

Inject into the prompt (original scope) — description, approach, domain_notes, and the relevant_tools/skip_tools lists go into the dir loop system prompt so the agent has the context.
Filter the tool schema (new scope) — when building the tools= list passed to _call_api_streaming for the dir loop, remove any tool whose name is in survey["skip_tools"]. This is a hard enforcement: the agent literally cannot call tools that are not in the schema.

Confidence guard

Skip-tool filtering should be gated on survey["confidence"] >= 0.5. If the survey is unsure (thin signals, generic target), do not let it strip tools — that risks breaking the dir loop on a wrong call. The prompt injection (item 1) can still happen at any confidence level since it is advisory.

Acceptance update

Dir loop system prompt contains survey description, approach, domain_notes
Dir loop tool schema excludes skip_tools when confidence >= 0.5
A re-run of the smoke test (--ai luminos_lib) shows zero run_command invocations in the dir loop, given the same survey output
Survey-unavailable case (None return) leaves dir loop behavior unchanged

## Empirical finding from #5 smoke test Ran `python3 luminos.py --ai luminos_lib` after #5 shipped (survey log-only, not yet wired into the dir loop). The survey returned `skip_tools: ["run_command"]` for this Python library target — a sensible call. The dir loop nonetheless invoked `run_command` twice on its second turn: ``` [AI] -> run_command(command="tail -n 50 ai.py") [AI] -> run_command(command="grep -n analyze_directory ai.py") ``` This means the prompt-injection approach alone is insufficient. The dir-loop agent reaches for any tool that is in its toolbox, even when the system prompt has been told otherwise. The model treats prompt instructions as soft preferences and tool availability as hard affordances. ## Scope expansion #6 should do BOTH: 1. **Inject into the prompt** (original scope) — `description`, `approach`, `domain_notes`, and the `relevant_tools`/`skip_tools` lists go into the dir loop system prompt so the agent has the context. 2. **Filter the tool schema** (new scope) — when building the `tools=` list passed to `_call_api_streaming` for the dir loop, remove any tool whose name is in `survey["skip_tools"]`. This is a hard enforcement: the agent literally cannot call tools that are not in the schema. ## Confidence guard Skip-tool filtering should be gated on `survey["confidence"] >= 0.5`. If the survey is unsure (thin signals, generic target), do not let it strip tools — that risks breaking the dir loop on a wrong call. The prompt injection (item 1) can still happen at any confidence level since it is advisory. ## Acceptance update - Dir loop system prompt contains survey description, approach, domain_notes - Dir loop tool schema excludes `skip_tools` when `confidence >= 0.5` - A re-run of the smoke test (`--ai luminos_lib`) shows zero `run_command` invocations in the dir loop, given the same survey output - Survey-unavailable case (None return) leaves dir loop behavior unchanged

archeious referenced this issue

2026-04-06 21:58:50 -06:00

Dir loop exhausts context budget on small targets #44

archeious referenced this issue from a commit

2026-04-06 22:07:15 -06:00

feat(ai): wire survey output into dir loop (#6)

archeious referenced this issue from a pull request that will close it,

2026-04-06 22:07:21 -06:00

feat(ai): wire survey output into dir loop (#6) #45

archeious referenced this issue from a commit

2026-04-06 22:07:24 -06:00

merge: feat/issue-6-wire-survey (#6)