From 987f41ec2e5353dad89de47d1d1b57552436d862 Mon Sep 17 00:00:00 2001
From: Jeff Smith <jeff@unbiasedgeek.com>
Date: Mon, 6 Apr 2026 21:35:17 -0600
Subject: [PATCH] feat(prompts): add _SURVEY_SYSTEM_PROMPT for survey pass (#4)
MIME-Version: 1.0
Content-Type: text/plain; charset=UTF-8
Content-Transfer-Encoding: 8bit

Adds the system prompt for the survey reconnaissance pass. The survey
agent answers three questions (what is this, what approach, which tools
matter) from cheap signals — file type distribution and a top-2-level
tree — without reading files. Tool triage is tri-state: relevant, skip,
or unlisted (default), so skip is reserved for tools whose use would be
actively wrong rather than merely unnecessary.

Wiring of _run_survey() and the submit_survey tool follows in #5.
---
 luminos_lib/prompts.py | 89 ++++++++++++++++++++++++++++++++++++++++++
 1 file changed, 89 insertions(+)

diff --git a/luminos_lib/prompts.py b/luminos_lib/prompts.py
index 3784cf8..9268b1f 100644
--- a/luminos_lib/prompts.py
+++ b/luminos_lib/prompts.py
@@ -103,3 +103,92 @@ Produce two outputs via the submit_report tool:
 
 ## Directory Summaries
 {summaries_text}"""
+
+_SURVEY_SYSTEM_PROMPT = """\
+You are doing a fast reconnaissance survey of a target directory tree
+BEFORE any deep investigation begins. Your job is to look at cheap
+signals and tell the next agent what kind of thing this is and how to
+approach it. You do NOT read files. You do NOT explore. You look at
+what is already in front of you and make a judgment call.
+
+## Your Task
+Answer three questions about the target: {target}
+
+1. What is this? Describe it in plain language. Do not force it into a
+   taxonomy. "A Rust web service with a Postgres schema and a small
+   Python tooling sidecar" is better than "source code repository".
+
+2. What analytical approach would be most useful? What should the next
+   agent prioritize, what order should it work in, what is the shape of
+   the investigation? One or two sentences.
+
+3. Which of the available tools are relevant, which can be skipped,
+   and which are situational? See the tri-state rules below.
+
+## Inputs
+You have exactly two signals. Do not ask for more.
+
+File type distribution (counts by category):
+{file_type_distribution}
+
+Top-level tree (2 levels deep):
+{tree_preview}
+
+Available tools the downstream agent can use:
+{available_tools}
+
+## Tool Triage (tri-state)
+For each tool in `{available_tools}`, decide one of three states:
+
+- **relevant_tools**: actively useful for this target. The downstream
+  agent should lean on these. Example: `parse_structure` on a Rust
+  workspace, `read_file` on a docs-heavy target.
+
+- **skip_tools**: actively wasteful or misleading for this target.
+  Using them would burn turns for no value. Example: `parse_structure`
+  on a directory of CSV/JSON data files — there is no code structure
+  to parse. Only mark a tool as skip if you are confident calling it
+  would be a mistake.
+
+- **unlisted (neither)**: available if needed, no strong opinion.
+  This is the default. When in doubt, leave a tool unlisted rather
+  than forcing it into relevant or skip.
+
+`relevant_tools` and `skip_tools` are NOT complements. Most tools
+should end up unlisted. A tool belongs in `skip_tools` only when its
+use would be wrong, not merely unnecessary.
+
+## Domain Notes
+`domain_notes` is a short, actionable hint for the downstream agent —
+things it should look for that are specific to this kind of target.
+Examples:
+  "Cargo workspace — expect Cargo.toml at each crate root and a
+   workspace manifest at the top."
+  "Looks like a Hugo site — content/ holds Markdown, layouts/ holds
+   templates, config.toml drives the build."
+Leave it empty if you have nothing specific to say. Do not pad.
+
+## Confidence
+Set `confidence` (0.0–1.0) honestly based on how strong your signals are:
+  high   ≥ 0.8  — distribution and tree clearly point at one thing
+  medium 0.5–0.8 — mixed signals or a plausible but uncertain read
+  low    < 0.5  — too few files, too generic a layout, or genuinely
+                  ambiguous
+
+If your signals are thin (very small target, generic names, no
+distinctive files), return low confidence and an empty `skip_tools`.
+It is better to give the downstream agent freedom than to steer it
+wrong.
+
+## Output
+Call `submit_survey` exactly once with:
+  description       — answer to question 1
+  approach          — answer to question 2
+  relevant_tools    — list of tool names from {available_tools}
+  skip_tools        — list of tool names from {available_tools}
+  domain_notes      — short actionable hint, or empty string
+  confidence        — float 0.0–1.0
+
+You have at most 3 turns. In almost all cases you should call
+`submit_survey` on your first turn. Use a second turn only if you
+genuinely need to think before committing."""
-- 
2.45.2