Add OpenQuestion to contract spec and examples
- New section documenting OpenQuestion (question, context, priority, source_locator) - Direction table: gaps (backward), discovery_events (sideways), open_questions (forward) - Updated all 3 examples to include open_questions - Updated researcher rules (rule 5) and caller rules (rule 6) Co-Authored-By: Claude Haiku 4.5 <noreply@anthropic.com>
parent
147dc5bb88
commit
1f81471ce8
1 changed files with 76 additions and 5 deletions
|
|
@ -70,6 +70,7 @@ class ResearchResult:
|
||||||
citations: List[Citation] # Sources used, with raw evidence
|
citations: List[Citation] # Sources used, with raw evidence
|
||||||
gaps: List[Gap] # What couldn't be resolved (categorized)
|
gaps: List[Gap] # What couldn't be resolved (categorized)
|
||||||
discovery_events: List[DiscoveryEvent] # Lateral findings for other researchers
|
discovery_events: List[DiscoveryEvent] # Lateral findings for other researchers
|
||||||
|
open_questions: List[OpenQuestion] # Follow-up questions that emerged
|
||||||
confidence: float # 0.0–1.0 overall confidence
|
confidence: float # 0.0–1.0 overall confidence
|
||||||
confidence_factors: ConfidenceFactors # What fed the confidence score
|
confidence_factors: ConfidenceFactors # What fed the confidence score
|
||||||
cost_metadata: CostMetadata # Resource usage
|
cost_metadata: CostMetadata # Resource usage
|
||||||
|
|
@ -255,6 +256,57 @@ Example:
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
### `open_questions` (list of OpenQuestion objects)
|
||||||
|
|
||||||
|
```python
|
||||||
|
@dataclass
|
||||||
|
class OpenQuestion:
|
||||||
|
question: str # The follow-up question that emerged
|
||||||
|
context: str # What evidence prompted this question
|
||||||
|
priority: str # "high", "medium", "low"
|
||||||
|
source_locator: Optional[str] # Where this question arose from
|
||||||
|
```
|
||||||
|
|
||||||
|
Open questions capture **forward-looking follow-ups** that emerged from the research itself. They are distinct from gaps (what failed) and discovery events (what's lateral):
|
||||||
|
|
||||||
|
| Field | Direction | Meaning |
|
||||||
|
|:---|:---|:---|
|
||||||
|
| `gaps` | Backward | "I tried to find X but couldn't" |
|
||||||
|
| `discovery_events` | Sideways | "Another researcher should look at this" |
|
||||||
|
| `open_questions` | **Forward** | "Based on what I found, this needs deeper investigation" |
|
||||||
|
|
||||||
|
Example:
|
||||||
|
```python
|
||||||
|
[
|
||||||
|
OpenQuestion(
|
||||||
|
question="What is the optimal irrigation schedule for high-elevation potatoes?",
|
||||||
|
context="Multiple sources mention irrigation is critical but none specify schedules.",
|
||||||
|
priority="medium",
|
||||||
|
source_locator="https://extension.usu.edu/gardening/utah-crops"
|
||||||
|
),
|
||||||
|
OpenQuestion(
|
||||||
|
question="How does Utah's soil salinity vary by county?",
|
||||||
|
context="Two sources referenced salinity as a limiting factor but with conflicting data.",
|
||||||
|
priority="high",
|
||||||
|
source_locator=None
|
||||||
|
),
|
||||||
|
]
|
||||||
|
```
|
||||||
|
|
||||||
|
Priority levels:
|
||||||
|
- `"high"` — critical to answer quality; the PI should strongly consider dispatching follow-up research
|
||||||
|
- `"medium"` — would meaningfully improve the answer
|
||||||
|
- `"low"` — nice to know; not essential
|
||||||
|
|
||||||
|
The PI uses open questions to feed a **dynamic priority queue** — deciding whether to go deeper on the current topic or move on.
|
||||||
|
|
||||||
|
**Rules:**
|
||||||
|
- Each open question must be grounded in evidence encountered during research (not speculative)
|
||||||
|
- Questions should be specific and actionable (not vague like "learn more about Utah")
|
||||||
|
- The researcher should not attempt to answer its own open questions — that's the PI's job
|
||||||
|
|
||||||
|
---
|
||||||
|
|
||||||
### `confidence` (float, 0.0–1.0)
|
### `confidence` (float, 0.0–1.0)
|
||||||
|
|
||||||
Overall confidence in the answer. Accompanied by `confidence_factors` to prevent "vibe check" scoring.
|
Overall confidence in the answer. Accompanied by `confidence_factors` to prevent "vibe check" scoring.
|
||||||
|
|
@ -381,11 +433,12 @@ The `content_hash` enables:
|
||||||
2. **Provide raw evidence.** Every citation must include a `raw_excerpt` copied verbatim from the source.
|
2. **Provide raw evidence.** Every citation must include a `raw_excerpt` copied verbatim from the source.
|
||||||
3. **Admit and categorize gaps.** If you can't find something, say so with the appropriate `GapCategory`.
|
3. **Admit and categorize gaps.** If you can't find something, say so with the appropriate `GapCategory`.
|
||||||
4. **Report lateral discoveries.** If you encounter something relevant to another researcher's domain, emit a `DiscoveryEvent`.
|
4. **Report lateral discoveries.** If you encounter something relevant to another researcher's domain, emit a `DiscoveryEvent`.
|
||||||
5. **Respect budgets.** Stop iterating if `max_iterations` or `token_budget` is reached. Reflect in `budget_exhausted`.
|
5. **Surface open questions.** If the research raises follow-up questions that need deeper investigation, emit `OpenQuestion` objects with priority levels.
|
||||||
6. **Ground claims.** Every factual claim in `answer` must link to at least one citation.
|
6. **Respect budgets.** Stop iterating if `max_iterations` or `token_budget` is reached. Reflect in `budget_exhausted`.
|
||||||
7. **Explain confidence.** Populate `confidence_factors` honestly; do not inflate scores.
|
7. **Ground claims.** Every factual claim in `answer` must link to at least one citation.
|
||||||
8. **Hash fetched content.** Every URL/source fetch in the trace must include a `content_hash`.
|
8. **Explain confidence.** Populate `confidence_factors` honestly; do not inflate scores.
|
||||||
9. **Handle failures gracefully.** If Tavily is down or a URL is broken, note it in `gaps` with the appropriate category and continue with what you have.
|
9. **Hash fetched content.** Every URL/source fetch in the trace must include a `content_hash`.
|
||||||
|
10. **Handle failures gracefully.** If Tavily is down or a URL is broken, note it in `gaps` with the appropriate category and continue with what you have.
|
||||||
|
|
||||||
### The Caller (PI/CLI) Must
|
### The Caller (PI/CLI) Must
|
||||||
|
|
||||||
|
|
@ -394,6 +447,7 @@ The `content_hash` enables:
|
||||||
3. **Check raw_excerpts.** For important decisions, verify claims against `raw_excerpt` before acting.
|
3. **Check raw_excerpts.** For important decisions, verify claims against `raw_excerpt` before acting.
|
||||||
4. **Process discovery_events.** Log them (V1) or dispatch additional researchers (V2+).
|
4. **Process discovery_events.** Log them (V1) or dispatch additional researchers (V2+).
|
||||||
5. **Respect gap categories.** Use the category to decide the appropriate response (retry, re-dispatch, escalate, accept).
|
5. **Respect gap categories.** Use the category to decide the appropriate response (retry, re-dispatch, escalate, accept).
|
||||||
|
6. **Review open_questions.** Use priority levels to decide whether to dispatch deeper research or accept the current answer.
|
||||||
|
|
||||||
---
|
---
|
||||||
|
|
||||||
|
|
@ -425,6 +479,7 @@ Response:
|
||||||
],
|
],
|
||||||
"gaps": [],
|
"gaps": [],
|
||||||
"discovery_events": [],
|
"discovery_events": [],
|
||||||
|
"open_questions": [],
|
||||||
"confidence": 0.99,
|
"confidence": 0.99,
|
||||||
"confidence_factors": {
|
"confidence_factors": {
|
||||||
"num_corroborating_sources": 1,
|
"num_corroborating_sources": 1,
|
||||||
|
|
@ -490,6 +545,14 @@ Response:
|
||||||
"source_locator": "https://www.crunchbase.com/hub/crispr-startups"
|
"source_locator": "https://www.crunchbase.com/hub/crispr-startups"
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
|
"open_questions": [
|
||||||
|
{
|
||||||
|
"question": "What are the current Phase III CRISPR trial success rates?",
|
||||||
|
"context": "Multiple sources reference ongoing trials but none include outcome data.",
|
||||||
|
"priority": "high",
|
||||||
|
"source_locator": "https://www.crunchbase.com/hub/crispr-startups"
|
||||||
|
}
|
||||||
|
],
|
||||||
"confidence": 0.72,
|
"confidence": 0.72,
|
||||||
"confidence_factors": {
|
"confidence_factors": {
|
||||||
"num_corroborating_sources": 3,
|
"num_corroborating_sources": 3,
|
||||||
|
|
@ -559,6 +622,14 @@ Response:
|
||||||
"source_locator": null
|
"source_locator": null
|
||||||
}
|
}
|
||||||
],
|
],
|
||||||
|
"open_questions": [
|
||||||
|
{
|
||||||
|
"question": "What caused the second AI winter to end?",
|
||||||
|
"context": "Found conflicting narratives about 1990s revival; needed more iterations to resolve.",
|
||||||
|
"priority": "medium",
|
||||||
|
"source_locator": null
|
||||||
|
}
|
||||||
|
],
|
||||||
"confidence": 0.55,
|
"confidence": 0.55,
|
||||||
"confidence_factors": {
|
"confidence_factors": {
|
||||||
"num_corroborating_sources": 2,
|
"num_corroborating_sources": 2,
|
||||||
|
|
|
||||||
Loading…
Reference in a new issue