|
|
3 months ago | |
|---|---|---|
| .. | ||
| 01-known-context-direct-load.yaml | 3 months ago | |
| 02-unknown-domain-discovery.yaml | 3 months ago | |
| 03-accuracy-correct-files.yaml | 3 months ago | |
| 04-implicit-discovery.yaml | 3 months ago | |
| 05-multi-domain-comprehensive.yaml | 3 months ago | |
| 06-agent-creation-uses-contextscout.yaml | 3 months ago | |
| 07-content-creation-uses-contextscout.yaml | 3 months ago | |
| 08-known-domain-no-contextscout.yaml | 3 months ago | |
| README.md | 3 months ago | |
| TEST_RESULTS.md | 3 months ago | |
Purpose: Validate that OpenAgent uses ContextScout effectively - at the right time, for the right reasons, with the right outcomes.
Created: 2026-01-07
Status: Ready to Run
Should we use ContextScout to help agents discover context, or is it adding unnecessary complexity?
These tests answer:
File: 01-known-context-direct-load.yaml
Scenario: "Write a fibonacci function"
Expected: Agent should DIRECTLY load code.md without using ContextScout
Why: For standard tasks (code/docs/tests), the context path is well-known. ContextScout adds overhead without value.
Success Criteria:
.opencode/context/core/standards/code.mdFile: 02-unknown-domain-discovery.yaml
Scenario: "Explain how the eval framework works"
Expected: Agent should USE ContextScout to discover eval-specific context
Why: For domain-specific topics, agents don't know what context exists. ContextScout discovers relevant files.
Success Criteria:
.opencode/context/openagents-repo/core-concepts/evals.mdFile: 03-accuracy-correct-files.yaml
Scenario: "What are the MVI principles?"
Expected: ContextScout finds the CORRECT file (mvi.md), not random files
Why: Validates ContextScout's search accuracy and relevance filtering.
Success Criteria:
.opencode/context/core/context-system/standards/mvi.mdcd evals/framework
npm run eval:sdk -- --agent=core/openagent --pattern="contextscout-integration/*.yaml"
# Test 1: Known context (should NOT use ContextScout)
npm run eval:sdk -- --agent=core/openagent --pattern="01-known-context-direct-load.yaml"
# Test 2: Unknown domain (should use ContextScout)
npm run eval:sdk -- --agent=core/openagent --pattern="02-unknown-domain-discovery.yaml"
# Test 3: Accuracy (ContextScout finds correct files)
npm run eval:sdk -- --agent=core/openagent --pattern="03-accuracy-correct-files.yaml"
npm run eval:sdk -- --agent=core/openagent --pattern="contextscout-integration/*.yaml" --no-evaluators
If tests show:
Conclusion: ContextScout is valuable for discovery, agents use it intelligently
Action: Keep ContextScout, refine when/how agents use it
If tests show:
Conclusion: ContextScout isn't helping, agents aren't using it predictably
Action: Remove ContextScout OR fix agent decision-making
If tests show:
Conclusion: ContextScout has potential but needs tuning
Action: Refine agent prompts, improve ContextScout search, add decision criteria
| Test Result | Interpretation | Action |
|---|---|---|
| All 3 pass | ✅ ContextScout working as designed | Keep it, document best practices |
| Test 1 fails (uses ContextScout) | ⚠️ Agents using it when they shouldn't | Refine agent decision logic |
| Test 2 fails (no ContextScout) | ⚠️ Agents not using it when they should | Update agent prompts |
| Test 3 fails (wrong files) | ❌ ContextScout search accuracy poor | Improve ContextScout search logic |
| All 3 fail | ❌ ContextScout not ready | Disable or redesign |
Key Insight: ContextScout should be a discovery tool for unknown domains, NOT a replacement for direct loading of well-known context paths. These tests validate this hypothesis.