# ContextScout Test Plan

## Test Coverage Analysis

### Current Test Status

| Test | Type | Coverage | Status | Issues |
|------|------|----------|--------|--------|
| smoke-test.yaml | Positive | Basic operation | ✅ Passing | Too simple - doesn't validate output quality |
| 02-discovery-test.yaml | Positive | Structure discovery | ⚠️ Needs validation | No assertions on output format |
| 03-search-standards.yaml | Positive | File search | ⚠️ Needs validation | No verification of line ranges |
| 04-content-extraction.yaml | Positive | Content extraction | ⚠️ Needs validation | No verification of key findings |
| 05-no-context-handling.yaml | Negative | Empty directory | ⚠️ Needs validation | Doesn't verify honest reporting |

### Test Coverage Gaps

**Missing Positive Tests:**
- ✅ Verify file paths are exact and valid
- ✅ Verify line ranges are accurate
- ✅ Verify priority ratings are appropriate
- ✅ Verify MVI compliance detection
- ✅ Verify function-based folder detection
- ✅ Verify loading order recommendations

**Missing Negative Tests:**
- ❌ Invalid path handling (non-existent directories)
- ❌ Malformed context files (invalid YAML, broken markdown)
- ❌ Ambiguous search queries
- ❌ Context files without clear structure
- ❌ Circular dependencies in context
- ❌ False positive prevention (claiming files exist that don't)

---

## Test Categories

### 1. Positive Tests (Happy Path)

**Purpose**: Verify ContextScout works correctly with valid inputs

#### Test: Valid Context Discovery
**Input**: "Find context for code standards"
**Expected Output**:
- ✅ Returns `.opencode/context/core/standards/code.md`
- ✅ Includes line ranges (e.g., "lines 22-27")
- ✅ Priority: ⭐⭐⭐⭐⭐ (Critical)
- ✅ Function type: Guide
- ✅ Key findings: 3-5 specific points
- ✅ Loading order: "Load this file NOW"

**Assertions**:
```yaml
assertions:
  - type: output_contains
    value: ".opencode/context/core/standards/code.md"
  - type: output_contains
    value: "lines"
  - type: output_contains
    value: "⭐⭐⭐⭐⭐"
  - type: output_contains
    value: "Key Findings"
```

#### Test: MVI-Aware Prioritization
**Input**: "Find context files, prioritize by MVI compliance"
**Expected Output**:
- ✅ Files <200 lines ranked higher
- ✅ Files with clear sections ranked higher
- ✅ Files with navigation README ranked higher
- ✅ Priority ratings reflect MVI compliance

**Assertions**:
```yaml
assertions:
  - type: custom
    validator: "verify_mvi_prioritization"
    description: "Files <200 lines should have higher priority"
```

#### Test: Function-Based Discovery
**Input**: "Find examples of how to write tests"
**Expected Output**:
- ✅ Searches `examples/` folder first
- ✅ Returns example files, not just guides
- ✅ Identifies function type: "Example"
- ✅ Provides minimal working code

**Assertions**:
```yaml
assertions:
  - type: output_contains
    value: "examples/"
  - type: output_contains
    value: "Type: Example"
```

---

### 2. Negative Tests (Error Handling)

**Purpose**: Verify ContextScout handles invalid inputs gracefully

#### Test: Non-Existent Directory
**Input**: "Find context in /fake/directory/that/does/not/exist"
**Expected Output**:
- ✅ Reports directory doesn't exist
- ✅ Doesn't fabricate results
- ✅ Suggests checking path
- ✅ No false positives

**Assertions**:
```yaml
assertions:
  - type: output_not_contains
    value: "Found context files"
  - type: output_contains
    value: "not found"
  - type: tool_not_called
    tool: "read"
    reason: "Should not attempt to read non-existent files"
```

#### Test: Ambiguous Query
**Input**: "Find stuff"
**Expected Output**:
- ✅ Asks for clarification
- ✅ Suggests specific search terms
- ✅ Doesn't return random files
- ✅ Provides examples of valid queries

**Assertions**:
```yaml
assertions:
  - type: output_contains
    value: "clarify"
  - type: output_contains
    value: "specific"
```

#### Test: Malformed Context File
**Input**: "Find context in directory with broken YAML frontmatter"
**Expected Output**:
- ✅ Reports file has issues
- ✅ Attempts to extract what it can
- ✅ Warns about malformed content
- ✅ Doesn't crash or hang

**Assertions**:
```yaml
assertions:
  - type: output_contains
    value: "malformed"
  - type: no_errors
    description: "Should handle gracefully without crashing"
```

#### Test: False Positive Prevention
**Input**: "Find API documentation"
**Expected Output**:
- ✅ Only returns files that actually exist
- ✅ Verifies file paths before reporting
- ✅ Doesn't hallucinate file names
- ✅ Uses glob/list to verify existence

**Assertions**:
```yaml
assertions:
  - type: all_paths_exist
    description: "Every file path mentioned must actually exist"
  - type: tool_called
    tool: "glob"
    reason: "Must verify files exist before claiming they do"
```

---

### 3. Edge Case Tests

**Purpose**: Verify ContextScout handles boundary conditions

#### Test: Empty Context Directory
**Input**: "Find context in empty .tmp/test-fixtures/empty/"
**Expected Output**:
- ✅ Reports no context found
- ✅ Suggests creating context structure
- ✅ Provides template/example
- ✅ Honest about lack of results

**Status**: ✅ Already implemented (05-no-context-handling.yaml)

#### Test: Very Large Context File (>1000 lines)
**Input**: "Extract key findings from large context file"
**Expected Output**:
- ✅ Identifies file is not MVI compliant
- ✅ Suggests splitting into smaller files
- ✅ Still extracts key findings
- ✅ Provides line ranges for sections

**Assertions**:
```yaml
assertions:
  - type: output_contains
    value: "MVI"
  - type: output_contains
    value: "lines"
```

#### Test: Circular Context Dependencies
**Input**: "Find context for X which depends on Y which depends on X"
**Expected Output**:
- ✅ Detects circular dependency
- ✅ Reports the cycle
- ✅ Suggests breaking the cycle
- ✅ Doesn't infinite loop

**Assertions**:
```yaml
assertions:
  - type: output_contains
    value: "circular"
  - type: timeout_not_exceeded
    max_duration: 30000
```

---

## Recommended Test Additions

### High Priority (Add These First)

1. **test-06-exact-paths.yaml** - Verify file paths are exact and valid
2. **test-07-line-ranges.yaml** - Verify line ranges are accurate
3. **test-08-false-positive.yaml** - Prevent hallucinated file paths
4. **test-09-invalid-path.yaml** - Handle non-existent directories
5. **test-10-ambiguous-query.yaml** - Handle vague requests

### Medium Priority

6. **test-11-mvi-detection.yaml** - Verify MVI compliance detection
7. **test-12-function-folders.yaml** - Verify function-based discovery
8. **test-13-priority-ratings.yaml** - Verify priority ratings are appropriate
9. **test-14-large-files.yaml** - Handle files >200 lines
10. **test-15-malformed-content.yaml** - Handle broken YAML/markdown

### Low Priority

11. **test-16-circular-deps.yaml** - Detect circular dependencies
12. **test-17-performance.yaml** - Verify response time <30s
13. **test-18-multiple-matches.yaml** - Handle multiple matching files
14. **test-19-no-readme.yaml** - Handle directories without README
15. **test-20-integration.yaml** - Full workflow test

---

## Test Quality Checklist

For each test, verify:

- [ ] **Clear purpose** - What specific behavior is being tested?
- [ ] **Specific assertions** - What exact output is expected?
- [ ] **Positive AND negative** - Tests both success and failure cases
- [ ] **No false positives** - Test would fail if agent misbehaves
- [ ] **No false negatives** - Test wouldn't fail for correct behavior
- [ ] **Fast execution** - Completes in <30 seconds
- [ ] **Deterministic** - Same input always produces same result
- [ ] **Independent** - Doesn't depend on other tests

---

## Current Test Issues

### Issue 1: Smoke Test Too Simple
**Problem**: Current smoke test just checks if agent responds, doesn't validate output quality
**Fix**: Add assertions for expected output format

### Issue 2: No Output Validation
**Problem**: Tests don't verify the actual content of responses
**Fix**: Add `assertions` section to each test with specific checks

### Issue 3: No False Positive Prevention
**Problem**: Tests don't verify agent isn't hallucinating file paths
**Fix**: Add test that verifies all mentioned paths actually exist

### Issue 4: No Negative Tests
**Problem**: Only 1 negative test (empty directory), need more
**Fix**: Add tests for invalid paths, ambiguous queries, malformed files

### Issue 5: No Performance Tests
**Problem**: No verification that ContextScout responds quickly
**Fix**: Add timeout assertions and performance benchmarks

---

## Next Steps

1. **Review current tests** - Analyze what they actually validate
2. **Add assertions** - Add specific output validation to existing tests
3. **Create negative tests** - Add 5 new negative test cases
4. **Run full suite** - Verify all tests pass
5. **Document results** - Update README with test coverage

---

**Last Updated**: 2026-01-07  
**Status**: Test plan created, implementation pending