ContextScout Test Plan

Test Coverage Analysis

Current Test Status

Test	Type	Coverage	Status	Issues
smoke-test.yaml	Positive	Basic operation	✅ Passing	Too simple - doesn't validate output quality
02-discovery-test.yaml	Positive	Structure discovery	⚠️ Needs validation	No assertions on output format
03-search-standards.yaml	Positive	File search	⚠️ Needs validation	No verification of line ranges
04-content-extraction.yaml	Positive	Content extraction	⚠️ Needs validation	No verification of key findings
05-no-context-handling.yaml	Negative	Empty directory	⚠️ Needs validation	Doesn't verify honest reporting

Test Coverage Gaps

Missing Positive Tests:

✅ Verify file paths are exact and valid
✅ Verify line ranges are accurate
✅ Verify priority ratings are appropriate
✅ Verify MVI compliance detection
✅ Verify function-based folder detection
✅ Verify loading order recommendations

Missing Negative Tests:

❌ Invalid path handling (non-existent directories)
❌ Malformed context files (invalid YAML, broken markdown)
❌ Ambiguous search queries
❌ Context files without clear structure
❌ Circular dependencies in context
❌ False positive prevention (claiming files exist that don't)

Test Categories

1. Positive Tests (Happy Path)

Purpose: Verify ContextScout works correctly with valid inputs

Test: Valid Context Discovery

Input: "Find context for code standards" Expected Output:

✅ Returns .opencode/context/core/standards/code.md
✅ Includes line ranges (e.g., "lines 22-27")
✅ Priority: ⭐⭐⭐⭐⭐ (Critical)
✅ Function type: Guide
✅ Key findings: 3-5 specific points
✅ Loading order: "Load this file NOW"

Assertions:

assertions:
  - type: output_contains
    value: ".opencode/context/core/standards/code.md"
  - type: output_contains
    value: "lines"
  - type: output_contains
    value: "⭐⭐⭐⭐⭐"
  - type: output_contains
    value: "Key Findings"

Test: MVI-Aware Prioritization

Input: "Find context files, prioritize by MVI compliance" Expected Output:

✅ Files <200 lines ranked higher
✅ Files with clear sections ranked higher
✅ Files with navigation README ranked higher
✅ Priority ratings reflect MVI compliance

Assertions:

assertions:
  - type: custom
    validator: "verify_mvi_prioritization"
    description: "Files <200 lines should have higher priority"

Test: Function-Based Discovery

Input: "Find examples of how to write tests" Expected Output:

✅ Searches examples/ folder first
✅ Returns example files, not just guides
✅ Identifies function type: "Example"
✅ Provides minimal working code

Assertions:

assertions:
  - type: output_contains
    value: "examples/"
  - type: output_contains
    value: "Type: Example"

2. Negative Tests (Error Handling)

Purpose: Verify ContextScout handles invalid inputs gracefully

Test: Non-Existent Directory

Input: "Find context in /fake/directory/that/does/not/exist" Expected Output:

✅ Reports directory doesn't exist
✅ Doesn't fabricate results
✅ Suggests checking path
✅ No false positives

Assertions:

assertions:
  - type: output_not_contains
    value: "Found context files"
  - type: output_contains
    value: "not found"
  - type: tool_not_called
    tool: "read"
    reason: "Should not attempt to read non-existent files"

Test: Ambiguous Query

Input: "Find stuff" Expected Output:

✅ Asks for clarification
✅ Suggests specific search terms
✅ Doesn't return random files
✅ Provides examples of valid queries

Assertions:

assertions:
  - type: output_contains
    value: "clarify"
  - type: output_contains
    value: "specific"

Test: Malformed Context File

Input: "Find context in directory with broken YAML frontmatter" Expected Output:

✅ Reports file has issues
✅ Attempts to extract what it can
✅ Warns about malformed content
✅ Doesn't crash or hang

Assertions:

assertions:
  - type: output_contains
    value: "malformed"
  - type: no_errors
    description: "Should handle gracefully without crashing"

Test: False Positive Prevention

Input: "Find API documentation" Expected Output:

✅ Only returns files that actually exist
✅ Verifies file paths before reporting
✅ Doesn't hallucinate file names
✅ Uses glob/list to verify existence

Assertions:

assertions:
  - type: all_paths_exist
    description: "Every file path mentioned must actually exist"
  - type: tool_called
    tool: "glob"
    reason: "Must verify files exist before claiming they do"

3. Edge Case Tests

Purpose: Verify ContextScout handles boundary conditions

Test: Empty Context Directory

Input: "Find context in empty .tmp/test-fixtures/empty/" Expected Output:

✅ Reports no context found
✅ Suggests creating context structure
✅ Provides template/example
✅ Honest about lack of results

Status: ✅ Already implemented (05-no-context-handling.yaml)

Test: Very Large Context File (>1000 lines)

Input: "Extract key findings from large context file" Expected Output:

✅ Identifies file is not MVI compliant
✅ Suggests splitting into smaller files
✅ Still extracts key findings
✅ Provides line ranges for sections

Assertions:

assertions:
  - type: output_contains
    value: "MVI"
  - type: output_contains
    value: "lines"

Test: Circular Context Dependencies

Input: "Find context for X which depends on Y which depends on X" Expected Output:

✅ Detects circular dependency
✅ Reports the cycle
✅ Suggests breaking the cycle
✅ Doesn't infinite loop

Assertions:

assertions:
  - type: output_contains
    value: "circular"
  - type: timeout_not_exceeded
    max_duration: 30000

Recommended Test Additions

High Priority (Add These First)

test-06-exact-paths.yaml - Verify file paths are exact and valid
test-07-line-ranges.yaml - Verify line ranges are accurate
test-08-false-positive.yaml - Prevent hallucinated file paths
test-09-invalid-path.yaml - Handle non-existent directories
test-10-ambiguous-query.yaml - Handle vague requests

Medium Priority

test-11-mvi-detection.yaml - Verify MVI compliance detection
test-12-function-folders.yaml - Verify function-based discovery
test-13-priority-ratings.yaml - Verify priority ratings are appropriate
test-14-large-files.yaml - Handle files >200 lines
test-15-malformed-content.yaml - Handle broken YAML/markdown

Low Priority

test-16-circular-deps.yaml - Detect circular dependencies
test-17-performance.yaml - Verify response time <30s
test-18-multiple-matches.yaml - Handle multiple matching files
test-19-no-readme.yaml - Handle directories without README
test-20-integration.yaml - Full workflow test

Test Quality Checklist

For each test, verify:

Clear purpose - What specific behavior is being tested?
Specific assertions - What exact output is expected?
Positive AND negative - Tests both success and failure cases
No false positives - Test would fail if agent misbehaves
No false negatives - Test wouldn't fail for correct behavior
Fast execution - Completes in <30 seconds
Deterministic - Same input always produces same result
Independent - Doesn't depend on other tests

Current Test Issues

Issue 1: Smoke Test Too Simple

Problem: Current smoke test just checks if agent responds, doesn't validate output quality Fix: Add assertions for expected output format

Issue 2: No Output Validation

Problem: Tests don't verify the actual content of responses Fix: Add assertions section to each test with specific checks

Issue 3: No False Positive Prevention

Problem: Tests don't verify agent isn't hallucinating file paths Fix: Add test that verifies all mentioned paths actually exist

Issue 4: No Negative Tests

Problem: Only 1 negative test (empty directory), need more Fix: Add tests for invalid paths, ambiguous queries, malformed files

Issue 5: No Performance Tests

Problem: No verification that ContextScout responds quickly Fix: Add timeout assertions and performance benchmarks

Next Steps

Review current tests - Analyze what they actually validate
Add assertions - Add specific output validation to existing tests
Create negative tests - Add 5 new negative test cases
Run full suite - Verify all tests pass
Document results - Update README with test coverage

Last Updated: 2026-01-07
Status: Test plan created, implementation pending

TEST_PLAN.md 8.9 KB تاريخچه خام

ContextScout Test Plan

Test Coverage Analysis

Current Test Status

Test Coverage Gaps

Test Categories

1. Positive Tests (Happy Path)

Test: Valid Context Discovery

Test: MVI-Aware Prioritization

Test: Function-Based Discovery

2. Negative Tests (Error Handling)

Test: Non-Existent Directory

Test: Ambiguous Query

Test: Malformed Context File

Test: False Positive Prevention

3. Edge Case Tests

Test: Empty Context Directory

Test: Very Large Context File (>1000 lines)

Test: Circular Context Dependencies

Recommended Test Additions

High Priority (Add These First)

Medium Priority

Low Priority

Test Quality Checklist

Current Test Issues

Issue 1: Smoke Test Too Simple

Issue 2: No Output Validation

Issue 3: No False Positive Prevention

Issue 4: No Negative Tests

Issue 5: No Performance Tests

Next Steps

TEST_PLAN.md 8.9 KB

تاريخچه خام