# Core Test Suite - Minimum Viable Tests **Purpose:** Minimum tests needed to validate OpenAgent's 4 critical rules **Total:** 8 core tests (down from 49) --- ## Core Tests (8 tests) ### 1. Approval Gate (2 tests) - ✅ `05-approval-before-execution-positive.yaml` - Standard approval workflow - ❌ `02-missing-approval-negative.yaml` - Should fail without approval ### 2. Context Loading (3 tests) - ✅ `01-code-task.yaml` - Code task loads code.md - ✅ `02-docs-task.yaml` - Docs task loads docs.md - ❌ `11-wrong-context-file-negative.yaml` - Should fail with wrong context ### 3. Stop on Failure (2 tests) - ✅ `02-stop-and-report-positive.yaml` - Stops and reports - ❌ `03-auto-fix-negative.yaml` - Should fail if auto-fixes ### 4. Report First (1 test) - ✅ `01-correct-workflow-positive.yaml` - Report→Propose→Approve→Fix --- ## Why These 8 Tests? **Approval Gate (2 tests):** - Positive: Validates approval BEFORE execution works - Negative: Validates missing approval is caught **Context Loading (3 tests):** - Code task: Most common use case - Docs task: Second most common - Wrong context: Validates evaluator catches wrong file **Stop on Failure (2 tests):** - Positive: Validates agent stops on error - Negative: Validates auto-fix is caught **Report First (1 test):** - Validates Report→Propose→Approve→Fix workflow --- ## What We're NOT Testing (Can Add Later) - Conversational path (3 tests) - Multi-turn context (2 tests) - Delegation (2 tests) - Edge cases (3 tests) - Integration (6 tests) - Behavior validation (4 tests) - Tool usage (2 tests) **Total skipped:** 22 tests --- ## Token Optimization **Full Suite:** 49 tests × ~7,000 tokens = ~343,000 tokens **Core Suite:** 8 tests × ~7,000 tokens = ~56,000 tokens **Savings:** 84% reduction in tokens