darrenhinde 0d1718e551 fix(evals): use test_tmp directory for test artifacts and add cleanup 4 months ago
..
tests 0d1718e551 fix(evals): use test_tmp directory for test artifacts and add cleanup 4 months ago
README.md cc96acc50e feat: add 5 essential workflow tests and reorganize with agents/ structure 4 months ago

README.md

Shared Test Cases

Tests in this directory are agent-agnostic and can be used to test any agent that follows the same core rules.

Purpose

Shared tests validate universal behaviors that all agents should follow:

  • Approval gate enforcement
  • Tool usage patterns
  • Basic workflow compliance
  • Error handling

Usage

Run Shared Tests for OpenAgent

npm run eval:sdk -- --pattern="shared/**/*.yaml" --agent=openagent

Run Shared Tests for OpenCoder

npm run eval:sdk -- --pattern="shared/**/*.yaml" --agent=opencoder

Override Agent in Test File

# In the YAML file
agent: openagent  # Change to opencoder, or any other agent

Test Categories

common/ - Universal Rules

Tests that apply to all agents:

  • approval-gate-basic.yaml - Basic approval enforcement
  • tool-usage-basic.yaml - Basic tool selection (future)
  • error-handling-basic.yaml - Basic error handling (future)

Adding New Shared Tests

  1. Create test in shared/tests/common/
  2. Use generic prompts (not agent-specific)
  3. Test universal behaviors only
  4. Tag with shared-test and agent-agnostic
  5. Document which agents it applies to

Example

id: shared-example-001
name: Example Shared Test
category: edge-case
agent: openagent  # Default, can be overridden

prompt: "Generic prompt that works for any agent"

behavior:
  requiresApproval: true  # Universal rule

expectedViolations:
  - rule: approval-gate
    shouldViolate: false

tags:
  - shared-test
  - agent-agnostic

Benefits

  1. Reduce Duplication - Write once, test multiple agents
  2. Consistency - Same tests ensure consistent behavior
  3. Easy Comparison - Compare agent behaviors side-by-side
  4. Faster Onboarding - New agents inherit core test suite