Darren Hinde f669cac34c feat: repository review and MVI context system implementation (#85) 3 months ago
..
config c8f7103cb6 refactor(evals): consolidate documentation and enhance test infrastructure (#56) 3 months ago
prompts c8f7103cb6 refactor(evals): consolidate documentation and enhance test infrastructure (#56) 3 months ago
tests c8f7103cb6 refactor(evals): consolidate documentation and enhance test infrastructure (#56) 3 months ago
README.md f669cac34c feat: repository review and MVI context system implementation (#85) 3 months ago

README.md

Agent Generator - Evaluation Tests

Overview

Agent: AgentGenerator
Parent Agent: system-builder
Description: Generates XML-optimized agent files following research-backed patterns

Test Structure

system-builder/agent-generator/
├── config/
│   └── config.yaml          # Test configuration
├── tests/
│   └── smoke-test.yaml      # Basic sanity check
├── prompts/                 # Prompt variants (future)
└── README.md                # This file

Running Tests

Standalone Mode

Tests the subagent directly (forces mode: primary):

# Using npm
npm run eval:sdk -- --subagent=system-builder-agent-generator

# Using Makefile
make test-subagent SUBAGENT=system-builder-agent-generator

# Verbose output
npm run eval:sdk -- --subagent=system-builder-agent-generator --verbose

Delegation Mode

Tests via parent agent (real-world usage):

# Using npm
npm run eval:sdk -- --subagent=system-builder-agent-generator --delegate

# Using Makefile
make test-subagent-delegate SUBAGENT=system-builder-agent-generator

Test Suites

Smoke Tests

  • Purpose: Basic sanity checks
  • Coverage: Agent initialization, basic tool usage
  • Status: ✅ Implemented

Standalone Tests

  • Purpose: Test subagent in isolation
  • Coverage: Core functionality without parent delegation
  • Status: 🚧 TODO

Delegation Tests

  • Purpose: Test subagent via parent agent
  • Coverage: Real-world delegation scenarios
  • Status: 🚧 TODO

Adding Tests

  1. Create test file in tests/ directory
  2. Follow the YAML schema from evals/agents/shared/tests/golden/
  3. Add appropriate tags: subagent, system-builder-agent-generator, suite name
  4. Update this README with test description

Prompt Variants

The prompts/ directory is reserved for model-specific prompt variants:

  • gpt.md - GPT-optimized prompts
  • gemini.md - Gemini-optimized prompts
  • llama.md - Llama-optimized prompts
  • etc.

Status: 🚧 Not yet implemented

Related Documentation