Darren Hinde f669cac34c feat: repository review and MVI context system implementation (#85)		3 months ago
..
config	c8f7103cb6 refactor(evals): consolidate documentation and enhance test infrastructure (#56)	3 months ago
prompts	c8f7103cb6 refactor(evals): consolidate documentation and enhance test infrastructure (#56)	3 months ago
tests	c8f7103cb6 refactor(evals): consolidate documentation and enhance test infrastructure (#56)	3 months ago
README.md	f669cac34c feat: repository review and MVI context system implementation (#85)	3 months ago

Agent Generator - Evaluation Tests

Overview

Agent: AgentGenerator
Parent Agent: system-builder
Description: Generates XML-optimized agent files following research-backed patterns

Test Structure

system-builder/agent-generator/
├── config/
│   └── config.yaml          # Test configuration
├── tests/
│   └── smoke-test.yaml      # Basic sanity check
├── prompts/                 # Prompt variants (future)
└── README.md                # This file

Running Tests

Standalone Mode

Tests the subagent directly (forces mode: primary):

# Using npm
npm run eval:sdk -- --subagent=system-builder-agent-generator

# Using Makefile
make test-subagent SUBAGENT=system-builder-agent-generator

# Verbose output
npm run eval:sdk -- --subagent=system-builder-agent-generator --verbose

Delegation Mode

Tests via parent agent (real-world usage):

# Using npm
npm run eval:sdk -- --subagent=system-builder-agent-generator --delegate

# Using Makefile
make test-subagent-delegate SUBAGENT=system-builder-agent-generator

Test Suites

Smoke Tests

Purpose: Basic sanity checks
Coverage: Agent initialization, basic tool usage
Status: ✅ Implemented

Standalone Tests

Purpose: Test subagent in isolation
Coverage: Core functionality without parent delegation
Status: 🚧 TODO

Delegation Tests

Purpose: Test subagent via parent agent
Coverage: Real-world delegation scenarios
Status: 🚧 TODO

Adding Tests

Create test file in tests/ directory
Follow the YAML schema from evals/agents/shared/tests/golden/
Add appropriate tags: subagent, system-builder-agent-generator, suite name
Update this README with test description

Prompt Variants

The prompts/ directory is reserved for model-specific prompt variants:

gpt.md - GPT-optimized prompts
gemini.md - Gemini-optimized prompts
llama.md - Llama-optimized prompts
etc.

Status: 🚧 Not yet implemented

README.md