Darren Hinde f669cac34c feat: repository review and MVI context system implementation (#85)		3 months ago
..
config	c8f7103cb6 refactor(evals): consolidate documentation and enhance test infrastructure (#56)	3 months ago
prompts	c8f7103cb6 refactor(evals): consolidate documentation and enhance test infrastructure (#56)	3 months ago
tests	c8f7103cb6 refactor(evals): consolidate documentation and enhance test infrastructure (#56)	3 months ago
README.md	f669cac34c feat: repository review and MVI context system implementation (#85)	3 months ago

Build Agent - Evaluation Tests

Overview

Agent: BuildAgent
Parent Agent: opencoder
Description: Type checking and build validation specialist

Test Structure

code/build-agent/
├── config/
│   └── config.yaml          # Test configuration
├── tests/
│   └── smoke-test.yaml      # Basic sanity check
├── prompts/                 # Prompt variants (future)
└── README.md                # This file

Running Tests

Standalone Mode

Tests the subagent directly (forces mode: primary):

# Using npm
npm run eval:sdk -- --subagent=code-build-agent

# Using Makefile
make test-subagent SUBAGENT=code-build-agent

# Verbose output
npm run eval:sdk -- --subagent=code-build-agent --verbose

Delegation Mode

Tests via parent agent (real-world usage):

# Using npm
npm run eval:sdk -- --subagent=code-build-agent --delegate

# Using Makefile
make test-subagent-delegate SUBAGENT=code-build-agent

Test Suites

Smoke Tests

Purpose: Basic sanity checks
Coverage: Agent initialization, basic tool usage
Status: ✅ Implemented

Standalone Tests

Purpose: Test subagent in isolation
Coverage: Core functionality without parent delegation
Status: 🚧 TODO

Delegation Tests

Purpose: Test subagent via parent agent
Coverage: Real-world delegation scenarios
Status: 🚧 TODO

Adding Tests

Create test file in tests/ directory
Follow the YAML schema from evals/agents/shared/tests/golden/
Add appropriate tags: subagent, code-build-agent, suite name
Update this README with test description

Prompt Variants

The prompts/ directory is reserved for model-specific prompt variants:

gpt.md - GPT-optimized prompts
gemini.md - Gemini-optimized prompts
llama.md - Llama-optimized prompts
etc.

Status: 🚧 Not yet implemented

README.md