Multi-model prompt variants with integrated evaluation framework for testing, validation, and continuous improvement.
Last Updated: 2025-12-08 Status: ✅ Production Ready
The Prompt Library System enables model-specific prompt optimization with comprehensive testing and validation.
✅ Multi-Model Support - Variants for Claude, GPT-4, Gemini, Grok, Llama/OSS ✅ Integrated Testing - Test variants with eval framework ✅ Results Tracking - Per-variant and per-model results ✅ Easy Switching - Switch between variants with one command ✅ Validation - JSON Schema + TypeScript validation ✅ Dashboard - Visual results with variant filtering
# Test a variant
cd evals/framework
npm run eval:sdk -- --agent=openagent --prompt-variant=llama --suite=smoke-test
# View results
open ../results/index.html
Completed Features:
Tested & Working:
See the comprehensive documentation files:
┌─────────────────────────────────────────────────────────────┐
│ Prompt Library System │
├─────────────────────────────────────────────────────────────┤
│ │
│ ┌──────────────┐ ┌──────────────┐ ┌───────────┐ │
│ │ Variants │─────▶│ Eval Framework│─────▶│ Dashboard │ │
│ │ (.md files) │ │ (Test Runner)│ │ (Results) │ │
│ └──────────────┘ └──────────────┘ └───────────┘ │
│ │ │ │ │
│ │ │ │ │
│ ┌──────▼──────┐ ┌───────▼────────┐ ┌────────▼────┐ │
│ │ Metadata │ │ Test Suites │ │ Results │ │
│ │(YAML Front) │ │ (JSON files) │ │(JSON files) │ │
│ └─────────────┘ └────────────────┘ └─────────────┘ │
│ │
└─────────────────────────────────────────────────────────────┘
Prompt Variants:
.opencode/prompts/openagent/*.md - Variant files.opencode/prompts/openagent/results/*.json - Per-variant resultsTest Suites:
evals/agents/openagent/config/*.json - Suite definitionsevals/agents/openagent/config/suite-schema.json - JSON SchemaFramework:
evals/framework/src/sdk/prompt-manager.ts - Prompt switchingevals/framework/src/sdk/suite-validator.ts - Suite validationevals/framework/src/sdk/run-sdk-tests.ts - Test runnerResults:
evals/results/latest.json - Main resultsevals/results/index.html - Dashboard# Quick smoke test (1 test, ~30s)
npm run eval:sdk -- --agent=openagent --prompt-variant=llama --suite=smoke-test
# Core test suite (7 tests, ~5-8min)
npm run eval:sdk -- --agent=openagent --prompt-variant=llama --suite=core-tests
# With specific model
npm run eval:sdk -- --agent=openagent --prompt-variant=llama --model=ollama/llama3.2 --suite=core-tests
# Custom test pattern
npm run eval:sdk -- --agent=openagent --prompt-variant=llama --pattern="01-critical-rules/**/*.yaml"
# 1. Copy template
cp .opencode/prompts/openagent/TEMPLATE.md .opencode/prompts/openagent/my-variant.md
# 2. Edit metadata and content
# 3. Test
npm run eval:sdk -- --agent=openagent --prompt-variant=my-variant --suite=smoke-test
# 4. Validate
cd evals/framework && npm run validate:suites openagent
# 1. Copy existing suite
cp evals/agents/openagent/config/smoke-test.json \
evals/agents/openagent/config/my-suite.json
# 2. Edit suite
# 3. Validate
cd evals/framework && npm run validate:suites openagent
# 4. Run
npm run eval:sdk -- --agent=openagent --suite=my-suite
class PromptManager {
constructor(projectRoot: string);
variantExists(agent: string, variant: string): boolean;
listVariants(agent: string): string[];
readMetadata(agent: string, variant: string): PromptMetadata;
switchToVariant(agent: string, variant: string): SwitchResult;
restoreDefault(agent: string): boolean;
}
class SuiteValidator {
constructor(agentsDir: string);
loadSuite(agent: string, suiteName: string): TestSuite;
validateSuite(agent: string, suite: TestSuite): ValidationResult;
getTestPaths(agent: string, suite: TestSuite): string[];
}
All variants tested with core test suite (7 tests):
| Variant | Pass Rate | Model Tested | Status |
|---|---|---|---|
| default | 7/7 (100%) | opencode/grok-code-fast | ✅ Stable |
| gpt | 7/7 (100%) | opencode/grok-code-fast | ✅ Stable |
| gemini | 7/7 (100%) | opencode/grok-code-fast | ✅ Stable |
| grok | 7/7 (100%) | opencode/grok-code-fast | ✅ Stable |
| llama | 7/7 (100%) | opencode/grok-code-fast | ✅ Stable |
Questions? Open an issue or see the main README.