Multi-model prompt variants with integrated evaluation framework for testing, validation, and continuous improvement.
Last Updated: 2025-12-08 Status: โ Production Ready
The Prompt Library System enables model-specific prompt optimization with comprehensive testing and validation.
โ Multi-Model Support - Variants for Claude, GPT-4, Gemini, Grok, Llama/OSS โ Integrated Testing - Test variants with eval framework โ Results Tracking - Per-variant and per-model results โ Easy Switching - Switch between variants with one command โ Validation - JSON Schema + TypeScript validation โ Dashboard - Visual results with variant filtering
# Test a variant
cd evals/framework
npm run eval:sdk -- --agent=openagent --prompt-variant=llama --suite=smoke-test
# View results
open ../results/index.html
Completed Features:
Tested & Working:
See the comprehensive documentation files:
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ Prompt Library System โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโค
โ โ
โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโ โ
โ โ Variants โโโโโโโถโ Eval Frameworkโโโโโโโถโ Dashboard โ โ
โ โ (.md files) โ โ (Test Runner)โ โ (Results) โ โ
โ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโ โโโโโโโโโโโโโ โ
โ โ โ โ โ
โ โ โ โ โ
โ โโโโโโโโผโโโโโโโ โโโโโโโโโผโโโโโโโโโ โโโโโโโโโโผโโโโโ โ
โ โ Metadata โ โ Test Suites โ โ Results โ โ
โ โ(YAML Front) โ โ (JSON files) โ โ(JSON files) โ โ
โ โโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโโโโ โโโโโโโโโโโโโโโ โ
โ โ
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
Prompt Variants:
.opencode/prompts/core/openagent/*.md - Variant files.opencode/prompts/core/openagent/results/*.json - Per-variant resultsTest Suites:
evals/agents/openagent/config/*.json - Suite definitionsevals/agents/openagent/config/suite-schema.json - JSON SchemaFramework:
evals/framework/src/sdk/prompt-manager.ts - Prompt switchingevals/framework/src/sdk/suite-validator.ts - Suite validationevals/framework/src/sdk/run-sdk-tests.ts - Test runnerResults:
evals/results/latest.json - Main resultsevals/results/index.html - Dashboard# Quick smoke test (1 test, ~30s)
npm run eval:sdk -- --agent=openagent --prompt-variant=llama --suite=smoke-test
# Core test suite (7 tests, ~5-8min)
npm run eval:sdk -- --agent=openagent --prompt-variant=llama --suite=core-tests
# With specific model
npm run eval:sdk -- --agent=openagent --prompt-variant=llama --model=ollama/llama3.2 --suite=core-tests
# Custom test pattern
npm run eval:sdk -- --agent=openagent --prompt-variant=llama --pattern="01-critical-rules/**/*.yaml"
# 1. Copy template
cp .opencode/prompts/core/openagent/TEMPLATE.md .opencode/prompts/core/openagent/my-variant.md
# 2. Edit metadata and content
# 3. Test
npm run eval:sdk -- --agent=openagent --prompt-variant=my-variant --suite=smoke-test
# 4. Validate
cd evals/framework && npm run validate:suites openagent
# 1. Copy existing suite
cp evals/agents/openagent/config/smoke-test.json \
evals/agents/openagent/config/my-suite.json
# 2. Edit suite
# 3. Validate
cd evals/framework && npm run validate:suites openagent
# 4. Run
npm run eval:sdk -- --agent=openagent --suite=my-suite
class PromptManager {
constructor(projectRoot: string);
variantExists(agent: string, variant: string): boolean;
listVariants(agent: string): string[];
readMetadata(agent: string, variant: string): PromptMetadata;
switchToVariant(agent: string, variant: string): SwitchResult;
restoreDefault(agent: string): boolean;
}
class SuiteValidator {
constructor(agentsDir: string);
loadSuite(agent: string, suiteName: string): TestSuite;
validateSuite(agent: string, suite: TestSuite): ValidationResult;
getTestPaths(agent: string, suite: TestSuite): string[];
}
All variants tested with core test suite (7 tests):
| Variant | Pass Rate | Model Tested | Status |
|---|---|---|---|
| default | 7/7 (100%) | opencode/grok-code-fast | โ Stable |
| gpt | 7/7 (100%) | opencode/grok-code-fast | โ Stable |
| gemini | 7/7 (100%) | opencode/grok-code-fast | โ Stable |
| grok | 7/7 (100%) | opencode/grok-code-fast | โ Stable |
| llama | 7/7 (100%) | opencode/grok-code-fast | โ Stable |
Questions? Open an issue or see the main README.