Prompt Library System

Multi-model prompt variants with integrated evaluation framework for testing, validation, and continuous improvement.

Last Updated: 2025-12-08 Status: ✅ Production Ready

📋 Quick Links

Overview

The Prompt Library System enables model-specific prompt optimization with comprehensive testing and validation.

Key Features

✅ Multi-Model Support - Variants for Claude, GPT-4, Gemini, Grok, Llama/OSS ✅ Integrated Testing - Test variants with eval framework ✅ Results Tracking - Per-variant and per-model results ✅ Easy Switching - Switch between variants with one command ✅ Validation - JSON Schema + TypeScript validation ✅ Dashboard - Visual results with variant filtering

Quick Start

# Test a variant
cd evals/framework
npm run eval:sdk -- --agent=openagent --prompt-variant=llama --suite=smoke-test

# View results
open ../results/index.html

System Status

Completed Features:

✅ Prompt variant management (PromptManager)
✅ Evaluation framework integration (--prompt-variant flag)
✅ Results tracking (dual save: main + per-variant)
✅ Dashboard filtering (variant badges and filters)
✅ Test suite validation (JSON Schema + Zod)
✅ CLI validation tool
✅ GitHub Actions workflow
✅ Comprehensive documentation

Tested & Working:

✅ All 5 variants (default, gpt, gemini, grok, llama)
✅ Smoke test suite (1 test)
✅ Core test suite (7 tests)
✅ Grok model integration
✅ Results dashboard
✅ Suite validation

Documentation

See the comprehensive documentation files:

Main Prompts README
- Quick start guide
- Creating variants
- Testing workflow
- Advanced usage
OpenAgent Variants README
- Capabilities matrix
- Variant details
- Test results
- Best practices
Eval Framework Guide
- How tests work
- Running tests
- Understanding results
Test Suite Validation
- Creating test suites
- Validation system
- JSON Schema reference
Validation Quick Reference
- Quick commands
- Common fixes
- Troubleshooting

Architecture

Components

┌─────────────────────────────────────────────────────────────┐
│                    Prompt Library System                     │
├─────────────────────────────────────────────────────────────┤
│                                                               │
│  ┌──────────────┐      ┌──────────────┐      ┌───────────┐ │
│  │   Variants   │─────▶│ Eval Framework│─────▶│ Dashboard │ │
│  │  (.md files) │      │  (Test Runner)│      │ (Results) │ │
│  └──────────────┘      └──────────────┘      └───────────┘ │
│         │                      │                     │       │
│         │                      │                     │       │
│  ┌──────▼──────┐      ┌───────▼────────┐   ┌────────▼────┐ │
│  │   Metadata  │      │  Test Suites   │   │   Results   │ │
│  │(YAML Front) │      │  (JSON files)  │   │(JSON files) │ │
│  └─────────────┘      └────────────────┘   └─────────────┘ │
│                                                               │
└─────────────────────────────────────────────────────────────┘

Key Files

Prompt Variants:

.opencode/prompts/core/openagent/*.md - Variant files
.opencode/prompts/core/openagent/results/*.json - Per-variant results

Test Suites:

evals/agents/openagent/config/*.json - Suite definitions
evals/agents/openagent/config/suite-schema.json - JSON Schema

Framework:

evals/framework/src/sdk/prompt-manager.ts - Prompt switching
evals/framework/src/sdk/suite-validator.ts - Suite validation
evals/framework/src/sdk/run-sdk-tests.ts - Test runner

Results:

evals/results/latest.json - Main results
evals/results/index.html - Dashboard

Usage Examples

Testing Variants

# Quick smoke test (1 test, ~30s)
npm run eval:sdk -- --agent=openagent --prompt-variant=llama --suite=smoke-test

# Core test suite (7 tests, ~5-8min)
npm run eval:sdk -- --agent=openagent --prompt-variant=llama --suite=core-tests

# With specific model
npm run eval:sdk -- --agent=openagent --prompt-variant=llama --model=ollama/llama3.2 --suite=core-tests

# Custom test pattern
npm run eval:sdk -- --agent=openagent --prompt-variant=llama --pattern="01-critical-rules/**/*.yaml"

Creating Variants

# 1. Copy template
cp .opencode/prompts/core/openagent/TEMPLATE.md .opencode/prompts/core/openagent/my-variant.md

# 2. Edit metadata and content
# 3. Test
npm run eval:sdk -- --agent=openagent --prompt-variant=my-variant --suite=smoke-test

# 4. Validate
cd evals/framework && npm run validate:suites openagent

Creating Test Suites

# 1. Copy existing suite
cp evals/agents/openagent/config/smoke-test.json \
   evals/agents/openagent/config/my-suite.json

# 2. Edit suite
# 3. Validate
cd evals/framework && npm run validate:suites openagent

# 4. Run
npm run eval:sdk -- --agent=openagent --suite=my-suite

API Reference

PromptManager

class PromptManager {
  constructor(projectRoot: string);
  variantExists(agent: string, variant: string): boolean;
  listVariants(agent: string): string[];
  readMetadata(agent: string, variant: string): PromptMetadata;
  switchToVariant(agent: string, variant: string): SwitchResult;
  restoreDefault(agent: string): boolean;
}

SuiteValidator

class SuiteValidator {
  constructor(agentsDir: string);
  loadSuite(agent: string, suiteName: string): TestSuite;
  validateSuite(agent: string, suite: TestSuite): ValidationResult;
  getTestPaths(agent: string, suite: TestSuite): string[];
}

Test Results

All variants tested with core test suite (7 tests):

Variant	Pass Rate	Model Tested	Status
default	7/7 (100%)	opencode/grok-code-fast	✅ Stable
gpt	7/7 (100%)	opencode/grok-code-fast	✅ Stable
gemini	7/7 (100%)	opencode/grok-code-fast	✅ Stable
grok	7/7 (100%)	opencode/grok-code-fast	✅ Stable
llama	7/7 (100%)	opencode/grok-code-fast	✅ Stable

Future Enhancements

Automated variant comparison reports
Performance benchmarking across variants
Variant recommendation based on model
Historical trend analysis
A/B testing framework
Automated regression detection

prompt-library-system.md 7.9 KB

History Raw

Prompt Library System

📋 Quick Links

Overview

Key Features

Quick Start

System Status

Documentation

Architecture

Components

Key Files

Usage Examples

Testing Variants

Creating Variants

Creating Test Suites

API Reference

PromptManager

SuiteValidator

Test Results

Future Enhancements

Related Documentation

prompt-library-system.md 7.9 KB History Raw

Prompt Library System

📋 Quick Links

Overview

Key Features

Quick Start

System Status

Documentation

Architecture

Components

Key Files

Usage Examples

Testing Variants

Creating Variants

Creating Test Suites

API Reference

PromptManager

SuiteValidator

Test Results

Future Enhancements

Related Documentation

prompt-library-system.md 7.9 KB

History Raw