# ๐Ÿš€ OpenCode Agents - Quick Start ![Version](https://img.shields.io/badge/version-0.1.0--alpha.1-blue) ## ๐Ÿ“‹ Available Agents - **openagent** - Full-featured development agent (22+ tests) - Developer tests: Code, docs, tests, delegation - Context loading tests: Standards, patterns, workflows - Business tests: Conversations, data analysis - Edge cases: Approval gates, negative tests - **opencoder** - Specialized coding agent (4+ tests) - Developer tests: Bash execution, file operations - Multi-tool workflows --- ## ๐Ÿงช Running Tests ### Test All Agents ```bash npm test # All agents, all tests (default) npm run test:all # Explicit all agents ``` ### Test Specific Agent ```bash npm run test:openagent # OpenAgent only npm run test:opencoder # OpenCoder only ``` ### Test with Different Models #### OpenAgent ```bash npm run test:openagent:grok # Grok (free tier, fast) npm run test:openagent:claude # Claude Sonnet 4.5 (best quality) npm run test:openagent:gpt4 # GPT-4 Turbo (OpenAI) ``` #### OpenCoder ```bash npm run test:opencoder:grok # Grok (free tier, fast) npm run test:opencoder:claude # Claude Sonnet 4.5 (best quality) npm run test:opencoder:gpt4 # GPT-4 Turbo (OpenAI) ``` #### All Agents ```bash npm run test:all:grok # All agents with Grok npm run test:all:claude # All agents with Claude npm run test:all:gpt4 # All agents with GPT-4 ``` --- ## ๐ŸŽฏ Test Specific Categories ### OpenAgent Categories ```bash npm run test:openagent:developer # Developer tests (code, docs, tests) npm run test:openagent:context # Context loading tests npm run test:openagent:business # Business/conversation tests ``` ### OpenCoder Categories ```bash npm run test:opencoder:developer # Developer tests npm run test:opencoder:bash # Bash execution tests ``` ### Custom Patterns ```bash npm run test:pattern -- "developer/*.yaml" # All developer tests npm run test:pattern -- "context-loading/*.yaml" # Context tests npm run test:pattern -- "edge-case/*.yaml" # Edge cases npm run test:openagent -- --pattern="developer/ctx-*" # OpenAgent context tests ``` --- ## ๐Ÿ“Š View Results ### Dashboard (Recommended) ```bash npm run dashboard # Launch interactive dashboard npm run dashboard:open # Launch and auto-open browser ``` The dashboard provides: - โœ… Real-time test results visualization - โœ… Filter by agent, category, status - โœ… Detailed violation tracking - โœ… CSV export functionality - โœ… Historical results tracking ### Command Line ```bash npm run results:openagent # Recent OpenAgent results npm run results:opencoder # Recent OpenCoder results npm run results:latest # Latest test summary (JSON) ``` --- ## ๐Ÿ› Debug Mode ```bash npm run test:debug # Run with debug output npm run test:openagent -- --debug # Debug OpenAgent tests npm run test:opencoder -- --debug # Debug OpenCoder tests ``` Debug mode shows: - Detailed event logging - Tool call details - Session information - Evaluation progress --- ## ๐Ÿ”ง Development ```bash npm run dev:setup # Install dependencies npm run dev:build # Build framework npm run dev:test # Run unit tests npm run dev:clean # Clean and reinstall ``` --- ## ๐Ÿ“ˆ Version Management ```bash npm run version # Show current version npm run version:bump alpha # Bump alpha version npm run version:bump beta # Bump to beta npm run version:bump rc # Bump to release candidate ``` --- ## ๐Ÿ“ Test Structure ``` evals/agents/ โ”œโ”€โ”€ openagent/tests/ โ”‚ โ”œโ”€โ”€ developer/ # Code, docs, tests (12 tests) โ”‚ โ”‚ โ”œโ”€โ”€ ctx-code-001.yaml โ”‚ โ”‚ โ”œโ”€โ”€ ctx-docs-001.yaml โ”‚ โ”‚ โ”œโ”€โ”€ ctx-tests-001.yaml โ”‚ โ”‚ โ”œโ”€โ”€ ctx-delegation-001.yaml โ”‚ โ”‚ โ””โ”€โ”€ ... โ”‚ โ”œโ”€โ”€ context-loading/ # Context loading (5 tests) โ”‚ โ”‚ โ”œโ”€โ”€ ctx-simple-coding-standards.yaml โ”‚ โ”‚ โ”œโ”€โ”€ ctx-simple-documentation-format.yaml โ”‚ โ”‚ โ””โ”€โ”€ ... โ”‚ โ”œโ”€โ”€ business/ # Conversations (2 tests) โ”‚ โ”‚ โ”œโ”€โ”€ conv-simple-001.yaml โ”‚ โ”‚ โ””โ”€โ”€ data-analysis.yaml โ”‚ โ””โ”€โ”€ edge-case/ # Edge cases (3 tests) โ”‚ โ”œโ”€โ”€ just-do-it.yaml โ”‚ โ”œโ”€โ”€ missing-approval-negative.yaml โ”‚ โ””โ”€โ”€ no-approval-negative.yaml โ”‚ โ””โ”€โ”€ opencoder/tests/ โ””โ”€โ”€ developer/ # Bash, file ops (4 tests) โ”œโ”€โ”€ bash-execution-001.yaml โ”œโ”€โ”€ file-read-001.yaml โ”œโ”€โ”€ multi-tool-001.yaml โ””โ”€โ”€ simple-bash-test.yaml ``` --- ## ๐Ÿ’ก Common Workflows ### Quick Test (Free Tier) ```bash npm run test:openagent:grok # Fast, free npm run test:opencoder:grok # Fast, free ``` ### Quality Test (Best Model) ```bash npm run test:openagent:claude # Best quality npm run test:opencoder:claude # Best quality ``` ### Full Test Suite ```bash npm run test:all:claude # All agents, best model ``` ### Continuous Development ```bash # 1. Run tests in debug mode npm run test:openagent:developer -- --debug # 2. View results in dashboard npm run dashboard:open # 3. Iterate on agent prompts # Edit .opencode/agent/core/openagent.md # 4. Re-run tests npm run test:openagent:developer ``` ### CI/CD Smoke Tests ```bash npm run test:ci # Fast smoke tests for both agents npm run test:ci:openagent # OpenAgent smoke test npm run test:ci:opencoder # OpenCoder smoke test ``` --- ## ๐ŸŽฏ Test Results After running tests, results are saved to: - `evals/results/latest.json` - Latest test run - `evals/results/history/YYYY-MM/DD-HHMMSS-{agent}.json` - Historical results View in dashboard: `npm run dashboard:open` --- ## ๐Ÿ” Understanding Test Results ### Test Status - โœ… **PASSED** - All checks passed, no violations - โŒ **FAILED** - Test failed (execution error or violations) ### Evaluators Tests are evaluated by multiple evaluators: - **approval-gate** - Checks if agent requested approval when required - **context-loading** - Validates context files were loaded before execution - **delegation** - Checks if agent delegated to subagents appropriately - **tool-usage** - Validates correct tool usage - **behavior** - Checks if agent performed expected actions ### Violations - **Error** - Critical issues that cause test failure - **Warning** - Non-critical issues - **Info** - Informational messages --- ## ๐Ÿ“š Additional Resources - [README.md](README.md) - Project overview - [CHANGELOG.md](CHANGELOG.md) - Version history --- ## ๐Ÿ†˜ Troubleshooting ### Tests not running? ```bash # Ensure dependencies are installed npm run dev:setup # Build the framework npm run dev:build ``` ### Dashboard not loading? ```bash # Check if results exist ls -la evals/results/ # Try launching manually cd evals/results && ./serve.sh ``` ### Version mismatch? ```bash # Check current version npm run version # Sync VERSION file with package.json npm run version > VERSION ``` --- ## ๐ŸŽ‰ Getting Help - Review test examples in `evals/agents/*/tests/` - Run tests in debug mode: `npm run test:debug` - View results dashboard: `npm run dashboard:open` --- **Current Version:** 0.1.0-alpha.1 **Last Updated:** 2025-11-26