|
|
4 months ago | |
|---|---|---|
| .. | ||
| .gitkeep | 4 months ago | |
| README.md | 4 months ago | |
| index.html | 4 months ago | |
| latest.json | 4 months ago | |
| serve.sh | 4 months ago | |
Interactive dashboard for visualizing OpenCode agent test results.
# Run tests
cd evals/framework && npm run eval:sdk -- --agent=opencoder
# View dashboard (auto-opens browser, auto-shuts down)
cd evals/results && ./serve.sh
That's it! 🎉
Run Tests:
cd evals/framework
npm run eval:sdk -- --agent=opencoder
npm run eval:sdk -- --agent=openagent
View Dashboard:
Option A: One-Command Solution (Easiest) ⭐
cd evals/results
./serve.sh
Custom timeout:
./serve.sh 8000 30 # Port 8000, 30 second timeout
Option B: Keep Server Running
cd evals/results
python3 -m http.server 8000
Press Ctrl+C to stop manually
Option C: Direct File Access
open evals/results/index.html
⚠️ Note: Some browsers block loading JSON from local files. If you see an error, use Option A or B.
results/
├── index.html # Dashboard (open this)
├── serve.sh # Helper script to start HTTP server
├── latest.json # Most recent test run
├── history/
│ └── 2025-11/
│ ├── 26-115759-opencoder.json
│ └── 26-115850-openagent.json
├── .gitignore # Retention policy
└── README.md # This file
Each result file contains:
{
"meta": {
"timestamp": "2025-11-26T11:59:36.365Z",
"agent": "openagent",
"model": "opencode/grok-code-fast",
"framework_version": "0.1.0",
"git_commit": "f872007"
},
"summary": {
"total": 8,
"passed": 6,
"failed": 2,
"duration_ms": 32450,
"pass_rate": 0.75
},
"by_category": {
"developer": { "passed": 5, "total": 6 },
"business": { "passed": 1, "total": 1 },
"edge-case": { "passed": 0, "total": 1 }
},
"tests": [
{
"id": "task-simple-001",
"category": "developer",
"passed": true,
"duration_ms": 4200,
"events": 23,
"approvals": 2,
"violations": {
"total": 0,
"errors": 0,
"warnings": 0
}
}
]
}
Results are automatically managed:
latest.json)This keeps the repo size manageable while preserving recent history.
The fastest way to view results:
cd evals/results && ./serve.sh
Want to keep exploring? Press Ctrl+C during countdown to keep server running.
The serve.sh script:
Why does it still work after shutdown?
If you start the server manually:
# Find the process
lsof -ti:8000
# Kill it
kill $(lsof -ti:8000)
Or just press Ctrl+C in the terminal.
npm run eval:sdklatest.json existsPotential improvements:
To improve the dashboard:
index.html (all code is in one file)MIT - Same as OpenCode Agents project