Darren Hinde 4103805270 Add build validation system and OpenAgent evaluation framework (#26) 4 months ago
..
tests 2bc97007c1 refactor: simplify agent execution philosophy and reorganize context structure 4 months ago
README.md 4103805270 Add build validation system and OpenAgent evaluation framework (#26) 4 months ago
auto-detect-components.sh 79110ed3fb Add Production-Ready Eval Framework for OpenAgent (#25) 4 months ago
bump-version.sh aebd68e046 feat: add monorepo structure with versioning and CI/CD 4 months ago
cleanup-stale-sessions.sh 5f2cec1404 ✨ feat: introduce OpenAgent universal agent with session management 4 months ago
dashboard.sh aebd68e046 feat: add monorepo structure with versioning and CI/CD 4 months ago
register-component.sh 3ab70d24a9 fix: simplify registration script to preserve manual registry 4 months ago
test.sh 4103805270 Add build validation system and OpenAgent evaluation framework (#26) 4 months ago
uninstall.sh bc8a322550 feat: enhance installation system with universal path transformation 4 months ago
validate-component.sh 777b3fc702 feat: add interactive installer with component registry system 4 months ago
validate-context-refs.sh bc8a322550 feat: enhance installation system with universal path transformation 4 months ago
validate-registry.sh 79110ed3fb Add Production-Ready Eval Framework for OpenAgent (#25) 4 months ago

README.md

Scripts

This directory contains utility scripts for the OpenAgents system.

Available Scripts

Testing

  • test.sh - Main test runner with multi-agent support
    • Run all tests: ./scripts/test.sh openagent
    • Run core tests: ./scripts/test.sh openagent --core (7 tests, ~5-8 min)
    • Run with specific model: ./scripts/test.sh openagent opencode/grok-code-fast
    • Debug mode: ./scripts/test.sh openagent --core --debug

See tests/ subdirectory for installer test scripts.

Component Management

  • register-component.sh - Register a new component in the registry
  • validate-component.sh - Validate component structure and metadata

Session Management

  • cleanup-stale-sessions.sh - Remove stale agent sessions older than 24 hours

Session Cleanup

Agent instances create temporary context files in .tmp/sessions/{session-id}/ for subagent delegation. These sessions are automatically cleaned up, but you can manually remove stale sessions:

# Clean up sessions older than 24 hours
./scripts/cleanup-stale-sessions.sh

# Or manually delete all sessions
rm -rf .tmp/sessions/

Sessions are safe to delete anytime - they only contain temporary context files for agent coordination.

Usage Examples

Run Tests

# Run core test suite (fast, 7 tests, ~5-8 min)
./scripts/test.sh openagent --core

# Run all tests for OpenAgent
./scripts/test.sh openagent

# Run tests with specific model
./scripts/test.sh openagent anthropic/claude-sonnet-4-5

# Run core tests with debug mode
./scripts/test.sh openagent --core --debug

Register a Component

./scripts/register-component.sh path/to/component

Validate a Component

./scripts/validate-component.sh path/to/component

Clean Stale Sessions

./scripts/cleanup-stale-sessions.sh

Core Test Suite

The core test suite is a subset of 7 carefully selected tests that provide ~85% coverage of critical OpenAgent functionality in just 5-8 minutes.

Why Use Core Tests?

  • Fast feedback - 5-8 minutes vs 40-80 minutes for full suite
  • Prompt iteration - Quick validation when updating agent prompts
  • Development - Fast validation during development cycles
  • Pre-commit - Quick checks before committing changes

What's Covered?

  1. Approval Gate - Critical safety rule
  2. Context Loading (Simple) - Most common use case
  3. Context Loading (Multi-Turn) - Complex scenarios
  4. Stop on Failure - Error handling
  5. Simple Task - No unnecessary delegation
  6. Subagent Delegation - Proper delegation when needed
  7. Tool Usage - Best practices

When to Use Full Suite?

Use the full test suite (71 tests) for:

  • 🔬 Release validation
  • 🔬 Comprehensive testing
  • 🔬 Edge case coverage
  • 🔬 Regression testing

See evals/agents/openagent/CORE_TESTS.md for detailed documentation.