# Abilities Plugin: Architecture & System Overview ## Executive Summary The **Abilities Plugin** is an OpenCode plugin that enforces deterministic, step-by-step workflow execution for AI agents. It solves the core problem with traditional skills: **LLMs ignore them**. By using enforcement hooks and structured workflows, Abilities guarantee that agents follow prescribed steps in order, without deviation. ### Core Problem It Solves ``` Traditional Skills Abilities ───────────────────────────────── ────────────────────────── Agent sees skill definition → Ability enforces execution Agent can ignore it → Agent MUST follow steps Execution is non-deterministic → Execution is deterministic No validation between steps → Each step validated Multi-agent coordination fails → Agent-specific abilities work ``` --- ## System Architecture Overview ### High-Level Flow ``` User Request ↓ ┌─────────────────────────────────┐ │ OpenCode Chat Message Received │ ↓ ┌──────────────────────────────────────┐ │ AbilitiesPlugin Hooks (opencode-plugin.ts) │ ├─ chat.message: Detect ability trigger │ ├─ tool.execute.before: Block unauthorized tools │ └─ event: Manage execution state ↓ ┌──────────────────────────────────────┐ │ If ability triggered: │ ├─ Load ability definition (loader/) │ ├─ Validate inputs (validator/) │ └─ Execute steps (executor/) ↓ ┌──────────────────────────────────────┐ │ Step Execution (ExecutionManager) │ ├─ Sequential execution with dependencies │ ├─ Output context passing │ ├─ Step-level validation │ └─ Error handling & recovery ↓ ┌──────────────────────────────────────┐ │ Enforcement Applied │ ├─ Tool blocking (only allowed tools) │ ├─ Context injection (ability status) │ └─ Step-by-step control ↓ Agent sees ability context and executes as instructed (not as LLM desires) ``` --- ## Module Architecture ``` src/ ├── opencode-plugin.ts [ENTRY POINT] Main plugin implementation │ ├─ Hooks: event, chat.message, tool.execute.before │ ├─ Tools: ability.list, ability.run, ability.status, ability.cancel │ └─ Enforcement logic & context injection │ ├── loader/ │ └─ index.ts [DISCOVERY] Load ability YAML files │ ├─ loadAbilities(): Discover & parse all abilities │ ├─ loadAbility(): Get specific ability │ └─ listAbilities(): Format for display │ ├── validator/ │ └─ index.ts [VALIDATION] Ensure abilities are valid │ ├─ validateAbility(): Check structure, dependencies, step types │ ├─ validateInputs(): Type-check user inputs against schema │ └─ validateSteps(): Ensure no circular dependencies │ ├── executor/ │ ├─ index.ts [EXECUTION] Run ability steps │ │ ├─ executeAbility(): Main orchestrator │ │ ├─ executeScriptStep(): Run shell commands │ │ ├─ executeAgentStep(): Call agents │ │ ├─ executeSkillStep(): Load skills │ │ ├─ executeApprovalStep(): Request approval │ │ └─ executeWorkflowStep(): Run nested abilities │ │ │ └─ execution-manager.ts [STATE] Track active executions │ ├─ ExecutionManager: Lifecycle management │ ├─ execute(): Start new execution │ ├─ getActive(): Current execution status │ ├─ cancelActive(): Stop active ability │ └─ cleanup(): GC & resource management │ ├── types/ │ └─ index.ts [TYPES] TypeScript definitions │ ├─ Ability, Step types │ ├─ Execution state types │ └─ Input/output schemas │ └── index.ts [EXPORTS] Public API ``` --- ## Module Responsibilities ### 1. **opencode-plugin.ts** - Main Plugin & Enforcement **Responsibility**: Interface between OpenCode and the abilities system **Key Functions**: - `AbilitiesPlugin` - Main async factory function that returns hooks - `matchesTrigger()` - Detect if user text matches ability keywords/patterns - `detectAbility()` - Find matching ability from user message - `showToast()` - Display UI notifications - `createExecutorContext()` - Build execution environment - `buildAbilityContextInjection()` - Format ability status for agent **Hooks Implemented**: ```typescript { event() // Handle session lifecycle (create/delete) config() // Load plugin configuration 'chat.message'() // Intercept messages, detect abilities, inject context 'tool.execute.before()' // Block unauthorized tools during steps tool: { // Register custom tools 'ability.list', 'ability.run', 'ability.status', 'ability.cancel' } } ``` **Enforcement Strategy**: - **Message Interception**: When user types, check if it matches ability triggers - **Tool Blocking**: During ability execution, only allow tools for current step type - **Context Injection**: Add ability progress/instructions to every message - **State Tracking**: ExecutionManager tracks active executions per session --- ### 2. **loader/index.ts** - Ability Discovery **Responsibility**: Find and parse YAML ability definitions from filesystem **Key Functions**: ```typescript loadAbilities(options) // Discover all abilities in directories └─ discoverAbilities() // Glob for *.yaml files └─ loadAbilityFile() // Parse YAML → Ability object loadAbility(name) // Get specific ability by name listAbilities(map) // Format abilities for display ``` **Globbing Strategy** (Limited scope): ```typescript const ABILITY_PATTERNS = [ '*.yaml', // Single-level files '*/ability.yaml', // Dir with ability.yaml '*/*.yaml', // Dir with YAML files '*/*/ability.yaml' // Two-level nesting (max) ] ``` **Why Limited Patterns?** - Prevents scanning entire project (performance) - Stops accidental loading of unrelated YAML files - Encourages organized directory structure **Output**: ```typescript Map { 'deploy': { ability, filePath, source } 'deploy/staging': { ability, filePath, source } 'test-suite': { ability, filePath, source } } ``` --- ### 3. **validator/index.ts** - Structure & Input Validation **Responsibility**: Ensure abilities are well-formed and inputs are valid **Key Functions**: ```typescript validateAbility(ability) // Check structure └─ Validates: ├─ name field exists ├─ steps array non-empty ├─ no duplicate step IDs ├─ no circular dependencies ├─ all dependencies exist ├─ step types valid └─ nested abilities exist validateInputs(ability, inputs) // Type-check user inputs └─ For each input definition: ├─ required field check ├─ type validation (string/number/object) ├─ pattern regex validation ├─ enum value check ├─ min/max range check └─ default value handling ``` **Validation Output**: ```typescript { valid: boolean errors: Array<{ path: string // e.g., "inputs.version" message: string // "Must match pattern: ^v\d+\.\d+\.\d+$" }> } ``` --- ### 4. **executor/index.ts** - Step Execution Engine **Responsibility**: Execute ability steps sequentially with dependency management **Key Functions**: ```typescript executeAbility(ability, inputs, ctx, options) └─ buildExecutionOrder(steps) // Resolve dependencies └─ executeStep(step, execution, ctx) ├─ executeScriptStep() // Run: sh -c "command" ├─ executeAgentStep() // Call agent with context ├─ executeSkillStep() // Load skill ├─ executeApprovalStep() // Request user approval └─ executeWorkflowStep() // Run nested ability formatExecutionResult(execution) // Pretty-print results ``` **Step Types & Their Allowed Tools**: ```typescript ALLOWED_TOOLS_BY_STEP_TYPE = { script: [], // No tools (runs deterministically) agent: ['task', 'background_task'], // Only agent-calling tools skill: ['skill', 'slashcommand'], // Skill-related tools approval: ['ability.status'], // Read-only status tools workflow: ['ability.run', 'ability.status'] // Run nested ability } ``` **Variable Interpolation**: ```yaml steps: - run: "deploy {{inputs.version}} to {{inputs.env}}" - run: "echo {{steps.test.output}}" # From previous step output ``` **Dependency Resolution**: ```yaml steps: - id: test type: script run: npm test - id: build needs: [test] # Runs after test completes run: npm run build - id: deploy needs: [build] # Runs after build completes run: ./deploy.sh ``` --- ### 5. **executor/execution-manager.ts** - Lifecycle Management **Responsibility**: Track active executions, manage state, handle cleanup **Key Responsibilities**: ```typescript class ExecutionManager { // Lifecycle execute(ability, inputs, ctx) // Start new execution getActive() // Get current execution cancelActive() // Stop active ability // State Management updateStep(executionId, result) // Mark step complete cancel(executionId) // Cancel by ID get(id) // Retrieve execution history list() // All executions (for debugging) // Resource Management cleanup() // Clean up timers & state cleanupOldExecutions() // GC old executions (30 min TTL) trimToMaxSize() // Keep last 50 executions max } ``` **Cleanup Strategy**: ```typescript const EXECUTION_TTL = 30 * 60 * 1000 // Delete after 30 minutes const CLEANUP_INTERVAL = 5 * 60 * 1000 // Check every 5 minutes const MAX_STORED_EXECUTIONS = 50 // Keep last 50 ``` **Why This Matters**: - Lazy timer initialization (doesn't create timers until first execution) - Automatic GC prevents memory leaks from long-running sessions - State persists across messages in same session - Timer uses `unref()` so it doesn't prevent process exit --- ### 6. **types/index.ts** - Type Definitions **Responsibility**: Provide TypeScript types for all data structures **Key Types**: ```typescript // Ability Definition interface Ability { name: string description: string triggers?: { keywords?: string[] patterns?: string[] // Regex patterns } inputs?: Record steps: Step[] settings?: { enforcement?: 'strict' | 'normal' | 'loose' on_failure?: 'stop' | 'retry' | 'continue' } exclusive_agent?: string // Only this agent can run compatible_agents?: string[] // Whitelist of agents } // Step Types type Step = | ScriptStep | AgentStep | SkillStep | ApprovalStep | WorkflowStep interface ScriptStep { id: string type: 'script' description?: string run: string // Shell command cwd?: string // Working directory env?: Record // Environment variables timeout?: string // '5m', '30s' validation?: { exit_code?: number stdout_contains?: string stderr_contains?: string } on_failure?: 'stop' | 'retry' | 'continue' max_retries?: number when?: string // Conditional: "inputs.env == 'prod'" needs?: string[] // Dependencies } // Execution State interface AbilityExecution { id: string ability: Ability inputs: InputValues status: 'running' | 'completed' | 'failed' | 'cancelled' currentStep: Step | null currentStepIndex: number completedSteps: StepResult[] pendingSteps: Step[] startedAt: number completedAt?: number error?: string } interface StepResult { stepId: string status: 'completed' | 'failed' | 'skipped' output?: string error?: string startedAt: number completedAt: number duration: number } ``` --- ## Data Flow Example: Deploy Workflow ### 1. User Types "Deploy v1.2.3" ``` User Message: "Deploy v1.2.3" ↓ chat.message hook intercepts ↓ detectAbility("Deploy v1.2.3") ├─ Check: "deploy" keyword in message? ✓ ├─ Found: ability { name: "deploy", ... } ↓ Show ability suggestion: "## Ability Detected: deploy\n\n Deploy application..." ``` ### 2. User Runs: `/ability.run deploy version=v1.2.3` ``` ability.run tool executes: ├─ Load ability: "deploy" ├─ Validate inputs: │ └─ version matches pattern: ^v\d+\.\d+\.\d+$ ✓ ├─ executionManager.execute(ability, {version: "v1.2.3"}) │ └─ Start execution: ExecutionManager creates AbilityExecution { id: "exec_1704067200000_abc123" status: "running" currentStep: steps[0] } ``` ### 3. Step 1: Test (Script) ``` Step: "test" (script) ├─ Command: "npm test" ├─ Run in shell: │ ├─ stdout: "✓ 124 tests passed" │ ├─ exit code: 0 │ └─ validation: exit_code == 0 ✓ ├─ Record result: │ └─ StepResult { stepId: "test", status: "completed", output: "..." } ├─ Inject context in next message: │ "## Active Ability: deploy\nProgress: 1/3 steps\nCurrent Step: build..." └─ Continue ``` ### 4. Step 2: Build (Script, depends on test) ``` Step: "build" (script) ├─ Needs: ["test"] ✓ (completed) ├─ Command: "npm run build" ├─ Tool check (tool.execute.before): │ └─ Block: bash, write, edit (not allowed in script steps) ├─ Execute... ├─ Result: success └─ Continue ``` ### 5. Step 3: Deploy (Script) ``` Step: "deploy" (script) ├─ Needs: ["build"] ✓ (completed) ├─ Interpolate variables: │ └─ "Deploy {{inputs.version}}" → "Deploy v1.2.3" ├─ Run: "./deploy.sh v1.2.3" ├─ Result: success └─ Mark ability complete ``` ### 6. Execution Complete ``` Set status: "completed" Save results: { completedSteps: [...], duration: "42.3s" } Return: "✅ Ability 'deploy' completed successfully" Clear activeExecution for next ability ``` --- ## Enforcement Mechanisms ### 1. Tool Blocking (tool.execute.before hook) **Problem**: Agent might try to call `bash` during a `script` step (redundant & risky) **Solution**: ```typescript async 'tool.execute.before'(input, output) { if (!activeExecution) return // Not running ability, allow all const currentStep = activeExecution.currentStep const allowedTools = ALLOWED_TOOLS_BY_STEP_TYPE[currentStep.type] if (enforcement === 'strict' && !allowedTools.includes(input.tool)) { throw new Error(`Tool '${input.tool}' blocked in ${currentStep.type} step`) } } ``` **Effect**: Agent cannot deviate from prescribed tool usage for current step ### 2. Context Injection (chat.message hook) **Problem**: Agent might forget which step it's on or what to do next **Solution**: ```typescript async 'chat.message'(input, output) { if (activeExecution?.status === 'running') { // Inject ability context at start of every message output.parts.unshift({ type: 'text', text: `## Active Ability: ${ability.name}\nProgress: 2/3 steps\nCurrent Step: deploy\nAction: Run ./deploy.sh v1.2.3` }) } } ``` **Effect**: Agent always sees context reminder, reducing deviation ### 3. Ability Detection (chat.message keyword matching) **Problem**: User doesn't know they can run an ability **Solution**: ```typescript const detected = detectAbility(userText) // Check triggers if (detected) { output.parts.unshift({ type: 'text', text: `## Ability Detected: ${detected.name}\n\n${detected.description}...` }) } ``` **Effect**: Auto-discovery makes abilities more discoverable --- ## Configuration ### In `.opencode/opencode.json`: ```json { "plugin": [ "file://../packages/plugin-abilities/src/opencode-plugin.ts" ] } ``` ### Optional Config: ```json { "abilities": { "enabled": true, "auto_trigger": true, "enforcement": "strict", "directories": [ ".opencode/abilities", "~/.config/opencode/abilities" ] } } ``` --- ## Ability File Structure ### Basic Example ```yaml # .opencode/abilities/deploy/ability.yaml name: deploy description: Deploy application with safety checks triggers: keywords: - deploy - ship patterns: - 'deploy.*v\d+\.\d+\.\d+' inputs: version: type: string required: true pattern: '^v\d+\.\d+\.\d+$' environment: type: string enum: [dev, staging, prod] default: staging steps: - id: test type: script run: npm test validation: exit_code: 0 - id: build type: script needs: [test] run: npm run build validation: exit_code: 0 - id: approve type: approval needs: [build] prompt: "Deploy {{inputs.version}} to {{inputs.environment}}?" - id: deploy type: script needs: [approve] run: ./deploy.sh {{inputs.version}} {{inputs.environment}} validation: exit_code: 0 ``` --- ## Tools Available to Agents ### ability.list Lists all available abilities ``` ability.list → "- deploy: Deploy application...\n- test: Run tests..." ``` ### ability.run Execute an ability ``` ability.run { name: "deploy", inputs: { version: "v1.2.3" } } → { status: "completed", ability: "deploy", result: "..." } ``` ### ability.status Check active ability execution ``` ability.status → { status: "running", ability: "deploy", currentStep: "build", progress: "2/3" } ``` ### ability.cancel Cancel active ability ``` ability.cancel → { status: "cancelled", message: "Ability cancelled" } ``` --- ## Execution Flow Diagram ``` ┌──────────────────────────────────────────────────────────┐ │ User Message → chat.message hook │ └────────────────────┬─────────────────────────────────────┘ │ ┌───────────┴────────────┐ ▼ ▼ No ability match Ability detected │ │ │ ┌─────┴──────┐ │ ▼ ▼ │ Auto-detect Show suggestion │ (cool 10s) to user │ ├─────────────────────────────────────┐ │ │ Allow normal User runs /ability.run OpenCode flow │ ┌──────────┴───────────┐ ▼ ▼ Load ability Validate inputs │ │ └──────────┬───────────┘ ▼ ExecutionManager.execute() │ ┌───────────────┴────────────────┐ │ Build execution order (deps) │ ├───────────────────────────────┤ │ FOR each step: │ │ ├─ Evaluate 'when' condition │ │ ├─ Execute step type: │ │ │ ├─ script → shell cmd │ │ │ ├─ agent → agent call │ │ │ ├─ skill → load skill │ │ │ ├─ approval → ask user │ │ │ └─ workflow → nested run │ │ ├─ Validate output │ │ ├─ Pass context to next step │ │ └─ Record result │ │ │ ├─ On failure: │ │ ├─ Stop (default) │ │ ├─ Retry (with max retries) │ │ └─ Continue (ignore error) │ │ │ └───────────────┬────────────────┘ ▼ Return execution results │ ┌───────────────────┼───────────────────┐ ▼ ▼ ▼ Save to history Show toast result Clear active (50 max, 30m TTL) (success/error) execution ``` --- ## Performance & Resource Considerations ### Lazy Initialization - ExecutionManager timer only starts on first ability execution - Timer uses `unref()` so it doesn't block process exit - Prevents unnecessary resource usage for inactive plugins ### Memory Management - Keep only last 50 executions in memory - Automatically delete executions older than 30 minutes - No memory leaks from long-running sessions ### Search Scope - Glob patterns limited to 2 levels deep (`*/*/ability.yaml`) - Prevents scanning entire project (could be thousands of files) - Encourages organized `.opencode/abilities/` directory structure ### Debouncing - Ability detection limited to once per 10 seconds per ability - Prevents message spam from repeated ability suggestions - User can still manually run with `/ability.run` --- ## Extension Points ### Adding New Step Types 1. Add type definition to `types/index.ts` 2. Add executor function in `executor/index.ts` 3. Add to `ALLOWED_TOOLS_BY_STEP_TYPE` 4. Update validator ### Custom Validation 1. Extend `validateAbility()` in `validator/index.ts` 2. Add custom error messages 3. Return enhanced validation result ### Custom Tools 1. Add tool definition in `opencode-plugin.ts` 2. Implement execute function 3. Register in tool map ### Agent-Specific Abilities ```yaml exclusive_agent: deploy-agent # Only this agent can run compatible_agents: [deploy-agent, devops-agent] # Whitelist ``` --- ## Testing ### Test Coverage (87 tests) - **executor.test.ts** - Step execution, dependencies, validation - **validator.test.ts** - Ability validation, input validation - **enforcement.test.ts** - Hook enforcement, agent attachment - **integration.test.ts** - Full lifecycle, error handling - **trigger.test.ts** - Keyword/pattern detection - **context-passing.test.ts** - Output context passing - **sdk.test.ts** - Public API ### Running Tests ```bash cd packages/plugin-abilities bun test ``` --- ## Troubleshooting ### "Ability not found" - Check `.opencode/abilities/` directory exists - Check ability YAML file is valid - Run `ability.list` to see loaded abilities ### "Input validation failed" - Check inputs match schema (type, pattern, enum, range) - Use `ability.validate ` to check definition ### "Tool blocked during step" - Check enforcement mode (loose vs strict) - Tool not in `ALLOWED_TOOLS_BY_STEP_TYPE[stepType]` - Script steps block all tools (run deterministically) ### "Step failed but continued" - Check `on_failure: continue` in step definition - Check `max_retries` configured ### Plugin crashes OpenCode - Check hook signatures match SDK (`@opencode-ai/plugin`) - Ensure all hooks have try-catch blocks - Check console for error messages --- ## Design Philosophy ### Why Enforcement? > **Hypothesis**: Traditional skills fail because LLMs are optimization engines, not planning engines. They optimize for "completion" not for "following instructions." **Solution**: Make it impossible to deviate - Block tools, not suggest them - Inject context, not hope it's remembered - Execute steps sequentially, not in parallel ### Why Step-Based? > **Real World**: Complex tasks have dependencies and validation needs. Humans break them into steps for a reason. **Solution**: Explicit step dependencies - Test before build - Build before deploy - Approval before production ### Why Validation? > **Problem**: Without validation, agents "guess" at outputs and continue. This causes silent failures. **Solution**: Assert expectations after each step - Script validation (exit codes, output content) - Input validation (required, pattern, range) - Dependency validation (no circular loops) --- ## Summary The Abilities Plugin enforces deterministic, step-by-step workflow execution through: 1. **Discovery** (loader) - Find ability definitions from YAML 2. **Validation** (validator) - Ensure well-formed and valid inputs 3. **Execution** (executor) - Run steps sequentially with context passing 4. **Enforcement** (plugin hooks) - Block tools, inject context, track state 5. **State Management** (ExecutionManager) - Lifecycle, cleanup, memory management Result: Agents follow prescribed workflows reliably, reproducibly, and safely.