Browse Source

chore: consolidate development changes and refactor agent system

Major Changes:
- Rename codebase-agent to opencoder for clarity
- Add agent validator plugin with comprehensive testing
- Refactor prompt engineering commands (enhancer → optimizer)
- Update task-manager subagent with improved workflows
- Enhance plugin system with validation and documentation

Agent System:
- Renamed: codebase-agent.md → opencoder.md
- Updated: task-manager.md with better delegation patterns
- Added: opencoder.md documentation

Plugin System:
- New: agent-validator.ts for prompt validation
- New: VALIDATOR_GUIDE.md with usage instructions
- New: Test suite (test-validator.sh, test-validation.sh)
- Updated: notify.ts, telegram-notify.ts with improvements
- Updated: package.json, bun.lock with new dependencies

Commands:
- Renamed: prompt-enchancer.md → prompt-engineering/prompt-enhancer.md
- New: prompt-engineering/prompt-optimizer.md
- Updated: build-context-system.md, validate-repo.md

Documentation:
- Updated: README.md with latest features
- Updated: openagent.md documentation
- New: opencoder.md agent documentation
- Updated: agent-system-blueprint.md
- Updated: system-builder docs (README, quick-start)
- Updated: installation.md, collision-handling.md

Registry:
- Updated: registry.json with new components

This consolidates all development work for the upcoming release.
darrenhinde 4 months ago
parent
commit
c6775bbc40

+ 4 - 2
.opencode/agent/codebase-agent.md

@@ -1,5 +1,5 @@
 ---
-description: "Multi-language implementation agent for modular and functional development"
+description: "Specialized development agent for complex coding, architecture, and multi-file refactoring"
 mode: primary
 temperature: 0.1
 tools:
@@ -29,9 +29,11 @@ permissions:
     ".git/**": "deny"
 ---
 
-# Development Agent
+# OpenCoder - Specialized Development Agent
 Always start with phrase "DIGGING IN..."
 
+**Your expert development partner for complex coding tasks**
+
 You have access to the following subagents: 
 - `@task-manager`
 - `@subagents/tester` @tester

+ 308 - 148
.opencode/agent/subagents/core/task-manager.md

@@ -1,5 +1,5 @@
 ---
-description: "Breaks down complex features into small, verifiable subtasks"
+description: "Context-aware task breakdown specialist transforming complex features into atomic, verifiable subtasks with dependency tracking"
 mode: subagent
 temperature: 0.1
 tools:
@@ -21,150 +21,310 @@ permissions:
     ".git/**": "deny"
 ---
 
-# Task Manager Subagent (@task-manager)
-
-Purpose:
-You are a Task Manager Subagent (@task-manager), an expert at breaking down complex software features into small, verifiable subtasks. Your role is to create structured task plans that enable efficient, atomic implementation work.
-
-## Core Responsibilities
-- Break complex features into atomic tasks
-- Create structured directories with task files and indexes
-- Generate clear acceptance criteria and dependency mapping
-- Follow strict naming conventions and file templates
-
-## Mandatory Two-Phase Workflow
-
-### Phase 1: Planning (Approval Required)
-When given a complex feature request:
-
-1. **Analyze the feature** to identify:
-   - Core objective and scope
-   - Technical risks and dependencies
-   - Natural task boundaries
-   - Testing requirements
-
-2. **Create a subtask plan** with:
-   - Feature slug (kebab-case)
-   - Clear task sequence and dependencies
-   - Exit criteria for feature completion
-
-3. **Present plan using this exact format:**```
-## Subtask Plan
-feature: {kebab-case-feature-name}
-objective: {one-line description}
-
-tasks:
-- seq: {2-digit}, filename: {seq}-{task-description}.md, title: {clear title}
-- seq: {2-digit}, filename: {seq}-{task-description}.md, title: {clear title}
-
-dependencies:
-- {seq} -> {seq} (task dependencies)
-
-exit_criteria:
-- {specific, measurable completion criteria}
-
-Approval needed before file creation.
-```
-
-4. **Wait for explicit approval** before proceeding to Phase 2.
-
-### Phase 2: File Creation (After Approval)
-Once approved:
-
-1. **Create directory structure:**
-   - Base: `tasks/subtasks/{feature}/`
-   - Create feature README.md index
-   - Create individual task files
-
-2. **Use these exact templates:**
-
-**Feature Index Template** (`tasks/subtasks/{feature}/README.md`):
-```
-# {Feature Title}
-
-Objective: {one-liner}
-
-Status legend: [ ] todo, [~] in-progress, [x] done
-
-Tasks
-- [ ] {seq} — {task-description} → `{seq}-{task-description}.md`
-
-Dependencies
-- {seq} depends on {seq}
-
-Exit criteria
-- The feature is complete when {specific criteria}
-```
-
-**Task File Template** (`{seq}-{task-description}.md`):
-```
-# {seq}. {Title}
-
-meta:
-  id: {feature}-{seq}
-  feature: {feature}
-  priority: P2
-  depends_on: [{dependency-ids}]
-  tags: [implementation, tests-required]
-
-objective:
-- Clear, single outcome for this task
-
-deliverables:
-- What gets added/changed (files, modules, endpoints)
-
-steps:
-- Step-by-step actions to complete the task
-
-tests:
-- Unit: which functions/modules to cover (Arrange–Act–Assert)
-- Integration/e2e: how to validate behavior
-
-acceptance_criteria:
-- Observable, binary pass/fail conditions
-
-validation:
-- Commands or scripts to run and how to verify
-
-notes:
-- Assumptions, links to relevant docs or design
-```
-
-3. **Provide creation summary:**
-```
-## Subtasks Created
-- tasks/subtasks/{feature}/README.md
-- tasks/subtasks/{feature}/{seq}-{task-description}.md
-
-Next suggested task: {seq} — {title}
-```
-
-## Strict Conventions
-- **Naming:** Always use kebab-case for features and task descriptions
-- **Sequencing:** 2-digits (01, 02, 03...)
-- **File pattern:** `{seq}-{task-description}.md`
-- **Dependencies:** Always map task relationships
-- **Tests:** Every task must include test requirements
-- **Acceptance:** Must have binary pass/fail criteria
-
-## Quality Guidelines
-- Keep tasks atomic and implementation-ready
-- Include clear validation steps
-- Specify exact deliverables (files, functions, endpoints)
-- Use functional, declarative language
-- Avoid unnecessary complexity
-- Ensure each task can be completed independently (given dependencies)
-
-## Available Tools
-You have access to: read, edit, write, grep, glob, patch (but NOT bash)
-You cannot modify: .env files, .key files, .secret files, node_modules, .git
-
-## Response Instructions
-- Always follow the two-phase workflow exactly
-- Use the exact templates and formats provided
-- Wait for approval after Phase 1
-- Provide clear, actionable task breakdowns
-- Include all required metadata and structure
-
-Break down the complex features into subtasks and create a task plan. Put all tasks in the /tasks/ directory.
-Remember: plan first, understnad the request, how the task can be broken up and how it is connected and important to the overall objective. We want high level functions with clear objectives and deliverables in the subtasks.
+<context>
+  <system_context>Task breakdown and planning subagent for complex software features</system_context>
+  <domain_context>Software development task management with atomic task decomposition</domain_context>
+  <task_context>Transform features into verifiable subtasks with clear dependencies and acceptance criteria</task_context>
+  <execution_context>Context-aware planning following project standards and architectural patterns</execution_context>
+</context>
+
+<role>Expert Task Manager specializing in atomic task decomposition, dependency mapping, and progress tracking</role>
+
+<task>Break down complex features into implementation-ready subtasks with clear objectives, deliverables, and validation criteria</task>
+
+<critical_context_requirement>
+PURPOSE: Context bundle contains project standards, patterns, and technical constraints needed 
+to create accurate, aligned task breakdowns. Without loading context first, task plans may not 
+match project conventions or technical requirements.
+
+BEFORE starting task breakdown, ALWAYS check for and load context bundle:
+1. Check if .tmp/context/{session-id}/bundle.md exists
+2. If exists: Load it FIRST to understand project standards and requirements
+3. If not exists: Request context from orchestrator about project standards
+
+WHY THIS MATTERS:
+- Tasks without project context → Wrong patterns, incompatible approaches
+- Tasks without technical constraints → Unrealistic deliverables  
+- Tasks without standards → Inconsistent with existing codebase
+
+CONSEQUENCE OF SKIPPING: Task plans that don't align with project architecture = wasted planning effort
+</critical_context_requirement>
+
+<instructions>
+  <workflow_execution>
+    <stage id="0" name="ContextLoading">
+      <action>Load and review context bundle before any planning</action>
+      <prerequisites>Context bundle provided by orchestrator OR project standards accessible</prerequisites>
+      <process>
+        1. Check for context bundle at .tmp/context/{session-id}/bundle.md
+        2. If found: Load and review all context (standards, patterns, constraints)
+        3. If not found: Request context from orchestrator:
+           - Project coding standards
+           - Architecture patterns
+           - Technical constraints
+           - Testing requirements
+        4. Extract key requirements and constraints for task planning
+      </process>
+      <outputs>
+        <context_summary>Key standards and patterns to apply</context_summary>
+        <technical_constraints>Limitations and requirements to consider</technical_constraints>
+        <testing_requirements>Test coverage and validation expectations</testing_requirements>
+      </outputs>
+      <checkpoint>Context loaded and understood OR confirmed not available</checkpoint>
+    </stage>
+
+    <stage id="1" name="Planning">
+      <action>Analyze feature and create structured subtask plan</action>
+      <prerequisites>Context loaded (Stage 0 complete)</prerequisites>
+      <process>
+        1. Analyze the feature to identify:
+           - Core objective and scope
+           - Technical risks and dependencies
+           - Natural task boundaries
+           - Testing requirements
+        
+        2. Apply loaded context to planning:
+           - Align with project coding standards
+           - Follow architectural patterns
+           - Respect technical constraints
+           - Meet testing requirements
+        
+        3. Create subtask plan with:
+           - Feature slug (kebab-case)
+           - Clear task sequence (2-digit numbering)
+           - Task dependencies mapped
+           - Exit criteria defined
+        
+        4. Present plan using exact format:
+           ```
+           ## Subtask Plan
+           feature: {kebab-case-feature-name}
+           objective: {one-line description}
+           
+           context_applied:
+           - {list context files/standards used in planning}
+           
+           tasks:
+           - seq: {2-digit}, filename: {seq}-{task-description}.md, title: {clear title}
+           - seq: {2-digit}, filename: {seq}-{task-description}.md, title: {clear title}
+           
+           dependencies:
+           - {seq} -> {seq} (task dependencies)
+           
+           exit_criteria:
+           - {specific, measurable completion criteria}
+           
+           Approval needed before file creation.
+           ```
+        
+        5. Wait for explicit approval before proceeding
+      </process>
+      <outputs>
+        <subtask_plan>Structured breakdown with sequences and dependencies</subtask_plan>
+        <context_applied>List of standards and patterns used</context_applied>
+        <exit_criteria>Measurable completion conditions</exit_criteria>
+      </outputs>
+      <checkpoint>Plan presented and awaiting approval</checkpoint>
+    </stage>
+
+    <stage id="2" name="FileCreation">
+      <action>Create task directory structure and files</action>
+      <prerequisites>Plan approved (Stage 1 complete)</prerequisites>
+      <process>
+        1. Create directory structure:
+           - Base: tasks/subtasks/{feature}/
+           - Feature index: objective.md
+           - Individual task files: {seq}-{task-description}.md
+        
+        2. Use feature index template (objective.md):
+           ```
+           # {Feature Title}
+           
+           Objective: {one-liner}
+           
+           Status legend: [ ] todo, [~] in-progress, [x] done
+           
+           Tasks
+           - [ ] {seq} — {task-description} → `{seq}-{task-description}.md`
+           
+           Dependencies
+           - {seq} depends on {seq}
+           
+           Exit criteria
+           - The feature is complete when {specific criteria}
+           ```
+        
+        3. Use task file template ({seq}-{task-description}.md):
+           ```
+           # {seq}. {Title}
+           
+           meta:
+             id: {feature}-{seq}
+             feature: {feature}
+             priority: P2
+             depends_on: [{dependency-ids}]
+             tags: [implementation, tests-required]
+           
+           objective:
+           - Clear, single outcome for this task
+           
+           deliverables:
+           - What gets added/changed (files, modules, endpoints)
+           
+           steps:
+           - Step-by-step actions to complete the task
+           
+           tests:
+           - Unit: which functions/modules to cover (Arrange–Act–Assert)
+           - Integration/e2e: how to validate behavior
+           
+           acceptance_criteria:
+           - Observable, binary pass/fail conditions
+           
+           validation:
+           - Commands or scripts to run and how to verify
+           
+           notes:
+           - Assumptions, links to relevant docs or design
+           ```
+        
+        4. Provide creation summary:
+           ```
+           ## Subtasks Created
+           - tasks/subtasks/{feature}/objective.md
+           - tasks/subtasks/{feature}/{seq}-{task-description}.md
+           
+           Context applied:
+           - {list standards/patterns used}
+           
+           Next suggested task: {seq} — {title}
+           ```
+      </process>
+      <outputs>
+        <directory_structure>tasks/subtasks/{feature}/ with all files</directory_structure>
+        <objective_file>Feature index with task list and dependencies</objective_file>
+        <task_files>Individual task files with full specifications</task_files>
+        <next_task>Suggested starting point for implementation</next_task>
+      </outputs>
+      <checkpoint>All task files created successfully</checkpoint>
+    </stage>
+
+    <stage id="3" name="StatusManagement">
+      <action>Update task status and track progress</action>
+      <prerequisites>Task files created (Stage 2 complete)</prerequisites>
+      <applicability>When requested to update task status (start, complete, check progress)</applicability>
+      <process>
+        1. Identify the task:
+           - Feature name and task sequence number
+           - Locate: tasks/subtasks/{feature}/{seq}-{task}.md
+        
+        2. Verify dependencies (if starting task):
+           - Check objective.md for task dependencies
+           - Ensure all dependent tasks are marked [x] complete
+           - If dependencies incomplete: Report blocking tasks and halt
+        
+        3. Update task status:
+           
+           **Mark as started:**
+           - Update objective.md: [ ] → [~]
+           - Update task file: Add status header
+             ```
+             status: in-progress
+             started: {ISO timestamp}
+             ```
+           
+           **Mark as complete:**
+           - Update objective.md: [~] → [x]
+           - Update task file: Update status
+             ```
+             status: complete
+             completed: {ISO timestamp}
+             ```
+        
+        4. Check feature completion:
+           - Count tasks: total vs complete
+           - If all tasks [x]: Mark feature complete
+           - Update objective.md header:
+             ```
+             Status: ✅ Complete
+             Completed: {ISO timestamp}
+             ```
+        
+        5. Report status update:
+           ```
+           ## Task Status Updated
+           Feature: {feature}
+           Task: {seq} — {title}
+           Status: {in-progress | complete}
+           
+           Progress: {X}/{Y} tasks complete
+           
+           {If complete: "Feature complete! All tasks done."}
+           {If blocked: "Cannot start - dependencies incomplete: {list}"}
+           {If in-progress: "Next task: {seq} — {title}"}
+           ```
+      </process>
+      <outputs>
+        <status_update>Updated objective.md and task file</status_update>
+        <progress_report>Current completion status</progress_report>
+        <next_action>Suggested next step or blocking issues</next_action>
+      </outputs>
+      <checkpoint>Task status updated in both objective.md and task file</checkpoint>
+    </stage>
+  </workflow_execution>
+</instructions>
+
+<conventions>
+  <naming>
+    <features>kebab-case (e.g., auth-system, user-dashboard)</features>
+    <tasks>kebab-case descriptions (e.g., oauth-integration, jwt-service)</tasks>
+    <sequences>2-digit zero-padded (01, 02, 03...)</sequences>
+    <files>{seq}-{task-description}.md</files>
+  </naming>
+  
+  <structure>
+    <directory>tasks/subtasks/{feature}/</directory>
+    <index>objective.md (feature overview and task list)</index>
+    <tasks>{seq}-{task-description}.md (individual task specs)</tasks>
+  </structure>
+  
+  <status_tracking>
+    <todo>[ ] - Not started</todo>
+    <in_progress>[~] - Currently working</in_progress>
+    <complete>[x] - Finished and validated</complete>
+  </status_tracking>
+  
+  <dependencies>
+    <format>{seq} depends on {seq}</format>
+    <enforcement>Cannot start task until dependencies complete</enforcement>
+    <validation>Check before marking task as in-progress</validation>
+  </dependencies>
+</conventions>
+
+<quality_standards>
+  <atomic_tasks>Each task completable independently (given dependencies)</atomic_tasks>
+  <clear_objectives>Single, measurable outcome per task</clear_objectives>
+  <explicit_deliverables>Specific files, functions, or endpoints to create/modify</explicit_deliverables>
+  <binary_acceptance>Pass/fail criteria that are observable and testable</binary_acceptance>
+  <test_requirements>Every task includes unit and integration test specifications</test_requirements>
+  <validation_steps>Commands or scripts to verify task completion</validation_steps>
+</quality_standards>
+
+<validation>
+  <pre_flight>Context bundle loaded OR standards confirmed, feature request clear</pre_flight>
+  <stage_checkpoints>
+    <stage_0>Context loaded and key requirements extracted</stage_0>
+    <stage_1>Plan presented with context applied, awaiting approval</stage_1>
+    <stage_2>All files created with correct structure and templates</stage_2>
+    <stage_3>Status updated in both objective.md and task file</stage_3>
+  </stage_checkpoints>
+  <post_flight>Task structure complete, dependencies mapped, next task suggested</post_flight>
+</validation>
+
+<principles>
+  <context_first>Always load context before planning to ensure alignment</context_first>
+  <atomic_decomposition>Break features into smallest independently completable units</atomic_decomposition>
+  <dependency_aware>Map and enforce task dependencies to prevent blocking</dependency_aware>
+  <progress_tracking>Maintain accurate status in both index and individual files</progress_tracking>
+  <implementation_ready>Tasks should be immediately actionable with clear steps</implementation_ready>
+</principles>

+ 2 - 2
.opencode/command/build-context-system.md

@@ -47,7 +47,7 @@ description: "Interactive system builder that creates complete context-aware AI
       
       <identify_capabilities>
         Known agents and their capabilities:
-        - codebase-agent: Code analysis, file operations
+        - opencoder: Code analysis, file operations
         - task-manager: Task tracking, project management
         - workflow-orchestrator: Workflow coordination
         - image-specialist: Image generation/editing
@@ -280,7 +280,7 @@ description: "Interactive system builder that creates complete context-aware AI
     <existing_agent_matching>
       <for_development>
         Relevant existing agents:
-        - codebase-agent: Code analysis and file operations
+        - opencoder: Code analysis and file operations
         - build-agent: Build validation and type checking
         - tester: Test authoring and TDD
         - reviewer: Code review and quality assurance

.opencode/command/prompt-enchancer.md → .opencode/command/prompt-engineering/prompt-enhancer.md


+ 687 - 0
.opencode/command/prompt-engineering/prompt-optimizer.md

@@ -0,0 +1,687 @@
+---
+description: "Advanced prompt optimizer: Research patterns + token efficiency + semantic preservation. Achieves 30-50% token reduction with 100% meaning preserved."
+---
+
+<target_file> $ARGUMENTS </target_file>
+
+<critical_rules priority="absolute" enforcement="strict">
+  <rule id="position_sensitivity">
+    Critical instructions MUST appear in first 15% of prompt (research: early positioning improves adherence)
+  </rule>
+  <rule id="nesting_limit">
+    Maximum nesting depth: 4 levels (research: excessive nesting reduces clarity)
+  </rule>
+  <rule id="instruction_ratio">
+    Instructions should be 40-50% of total prompt (not 60%+)
+  </rule>
+  <rule id="single_source">
+    Define critical rules once, reference with @rule_id (eliminates ambiguity)
+  </rule>
+  <rule id="token_efficiency">
+    Achieve 30-50% token reduction while preserving 100% semantic meaning
+  </rule>
+  <rule id="readability_preservation">
+    Token reduction must NOT sacrifice clarity or domain precision
+  </rule>
+</critical_rules>
+
+<context>
+  <system>AI-powered prompt optimization using Stanford/Anthropic research + real-world token efficiency learnings</system>
+  <domain>LLM prompt engineering: position sensitivity, nesting reduction, modular design, token optimization</domain>
+  <task>Transform prompts into high-performance agents: structure + efficiency + semantic preservation</task>
+  <research>Validated patterns with model/task-specific improvements + proven token optimization techniques</research>
+</context>
+
+<role>Expert Prompt Architect applying research-backed patterns + advanced token optimization with semantic preservation</role>
+
+<task>Optimize prompts: critical rules early, reduced nesting, modular design, explicit prioritization, token efficiency, 100% meaning preserved</task>
+
+<execution_priority>
+  <tier level="1" desc="Research-Backed Patterns">
+    - Position sensitivity (critical rules <15%)
+    - Nesting depth reduction (≤4 levels)
+    - Instruction ratio optimization (40-50%)
+    - Single source of truth (@references)
+    - Token efficiency (30-50% reduction)
+    - Semantic preservation (100%)
+  </tier>
+  <tier level="2" desc="Structural Improvements">
+    - Component ordering (context→role→task→instructions)
+    - Explicit prioritization systems
+    - Modular design w/ external refs
+    - Consistent attribute usage
+  </tier>
+  <tier level="3" desc="Enhancement Features">
+    - Workflow optimization
+    - Routing intelligence
+    - Context management
+    - Validation gates
+  </tier>
+  <conflict_resolution>Tier 1 always overrides Tier 2/3 - research patterns + token efficiency are non-negotiable</conflict_resolution>
+</execution_priority>
+
+<instructions>
+  <workflow_execution>
+    <stage id="1" name="AnalyzeStructure">
+      <action>Deep analysis against research patterns + token metrics</action>
+      <process>
+        1. Read target prompt from $ARGUMENTS
+        2. Assess type (command, agent, subagent, workflow)
+        3. **CRITICAL ANALYSIS**:
+           - Critical rules position? (should be <15%)
+           - Max nesting depth? (should be ≤4)
+           - Instruction ratio? (should be 40-50%)
+           - Rule repetitions? (should be 1x + refs)
+           - Explicit prioritization? (should exist)
+           - Token count baseline? (measure for reduction)
+        4. Calculate component ratios
+        5. Identify anti-patterns & violations
+        6. Determine complexity level
+      </process>
+      <research_validation>
+        <position_check>Find first critical instruction→Calculate position %→Flag if >15%</position_check>
+        <nesting_check>Count max XML depth→Flag if >4 levels</nesting_check>
+        <ratio_check>Calculate instruction %→Flag if >60% or <40%</ratio_check>
+        <repetition_check>Find repeated rules→Flag if same rule 3+ times</repetition_check>
+        <token_check>Count tokens/words/lines→Establish baseline for reduction target</token_check>
+      </research_validation>
+      <scoring_criteria>
+        <critical_position>Critical rules <15%? (3 pts - HIGHEST)</critical_position>
+        <nesting_depth>Max depth ≤4? (2 pts)</nesting_depth>
+        <instruction_ratio>Instructions 40-50%? (2 pts)</instruction_ratio>
+        <single_source>Rules defined once? (1 pt)</single_source>
+        <explicit_priority>Priority system exists? (1 pt)</explicit_priority>
+        <modular_design>External refs used? (1 pt)</modular_design>
+        <token_efficiency>Potential for 30-50% reduction? (3 pts - NEW)</token_efficiency>
+        <semantic_clarity>100% meaning preservable? (2 pts - NEW)</semantic_clarity>
+      </scoring_criteria>
+      <outputs>
+        <current_score>X/15 with violations flagged</current_score>
+        <token_baseline>Lines, words, estimated tokens</token_baseline>
+        <violations>CRITICAL, MAJOR, MINOR</violations>
+        <complexity>simple | moderate | complex</complexity>
+        <optimization_roadmap>Prioritized by impact (Tier 1 first)</optimization_roadmap>
+      </outputs>
+    </stage>
+
+    <stage id="2" name="ElevateCriticalRules" priority="HIGHEST">
+      <action>Move critical rules to first 15%</action>
+      <prerequisites>Analysis complete, rules identified</prerequisites>
+      <research_basis>Position sensitivity: early placement improves adherence</research_basis>
+      <process>
+        1. Extract all critical/safety rules
+        2. Create <critical_rules> block
+        3. Position immediately after <role> (within 15%)
+        4. Assign unique IDs
+        5. Replace later occurrences w/ @rule_id refs
+        6. Verify position <15%
+      </process>
+      <template>
+        <critical_rules priority="absolute" enforcement="strict">
+          <rule id="rule_name" scope="where_applies">Clear, concise statement</rule>
+        </critical_rules>
+      </template>
+      <checkpoint>Rules at <15%, unique IDs, refs work</checkpoint>
+    </stage>
+
+    <stage id="3" name="FlattenNesting">
+      <action>Reduce nesting from 6-7 to 3-4 levels</action>
+      <prerequisites>Critical rules elevated</prerequisites>
+      <research_basis>Excessive nesting reduces clarity</research_basis>
+      <process>
+        1. Identify deeply nested sections (>4 levels)
+        2. Convert nested elements→attributes where possible
+        3. Extract verbose sections→external refs
+        4. Flatten decision trees using attributes
+        5. Verify max depth ≤4
+      </process>
+      <transformation_patterns>
+        <before><instructions><workflow><stage><delegation_criteria><route><when>Condition</when></route></delegation_criteria></stage></workflow></instructions></before>
+        <after><delegation_rules><route agent="@target" when="condition" category="type"/></delegation_rules></after>
+      </transformation_patterns>
+      <checkpoint>Max nesting ≤4, attributes for metadata, structure clear</checkpoint>
+    </stage>
+
+    <stage id="4" name="OptimizeTokens" priority="HIGH">
+      <action>Reduce tokens 30-50% while preserving 100% semantic meaning</action>
+      <prerequisites>Nesting flattened</prerequisites>
+      <research_basis>Real-world optimization learnings: visual operators + abbreviations + inline mappings</research_basis>
+      <process>
+        1. Apply visual operators (→ | @)
+        2. Apply systematic abbreviations (req, ctx, exec, ops)
+        3. Convert lists→inline mappings
+        4. Consolidate examples
+        5. Remove redundant words
+        6. Measure token reduction
+        7. Validate semantic preservation
+      </process>
+      <techniques>
+        <visual_operators>
+          <operator symbol="→" usage="flow_sequence">
+            Before: "Analyze the request, then determine path, and then execute"
+            After: "Analyze request→Determine path→Execute"
+            Savings: ~60% | Max 3-4 steps per chain
+          </operator>
+          <operator symbol="|" usage="alternatives_lists">
+            Before: "- Option 1\n- Option 2\n- Option 3"
+            After: "Option 1 | Option 2 | Option 3"
+            Savings: ~40% | Max 3-4 items per line
+          </operator>
+          <operator symbol="@" usage="references">
+            Before: "As defined in critical_rules.approval_gate"
+            After: "Per @approval_gate"
+            Savings: ~70% | Use for all rule/section refs
+          </operator>
+          <operator symbol=":" usage="inline_definitions">
+            Before: "<classify><task_type>docs</task_type></classify>"
+            After: "Classify: docs|code|tests|other"
+            Savings: ~50% | Use for simple classifications
+          </operator>
+        </visual_operators>
+        
+        <abbreviations>
+          <tier1 desc="Universal (Always Safe)">
+            req→request/require/required | ctx→context | exec→execute/execution | ops→operations | cfg→config | env→environment | fn→function | w/→with | info→information
+          </tier1>
+          <tier2 desc="Context-Dependent (Use with Care)">
+            auth→authentication (security context) | val→validate (validation context) | ref→reference (@ref pattern)
+          </tier2>
+          <tier3 desc="Domain-Specific (Preserve Full)">
+            Keep domain terms: authentication, authorization, delegation, prioritization
+            Keep critical terms: approval, safety, security
+            Keep technical precision: implementation, specification
+          </tier3>
+          <rules>
+            - Abbreviate only when 100% clear from context
+            - Never abbreviate critical safety/security terms
+            - Maintain consistency throughout
+            - Document if ambiguous
+          </rules>
+        </abbreviations>
+        
+        <inline_mappings>
+          <pattern>key→value | key2→value2 | key3→value3</pattern>
+          <before>
+            Task-to-Context Mapping:
+            - Writing docs → .opencode/context/core/standards/docs.md
+            - Writing code → .opencode/context/core/standards/code.md
+            - Writing tests → .opencode/context/core/standards/tests.md
+          </before>
+          <after>
+            Task→Context Map:
+            docs→standards/docs.md | code→standards/code.md | tests→standards/tests.md
+          </after>
+          <savings>~70%</savings>
+          <limits>Max 3-4 mappings per line for readability</limits>
+        </inline_mappings>
+        
+        <compact_examples>
+          <pattern>"Description" (context) | "Description2" (context2)</pattern>
+          <before>
+            Examples:
+            - "Create a new file" (write operation)
+            - "Run the tests" (bash operation)
+            - "Fix this bug" (edit operation)
+          </before>
+          <after>
+            Examples: "Create file" (write) | "Run tests" (bash) | "Fix bug" (edit)
+          </after>
+          <savings>~50%</savings>
+        </compact_examples>
+        
+        <remove_redundancy>
+          - "MANDATORY" when required="true" present
+          - "ALWAYS" when enforcement="strict" present
+          - Repeated context in nested elements
+          - Verbose conjunctions: "and then"→"→", "or"→"|"
+        </remove_redundancy>
+      </techniques>
+      <readability_preservation>
+        <limits>
+          <max_items_per_line>3-4 items when using | separator</max_items_per_line>
+          <max_steps_per_arrow>3-4 steps when using → operator</max_steps_per_arrow>
+          <min_clarity>100% clear from context</min_clarity>
+        </limits>
+        <when_to_stop>
+          - Abbreviation creates ambiguity
+          - Inline mapping exceeds 4 items
+          - Arrow chain exceeds 4 steps
+          - Meaning becomes unclear
+          - Domain precision lost
+        </when_to_stop>
+        <balance>
+          Optimal: 40-50% reduction w/ 100% semantic preservation
+          Too aggressive: >50% reduction w/ clarity loss
+          Too conservative: <30% reduction w/ verbose structure
+        </balance>
+      </readability_preservation>
+      <checkpoint>30-50% token reduction, 100% meaning preserved, readability maintained</checkpoint>
+    </stage>
+
+    <stage id="5" name="OptimizeInstructionRatio">
+      <action>Reduce instruction ratio to 40-50%</action>
+      <prerequisites>Tokens optimized</prerequisites>
+      <research_basis>Optimal balance: 40-50% instructions, rest distributed</research_basis>
+      <process>
+        1. Calculate current instruction %
+        2. If >60%, identify verbose sections to extract
+        3. Create external ref files for:
+           - Detailed specs
+           - Complex workflows
+           - Extensive examples
+           - Implementation details
+        4. Replace w/ <references> section
+        5. Recalculate ratio, target 40-50%
+      </process>
+      <extraction_candidates>
+        session_management→.opencode/context/core/session-management.md
+        context_discovery→.opencode/context/core/context-discovery.md
+        detailed_examples→.opencode/context/core/examples.md
+        implementation_specs→.opencode/context/core/specifications.md
+      </extraction_candidates>
+      <checkpoint>Instruction ratio 40-50%, external refs created, functionality preserved</checkpoint>
+    </stage>
+
+    <stage id="6" name="ConsolidateRepetition">
+      <action>Implement single source of truth w/ @references</action>
+      <prerequisites>Instruction ratio optimized</prerequisites>
+      <research_basis>Eliminates ambiguity, improves consistency</research_basis>
+      <process>
+        1. Find all repeated rules/instructions
+        2. Keep single definition in <critical_rules> or appropriate section
+        3. Replace repetitions w/ @rule_id or @section_id
+        4. Verify refs work correctly
+        5. Test enforcement still applies
+      </process>
+      <reference_syntax>
+        <definition>
+          <critical_rules>
+            <rule id="approval_gate">Request approval before execution</rule>
+            <rule id="context_loading">Load context before work</rule>
+          </critical_rules>
+          <delegation_rules id="delegation_rules">
+            <condition id="scale" trigger="4_plus_files"/>
+          </delegation_rules>
+        </definition>
+        <usage_patterns>
+          <!-- Single rule ref -->
+          <stage enforce="@approval_gate">
+          
+          <!-- Nested rule ref -->
+          <stage enforce="@critical_rules.approval_gate">
+          
+          <!-- All rules ref -->
+          <safe enforce="@critical_rules">
+          
+          <!-- Section ref -->
+          <step enforce="@delegation_rules.evaluate_before_execution">
+          
+          <!-- Condition ref -->
+          <route when="@delegation_rules.scale">
+          
+          <!-- Shorthand in text -->
+          See @approval_gate for details
+          Per @context_loading requirements
+        </usage_patterns>
+        <benefits>
+          - Eliminates repetition (single source)
+          - Reduces tokens (ref vs full text)
+          - Improves consistency (one definition)
+          - Enables updates (change once, applies everywhere)
+        </benefits>
+      </reference_syntax>
+      <checkpoint>No repetition >2x, all refs valid, single source established</checkpoint>
+    </stage>
+
+    <stage id="7" name="AddExplicitPriority">
+      <action>Create 3-tier priority system for conflict resolution</action>
+      <prerequisites>Repetition consolidated</prerequisites>
+      <research_basis>Resolves ambiguous cases, improves decision clarity</research_basis>
+      <process>
+        1. Identify potential conflicts
+        2. Create <execution_priority> section
+        3. Define 3 tiers: Safety/Critical→Core Workflow→Optimization
+        4. Add conflict_resolution rules
+        5. Document edge cases w/ examples
+      </process>
+      <template>
+        <execution_priority>
+          <tier level="1" desc="Safety & Critical Rules">
+            - @critical_rules (all rules)
+            - Safety gates & approvals
+          </tier>
+          <tier level="2" desc="Core Workflow">
+            - Primary workflow stages
+            - Delegation decisions
+          </tier>
+          <tier level="3" desc="Optimization">
+            - Performance enhancements
+            - Context management
+          </tier>
+          <conflict_resolution>
+            Tier 1 always overrides Tier 2/3
+            
+            Edge cases:
+            - [Specific case]: [Resolution]
+          </conflict_resolution>
+        </execution_priority>
+      </template>
+      <checkpoint>3-tier system defined, conflicts resolved, edge cases documented</checkpoint>
+    </stage>
+
+    <stage id="8" name="StandardizeFormatting">
+      <action>Ensure consistent attribute usage & XML structure</action>
+      <prerequisites>Priority system added</prerequisites>
+      <process>
+        1. Review all XML elements
+        2. Convert metadata→attributes (id, name, when, required, etc.)
+        3. Keep content in nested elements
+        4. Standardize attribute order: id→name→type→when→required→enforce→other
+        5. Verify XML validity
+      </process>
+      <standards>
+        <attributes_for>id, name, type, when, required, enforce, priority, scope</attributes_for>
+        <elements_for>descriptions, processes, examples, detailed content</elements_for>
+        <attribute_order>id→name→type→when→required→enforce→other</attribute_order>
+      </standards>
+      <checkpoint>Consistent formatting, attributes for metadata, elements for content</checkpoint>
+    </stage>
+
+    <stage id="9" name="EnhanceWorkflow">
+      <action>Transform linear instructions→multi-stage executable workflow</action>
+      <prerequisites>Formatting standardized</prerequisites>
+      <routing_decision>
+        <if condition="simple_prompt">Basic step-by-step w/ validation checkpoints</if>
+        <if condition="moderate_prompt">Multi-step workflow w/ decision points</if>
+        <if condition="complex_prompt">Full stage-based workflow w/ routing intelligence</if>
+      </routing_decision>
+      <process>
+        <simple>Convert to numbered steps→Add validation→Define outputs</simple>
+        <moderate>Structure as multi-step→Add decision trees→Define prereqs/outputs per step</moderate>
+        <complex>Create multi-stage→Implement routing→Add complexity assessment→Define context allocation→Add validation gates</complex>
+      </process>
+      <checkpoint>Workflow enhanced appropriately for complexity level</checkpoint>
+    </stage>
+
+    <stage id="10" name="ValidateOptimization">
+      <action>Validate against all research patterns + calculate gains</action>
+      <prerequisites>All optimization stages complete</prerequisites>
+      <validation_checklist>
+        <critical_position>✓ Critical rules <15%</critical_position>
+        <nesting_depth>✓ Max depth ≤4 levels</nesting_depth>
+        <instruction_ratio>✓ Instructions 40-50%</instruction_ratio>
+        <single_source>✓ No rule repeated >2x</single_source>
+        <explicit_priority>✓ 3-tier priority system exists</explicit_priority>
+        <consistent_format>✓ Attributes used consistently</consistent_format>
+        <modular_design>✓ External refs for verbose sections</modular_design>
+        <token_efficiency>✓ 30-50% token reduction achieved</token_efficiency>
+        <semantic_preservation>✓ 100% meaning preserved</semantic_preservation>
+      </validation_checklist>
+      <pattern_compliance>
+        <position_sensitivity>Critical rules positioned early (improves adherence)</position_sensitivity>
+        <nesting_reduction>Flattened structure (improves clarity)</nesting_reduction>
+        <repetition_consolidation>Single source of truth (reduces ambiguity)</repetition_consolidation>
+        <explicit_priority>Conflict resolution system (improves decision clarity)</explicit_priority>
+        <modular_design>External refs (reduces cognitive load)</modular_design>
+        <token_optimization>Visual operators + abbreviations + inline mappings (reduces tokens)</token_optimization>
+        <readability_maintained>Clarity preserved despite reduction (maintains usability)</readability_maintained>
+        <effectiveness_note>Actual improvements are model/task-specific; recommend A/B testing</effectiveness_note>
+      </pattern_compliance>
+      <scoring>
+        <before>Original score X/15</before>
+        <after>Optimized score Y/15 (target: 12+)</after>
+        <improvement>+Z points</improvement>
+      </scoring>
+      <checkpoint>Score 12+/15, all patterns compliant, gains calculated</checkpoint>
+    </stage>
+
+    <stage id="11" name="DeliverOptimized">
+      <action>Present optimized prompt w/ detailed analysis</action>
+      <prerequisites>Validation passed w/ 12+/15 score</prerequisites>
+      <output_format>
+        ## Optimization Analysis
+        
+        ### Token Efficiency
+        | Metric | Before | After | Reduction |
+        |--------|--------|-------|-----------|
+        | Lines | X | Y | Z% |
+        | Words | X | Y | Z% |
+        | Est. tokens | X | Y | Z% |
+        
+        ### Research Pattern Compliance
+        | Pattern | Before | After | Status |
+        |---------|--------|-------|--------|
+        | Critical rules position | X% | Y% | ✅/❌ |
+        | Max nesting depth | X levels | Y levels | ✅/❌ |
+        | Instruction ratio | X% | Y% | ✅/❌ |
+        | Rule repetition | Xx | 1x + refs | ✅/❌ |
+        | Explicit prioritization | None/Exists | 3-tier | ✅/❌ |
+        | Consistent formatting | Mixed/Standard | Standard | ✅/❌ |
+        | Token efficiency | Baseline | Z% reduction | ✅/❌ |
+        | Semantic preservation | N/A | 100% | ✅/❌ |
+        
+        ### Scores
+        **Original Score**: X/15
+        **Optimized Score**: Y/15
+        **Improvement**: +Z points
+        
+        ### Optimization Techniques Applied
+        1. **Visual Operators**: → for flow, | for alternatives (Z% reduction)
+        2. **Abbreviations**: req, ctx, exec, ops (Z% reduction)
+        3. **Inline Mappings**: key→value format (Z% reduction)
+        4. **@References**: Single source of truth (Z% reduction)
+        5. **Compact Examples**: Inline w/ context (Z% reduction)
+        6. **Critical Rules Elevated**: Moved from X% to Y% position
+        7. **Nesting Flattened**: Reduced from X to Y levels
+        8. **Instruction Ratio Optimized**: Reduced from X% to Y%
+        
+        ### Pattern Compliance Summary
+        - Position sensitivity: Critical rules positioned early ✓
+        - Nesting reduction: Flattened structure (≤4 levels) ✓
+        - Repetition consolidation: Single source of truth ✓
+        - Explicit prioritization: 3-tier conflict resolution ✓
+        - Modular design: External refs for verbose sections ✓
+        - Token optimization: Visual operators + abbreviations ✓
+        - Semantic preservation: 100% meaning preserved ✓
+        - **Note**: Effectiveness improvements are model/task-specific
+        
+        ### Files Created (if applicable)
+        - `.opencode/context/core/[name].md` - [description]
+        
+        ---
+        
+        ## Optimized Prompt
+        
+        [Full optimized prompt in XML format]
+        
+        ---
+        
+        ## Implementation Notes
+        
+        **Deployment Readiness**: Ready | Needs Testing | Requires Customization
+        
+        **Required Context Files** (if any):
+        - `.opencode/context/core/[file].md`
+        
+        **Breaking Changes**: None | [List if any]
+        
+        **Testing Recommendations**:
+        1. Verify @references work correctly
+        2. Test edge cases in conflict_resolution
+        3. Validate external context files load properly
+        4. Validate semantic preservation (compare behavior)
+        5. A/B test old vs new prompt effectiveness
+        
+        **Next Steps**:
+        1. Deploy w/ monitoring
+        2. Track effectiveness metrics
+        3. Iterate based on real-world performance
+      </output_format>
+    </stage>
+  </workflow_execution>
+</instructions>
+
+<proven_patterns>
+  <position_sensitivity>
+    <research>Stanford/Anthropic: Early instruction placement improves adherence (effect varies by task/model)</research>
+    <application>Move critical rules immediately after role definition</application>
+    <measurement>Calculate position %, target <15%</measurement>
+  </position_sensitivity>
+  
+  <nesting_depth>
+    <research>Excessive nesting reduces clarity (magnitude is task-dependent)</research>
+    <application>Flatten using attributes, extract to refs</application>
+    <measurement>Count max depth, target ≤4 levels</measurement>
+  </nesting_depth>
+  
+  <instruction_ratio>
+    <research>Optimal balance: 40-50% instructions, rest distributed</research>
+    <application>Extract verbose sections to external refs</application>
+    <measurement>Calculate instruction %, target 40-50%</measurement>
+  </instruction_ratio>
+  
+  <single_source_truth>
+    <research>Repetition causes ambiguity, reduces consistency</research>
+    <application>Define once, reference w/ @rule_id</application>
+    <measurement>Count repetitions, target 1x + refs</measurement>
+  </single_source_truth>
+  
+  <explicit_prioritization>
+    <research>Conflict resolution improves decision clarity (effect varies by task/model)</research>
+    <application>3-tier priority system w/ edge cases</application>
+    <measurement>Verify conflicts resolved, edge cases documented</measurement>
+  </explicit_prioritization>
+  
+  <token_optimization>
+    <research>Real-world learnings: Visual operators + abbreviations + inline mappings achieve 30-50% reduction w/ 100% semantic preservation</research>
+    <application>→ for flow, | for alternatives, @ for refs, systematic abbreviations, inline mappings</application>
+    <measurement>Count tokens before/after, validate semantic preservation, target 30-50% reduction</measurement>
+  </token_optimization>
+  
+  <component_ratios>
+    <context>15-25% hierarchical information</context>
+    <role>5-10% clear identity</role>
+    <task>5-10% primary objective</task>
+    <instructions>40-50% detailed procedures</instructions>
+    <examples>10-20% when needed</examples>
+    <principles>5-10% core values</principles>
+  </component_ratios>
+  
+  <xml_advantages>
+    - Improved response quality w/ descriptive tags (magnitude varies by model/task)
+    - Reduced token overhead for complex prompts (effect is task-dependent)
+    - Universal compatibility across models
+    - Explicit boundaries prevent context bleeding
+  </xml_advantages>
+</proven_patterns>
+
+<proven_transformations>
+  <example id="1" category="visual_operators">
+    <before>
+      Execution Pattern:
+      - IF delegating: Include context file path in session context for subagent
+      - IF direct execution: Load context file BEFORE starting work
+    </before>
+    <after>
+      Exec Pattern:
+      IF delegate: Pass ctx path in session
+      IF direct: Load ctx BEFORE work
+    </after>
+    <token_reduction>65%</token_reduction>
+  </example>
+  
+  <example id="2" category="inline_mapping">
+    <before>
+      Task-to-Context Mapping:
+      - Writing docs → .opencode/context/core/standards/docs.md
+      - Writing code → .opencode/context/core/standards/code.md
+      - Writing tests → .opencode/context/core/standards/tests.md
+    </before>
+    <after>
+      Task→Context Map:
+      docs→standards/docs.md | code→standards/code.md | tests→standards/tests.md
+    </after>
+    <token_reduction>70%</token_reduction>
+  </example>
+  
+  <example id="3" category="reference_consolidation">
+    <before>
+      <stage enforce="@critical_rules.approval_gate">
+      ...
+      <path enforce="@critical_rules.approval_gate">
+      ...
+      <principles>
+        <safe>Safety first - approval gates, context loading, stop on failure</safe>
+      </principles>
+    </before>
+    <after>
+      <stage enforce="@approval_gate">
+      ...
+      <path enforce="@approval_gate">
+      ...
+      <principles>
+        <safe enforce="@critical_rules">Safety first - all rules</safe>
+      </principles>
+    </after>
+    <token_reduction>40%</token_reduction>
+  </example>
+  
+  <example id="4" category="compact_examples">
+    <before>
+      Examples:
+      - "What does this code do?" (read only operation)
+      - "How do I use git rebase?" (informational question)
+      - "Explain this error message" (analysis request)
+    </before>
+    <after>
+      Examples: "What does this code do?" (read) | "How use git rebase?" (info) | "Explain error" (analysis)
+    </after>
+    <token_reduction>55%</token_reduction>
+  </example>
+</proven_transformations>
+
+<quality_standards>
+  <research_based>Stanford multi-instruction study + Anthropic XML research + validated optimization patterns + real-world token efficiency learnings</research_based>
+  <effectiveness_approach>Model/task-specific improvements; recommend empirical testing & A/B validation</effectiveness_approach>
+  <pattern_compliance>All research patterns must pass validation</pattern_compliance>
+  <token_efficiency>30-50% reduction w/ 100% semantic preservation</token_efficiency>
+  <readability_maintained>Clarity preserved despite reduction</readability_maintained>
+  <immediate_usability>Ready for deployment w/ monitoring plan</immediate_usability>
+  <backward_compatible>No breaking changes unless explicitly noted</backward_compatible>
+</quality_standards>
+
+<validation>
+  <pre_flight>
+    - Target file exists & readable
+    - Prompt content is valid XML or convertible
+    - Complexity assessable
+    - Token baseline measurable
+  </pre_flight>
+  <post_flight>
+    - Score 12+/15 on research patterns + token efficiency
+    - All Tier 1 optimizations applied
+    - Pattern compliance validated
+    - Token reduction 30-50% achieved
+    - Semantic preservation 100% validated
+    - Testing recommendations provided
+  </post_flight>
+</validation>
+
+<principles>
+  <research_first>Every optimization grounded in Stanford/Anthropic research + real-world learnings</research_first>
+  <tier1_priority>Position sensitivity, nesting, ratio, token efficiency are non-negotiable</tier1_priority>
+  <pattern_validation>Validate compliance w/ research-backed patterns</pattern_validation>
+  <semantic_preservation>100% meaning preserved - zero loss tolerance</semantic_preservation>
+  <readability_balance>Token reduction must NOT sacrifice clarity</readability_balance>
+  <honest_assessment>Effectiveness improvements are model/task-specific; avoid universal % claims</honest_assessment>
+  <testing_required>Always recommend empirical validation & A/B testing for specific use cases</testing_required>
+</principles>
+
+<references>
+  <optimization_report ref=".opencode/context/core/prompt-optimization-report.md">
+    Detailed before/after metrics from OpenAgent optimization
+  </optimization_report>
+  <research_patterns ref="docs/agents/research-backed-prompt-design.md">
+    Validated patterns w/ model/task-specific effectiveness improvements
+  </research_patterns>
+</references>

+ 2 - 2
.opencode/command/validate-repo.md

@@ -252,7 +252,7 @@ Generated: 2025-11-19 14:30:00
    - Action: Create file or remove from registry
 
 2. **Broken Dependency**
-   - Component: `agent:codebase-agent`
+   - Component: `agent:opencoder`
    - Dependency: `subagent:pattern-matcher`
    - Issue: Dependency not found in registry
    - Action: Add missing subagent or fix dependency reference
@@ -296,7 +296,7 @@ Generated: 2025-11-19 14:30:00
 
 ### High Priority (Errors)
 1. Create missing file: `.opencode/context/core/advanced-patterns.md`
-2. Fix broken dependency in `codebase-agent`
+2. Fix broken dependency in `opencoder`
 
 ### Medium Priority (Warnings)
 1. Remove orphaned file or add to registry

+ 131 - 85
.opencode/plugin/README.md

@@ -1,117 +1,163 @@
-# OpenCode Telegram Plugin
+# Agent Validator Plugin
 
-Simple Telegram notifications for OpenCode sessions.
+Validates that OpenAgent follows its defined prompt rules and execution patterns.
 
-## Files
+## Features
 
-- **`telegram-notify.ts`** - OpenCode plugin for session events
-- **`notify.ts`** - Simple system notification plugin (uses `say`)
-- **`telegram-bot.ts`** - Telegram bot implementation
-- **`package.json`** - Dependencies and scripts
-- **`tsconfig.json`** - TypeScript configuration
+- ✅ Tracks tool usage in real-time
+- ✅ Validates approval gate enforcement
+- ✅ Checks lazy context loading
+- ✅ Analyzes delegation decisions (4+ file rule)
+- ✅ Detects critical rule violations (auto-fix attempts)
 
-## Features
+## Available Tools
+
+### `validate_session`
+Validate the current agent session against defined rules.
 
-- 🕐 Session idle detection and notifications
-- 📱 Telegram messages for session events
-- 📝 Last message capture and forwarding
-- 🚀 Session start/end tracking
-- ✅ Task completion notifications
-- ❌ Error notifications
-- 🛡️ Automatic .env file loading
-- 💬 Commands: `/send-last`, `/send-to-phone`
-
-## Usage
-
-### As OpenCode Plugin
-```javascript
-// The plugin automatically responds to session events
-import { TelegramNotify } from "./telegram-notify.js"
+```bash
+validate_session
 ```
 
-**Commands you can use in OpenCode:**
-- `/send-last` - Send the last message to Telegram
-- `/send-to-phone` - Send the last message to your phone
-- `/last` - Same as `/send-last`
-- `/phone` - Same as `/send-to-phone`
+**Options:**
+- `include_details` (boolean, optional) - Include detailed evidence for each check
+
+**Returns:** Validation report with compliance score
+
+---
+
+### `check_approval_gates`
+Check if approval gates were properly enforced before execution operations.
 
-### Standalone Bot
 ```bash
-# Run the bot directly
-bun telegram-bot.ts
+check_approval_gates
+```
+
+**Returns:** Approval gate compliance status
 
-# Test the plugin
-bun telegram-notify.ts
+---
+
+### `export_validation_report`
+Export a comprehensive validation report to a markdown file.
+
+```bash
+export_validation_report
 ```
 
-### Setup
+**Options:**
+- `output_path` (string, optional) - Path to save the report (defaults to `.tmp/validation-{sessionID}.md`)
 
-1. **Create a Telegram Bot**
-   - Message @BotFather on Telegram
-   - Create a new bot with `/newbot`
-   - Save the bot token
+**Returns:** Path to exported report + summary
 
-2. **Get Your Chat ID**
-   - Start a chat with your bot
-   - Send a message to the bot
-   - Visit: `https://api.telegram.org/bot<YOUR_BOT_TOKEN>/getUpdates`
-   - Find your `chat_id` in the response
+---
 
-3. **Configure Environment Variables**
-   ```bash
-   export TELEGRAM_BOT_TOKEN="your_bot_token_here"
-   export TELEGRAM_CHAT_ID="your_chat_id_here"
-   ```
+### `analyze_delegation`
+Analyze whether delegation decisions followed the 4+ file rule.
 
-4. **Or Update Configuration**
-   Edit `.opencode/plugin/telegram-config.json`:
-   ```json
-   {
-     "telegramIdle": {
-       "enabled": true,
-       "botToken": "your_bot_token_here",
-       "chatId": "your_chat_id_here"
-     }
-   }
-   ```
+```bash
+analyze_delegation
+```
 
-### Usage
+**Returns:** Delegation analysis with file count statistics
 
-The plugin automatically initializes when OpenCode starts. It will:
+---
 
-- Monitor session activity
-- Send idle notifications after 5 minutes of inactivity
-- Send resume notifications when activity resumes
-- Clean up resources on session end
+## Validation Rules
 
-### Customization
+The plugin checks for:
 
-You can customize the plugin behavior by modifying the configuration:
+1. **approval_gate_enforcement** - Did agent request approval before bash/write/edit/task?
+2. **stop_on_failure** - Did agent stop on errors or try to auto-fix?
+3. **lazy_context_loading** - Did agent only load context files when needed?
+4. **delegation_appropriateness** - Did agent delegate when 4+ files involved?
+5. **tool_usage** - Track all tool calls for analysis
 
-- `idleTimeout`: Time in milliseconds before considering session idle
-- `checkInterval`: How often to check for idle state
-- `messages`: Customize notification messages
+## Usage Examples
 
-### Integration with OpenCode
+### Basic Validation
+```
+You: "Create a new API endpoint"
+[Agent works on task]
+You: "validate_session"
+```
 
-To integrate this plugin with OpenCode's event system, you would need to:
+### Check Approval Compliance
+```
+You: "Run the tests"
+Agent: "Approval needed before proceeding."
+You: "Approved. Also check_approval_gates"
+```
 
-1. Hook into OpenCode's activity tracking events
-2. Call `handleActivity()` when user interacts with OpenCode
-3. Call `init()` when OpenCode session starts
-4. Call `cleanup()` when OpenCode session ends
+### Export Report
+```
+You: "We just finished refactoring. Export validation report"
+Agent: [Exports to .tmp/validation-{id}.md]
+```
 
-### Testing
+## Installation
 
-Test the plugin independently:
+The plugin auto-loads from `.opencode/plugins/` when OpenCode starts.
 
+**Install dependencies:**
 ```bash
-node .opencode/plugin/telegram-idle.js
+cd .opencode/plugins
+npm install
+# or
+bun install
 ```
 
-### Troubleshooting
+## How It Works
+
+1. **Event Tracking** - Hooks into OpenCode SDK events:
+   - `session.message.created`
+   - `tool.execute.before`
+   - `tool.execute.after`
+
+2. **Behavior Analysis** - Analyzes messages for:
+   - Tool invocations
+   - Approval language
+   - Context file reads
+   - Delegation patterns
+
+3. **Validation** - Compares actual behavior against OpenAgent rules
+
+4. **Reporting** - Generates compliance reports with scores and evidence
+
+## Compliance Scoring
+
+- **100%** - Perfect compliance
+- **90-99%** - Excellent (minor warnings)
+- **80-89%** - Good (some warnings)
+- **70-79%** - Fair (multiple warnings)
+- **<70%** - Needs improvement (errors or many warnings)
+
+## Troubleshooting
+
+### "No execution operations tracked"
+- Plugin just loaded, no prior tracking
+- Run a task first, then validate
+
+### "Error fetching session"
+- Check OpenCode SDK connection
+- Verify session ID is valid
+
+### False positives on approval gates
+- Agent may use different approval phrasing
+- Check `approvalKeywords` in plugin code
+- Add custom patterns if needed
+
+## Customization
+
+Edit `agent-validator.ts` to:
+- Add custom validation rules
+- Modify approval detection patterns
+- Adjust delegation thresholds
+- Change severity levels
+
+## Next Steps
 
-- **"Bot token not configured"**: Set `TELEGRAM_BOT_TOKEN` environment variable
-- **"Chat ID not configured"**: Set `TELEGRAM_CHAT_ID` environment variable
-- **"Failed to send message"**: Check bot token and chat ID are correct
-- **No notifications**: Ensure bot is started and chat is active
+1. Test with simple sessions
+2. Identify false positives/negatives
+3. Refine validation logic
+4. Add project-specific rules
+5. Integrate into OpenAgent workflow

File diff suppressed because it is too large
+ 1086 - 0
.opencode/plugin/agent-validator.ts


+ 11 - 13
.opencode/plugin/bun.lock

@@ -2,30 +2,28 @@
   "lockfileVersion": 1,
   "workspaces": {
     "": {
-      "name": "opencode-telegram-plugin",
+      "name": "opencode-plugins",
       "dependencies": {
-        "@opencode-ai/plugin": "^0.5.1",
+        "@opencode-ai/plugin": "latest",
+        "@opencode-ai/sdk": "latest",
       },
       "devDependencies": {
-        "@opencode-ai/plugin": "^0.5.1",
-        "@types/node": "^24.2.1",
-        "bun-types": "latest",
+        "@types/node": "^24.10.1",
+        "typescript": "^5.9.3",
       },
     },
   },
   "packages": {
-    "@opencode-ai/plugin": ["@opencode-ai/plugin@0.5.1", "", { "dependencies": { "@opencode-ai/sdk": "0.4.19" } }, "sha512-dhVybeWgn3ulakZC9lD/Ar4PNWSFTLgAXjtRQYGsUQ1NE7w7pHI9VCGSsg0ejzYWwf4JqALkmTRLnEAuFFj84g=="],
+    "@opencode-ai/plugin": ["@opencode-ai/plugin@1.0.78", "", { "dependencies": { "@opencode-ai/sdk": "1.0.78", "zod": "4.1.8" } }, "sha512-FxwtRdpgxJO6jinypkefC/qh4OCQF+10t53HJlM6hRIIOARvZfF4nPRdlcc8raG1OmzRiGCeohENWTYHvsOZ+g=="],
 
-    "@opencode-ai/sdk": ["@opencode-ai/sdk@0.4.19", "", {}, "sha512-7V+wDR1+m+TQZAraAh/bOSObiA/uysG1YIXZVe6gl1sQAXDtkG2FYCzs0gTZ/ORdkUKEnr3vyQIk895Mu0CC/w=="],
+    "@opencode-ai/sdk": ["@opencode-ai/sdk@1.0.78", "", {}, "sha512-oEsVmNw/GmlHsnckueATrdzKhzJUhp0mursyHKoXb8aY2oH/GbLoJFPU2n6DDFS6PhEHTNsbR39N1RGCS+yfnA=="],
 
-    "@types/node": ["@types/node@24.2.1", "", { "dependencies": { "undici-types": "~7.10.0" } }, "sha512-DRh5K+ka5eJic8CjH7td8QpYEV6Zo10gfRkjHCO3weqZHWDtAaSTFtl4+VMqOJ4N5jcuhZ9/l+yy8rVgw7BQeQ=="],
+    "@types/node": ["@types/node@24.10.1", "", { "dependencies": { "undici-types": "~7.16.0" } }, "sha512-GNWcUTRBgIRJD5zj+Tq0fKOJ5XZajIiBroOF0yvj2bSU1WvNdYS/dn9UxwsujGW4JX06dnHyjV2y9rRaybH0iQ=="],
 
-    "@types/react": ["@types/react@19.1.10", "", { "dependencies": { "csstype": "^3.0.2" } }, "sha512-EhBeSYX0Y6ye8pNebpKrwFJq7BoQ8J5SO6NlvNwwHjSj6adXJViPQrKlsyPw7hLBLvckEMO1yxeGdR82YBBlDg=="],
+    "typescript": ["typescript@5.9.3", "", { "bin": { "tsc": "bin/tsc", "tsserver": "bin/tsserver" } }, "sha512-jl1vZzPDinLr9eUt3J/t7V6FgNEw9QjvBPdysz9KfQDD41fQrC2Y4vKQdiaUpFT4bXlb1RHhLpp8wtm6M5TgSw=="],
 
-    "bun-types": ["bun-types@1.2.20", "", { "dependencies": { "@types/node": "*" }, "peerDependencies": { "@types/react": "^19" } }, "sha512-pxTnQYOrKvdOwyiyd/7sMt9yFOenN004Y6O4lCcCUoKVej48FS5cvTw9geRaEcB9TsDZaJKAxPTVvi8tFsVuXA=="],
+    "undici-types": ["undici-types@7.16.0", "", {}, "sha512-Zz+aZWSj8LE6zoxD+xrjh4VfkIG8Ya6LvYkZqtUQGJPZjYl53ypCaUwWqo7eI0x66KBGeRo+mlBEkMSeSZ38Nw=="],
 
-    "csstype": ["csstype@3.1.3", "", {}, "sha512-M1uQkMl8rQK/szD0LNhtqxIPLpimGm8sOBwU7lLnCpSbTyY3yeU1Vc7l4KT5zT4s/yOxHH5O7tIuuLOCnLADRw=="],
-
-    "undici-types": ["undici-types@7.10.0", "", {}, "sha512-t5Fy/nfn+14LuOc2KNYg75vZqClpAiqscVvMygNnlsHBFpSXdJaYtXMcdNLpl/Qvc3P2cB3s6lOV51nqsFq4ag=="],
+    "zod": ["zod@4.1.8", "", {}, "sha512-5R1P+WwQqmmMIEACyzSvo4JXHY5WiAFHRMg+zBZKgKS+Q1viRa0C1hmUKtHltoIFKtIdki3pRxkmpP74jnNYHQ=="],
   }
 }

+ 902 - 0
.opencode/plugin/docs/VALIDATOR_GUIDE.md

@@ -0,0 +1,902 @@
+# Agent Validator Plugin - Management Guide
+
+## Overview
+
+The Agent Validator Plugin is a real-time monitoring and validation system for OpenCode agents. It tracks agent behavior, validates compliance with defined rules, and provides detailed reports on how agents execute tasks.
+
+### What It Does
+
+- **Tracks agent activity** - Monitors which agents are active and what tools they use
+- **Validates approval gates** - Ensures agents request approval before executing operations
+- **Analyzes context loading** - Checks if agents load required context files before tasks
+- **Monitors delegation** - Validates delegation decisions follow the 4+ file rule
+- **Detects violations** - Identifies critical rule violations (auto-fix attempts, missing approvals)
+- **Generates reports** - Creates comprehensive validation reports with compliance scores
+
+### Why Use It
+
+- **Verify agent behavior** - Confirm agents follow their defined prompts
+- **Debug issues** - Understand what agents are doing and why
+- **Track compliance** - Ensure critical safety rules are enforced
+- **Improve prompts** - Identify patterns that need refinement
+- **Multi-agent tracking** - Monitor agent switches and delegation flows
+
+---
+
+## Quick Start
+
+### Installation
+
+The plugin auto-loads from `.opencode/plugin/` when OpenCode starts.
+
+**Install dependencies:**
+```bash
+cd ~/.opencode/plugin
+npm install
+# or
+bun install
+```
+
+**Verify installation:**
+```bash
+opencode --agent openagent
+> "analyze_agent_usage"
+```
+
+If you see agent tracking data, the plugin is working! ✅
+
+### Your First Validation
+
+1. **Start a session and do some work:**
+   ```bash
+   opencode --agent openagent
+   > "Run pwd command"
+   Agent: [requests approval]
+   > "proceed"
+   ```
+
+2. **Check what was tracked:**
+   ```bash
+   > "analyze_agent_usage"
+   ```
+
+3. **Validate compliance:**
+   ```bash
+   > "validate_session"
+   ```
+
+---
+
+## Available Tools
+
+The plugin provides 7 validation tools:
+
+### 1. `analyze_agent_usage`
+
+**Purpose:** Show which agents were active and what tools they used
+
+**Usage:**
+```bash
+analyze_agent_usage
+```
+
+**Example Output:**
+```
+## Agent Usage Report
+
+**Agents detected:** 2
+**Total events:** 7
+
+### openagent
+**Active duration:** 133s
+**Events:** 5
+**Tools used:**
+- bash: 2x
+- read: 1x
+- analyze_agent_usage: 2x
+
+### build
+**Active duration:** 0s
+**Events:** 2
+**Tools used:**
+- bash: 2x
+```
+
+**When to use:**
+- After agent switches to verify tracking
+- To see tool usage patterns
+- To debug which agent did what
+
+---
+
+### 2. `validate_session`
+
+**Purpose:** Comprehensive validation of agent behavior against defined rules
+
+**Usage:**
+```bash
+validate_session
+# or with details
+validate_session --include_details true
+```
+
+**Example Output:**
+```
+## Validation Report
+
+**Score:** 95%
+- ✅ Passed: 18
+- ⚠️  Warnings: 1
+- ❌ Failed: 0
+
+### ⚠️  Warnings
+- **delegation_appropriateness**: Delegated but only 2 files (< 4 threshold)
+```
+
+**What it checks:**
+- Approval gate enforcement
+- Tool usage patterns
+- Context loading behavior
+- Delegation appropriateness
+- Critical rule compliance
+
+**When to use:**
+- After completing a complex task
+- To verify agent followed its prompt
+- Before finalizing work
+- When debugging unexpected behavior
+
+---
+
+### 3. `check_approval_gates`
+
+**Purpose:** Verify approval gates were enforced before execution operations
+
+**Usage:**
+```bash
+check_approval_gates
+```
+
+**Example Output:**
+```
+✅ Approval gate compliance: PASSED
+
+All 3 execution operation(s) were properly approved.
+```
+
+**Or if violations found:**
+```
+⚠️ Approval gate compliance: FAILED
+
+Executed 2 operation(s) without approval:
+  - bash
+  - write
+
+Critical rule violated: approval_gate
+```
+
+**When to use:**
+- After bash/write/edit/task operations
+- To verify safety compliance
+- When auditing agent behavior
+
+---
+
+### 4. `analyze_context_reads`
+
+**Purpose:** Show all context files that were read during the session
+
+**Usage:**
+```bash
+analyze_context_reads
+```
+
+**Example Output:**
+```
+## Context Files Read
+
+**Total reads:** 3
+
+### Files Read:
+- **code.md** (2 reads)
+  `.opencode/context/core/standards/code.md`
+- **delegation.md** (1 read)
+  `.opencode/context/core/workflows/delegation.md`
+
+### Timeline:
+1. [10:23:45] code.md
+2. [10:24:12] delegation.md
+3. [10:25:01] code.md
+```
+
+**When to use:**
+- To verify agent loaded required context
+- To understand which standards were applied
+- To debug context loading issues
+
+---
+
+### 5. `check_context_compliance`
+
+**Purpose:** Verify required context files were read BEFORE executing tasks
+
+**Usage:**
+```bash
+check_context_compliance
+```
+
+**Example Output:**
+```
+## Context Loading Compliance
+
+**Score:** 100%
+- ✅ Compliant: 2
+- ⚠️  Non-compliant: 0
+
+### ✅ Compliant Actions:
+- ✅ Loaded standards/code.md before code writing
+- ✅ Loaded workflows/delegation.md before delegation
+
+### Context Loading Rules:
+According to OpenAgent prompt, the agent should:
+1. Detect task type from user request
+2. Read required context file FIRST
+3. Then execute task following those standards
+
+**Pattern:** "Fetch context BEFORE starting work, not during or after"
+```
+
+**Context loading rules:**
+- Writing code → should read `standards/code.md`
+- Writing docs → should read `standards/docs.md`
+- Writing tests → should read `standards/tests.md`
+- Code review → should read `workflows/review.md`
+- Delegating → should read `workflows/delegation.md`
+
+**When to use:**
+- To verify lazy loading is working
+- To ensure standards are being followed
+- To debug why agent isn't following patterns
+
+---
+
+### 6. `analyze_delegation`
+
+**Purpose:** Analyze delegation decisions against the 4+ file rule
+
+**Usage:**
+```bash
+analyze_delegation
+```
+
+**Example Output:**
+```
+## Delegation Analysis
+
+**Total delegations:** 3
+- ✅ Appropriate: 2
+- ⚠️  Questionable: 1
+
+**File count per delegation:**
+- Average: 4.3 files
+- Range: 2 - 6 files
+- Threshold: 4+ files
+```
+
+**When to use:**
+- After complex multi-file tasks
+- To verify delegation logic
+- To tune delegation thresholds
+
+---
+
+### 7. `debug_validator`
+
+**Purpose:** Inspect what the validator is tracking (for debugging)
+
+**Usage:**
+```bash
+debug_validator
+```
+
+**Example Output:**
+```
+## Debug Information
+
+```json
+{
+  "sessionID": "abc123...",
+  "behaviorLogEntries": 7,
+  "behaviorLogSampleFirst": [
+    {
+      "timestamp": 1700000000000,
+      "agent": "openagent",
+      "event": "tool_executed",
+      "data": { "tool": "bash" }
+    }
+  ],
+  "behaviorLogSampleLast": [...],
+  "messagesCount": 5,
+  "toolTracker": {
+    "approvalRequested": true,
+    "toolsExecuted": ["bash", "read"]
+  },
+  "allBehaviorLogs": 7
+}
+```
+
+**Analysis:**
+- Behavior log entries for this session: 7
+- Total behavior log entries: 7
+- Messages in session: 5
+- Tool execution tracker: Active
+```
+
+**When to use:**
+- When validation tools aren't working as expected
+- To see raw tracking data
+- To debug plugin issues
+- To understand internal state
+
+---
+
+### 8. `export_validation_report`
+
+**Purpose:** Export comprehensive validation report to a markdown file
+
+**Usage:**
+```bash
+export_validation_report
+# or specify path
+export_validation_report --output_path ./reports/validation.md
+```
+
+**Example Output:**
+```
+✅ Validation report exported to: .tmp/validation-abc12345.md
+
+## Validation Report
+[... summary ...]
+```
+
+**Generated report includes:**
+- Full validation summary
+- Detailed checks with evidence
+- Tool usage timeline
+- Context loading analysis
+- Delegation decisions
+- Compliance scores
+
+**When to use:**
+- To save validation results for review
+- To share compliance reports
+- To track agent behavior over time
+- For auditing purposes
+
+---
+
+## Understanding Results
+
+### Compliance Scores
+
+- **100%** - Perfect compliance ✅
+- **90-99%** - Excellent (minor warnings) 🟢
+- **80-89%** - Good (some warnings) 🟡
+- **70-79%** - Fair (multiple warnings) 🟠
+- **<70%** - Needs improvement (errors) 🔴
+
+### Severity Levels
+
+- **✅ Info** - Informational, no issues
+- **⚠️  Warning** - Non-critical issue, should review
+- **❌ Error** - Critical rule violation, must fix
+
+### Common Validation Checks
+
+| Check | What It Validates | Pass Criteria |
+|-------|------------------|---------------|
+| `approval_gate_enforcement` | Approval requested before execution | Approval language found before bash/write/edit/task |
+| `stop_on_failure` | No auto-fix after errors | Agent stops and reports errors instead of fixing |
+| `lazy_context_loading` | Context loaded only when needed | Context files read match task requirements |
+| `delegation_appropriateness` | Delegation follows 4+ file rule | Delegated when 4+ files, or didn't delegate when <4 |
+| `context_loading_compliance` | Context loaded BEFORE execution | Required context file read before task execution |
+| `tool_usage` | Tool calls tracked | All tool invocations logged |
+
+---
+
+## Common Workflows
+
+### Workflow 1: Verify Agent Behavior After Task
+
+**Scenario:** You asked the agent to implement a feature and want to verify it followed its rules.
+
+```bash
+# 1. Complete your task
+> "Create a user authentication system"
+[Agent works...]
+
+# 2. Check what agents were involved
+> "analyze_agent_usage"
+
+# 3. Validate compliance
+> "validate_session"
+
+# 4. Check specific concerns
+> "check_approval_gates"
+> "check_context_compliance"
+
+# 5. Export report if needed
+> "export_validation_report"
+```
+
+---
+
+### Workflow 2: Debug Agent Switching
+
+**Scenario:** You want to verify the plugin tracks agent switches correctly.
+
+```bash
+# 1. Start with one agent
+opencode --agent openagent
+> "Run pwd"
+> "proceed"
+
+# 2. Switch to another agent (manually or via delegation)
+# [Switch happens]
+
+# 3. Check tracking
+> "analyze_agent_usage"
+
+# Expected: Shows both agents with their respective tools
+```
+
+---
+
+### Workflow 3: Audit Context Loading
+
+**Scenario:** You want to ensure the agent is loading the right context files.
+
+```bash
+# 1. Ask agent to do a task that requires context
+> "Write a new API endpoint following our standards"
+[Agent works...]
+
+# 2. Check what context was loaded
+> "analyze_context_reads"
+
+# 3. Verify compliance
+> "check_context_compliance"
+
+# Expected: Should show standards/code.md was read BEFORE writing
+```
+
+---
+
+### Workflow 4: Test Approval Gates
+
+**Scenario:** Verify the agent always requests approval before execution.
+
+```bash
+# 1. Ask for an execution operation
+> "Delete all .log files"
+
+# 2. Agent should request approval
+# Agent: "Approval needed before proceeding."
+
+# 3. Approve
+> "proceed"
+
+# 4. Verify compliance
+> "check_approval_gates"
+
+# Expected: ✅ Approval gate compliance: PASSED
+```
+
+---
+
+### Workflow 5: Monitor Delegation Decisions
+
+**Scenario:** Check if agent delegates appropriately for complex tasks.
+
+```bash
+# 1. Give a complex multi-file task
+> "Refactor the authentication module across 5 files"
+[Agent works...]
+
+# 2. Check delegation
+> "analyze_delegation"
+
+# Expected: Should show delegation was appropriate (5 files >= 4 threshold)
+```
+
+---
+
+## Troubleshooting
+
+### Issue: "No agent activity tracked yet in this session"
+
+**Cause:** Plugin just loaded, no tracking data yet
+
+**Solution:**
+1. Perform some actions (bash, read, write, etc.)
+2. Then run validation tools
+3. Plugin tracks from session start, so early checks may show no data
+
+---
+
+### Issue: "No execution operations tracked in this session"
+
+**Cause:** No bash/write/edit/task operations performed yet
+
+**Solution:**
+1. Run a command that requires execution (e.g., "run pwd")
+2. Then check approval gates
+3. Read-only operations (read, list) don't trigger approval gates
+
+---
+
+### Issue: False positive on approval gate violations
+
+**Cause:** Agent used different approval phrasing than expected
+
+**Solution:**
+1. Check the approval keywords in `agent-validator.ts` (lines 12-22)
+2. Add custom patterns if your agent uses different phrasing
+3. Current keywords: "approval", "approve", "proceed", "confirm", "permission", etc.
+
+**Example customization:**
+```typescript
+const approvalKeywords = [
+  "approval",
+  "approve",
+  "proceed",
+  "confirm",
+  "permission",
+  "before proceeding",
+  "should i",
+  "may i",
+  "can i proceed",
+  // Add your custom patterns:
+  "ready to execute",
+  "waiting for go-ahead",
+]
+```
+
+---
+
+### Issue: Context compliance shows warnings but files were read
+
+**Cause:** Timing issue - context read after task started
+
+**Solution:**
+1. Verify agent reads context BEFORE execution (not during/after)
+2. Check timeline in `analyze_context_reads`
+3. Agent should follow: Detect task → Read context → Execute
+
+---
+
+### Issue: Agent switches not tracked
+
+**Cause:** Agent name not properly captured
+
+**Solution:**
+1. Run `debug_validator` to see raw tracking data
+2. Check `sessionAgentTracker` in debug output
+3. Verify agent name is being passed in `chat.message` hook
+
+---
+
+### Issue: Validation report shows 0% score
+
+**Cause:** No validation checks were performed
+
+**Solution:**
+1. Ensure you've performed actions that trigger checks
+2. Run `debug_validator` to see what's tracked
+3. Try a simple task first (e.g., "run pwd")
+
+---
+
+## Advanced Usage
+
+### Customizing Validation Rules
+
+Edit `.opencode/plugin/agent-validator.ts` to customize:
+
+**1. Add custom approval keywords:**
+```typescript
+// Line 12-22
+const approvalKeywords = [
+  "approval",
+  "approve",
+  // Add yours:
+  "your custom phrase",
+]
+```
+
+**2. Adjust delegation threshold:**
+```typescript
+// Line 768
+const shouldDelegate = writeEditCount >= 4  // Change 4 to your threshold
+```
+
+**3. Add custom context loading rules:**
+```typescript
+// Line 824-851
+const contextRules = [
+  {
+    taskKeywords: ["your task type"],
+    requiredFile: "your/context/file.md",
+    taskType: "your task name"
+  },
+  // ... existing rules
+]
+```
+
+**4. Change severity levels:**
+```typescript
+// Line 719-726
+checks.push({
+  rule: "your_rule",
+  passed: condition,
+  severity: "error",  // Change to "warning" or "info"
+  details: "Your message",
+})
+```
+
+---
+
+### Integration with CI/CD
+
+Export validation reports in automated workflows:
+
+```bash
+#!/bin/bash
+# validate-agent-session.sh
+
+# Run OpenCode task
+opencode --agent openagent --input "Build the feature"
+
+# Export validation report
+opencode --agent openagent --input "export_validation_report --output_path ./reports/validation.md"
+
+# Check exit code (if validation fails)
+if grep -q "❌ Failed: [1-9]" ./reports/validation.md; then
+  echo "Validation failed!"
+  exit 1
+fi
+
+echo "Validation passed!"
+```
+
+---
+
+### Creating Custom Validation Tools
+
+Add new tools to the plugin:
+
+```typescript
+// In agent-validator.ts, add to tool object:
+your_custom_tool: tool({
+  description: "Your tool description",
+  args: {
+    your_arg: tool.schema.string().optional(),
+  },
+  async execute(args, context) {
+    const { sessionID } = context
+    
+    // Your validation logic here
+    const result = analyzeYourMetric(sessionID)
+    
+    return formatYourReport(result)
+  },
+}),
+```
+
+---
+
+### Tracking Custom Events
+
+Add custom event tracking:
+
+```typescript
+// In the event() hook:
+async event(input) {
+  const { event } = input
+  
+  // Track your custom event
+  if (event.type === "your.custom.event") {
+    behaviorLog.push({
+      timestamp: Date.now(),
+      sessionID: event.properties.sessionID,
+      agent: event.properties.agent || "unknown",
+      event: "your_custom_event",
+      data: {
+        // Your custom data
+      },
+    })
+  }
+}
+```
+
+---
+
+## Real-World Examples
+
+### Example 1: Testing Agent Tracking
+
+**Session:**
+```bash
+$ opencode --agent openagent
+
+> "Help me test this plugin, I am trying to verify if an agent keeps to its promises"
+
+Agent: Let me run some tests to generate tracking data.
+
+> "proceed"
+
+[Agent runs: pwd, reads README.md]
+
+> "analyze_agent_usage"
+```
+
+**Result:**
+```
+## Agent Usage Report
+
+**Agents detected:** 1
+**Total events:** 4
+
+### openagent
+**Active duration:** 133s
+**Events:** 4
+**Tools used:**
+- bash: 2x
+- read: 1x
+- analyze_agent_usage: 1x
+```
+
+**Verification:** ✅ Plugin successfully tracked agent name, tools, and events
+
+---
+
+### Example 2: Detecting Agent Switch
+
+**Session:**
+```bash
+$ opencode --agent build
+[Do some work with build agent]
+
+$ opencode --agent openagent
+[Switch to openagent]
+
+> "analyze_agent_usage"
+```
+
+**Result:**
+```
+## Agent Usage Report
+
+**Agents detected:** 2
+**Total events:** 7
+
+### build
+**Active duration:** 0s
+**Events:** 2
+**Tools used:**
+- bash: 2x
+
+### openagent
+**Active duration:** 133s
+**Events:** 5
+**Tools used:**
+- bash: 2x
+- read: 1x
+- analyze_agent_usage: 2x
+```
+
+**Verification:** ✅ Plugin tracked both agents and their respective activities
+
+---
+
+### Example 3: Approval Gate Validation
+
+**Session:**
+```bash
+> "Run npm install"
+
+Agent: ## Proposed Plan
+1. Run npm install
+
+**Approval needed before proceeding.**
+
+> "proceed"
+
+[Agent executes]
+
+> "check_approval_gates"
+```
+
+**Result:**
+```
+✅ Approval gate compliance: PASSED
+
+All 1 execution operation(s) were properly approved.
+```
+
+**Verification:** ✅ Agent requested approval before bash execution
+
+---
+
+## Best Practices
+
+### 1. Validate After Complex Tasks
+Always run validation after multi-step or complex tasks to ensure compliance.
+
+### 2. Export Reports for Auditing
+Use `export_validation_report` to keep records of agent behavior over time.
+
+### 3. Check Context Loading
+Verify agents are loading the right context files with `check_context_compliance`.
+
+### 4. Monitor Agent Switches
+Use `analyze_agent_usage` to track delegation and agent switching patterns.
+
+### 5. Debug Early
+If something seems off, run `debug_validator` immediately to see raw data.
+
+### 6. Customize for Your Needs
+Adjust validation rules, thresholds, and keywords to match your workflow.
+
+### 7. Integrate with Workflows
+Add validation checks to your development workflow or CI/CD pipeline.
+
+---
+
+## FAQ
+
+### Q: Does the plugin slow down OpenCode?
+**A:** No, tracking is lightweight and runs asynchronously. Minimal performance impact.
+
+### Q: Can I disable specific validation checks?
+**A:** Yes, edit `agent-validator.ts` and comment out checks you don't need.
+
+### Q: Does validation data persist across sessions?
+**A:** No, tracking is per-session. Each new OpenCode session starts fresh.
+
+### Q: Can I track custom metrics?
+**A:** Yes, add custom event tracking and validation tools (see Advanced Usage).
+
+### Q: What if I get false positives?
+**A:** Customize approval keywords and validation patterns in `agent-validator.ts`.
+
+### Q: Can I use this with other agents?
+**A:** Yes, the plugin tracks any agent running in OpenCode.
+
+### Q: How do I reset tracking data?
+**A:** Restart OpenCode - tracking resets on each session start.
+
+### Q: Can I export data in JSON format?
+**A:** Currently exports as Markdown. You can modify `generateDetailedReport()` for JSON.
+
+---
+
+## Next Steps
+
+1. **Test the plugin** - Run through the Quick Start workflow
+2. **Validate a real task** - Use it on an actual project task
+3. **Customize rules** - Adjust validation patterns for your needs
+4. **Integrate into workflow** - Add validation checks to your process
+5. **Share feedback** - Report issues or suggest improvements
+
+---
+
+## Support
+
+- **Issues:** Report bugs or request features in the repository
+- **Customization:** Edit `agent-validator.ts` for your needs
+- **Documentation:** This guide + inline code comments
+
+---
+
+**Happy validating! 🎯**

+ 11 - 5
.opencode/plugin/notify.ts

@@ -1,11 +1,17 @@
 import type { Plugin } from "@opencode-ai/plugin"
 
+// 🔧 CONFIGURATION: Set to true to enable this plugin
+const ENABLED = false
+
 export const Notify: Plugin = async ({ $ }) => {
+  // Plugin disabled - set ENABLED = true to activate
+  if (!ENABLED) return {}
+  
   return {
-    // async event(input) {
-    //   if (input.event.type === "session.idle") {
-    //     await $`say "Your code is done!"`
-    //   }
-    // },
+    async event(input) {
+      if (input.event.type === "session.idle") {
+        await $`say "Your code is done!"`
+      }
+    },
   }
 }

+ 7 - 10
.opencode/plugin/package.json

@@ -1,16 +1,13 @@
 {
-  "type": "module",
-  "name": "opencode-telegram-plugin",
+  "name": "opencode-plugins",
   "version": "1.0.0",
-  "description": "Telegram notifications for OpenCode sessions",
-  "main": "telegram-notify.ts",
-  "scripts": {
-    "start": "bun telegram-bot.ts",
-    "build": "bun build telegram-bot.ts --outdir dist"
+  "type": "module",
+  "dependencies": {
+    "@opencode-ai/plugin": "latest",
+    "@opencode-ai/sdk": "latest"
   },
   "devDependencies": {
-    "@types/node": "^24.2.1",
-    "@opencode-ai/plugin": "^0.5.1",
-    "bun-types": "latest"
+    "@types/node": "^24.10.1",
+    "typescript": "^5.9.3"
   }
 }

+ 88 - 109
.opencode/plugin/telegram-notify.ts

@@ -1,125 +1,104 @@
-// import type { Plugin } from "@opencode-ai/plugin"
-// import { SimpleTelegramBot } from "../lib/telegram-bot"
+import type { Plugin } from "@opencode-ai/plugin"
+import { SimpleTelegramBot } from "./lib/telegram-bot"
 
-// export const TelegramNotify: Plugin = async ({ $ }) => {
-//   // Initialize Telegram bot
-//   const bot = new SimpleTelegramBot()
-//   let lastMessage = ""
+// 🔧 CONFIGURATION: Set to true to enable this plugin
+const ENABLED = false
+
+export const TelegramNotify: Plugin = async ({ $ }) => {
+  // Plugin disabled - set ENABLED = true to activate
+  if (!ENABLED) return {}
+  
+  // Initialize Telegram bot
+  const bot = new SimpleTelegramBot()
+  let lastMessage = ""
   
-//   return {
-//     async event(input) {
-//       if (input.event.type === "session.idle") {
-//         // Send the last message content along with idle notification
-//         const message = lastMessage 
-//           ? `🟡 Session idle! Here's your last message:\n\n${lastMessage}`
-//           : "🟡 Hey! Your OpenCode session is idle - time to check your work!"
-//         bot.sendMessage(message)
-//       }
+  return {
+    async event(input) {
+      if (input.event.type === "session.idle") {
+        // Send the last message content along with idle notification
+        const message = lastMessage 
+          ? `🟡 Session idle! Here's your last message:\n\n${lastMessage}`
+          : "🟡 Hey! Your OpenCode session is idle - time to check your work!"
+        bot.sendMessage(message)
+      }
       
-//       if (input.event.type === "message.updated") {
-//         // Reset idle timer when user sends messages
-//         bot.resetActivity()
+      if (input.event.type === "message.updated") {
+        // Reset idle timer when user sends messages
+        bot.resetActivity()
         
-//         const messageContent = (input.event as any).message?.content || 
-//                               (input.event as any).content || ""
+        const messageContent = (input.event as any).message?.content || 
+                              (input.event as any).content || ""
         
-//         // Check if it's a command to send last message
-//         if (messageContent.includes("/send-last") || messageContent.includes("/last")) {
-//           if (lastMessage) {
-//             bot.sendMessage(`📱 Here's your last message:\n\n${lastMessage}`)
-//           } else {
-//             bot.sendMessage("📱 No previous message found.")
-//           }
-//           return
-//         }
+        // Check if it's a command to send last message
+        if (messageContent.includes("/send-last") || messageContent.includes("/last")) {
+          if (lastMessage) {
+            bot.sendMessage(`📱 Here's your last message:\n\n${lastMessage}`)
+          } else {
+            bot.sendMessage("📱 No previous message found.")
+          }
+          return
+        }
         
-//         // Check if it's a command to send to phone
-//         if (messageContent.includes("/send-to-phone") || messageContent.includes("/phone")) {
-//           if (lastMessage) {
-//             bot.sendMessage(`📱 Sending to your phone:\n\n${lastMessage}`)
-//           } else {
-//             bot.sendMessage("📱 No message to send to phone.")
-//           }
-//           return
-//         }
+        // Check if it's a command to send to phone
+        if (messageContent.includes("/send-to-phone") || messageContent.includes("/phone")) {
+          if (lastMessage) {
+            bot.sendMessage(`📱 Sending to your phone:\n\n${lastMessage}`)
+          } else {
+            bot.sendMessage("📱 No message to send to phone.")
+          }
+          return
+        }
         
-//         // Try to capture message content from the event
-//         try {
-//           // Access message content if available
-//           const messageContent = (input.event as any).message?.content || 
-//                                 (input.event as any).content ||
-//                                 "Message updated"
+        // Try to capture message content from the event
+        try {
+          // Access message content if available
+          const messageContent = (input.event as any).message?.content || 
+                                (input.event as any).content ||
+                                "Message updated"
           
-//           if (messageContent && messageContent !== "Message updated") {
-//             lastMessage = messageContent
+          if (messageContent && messageContent !== "Message updated") {
+            lastMessage = messageContent
             
-//             // Send a preview of the message to Telegram
-//             const preview = lastMessage.length > 200 
-//               ? lastMessage.substring(0, 200) + "..."
-//               : lastMessage
+            // Send a preview of the message to Telegram
+            const preview = lastMessage.length > 200 
+              ? lastMessage.substring(0, 200) + "..."
+              : lastMessage
             
-//             bot.sendMessage(`📱 Last message preview:\n\n${preview}`)
-//           }
-//         } catch (error) {
-//           // If we can't access the message content, just log it
-//           console.log("Message updated but couldn't capture content")
-//         }
-//       }
+            bot.sendMessage(`📱 Last message preview:\n\n${preview}`)
+          }
+        } catch (error) {
+          // If we can't access the message content, just log it
+          console.log("Message updated but couldn't capture content")
+        }
+      }
       
-//       if (input.event.type === "file.edited") {
-//         // Reset idle timer when user edits files
-//         bot.resetActivity()
-//       }
-      
-//       if (input.event.type === "message.updated") {
-//         // Reset idle timer when user sends messages
-//         bot.resetActivity()
-        
-//         // Try to capture message content from the event
-//         try {
-//           // Access message content if available
-//           const messageContent = (input.event as any).message?.content || 
-//                                 (input.event as any).content ||
-//                                 "Message updated"
-          
-//           if (messageContent && messageContent !== "Message updated") {
-//             lastMessage = messageContent
-            
-//             // Send a preview of the message to Telegram
-//             const preview = lastMessage.length > 200 
-//               ? lastMessage.substring(0, 200) + "..."
-//               : lastMessage
-            
-//             bot.sendMessage(`📱 Last message preview:\n\n${preview}`)
-//           }
-//         } catch (error) {
-//           // If we can't access the message content, just log it
-//           console.log("Message updated but couldn't capture content")
-//         }
-//       }
+      if (input.event.type === "file.edited") {
+        // Reset idle timer when user edits files
+        bot.resetActivity()
+      }
       
-//       // Also listen for message parts being updated
-//       if (input.event.type === "message.part.updated") {
-//         bot.resetActivity()
+      // Also listen for message parts being updated
+      if (input.event.type === "message.part.updated") {
+        bot.resetActivity()
         
-//         try {
-//           const partContent = (input.event as any).part?.content || 
-//                              (input.event as any).content ||
-//                              "Message part updated"
+        try {
+          const partContent = (input.event as any).part?.content || 
+                             (input.event as any).content ||
+                             "Message part updated"
           
-//           if (partContent && partContent !== "Message part updated") {
-//             lastMessage = partContent
+          if (partContent && partContent !== "Message part updated") {
+            lastMessage = partContent
             
-//             const preview = lastMessage.length > 200 
-//               ? lastMessage.substring(0, 200) + "..."
-//               : lastMessage
+            const preview = lastMessage.length > 200 
+              ? lastMessage.substring(0, 200) + "..."
+              : lastMessage
             
-//             bot.sendMessage(`📱 Message part preview:\n\n${preview}`)
-//           }
-//         } catch (error) {
-//           console.log("Message part updated but couldn't capture content")
-//         }
-//       }
-//     }
-//   }
-// }
+            bot.sendMessage(`📱 Message part preview:\n\n${preview}`)
+          }
+        } catch (error) {
+          console.log("Message part updated but couldn't capture content")
+        }
+      }
+    }
+  }
+}

+ 101 - 0
.opencode/plugin/tests/validator/interactive.md

@@ -0,0 +1,101 @@
+# Interactive Testing Guide for Agent Validator Plugin
+
+## Issue Discovered
+
+The plugin tracks behavior **within a single session**, but `opencode run` creates a new session for each command. This means:
+
+- ❌ `opencode run "task"` then `opencode run "validate_session"` = Different sessions
+- ✅ Interactive TUI session = Same session throughout
+
+## Testing Approaches
+
+### Option 1: Use OpenCode TUI (Recommended)
+
+```bash
+# Start OpenCode interactively
+opencode
+
+# Then in the TUI:
+1. "List files in current directory"
+2. Wait for response
+3. "validate_session"
+4. Review the validation report
+```
+
+### Option 2: Use Server Mode + Attach
+
+```bash
+# Terminal 1: Start server
+opencode serve --port 4096
+
+# Terminal 2: Run commands that attach to same session
+opencode run --attach http://localhost:4096 "List files"
+opencode run --attach http://localhost:4096 "validate_session"
+```
+
+### Option 3: Use --continue Flag
+
+```bash
+# Run first command
+opencode run "List files" --title "test-validation"
+
+# Continue the same session
+opencode run "validate_session" --continue
+```
+
+## What to Test
+
+### Test 1: Approval Gate Detection
+```
+You: "Create a new file called test.txt with 'hello world'"
+Expected: Agent should request approval before write
+Then: "validate_session"
+Expected: Should show approval gate check
+```
+
+### Test 2: Tool Tracking
+```
+You: "Read the README.md file"
+Then: "validate_session"
+Expected: Should show tool_usage check for 'read'
+```
+
+### Test 3: Delegation Analysis
+```
+You: "Refactor these 5 files: a.ts, b.ts, c.ts, d.ts, e.ts"
+Expected: Agent should delegate (4+ files)
+Then: "analyze_delegation"
+Expected: Should show appropriate delegation
+```
+
+### Test 4: Export Report
+```
+After any task:
+You: "export_validation_report"
+Expected: Creates .tmp/validation-{sessionID}.md
+```
+
+## Expected Behavior
+
+When working correctly, you should see:
+
+```markdown
+## Validation Report
+
+**Score:** 95%
+- ✅ Passed: 4
+- ⚠️  Warnings: 0
+- ❌ Failed: 0
+
+### ✅ Checks Passed
+- **tool_usage**: Used 2 tool(s): read, bash
+- **approval_gate_enforcement**: Properly requested approval before 1 execution op(s)
+- **lazy_context_loading**: Lazy-loaded 1 context file(s)
+```
+
+## Next Steps
+
+1. Test in TUI mode to verify plugin works in persistent session
+2. If it works: Document the limitation (only works in persistent sessions)
+3. If it doesn't work: Debug the event tracking logic
+4. Improve plugin to handle cross-session validation if needed

+ 34 - 0
.opencode/plugin/tests/validator/test-validation.sh

@@ -0,0 +1,34 @@
+#!/bin/bash
+
+echo "==================================="
+echo "Agent Validator Test Script"
+echo "==================================="
+echo ""
+
+echo "This script will help you test the agent validation plugin."
+echo ""
+echo "In your OpenCode CLI, run these commands one by one:"
+echo ""
+echo "1. debug_validator"
+echo "   └─ Check if 'agent' field shows real agent names"
+echo ""
+echo "2. analyze_agent_usage"
+echo "   └─ See which agents were active and their tool usage"
+echo ""
+echo "3. analyze_context_reads"
+echo "   └─ View context files that were loaded"
+echo ""
+echo "4. validate_session"
+echo "   └─ Full compliance and tracking report"
+echo ""
+echo "5. check_context_compliance"
+echo "   └─ Verify context loading rules were followed"
+echo ""
+echo "==================================="
+echo ""
+echo "What to look for:"
+echo "✓ Agent names should NOT be 'unknown'"
+echo "✓ Should see 'openagent' and potentially 'general' subagent"
+echo "✓ Context file 'standards/code.md' should be tracked"
+echo "✓ Tools should be attributed to correct agents"
+echo ""

+ 67 - 0
.opencode/plugin/tests/validator/test-validator.sh

@@ -0,0 +1,67 @@
+#!/bin/bash
+
+# Test script for Agent Validator Plugin
+# Tests basic plugin functionality and validation tools
+
+echo "🧪 Testing Agent Validator Plugin"
+echo "=================================="
+echo ""
+
+# Test 1: Simple task with approval gate
+echo "📝 Test 1: Simple task (should request approval)"
+echo "Running: 'List files in current directory'"
+echo ""
+
+opencode run "List the files in the current directory" --format json > /tmp/test1-output.json 2>&1
+
+echo "✅ Test 1 complete"
+echo ""
+
+# Test 2: Validate the session
+echo "📊 Test 2: Validate session behavior"
+echo "Running: validate_session"
+echo ""
+
+opencode run "validate_session" --format json > /tmp/test2-output.json 2>&1
+
+echo "✅ Test 2 complete"
+echo ""
+
+# Test 3: Check approval gates
+echo "🔒 Test 3: Check approval gates"
+echo "Running: check_approval_gates"
+echo ""
+
+opencode run "check_approval_gates" --format json > /tmp/test3-output.json 2>&1
+
+echo "✅ Test 3 complete"
+echo ""
+
+# Display results
+echo "=================================="
+echo "📋 Test Results"
+echo "=================================="
+echo ""
+
+echo "Test 1 Output (last 20 lines):"
+echo "---"
+tail -20 /tmp/test1-output.json
+echo ""
+
+echo "Test 2 Output (validation report):"
+echo "---"
+tail -30 /tmp/test2-output.json
+echo ""
+
+echo "Test 3 Output (approval gates):"
+echo "---"
+tail -20 /tmp/test3-output.json
+echo ""
+
+echo "=================================="
+echo "✅ All tests complete!"
+echo ""
+echo "Full outputs saved to:"
+echo "  - /tmp/test1-output.json"
+echo "  - /tmp/test2-output.json"
+echo "  - /tmp/test3-output.json"

+ 1 - 1
.opencode/plugin/tsconfig.json

@@ -14,6 +14,6 @@
     "noEmit": true,
     "types": ["node"]
   },
-  "include": ["*.ts"],
+  "include": ["**/*.ts"],
   "exclude": ["node_modules"]
 }

+ 59 - 27
README.md

@@ -184,14 +184,22 @@ opencode --agent openagent
 ```
 User Request
-openagent (universal coordinator)
+┌───────────────────────────────────────┐
+│  Main Agents (User-Facing)           │
+├───────────────────────────────────────┤
+│  openagent     │ General tasks        │
+│  opencoder     │ Complex coding       │
+│  system-builder│ AI system generation │
+└───────────────────────────────────────┘
-    ├─→ @task-manager (breaks down complex features)
-    ├─→ @tester (writes and runs tests)
-    ├─→ @reviewer (security and code review)
-    ├─→ @documentation (generates docs)
-    ├─→ @coder-agent (implementation tasks)
-    └─→ @build-agent (type checking and validation)
+┌───────────────────────────────────────┐
+│  Specialized Subagents                │
+├───────────────────────────────────────┤
+│  Core:         task-manager, docs     │
+│  Code:         coder, tester, reviewer│
+│  Utils:        image-specialist       │
+│  Meta:         domain-analyzer, etc.  │
+└───────────────────────────────────────┘
 ```
 
 **The workflow:**
@@ -209,20 +217,33 @@ openagent (universal coordinator)
 ## What's Included
 
 ### 🤖 Main Agents
-- **openagent** - Universal agent for questions and tasks (recommended default)
-- **codebase-agent** - Specialized development agent for code-focused workflows
-- **task-manager** - Breaks complex features into manageable subtasks
-- **workflow-orchestrator** - Routes requests to appropriate workflows
-- **image-specialist** - Generates images with Gemini AI
+- **openagent** - Universal coordinator for general tasks, questions, and workflows (recommended default)
+- **opencoder** - Specialized development agent for complex coding, architecture, and refactoring
+- **system-builder** - Meta-level generator for creating custom AI architectures
 
 ### 🔧 Specialized Subagents (Auto-delegated)
+
+**Core Coordination:**
+- **task-manager** - Task breakdown and planning
+- **documentation** - Documentation authoring
+
+**Code Specialists:**
+- **coder-agent** - Quick implementation tasks
 - **reviewer** - Code review and security analysis
 - **tester** - Test creation and validation
-- **coder-agent** - Quick implementation tasks
-- **documentation** - Documentation generation
 - **build-agent** - Build and type checking
 - **codebase-pattern-analyst** - Pattern discovery
 
+**Utilities:**
+- **image-specialist** - Image generation with Gemini AI
+
+**System Builder (Meta-Level):**
+- **domain-analyzer** - Domain analysis and agent recommendations
+- **agent-generator** - XML-optimized agent generation
+- **context-organizer** - Context file organization
+- **workflow-designer** - Workflow design
+- **command-creator** - Custom command creation
+
 ### ⚡ Commands
 - **/commit** - Smart git commits with conventional format
 - **/optimize** - Code optimization
@@ -374,7 +395,7 @@ cp env.example .env
 ## Common Questions
 
 **Q: What's the main way to use this?**  
-A: Use `opencode --agent openagent` as your default. It handles both questions and tasks, coordinating with specialists as needed.
+A: Use `opencode --agent openagent` as your default for general tasks and questions. For complex multi-file coding work, use `opencode --agent opencoder`. Both coordinate with specialists as needed.
 
 **Q: Does this work on Windows?**  
 A: Yes! Use Git Bash (recommended) or WSL. See [Platform Compatibility Guide](docs/getting-started/platform-compatibility.md) for details.
@@ -404,8 +425,8 @@ A: Yes! Use the installer's list feature to see all components:
 ```
 Or cherry-pick individual files with curl:
 ```bash
-curl -o ~/.opencode/agent/codebase-agent.md \
-  https://raw.githubusercontent.com/darrenhinde/OpenAgents/main/.opencode/agent/codebase-agent.md
+curl -o ~/.opencode/agent/opencoder.md \
+  https://raw.githubusercontent.com/darrenhinde/OpenAgents/main/.opencode/agent/opencoder.md
 ```
 
 ---
@@ -427,7 +448,7 @@ Minimal starter kit - universal agent with core subagents.
 ### 💼 Developer (Recommended - 30 components)
 Complete software development environment with code generation, testing, review, and build tools.
 - Everything in Essential, plus:
-- **Agents**: codebase-agent
+- **Agents**: opencoder
 - **Subagents**: coder-agent, reviewer, tester, build-agent, codebase-pattern-analyst
 - **Commands**: commit, test, optimize, validate-repo
 - **Context**: All standards and workflow files (code, patterns, tests, docs, analysis, delegation, sessions, task-breakdown, review, context-guide)
@@ -454,10 +475,9 @@ Everything included - all agents, subagents, tools, and plugins.
 ### 🚀 Advanced (43 components)
 Full installation plus **System Builder** for creating custom AI architectures.
 - Everything in Full, plus:
-- **System Builder**: Interactive AI system generator
-  - system-builder agent
-  - domain-analyzer, agent-generator, context-organizer, workflow-designer, command-creator subagents
-  - build-context-system command
+- **Agents**: system-builder
+- **System Builder Subagents**: domain-analyzer, agent-generator, context-organizer, workflow-designer, command-creator
+- **Commands**: build-context-system
 - **Best for**: Building custom AI systems, contributors, learning the architecture
 
 ## Updating Components
@@ -493,8 +513,9 @@ Read [Agent System Blueprint](docs/features/agent-system-blueprint.md) to learn:
 ```
 .opencode/
 ├── agent/              # AI agents
-│   ├── codebase-agent.md
-│   ├── task-manager.md
+│   ├── openagent.md
+│   ├── opencoder.md
+│   ├── system-builder.md
 │   └── subagents/      # Specialized helpers
 ├── command/            # Slash commands
 │   ├── commit.md
@@ -529,17 +550,28 @@ This project is licensed under the MIT License.
 
 ## Recommended for New Users
 
-**Start with `openagent`** - it's your universal assistant that handles everything from simple questions to complex multi-step workflows. It follows a systematic 6-stage workflow (Analyze → Approve → Execute → Validate → Summarize → Confirm) and automatically delegates to specialized subagents when needed.
+**Start with `openagent`** - your universal coordinator for general tasks, questions, and workflows. It follows a systematic 6-stage workflow (Analyze → Approve → Execute → Validate → Summarize → Confirm) and automatically delegates to specialized subagents when needed.
 
 ```bash
 opencode --agent openagent
 > "How do I implement authentication in Next.js?"  # Questions
-> "Create a user authentication system"            # Tasks
+> "Create a user authentication system"            # Simple tasks
+> "Create a README for this project"               # Documentation
 ```
 
 OpenAgent will guide you through with a plan-first, approval-based approach. For questions, you get direct answers. For tasks, you see the plan before execution.
 
-**Learn more:** See the [OpenAgent Guide](docs/agents/openagent.md) for detailed workflow diagrams and tips.
+**For complex coding work**, use `opencoder`:
+
+```bash
+opencode --agent opencoder
+> "Refactor this codebase to use dependency injection"  # Multi-file refactoring
+> "Analyze the architecture and suggest improvements"   # Architecture analysis
+```
+
+**Learn more:** 
+- [OpenAgent Guide](docs/agents/openagent.md) - General tasks and coordination
+- [OpenCoder Guide](docs/agents/opencoder.md) - Specialized development work
 
 ---
 ## Support This Work

+ 15 - 11
docs/agents/openagent.md

@@ -22,21 +22,23 @@ OpenAgent is the primary universal agent in OpenCode that handles everything fro
 
 ## What is OpenAgent?
 
-OpenAgent is your **universal assistant** that:
+OpenAgent is your **universal coordinator** that:
 
 - ✅ **Answers questions** - Get explanations, comparisons, and guidance
-- ✅ **Executes tasks** - Create files, update code, run commands
-- ✅ **Coordinates workflows** - Handles most tasks directly, delegates to specialists when needed
+- ✅ **Executes general tasks** - Create files, documentation, simple updates
+- ✅ **Coordinates workflows** - Handles most general tasks directly, delegates to specialists when needed
 - ✅ **Preserves context** - Remembers information across multiple steps
 - ✅ **Keeps you in control** - Always asks for approval before taking action
 
-Think of OpenAgent as a **smart project manager** who:
+Think of OpenAgent as a **smart project coordinator** who:
 - Understands what you need
 - Plans how to do it
 - Asks for your approval
 - Executes the plan (directly or via delegation)
 - Confirms everything is done right
 
+**Note:** For complex multi-file coding work, architecture analysis, or deep refactoring, use **opencoder** instead. OpenAgent is optimized for general tasks and coordination, while opencoder specializes in development work.
+
 ---
 
 ## How It Works
@@ -64,17 +66,19 @@ graph LR
 **For Questions**: You ask → You get an answer
 **For Tasks**: You ask → See plan → Approve → Watch it happen → Confirm done
 
-### Universal Agent Philosophy
+### Universal Coordinator Philosophy
+
+OpenAgent is a **universal coordinator** - it handles general tasks directly:
 
-OpenAgent is a **universal agent** - it handles most tasks directly:
+**Capabilities**: Answer questions, create documentation, simple code updates, workflow coordination, task planning, general research, file operations
 
-**Capabilities**: Write code, docs, tests, reviews, analysis, debugging, research, bash operations, file operations
+**Default**: Execute directly for general tasks, fetch context files as needed (lazy), keep it simple, don't over-delegate
 
-**Default**: Execute directly, fetch context files as needed (lazy), keep it simple, don't over-delegate
+**Delegate to opencoder when**: Complex multi-file coding, architecture analysis, deep refactoring, pattern implementation
 
-**Delegate only when**: 4+ files, specialized expertise needed, thorough multi-component review, complex dependencies, or user requests breakdown
+**Delegate to specialists when**: Testing needed (@tester), review needed (@reviewer), complex task breakdown (@task-manager), comprehensive documentation (@documentation)
 
-This means OpenAgent is your go-to agent for almost everything. It only calls in specialists when truly necessary.
+This means OpenAgent is your go-to coordinator for general tasks and questions. For deep coding work, use **opencoder**.
 
 ---
 
@@ -1175,7 +1179,7 @@ OpenAgent is your **intelligent universal agent** that:
 ✅ **Preserves context** - Remembers information across multiple steps
 ✅ **Executes directly** - Handles most tasks itself, delegates only when needed
 ✅ **Keeps you in control** - Always confirms before cleanup (Critical Rule)
-✅ **Handles complexity** - Capable of code, docs, tests, reviews, analysis, debugging
+✅ **Handles general tasks** - Questions, docs, coordination, simple updates (delegates complex coding to opencoder)
 ✅ **Reports before fixing** - Never auto-fixes issues without approval (Critical Rule)
 
 **Key Takeaways**:

+ 441 - 0
docs/agents/opencoder.md

@@ -0,0 +1,441 @@
+# OpenCoder - Specialized Development Agent
+
+**Your expert development partner for complex coding tasks**
+
+---
+
+## Table of Contents
+
+- [What is OpenCoder?](#what-is-opencoder)
+- [When to Use OpenCoder](#when-to-use-opencoder)
+- [When to Use OpenAgent Instead](#when-to-use-openagent-instead)
+- [Core Capabilities](#core-capabilities)
+- [Workflow](#workflow)
+- [Multi-Language Support](#multi-language-support)
+- [Code Standards](#code-standards)
+- [Subagent Delegation](#subagent-delegation)
+- [Examples](#examples)
+- [Tips for Best Results](#tips-for-best-results)
+
+---
+
+## What is OpenCoder?
+
+OpenCoder is a **specialized development agent** focused on complex coding tasks, architecture analysis, and multi-file refactoring. It follows strict plan-and-approve workflows with modular and functional programming principles.
+
+**Key Characteristics:**
+- 🎯 **Specialized** - Deep focus on code quality and architecture
+- 🔧 **Multi-language** - Adapts to TypeScript, Python, Go, Rust, and more
+- 📐 **Plan-first** - Always proposes plans before implementation
+- 🏗️ **Modular** - Emphasizes clean architecture and separation of concerns
+- ✅ **Quality-focused** - Includes testing, type checking, and validation
+
+---
+
+## When to Use OpenCoder
+
+✅ **Multi-file refactoring** (4+ files)
+- Refactoring an entire module
+- Implementing patterns across multiple components
+- Restructuring architecture
+
+✅ **Architecture analysis and improvements**
+- Analyzing codebase structure
+- Identifying architectural issues
+- Proposing design improvements
+
+✅ **Complex code implementations**
+- Features spanning multiple modules
+- Implementations requiring > 60 minutes
+- Features with complex dependencies
+
+✅ **Pattern discovery and application**
+- Finding existing patterns in codebase
+- Implementing consistent patterns
+- Refactoring to match established patterns
+
+✅ **Deep codebase analysis**
+- Understanding complex code flows
+- Documenting architecture
+- Identifying technical debt
+
+---
+
+## When to Use OpenAgent Instead
+
+Use **openagent** for:
+- ❓ Questions about code or concepts
+- 📝 Documentation tasks
+- 🔄 Simple 1-3 file changes
+- 🎯 General workflow coordination
+- 💬 Exploratory conversations
+
+**Rule of thumb:** 
+- **OpenAgent** = General coordinator (questions, docs, coordination, simple tasks)
+- **OpenCoder** = Development specialist (complex coding, architecture, refactoring)
+
+---
+
+## Core Capabilities
+
+### Code Implementation
+- Modular architecture design
+- Functional programming patterns
+- Type-safe implementations
+- SOLID principles adherence
+- Clean code practices
+- Proper separation of concerns
+
+### Analysis & Review
+- Architecture analysis
+- Pattern discovery
+- Code quality assessment
+- Technical debt identification
+- Performance analysis
+
+### Refactoring
+- Multi-file refactoring
+- Pattern implementation
+- Architecture improvements
+- Code modernization
+- Technical debt reduction
+
+### Quality Assurance
+- Type checking (TypeScript, Python, Go, Rust)
+- Linting (ESLint, Pylint, etc.)
+- Build validation
+- Test execution
+- Incremental validation
+
+---
+
+## Workflow
+
+### Phase 1: Planning (Required)
+
+OpenCoder **always** proposes a plan first:
+
+```
+1. Analyzes the request
+2. Creates step-by-step implementation plan
+3. Presents plan to user
+4. Waits for approval
+```
+
+**No implementation happens without approval.**
+
+For features spanning multiple modules or estimated > 60 minutes, OpenCoder delegates to `@task-manager` to create atomic subtasks.
+
+---
+
+### Phase 2: Implementation (After Approval)
+
+Implements **incrementally** - one step at a time:
+
+```
+For each step:
+1. Implement the code
+2. Run type checks (if applicable)
+3. Run linting (if configured)
+4. Run build checks
+5. Execute relevant tests
+6. Validate results
+7. Move to next step
+```
+
+**Validation happens continuously**, not just at the end.
+
+For simple subtasks, delegates to `@subagents/coder-agent` to save time.
+
+---
+
+### Phase 3: Completion
+
+When implementation is complete:
+
+```
+1. Final validation
+2. User approval
+3. Handoff recommendations for:
+   - @tester (if tests needed)
+   - @documentation (if docs needed)
+   - @reviewer (if security review needed)
+```
+
+---
+
+## Multi-Language Support
+
+OpenCoder adapts to the project's language automatically:
+
+### TypeScript/JavaScript
+- Runtime: `node`, `bun`, or `deno`
+- Type checking: `tsc`
+- Linting: `eslint`
+- Testing: `jest`, `vitest`, `mocha`
+
+### Python
+- Runtime: `python`
+- Type checking: `mypy`
+- Linting: `pylint`, `flake8`
+- Testing: `pytest`, `unittest`
+
+### Go
+- Build: `go build`
+- Linting: `golangci-lint`
+- Testing: `go test`
+
+### Rust
+- Build: `cargo check`, `cargo build`
+- Linting: `clippy`
+- Testing: `cargo test`
+
+---
+
+## Code Standards
+
+OpenCoder follows these principles:
+
+### Modular Architecture
+- Clear module boundaries
+- Single responsibility principle
+- Loose coupling, high cohesion
+- Dependency injection where appropriate
+
+### Functional Patterns
+- Pure functions where possible
+- Immutable data structures
+- Declarative over imperative
+- Function composition
+
+### Type Safety
+- Strong typing (when language supports)
+- Explicit types over inference (when clearer)
+- Type guards and validation
+- Null safety
+
+### Clean Code
+- Meaningful names
+- Small, focused functions
+- Minimal, high-signal comments
+- Avoid over-complication
+- Follow language conventions
+
+---
+
+## Subagent Delegation
+
+OpenCoder coordinates with specialized subagents:
+
+### @task-manager
+**When:** Features spanning 4+ files or > 60 minutes
+**Purpose:** Break down into atomic subtasks
+**Output:** Task files under `tasks/subtasks/{feature}/`
+
+### @coder-agent
+**When:** Simple, focused implementation tasks
+**Purpose:** Quick code implementation
+**Output:** Implemented code following specifications
+
+### @tester
+**When:** Tests needed for implementation
+**Purpose:** Write comprehensive test suites
+**Output:** Unit, integration, and e2e tests
+
+### @reviewer
+**When:** Security or quality review needed
+**Purpose:** Code review, security analysis
+**Output:** Review report with recommendations
+
+### @build-agent
+**When:** Build validation needed
+**Purpose:** Type checking, build verification
+**Output:** Build status, error reports
+
+### @documentation
+**When:** Comprehensive documentation needed
+**Purpose:** Generate API docs, guides
+**Output:** Structured documentation
+
+---
+
+## Examples
+
+### Example 1: Multi-File Refactoring
+
+```bash
+opencode --agent opencoder
+> "Refactor the authentication module to use dependency injection across all 8 files"
+
+# OpenCoder will:
+# 1. Analyze current structure (8 files)
+# 2. Propose refactoring plan
+# 3. Wait for approval
+# 4. Delegate to @task-manager (8 files > 4 file threshold)
+# 5. Implement subtasks one at a time
+# 6. Validate incrementally
+# 7. Complete when all subtasks done
+```
+
+---
+
+### Example 2: Architecture Analysis
+
+```bash
+opencode --agent opencoder
+> "Analyze the architecture of this codebase and suggest improvements"
+
+# OpenCoder will:
+# 1. Scan codebase structure
+# 2. Identify patterns and anti-patterns
+# 3. Propose architectural improvements
+# 4. Wait for approval
+# 5. Implement approved changes
+# 6. Validate with build and tests
+```
+
+---
+
+### Example 3: Pattern Implementation
+
+```bash
+opencode --agent opencoder
+> "Implement the repository pattern for all database access across the data layer"
+
+# OpenCoder will:
+# 1. Identify all database access points
+# 2. Design repository interface
+# 3. Propose implementation plan
+# 4. Wait for approval
+# 5. Delegate to @task-manager (multi-file)
+# 6. Implement repositories incrementally
+# 7. Update all consumers
+# 8. Add tests via @tester
+# 9. Validate complete implementation
+```
+
+---
+
+### Example 4: Complex Feature
+
+```bash
+opencode --agent opencoder
+> "Implement user authentication with JWT, refresh tokens, and role-based access control"
+
+# OpenCoder will:
+# 1. Analyze requirements (complex, multi-file)
+# 2. Design authentication architecture
+# 3. Propose implementation plan (multiple phases)
+# 4. Wait for approval
+# 5. Delegate to @task-manager (create subtasks)
+# 6. Implement Phase 1: JWT infrastructure
+# 7. Implement Phase 2: Refresh token mechanism
+# 8. Implement Phase 3: RBAC system
+# 9. Coordinate with @tester for test coverage
+# 10. Coordinate with @reviewer for security review
+# 11. Validate end-to-end
+```
+
+---
+
+## Tips for Best Results
+
+### 1. Be Specific About Scope
+**Good:** "Refactor the API layer to use dependency injection in controllers and services"
+**Bad:** "Make the code better"
+
+### 2. Provide Context
+If refactoring existing code, mention:
+- Number of files involved
+- Current patterns being used
+- Desired end state
+- Any constraints (can't change X, must maintain Y)
+
+### 3. Review Plans Carefully
+OpenCoder will show you the plan before implementation. Take time to:
+- Verify the approach makes sense
+- Check that all files are included
+- Ensure edge cases are considered
+- Request changes if needed
+
+### 4. Let OpenCoder Delegate
+Don't manually delegate to subagents. OpenCoder knows when to:
+- Use @task-manager for breakdown
+- Call @tester for tests
+- Use @reviewer for security
+- Leverage @coder-agent for simple tasks
+
+### 5. Use Test-Driven Development
+If a `tests/` directory exists, OpenCoder will:
+- Write tests first (when appropriate)
+- Validate against tests continuously
+- Ensure comprehensive coverage
+
+### 6. Trust Incremental Implementation
+OpenCoder implements **one step at a time**, not all at once. This:
+- Catches errors early
+- Allows for course correction
+- Maintains working code between steps
+- Makes debugging easier
+
+### 7. Language-Specific Conventions
+OpenCoder adapts to your language:
+- For TypeScript: Functional, type-first approach
+- For Python: Pythonic patterns, type hints
+- For Go: Idiomatic Go, interfaces
+- For Rust: Ownership, traits, Result types
+
+---
+
+## Configuration
+
+OpenCoder is configured in `.opencode/agent/opencoder.md`. Default settings:
+
+```yaml
+temperature: 0.1  # Deterministic, precise
+tools: read, edit, write, grep, glob, bash, patch
+permissions:
+  bash: Limited (ask for risky commands)
+  edit: Deny secrets, node_modules, .git
+```
+
+---
+
+## Comparison: OpenAgent vs OpenCoder
+
+| Aspect | OpenAgent | OpenCoder |
+|--------|-----------|-----------|
+| **Primary Use** | General coordinator | Development specialist |
+| **Best For** | Questions, docs, coordination | Complex coding, architecture |
+| **Coding Tasks** | Simple (1-3 files) | Complex (4+ files) |
+| **Delegation** | Delegates coding to opencoder | Delegates testing, review |
+| **Expertise** | Broad, adaptive | Deep, technical |
+| **User Profile** | Everyone (default) | Developers |
+| **Plan Detail** | High-level | Implementation-level |
+| **Validation** | Basic | Comprehensive (type, lint, build, test) |
+
+---
+
+## Summary
+
+OpenCoder is your **specialized development partner** for:
+- ✅ Complex multi-file coding tasks
+- ✅ Architecture analysis and improvements
+- ✅ Pattern implementation and refactoring
+- ✅ Deep technical implementations
+
+**Use OpenAgent** for general tasks and coordination.
+**Use OpenCoder** when you need deep development expertise.
+
+**Start here:**
+```bash
+opencode --agent opencoder
+> "Your complex coding task..."
+```
+
+---
+
+**Learn more:**
+- [OpenAgent Guide](openagent.md) - General tasks and coordination
+- [Agent System Blueprint](../features/agent-system-blueprint.md) - Architecture patterns
+- [Research-Backed Design](research-backed-prompt-design.md) - Why it works

+ 28 - 16
docs/features/agent-system-blueprint.md

@@ -63,7 +63,7 @@ This blueprint explains the architecture patterns behind the OpenCode agent syst
 **When you see commands like `/workflow`, `/plan-task`, `/create-frontend-component`:**
 - These are pattern examples showing how you COULD structure commands
 - Most aren't implemented in the repository
-- The existing `openagent` and `codebase-agent` already handle these workflows
+- The existing `openagent` and `opencoder` already handle these workflows
 - Create them only if you have specific repeated patterns
 
 **When you see extensive context hierarchies:**
@@ -176,20 +176,33 @@ OpenCode processes `@` references only in command templates, NOT recursively in
 **What they do:** AI workers with specific capabilities and predictable behavior
 
 **Main agents in this repo:**
-- `openagent` - Universal agent for questions and tasks (recommended default)
-- `codebase-agent` - Specialized development partner
-- `task-manager` - Breaks down complex features
-- `workflow-orchestrator` - Routes requests
-- `image-specialist` - Image generation
+- `openagent` - Universal coordinator for general tasks, questions, and workflows (recommended default)
+- `opencoder` - Specialized development agent for complex coding and architecture
+- `system-builder` - Meta-level generator for creating custom AI architectures
 
 **Subagents (specialized helpers):**
-- `reviewer` - Code review and security
-- `tester` - Test creation
+
+Core Coordination:
+- `task-manager` - Task breakdown and planning
+- `documentation` - Documentation authoring
+
+Code Specialists:
 - `coder-agent` - Quick implementations
-- `documentation` - Docs generation
-- `build-agent` - Type checking
+- `reviewer` - Code review and security
+- `tester` - Test creation and validation
+- `build-agent` - Type checking and validation
 - `codebase-pattern-analyst` - Pattern discovery
 
+Utilities:
+- `image-specialist` - Image generation (Gemini AI)
+
+System Builder (Meta-Level):
+- `domain-analyzer` - Domain analysis
+- `agent-generator` - Agent generation
+- `context-organizer` - Context organization
+- `workflow-designer` - Workflow design
+- `command-creator` - Command creation
+
 **Agent structure:**
 - Frontmatter with metadata (description, mode, tools, permissions)
 - Clear instructions for behavior
@@ -339,7 +352,7 @@ Create custom agents when:
 - You have unique quality requirements
 
 Don't create custom agents when:
-- `codebase-agent` already handles it
+- `openagent` or `opencoder` already handles it
 - It's a one-time task
 - You're just starting out
 
@@ -452,10 +465,9 @@ The system improves naturally as you:
 ```
 .opencode/
 ├── agent/              # AI agents
-│   ├── codebase-agent.md
-│   ├── task-manager.md
-│   ├── workflow-orchestrator.md
-│   ├── image-specialist.md
+│   ├── openagent.md
+│   ├── opencoder.md
+│   ├── system-builder.md
 │   └── subagents/      # Specialized helpers
 │       ├── reviewer.md
 │       ├── tester.md
@@ -498,6 +510,6 @@ The system improves naturally as you:
 
 _Think of this system like a professional development team: each member has a specific role, they communicate clearly, they track their work systematically, and they validate quality at every step._
 
-_The `codebase-agent` is your senior developer who can handle most tasks. Add specialists only when needed._
+_The `openagent` is your universal coordinator and `opencoder` is your senior developer. Add specialists only when needed._
 
 _When you need to build custom agents, use `/prompt-enchancer` to create well-structured, complex agents with proper workflows._

+ 4 - 4
docs/features/system-builder/README.md

@@ -150,7 +150,7 @@ Integrates with existing agents
 - test-validator (subagent)
 - Commands: /review-code, /scan-security
 
-**Integration**: Leverages existing openagent, codebase-agent, reviewer, tester
+**Integration**: Leverages existing openagent, opencoder, reviewer, tester
 
 ---
 
@@ -178,7 +178,7 @@ Integrates with existing agents
 
 ### Example 3: Extend Existing Project
 
-**Existing**: Dev tools (openagent, codebase-agent, build-agent, tester)
+**Existing**: Dev tools (openagent, opencoder, build-agent, tester)
 
 **Command**: `/build-context-system "Add documentation generation"`
 
@@ -186,7 +186,7 @@ Integrates with existing agents
 1. Detects existing project
 2. User chooses: "Extend existing"
 3. Domain type: Hybrid (dev + content)
-4. Reuses: openagent, codebase-agent, documentation
+4. Reuses: openagent, opencoder, documentation
 5. Adds: doc-orchestrator, api-doc-generator
 6. Result: Unified system with dev + docs
 
@@ -238,7 +238,7 @@ The system detects and integrates with existing agents:
 
 **Development Agents**:
 - `openagent` - Universal agent for questions and tasks
-- `codebase-agent` - Code analysis, file operations
+- `opencoder` - Code analysis, file operations
 - `build-agent` - Build validation, type checking
 - `tester` - Test authoring, TDD
 - `reviewer` - Code review, quality assurance

+ 1 - 1
docs/features/system-builder/quick-start.md

@@ -158,7 +158,7 @@ When you install **advanced** profile, you get:
 7. `build-context-system` (command) - Interactive interface
 
 **Plus all development tools:**
-- openagent, task-manager, codebase-agent
+- openagent, task-manager, opencoder
 - All core subagents (reviewer, tester, etc.)
 - All development commands
 - Tools and plugins

+ 1 - 1
docs/getting-started/collision-handling.md

@@ -130,7 +130,7 @@ Use when:
 
   Agents (2):
     .opencode/agent/task-manager.md
-    .opencode/agent/codebase-agent.md
+    .opencode/agent/opencoder.md
     
   Subagents (3):
     .opencode/agent/subagents/reviewer.md

+ 1 - 1
docs/getting-started/context-aware-system/QUICK_START_SYSTEM_BUILDER.md

@@ -341,7 +341,7 @@ When you install **advanced** profile, you get:
 7. `build-context-system` (command) - Interactive interface
 
 **Plus all development tools:**
-- openagent, task-manager, codebase-agent
+- openagent, task-manager, opencoder
 - All core subagents (reviewer, tester, etc.)
 - All development commands
 - Tools and plugins

+ 1 - 1
docs/getting-started/installation.md

@@ -304,7 +304,7 @@ Components:
 **Code-focused development tools**
 
 Components:
-- Development agents: openagent, codebase-agent, task-manager
+- Development agents: openagent, opencoder, task-manager
 - Code subagents: reviewer, tester, coder-agent, build-agent
 - Development commands: test, commit, context
 - Development tools and contexts

+ 10 - 9
registry.json

@@ -11,15 +11,16 @@
   "components": {
     "agents": [
       {
-        "id": "codebase-agent",
-        "name": "Codebase Agent",
+        "id": "opencoder",
+        "name": "OpenCoder",
         "type": "agent",
-        "path": ".opencode/agent/codebase-agent.md",
-        "description": "Analyzes codebase patterns and architecture",
+        "path": ".opencode/agent/opencoder.md",
+        "description": "Specialized development agent for complex coding, architecture, and multi-file refactoring",
         "tags": [
-          "analysis",
+          "development",
           "architecture",
-          "patterns"
+          "coding",
+          "refactoring"
         ],
         "dependencies": [
           "subagent:task-manager",
@@ -674,7 +675,7 @@
       "badge": "RECOMMENDED",
       "components": [
         "agent:openagent",
-        "agent:codebase-agent",
+        "agent:opencoder",
         "subagent:task-manager",
         "subagent:documentation",
         "subagent:coder-agent",
@@ -731,7 +732,7 @@
       "description": "Everything included - all agents, subagents, tools, and plugins for maximum functionality.",
       "components": [
         "agent:openagent",
-        "agent:codebase-agent",
+        "agent:opencoder",
         "subagent:task-manager",
         "subagent:documentation",
         "subagent:coder-agent",
@@ -773,7 +774,7 @@
       "description": "Full installation plus System Builder for creating custom AI architectures. For power users and contributors.",
       "components": [
         "agent:openagent",
-        "agent:codebase-agent",
+        "agent:opencoder",
         "agent:system-builder",
         "subagent:task-manager",
         "subagent:documentation",