Architecture & Design Insights

This document provides architectural observations and recommendations based on the codebase review.

Current Architecture

Strengths ✅

Clean Module Separation
- Well-organized into logical domains: agents/, tools/, config/, hooks/, utils/
- Clear separation of concerns
- Minimal coupling between modules
Strong Type Safety
- Comprehensive TypeScript usage with strict mode enabled
- Zod schemas for runtime validation
- Good use of discriminated unions and type guards
Excellent Test Coverage
- 269 tests across 23 test files
- Good coverage of critical paths
- Fast test execution (3.24s)
Plugin Architecture
- Clean plugin interface via @opencode-ai/plugin
- Proper event-driven design
- Good use of hooks for cross-cutting concerns
Configuration System
- Flexible config merging (user + project)
- Support for presets
- Good defaults with override capability

Areas for Improvement 🔧

Resource Management
- No automatic cleanup of long-lived resources
- Unbounded growth in several managers (BackgroundTaskManager, LSPServerManager)
- Missing health checks and monitoring
Initialization Performance
- Synchronous file I/O blocks startup
- No lazy loading of optional features
- Eager initialization of all subsystems
Error Handling
- Inconsistent error handling patterns
- Missing error boundaries in critical sections
- Silent failures in some areas
Observability
- No structured logging levels
- Missing performance metrics
- No telemetry for production debugging

Design Patterns Used

Factory Pattern

Location: src/agents/
Usage: Agent creation with configurable models and prompts
Benefits: Encapsulates creation logic, supports customization
Rating: ⭐⭐⭐⭐⭐ Excellent use

Singleton Pattern

Location: src/tools/lsp/client.ts (LSPServerManager)
Usage: Global LSP client pool
Benefits: Shared resource management
Concern: Could benefit from lifecycle management
Rating: ⭐⭐⭐⭐ Good, needs resource limits

Observer Pattern

Location: src/background/ (event handlers)
Usage: Session status monitoring, task completion
Benefits: Decoupled event handling
Rating: ⭐⭐⭐⭐⭐ Excellent use

Strategy Pattern

Location: src/agents/ (agent variants)
Usage: Different agent configurations per use case
Benefits: Flexible agent behavior
Rating: ⭐⭐⭐⭐ Good implementation

Builder Pattern

Location: src/config/ (config merging)
Usage: Layered configuration building
Benefits: Flexible, composable configs
Rating: ⭐⭐⭐⭐⭐ Excellent use

Architecture Recommendations

1. Implement Resource Lifecycle Management

Problem: Resources created but never cleaned up

Solution: Implement a resource manager

// src/utils/resource-manager.ts
export class ResourceManager {
  private resources = new Map<string, Disposable>();
  
  register(id: string, resource: Disposable): void {
    this.resources.set(id, resource);
  }
  
  async dispose(id?: string): Promise<void> {
    if (id) {
      const resource = this.resources.get(id);
      if (resource) {
        await resource.dispose();
        this.resources.delete(id);
      }
    } else {
      // Dispose all
      for (const [id, resource] of this.resources) {
        await resource.dispose();
        this.resources.delete(id);
      }
    }
  }
}

interface Disposable {
  dispose(): Promise<void> | void;
}

Usage:

// In src/index.ts
const resourceManager = new ResourceManager();

resourceManager.register('lsp', lspManager);
resourceManager.register('background', backgroundManager);

// On plugin shutdown
await resourceManager.dispose();

2. Add Circuit Breaker for External Services

Problem: No protection against cascading failures (LSP, MCP)

Solution: Implement circuit breaker pattern

// src/utils/circuit-breaker.ts
export class CircuitBreaker {
  private failures = 0;
  private lastFailureTime = 0;
  private state: 'closed' | 'open' | 'half-open' = 'closed';
  
  constructor(
    private readonly threshold = 5,
    private readonly timeout = 60000, // 1 minute
  ) {}
  
  async execute<T>(fn: () => Promise<T>): Promise<T> {
    if (this.state === 'open') {
      if (Date.now() - this.lastFailureTime > this.timeout) {
        this.state = 'half-open';
      } else {
        throw new Error('Circuit breaker is open');
      }
    }
    
    try {
      const result = await fn();
      this.onSuccess();
      return result;
    } catch (error) {
      this.onFailure();
      throw error;
    }
  }
  
  private onSuccess(): void {
    this.failures = 0;
    this.state = 'closed';
  }
  
  private onFailure(): void {
    this.failures++;
    this.lastFailureTime = Date.now();
    
    if (this.failures >= this.threshold) {
      this.state = 'open';
    }
  }
}

Usage:

// Wrap LSP calls
const lspBreaker = new CircuitBreaker();

async function safeLspCall<T>(fn: () => Promise<T>): Promise<T> {
  return lspBreaker.execute(fn);
}

3. Add Retry Logic with Backoff

Problem: No retry logic for transient failures

Solution: Exponential backoff utility

// src/utils/retry.ts
export async function retry<T>(
  fn: () => Promise<T>,
  options: {
    maxAttempts?: number;
    initialDelay?: number;
    maxDelay?: number;
    backoffFactor?: number;
  } = {},
): Promise<T> {
  const {
    maxAttempts = 3,
    initialDelay = 100,
    maxDelay = 5000,
    backoffFactor = 2,
  } = options;
  
  let lastError: Error | undefined;
  let delay = initialDelay;
  
  for (let attempt = 1; attempt <= maxAttempts; attempt++) {
    try {
      return await fn();
    } catch (error) {
      lastError = error instanceof Error ? error : new Error(String(error));
      
      if (attempt < maxAttempts) {
        await new Promise(resolve => setTimeout(resolve, delay));
        delay = Math.min(delay * backoffFactor, maxDelay);
      }
    }
  }
  
  throw lastError;
}

4. Implement Event Bus for Cross-Module Communication

Problem: Tight coupling through direct imports

Solution: Event bus for decoupled communication

// src/utils/event-bus.ts
type EventHandler<T = unknown> = (data: T) => void | Promise<void>;

export class EventBus {
  private handlers = new Map<string, EventHandler[]>();
  
  on<T>(event: string, handler: EventHandler<T>): () => void {
    if (!this.handlers.has(event)) {
      this.handlers.set(event, []);
    }
    
    this.handlers.get(event)!.push(handler as EventHandler);
    
    // Return unsubscribe function
    return () => this.off(event, handler);
  }
  
  off<T>(event: string, handler: EventHandler<T>): void {
    const handlers = this.handlers.get(event);
    if (handlers) {
      const index = handlers.indexOf(handler as EventHandler);
      if (index > -1) {
        handlers.splice(index, 1);
      }
    }
  }
  
  async emit<T>(event: string, data: T): Promise<void> {
    const handlers = this.handlers.get(event);
    if (handlers) {
      await Promise.all(handlers.map(h => h(data)));
    }
  }
}

Usage:

// Global event bus
const eventBus = new EventBus();

// In background manager
eventBus.emit('task:completed', { taskId: task.id, result });

// In hooks
eventBus.on('task:completed', async ({ taskId, result }) => {
  // React to task completion
});

5. Add Health Check System

Problem: No way to monitor plugin health

Solution: Health check aggregator

// src/utils/health.ts
export interface HealthCheck {
  name: string;
  check(): Promise<HealthStatus>;
}

export interface HealthStatus {
  healthy: boolean;
  message?: string;
  details?: Record<string, unknown>;
}

export class HealthMonitor {
  private checks = new Map<string, HealthCheck>();
  
  register(check: HealthCheck): void {
    this.checks.set(check.name, check);
  }
  
  async checkAll(): Promise<Record<string, HealthStatus>> {
    const results: Record<string, HealthStatus> = {};
    
    for (const [name, check] of this.checks) {
      try {
        results[name] = await check.check();
      } catch (error) {
        results[name] = {
          healthy: false,
          message: error instanceof Error ? error.message : String(error),
        };
      }
    }
    
    return results;
  }
  
  async isHealthy(): Promise<boolean> {
    const results = await this.checkAll();
    return Object.values(results).every(r => r.healthy);
  }
}

Usage:

// Register health checks
healthMonitor.register({
  name: 'lsp',
  async check() {
    const clientCount = lspManager.getClientCount();
    return {
      healthy: clientCount < 10,
      details: { clientCount },
    };
  },
});

healthMonitor.register({
  name: 'background-tasks',
  async check() {
    const activeCount = backgroundManager.getActiveCount();
    return {
      healthy: activeCount < 50,
      details: { activeCount },
    };
  },
});

Performance Architecture Patterns

1. Lazy Initialization

Current: Eager initialization of all subsystems Recommended: Lazy initialization for optional features

class LazyService<T> {
  private instance?: T;
  
  constructor(private factory: () => T) {}
  
  get(): T {
    if (!this.instance) {
      this.instance = this.factory();
    }
    return this.instance;
  }
}

// Usage
const lspTools = new LazyService(() => import('./tools/lsp'));

2. Object Pool Pattern

Use Case: Reusable objects (LSP clients, background tasks)

class ObjectPool<T> {
  private available: T[] = [];
  private inUse = new Set<T>();
  
  constructor(
    private factory: () => T,
    private maxSize: number,
  ) {}
  
  acquire(): T {
    let obj = this.available.pop();
    
    if (!obj && this.inUse.size < this.maxSize) {
      obj = this.factory();
    }
    
    if (obj) {
      this.inUse.add(obj);
    }
    
    return obj!;
  }
  
  release(obj: T): void {
    this.inUse.delete(obj);
    this.available.push(obj);
  }
}

3. Cache with TTL

Use Case: Config files, agent prompts

class TTLCache<K, V> {
  private cache = new Map<K, { value: V; expiry: number }>();
  
  constructor(private ttl: number) {}
  
  set(key: K, value: V): void {
    this.cache.set(key, {
      value,
      expiry: Date.now() + this.ttl,
    });
  }
  
  get(key: K): V | undefined {
    const entry = this.cache.get(key);
    
    if (!entry) return undefined;
    
    if (Date.now() > entry.expiry) {
      this.cache.delete(key);
      return undefined;
    }
    
    return entry.value;
  }
  
  clear(): void {
    this.cache.clear();
  }
}

Monitoring & Observability

Recommended Metrics

Plugin Metrics
- Initialization time
- Memory usage
- Active sessions count
Agent Metrics
- Delegations per agent
- Response times
- Error rates
Tool Metrics
- Tool execution times
- Tool usage frequency
- Tool error rates
Background Task Metrics
- Queue depth
- Task completion times
- Task success/failure rates
LSP Metrics
- Active client count
- Request latencies
- Cache hit rates

Implementation

// src/utils/metrics.ts
export class MetricsCollector {
  private counters = new Map<string, number>();
  private gauges = new Map<string, number>();
  private histograms = new Map<string, number[]>();
  
  increment(name: string, value = 1): void {
    this.counters.set(name, (this.counters.get(name) ?? 0) + value);
  }
  
  gauge(name: string, value: number): void {
    this.gauges.set(name, value);
  }
  
  histogram(name: string, value: number): void {
    if (!this.histograms.has(name)) {
      this.histograms.set(name, []);
    }
    this.histograms.get(name)!.push(value);
  }
  
  getStats(): Record<string, unknown> {
    return {
      counters: Object.fromEntries(this.counters),
      gauges: Object.fromEntries(this.gauges),
      histograms: Object.fromEntries(
        Array.from(this.histograms.entries()).map(([name, values]) => [
          name,
          {
            count: values.length,
            avg: values.reduce((a, b) => a + b, 0) / values.length,
            min: Math.min(...values),
            max: Math.max(...values),
          },
        ]),
      ),
    };
  }
}

// Global metrics instance
export const metrics = new MetricsCollector();

Testing Recommendations

1. Add Performance Tests

// src/__tests__/performance.test.ts
import { describe, expect, test } from 'bun:test';

describe('Performance', () => {
  test('plugin initialization should complete within 100ms', async () => {
    const start = performance.now();
    await initPlugin();
    const duration = performance.now() - start;
    
    expect(duration).toBeLessThan(100);
  });
  
  test('background task launch should complete within 50ms', async () => {
    const start = performance.now();
    backgroundManager.launch({ /* ... */ });
    const duration = performance.now() - start;
    
    expect(duration).toBeLessThan(50);
  });
});

2. Add Integration Tests

// src/__tests__/integration.test.ts
test('end-to-end task flow', async () => {
  const task = backgroundManager.launch({
    agent: 'explorer',
    prompt: 'test',
    description: 'test task',
    parentSessionId: 'test',
  });
  
  expect(task.status).toBe('pending');
  
  const completed = await backgroundManager.waitForCompletion(task.id, 30000);
  
  expect(completed?.status).toBeOneOf(['completed', 'failed']);
});

3. Add Load Tests

test('handle 100 concurrent background tasks', async () => {
  const tasks = Array.from({ length: 100 }, (_, i) =>
    backgroundManager.launch({
      agent: 'explorer',
      prompt: `task ${i}`,
      description: `test task ${i}`,
      parentSessionId: 'test',
    })
  );
  
  expect(tasks).toHaveLength(100);
  expect(backgroundManager.getActiveCount()).toBeLessThanOrEqual(10);
});

Security Considerations

1. Input Validation

✅ Using Zod for runtime validation
⚠️ Should validate file paths to prevent directory traversal
⚠️ Should sanitize MCP names more thoroughly

2. Resource Limits

⚠️ No memory limits enforced
⚠️ No CPU limits
⚠️ No rate limiting on tool calls

3. Privilege Separation

✅ Good use of permission system
✅ Agent-specific tool access control
✅ MCP access control per agent

Scalability Considerations

Current Limits

No limit on concurrent background tasks (only start queue)
No limit on LSP clients (5-minute idle timeout only)
No limit on task history

Recommended Limits

const LIMITS = {
  MAX_BACKGROUND_TASKS: 100,
  MAX_LSP_CLIENTS: 10,
  MAX_TASK_HISTORY: 100,
  MAX_SESSION_MEMORY: 1000, // messages
  MAX_LOG_SIZE: 10_000_000, // 10MB
};

Summary

The codebase has a solid architectural foundation with good separation of concerns and strong type safety. The main areas for improvement are:

Resource Management: Add lifecycle management and cleanup
Performance: Lazy loading, caching, and async I/O
Observability: Structured logging, metrics, and health checks
Resilience: Circuit breakers, retries, and error boundaries
Scalability: Resource limits and pool management

Implementing these patterns will make the plugin more robust, performant, and production-ready.

ARCHITECTURE_NOTES.md 15 KB History Raw

Architecture & Design Insights

Current Architecture

Strengths ✅

Areas for Improvement 🔧

Design Patterns Used

Factory Pattern

Singleton Pattern

Observer Pattern

Strategy Pattern

Builder Pattern

Architecture Recommendations

1. Implement Resource Lifecycle Management

2. Add Circuit Breaker for External Services

3. Add Retry Logic with Backoff

4. Implement Event Bus for Cross-Module Communication

5. Add Health Check System

Performance Architecture Patterns

1. Lazy Initialization

2. Object Pool Pattern

3. Cache with TTL

Monitoring & Observability

Recommended Metrics

Implementation

Testing Recommendations

1. Add Performance Tests

2. Add Integration Tests

3. Add Load Tests

Security Considerations

1. Input Validation

2. Resource Limits

3. Privilege Separation

Scalability Considerations

Current Limits

Recommended Limits

Summary

ARCHITECTURE_NOTES.md 15 KB

History Raw