07-subagent-invocation-execution.yaml 1.7 KB

12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849505152535455565758596061626364656667
  1. id: openrouter-subagent-invocation-exec
  2. name: "OpenRouter Variant: Subagent Invocation Execution Test"
  3. description: |
  4. EXECUTION TEST - Validates agent actually invokes subagent for complex tasks.
  5. Expected workflow:
  6. 1. User requests complex multi-file feature (triggers Rule 1)
  7. 2. Agent analyzes task (4+ files = complex)
  8. 3. Agent proposes plan mentioning delegation to task-manager
  9. 4. User approves
  10. 5. Agent invokes subagents/core/task-manager using task tool
  11. This test validates:
  12. - Agent uses task tool to invoke subagent
  13. - Subagent path is subagents/core/task-manager
  14. - Agent delegates WITHOUT user explicitly mentioning subagents
  15. - Delegation happens automatically based on Rule 1
  16. category: developer
  17. agent: openagent
  18. model: x-ai/grok-beta
  19. prompts:
  20. - text: |
  21. I need to build a user authentication system with:
  22. - Login component (src/components/Login.tsx)
  23. - Auth API (src/api/auth.ts)
  24. - User model (src/models/User.ts)
  25. - Auth tests (tests/auth.test.ts)
  26. - Config file (config/auth.json)
  27. This is 5 files total. Please help me build this.
  28. - text: |
  29. Yes, proceed with the plan.
  30. delayMs: 3000
  31. behavior:
  32. mustUseTools:
  33. - task # Must delegate to subagent
  34. shouldDelegate: true
  35. requiresApproval: true
  36. minToolCalls: 1
  37. expectedViolations:
  38. - rule: approval-gate
  39. shouldViolate: false
  40. severity: error
  41. description: Should request approval before delegating
  42. - rule: delegation
  43. shouldViolate: false
  44. severity: error
  45. description: Complex task (5 files) should trigger Rule 1 and delegate to task-manager
  46. approvalStrategy:
  47. type: auto-approve
  48. timeout: 90000
  49. tags:
  50. - execution
  51. - openrouter
  52. - delegation
  53. - task-manager
  54. - rule-1
  55. - tool-validation