06-context-loading-execution.yaml 1.5 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
  1. id: openrouter-context-loading-exec
  2. name: "OpenRouter Variant: Context Loading Execution Test"
  3. description: |
  4. EXECUTION TEST - Validates agent actually loads context files before writing code.
  5. Expected workflow:
  6. 1. User requests code to be written
  7. 2. Agent proposes plan (approval gate)
  8. 3. User approves
  9. 4. Agent loads .opencode/context/core/standards/code.md (using read tool)
  10. 5. Agent writes the file
  11. This test validates:
  12. - Agent uses read tool to load context file
  13. - Context file path matches .opencode/context/core/standards/code.md
  14. - Context is loaded BEFORE write tool is used
  15. category: developer
  16. agent: openagent
  17. model: x-ai/grok-beta
  18. prompts:
  19. - text: |
  20. Create a simple utility function in evals/test_tmp/openrouter-utils.js
  21. that exports a function called greet(name) which returns "Hello, {name}!".
  22. Keep it simple - just the function.
  23. - text: |
  24. Yes, proceed with the plan.
  25. delayMs: 3000
  26. behavior:
  27. mustUseTools:
  28. - read # Must load context
  29. - write # Must write file
  30. requiresApproval: true
  31. minToolCalls: 2
  32. expectedViolations:
  33. - rule: approval-gate
  34. shouldViolate: false
  35. severity: error
  36. description: Should request approval before writing
  37. - rule: context-loading
  38. shouldViolate: false
  39. severity: error
  40. description: Must load .opencode/context/core/standards/code.md before writing code
  41. approvalStrategy:
  42. type: auto-approve
  43. timeout: 90000
  44. tags:
  45. - execution
  46. - openrouter
  47. - context-loading
  48. - code-standards
  49. - tool-validation