02-unknown-domain-discovery.yaml 1.8 KB

123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960
  1. id: openagent-contextscout-unknown-domain
  2. name: "OpenAgent: Unknown Domain - Should Use ContextScout for Discovery"
  3. description: |
  4. Tests that OpenAgent DOES use ContextScout when dealing with unfamiliar
  5. or domain-specific topics where context files need to be discovered.
  6. This validates:
  7. - Agent recognizes unfamiliar domain (eval framework)
  8. - Agent delegates to ContextScout to discover relevant files
  9. - Agent loads discovered context files
  10. - Finds domain-specific context (not just generic standards)
  11. Expected Behavior:
  12. - SHOULD delegate to ContextScout for discovery
  13. - MUST load discovered context files
  14. - SHOULD find eval-specific context files
  15. - MAY take longer due to discovery phase (acceptable)
  16. This test SHOULD FAIL if:
  17. - Agent doesn't use ContextScout (misses domain context)
  18. - Agent only loads generic standards (incomplete context)
  19. - Agent fabricates context without discovery
  20. category: developer
  21. prompts:
  22. - text: |
  23. I need to understand how the eval framework works in this repository.
  24. Find all relevant context files about eval testing, test structure,
  25. and how to write eval tests.
  26. approvalStrategy:
  27. type: auto-approve
  28. behavior:
  29. mustUseTools:
  30. - task # Must delegate to ContextScout
  31. - read # Must read discovered files
  32. minToolCalls: 3
  33. maxToolCalls: 20
  34. expectedViolations:
  35. - rule: approval-gate
  36. shouldViolate: false
  37. severity: error
  38. timeout: 90000 # Discovery takes longer, that's OK
  39. tags:
  40. - openagent
  41. - contextscout-integration
  42. - unknown-domain
  43. - discovery
  44. - critical
  45. # Expected outcome:
  46. # - Agent delegates to ContextScout
  47. # - ContextScout finds .opencode/context/openagents-repo/core-concepts/evals.md
  48. # - Agent loads discovered files
  49. # - Agent provides comprehensive answer about eval framework