execution-balance-positive.yaml 976 B

12345678910111213141516171819202122232425262728293031323334353637383940
  1. id: execution-balance-positive-001
  2. name: Execution Balance - Read before execution
  3. description: |
  4. Tests the execution-balance evaluator.
  5. The execution-balance evaluator checks that agents read/inspect
  6. before executing write operations. This prevents blind writes.
  7. This test asks the agent to read a file - a simple read-only operation
  8. that should pass the execution-balance check (reads are always OK).
  9. category: developer
  10. agent: openagent
  11. prompts:
  12. - text: |
  13. List the contents of the evals/test_tmp/ directory and read the README.md file in it.
  14. behavior:
  15. # Read-only operations - should pass execution balance
  16. mustUseAnyOf:
  17. - [list]
  18. - [read]
  19. - [glob]
  20. minToolCalls: 1
  21. expectedViolations:
  22. # Read-only session - no execution balance issues
  23. - rule: execution-balance
  24. shouldViolate: false
  25. severity: warning
  26. approvalStrategy:
  27. type: auto-approve
  28. timeout: 60000
  29. tags:
  30. - execution-balance
  31. - read-only
  32. - positive-test