darrenhinde f773b290ce chore(evals): comprehensive cleanup, documentation, and test infrastructure improvements 4 months ago
..
openagent f773b290ce chore(evals): comprehensive cleanup, documentation, and test infrastructure improvements 4 months ago
opencoder 8eb4b31ef4 feat(evals): add opencoder test suite and fix expected violation handling 4 months ago
shared 0d1718e551 fix(evals): use test_tmp directory for test artifacts and add cleanup 4 months ago
AGENT_TESTING_GUIDE.md cc96acc50e feat: add 5 essential workflow tests and reorganize with agents/ structure 4 months ago