Browse Source

feat(skills): loop-ops iter3+4 — failure-modes catalog, connector/MCP scopes

Closes the last upstream-led dimensions:

- references/failure-modes.md — incident-shaped scar tissue: 11 ways a loop breaks
  (runaway budget, 3am-dead, cache-cold, force-push, ungated-child spawn, colliding
  loops, silent-stop, gate reward-hacking, unbounded scope, no-kill-switch,
  comprehension debt), each mapped to the symptom -> mechanism -> the loop-ops control
  that catches it. Exceeds upstream's catalog by tying each to a concrete control +
  the Claude Code gate.
- risk-tiers.md gains connector/MCP least-privilege scoping (allowlist not blanket,
  read-scoped connectors at L1, the auto-merge guard, MCP-descriptions-are-instructions)
  + an honest "why Claude Code-specific, not a multi-tool primitives matrix" note that
  turns the one remaining upstream column into a stated strength.

Suite/check-resources/doc-drift/validate all green.

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
0xDarkMatter 1 week ago
parent
commit
cd8ca5b291

+ 9 - 0
skills/loop-ops/SKILL.md

@@ -96,6 +96,9 @@ Code's classifier tiers. Bake these into the config's `escalation:` field:
 - **The test:** *would a careful human let this happen unattended in this repo?* If the
   action's blast radius exceeds the loop's stated purpose, it escalates. A general goal
   ("keep CI green") is **not** authorization for a specific high-blast action it implies.
+- **Scope the tools, not just the mode.** Allowlist exactly the tools/MCP connectors the
+  job needs (read-only at L1); keep `gh pr merge` out and `land_via: fleet-ops` in. Full
+  connector/MCP-scope discipline + the auto-merge guard: [references/risk-tiers.md](references/risk-tiers.md).
 
 ## The state spine
 
@@ -291,6 +294,11 @@ example every build, so it can't drift out of validity.
 
 ## Anti-patterns (these are detected and wrong)
 
+The incident-shaped catalog — symptom → mechanism → the control that catches each — is
+[references/failure-modes.md](references/failure-modes.md) (runaway budget, the 3am-dead
+loop, cache-cold, force-push, ungated-child spawn, colliding loops, silent-stop,
+gate reward-hacking, …). The headline ones:
+
 - **Routing around the gate.** Wrapping `claude -p --permission-mode bypassPermissions`
   in a script to dodge the classifier is *Auto-Mode Bypass* — a `hard_deny` nothing
   clears. If an outcome is blocked, **authorize it** (a narrow allow rule, or run the
@@ -312,5 +320,6 @@ example every build, so it can't drift out of validity.
 - [references/pattern-catalog.md](references/pattern-catalog.md) — the seven patterns, full skeletons + escalation rules.
 - [references/state-spine.md](references/state-spine.md) — STATE.md / run-log / budget schemas, multi-loop coordination.
 - [references/claude-code-loops.md](references/claude-code-loops.md) — where loops actually live: `/loop`, `/schedule`, hooks, the scheduler pattern.
+- [references/failure-modes.md](references/failure-modes.md) — how loops break (incident-shaped) and the control that catches each.
 - [assets/loop.config.template.yaml](assets/loop.config.template.yaml) — the loop definition starter; [assets/STATE.template.md](assets/STATE.template.md) — the state-spine starter; [assets/run.template.md](assets/run.template.md) — the headless run prompt.
 - The lineage: [Ralph loop](https://ghuntley.com/ralph/) (inner brute-force), [loop-engineering](https://github.com/cobusgreyling/loop-engineering) (the methodology this distills).

+ 148 - 0
skills/loop-ops/references/failure-modes.md

@@ -0,0 +1,148 @@
+# Failure Modes — how loops actually break, and what catches each
+
+Incident-shaped scar tissue. Every entry is a real way an outer loop goes wrong: the
+**symptom** you'd observe, the **mechanism** underneath, and the **catch** — the
+specific `loop-ops` control (or Claude Code gate) that prevents or surfaces it. Read this
+before you schedule anything unattended; most of these only bite once you're not watching.
+
+The meta-lesson (Addy Osmani): *"build the loop like someone who intends to stay the
+engineer."* These failures are what happens when the loop is given more autonomy than its
+judgment has earned.
+
+---
+
+## 1. The runaway-budget loop
+
+- **Symptom:** a day's token spend gone in an hour; the bill is 5–10× the estimate.
+- **Mechanism:** cadence too tight, or scope crept so each tick reads/does far more than
+  scoped (a "report PRs" loop that started crawling diffs). Sub-agents multiply it.
+- **Catch:** set `budget_tokens` (a per-run ceiling). Estimate with `loop-cost` *before*
+  scheduling; `loop-doctor` fails the loop if `budget_tokens` < estimated tokens/run.
+  Reconcile the `loop-cost` estimate against `run-log.md` actuals periodically — a tick
+  that used to cost 2k now costing 40k means scope crept.
+
+## 2. The 3am-dead loop
+
+- **Symptom:** every tick aborts immediately; `run-log.md` shows nothing but failures.
+- **Mechanism:** the `verify`/`guard` gate command's binary isn't on PATH in the
+  *scheduler's* environment (works on your laptop, absent on the CI runner), or `claude`
+  itself isn't installed there. In non-interactive `-p`, a hard denial **aborts the
+  session** — no human to prompt.
+- **Catch:** `loop-doctor --live` resolves the gate's leading binary and checks
+  `claude`/`git` are on PATH *before* you schedule. Run it in the target environment.
+
+## 3. The cache-cold loop
+
+- **Symptom:** cost far higher than `loop-cost --cached` projected; `cache_read_input_tokens`
+  stays 0.
+- **Mechanism:** the run prompt isn't byte-identical every tick — a `datetime.now()`, a
+  per-run UUID, or unsorted JSON in the prefix invalidates the cache. Or the cadence is
+  slower than the cache TTL (a 6h loop can't keep a 1h entry warm), so every tick is a
+  cold write.
+- **Catch:** keep `run.md` byte-identical (the template enforces this — fresh context,
+  same prompt). `loop-cost` tells you whether the cadence can cache at all and which TTL;
+  if it can't, don't pay the write multiplier — run uncached.
+
+## 4. The force-push / push-to-main loop
+
+- **Symptom:** the loop force-pushed, pushed to `main`, or ran a production migration —
+  "to fix the thing."
+- **Mechanism:** a *general* goal ("keep CI green", "clean up the repo") was taken as
+  authorization for a *specific* high-blast action it merely implied.
+- **Catch:** the escalation gate — these classes (force-push, push to `main`, prod
+  deploy/migration, mass delete, IAM grants, deleting pre-session files, `.claude` edits)
+  are **always** escalated, declared in `escalation:`. Claude Code's auto-mode classifier
+  also hard/soft-denies them independently: a general goal is *not* explicit intent.
+
+## 5. The ungated-child spawn
+
+- **Symptom:** the orchestrator session dies with *Create Unsafe Agents* / *Auto-Mode
+  Bypass*; the loop never starts.
+- **Mechanism:** a session in `auto` mode tried to launch a detached `claude -p
+  --permission-mode bypassPermissions` child (an ungated autonomous agent). Wrapping the
+  flag in a script to dodge the classifier is a `hard_deny` nothing clears.
+- **Catch:** the cardinal rule — **a scheduler invokes `claude -p`, not a session that
+  spawns ungated children.** Move the launch to cron/Actions/Task Scheduler (the human
+  authorizer), and give the child gates (`dontAsk` + allowlist), not bypass — unless it's
+  in an isolated container. (`rules/loop-engineering.md` directive #2.)
+
+## 6. The colliding loops
+
+- **Symptom:** two loops fight over the same branch/worktree; one clobbers the other's
+  work; merge churn.
+- **Mechanism:** several loops run against one repo with no coordination.
+- **Catch:** the multi-loop **priority order** (CI > PR > deps > cleanup > triage) — the
+  higher-priority loop wins worktree contention, lowers defer to their next tick. Each
+  loop isolates in its **own** worktree; they announce what they hold via `pigeon` so a
+  peer can stand off. Never touch another session's `.claude/worktrees/`.
+
+## 7. The silent-stop loop
+
+- **Symptom:** nobody noticed the loop stopped running for a week; stale `STATE.md`.
+- **Mechanism:** the schedule quietly stopped firing — a disabled workflow, a cron typo,
+  a paused runner, an expired token. Loops fail *open* into silence, not error.
+- **Catch:** treat `STATE.md`'s `_Updated_` timestamp + the `run-log.md` tail as a
+  heartbeat — if the latest run is older than ~2× the cadence, the loop is down. A
+  separate cheap monitor (or a `daily-triage` loop) that flags stale loop heartbeats
+  closes this; the kill switch is for stopping, the heartbeat is for noticing it stopped.
+
+## 8. The test-deleting "fix" (gate reward-hacking)
+
+- **Symptom:** CI is green again — because the loop deleted or `skip`-ped the failing
+  test, not because it fixed the bug.
+- **Mechanism:** the loop optimized the literal gate (`verify` passes) rather than the
+  intent. A loop, like any optimizer, games a weak metric.
+- **Catch:** make the gate hard to hack — a `guard` that runs the **full** suite +
+  typecheck, a `scope` that **excludes** test files and CI config, and a human review at
+  L2 (the PR gate). Never let an L2/L3 loop modify the very tests that gate it.
+
+## 9. The unbounded-scope loop
+
+- **Symptom:** the loop edited files far outside its job.
+- **Mechanism:** `scope: "*"` (or `**`, or empty) — "may touch anything."
+- **Catch:** `loop-audit` **rejects** an unbounded or placeholder scope (exit 10). Scope
+  is bounded globs, always.
+
+## 10. The no-kill-switch loop
+
+- **Symptom:** the loop is misbehaving and there's no fast way to stop it.
+- **Mechanism:** no stop signal was designed in; stopping means disabling the workflow by
+  hand mid-tick.
+- **Catch:** `kill_switch` is mandatory (`loop-audit` errors without one) and checked
+  **first** every run. The cheapest implementation is a `PreToolUse` hook that blocks
+  every tool the instant a `PAUSED` sentinel appears — an instant breaker.
+
+## 11. The comprehension-debt loop
+
+- **Symptom:** the codebase works but no one on the team understands the changes the loop
+  shipped; onboarding slows, incidents take longer.
+- **Mechanism:** an unattended loop shipped correct-but-unreviewed changes for weeks;
+  comprehension debt compounded silently.
+- **Catch:** the tier ladder is the antidote — **start at L1 (report-only)** and *read
+  the reports*; graduate to L2 only once you trust its judgment, and keep the human in the
+  PR loop. Autonomy is earned with evidence, not granted up front. Build the loop like you
+  intend to stay the engineer.
+
+---
+
+## At a glance — symptom → control
+
+| Failure | Primary control |
+|---|---|
+| Runaway budget | `budget_tokens` + `loop-cost` + `loop-doctor` budget check |
+| 3am-dead | `loop-doctor --live` (gate binary + PATH) |
+| Cache-cold | byte-identical `run.md` + `loop-cost` TTL guidance |
+| Force-push / prod | escalation gate + auto-mode classifier |
+| Ungated-child spawn | scheduler-invokes-`claude -p` (rule #2) |
+| Colliding loops | priority order + per-loop worktree + `pigeon` |
+| Silent-stop | `STATE.md`/run-log heartbeat staleness |
+| Gate reward-hacking | full-suite `guard` + scope excludes tests + human PR gate |
+| Unbounded scope | `loop-audit` rejects `*` |
+| No kill switch | mandatory `kill_switch` + PreToolUse PAUSED hook |
+| Comprehension debt | L1-first graduation; read the reports |
+
+## See also
+
+- [risk-tiers.md](risk-tiers.md) — the graduated-autonomy ladder behind #11.
+- [state-spine.md](state-spine.md) — budget + heartbeat + multi-loop coordination.
+- [../../../rules/loop-engineering.md](../../../rules/loop-engineering.md) — the directives that prevent #4/#5/#9/#10.

+ 30 - 0
skills/loop-ops/references/risk-tiers.md

@@ -120,6 +120,36 @@ execution *and* you have a real sandbox (no internet, can't damage the host).
 
 ---
 
+## Connector & MCP scopes — least privilege for the loop's tools
+
+A loop is only as safe as the tools it can reach. The permission *mode* gates Bash + file
+edits; the *tool surface* gates everything else — MCP connectors (Slack, GitHub, Jira, a
+DB), `WebFetch`, the `Agent` tool. Scope them per tier:
+
+- **Allowlist, don't blanket.** Headless, name exactly what the job needs:
+  `--allowedTools 'Bash(gh pr list:*)' 'Bash(gh pr view:*)' 'Read' 'mcp__github__*'`. Use
+  `--disallowedTools` to subtract a dangerous one (block `WebFetch` on a loop that
+  shouldn't read the web; block a Slack `post_message` on a read-only triage loop).
+- **Read-scoped connectors at L1.** An L1 report loop gets read-only MCP scopes
+  (list/get/search), never write (post/create/delete/merge). Scope the connector *itself*
+  least-privilege — don't hand a triage loop a write-capable GitHub token "just in case".
+- **The auto-merge guard.** Never give a loop a path to merge `main`: keep `gh pr merge`
+  out of the allowlist, set `land_via: fleet-ops` (test-gated, human-or-queue), and list
+  main-push in `escalation`. A green PR on a feature branch is the *most* a loop
+  auto-produces.
+- **MCP tool descriptions are instructions.** A poisoned connector description is
+  prompt-injection straight into the loop's context — vet a connector (and prefer read
+  scopes) before a loop uses it. See [`prompt-injection-defense`](../../prompt-injection-defense/SKILL.md).
+
+## Why Claude Code-specific (not a multi-tool matrix)
+
+`loop-ops` is deliberately scoped to **Claude Code**, not a cross-tool primitives matrix
+(Grok / Codex / …). The whole edge is grounding in Claude Code's *actual* gate model — the
+permission modes, the auto-mode classifier, `claude -p`, the hook events. A generic
+multi-agent matrix would dilute exactly that. The *doctrine* ports to any agent (the tier
+ladder, the gate, the kill switch, the STATE spine, the escalation classes); the
+permission-mode **mapping** is Claude Code's, and that specificity is the point.
+
 ## Tier checklist (what `loop-audit` enforces)
 
 - **L1:** bounded `scope` (never `*`), a `kill_switch`, `permission_mode` ∈ {plan,