1 день назад · 328fd6ffcb
--- a/CHANGELOG.md
+++ b/CHANGELOG.md
@@ -28,6 +28,15 @@ feature releases live in the README "Recent Updates" section.
 
				 - **`docs/AUTO-MODE-CLASSIFIER.md`** - reference on Claude Code's auto-mode permission
			
 
				   classifier (the two-gate model, gating categories, legitimate-authorization decision tree),
			
 
				   cited by `loop-ops` as the authority for its risk-tier mapping.
			
 
				+- **loop-ops hardening (world-class pass)**: `loop-doctor.sh` - a live preflight
			
 
				+  (`--offline`/`--live`) that proves a loop will *run* (gate binary on PATH, budget fits a
			
 
				+  tick, permission mode achievable, L3 isolation present), complementing loop-audit's
			
 
				+  *well-formed* check; `loop-cost.py` is now **caching-aware** - it models the static
			
 
				+  run-prompt prefix as a cache entry and the TTL-vs-cadence rule (a loop slower than ~1h
			
 
				+  can't cache), the key loop economics lever; and a companion **`rules/loop-engineering.md`**
			
 
				+  carries the graduated-autonomy directive (L1→L2→L3, scheduler-not-session, escalation
			
 
				+  gate, kill switch + budget) into every session, not just when the skill is invoked.
			
 
				+  Suite now 81 assertions.
			
 
				 
			
 
				 ## [3.2.0] - 2026-06-22
			
 
				 
			
--- a/README.md
+++ b/README.md
@@ -18,7 +18,7 @@ Built on the [Agent Skills specification](https://agentskills.io/specification)
 
				 
			
 
				 From Python async patterns to Rust ownership models, from AWS Fargate deployments to Craft CMS development - claude-mods provides the specialized knowledge and tools that transform Claude from a general-purpose assistant into a domain expert who understands your stack, remembers your workflow, and ships production code.
			
 
				 
			
 
				-**3 agents. 95 skills. 13 styles. 11 hooks. 7 rules. One install.**
			
 
				+**3 agents. 95 skills. 13 styles. 11 hooks. 8 rules. One install.**
			
 
				 
			
 
				 ## Recent Updates
			
 
				 
			
@@ -395,6 +395,7 @@ See [skill-creator](skills/skill-creator/) for the complete guide.
 
				 | [skill-agent-updates.md](rules/skill-agent-updates.md) | Mandatory docs check before creating/updating skills or agents |
			
 
				 | [supply-chain.md](rules/supply-chain.md) | Behavioural-first dependency hygiene - scan before adding, day-zero cooldown, OIDC audit, persistence-hook awareness |
			
 
				 | [worktree-boundaries.md](rules/worktree-boundaries.md) | Never touch other sessions' worktrees - no rm -rf, no git add -A sweeping gitlinks |
			
 
				+| [loop-engineering.md](rules/loop-engineering.md) | Graduated-autonomy discipline for scheduled/autonomous agent loops - L1→L2→L3, scheduler-not-session, escalation gate, kill switch + budget; companion to loop-ops |
			
 
				 
			
 
				 ### Tools & Hooks
			
 
				 
			
--- a/docs/PLAN.md
+++ b/docs/PLAN.md
@@ -18,7 +18,7 @@
 
				 | Agents | 3 | Pure context-isolation/worker roles only: git-agent (background commits/PRs), firecrawl-expert (noisy scrapes), project-organizer (bulk restructure) |
			
 
				 | Skills | 95 | Operational skills, CLI tools, workflows, diagnostics, security |
			
 
				 | Commands | 2 | Session management (sync, save) |
			
 
				-| Rules | 7 | cli-tools, commit-style, naming-conventions, prompt-injection, skill-agent-updates, supply-chain, worktree-boundaries |
			
 
				+| Rules | 8 | cli-tools, commit-style, naming-conventions, prompt-injection, skill-agent-updates, supply-chain, worktree-boundaries, loop-engineering |
			
 
				 | Output Styles | 13 | Vesper, Spartan, Mentor, Executive, Pair, Atlas, Coach, Harbour, Meridian, Noir, Roast, Sage, Scout |
			
 
				 | Hooks | 11 | lint, format, safety, uv, install-scan, manifest-scan, pmail, unicode-scan ×2, config-change guard, worktree guard |
			
 
				 
			
--- a/rules/loop-engineering.md
+++ b/rules/loop-engineering.md
@@ -0,0 +1,83 @@
 
				+# Loop Engineering — graduated-autonomy discipline for agent loops
			
 
				+
			
 
				+Companion to the [`loop-ops`](../skills/loop-ops/SKILL.md) skill (the full playbook +
			
 
				+`loop-init`/`loop-audit`/`loop-doctor`/`loop-cost` scripts). This file is the *directive*
			
 
				+— what to do every time you design or run a **recurring / scheduled / autonomous** agent
			
 
				+loop, in any project: a `/loop`, a `/schedule` routine, a cron `claude -p`, an `iterate`
			
 
				+run, a `fleet-worker` fan-out.
			
 
				+
			
 
				+## The rule
			
 
				+
			
 
				+**A loop is a recurring process you grant standing authority to. Grant it the *least*
			
 
				+authority that does the job, earn each increase with evidence, and never let it act on a
			
 
				+blast radius bigger than its stated purpose.** Three non-negotiables:
			
 
				+
			
 
				+1. **Graduated autonomy — never start unattended.** `L1 report → L2 assisted → L3
			
 
				+   unattended`. A fresh loop runs read-only (L1) until its reports prove its judgment;
			
 
				+   only then does it earn write access (L2, human-gated merge), and only then autonomous
			
 
				+   landing (L3, inside an isolation boundary). Starting at L3 is how incidents and
			
 
				+   comprehension debt compound.
			
 
				+2. **A scheduler invokes `claude -p` — a session does not spawn ungated children.** The
			
 
				+   authorizer of an unattended loop is a human-configured cron / Task Scheduler / CI
			
 
				+   runner, *outside* any auto-mode session. An `auto`-mode session that launches a
			
 
				+   `--permission-mode bypassPermissions` child is hard-denied as *Create Unsafe Agents* —
			
 
				+   by design. Give the headless child *gates* (`dontAsk` + a narrow allowlist), not bypass
			
 
				+   — unless it runs in an isolated container.
			
 
				+3. **No gate, no kill switch, no budget → no loop.** Every loop has a `verify` gate (the
			
 
				+   check that decides land-vs-discard), a kill switch every run checks first, and a
			
 
				+   per-run token budget. A loop missing any of these doesn't get scheduled.
			
 
				+
			
 
				+## Why this matters
			
 
				+
			
 
				+Unattended loops amplify both good judgment and mistakes, and they do it on a schedule
			
 
				+while you're not watching. The failure modes are not hypothetical: a loop that force-pushes,
			
 
				+that burns a day's budget in an hour, that "fixes" CI by deleting the failing test, that
			
 
				+collides with another loop's worktree, or that silently stops triggering. The controls
			
 
				+above are what make a loop's authority *recoverable*: a kill switch stops it, a budget
			
 
				+bounds it, a gate keeps bad changes out, and the tier ladder means you only ever granted
			
 
				+the authority you'd already seen it use well.
			
 
				+
			
 
				+## Directives — apply whenever a loop is involved
			
 
				+
			
 
				+| Situation | Directive |
			
 
				+|---|---|
			
 
				+| Designing any scheduled/autonomous loop | Start at **L1 (read-only)**. Scaffold with `loop-init`; fill a bounded `scope` (never `*`), a `verify` gate, an `escalation` rule, a `kill_switch`, a `budget_tokens`. |
			
 
				+| Before scheduling a loop | Run **`loop-audit`** (config sane?) **then `loop-doctor --live`** (will it actually run — gate binary on PATH, budget fits a tick, permission mode achievable?). Don't schedule a loop that fails either. |
			
 
				+| Choosing the permission mode | Default to **`dontAsk` + a narrow allowlist** (runs anywhere, fully gated). Reserve `bypassPermissions` for an **isolated container** (the enumerate-vs-isolate fork). Never `default` (interactive) for a headless loop. |
			
 
				+| Wiring the cadence | A **scheduler** runs `claude -p` (the authorizer). Do **not** run an orchestrator session in auto mode whose job is spawning the loop. |
			
 
				+| Setting the cadence + cost | Cadence is the biggest cost lever; **caching** is the next (a loop re-sends the same prompt — cache the static prefix, and note a loop slower than ~1h can't cache). Estimate with `loop-cost` before committing. |
			
 
				+| Running several loops | Give them a **priority order** (CI > PR > deps > cleanup > triage) and a **shared kill switch**; coordinate via `pigeon` so they don't collide on a worktree. |
			
 
				+| Anything high-blast-radius | **Escalate, don't act** (see below). A general goal is *not* authorization for a specific destructive action it implies. |
			
 
				+
			
 
				+## The escalation gate — never auto-land
			
 
				+
			
 
				+Bake into every loop's `escalation:` field. These **always** go to a human, regardless of
			
 
				+the loop's goal: force-push · push to `main` · production deploy/migration · mass deletion ·
			
 
				+granting IAM/repo permissions · destroying files that predate the run · editing `.claude/`
			
 
				+or settings (self-modification) · `curl | bash`. Safe to auto-land at L2/L3 *when
			
 
				+allowlisted*: a green PR on a feature branch, a lockfile patch bump past the guard, a
			
 
				+generated draft, a label/triage classification, a comment.
			
 
				+
			
 
				+## Self-check before wiring a loop
			
 
				+
			
 
				+- Is it starting at L1? If you're reaching for L3 on a fresh loop, stop.
			
 
				+- Does `loop-audit` pass and `loop-doctor --live` say it will run?
			
 
				+- Is the child **gated** (`dontAsk`+allowlist) or genuinely **isolated** (container)? If
			
 
				+  you're using `bypassPermissions` on the host to avoid enumerating permissions, that's the
			
 
				+  exact pattern the auto-mode classifier blocks — authorize it properly or isolate it.
			
 
				+- Can you stop it (kill switch) and does it have a budget?
			
 
				+
			
 
				+## When the playbook is needed
			
 
				+
			
 
				+For the full operational workflow — the risk-tier ↔ permission-mode mapping, the STATE/
			
 
				+run-log/budget spine, the seven production patterns, multi-loop coordination, the
			
 
				+scheduler mechanics, and the `loop-init`/`loop-audit`/`loop-doctor`/`loop-cost` tools —
			
 
				+**invoke the [`loop-ops`](../skills/loop-ops/SKILL.md) skill.**
			
 
				+
			
 
				+## Cross-reference
			
 
				+
			
 
				+- `~/.claude/skills/loop-ops/SKILL.md` — full playbook + scripts.
			
 
				+- `~/.claude/skills/loop-ops/references/risk-tiers.md` — the L1/L2/L3 ↔ permission-mode mapping.
			
 
				+- `~/.claude/docs/AUTO-MODE-CLASSIFIER.md` — the two-gate model behind directive #2.
			
 
				+- `worktree-boundaries.md` — never let a loop touch another session's `.claude/worktrees/`.
			
 
				+- `iterate` / `fleet-worker` / `fleet-ops` — the inner-loop, spawn, and land layers a loop composes.
			
--- a/skills/loop-ops/SKILL.md
+++ b/skills/loop-ops/SKILL.md
@@ -155,10 +155,12 @@ Running several loops? Two non-negotiables (detail in
 
				 
			
 
				 ## Tools
			
 
				 
			
 
				-Three scripts, all following the [Skill Resource Protocol](../../docs/SKILL-RESOURCE-PROTOCOL.md)
			
 
				-(stdout = data, semantic exit codes, `--help` with EXAMPLES, `--json` envelopes). They
			
 
				-are the legs of a stool: **init** scaffolds, **audit** scores readiness, **cost**
			
 
				-estimates spend before you commit to a cadence.
			
 
				+Five scripts, all following the [Skill Resource Protocol](../../docs/SKILL-RESOURCE-PROTOCOL.md)
			
 
				+(stdout = data, semantic exit codes, `--help` with EXAMPLES, `--json` envelopes): **init**
			
 
				+scaffolds the loop, **audit** scores whether the config is *well-formed*, **doctor**
			
 
				+preflights whether it will actually *run*, **cost** estimates spend (caching-aware), and
			
 
				+**check-pricing-sync** gates pricing drift in CI. The discipline before scheduling is
			
 
				+`init → fill → cost → audit → doctor --live`.
			
 
				 
			
 
				 ### `scripts/loop-init.sh` — scaffold a loop's state spine
			
 
				 
			
@@ -198,10 +200,33 @@ Exit **0** = ready (no errors, score ≥ `--min`), **10** = not ready (findings
 
				 `2` usage, `3` config not found, `4` config unparseable. `--strict` counts warnings
			
 
				 toward the not-ready signal.
			
 
				 
			
 
				-### `scripts/loop-cost.py` — token/$ estimate by pattern × cadence × model
			
 
				+### `scripts/loop-doctor.sh` — live preflight (will it actually run?)
			
 
				+
			
 
				+`loop-audit` proves the config is *well-formed*; `loop-doctor` proves the loop will
			
 
				+*execute* — catching the "blocked at 3am" failures audit can't see. `--offline` (CI-safe):
			
 
				+the budget fits a tick's estimated tokens, the permission mode is achievable (not
			
 
				+interactive), an L3 bypass declares an isolation boundary. `--live` adds runtime preflight:
			
 
				+the `verify`/`guard` gate's leading binary resolves on PATH, `claude`/`git` are present,
			
 
				+the kill-switch sentinel's parent dir exists.
			
 
				+
			
 
				+```bash
			
 
				+bash scripts/loop-doctor.sh --offline .loops/pr-babysitter/loop.config.yaml   # CI gate
			
 
				+bash scripts/loop-doctor.sh --live .loops/ci-sweeper/loop.config.yaml          # before scheduling
			
 
				+bash scripts/loop-doctor.sh --live --json .loops/dep-sweeper/loop.config.yaml | jq '.data[] | select(.state=="bad")'
			
 
				+```
			
 
				+
			
 
				+Exit **0** = will run, **10** = a check predicts a runtime failure (gate binary missing,
			
 
				+bypass on host without isolation, budget too small for a tick), `2` usage, `3` not found,
			
 
				+`4` unparseable, `5` missing core dep. Run it **after** `loop-audit` and before scheduling.
			
 
				+
			
 
				+### `scripts/loop-cost.py` — token/$ estimate by pattern × cadence × model (caching-aware)
			
 
				 
			
 
				 Estimate spend **before** committing to a cadence — the cost of an outer loop is
			
 
				-runs/day × tokens/run × price, and sub-agents multiply it. Pricing reads from
			
 
				+runs/day × tokens/run × price, and sub-agents multiply it. It also models **prompt
			
 
				+caching**: a loop re-sends the same `run.md`+system prefix every tick (the Ralph
			
 
				+property), so the prefix should be cache-written once then read (~0.1×) — *but only if the
			
 
				+tick interval fits the cache TTL*. A loop slower than ~1h can't cache (the entry expires
			
 
				+between ticks); the estimator says so and recommends the TTL. Pricing reads from
			
 
				 `assets/model-pricing.json` (date-stamped; [`claude-api-ops`](../claude-api-ops/SKILL.md)
			
 
				 is the source of truth — run its `check-model-table.py` if you suspect drift).
			
 
				 
			
@@ -240,9 +265,12 @@ python scripts/check-pricing-sync.py --offline   # exit 0 in sync, 10 drift, 3 a
 
				    sanity-check the monthly spend against the value.
			
 
				 5. **Audit it:** `bash scripts/loop-audit.sh .loops/<n>/loop.config.yaml` — fix every
			
 
				    error before scheduling. Don't schedule a loop that fails its own audit.
			
 
				-6. **Schedule** the L1 run with native `/loop` or `/schedule` (read-only — it just
			
 
				+6. **Doctor it:** `bash scripts/loop-doctor.sh --live .loops/<n>/loop.config.yaml` — prove
			
 
				+   it will actually *run* (gate binary on PATH, budget fits a tick). Audit = well-formed;
			
 
				+   doctor = will-run.
			
 
				+7. **Schedule** the L1 run with native `/loop` or `/schedule` (read-only — it just
			
 
				    writes `STATE.md` + a report).
			
 
				-7. **Read the reports.** Only after the loop's judgment is proven do you graduate it to
			
 
				+8. **Read the reports.** Only after the loop's judgment is proven do you graduate it to
			
 
				    **L2** (worktree + guard + `fleet-ops` landing) and re-audit at the higher tier.
			
 
				 
			
 
				 ## Anti-patterns (these are detected and wrong)
			
--- a/skills/loop-ops/scripts/loop-cost.py
+++ b/skills/loop-ops/scripts/loop-cost.py
@@ -2,22 +2,28 @@
 
				 """Estimate the token/$ cost of an outer loop by pattern × cadence × model.
			
 
				 
			
 
				 A loop's cost is runs/day × tokens/run × price, and sub-agents multiply tokens/run.
			
 
				-This computes that before you commit to a cadence. Pricing reads from
			
 
				-assets/model-pricing.json (date-stamped; skills/claude-api-ops is the source of
			
 
				-truth — run its check-model-table.py if you suspect drift).
			
 
				+This computes that - and, crucially, models **prompt caching**: a loop re-sends the
			
 
				+SAME run.md + system prefix every tick (the Ralph property), which is the textbook
			
 
				+caching case. Whether caching helps depends on cadence vs cache TTL, so this picks the
			
 
				+TTL and reports the cached projection alongside the naive one.
			
 
				+
			
 
				+Pricing reads from assets/model-pricing.json (date-stamped; skills/claude-api-ops is
			
 
				+the source of truth - run its check-model-table.py if you suspect drift).
			
 
				 
			
 
				 Usage:   loop-cost.py --pattern P --cadence C --model M [OPTIONS]
			
 
				 Input:   argv flags only (no stdin).
			
 
				 Output:  stdout = the cost breakdown (plain rows, or --json envelope). Data only.
			
 
				-Stderr:  the assumptions note, errors.
			
 
				+Stderr:  the assumptions + caching note, errors.
			
 
				 Exit:    0 ok, 2 usage, 3 pricing file missing, 4 bad cadence/model/pattern
			
 
				 
			
 
				-Estimates, not guarantees — reconcile against the loop's run-log.md actuals. The
			
 
				-cheapest lever is cadence (halving frequency halves cost); the next is model.
			
 
				+Estimates, not guarantees - reconcile against the loop's run-log.md actuals. Levers in
			
 
				+order of impact: cadence (halving frequency halves cost), prompt caching (model below),
			
 
				+model tier.
			
 
				 
			
 
				 Examples:
			
 
				   loop-cost.py --pattern pr-babysitter --cadence 10m --model claude-haiku-4-5
			
 
				   loop-cost.py --pattern ci-sweeper --cadence 15m --model claude-sonnet-4-6 --days 30 --json
			
 
				+  loop-cost.py --pattern daily-triage --cadence 6h --model claude-opus-4-8   # too slow to cache
			
 
				   loop-cost.py --list-models
			
 
				 """
			
 
				 from __future__ import annotations
			
@@ -36,11 +42,26 @@ EX_VALIDATION = 4
 
				 
			
 
				 DEFAULT_PRICING = Path(__file__).resolve().parent.parent / "assets" / "model-pricing.json"
			
 
				 
			
 
				+# Prompt-caching multipliers vs base input price (claude-api-ops/references/caching-and-cost.md).
			
 
				+CACHE_WRITE_5M = 1.25   # write a 5-minute-TTL entry
			
 
				+CACHE_WRITE_1H = 2.0    # write a 1-hour-TTL entry
			
 
				+CACHE_READ = 0.1        # read any cached entry
			
 
				+
			
 
				+# Minimum cacheable prefix (tokens) - below this the cache_control marker is silently
			
 
				+# ignored (caching-and-cost.md). A loop whose static prefix is smaller can't cache.
			
 
				+MIN_PREFIX = {
			
 
				+    "claude-fable-5": 512,
			
 
				+    "claude-opus-4-8": 1024,
			
 
				+    "claude-sonnet-4-6": 1024,
			
 
				+    "claude-haiku-4-5": 4096,
			
 
				+}
			
 
				+DEFAULT_MIN_PREFIX = 1024
			
 
				+
			
 
				 
			
 
				 class Term:
			
 
				-    """Minimal ANSI helper (term.sh is bash-only; per TERMINAL-DESIGN.md §9 the
			
 
				-    Python port is inline). Honors FORCE_COLOR / NO_COLOR / TERM_ASCII and the
			
 
				-    bound stream's TTY + encoding, so piped data stays plain ASCII."""
			
 
				+    """Minimal ANSI helper (term.sh is bash-only; per TERMINAL-DESIGN.md §9 the Python
			
 
				+    port is inline). Honors FORCE_COLOR / NO_COLOR / TERM_ASCII and the bound stream's
			
 
				+    TTY + encoding, so piped data stays plain ASCII."""
			
 
				 
			
 
				     _C = {"green": "\033[32m", "cyan": "\033[36m", "dim": "\033[2m", "off": "\033[0m"}
			
 
				 
			
@@ -95,15 +116,63 @@ def runs_per_day(cadence: str, override: float | None) -> float:
 
				     if re.fullmatch(r"\d+ \* \* \* \*", s):
			
 
				         return 24.0
			
 
				     print(
			
 
				-        f"error: cannot derive runs/day from cadence '{cadence}' — "
			
 
				+        f"error: cannot derive runs/day from cadence '{cadence}' - "
			
 
				         "use Nm/Nh/Nd, `*/N * * * *`, or pass --runs-per-day",
			
 
				         file=sys.stderr,
			
 
				     )
			
 
				     raise SystemExit(EX_VALIDATION)
			
 
				 
			
 
				 
			
 
				+def caching_projection(in_tok, out_tok, sub, in_price, out_price, rpd, model,
			
 
				+                       prefix_frac, ttl_choice):
			
 
				+    """Model prompt-caching of the static run-prompt prefix across ticks.
			
 
				+
			
 
				+    Returns a dict: ttl, beneficial, reason, cost_per_run/day, prefix_tokens.
			
 
				+    The cache stays warm only when the tick interval is <= the TTL (reads refresh it);
			
 
				+    a loop slower than the 1h max TTL writes a cold entry every tick - caching can't help.
			
 
				+    """
			
 
				+    interval_min = 1440.0 / rpd if rpd > 0 else 1e9
			
 
				+    prefix_tokens = int(round(in_tok * prefix_frac))
			
 
				+    variable_in = in_tok - prefix_tokens
			
 
				+    min_prefix = MIN_PREFIX.get(model, DEFAULT_MIN_PREFIX)
			
 
				+
			
 
				+    # Pick TTL: smallest that stays warm at this cadence.
			
 
				+    if ttl_choice == "5m":
			
 
				+        ttl, warm = "5m", interval_min <= 5
			
 
				+    elif ttl_choice == "1h":
			
 
				+        ttl, warm = "1h", interval_min <= 60
			
 
				+    else:  # auto
			
 
				+        if interval_min <= 5:
			
 
				+            ttl, warm = "5m", True
			
 
				+        elif interval_min <= 60:
			
 
				+            ttl, warm = "1h", True
			
 
				+        else:
			
 
				+            ttl, warm = None, False
			
 
				+
			
 
				+    out_cost_day = out_tok / 1e6 * out_price * rpd
			
 
				+
			
 
				+    if prefix_tokens < min_prefix:
			
 
				+        return {"ttl": ttl, "beneficial": False,
			
 
				+                "reason": f"static prefix ~{prefix_tokens} tok < {model} minimum {min_prefix} tok "
			
 
				+                          "- cache marker silently ignored; enlarge the run prompt/system or skip caching",
			
 
				+                "prefix_tokens": prefix_tokens, "cost_per_day": None, "cost_per_run": None}
			
 
				+    if not warm or ttl is None:
			
 
				+        return {"ttl": ttl, "beneficial": False,
			
 
				+                "reason": f"tick interval ~{interval_min:.0f} min exceeds the cache TTL "
			
 
				+                          "- the entry expires between ticks, so every tick is a cold write; caching won't help",
			
 
				+                "prefix_tokens": prefix_tokens, "cost_per_day": None, "cost_per_run": None}
			
 
				+
			
 
				+    write_mult = CACHE_WRITE_5M if ttl == "5m" else CACHE_WRITE_1H
			
 
				+    # Per day, warm: ~1 cache write of the prefix + (rpd-1) reads; variable input + output full price.
			
 
				+    prefix_day = prefix_tokens / 1e6 * in_price * (write_mult + max(rpd - 1, 0) * CACHE_READ)
			
 
				+    variable_day = variable_in / 1e6 * in_price * rpd
			
 
				+    cost_day = (prefix_day + variable_day + out_cost_day) * sub
			
 
				+    return {"ttl": ttl, "beneficial": True, "reason": "",
			
 
				+            "prefix_tokens": prefix_tokens, "write_mult": write_mult,
			
 
				+            "cost_per_day": cost_day, "cost_per_run": cost_day / rpd if rpd else cost_day}
			
 
				+
			
 
				+
			
 
				 def fmt_money(x: float) -> str:
			
 
				-    """Human dollar string: cents below $100, 4 decimals below $1 for tiny per-run costs."""
			
 
				     if x < 1:
			
 
				         return f"${x:.4f}"
			
 
				     return f"${x:,.2f}"
			
@@ -112,7 +181,7 @@ def fmt_money(x: float) -> str:
 
				 def main(argv: list[str]) -> int:
			
 
				     p = argparse.ArgumentParser(
			
 
				         prog="loop-cost.py",
			
 
				-        description="Estimate outer-loop cost by pattern × cadence × model.",
			
 
				+        description="Estimate outer-loop cost by pattern × cadence × model, with prompt caching.",
			
 
				     )
			
 
				     p.add_argument("--pattern", default="custom", help="catalog pattern key (default: custom)")
			
 
				     p.add_argument("--cadence", default="1h", help="10m | 1h | 6h | 1d, or a cron string (default: 1h)")
			
@@ -122,6 +191,11 @@ def main(argv: list[str]) -> int:
 
				     p.add_argument("--input-tokens", type=int, default=None, help="override per-run input tokens")
			
 
				     p.add_argument("--output-tokens", type=int, default=None, help="override per-run output tokens")
			
 
				     p.add_argument("--subagents", type=int, default=None, help="override the sub-agent fan-out multiplier")
			
 
				+    p.add_argument("--cache-prefix-frac", type=float, default=0.6,
			
 
				+                   help="fraction of input that is the static, cacheable run-prompt prefix (default: 0.6)")
			
 
				+    p.add_argument("--cache-ttl", choices=["auto", "5m", "1h"], default="auto",
			
 
				+                   help="cache TTL to model (default: auto - pick by cadence)")
			
 
				+    p.add_argument("--no-cache", action="store_true", help="report the uncached cost only")
			
 
				     p.add_argument("--pricing", default=str(DEFAULT_PRICING), help="path to model-pricing.json")
			
 
				     p.add_argument("--list-models", action="store_true", help="print the pricing table + as-of date, exit 0")
			
 
				     p.add_argument("--json", action="store_true", help="emit a JSON envelope")
			
@@ -135,7 +209,6 @@ def main(argv: list[str]) -> int:
 
				     as_of = pricing.get("_as_of", "unknown")
			
 
				     pattern_defaults = pricing.get("_pattern_defaults", {})
			
 
				 
			
 
				-    # ── --list-models ──
			
 
				     if args.list_models:
			
 
				         if args.json:
			
 
				             print(json.dumps({"data": models, "meta": {"as_of": as_of, "schema": "claude-mods.loop-ops.pricing/v1"}}, indent=2))
			
@@ -149,26 +222,27 @@ def main(argv: list[str]) -> int:
 
				     if args.days <= 0:
			
 
				         print("error: --days must be positive", file=sys.stderr)
			
 
				         return EX_VALIDATION
			
 
				+    if not (0.0 <= args.cache_prefix_frac <= 1.0):
			
 
				+        print("error: --cache-prefix-frac must be between 0 and 1", file=sys.stderr)
			
 
				+        return EX_VALIDATION
			
 
				 
			
 
				-    # ── model ──
			
 
				     if args.model not in models:
			
 
				-        print(f"error: unknown model '{args.model}' — known: {', '.join(models) or '(none)'}", file=sys.stderr)
			
 
				+        print(f"error: unknown model '{args.model}' - known: {', '.join(models) or '(none)'}", file=sys.stderr)
			
 
				         return EX_VALIDATION
			
 
				     in_price = float(models[args.model]["input_per_mtok"])
			
 
				     out_price = float(models[args.model]["output_per_mtok"])
			
 
				 
			
 
				-    # ── tokens/run: overrides win, else pattern defaults ──
			
 
				     if args.input_tokens is not None and args.output_tokens is not None:
			
 
				         in_tok, out_tok = args.input_tokens, args.output_tokens
			
 
				         sub = args.subagents if args.subagents is not None else 1
			
 
				-    elif args.pattern in pattern_defaults:
			
 
				+    elif args.pattern in pattern_defaults and not args.pattern.startswith("_"):
			
 
				         d = pattern_defaults[args.pattern]
			
 
				         in_tok = args.input_tokens if args.input_tokens is not None else int(d["input"])
			
 
				         out_tok = args.output_tokens if args.output_tokens is not None else int(d["output"])
			
 
				         sub = args.subagents if args.subagents is not None else int(d.get("subagents", 1))
			
 
				     else:
			
 
				         print(
			
 
				-            f"error: unknown pattern '{args.pattern}' — pass --input-tokens and "
			
 
				+            f"error: unknown pattern '{args.pattern}' - pass --input-tokens and "
			
 
				             f"--output-tokens, or use one of: {', '.join(k for k in pattern_defaults if not k.startswith('_'))}",
			
 
				             file=sys.stderr,
			
 
				         )
			
@@ -180,7 +254,7 @@ def main(argv: list[str]) -> int:
 
				 
			
 
				     rpd = runs_per_day(args.cadence, args.runs_per_day)
			
 
				 
			
 
				-    # ── cost math ──
			
 
				+    # ── uncached (naive) ──
			
 
				     cost_in = in_tok / 1_000_000 * in_price
			
 
				     cost_out = out_tok / 1_000_000 * out_price
			
 
				     cost_run = (cost_in + cost_out) * sub
			
@@ -188,25 +262,32 @@ def main(argv: list[str]) -> int:
 
				     cost_day = cost_run * rpd
			
 
				     cost_horizon = cost_day * args.days
			
 
				 
			
 
				+    # ── cached projection ──
			
 
				+    cache = None
			
 
				+    if not args.no_cache:
			
 
				+        cache = caching_projection(in_tok, out_tok, sub, in_price, out_price, rpd,
			
 
				+                                   args.model, args.cache_prefix_frac, args.cache_ttl)
			
 
				+
			
 
				     if args.json:
			
 
				-        envelope = {
			
 
				-            "data": {
			
 
				-                "pattern": args.pattern,
			
 
				-                "model": args.model,
			
 
				-                "cadence": args.cadence,
			
 
				-                "runs_per_day": round(rpd, 3),
			
 
				-                "tokens_per_run": tokens_run,
			
 
				-                "input_tokens": in_tok,
			
 
				-                "output_tokens": out_tok,
			
 
				-                "subagents": sub,
			
 
				-                "cost_per_run": round(cost_run, 6),
			
 
				-                "cost_per_day": round(cost_day, 4),
			
 
				-                "days": args.days,
			
 
				-                "cost_per_horizon": round(cost_horizon, 2),
			
 
				-            },
			
 
				-            "meta": {"as_of": as_of, "schema": "claude-mods.loop-ops.cost/v1"},
			
 
				+        data = {
			
 
				+            "pattern": args.pattern, "model": args.model, "cadence": args.cadence,
			
 
				+            "runs_per_day": round(rpd, 3), "tokens_per_run": tokens_run,
			
 
				+            "input_tokens": in_tok, "output_tokens": out_tok, "subagents": sub,
			
 
				+            "cost_per_run": round(cost_run, 6), "cost_per_day": round(cost_day, 4),
			
 
				+            "days": args.days, "cost_per_horizon": round(cost_horizon, 2),
			
 
				         }
			
 
				-        print(json.dumps(envelope, indent=2))
			
 
				+        if cache is not None:
			
 
				+            if cache["beneficial"]:
			
 
				+                cd = cache["cost_per_day"]
			
 
				+                data["caching"] = {
			
 
				+                    "beneficial": True, "ttl": cache["ttl"], "prefix_tokens": cache["prefix_tokens"],
			
 
				+                    "cost_per_day": round(cd, 4), "cost_per_horizon": round(cd * args.days, 2),
			
 
				+                    "savings_pct": round((cost_day - cd) / cost_day * 100, 1) if cost_day else 0.0,
			
 
				+                }
			
 
				+            else:
			
 
				+                data["caching"] = {"beneficial": False, "reason": cache["reason"],
			
 
				+                                   "prefix_tokens": cache["prefix_tokens"]}
			
 
				+        print(json.dumps({"data": data, "meta": {"as_of": as_of, "schema": "claude-mods.loop-ops.cost/v1"}}, indent=2))
			
 
				         return EX_OK
			
 
				 
			
 
				     t = Term(sys.stderr)
			
@@ -216,12 +297,21 @@ def main(argv: list[str]) -> int:
 
				     print(f"{'tokens/run:':<16}{tokens_run:,} ({in_tok:,} in + {out_tok:,} out) x {sub} subagent(s)")
			
 
				     print(f"{'cost/run:':<16}{fmt_money(cost_run)}")
			
 
				     print(f"{'cost/day:':<16}{fmt_money(cost_day)}")
			
 
				-    print(f"{'cost/'+str(args.days)+'d:':<16}{t.c('cyan', fmt_money(cost_horizon))}")
			
 
				-    print(
			
 
				-        f"estimate (as of {as_of} pricing) - reconcile against run-log.md actuals; "
			
 
				-        "cadence is the biggest lever",
			
 
				-        file=sys.stderr,
			
 
				-    )
			
 
				+    print(f"{'cost/'+str(args.days)+'d:':<16}{fmt_money(cost_horizon)}  (uncached)")
			
 
				+    if cache is not None:
			
 
				+        if cache["beneficial"]:
			
 
				+            cd, ch = cache["cost_per_day"], cache["cost_per_day"] * args.days
			
 
				+            save = (cost_day - cd) / cost_day * 100 if cost_day else 0.0
			
 
				+            print(f"{'cached/'+str(args.days)+'d:':<16}{t.c('cyan', fmt_money(ch))}  "
			
 
				+                  f"({t.c('green', f'-{save:.0f}%')}, TTL {cache['ttl']}, prefix ~{cache['prefix_tokens']:,} tok)")
			
 
				+            print(f"recommendation: cache the static run.md+system prefix at TTL {cache['ttl']} "
			
 
				+                  f"-> ~-{save:.0f}%/mo. Keep run.md BYTE-IDENTICAL every tick or the cache never hits.",
			
 
				+                  file=sys.stderr)
			
 
				+        else:
			
 
				+            print(f"caching: not beneficial here", file=sys.stderr)
			
 
				+            print(f"  why: {cache['reason']}", file=sys.stderr)
			
 
				+    print(f"estimate (as of {as_of} pricing) - reconcile against run-log.md actuals; "
			
 
				+          "cadence is the biggest lever, then caching, then model tier", file=sys.stderr)
			
 
				     return EX_OK
			
 
				 
			
 
				 
			
--- a/skills/loop-ops/scripts/loop-doctor.sh
+++ b/skills/loop-ops/scripts/loop-doctor.sh
@@ -0,0 +1,227 @@
 
				+#!/usr/bin/env bash
			
 
				+# Preflight a loop config - will this loop actually RUN, or die at 3am?
			
 
				+#
			
 
				+# loop-audit checks the config is well-formed; loop-doctor checks the loop will
			
 
				+# execute: the gate command's binary resolves, claude/git are on PATH, the budget
			
 
				+# can fit a tick, and the permission mode is achievable from where it launches.
			
 
				+# Modeled on fleet-worker/scripts/fleet-doctor.sh.
			
 
				+#
			
 
				+# Usage:   loop-doctor.sh [--offline|--live] [--json] [-q] <loop.config.yaml>
			
 
				+# Input:   argv flags + a config path (no stdin).
			
 
				+# Output:  stdout = check rows (TSV: state<TAB>check<TAB>detail), or a --json envelope.
			
 
				+# Stderr:  the preflight panel, notices, errors.
			
 
				+# Exit:    0 ok, 2 usage, 3 config not found, 4 unparseable, 5 missing core dep,
			
 
				+#          10 a check predicts a runtime failure (a gate binary missing, bypass on
			
 
				+#          host without isolation, budget too small for a tick)
			
 
				+#
			
 
				+#   --offline (default): no PATH/exec - config-shape + budget-vs-cost + permission/
			
 
				+#                        isolation coherence. Safe for PR CI.
			
 
				+#   --live:              adds runtime preflight - claude/git on PATH, the verify/guard
			
 
				+#                        leading binary resolvable, the kill-switch path's parent exists.
			
 
				+#
			
 
				+# Examples:
			
 
				+#   loop-doctor.sh --offline .loops/pr-babysitter/loop.config.yaml
			
 
				+#   loop-doctor.sh --live .loops/ci-sweeper/loop.config.yaml
			
 
				+#   loop-doctor.sh --live --json .loops/dep-sweeper/loop.config.yaml | jq '.data[] | select(.state=="bad")'
			
 
				+set -uo pipefail
			
 
				+
			
 
				+readonly EX_OK=0 EX_USAGE=2 EX_NOTFOUND=3 EX_UNPARSEABLE=4 EX_MISSING_DEP=5 EX_FINDINGS=10
			
 
				+
			
 
				+__lib="$(cd "$(dirname "${BASH_SOURCE[0]}")/../../_lib" 2>/dev/null && pwd || true)"
			
 
				+if [ -n "${__lib:-}" ] && [ -f "$__lib/term.sh" ]; then . "$__lib/term.sh"; term_init 2
			
 
				+else
			
 
				+  term_panel_open() { :; }; term_panel_close() { :; }; term_panel_vert() { :; }
			
 
				+  term_status_row() { shift; printf '  - %s %s\n' "$1" "${2:-}"; }
			
 
				+  term_color() { shift; printf '%s' "$*"; }; TERM_DOT="|"
			
 
				+fi
			
 
				+
			
 
				+HERE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
			
 
				+PRICING="$HERE/../assets/model-pricing.json"
			
 
				+
			
 
				+CFG=""; MODE="offline"; JSON=0; QUIET=0
			
 
				+
			
 
				+usage() {
			
 
				+  cat <<'EOF'
			
 
				+loop-doctor.sh - preflight a loop config (will it actually run?).
			
 
				+
			
 
				+Usage:
			
 
				+  loop-doctor.sh [--offline|--live] [--json] [-q] <loop.config.yaml>
			
 
				+
			
 
				+Options:
			
 
				+  --offline      config-shape + budget-vs-cost + permission coherence (default; no PATH/exec).
			
 
				+  --live         adds runtime preflight: claude/git on PATH, verify/guard binary resolvable.
			
 
				+  --json         emit a JSON envelope.
			
 
				+  -q, --quiet    suppress the stderr panel.
			
 
				+  -h, --help     show this help and exit 0.
			
 
				+
			
 
				+Exit codes:
			
 
				+  0 ok   2 usage   3 not found   4 unparseable   5 missing dep   10 predicted runtime failure
			
 
				+
			
 
				+Examples:
			
 
				+  loop-doctor.sh --offline .loops/pr-babysitter/loop.config.yaml
			
 
				+  loop-doctor.sh --live .loops/ci-sweeper/loop.config.yaml
			
 
				+  loop-doctor.sh --live --json .loops/dep-sweeper/loop.config.yaml | jq '.data[] | select(.state=="bad")'
			
 
				+EOF
			
 
				+}
			
 
				+die_usage() { printf 'error: %s\n' "$1" >&2; echo >&2; usage >&2; exit "$EX_USAGE"; }
			
 
				+
			
 
				+while [[ $# -gt 0 ]]; do
			
 
				+  case "$1" in
			
 
				+    --offline) MODE="offline"; shift ;;
			
 
				+    --live)    MODE="live"; shift ;;
			
 
				+    --json)    JSON=1; shift ;;
			
 
				+    -q|--quiet) QUIET=1; shift ;;
			
 
				+    -h|--help) usage; exit "$EX_OK" ;;
			
 
				+    -*)        die_usage "unknown flag: $1" ;;
			
 
				+    *)         [[ -z "$CFG" ]] || die_usage "unexpected extra argument: $1"; CFG="$1"; shift ;;
			
 
				+  esac
			
 
				+done
			
 
				+
			
 
				+command -v awk  >/dev/null 2>&1 || { echo "loop-doctor: awk required" >&2; exit "$EX_MISSING_DEP"; }
			
 
				+command -v grep >/dev/null 2>&1 || { echo "loop-doctor: grep required" >&2; exit "$EX_MISSING_DEP"; }
			
 
				+
			
 
				+[[ -n "$CFG" ]] || die_usage "a loop.config.yaml path is required"
			
 
				+[[ -f "$CFG" ]] || { printf 'error: config not found: %s\n' "$CFG" >&2; exit "$EX_NOTFOUND"; }
			
 
				+grep -Eq '^[a-z_]+:' "$CFG" || { printf 'error: no parseable keys in %s\n' "$CFG" >&2; exit "$EX_UNPARSEABLE"; }
			
 
				+
			
 
				+# Pick a working python for the budget-vs-cost check (skipped gracefully if none).
			
 
				+PY=""
			
 
				+for c in python python3 py; do
			
 
				+  if command -v "$c" >/dev/null 2>&1 && "$c" -c "" >/dev/null 2>&1; then PY="$c"; break; fi
			
 
				+done
			
 
				+
			
 
				+# ── flat-YAML readers (no yq), same contract as loop-audit.sh ────────────────
			
 
				+cfg_scalar() {
			
 
				+  awk -v k="$1" -v q="'" '
			
 
				+    $0 ~ "^"k":" { sub("^"k":[ \t]*",""); sub(/[ \t]*#.*$/,""); gsub(/^[ \t]+|[ \t]+$/,"");
			
 
				+      gsub(/^"|"$/,""); gsub("^"q"|"q"$",""); print; exit }' "$CFG"
			
 
				+}
			
 
				+cfg_list_items() {
			
 
				+  awk -v k="$1" -v q="'" '
			
 
				+    $0 ~ "^"k":" { inlist=1; next }
			
 
				+    inlist==1 { if ($0 ~ /^[ \t]*-[ \t]+/) { line=$0; sub(/^[ \t]*-[ \t]+/,"",line); sub(/[ \t]*#.*$/,"",line);
			
 
				+        gsub(/^[ \t]+|[ \t]+$/,"",line); gsub(/^"|"$/,"",line); gsub("^"q"|"q"$","",line); if (line!="") print line }
			
 
				+      else if ($0 ~ /^[^ \t#]/) { inlist=0 } }' "$CFG"
			
 
				+}
			
 
				+
			
 
				+TIER="$(cfg_scalar tier)"; PMODE="$(cfg_scalar permission_mode)"; PATTERN="$(cfg_scalar pattern)"
			
 
				+VERIFY="$(cfg_scalar verify)"; GUARD="$(cfg_scalar guard)"; BUDGET="$(cfg_scalar budget_tokens)"
			
 
				+KILL="$(cfg_scalar kill_switch)"; ESCAL="$(cfg_scalar escalation)"
			
 
				+is_l2plus=0; [[ "$TIER" == "L2" || "$TIER" == "L3" ]] && is_l2plus=1
			
 
				+
			
 
				+# ── findings ─────────────────────────────────────────────────────────────
			
 
				+ROWS=()       # "state\tcheck\tdetail"
			
 
				+FINDING=0
			
 
				+row() { ROWS+=("$1"$'\t'"$2"$'\t'"$3"); [[ "$1" == "bad" ]] && FINDING=1; }
			
 
				+
			
 
				+# leading binary of a command string (first whitespace token; strips a leading VAR= prefix)
			
 
				+lead_bin() { awk '{ for(i=1;i<=NF;i++){ if($i !~ /=/){print $i; exit} } }' <<<"$1"; }
			
 
				+
			
 
				+# ── OFFLINE checks ───────────────────────────────────────────────────────
			
 
				+# Permission mode achievability.
			
 
				+case "$PMODE" in
			
 
				+  default) row bad "permission_mode" "default is interactive - a headless 'claude -p' tick can't answer prompts; use dontAsk/auto/bypassPermissions" ;;
			
 
				+  "")      row bad "permission_mode" "missing" ;;
			
 
				+  *)       row ok  "permission_mode" "$PMODE" ;;
			
 
				+esac
			
 
				+# L3 bypass needs an isolation boundary.
			
 
				+if [[ "$TIER" == "L3" && "$PMODE" == "bypassPermissions" ]]; then
			
 
				+  if printf '%s %s' "$ESCAL" "$(cfg_list_items scope | tr '\n' ' ')" | grep -Eqi 'container|isolat|sandbox|devcontainer'; then
			
 
				+    row ok "isolation" "L3 bypass declares an isolation boundary"
			
 
				+  else
			
 
				+    row bad "isolation" "L3 + bypassPermissions with no container/sandbox note - only safe in an isolated VM/container"
			
 
				+  fi
			
 
				+fi
			
 
				+# Budget vs estimated tokens/run.
			
 
				+if [[ -n "$BUDGET" && "$BUDGET" =~ ^[0-9]+$ && -n "$PY" && -n "$PATTERN" && -f "$PRICING" ]]; then
			
 
				+  TPR="$(PR="$PRICING" PAT="$PATTERN" "$PY" -c "import json,os
			
 
				+try:
			
 
				+ d=json.load(open(os.environ['PR']))['_pattern_defaults'].get(os.environ['PAT'])
			
 
				+ print((int(d['input'])+int(d['output']))*int(d.get('subagents',1)) if d else '')
			
 
				+except Exception: print('')" 2>/dev/null)"
			
 
				+  if [[ -n "$TPR" && "$TPR" =~ ^[0-9]+$ ]]; then
			
 
				+    if [[ "$BUDGET" -lt "$TPR" ]]; then
			
 
				+      row bad "budget" "budget_tokens $BUDGET < ~$TPR est. tokens/run for $PATTERN - a tick can't complete"
			
 
				+    else
			
 
				+      row ok "budget" "budget_tokens $BUDGET >= ~$TPR est. tokens/run"
			
 
				+    fi
			
 
				+  fi
			
 
				+fi
			
 
				+
			
 
				+# ── LIVE checks ──────────────────────────────────────────────────────────
			
 
				+if [[ "$MODE" == "live" ]]; then
			
 
				+  if command -v claude >/dev/null 2>&1; then row ok "claude" "on PATH"; else row warn "claude" "not on PATH - the scheduler that runs 'claude -p' must have it"; fi
			
 
				+  if command -v git >/dev/null 2>&1; then
			
 
				+    row ok "git" "on PATH"
			
 
				+    if [[ "$is_l2plus" -eq 1 ]] && ! git worktree list >/dev/null 2>&1; then
			
 
				+      row warn "worktree" "'git worktree' unavailable here - L2+ isolates changes in a worktree"
			
 
				+    fi
			
 
				+  elif [[ "$is_l2plus" -eq 1 ]]; then
			
 
				+    row bad "git" "git not on PATH - L2+ needs it for worktree isolation + landing"
			
 
				+  else
			
 
				+    row warn "git" "git not on PATH"
			
 
				+  fi
			
 
				+  # verify / guard leading binary resolvable
			
 
				+  for pair in "verify:$VERIFY" "guard:$GUARD"; do
			
 
				+    label="${pair%%:*}"; cmd="${pair#*:}"
			
 
				+    [[ -z "$cmd" ]] && continue
			
 
				+    case "$cmd" in *"<"*">"*) continue ;; esac   # unfilled placeholder - audit's job
			
 
				+    bin="$(lead_bin "$cmd")"
			
 
				+    [[ -z "$bin" ]] && continue
			
 
				+    if [[ "$bin" == */* ]]; then
			
 
				+      [[ -x "$bin" ]] && row ok "$label" "$bin executable" || row bad "$label" "$bin not executable - the gate can't run"
			
 
				+    elif command -v "$bin" >/dev/null 2>&1; then
			
 
				+      row ok "$label" "$bin resolves"
			
 
				+    else
			
 
				+      row bad "$label" "'$bin' not on PATH - the gate command can't run at tick time"
			
 
				+    fi
			
 
				+  done
			
 
				+  # kill-switch path parent exists (only when it clearly names a path)
			
 
				+  ks_path="$(grep -oE '[^ "'"'"']*/[^ "'"'"']*' <<<"$KILL" | head -1)"
			
 
				+  if [[ -n "$ks_path" ]]; then
			
 
				+    parent="$(dirname "$ks_path")"
			
 
				+    [[ -d "$parent" || "$parent" == "." ]] && row ok "kill_switch" "sentinel path parent exists ($parent)" \
			
 
				+      || row warn "kill_switch" "sentinel parent dir missing ($parent) - create it so the switch works"
			
 
				+  fi
			
 
				+fi
			
 
				+
			
 
				+# ── output ───────────────────────────────────────────────────────────────
			
 
				+n_bad=0; n_warn=0; n_ok=0
			
 
				+for r in "${ROWS[@]:-}"; do
			
 
				+  case "${r%%$'\t'*}" in bad) n_bad=$((n_bad+1));; warn) n_warn=$((n_warn+1));; ok) n_ok=$((n_ok+1));; esac
			
 
				+done
			
 
				+
			
 
				+if [[ "$JSON" -eq 1 ]]; then
			
 
				+  printf '{\n  "data": [\n'
			
 
				+  if [[ ${#ROWS[@]} -gt 0 ]]; then
			
 
				+   for i in "${!ROWS[@]}"; do
			
 
				+    IFS=$'\t' read -r st ck dt <<<"${ROWS[$i]}"
			
 
				+    dt="${dt//\\/\\\\}"; dt="${dt//\"/\\\"}"
			
 
				+    sep=","; [[ "$i" -eq $(( ${#ROWS[@]} - 1 )) ]] && sep=""
			
 
				+    printf '    {"state": "%s", "check": "%s", "detail": "%s"}%s\n' "$st" "$ck" "$dt" "$sep"
			
 
				+   done
			
 
				+  fi
			
 
				+  printf '  ],\n  "meta": {"mode": "%s", "ok": %d, "warn": %d, "bad": %d, "will_run": %s, "tier": "%s", "schema": "claude-mods.loop-ops.doctor/v1"}\n}\n' \
			
 
				+    "$MODE" "$n_ok" "$n_warn" "$n_bad" "$([[ "$FINDING" -eq 0 ]] && echo true || echo false)" "${TIER:-unknown}"
			
 
				+else
			
 
				+  if [[ ${#ROWS[@]} -gt 0 ]]; then
			
 
				+    for r in "${ROWS[@]}"; do
			
 
				+      IFS=$'\t' read -r st ck dt <<<"$r"
			
 
				+      printf '%-5s %-14s %s\n' "$st" "$ck" "$dt"
			
 
				+    done
			
 
				+  fi
			
 
				+  if [[ "$QUIET" -eq 0 ]]; then
			
 
				+    verdict="$([[ "$FINDING" -eq 0 ]] && echo "WILL RUN" || echo "WILL FAIL")"
			
 
				+    vstate="$([[ "$FINDING" -eq 0 ]] && echo ok || echo bad)"
			
 
				+    {
			
 
				+      term_panel_open loop "loop ${TERM_DOT} doctor ($MODE)" "$(basename "$(dirname "$CFG")")"
			
 
				+      term_panel_vert
			
 
				+      term_status_row "$vstate" "$verdict" "$n_bad blocking ${TERM_DOT} $n_warn advisory ${TERM_DOT} $n_ok ok"
			
 
				+      [[ "$MODE" == "offline" ]] && term_status_row skip "run --live before scheduling" "checks gate binaries + PATH"
			
 
				+      term_panel_vert
			
 
				+      term_panel_close "audit = well-formed ${TERM_DOT} doctor = will-run" ""
			
 
				+    } >&2
			
 
				+  fi
			
 
				+fi
			
 
				+
			
 
				+[[ "$FINDING" -eq 0 ]] && exit "$EX_OK" || exit "$EX_FINDINGS"
			
--- a/skills/loop-ops/tests/run.sh
+++ b/skills/loop-ops/tests/run.sh
@@ -17,6 +17,7 @@ INIT="$SCRIPTS/loop-init.sh"
 
				 AUDIT="$SCRIPTS/loop-audit.sh"
			
 
				 COST="$SCRIPTS/loop-cost.py"
			
 
				 SYNC="$SCRIPTS/check-pricing-sync.py"
			
 
				+DOCTOR="$SCRIPTS/loop-doctor.sh"
			
 
				 
			
 
				 # Pick a python that actually executes — skips the Windows Store python3 stub.
			
 
				 PYTHON=""
			
@@ -187,6 +188,36 @@ expect_exit "cron cadence -> 0" 0 $?
 
				 out="$("$PYTHON" "$COST" --pattern custom --cadence weird --runs-per-day 5 --model claude-haiku-4-5 2>/dev/null)"; rc=$?
			
 
				 expect_exit "runs-per-day override -> 0" 0 "$rc"
			
 
				 expect_has  "uses the override" "5 runs/day" "$out"
			
 
				+# caching: a fast loop (10m -> 1h TTL) projects a cached saving
			
 
				+out="$("$PYTHON" "$COST" --pattern ci-sweeper --cadence 10m --model claude-sonnet-4-6 2>&1)"
			
 
				+expect_has "fast loop shows a cached projection" "cached/" "$out"
			
 
				+# caching: a slow loop (6h > 1h TTL) is not cache-beneficial
			
 
				+out="$("$PYTHON" "$COST" --pattern daily-triage --cadence 6h --model claude-opus-4-8 2>&1)"
			
 
				+expect_has "slow loop: caching not beneficial" "not beneficial" "$out"
			
 
				+# --no-cache suppresses the cached projection
			
 
				+out="$("$PYTHON" "$COST" --pattern ci-sweeper --cadence 10m --model claude-sonnet-4-6 --no-cache 2>&1)"
			
 
				+case "$out" in *"cached/"*) no "--no-cache still showed caching";; *) ok "--no-cache suppresses caching";; esac
			
 
				+# json caching block present for a cacheable loop
			
 
				+out="$("$PYTHON" "$COST" --pattern ci-sweeper --cadence 5m --model claude-sonnet-4-6 --json 2>/dev/null)"
			
 
				+expect_has "cost json carries caching block" '"caching"' "$out"
			
 
				+
			
 
				+# ── loop-doctor: preflight (offline budget, live binary), json ─────────────
			
 
				+echo "-- loop-doctor --"
			
 
				+bash "$DOCTOR" --help >/dev/null 2>&1; expect_exit "loop-doctor --help -> 0" 0 $?
			
 
				+bash "$DOCTOR" --offline "$SB/l1.yaml" >/dev/null 2>&1; expect_exit "doctor offline healthy L1 -> 0" 0 $?
			
 
				+bash "$DOCTOR" --live "$SB/l1.yaml" >/dev/null 2>&1; expect_exit "doctor live healthy L1 -> 0" 0 $?
			
 
				+# budget too small for the pattern -> bad -> 10
			
 
				+sed 's/^budget_tokens: 300000/budget_tokens: 100/' "$SB/l2.yaml" > "$SB/l2-poor.yaml"
			
 
				+out="$(bash "$DOCTOR" --offline "$SB/l2-poor.yaml" 2>/dev/null)"; rc=$?
			
 
				+expect_exit "doctor budget-too-small -> 10" 10 "$rc"
			
 
				+expect_has  "doctor names the budget gap" "tokens/run" "$out"
			
 
				+# live: a verify gate whose binary is missing -> bad -> 10
			
 
				+sed 's/^verify: "npm test"/verify: "totally-missing-binary-zzz run"/' "$SB/l2.yaml" > "$SB/l2-nobin.yaml"
			
 
				+bash "$DOCTOR" --live "$SB/l2-nobin.yaml" >/dev/null 2>&1; expect_exit "doctor missing gate binary -> 10" 10 $?
			
 
				+# missing config -> 3, json schema
			
 
				+bash "$DOCTOR" --offline "$SB/no-such.yaml" >/dev/null 2>&1; expect_exit "doctor missing config -> 3" 3 $?
			
 
				+out="$(bash "$DOCTOR" --offline --json "$SB/l1.yaml" 2>/dev/null)"
			
 
				+expect_has "doctor json schema" "claude-mods.loop-ops.doctor/v1" "$out"
			
 
				 
			
 
				 # ── loop-cost: validation errors ───────────────────────────────────────────
			
 
				 "$PYTHON" "$COST" --pattern pr-babysitter --cadence 10m --model claude-nope >/dev/null 2>&1; expect_exit "unknown model -> 4" 4 $?
			
@@ -208,7 +239,7 @@ expect_has "pricing-sync json in_sync" '"in_sync": true' "$out"
 
				 
			
 
				 # ── terminal design system ─────────────────────────────────────────────────
			
 
				 echo "-- terminal design system --"
			
 
				-for s in "$INIT" "$AUDIT"; do
			
 
				+for s in "$INIT" "$AUDIT" "$DOCTOR"; do
			
 
				   b="$(basename "$s")"
			
 
				   grep -q '_lib/term.sh' "$s" && ok "$b sources _lib/term.sh" || no "$b does not source _lib/term.sh"
			
 
				 done