Selaa lähdekoodia

feat(skills): Add claude-code-internals (merged) + claude-api-ops, playwright-ops, terraform-ops

claude-code-internals merges and refreshes claude-code-debug/-headless/
-hooks against current official docs: 30-event hook catalog with per-event
JSON contracts and all 5 hook types, current SKILL.md frontmatter spec,
headless/CLI flag reference, extension-debugging decision trees. Stale
$TOOL_INPUT-era content dropped. Old skills removed; install scripts
clean them up on upgrade.

Three new comprehensive skills, all verified against official docs:
- claude-api-ops: Messages API, tool use, prompt caching, structured
  outputs (output_config.format), batches, thinking, Agent SDK
- playwright-ops: selector hierarchy, fixtures/POM, network mocking,
  storageState auth, CI sharding, flake hunting + config template
- terraform-ops: state management, module patterns, OIDC plan/apply
  workflow template, drift detection, write-only secrets, terraform test

80 -> 81 skills. All gates pass (validate 100, doc-drift clean, suites 4/4).

Co-Authored-By: Claude Fable 5 <noreply@anthropic.com>
0xDarkMatter 2 viikkoa sitten
vanhempi
sitoutus
378ef84ec8
47 muutettua tiedostoa jossa 4831 lisäystä ja 2640 poistoa
  1. 1 1
      AGENTS.md
  2. 17 2
      CHANGELOG.md
  3. 7 6
      README.md
  4. 4 4
      docs/PLAN.md
  5. 3 0
      scripts/install.ps1
  6. 3 0
      scripts/install.sh
  7. 275 0
      skills/claude-api-ops/SKILL.md
  8. 0 0
      skills/claude-api-ops/assets/.gitkeep
  9. 254 0
      skills/claude-api-ops/references/agent-sdk.md
  10. 244 0
      skills/claude-api-ops/references/caching-and-cost.md
  11. 292 0
      skills/claude-api-ops/references/messages-api.md
  12. 190 0
      skills/claude-api-ops/references/structured-outputs.md
  13. 289 0
      skills/claude-api-ops/references/tool-use.md
  14. 0 0
      skills/claude-api-ops/scripts/.gitkeep
  15. 0 124
      skills/claude-code-debug/SKILL.md
  16. 0 208
      skills/claude-code-debug/references/common-issues.md
  17. 0 276
      skills/claude-code-debug/references/debug-commands.md
  18. 0 213
      skills/claude-code-debug/references/troubleshooting-flow.md
  19. 0 130
      skills/claude-code-headless/SKILL.md
  20. 0 207
      skills/claude-code-headless/references/cli-options.md
  21. 0 373
      skills/claude-code-headless/references/integration-patterns.md
  22. 0 202
      skills/claude-code-headless/references/output-formats.md
  23. 0 116
      skills/claude-code-hooks/SKILL.md
  24. 0 263
      skills/claude-code-hooks/references/configuration.md
  25. 0 251
      skills/claude-code-hooks/references/hook-events.md
  26. 0 264
      skills/claude-code-hooks/references/security-patterns.md
  27. 104 0
      skills/claude-code-internals/SKILL.md
  28. 0 0
      skills/claude-code-internals/assets/.gitkeep
  29. 129 0
      skills/claude-code-internals/references/debugging-reference.md
  30. 197 0
      skills/claude-code-internals/references/headless-reference.md
  31. 325 0
      skills/claude-code-internals/references/hooks-reference.md
  32. 147 0
      skills/claude-code-internals/references/skills-reference.md
  33. 0 0
      skills/claude-code-internals/scripts/.gitkeep
  34. 315 0
      skills/playwright-ops/SKILL.md
  35. 126 0
      skills/playwright-ops/assets/playwright.config.template.ts
  36. 193 0
      skills/playwright-ops/references/ci-patterns.md
  37. 191 0
      skills/playwright-ops/references/fixtures-and-pom.md
  38. 106 0
      skills/playwright-ops/references/flake-hunting.md
  39. 169 0
      skills/playwright-ops/references/network-and-api.md
  40. 0 0
      skills/playwright-ops/scripts/.gitkeep
  41. 252 0
      skills/terraform-ops/SKILL.md
  42. 211 0
      skills/terraform-ops/assets/github-actions-terraform.yml
  43. 182 0
      skills/terraform-ops/references/cicd-pipelines.md
  44. 223 0
      skills/terraform-ops/references/module-patterns.md
  45. 156 0
      skills/terraform-ops/references/security-and-secrets.md
  46. 226 0
      skills/terraform-ops/references/state-management.md
  47. 0 0
      skills/terraform-ops/scripts/.gitkeep

+ 1 - 1
AGENTS.md

@@ -5,7 +5,7 @@
 This is **claude-mods** - a collection of custom extensions for Claude Code:
 - **12 expert agents** for domains without a skill twin (Cloudflare, Cypress, CraftCMS, git-agent, web scraping, etc.) - language/framework experts were folded into their `-ops` skills (v3.0, skills-first)
 - **2 commands** for session management (/sync, /save)
-- **80 skills** for CLI tools, patterns, workflows, and development tasks (incl. `supply-chain-defense` for behavioural-first dependency security, `prompt-injection-defense` for instruction-integrity scanning, `net-ops` for network troubleshooting, `windows-ops` / `mac-ops` for workstation diagnostics)
+- **81 skills** for CLI tools, patterns, workflows, and development tasks (incl. `supply-chain-defense` for behavioural-first dependency security, `prompt-injection-defense` for instruction-integrity scanning, `net-ops` for network troubleshooting, `windows-ops` / `mac-ops` for workstation diagnostics)
 - **13 output styles** for response personality (Vesper, Spartan, Mentor, Executive, Pair, Atlas, Coach, Harbour, Meridian, Noir, Roast, Sage, Scout)
 - **9 hooks** for pre-commit linting, post-edit formatting, dangerous command warnings, uv enforcement, dependency-install + manifest-edit supply-chain advisories, hidden-Unicode scanning (session-start + pre-commit), and pmail notifications
 - **Pigeon** inter-session messaging (`pigeon send/read/reply`) - SQLite-backed pmail at `~/.claude/pmail.db`

+ 17 - 2
CHANGELOG.md

@@ -13,8 +13,23 @@ feature releases live in the README "Recent Updates" section.
   laravel, sql, postgres) - unique agent content folded into the skills;
   dispatching skills (review, testgen, explain, perf-ops, security-ops) now
   route `general-purpose` agents with skill preloading. 23 → 12 agents.
-
-### Added
+- `claude-code-debug`, `claude-code-headless`, `claude-code-hooks` skills -
+  merged into `claude-code-internals` (content was written against Claude Code
+  ~2.0; the stale `$TOOL_INPUT` hook contract is gone, stdin JSON is current)
+
+### Added
+- **`claude-api-ops` skill** - building ON Claude: Messages API, tool use,
+  prompt caching, structured outputs (`output_config.format`), batches,
+  extended thinking, model selection, Agent SDK (Python + TypeScript)
+- **`playwright-ops` skill** - e2e testing: selector hierarchy, fixtures/POM,
+  network mocking, auth storageState, CI sharding, flake hunting, config template
+- **`terraform-ops` skill** - Terraform/OpenTofu IaC: state management,
+  module patterns, OIDC CI/CD workflow template, drift detection, write-only
+  secrets, native `terraform test`
+- **`claude-code-internals` skill** - merges + refreshes claude-code-debug,
+  claude-code-headless, claude-code-hooks against current docs: 30-event hook
+  catalog with JSON contracts, current skill frontmatter spec, headless/CLI
+  reference, extension debugging decision trees
 - CI: doc-drift gate (`tests/doc-drift.sh`) - docs must match disk
 - CI: skill behavioural test suites (`tests/run-skill-tests.sh`)
 

+ 7 - 6
README.md

@@ -12,13 +12,13 @@
 
 > *A comprehensive extension toolkit that transforms Claude Code into a specialized development powerhouse.*
 
-**claude-mods** is a production-ready plugin that extends Claude Code with 80 specialized skills, 12 expert agents, 13 output styles, 9 hooks, and modern CLI tools designed for real-world development workflows. Whether you're debugging React hooks, optimizing PostgreSQL queries, or building production CLI applications, this toolkit equips Claude with the domain expertise and procedural knowledge to work at expert level across multiple technology stacks.
+**claude-mods** is a production-ready plugin that extends Claude Code with 81 specialized skills, 12 expert agents, 13 output styles, 9 hooks, and modern CLI tools designed for real-world development workflows. Whether you're debugging React hooks, optimizing PostgreSQL queries, or building production CLI applications, this toolkit equips Claude with the domain expertise and procedural knowledge to work at expert level across multiple technology stacks.
 
 Built on the [Agent Skills specification](https://agentskills.io/specification) (an open standard backed by Anthropic, Vercel, Google, Microsoft, and 40+ agent platforms), claude-mods fills critical gaps in Claude Code's capabilities: persistent session state that survives across machines, on-demand expert knowledge for specialized domains, token-efficient modern CLI tools (10-100x faster than traditional alternatives), and proven workflow patterns for TDD, code review, and feature development. The toolkit implements Anthropic's [recommended patterns for long-running agents](https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents), ensuring your development context never vanishes when sessions end.
 
 From Python async patterns to Rust ownership models, from AWS Fargate deployments to Craft CMS development - claude-mods provides the specialized knowledge and tools that transform Claude from a general-purpose assistant into a domain expert who understands your stack, remembers your workflow, and ships production code.
 
-**12 agents. 80 skills. 13 styles. 9 hooks. 7 rules. One install.**
+**12 agents. 81 skills. 13 styles. 9 hooks. 7 rules. One install.**
 
 ## Recent Updates
 
@@ -161,7 +161,7 @@ claude-mods/
 ├── .claude-plugin/     # Plugin metadata
 ├── agents/             # Expert subagents (12)
 ├── commands/           # Slash commands (2)
-├── skills/             # Custom skills (80)
+├── skills/             # Custom skills (81)
 ├── output-styles/      # Response personalities
 ├── hooks/              # Hook examples & docs
 ├── rules/              # Claude Code rules
@@ -283,6 +283,7 @@ See [skill-creator](skills/skill-creator/) for the complete guide.
 | [sql-ops](skills/sql-ops/) | CTEs, window functions, JOIN patterns, indexing |
 | [postgres-ops](skills/postgres-ops/) | PostgreSQL operations, optimization, schema design, replication, monitoring |
 | [sqlite-ops](skills/sqlite-ops/) | SQLite schemas, Python sqlite3/aiosqlite patterns |
+| [claude-api-ops](skills/claude-api-ops/) | Build on Claude - Messages API, tool use, prompt caching, structured outputs, batches, Agent SDK |
 | [mcp-ops](skills/mcp-ops/) | MCP server development, FastMCP, transports, tool design, testing |
 
 #### Infrastructure Skills
@@ -296,6 +297,7 @@ See [skill-creator](skills/skill-creator/) for the complete guide.
 | [monitoring-ops](skills/monitoring-ops/) | Prometheus, Grafana, OpenTelemetry, structured logging, alerting |
 | [debug-ops](skills/debug-ops/) | Systematic debugging, language-specific debuggers, common scenarios |
 | [perf-ops](skills/perf-ops/) | Performance profiling - CPU, memory, bundle analysis, load testing, flamegraphs |
+| [terraform-ops](skills/terraform-ops/) | Terraform/OpenTofu IaC - state management, module patterns, OIDC CI/CD, drift detection, secrets |
 | [supply-chain-defense](skills/supply-chain-defense/) | Behavioural-first dependency security - Socket.dev (free CLI + depscore MCP), exposure-check (IOC match across npm/pnpm/yarn/bun/PyPI/Composer/Cargo/Go/RubyGems + extensions), integrity-audit (worm persistence), scan-extensions, install/manifest hooks |
 | [prompt-injection-defense](skills/prompt-injection-defense/) | Instruction-integrity defense - hidden Unicode scanning (bidi/Trojan Source, tag-block smuggling, zero-width), content sanitization, trust-boundary doctrine |
 | [security-ops](skills/security-ops/) | Reactive security auditing - 3 parallel agents (dependency CVEs, SAST patterns, auth/config review) consolidated into OWASP-mapped report |
@@ -355,9 +357,8 @@ See [skill-creator](skills/skill-creator/) for the complete guide.
 | [scaffold](skills/scaffold/) | Project scaffolding - generate boilerplate for APIs, web apps, CLIs, monorepos |
 | [iterate](skills/iterate/) | Autonomous improvement loop - modify, measure, keep or discard, repeat. Inspired by Karpathy's autoresearch. |
 | [testing-ops](skills/testing-ops/) | Test strategy patterns - mocking, CI testing, test data design |
-| [claude-code-debug](skills/claude-code-debug/) | Troubleshoot Claude Code extensions - skills not loading, hooks not firing |
-| [claude-code-headless](skills/claude-code-headless/) | Run Claude Code programmatically - headless mode, output formats, CI/CD scripting |
-| [claude-code-hooks](skills/claude-code-hooks/) | Claude Code hook system - events, configuration, security patterns |
+| [claude-code-internals](skills/claude-code-internals/) | Claude Code internals - full hook event catalog, skill frontmatter spec, headless/CLI reference, extension debugging |
+| [playwright-ops](skills/playwright-ops/) | Playwright e2e testing - selector hierarchy, fixtures, network mocking, CI sharding, flake hunting |
 
 ### Hooks
 

+ 4 - 4
docs/PLAN.md

@@ -16,7 +16,7 @@
 | Component | Count | Notes |
 |-----------|-------|-------|
 | Agents | 12 | Domain experts without skill twins + git-agent background worker |
-| Skills | 80 | Operational skills, CLI tools, workflows, diagnostics, security |
+| Skills | 81 | Operational skills, CLI tools, workflows, diagnostics, security |
 | Commands | 2 | Session management (sync, save) |
 | Rules | 7 | cli-tools, commit-style, naming-conventions, prompt-injection, skill-agent-updates, supply-chain, worktree-boundaries |
 | Output Styles | 13 | Vesper, Spartan, Mentor, Executive, Pair, Atlas, Coach, Harbour, Meridian, Noir, Roast, Sage, Scout |
@@ -43,10 +43,10 @@ Counts are enforced by the CI doc-drift gate (see roadmap) — if this table rot
       typescript, javascript, go, rust, react, vue, astro, laravel, sql,
       postgres). Unique content folded into twin skills; dispatching skills
       now route general-purpose agents with skill preloading. 23 → 12 agents.
-- [ ] **claude-code-internals**: merge + refresh claude-code-debug /
+- [x] **claude-code-internals**: merged + refreshed claude-code-debug /
       claude-code-headless / claude-code-hooks against current official docs
-      (new hook events, skill frontmatter fields, CLI flags).
-- [ ] **New skills**: claude-api-ops (Messages API, tool use, caching, Agent SDK),
+      (30-event hook catalog, current skill frontmatter, current CLI flags).
+- [x] **New skills**: claude-api-ops (Messages API, tool use, caching, Agent SDK),
       playwright-ops, terraform-ops.
 
 ### Phase 3 — Distribution & native-feature adoption

+ 3 - 0
scripts/install.ps1

@@ -43,6 +43,9 @@ $deprecated = @(
     "$claudeDir\skills\conclave",
     "$claudeDir\skills\claude-code-templates",  # Replaced by skill-creator
     "$claudeDir\skills\agentmail",              # Renamed to pigeon (v2.3.0)
+    "$claudeDir\skills\claude-code-debug",      # Merged into claude-code-internals (v3.0)
+    "$claudeDir\skills\claude-code-headless",   # Merged into claude-code-internals (v3.0)
+    "$claudeDir\skills\claude-code-hooks",      # Merged into claude-code-internals (v3.0)
 
     # Deprecated agents (v3.0): folded into their -ops skill twins
     "$claudeDir\agents\python-expert.md",

+ 3 - 0
scripts/install.sh

@@ -57,6 +57,9 @@ deprecated_items=(
     "$CLAUDE_DIR/skills/conclave"                # Deprecated
     "$CLAUDE_DIR/skills/claude-code-templates"   # Replaced by skill-creator
     "$CLAUDE_DIR/skills/agentmail"               # Renamed to pigeon (v2.3.0)
+    "$CLAUDE_DIR/skills/claude-code-debug"       # Merged into claude-code-internals (v3.0)
+    "$CLAUDE_DIR/skills/claude-code-headless"    # Merged into claude-code-internals (v3.0)
+    "$CLAUDE_DIR/skills/claude-code-hooks"       # Merged into claude-code-internals (v3.0)
 
     # Deprecated agents (v3.0): folded into their -ops skill twins
     "$CLAUDE_DIR/agents/python-expert.md"

+ 275 - 0
skills/claude-api-ops/SKILL.md

@@ -0,0 +1,275 @@
+---
+name: claude-api-ops
+description: "Building applications ON Claude - the Anthropic API and Claude Agent SDK. Use for: anthropic api, claude api, messages api, tool use, function calling, prompt caching, agent sdk, claude-agent-sdk, structured output, json schema output, batches api, extended thinking, adaptive thinking, model selection, claude pricing, build claude agent, anthropic sdk, stop_reason handling, streaming claude, token counting, cache_control, output_config, tool_choice, agentic loop, rate limits anthropic."
+license: MIT
+allowed-tools: "Read Write Bash WebFetch"
+metadata:
+  author: claude-mods
+  related-skills: mcp-ops
+---
+
+# Claude API Operations
+
+Building applications and agents on Anthropic's API: the Messages API, tool use,
+prompt caching, structured outputs, batches, thinking/effort, and the Claude
+Agent SDK. For developers writing apps *against* the API — not for using Claude
+Code itself.
+
+**API surfaces move fast.** Model IDs, parameters, and betas in this skill were
+verified against platform.claude.com (2026-06). When in doubt — especially for
+"latest model" or pricing questions — verify with WebFetch against
+`https://platform.claude.com/docs/en/about-claude/models/overview.md` or query
+the Models API (`client.models.list()`).
+
+## Current Models (verified 2026-06)
+
+| Model | ID (exact, no date suffix) | Context | Max Output | Input $/MTok | Output $/MTok |
+|---|---|---|---|---|---|
+| Claude Fable 5 | `claude-fable-5` | 1M | 128K | $10.00 | $50.00 |
+| Claude Opus 4.8 | `claude-opus-4-8` | 1M | 128K | $5.00 | $25.00 |
+| Claude Sonnet 4.6 | `claude-sonnet-4-6` | 1M | 64K | $3.00 | $15.00 |
+| Claude Haiku 4.5 | `claude-haiku-4-5` | 200K | 64K | $1.00 | $5.00 |
+
+Use these alias IDs verbatim. **Never append date suffixes** (`claude-sonnet-4-6-20251114`
+is wrong → 404). Older actives: `claude-opus-4-7`, `claude-opus-4-6`, `claude-opus-4-5`,
+`claude-sonnet-4-5`. Live capability lookup: `client.models.retrieve("claude-opus-4-8")`
+→ `.max_input_tokens`, `.max_tokens`, `.capabilities` dict.
+
+## Model Selection Decision Tree
+
+```
+What is the workload?
+│
+├─ Hardest problems, long-horizon agents, deep research, ceiling intelligence
+│  └─ claude-fable-5 (premium) or claude-opus-4-8 (default flagship)
+│
+├─ Agentic coding, tool-heavy workflows, production assistants
+│  └─ claude-opus-4-8 (quality) or claude-sonnet-4-6 (speed/cost balance)
+│
+├─ High-volume production: summarization, RAG answers, extraction
+│  └─ claude-sonnet-4-6
+│
+├─ Classification, routing, simple Q&A, latency-critical
+│  └─ claude-haiku-4-5
+│
+└─ Subagents inside a larger system
+   └─ One tier below the orchestrator (Opus loop → Sonnet/Haiku workers)
+```
+
+Tiering rule: route by task difficulty, not by uniform default. An Opus
+orchestrator dispatching Haiku classifiers is routinely 5-10x cheaper than
+Opus-everywhere with no quality loss on the simple legs.
+
+## Which Surface? (API vs Agent SDK vs Batches)
+
+| Need | Use | Why |
+|---|---|---|
+| One request → one response (classify, summarize, extract, Q&A) | **Messages API** | Simplest; full control |
+| Multi-step pipeline, your code controls the logic | **Messages API + tool use** | You own the loop |
+| Custom agent with your own tools, your infra | **Messages API + tool use** (manual loop or SDK tool runner) | Max flexibility |
+| Agent that reads/edits files, runs commands, searches — without building tools | **Claude Agent SDK** | Claude Code's tools + agent loop as a library |
+| CI/CD automation, coding agents, production agent apps | **Claude Agent SDK** | Built-in tools, hooks, sessions, MCP |
+| Large non-urgent workloads (eval runs, backfills, bulk extraction) | **Batches API** | 50% discount, ≤24h turnaround |
+| Hosted agent, Anthropic runs loop + sandbox | **Managed Agents** (beta) | No infra; see official docs |
+
+Rule of thumb: start at the simplest tier. Reach for an agent only when the
+task is genuinely open-ended (multi-step, hard to fully specify, errors
+recoverable, value justifies cost).
+
+## Messages API Quick Start
+
+Everything goes through `POST /v1/messages`. Headers: `x-api-key`,
+`anthropic-version: 2023-06-01`, `content-type: application/json`.
+
+```python
+# pip install anthropic
+import anthropic
+
+client = anthropic.Anthropic()  # reads ANTHROPIC_API_KEY
+
+response = client.messages.create(
+    model="claude-opus-4-8",
+    max_tokens=16000,
+    system="You are a concise technical assistant.",
+    messages=[{"role": "user", "content": "Explain CRDTs in one paragraph."}],
+)
+for block in response.content:        # content is a list of typed blocks
+    if block.type == "text":          # always check .type before .text
+        print(block.text)
+print(response.stop_reason, response.usage.input_tokens, response.usage.output_tokens)
+```
+
+```typescript
+// npm install @anthropic-ai/sdk
+import Anthropic from "@anthropic-ai/sdk";
+
+const client = new Anthropic();
+
+const response = await client.messages.create({
+  model: "claude-opus-4-8",
+  max_tokens: 16000,
+  messages: [{ role: "user", content: "Explain CRDTs in one paragraph." }],
+});
+for (const block of response.content) {
+  if (block.type === "text") console.log(block.text);  // narrow the union first
+}
+```
+
+Streaming (default to it for long outputs — non-streaming above ~16K
+`max_tokens` risks SDK HTTP timeouts):
+
+```python
+with client.messages.stream(model="claude-opus-4-8", max_tokens=64000,
+                            messages=[{"role": "user", "content": "Write a long report"}]) as stream:
+    for text in stream.text_stream:
+        print(text, end="", flush=True)
+    final = stream.get_final_message()   # full Message after streaming
+```
+
+Full params, response shape, stop reasons, errors, retries, rate limits:
+[references/messages-api.md](references/messages-api.md)
+
+## Thinking & Effort (quick reference)
+
+- **Claude 4.6+ (Fable 5, Opus 4.8/4.7/4.6, Sonnet 4.6):** use
+  `thinking: {"type": "adaptive"}`. The old fixed budget
+  `{"type": "enabled", "budget_tokens": N}` is **removed on Fable 5 / Opus 4.8 / 4.7
+  (400 error)** and deprecated on Opus 4.6 / Sonnet 4.6.
+- **Effort (GA):** `output_config: {"effort": "low" | "medium" | "high" | "xhigh" | "max"}`
+  — nested in `output_config`, not top-level. Default `high`. `xhigh` (Opus 4.7+)
+  is best for coding/agentic work; `max` is Opus-tier + Sonnet 4.6 only.
+- **Sampling params removed on Fable 5 / Opus 4.8 / 4.7:** `temperature`,
+  `top_p`, `top_k` all return 400. Steer with prompting + effort.
+- **Thinking + forced tool_choice is incompatible:** with thinking on, only
+  `tool_choice: {"type": "auto"}` (default) or `"none"` is allowed —
+  `{"type": "any"}` or `{"type": "tool", ...}` returns a 400.
+- Thinking text is **omitted by default** on Fable 5 / Opus 4.8 / 4.7 — opt in
+  with `thinking: {"type": "adaptive", "display": "summarized"}` if you surface
+  reasoning to users.
+
+Details and gotchas: [references/structured-outputs.md](references/structured-outputs.md)
+(thinking interplay) and [references/messages-api.md](references/messages-api.md).
+
+## Tool Use (quick reference)
+
+```python
+tools = [{
+    "name": "get_weather",
+    "description": "Get current weather. Call when the user asks about weather conditions.",
+    "input_schema": {
+        "type": "object",
+        "properties": {"location": {"type": "string", "description": "City, e.g. Paris"}},
+        "required": ["location"],
+    },
+}]
+response = client.messages.create(model="claude-opus-4-8", max_tokens=16000,
+                                  tools=tools, messages=messages)
+if response.stop_reason == "tool_use":
+    ...  # execute, send tool_result back, loop
+```
+
+`tool_choice`: `{"type": "auto"}` (default) | `{"type": "any"}` | `{"type":
+"tool", "name": "..."}` | `{"type": "none"}`. Add
+`"disable_parallel_tool_use": true` to force at most one call per response.
+
+The agentic loop, parallel tool results, `pause_turn`, `is_error`, server-side
+tools, and SDK tool runners: [references/tool-use.md](references/tool-use.md)
+
+## Cost Optimization Checklist
+
+Work top-down; each item is independent:
+
+- [ ] **Right-size the model.** Haiku for classification/routing, Sonnet for
+      volume work, Opus/Fable for the hard 10%. Largest single lever.
+- [ ] **Prompt caching** on stable prefixes (system prompt, tool defs, big docs):
+      `cache_control: {"type": "ephemeral"}`. Reads cost ~0.1x; up to 90% savings.
+      Verify with `usage.cache_read_input_tokens > 0` — zero means a silent
+      invalidator (timestamp in system prompt, unsorted JSON, varying tools).
+- [ ] **Batches API** for anything that can wait ≤24h: flat 50% off all tokens,
+      stacks with caching.
+- [ ] **Cap output**: set `max_tokens` to what you need (256 for classification);
+      stream + generous cap for long generation.
+- [ ] **Tune effort down** where quality allows: `medium` is often the sweet
+      spot; `low` for subagents and simple tasks.
+- [ ] **Count before sending**: `client.messages.count_tokens(...)` (never
+      tiktoken — it's OpenAI's tokenizer and undercounts Claude by 15-20%).
+- [ ] **Keep prefixes stable**: order requests `tools` → `system` → `messages`,
+      volatile content last; don't swap tool sets or models mid-conversation.
+
+Mechanics, breakpoints, TTLs, batch lifecycle, tiering math:
+[references/caching-and-cost.md](references/caching-and-cost.md)
+
+## Claude Agent SDK (quick reference)
+
+```python
+# pip install claude-agent-sdk   (Python >= 3.10)
+import asyncio
+from claude_agent_sdk import query, ClaudeAgentOptions
+
+async def main():
+    async for message in query(
+        prompt="Find and fix the bug in auth.py",
+        options=ClaudeAgentOptions(allowed_tools=["Read", "Edit", "Bash"]),
+    ):
+        if hasattr(message, "result"):
+            print(message.result)
+
+asyncio.run(main())
+```
+
+```typescript
+// npm install @anthropic-ai/claude-agent-sdk
+import { query } from "@anthropic-ai/claude-agent-sdk";
+
+for await (const message of query({
+  prompt: "Find and fix the bug in auth.ts",
+  options: { allowedTools: ["Read", "Edit", "Bash"] },
+})) {
+  if ("result" in message) console.log(message.result);
+}
+```
+
+Built-in tools (Read/Write/Edit/Bash/Glob/Grep/WebSearch/WebFetch/...), hooks
+(`PreToolUse`, `PostToolUse`, ...), subagents, MCP servers, sessions
+(resume/fork), permission modes, and the SDK-vs-raw-API decision:
+[references/agent-sdk.md](references/agent-sdk.md)
+
+## Common Pitfalls
+
+| Pitfall | Symptom | Fix |
+|---|---|---|
+| Date-suffixed or guessed model ID | 404 `not_found_error` | Use exact alias IDs from the table above |
+| `budget_tokens` on Fable 5 / Opus 4.8 / 4.7 | 400 | `thinking: {"type": "adaptive"}` |
+| `temperature`/`top_p`/`top_k` on Fable 5 / Opus 4.8 / 4.7 | 400 | Remove; steer via prompt + `effort` |
+| Thinking + `tool_choice: any/tool` | 400 | Only `auto`/`none` with thinking on |
+| Assistant-turn prefill on 4.6+ models | 400 | `output_config.format` or system-prompt instruction |
+| Cache marker on <minimum prefix | Silent no-cache (`cache_creation_input_tokens: 0`) | Min ~1024-4096 tokens depending on model (see caching ref) |
+| Not handling `stop_reason: "tool_use"` | Agent "stops" after first tool call | Loop: execute tools, append `tool_result`, re-request |
+| Missing `tool_result` for a `tool_use` id | 400 on follow-up | One `tool_result` per `tool_use` block, ids matching |
+| Non-streaming with `max_tokens` > ~16K | SDK timeout / `ValueError` | Stream + `get_final_message()` / `finalMessage()` |
+| `output_format` top-level param | Deprecated | `output_config: {"format": {...}}` |
+| tiktoken for Claude token counts | 15-20%+ undercount | `messages.count_tokens` endpoint |
+| String-matching error messages | Fragile retries | Typed exceptions: `anthropic.RateLimitError` etc. |
+| Raw string-matching tool `input` | Breaks on escaping changes | Always `json.loads()` / use parsed `block.input` |
+
+## Reference Files
+
+| File | Covers |
+|---|---|
+| [references/messages-api.md](references/messages-api.md) | Params, response shape, streaming events, stop reasons, error handling, retries, rate limits |
+| [references/tool-use.md](references/tool-use.md) | Tool definitions, tool_choice, parallel tools, agentic loop, tool results, server tools, tool runners |
+| [references/caching-and-cost.md](references/caching-and-cost.md) | Prompt caching mechanics, Batches API, token counting, model tiering economics |
+| [references/structured-outputs.md](references/structured-outputs.md) | output_config.format, schema rules/limits, strict tools, parse() helpers, thinking interplay |
+| [references/agent-sdk.md](references/agent-sdk.md) | Python + TS Agent SDK, ClaudeAgentOptions, hooks, MCP, sessions, SDK vs raw API |
+
+## Live Documentation
+
+When cached facts may be stale, WebFetch (append `.md` for clean markdown):
+
+- Models/pricing: `https://platform.claude.com/docs/en/about-claude/models/overview.md`
+- Messages API: `https://platform.claude.com/docs/en/api/messages`
+- Tool use: `https://platform.claude.com/docs/en/agents-and-tools/tool-use/overview.md`
+- Prompt caching: `https://platform.claude.com/docs/en/build-with-claude/prompt-caching.md`
+- Structured outputs: `https://platform.claude.com/docs/en/build-with-claude/structured-outputs.md`
+- Batches: `https://platform.claude.com/docs/en/build-with-claude/batch-processing.md`
+- Agent SDK: `https://code.claude.com/docs/en/agent-sdk/overview`

+ 0 - 0
skills/claude-code-debug/assets/.gitkeep → skills/claude-api-ops/assets/.gitkeep


+ 254 - 0
skills/claude-api-ops/references/agent-sdk.md

@@ -0,0 +1,254 @@
+# Claude Agent SDK Reference
+
+The Agent SDK is **Claude Code as a library**: the same agent loop, built-in
+tools (file ops, bash, search, web), context management, and permission system,
+programmable from Python or TypeScript. Verified against
+code.claude.com/docs/en/agent-sdk (2026-06).
+
+| | Python | TypeScript |
+|---|---|---|
+| Package | `claude-agent-sdk` (`pip install claude-agent-sdk`) | `@anthropic-ai/claude-agent-sdk` (`npm install @anthropic-ai/claude-agent-sdk`) |
+| Runtime | Python ≥ 3.10 | Node (bundles a native Claude Code binary — no separate install) |
+| Entry point | `query(prompt, options=ClaudeAgentOptions(...))` → async iterator | `query({ prompt, options })` → async iterator |
+| Option naming | `snake_case` (`allowed_tools`) | `camelCase` (`allowedTools`) |
+
+Auth: `ANTHROPIC_API_KEY`, or third-party providers via env flags
+(`CLAUDE_CODE_USE_BEDROCK=1`, `CLAUDE_CODE_USE_VERTEX=1`,
+`CLAUDE_CODE_USE_FOUNDRY=1`, `CLAUDE_CODE_USE_ANTHROPIC_AWS=1` +
+`ANTHROPIC_AWS_WORKSPACE_ID`). Note: from June 15, 2026, Agent SDK / `claude -p`
+usage on subscription plans draws from a separate monthly Agent SDK credit —
+production apps should use API keys.
+
+## When SDK vs raw API
+
+| Signal | Choice |
+|---|---|
+| Agent must read/edit files, run commands, search a codebase | **Agent SDK** — tools are built in, loop is handled |
+| CI/CD automation, "fix the failing test", repo-scale refactors | **Agent SDK** |
+| You want hooks, permission gating, session resume out of the box | **Agent SDK** |
+| Single call: classify/summarize/extract | **Messages API** — SDK is overkill |
+| You need exact control of every request (params, caching breakpoints, message shapes) | **Messages API + tool use** |
+| Your tools are pure in-process functions, no filesystem | **Messages API + tool runner** is lighter |
+| Hosted agent, no infra at all | **Managed Agents** (REST, Anthropic-run sandbox) |
+
+The difference in code:
+
+```python
+# Messages API: you implement the loop
+response = client.messages.create(...)
+while response.stop_reason == "tool_use":
+    result = your_tool_executor(...)
+    response = client.messages.create(...)
+
+# Agent SDK: the loop, tools, and context management are inside query()
+async for message in query(prompt="Fix the bug in auth.py"):
+    print(message)
+```
+
+## Minimal agents
+
+```python
+import asyncio
+from claude_agent_sdk import query, ClaudeAgentOptions
+
+async def main():
+    async for message in query(
+        prompt="Find all TODO comments and create a summary",
+        options=ClaudeAgentOptions(allowed_tools=["Read", "Glob", "Grep"]),
+    ):
+        if hasattr(message, "result"):       # final ResultMessage
+            print(message.result)
+
+asyncio.run(main())
+```
+
+```typescript
+import { query } from "@anthropic-ai/claude-agent-sdk";
+
+for await (const message of query({
+  prompt: "Find all TODO comments and create a summary",
+  options: { allowedTools: ["Read", "Glob", "Grep"] },
+})) {
+  if ("result" in message) console.log(message.result);
+}
+```
+
+The iterator yields typed messages: system messages (`subtype: "init"` carries
+`session_id`), assistant/tool activity, and a final result message
+(`message.result` / `ResultMessage`).
+
+## ClaudeAgentOptions (key fields)
+
+| Python | TypeScript | Purpose |
+|---|---|---|
+| `allowed_tools` | `allowedTools` | Pre-approved tools (no permission prompt). Include `"Agent"` to auto-approve subagent spawns |
+| `disallowed_tools` | `disallowedTools` | Hard-blocked tools |
+| `permission_mode` | `permissionMode` | e.g. `"default"`, `"acceptEdits"`, `"bypassPermissions"`, `"plan"` |
+| `system_prompt` | `systemPrompt` | Replace or extend the system prompt |
+| `mcp_servers` | `mcpServers` | MCP server map (see below) |
+| `hooks` | `hooks` | Lifecycle callbacks (see below) |
+| `agents` | `agents` | Named subagent definitions |
+| `resume` | `resume` | Session ID to continue with full context |
+| `cwd` | `cwd` | Working directory for the agent |
+| `model` | `model` | Override model |
+| `max_turns` | `maxTurns` | Cap agent-loop iterations |
+| `setting_sources` | `settingSources` | Restrict which filesystem config loads (`.claude/`, `~/.claude/`) |
+| `plugins` | `plugins` | Programmatic plugin loading |
+
+### Built-in tools
+
+`Read`, `Write`, `Edit`, `Bash`, `Glob`, `Grep`, `WebSearch`, `WebFetch`,
+`Monitor` (watch a background process), `AskUserQuestion` (clarifying
+questions with options), `Agent` (spawn subagents). A read-only agent is just
+`allowed_tools=["Read", "Glob", "Grep"]`.
+
+## Hooks
+
+Callbacks at lifecycle points: `PreToolUse`, `PostToolUse`, `Stop`,
+`SessionStart`, `SessionEnd`, `UserPromptSubmit`, and more. Use them to audit,
+block, or transform agent behavior.
+
+```python
+from claude_agent_sdk import query, ClaudeAgentOptions, HookMatcher
+from datetime import datetime
+
+async def log_file_change(input_data, tool_use_id, context):
+    path = input_data.get("tool_input", {}).get("file_path", "unknown")
+    with open("./audit.log", "a") as f:
+        f.write(f"{datetime.now()}: modified {path}\n")
+    return {}        # empty dict = allow; hooks can also block/modify
+
+options = ClaudeAgentOptions(
+    permission_mode="acceptEdits",
+    hooks={"PostToolUse": [HookMatcher(matcher="Edit|Write", hooks=[log_file_change])]},
+)
+```
+
+```typescript
+import { query, HookCallback } from "@anthropic-ai/claude-agent-sdk";
+import { appendFile } from "fs/promises";
+
+const logFileChange: HookCallback = async (input) => {
+  const filePath = (input as any).tool_input?.file_path ?? "unknown";
+  await appendFile("./audit.log", `${new Date().toISOString()}: modified ${filePath}\n`);
+  return {};
+};
+
+const options = {
+  permissionMode: "acceptEdits" as const,
+  hooks: { PostToolUse: [{ matcher: "Edit|Write", hooks: [logFileChange] }] },
+};
+```
+
+`matcher` is a regex over tool names. A `PreToolUse` hook returning a deny
+decision blocks the call — this is the programmatic equivalent of a human
+approval gate.
+
+## MCP servers
+
+Wire any MCP server (stdio command or remote) into the agent:
+
+```python
+options = ClaudeAgentOptions(
+    mcp_servers={
+        "playwright": {"command": "npx", "args": ["@playwright/mcp@latest"]},
+    },
+)
+```
+
+```typescript
+const options = {
+  mcpServers: {
+    playwright: { command: "npx", args: ["@playwright/mcp@latest"] },
+  },
+};
+```
+
+MCP tools surface as `mcp__<server>__<tool>` — add them to
+`allowed_tools` to pre-approve. This is how you give an agent databases,
+browsers, and third-party APIs without writing tool plumbing.
+
+## Subagents
+
+```python
+from claude_agent_sdk import query, ClaudeAgentOptions, AgentDefinition
+
+options = ClaudeAgentOptions(
+    allowed_tools=["Read", "Glob", "Grep", "Agent"],   # Agent tool approves spawns
+    agents={
+        "code-reviewer": AgentDefinition(
+            description="Expert code reviewer for quality and security reviews.",
+            prompt="Analyze code quality and suggest improvements.",
+            tools=["Read", "Glob", "Grep"],
+        ),
+    },
+)
+# prompt: "Use the code-reviewer agent to review this codebase"
+```
+
+Messages emitted inside a subagent carry `parent_tool_use_id` so you can
+attribute output to the spawning call. Use subagents to isolate context
+(reviewer doesn't pollute the main transcript) and to parallelize independent
+legs.
+
+## Sessions: resume and fork
+
+Session state is JSONL on your filesystem. Capture the session ID from the
+init message, resume later with full context:
+
+```python
+from claude_agent_sdk import query, ClaudeAgentOptions, SystemMessage, ResultMessage
+
+session_id = None
+async for message in query(prompt="Read the authentication module",
+                           options=ClaudeAgentOptions(allowed_tools=["Read", "Glob"])):
+    if isinstance(message, SystemMessage) and message.subtype == "init":
+        session_id = message.data["session_id"]
+
+async for message in query(prompt="Now find all places that call it",
+                           options=ClaudeAgentOptions(resume=session_id)):
+    if isinstance(message, ResultMessage):
+        print(message.result)
+```
+
+```typescript
+let sessionId: string | undefined;
+for await (const message of query({ prompt: "Read the authentication module",
+                                    options: { allowedTools: ["Read", "Glob"] } })) {
+  if (message.type === "system" && message.subtype === "init") {
+    sessionId = message.session_id;
+  }
+}
+for await (const message of query({ prompt: "Now find all places that call it",
+                                    options: { resume: sessionId } })) {
+  if ("result" in message) console.log(message.result);
+}
+```
+
+Sessions can also be forked to explore alternative approaches from the same
+context point.
+
+## Filesystem configuration
+
+With default options the SDK loads Claude Code's filesystem config from
+`.claude/` (project) and `~/.claude/` (user): skills
+(`.claude/skills/*/SKILL.md`), commands, `CLAUDE.md` memory, plugins. Restrict
+with `setting_sources` / `settingSources` when you want a hermetic agent (CI)
+that ignores developer-machine state.
+
+## Production patterns
+
+- **CI agent:** `permission_mode="bypassPermissions"` (or a tight
+  `allowed_tools` list) + `max_turns` cap + hooks for audit logging. Never
+  bypass permissions on a machine with credentials you don't want the agent
+  exercising.
+- **Approval gate:** `PreToolUse` hook on `Bash|Write|Edit` that checks the
+  input and returns deny for out-of-policy actions.
+- **Observability:** log every message from the iterator; hook
+  `PostToolUse` for tool-level metrics; final `ResultMessage` includes
+  cost/usage data.
+- **Prototype → production:** a common path is Agent SDK locally (works on
+  your filesystem), then Managed Agents for hosted production (Anthropic runs
+  the sandbox; custom tools become event round-trips).
+- Workflows translate 1:1 with the Claude Code CLI (`claude -p`) — anything
+  you can do interactively you can automate via the SDK.

+ 244 - 0
skills/claude-api-ops/references/caching-and-cost.md

@@ -0,0 +1,244 @@
+# Caching, Batches & Cost Reference
+
+The three big cost levers, in order of typical impact: model tiering, prompt
+caching, Batches API. They stack — a cached Haiku batch request can cost ~2-3%
+of an uncached Opus interactive request for the same tokens.
+
+## Prompt Caching
+
+### The one invariant
+
+**Caching is a prefix match.** The cache key is the exact bytes of the
+rendered prompt up to each `cache_control` breakpoint. One byte changed
+anywhere in the prefix invalidates everything after it. Render order is
+`tools` → `system` → `messages` — a breakpoint on the last system block caches
+tools + system together.
+
+### Syntax
+
+```python
+response = client.messages.create(
+    model="claude-opus-4-8",
+    max_tokens=16000,
+    system=[{
+        "type": "text",
+        "text": LARGE_STABLE_PROMPT,                      # 50KB of docs, instructions...
+        "cache_control": {"type": "ephemeral"},           # 5-minute TTL (default)
+        # "cache_control": {"type": "ephemeral", "ttl": "1h"}  # 1-hour TTL
+    }],
+    messages=[{"role": "user", "content": question}],
+)
+```
+
+```typescript
+const response = await client.messages.create({
+  model: "claude-opus-4-8",
+  max_tokens: 16000,
+  system: [{ type: "text", text: LARGE_STABLE_PROMPT,
+             cache_control: { type: "ephemeral" } }],
+  messages: [{ role: "user", content: question }],
+});
+```
+
+Simplest option — top-level auto-caching (caches the last cacheable block, no
+per-block markers):
+
+```python
+client.messages.create(model="claude-opus-4-8", max_tokens=16000,
+                       cache_control={"type": "ephemeral"},
+                       system=big_doc, messages=[...])
+```
+
+Rules:
+
+- Max **4** `cache_control` breakpoints per request.
+- Valid on system text blocks, tool definitions, and message content blocks
+  (`text`, `image`, `tool_use`, `tool_result`, `document`).
+- **Minimum cacheable prefix is model-dependent** — below it the marker is
+  silently ignored (no error, just `cache_creation_input_tokens: 0`):
+
+| Model | Minimum prefix tokens |
+|---|---:|
+| Opus 4.8 / 4.7 / 4.6 / 4.5, Haiku 4.5 | 4096 |
+| Fable 5, Sonnet 4.6 | 2048 |
+| Sonnet 4.5 and older Sonnets | 1024 |
+
+### Pricing & break-even
+
+| Operation | Cost vs base input |
+|---|---|
+| Cache write, 5-min TTL | 1.25x |
+| Cache write, 1-hour TTL | 2x |
+| Cache read | ~0.1x |
+
+Break-even: 5-min TTL pays off at the **2nd** request (1.25 + 0.1 = 1.35x vs
+2x); 1-hour TTL needs **3+** requests (2 + 0.2 = 2.2x vs 3x). Steady-state
+savings on a large cached prefix approach 90%.
+
+### Multi-turn / agent placement
+
+- **Multi-turn:** put the breakpoint on the last content block of the latest
+  turn; earlier breakpoints remain valid read points, so hits accrue as the
+  conversation grows. Top-level auto-caching does this for you.
+- **Shared prefix, varying question:** breakpoint at the end of the *shared*
+  part, not the end of the prompt — otherwise every request writes a distinct
+  entry and nothing is ever read.
+- **20-block lookback:** a breakpoint searches backward at most 20 content
+  blocks for a prior entry. Agent turns adding >20 blocks (many
+  tool_use/tool_result pairs) silently miss — add an intermediate breakpoint
+  every ~15 blocks in long turns.
+- **Concurrent fan-out:** an entry becomes readable only once the first
+  response starts streaming. Fire 1 request, await first token, then fire the
+  other N-1 so they read the fresh cache.
+
+### Silent invalidators (audit checklist)
+
+If `usage.cache_read_input_tokens` stays 0 across identical-prefix requests,
+grep the prompt-assembly path for:
+
+| Pattern | Why it kills the cache |
+|---|---|
+| `datetime.now()` / `Date.now()` in the system prompt | New prefix every request |
+| `uuid4()` / request IDs early in content | Same |
+| `json.dumps(d)` without `sort_keys=True` | Non-deterministic bytes |
+| Per-user IDs interpolated into the system prompt | No cross-user sharing |
+| Conditional system sections (`if flag: system += ...`) | Each flag combo = distinct prefix |
+| Tool set built per-user / unsorted | Tools render at position 0 — full invalidation |
+| Switching models mid-conversation | Caches are model-scoped |
+
+Fix: move volatile content **after** the last breakpoint (or into the latest
+user message), serialize deterministically, freeze the system prompt and tool
+list.
+
+### Verifying
+
+```python
+u = response.usage
+print(u.cache_creation_input_tokens)  # written this request (paid 1.25-2x)
+print(u.cache_read_input_tokens)      # served from cache (paid ~0.1x)
+print(u.input_tokens)                 # uncached remainder (full price)
+# total prompt = input + cache_creation + cache_read
+```
+
+### What does NOT invalidate
+
+Changing `tool_choice`, toggling thinking, or message content changes leave
+the tools+system cache intact. Only tool-definition changes and model switches
+force a full rebuild.
+
+## Batches API
+
+`POST /v1/messages/batches` — asynchronous Messages requests at a **flat 50%
+discount on all token usage** (stacks with prompt caching).
+
+| Fact | Value |
+|---|---|
+| Max batch size | 100,000 requests or 256 MB |
+| Turnaround | Usually <1 hour; max 24h |
+| Results retention | 29 days |
+| Feature support | All Messages features (tools, vision, caching, structured outputs, thinking) — no streaming |
+| Rate limits | Separate pool from interactive traffic |
+
+```python
+import anthropic, time
+from anthropic.types.message_create_params import MessageCreateParamsNonStreaming
+from anthropic.types.messages.batch_create_params import Request
+
+client = anthropic.Anthropic()
+
+# 1. Create
+batch = client.messages.batches.create(requests=[
+    Request(custom_id=f"item-{i}",
+            params=MessageCreateParamsNonStreaming(
+                model="claude-haiku-4-5", max_tokens=64,
+                messages=[{"role": "user",
+                           "content": f"Classify sentiment (one word): {text}"}]))
+    for i, text in enumerate(texts)
+])
+
+# 2. Poll
+while True:
+    batch = client.messages.batches.retrieve(batch.id)
+    if batch.processing_status == "ended":
+        break
+    time.sleep(60)
+print(batch.request_counts)  # succeeded / errored / canceled / expired
+
+# 3. Results (order not guaranteed — key on custom_id)
+for result in client.messages.batches.results(batch.id):
+    if result.result.type == "succeeded":
+        msg = result.result.message
+        text = next((b.text for b in msg.content if b.type == "text"), "")
+    elif result.result.type == "errored":
+        # error.type == "invalid_request" → fix and resubmit; otherwise safe to retry
+        ...
+```
+
+```typescript
+const batch = await client.messages.batches.create({
+  requests: [{
+    custom_id: "request-1",
+    params: { model: "claude-sonnet-4-6", max_tokens: 1024,
+              messages: [{ role: "user", content: "Summarize..." }] },
+  }],
+});
+// poll batches.retrieve(batch.id) until processing_status === "ended"
+for await (const result of await client.messages.batches.results(batch.id)) {
+  if (result.result.type === "succeeded") { /* result.result.message */ }
+}
+```
+
+Batch gotchas:
+
+- `custom_id` is your only join key — results stream in completion order, not
+  submission order.
+- Result types: `succeeded` | `errored` | `canceled` | `expired`. Resubmit
+  `expired`; inspect `errored` (validation vs server error).
+- Cancel is async: `batches.cancel(id)` → status `"canceling"`; some requests
+  may still complete.
+- Caching inside batches works, but hit rates are best-effort (requests run
+  concurrently) — put a shared cached `system` block on every request and use
+  the 1-hour TTL.
+- Per-request params are full `MessageCreateParams` minus `stream`.
+
+Use batches for: eval suites, backfills, bulk extraction/classification,
+nightly report generation, regenerating embeddings-adjacent metadata — any
+workload where minutes-to-hours latency is fine.
+
+## Model Tiering Economics
+
+Worked example, 10M input + 1M output tokens/day:
+
+| Strategy | Cost/day |
+|---|---|
+| Everything Opus 4.8 | 10×$5 + 1×$25 = **$75** |
+| Everything Sonnet 4.6 | 10×$3 + 1×$15 = **$45** |
+| Route: 80% Haiku, 15% Sonnet, 5% Opus | ≈ 8×$1 + 1.5×$3 + 0.5×$5 + (output pro-rata ≈ $6.5) = **~$21.5** |
+| Same + cached system prompts (70% of input cached) | **~$8-10** |
+| Same + batchable share moved to Batches | **lower still (50% off that share)** |
+
+Patterns:
+
+- **Router**: a Haiku call classifies difficulty, dispatches to the right model.
+- **Cascade**: try Haiku; escalate to Sonnet/Opus only when confidence is low
+  or validation fails (works well with structured outputs as the validator).
+- **Subagents**: keep the orchestrator on Opus, push parallel/simple legs to
+  Haiku/Sonnet. Spawn a separate request per subagent (don't switch models
+  mid-conversation — it kills the cache).
+- **Effort tuning**: on supported models, dropping `output_config.effort` from
+  `high` to `medium` often cuts output tokens substantially at minor quality
+  cost — cheaper than switching model tier for borderline workloads.
+
+## Token Counting for Cost Estimation
+
+```python
+count = client.messages.count_tokens(
+    model="claude-opus-4-8", system=system, tools=tools, messages=messages)
+est_input_cost = count.input_tokens * 5.00 / 1_000_000   # Opus 4.8 input rate
+```
+
+- Free endpoint; counts include tools and system.
+- Tool use adds a hidden system prompt (~290-800 tokens depending on model and
+  `tool_choice`).
+- Token counts are model-specific — re-baseline when migrating models; don't
+  apply blanket multipliers.

+ 292 - 0
skills/claude-api-ops/references/messages-api.md

@@ -0,0 +1,292 @@
+# Messages API Reference
+
+`POST https://api.anthropic.com/v1/messages` — the single endpoint everything
+runs through. Tools, structured outputs, thinking, and caching are all features
+of this endpoint, not separate APIs.
+
+## Required Headers
+
+| Header | Value |
+|---|---|
+| `x-api-key` | Your API key (`sk-ant-...`) |
+| `anthropic-version` | `2023-06-01` |
+| `content-type` | `application/json` |
+| `anthropic-beta` | Comma-separated beta IDs, only for beta features |
+
+OAuth bearer tokens go on `Authorization: Bearer <token>` instead of
+`x-api-key` (plus `anthropic-beta: oauth-2025-04-20`). Setting both
+`ANTHROPIC_API_KEY` and `ANTHROPIC_AUTH_TOKEN` makes the SDK send both headers
+and the API rejects the request.
+
+## Request Parameters
+
+| Param | Type | Required | Notes |
+|---|---|---|---|
+| `model` | string | yes | Exact alias ID, e.g. `claude-opus-4-8` — no date suffixes |
+| `max_tokens` | int | yes | Hard output cap. Default sensibly: ~16000 non-streaming, ~64000 streaming, ~256 classification |
+| `messages` | array | yes | Alternating `user`/`assistant` turns; first must be `user`. Consecutive same-role messages are merged |
+| `system` | string \| block[] | no | System prompt. Block-list form required for `cache_control` |
+| `tools` | array | no | Custom + server tool definitions (see tool-use.md) |
+| `tool_choice` | object | no | `auto` (default) / `any` / `tool` / `none` |
+| `thinking` | object | no | `{"type": "adaptive"}` on 4.6+; `{"type": "enabled", "budget_tokens": N}` legacy models only |
+| `output_config` | object | no | `{"effort": "...", "format": {...}, "task_budget": {...}}` |
+| `stop_sequences` | string[] | no | Custom stop strings |
+| `stream` | bool | no | SSE streaming |
+| `metadata` | object | no | `{"user_id": "..."}` — opaque end-user id for abuse detection |
+| `temperature` / `top_p` / `top_k` | number | no | **Removed on Fable 5 / Opus 4.8 / 4.7 (400).** On other 4.x: at most one of temperature/top_p |
+| `cache_control` | object | no | Top-level auto-caching: caches the last cacheable block |
+| `container` | string | no | Reuse a code-execution container id |
+| `mcp_servers` | array | no | Remote MCP connector (beta `mcp-client-2025-11-20`) |
+
+### Message content blocks
+
+`content` is either a plain string or an array of blocks:
+
+```json
+{"role": "user", "content": [
+  {"type": "text", "text": "What's in this image?"},
+  {"type": "image", "source": {"type": "base64", "media_type": "image/png", "data": "<b64>"}},
+  {"type": "image", "source": {"type": "url", "url": "https://example.com/img.png"}},
+  {"type": "document", "source": {"type": "base64", "media_type": "application/pdf", "data": "<b64>"}},
+  {"type": "tool_result", "tool_use_id": "toolu_...", "content": "..."}
+]}
+```
+
+## Response Shape
+
+```json
+{
+  "id": "msg_01...",
+  "type": "message",
+  "role": "assistant",
+  "model": "claude-opus-4-8",
+  "content": [
+    {"type": "thinking", "thinking": "...", "signature": "..."},
+    {"type": "text", "text": "Hello!"},
+    {"type": "tool_use", "id": "toolu_01...", "name": "get_weather", "input": {"location": "Paris"}}
+  ],
+  "stop_reason": "end_turn",
+  "stop_sequence": null,
+  "usage": {
+    "input_tokens": 1024,
+    "output_tokens": 256,
+    "cache_creation_input_tokens": 0,
+    "cache_read_input_tokens": 0
+  }
+}
+```
+
+`content` is a **list of typed blocks** — never index `content[0].text` blindly
+(a `thinking` block may come first). Filter by `.type`.
+
+## Stop Reasons
+
+| `stop_reason` | Meaning | What to do |
+|---|---|---|
+| `end_turn` | Finished naturally | Done |
+| `max_tokens` | Hit the `max_tokens` cap | Raise the cap or stream; output may be truncated mid-thought |
+| `stop_sequence` | Hit a custom stop string | `stop_sequence` field has which one |
+| `tool_use` | Claude wants tool(s) executed | Execute each `tool_use` block, send `tool_result`(s), re-request |
+| `pause_turn` | Server-side tool loop hit its iteration limit | Append the assistant turn and re-send unchanged — server resumes; do NOT add a "continue" user message |
+| `refusal` | Safety refusal | Check `stop_details` (`category`: "cyber"/"bio"/null, `explanation`); don't retry same prompt |
+| `model_context_window_exceeded` | Context window exhausted (distinct from max_tokens) | Compact, truncate, or split the conversation |
+
+```python
+if response.stop_reason == "refusal" and response.stop_details:
+    print(response.stop_details.category, response.stop_details.explanation)
+```
+
+## Multi-Turn Conversations
+
+The API is stateless — send the full history every request:
+
+```python
+messages = []
+def chat(user_msg: str) -> str:
+    messages.append({"role": "user", "content": user_msg})
+    r = client.messages.create(model="claude-opus-4-8", max_tokens=16000, messages=messages)
+    # Append the FULL content list (preserves tool_use/thinking/compaction blocks)
+    messages.append({"role": "assistant", "content": r.content})
+    return next(b.text for b in r.content if b.type == "text")
+```
+
+For conversations that may exceed context: server-side **compaction** (beta
+header `compact-2026-01-12`, `context_management: {"edits": [{"type":
+"compact_20260112"}]}` on `client.beta.messages.create`). Critical: append
+`response.content` back verbatim — compaction blocks must be preserved or
+state is silently lost.
+
+## Streaming
+
+```python
+with client.messages.stream(model="claude-opus-4-8", max_tokens=64000,
+                            messages=[...]) as stream:
+    for text in stream.text_stream:
+        print(text, end="", flush=True)
+    final = stream.get_final_message()
+print(final.usage.output_tokens)
+```
+
+```typescript
+const stream = client.messages.stream({ model: "claude-opus-4-8", max_tokens: 64000, messages });
+stream.on("text", (delta) => process.stdout.write(delta));
+const final = await stream.finalMessage();   // never wrap .on() in new Promise()
+```
+
+### SSE event sequence
+
+```
+message_start          → message metadata (id, model, usage so far)
+content_block_start    → index + block type (text / thinking / tool_use)
+content_block_delta    → text_delta | thinking_delta | input_json_delta
+content_block_stop     → block finished
+message_delta          → stop_reason + final usage
+message_stop           → stream done
+```
+
+Tool inputs stream as `input_json_delta` (partial JSON strings) — accumulate
+and parse at `content_block_stop`, or just use `get_final_message()` /
+`finalMessage()` which assembles parsed blocks for you.
+
+**Why stream:** non-streaming requests with large `max_tokens` exceed HTTP
+timeouts (the Python SDK raises `ValueError` for non-streaming requests it
+estimates will run >~10 min). Default to streaming for anything long.
+
+## Error Handling
+
+| HTTP | `error.type` | Retryable | Typical cause |
+|---|---|---|---|
+| 400 | `invalid_request_error` | no | Bad params: removed sampling params, `budget_tokens` on 4.7+, prefill on 4.6+, role ordering |
+| 401 | `authentication_error` | no | Missing/invalid key; both key + token set |
+| 403 | `permission_error` | no | Key lacks model/feature access |
+| 404 | `not_found_error` | no | Bad model ID (date-suffix mistake) or endpoint |
+| 413 | `request_too_large` | no | Body over size limit — shrink images/history |
+| 429 | `rate_limit_error` | yes | RPM/ITPM/OTPM exceeded — honor `retry-after` |
+| 500 | `api_error` | yes | Transient server issue |
+| 529 | `overloaded_error` | yes | Capacity — backoff; consider another model |
+
+Error envelope:
+
+```json
+{"type": "error",
+ "error": {"type": "rate_limit_error", "message": "..."},
+ "request_id": "req_011CSH..."}
+```
+
+Log `request_id` (also `response._request_id` on SDK success objects) when
+reporting issues to Anthropic.
+
+### Typed exceptions — never string-match messages
+
+```python
+import anthropic
+try:
+    r = client.messages.create(...)
+except anthropic.BadRequestError as e:      # 400
+    raise                                    # don't retry client errors
+except anthropic.RateLimitError as e:       # 429
+    wait = int(e.response.headers.get("retry-after", "60"))
+except anthropic.APIStatusError as e:        # catch-all with .status_code / .type
+    if e.status_code >= 500: ...             # retryable
+except anthropic.APIConnectionError:
+    ...                                      # network — retryable
+```
+
+```typescript
+try {
+  await client.messages.create({...});
+} catch (err) {
+  if (err instanceof Anthropic.RateLimitError) { /* backoff */ }
+  else if (err instanceof Anthropic.APIError) { console.error(err.status, err.message); }
+}
+```
+
+All subclasses expose `.type` (e.g. `"overloaded_error"`) for finer
+classification than the status code (e.g. `billing_error` vs
+`permission_error`, both 403).
+
+### Retries
+
+The official SDKs **auto-retry** connection errors, 408/409/429 and >=500 with
+exponential backoff — default `max_retries=2`. Configure per client
+(`anthropic.Anthropic(max_retries=5)`) or per call
+(`client.with_options(max_retries=5, timeout=20.0).messages.create(...)`).
+Only hand-roll retry logic when you need behavior beyond that (e.g. queue +
+jitter across many workers):
+
+```python
+import random, time
+
+def call_with_retry(client, max_retries=5, base=1.0, cap=60.0, **kwargs):
+    last = None
+    for attempt in range(max_retries):
+        try:
+            return client.messages.create(**kwargs)
+        except anthropic.RateLimitError as e:
+            last = e
+        except anthropic.APIStatusError as e:
+            if e.status_code < 500:
+                raise          # 4xx (except 429) is not retryable
+            last = e
+        time.sleep(min(base * 2 ** attempt + random.random(), cap))
+    raise last
+```
+
+Default request timeout is 10 minutes (`timeout=` on the client or
+`with_options`). On timeout: `anthropic.APITimeoutError`, retried per
+`max_retries`.
+
+## Rate Limits
+
+Limits are per-organization, per-model-class, measured three ways:
+
+- **RPM** — requests per minute
+- **ITPM** — input tokens per minute (cache reads often discounted/exempt — check headers)
+- **OTPM** — output tokens per minute
+
+Tiers scale with cumulative spend (Tier 1-4, then custom/scale). Check live
+limits in Console or response headers:
+
+| Header | Meaning |
+|---|---|
+| `retry-after` | Seconds to wait (on 429) |
+| `anthropic-ratelimit-requests-limit` / `-remaining` / `-reset` | RPM state |
+| `anthropic-ratelimit-input-tokens-*` / `-output-tokens-*` | ITPM / OTPM state |
+
+Practical guidance:
+
+- Treat 429 as backpressure: honor `retry-after`, add jitter, cap concurrency.
+- Long-running agent fleets: budget OTPM, not just RPM — output is usually the
+  binding constraint.
+- Batches API has separate, much higher throughput and doesn't draw from
+  interactive rate limits — move bulk traffic there.
+- 529 `overloaded_error` is capacity, not your quota — backoff and/or fail over
+  to a different model tier.
+
+## Token Counting
+
+`POST /v1/messages/count_tokens` — free, model-specific, counts a request
+without running it:
+
+```python
+n = client.messages.count_tokens(
+    model="claude-opus-4-8",
+    system=system, tools=tools,
+    messages=[{"role": "user", "content": text}],
+).input_tokens
+```
+
+Never estimate with `tiktoken` (OpenAI tokenizer; 15-20% undercount on prose,
+worse on code). Token counts differ **between Claude models** too — count
+against the model you'll run.
+
+## Vision & Documents
+
+- Images: `{"type": "image", "source": {...}}` blocks — base64, URL, or Files
+  API `{"type": "file", "file_id": ...}`. Opus 4.7+ supports high-res input
+  (up to 2576px long edge, pixel-accurate coordinates; up to ~3x image tokens).
+- PDFs: `{"type": "document", "source": {...}}` — base64, URL, plain text, or
+  file_id. Optional `citations: {"enabled": true}`.
+- Files API (beta `files-api-2025-04-14`): upload once
+  (`client.beta.files.upload(...)`), reference by `file_id` across requests.
+  500 MB/file, 100 GB/org.

+ 190 - 0
skills/claude-api-ops/references/structured-outputs.md

@@ -0,0 +1,190 @@
+# Structured Outputs Reference
+
+Two related features, same constrained-sampling mechanism:
+
+| Feature | Parameter | Constrains |
+|---|---|---|
+| **JSON outputs** | `output_config: {"format": {...}}` | Claude's response text (guaranteed valid JSON matching your schema) |
+| **Strict tool use** | `strict: true` on a tool definition | The `input` of tool calls |
+
+They can be combined in one request. Supported on Fable 5, Opus 4.8/4.7/4.6/4.5,
+Sonnet 4.6/4.5, Haiku 4.5.
+
+**Naming:** the canonical parameter is `output_config.format`. The older
+top-level `output_format` parameter (and the `structured-outputs-2025-11-13`
+beta header) is **deprecated** — still accepted during a transition window,
+and still used as a convenience kwarg by some SDK `parse()` methods, but write
+new code against `output_config`.
+
+## JSON Outputs — raw schema
+
+```python
+import json, anthropic
+
+client = anthropic.Anthropic()
+response = client.messages.create(
+    model="claude-opus-4-8",
+    max_tokens=16000,
+    messages=[{"role": "user",
+               "content": "Extract: John Smith (john@example.com) wants the Enterprise plan."}],
+    output_config={
+        "format": {
+            "type": "json_schema",
+            "schema": {
+                "type": "object",
+                "properties": {
+                    "name":  {"type": "string"},
+                    "email": {"type": "string", "format": "email"},
+                    "plan":  {"type": "string", "enum": ["Free", "Pro", "Enterprise"]},
+                },
+                "required": ["name", "email", "plan"],
+                "additionalProperties": False,
+            },
+        }
+    },
+)
+text = next(b.text for b in response.content if b.type == "text")
+data = json.loads(text)   # guaranteed valid against the schema (unless refusal/max_tokens)
+```
+
+cURL shape:
+
+```json
+{
+  "model": "claude-opus-4-8",
+  "max_tokens": 1024,
+  "output_config": {
+    "format": {"type": "json_schema", "schema": { ... }}
+  },
+  "messages": [{"role": "user", "content": "..."}]
+}
+```
+
+## SDK helpers — `parse()` (recommended)
+
+```python
+from pydantic import BaseModel
+
+class ContactInfo(BaseModel):
+    name: str
+    email: str
+    plan: str
+    demo_requested: bool
+
+response = client.messages.parse(
+    model="claude-opus-4-8",
+    max_tokens=16000,
+    messages=[{"role": "user", "content": "Extract: Jane Doe (jane@co.com), Enterprise, wants a demo."}],
+    output_format=ContactInfo,          # parse() convenience kwarg
+)
+contact = response.parsed_output        # validated ContactInfo instance
+```
+
+```typescript
+import { z } from "zod";
+import { zodOutputFormat } from "@anthropic-ai/sdk/helpers/zod";
+
+const ContactInfo = z.object({
+  name: z.string(),
+  email: z.string(),
+  plan: z.string(),
+  demo_requested: z.boolean(),
+});
+
+const response = await client.messages.parse({
+  model: "claude-opus-4-8",
+  max_tokens: 16000,
+  output_config: { format: zodOutputFormat(ContactInfo) },
+  messages: [{ role: "user", content: "Extract: ..." }],
+});
+console.log(response.parsed_output!.name);  // null if parsing failed — guard it
+```
+
+The SDKs strip unsupported schema constraints (e.g. `minLength`) before
+sending and validate them client-side instead.
+
+## Strict Tool Use
+
+```python
+tools = [{
+    "name": "book_flight",
+    "description": "Book a flight",
+    "strict": True,
+    "input_schema": {
+        "type": "object",
+        "properties": {
+            "destination": {"type": "string"},
+            "date":        {"type": "string", "format": "date"},
+            "passengers":  {"type": "integer", "enum": [1, 2, 3, 4, 5, 6, 7, 8]},
+        },
+        "required": ["destination", "date", "passengers"],
+        "additionalProperties": False,
+    },
+}]
+```
+
+- Per-tool opt-in; non-strict tools don't count toward complexity limits.
+- Max **20 strict tools** per request.
+- Guarantees the `tool_use.input` validates exactly — no missing required
+  fields, no type drift.
+
+## JSON Schema: supported vs not
+
+**Supported:** object/array/string/integer/number/boolean/null; `enum`
+(scalars only); `const`; `anyOf`/`allOf` (no `allOf` + `$ref` combo); internal
+`$ref`/`$defs`; `default`; `required`; `additionalProperties: false`
+(mandatory on every object); string `format` (`date-time`, `time`, `date`,
+`duration`, `email`, `hostname`, `uri`, `ipv4`, `ipv6`, `uuid`); array
+`minItems` 0 or 1 only; simple regex `pattern`.
+
+**Not supported:** recursive schemas; external `$ref`; numeric constraints
+(`minimum`/`maximum`/`multipleOf`); string length constraints
+(`minLength`/`maxLength`); array constraints beyond `minItems` 0/1;
+regex backreferences, lookahead/lookbehind, `\b`; `additionalProperties`
+anything but `false`.
+
+**Complexity limits:** 20 strict tools; 24 optional parameters total across
+all schemas; 16 union-typed (`anyOf`) parameters; grammar compilation timeout
+180s ("Schema is too complex"). Reduce by flattening nesting, making params
+required, splitting across requests.
+
+## Operational notes
+
+- **First-request latency:** new schemas compile a grammar on first use;
+  cached for 24h (keyed on schema + tool set; name/description changes don't
+  invalidate).
+- **Prompt cache interplay:** changing `output_config.format` invalidates the
+  prompt cache; the feature also injects an extra system prompt (more input
+  tokens).
+- **Failure modes:** `stop_reason: "refusal"` → output may not match the
+  schema; `stop_reason: "max_tokens"` → JSON may be truncated/incomplete —
+  raise `max_tokens` and check before parsing.
+- **Incompatible with:** citations (400) and assistant-message prefilling.
+  **Works with:** batches, streaming, token counting, extended/adaptive
+  thinking.
+- Don't put PHI/PII in schema property names, enum values, or patterns —
+  schemas are cached separately from ZDR-handled message content.
+
+## Structured outputs vs tool-use extraction
+
+Before structured outputs existed, the standard extraction trick was a forced
+tool call (`tool_choice: {"type": "tool", "name": "record_result"}`) with the
+target shape as `input_schema`. Decision now:
+
+| Want | Use |
+|---|---|
+| The *final answer* as guaranteed JSON | `output_config.format` |
+| Valid *parameters* for a real action/function | tool + `strict: true` |
+| Extraction **while thinking is enabled** | `output_config.format` — forced `tool_choice` is a 400 with thinking on |
+| Extraction mid-agentic-loop (model also has other tools) | A strict "report/record" tool keeps the loop uniform |
+| Legacy prefill (`{"name": "` assistant prefill) | Dead on 4.6+ models (400) — migrate to `output_config.format` |
+
+## Thinking interplay
+
+- `output_config.format` **works with adaptive/extended thinking** — the model
+  thinks, then the final text block conforms to the schema.
+- Forced tool extraction does **not** work with thinking
+  (`tool_choice: any/tool` + thinking = 400). This is the main reason to
+  prefer `output_config.format` for extraction on 4.6+ models.
+- Effort and format coexist in `output_config`:
+  `output_config={"effort": "medium", "format": {...}}`.

+ 289 - 0
skills/claude-api-ops/references/tool-use.md

@@ -0,0 +1,289 @@
+# Tool Use Reference
+
+Tool use is a feature of `POST /v1/messages` — you pass `tools`, Claude
+responds with `tool_use` content blocks, you execute and return `tool_result`
+blocks. **Client tools** run in your code; **server tools** (web_search,
+code_execution, web_fetch) run on Anthropic's infrastructure.
+
+## Tool Definition
+
+```json
+{
+  "name": "get_weather",
+  "description": "Get current weather for a location. Call this when the user asks about weather conditions, temperature, or forecasts.",
+  "input_schema": {
+    "type": "object",
+    "properties": {
+      "location": {"type": "string", "description": "City and state, e.g. San Francisco, CA"},
+      "unit": {"type": "string", "enum": ["celsius", "fahrenheit"], "description": "Temperature unit"}
+    },
+    "required": ["location"]
+  }
+}
+```
+
+Rules that actually move the needle:
+
+- **Descriptions are the routing signal.** Be prescriptive about *when* to
+  call, not just what it does ("Call this when the user asks about current
+  prices or recent events"). Recent Opus models reach for tools more
+  conservatively — trigger conditions in the description give measurable lift.
+- Describe every property; use `enum` for closed sets; only truly-required
+  params in `required`.
+- Specific names beat generic ones: `get_current_weather` > `weather`.
+- Keep the tool set focused — too many similar tools degrades selection. For
+  large tool libraries, use the server-side **tool search tool** (loads schemas
+  on demand, preserves the prompt cache by appending rather than swapping).
+- **Sample calls in definitions**: you can include example invocations
+  (`input_examples` on supported surfaces / examples embedded in the
+  description) to demonstrate parameter formats and cut parameter errors —
+  most useful for complex nested schemas.
+- `strict: true` on a tool guarantees the emitted `input` validates against
+  the schema exactly (see structured-outputs.md; max 20 strict tools/request).
+
+## tool_choice
+
+| Value | Behavior |
+|---|---|
+| `{"type": "auto"}` | Claude decides (default when tools are present) |
+| `{"type": "any"}` | Must call at least one tool |
+| `{"type": "tool", "name": "get_weather"}` | Must call that specific tool |
+| `{"type": "none"}` | Cannot call tools (definitions stay in context) |
+
+Any variant accepts `"disable_parallel_tool_use": true` to cap at one tool
+call per response.
+
+Gotchas:
+
+- **Thinking on (enabled or adaptive) + `any`/`tool` = 400.** Only `auto` and
+  `none` are compatible with thinking. To force a tool while thinking, prompt
+  for it instead, or disable thinking for that call.
+- `any`/`tool` add more tool-use system-prompt tokens than `auto`/`none` (e.g.
+  410 vs 290 on Opus 4.8).
+- Changing `tool_choice` between requests does **not** invalidate the
+  tools+system prompt cache (message cache only).
+
+## Parallel Tool Use
+
+By default Claude may emit **multiple `tool_use` blocks in one response**.
+Execute them all (concurrently if safe), then return **all results in a single
+user message** — one `tool_result` per `tool_use`, ids matching, results may
+be in any order but must all be present:
+
+```python
+tool_results = []
+for block in response.content:
+    if block.type == "tool_use":
+        result = execute_tool(block.name, block.input)   # block.input is parsed dict
+        tool_results.append({
+            "type": "tool_result",
+            "tool_use_id": block.id,
+            "content": result,
+        })
+messages.append({"role": "assistant", "content": response.content})
+messages.append({"role": "user", "content": tool_results})
+```
+
+A follow-up request missing a `tool_result` for any outstanding `tool_use` id
+is a 400.
+
+## The Agentic Loop (manual)
+
+Use the manual loop when you need approval gates, custom logging, or
+conditional execution:
+
+```python
+import anthropic
+
+client = anthropic.Anthropic()
+messages = [{"role": "user", "content": user_input}]
+
+while True:
+    response = client.messages.create(
+        model="claude-opus-4-8",
+        max_tokens=16000,
+        tools=tools,
+        messages=messages,
+    )
+
+    if response.stop_reason == "end_turn":
+        break
+
+    if response.stop_reason == "pause_turn":
+        # Server-side tool loop hit its iteration limit: append and re-send.
+        # Do NOT inject a "continue" user message — the API resumes automatically.
+        messages.append({"role": "assistant", "content": response.content})
+        continue
+
+    if response.stop_reason == "tool_use":
+        messages.append({"role": "assistant", "content": response.content})
+        results = []
+        for block in response.content:
+            if block.type == "tool_use":
+                try:
+                    out = execute_tool(block.name, block.input)
+                    results.append({"type": "tool_result",
+                                    "tool_use_id": block.id, "content": out})
+                except Exception as e:
+                    results.append({"type": "tool_result",
+                                    "tool_use_id": block.id,
+                                    "content": f"Error: {e}", "is_error": True})
+        messages.append({"role": "user", "content": results})
+        continue
+
+    break  # max_tokens / refusal / stop_sequence — handle per stop_reason
+
+final_text = next((b.text for b in response.content if b.type == "text"), "")
+```
+
+```typescript
+import Anthropic from "@anthropic-ai/sdk";
+
+const client = new Anthropic();
+const messages: Anthropic.MessageParam[] = [{ role: "user", content: userInput }];
+
+while (true) {
+  const response = await client.messages.create({
+    model: "claude-opus-4-8", max_tokens: 16000, tools, messages,
+  });
+
+  if (response.stop_reason === "end_turn") break;
+
+  if (response.stop_reason === "pause_turn") {
+    messages.push({ role: "assistant", content: response.content });
+    continue;
+  }
+
+  const toolUses = response.content.filter(
+    (b): b is Anthropic.ToolUseBlock => b.type === "tool_use",
+  );
+  messages.push({ role: "assistant", content: response.content });
+
+  const results: Anthropic.ToolResultBlockParam[] = [];
+  for (const t of toolUses) {
+    results.push({ type: "tool_result", tool_use_id: t.id,
+                   content: await executeTool(t.name, t.input) });
+  }
+  messages.push({ role: "user", content: results });
+}
+```
+
+Loop invariants:
+
+1. Append the **full** `response.content` as the assistant turn (preserves
+   `tool_use` + `thinking` blocks; thinking `signature` must round-trip
+   untouched).
+2. One `tool_result` per `tool_use`, matching `tool_use_id`.
+3. Tool results go in a **user** message.
+4. Add a max-iterations guard (e.g. 10-20) so a confused model can't loop forever.
+5. Parse `block.input` as structured data — never regex the serialized JSON
+   (escaping varies across models).
+
+## Tool Result Shapes
+
+```json
+{"type": "tool_result", "tool_use_id": "toolu_01...", "content": "plain string"}
+
+{"type": "tool_result", "tool_use_id": "toolu_01...",
+ "content": [{"type": "text", "text": "..."},
+             {"type": "image", "source": {"type": "base64", "media_type": "image/png", "data": "..."}}]}
+
+{"type": "tool_result", "tool_use_id": "toolu_01...",
+ "content": "Error: location 'xyz' not found. Provide a valid city.",
+ "is_error": true}
+```
+
+`is_error: true` tells Claude the execution failed — it will typically adjust
+approach or ask for clarification. Return **informative** error strings, not
+stack traces.
+
+## SDK Tool Runners (beta) — skip the manual loop
+
+The runners handle call → execute → feed-back → repeat automatically.
+
+```python
+from anthropic import beta_tool
+import anthropic
+
+client = anthropic.Anthropic()
+
+@beta_tool
+def get_weather(location: str, unit: str = "celsius") -> str:
+    """Get current weather for a location.
+
+    Args:
+        location: City and state, e.g. San Francisco, CA.
+        unit: "celsius" or "fahrenheit".
+    """
+    return f"22°C and sunny in {location}"
+
+runner = client.beta.messages.tool_runner(
+    model="claude-opus-4-8", max_tokens=16000,
+    tools=[get_weather],
+    messages=[{"role": "user", "content": "Weather in Paris?"}],
+)
+for message in runner:        # iterates messages until Claude stops calling tools
+    print(message)
+```
+
+```typescript
+import Anthropic from "@anthropic-ai/sdk";
+import { betaZodTool } from "@anthropic-ai/sdk/helpers/beta/zod";
+import { z } from "zod";
+
+const getWeather = betaZodTool({
+  name: "get_weather",
+  description: "Get current weather for a location",
+  inputSchema: z.object({
+    location: z.string().describe("City and state, e.g. San Francisco, CA"),
+  }),
+  run: async ({ location }) => `22°C and sunny in ${location}`,
+});
+
+const finalMessage = await client.beta.messages.toolRunner({
+  model: "claude-opus-4-8", max_tokens: 16000,
+  tools: [getWeather],
+  messages: [{ role: "user", content: "Weather in Paris?" }],
+});
+```
+
+Schemas are generated from the function signature/docstring (Python) or Zod
+schema (TS). The runner executes tools **automatically** — for destructive
+side effects (email, payments, deletes), validate inside the tool function or
+use the manual loop with a human-approval gate.
+
+## Server-Side Tools
+
+Declared in `tools`, executed by Anthropic — no client handling:
+
+| Tool | Type string | Notes |
+|---|---|---|
+| Web search | `web_search_20260209` | Per-search pricing; `allowed_domains`, `blocked_domains`, `max_uses` |
+| Web fetch | `web_fetch_20260209` | Fetch URL content with citations |
+| Code execution | `code_execution_20260120` | Sandboxed Python 3.11 (pandas/numpy/matplotlib preinstalled), no internet; container reusable via response `container.id` |
+| Tool search | `tool_search_tool_bm25_20251119` / `..._regex_20251119` | On-demand tool discovery for large libraries |
+| Memory | `memory_20250818` | Client-executed but Anthropic-defined schema |
+| Bash / text editor | `bash_20250124` / `text_editor_20250728` (name `str_replace_based_edit_tool`) | Anthropic-defined, **you** execute |
+
+Server tool runs may return `stop_reason: "pause_turn"` when the server-side
+loop hits its iteration limit (default 10) — append the assistant turn and
+re-request to resume (see loop above). Cap continuations (~5) to avoid
+infinite resumes.
+
+`web_search_20260209` / `web_fetch_20260209` include **dynamic filtering**
+(model filters results in a sandbox before they hit context) automatically —
+don't add a separate `code_execution` tool just for that; only declare
+code_execution when you need it independently.
+
+## Bash vs Dedicated Tools (design)
+
+A bash tool gives breadth but the harness sees only an opaque command string.
+Promote an action to a dedicated tool when you need to:
+
+- **Gate** it (send_email behind confirmation — easy as a tool, impossible as `bash -c "curl ..."`)
+- **Validate** invariants (an edit tool can reject writes to files changed since last read)
+- **Render** it specially (question-asking as a modal)
+- **Parallelize** safely (read-only tools marked parallel-safe; bash must serialize)
+
+Start with bash for breadth; promote when you need to gate, render, audit, or
+parallelize.

+ 0 - 0
skills/claude-code-debug/scripts/.gitkeep → skills/claude-api-ops/scripts/.gitkeep


+ 0 - 124
skills/claude-code-debug/SKILL.md

@@ -1,124 +0,0 @@
----
-name: claude-code-debug
-description: "Troubleshoot Claude Code extensions and behavior. Triggers on: debug, troubleshoot, not working, skill not loading, hook not running, agent not found."
-license: MIT
-compatibility: "Claude Code CLI"
-allowed-tools: "Bash Read"
-metadata:
-  author: claude-mods
-  related-skills: claude-code-hooks, claude-code-headless, claude-code-templates
----
-
-# Claude Code Debug
-
-Troubleshoot extensions, hooks, and unexpected behavior.
-
-## Quick Diagnostics
-
-```bash
-# Enable debug mode
-claude --debug
-
-# Check loaded extensions
-/hooks        # View registered hooks
-/agents       # View available agents
-/memory       # View loaded memory files
-/config       # View current configuration
-```
-
-## Common Issues
-
-| Symptom | Quick Check |
-|---------|-------------|
-| Skill not activating | Verify description has trigger keywords |
-| Hook not running | Check `chmod +x`, run `/hooks` |
-| Agent not delegating | Add "Use proactively" to description |
-| MCP connection fails | Test server manually with `npx` |
-| Permission denied | Check settings.json allow rules |
-
-## Debug Mode Output
-
-```bash
-claude --debug
-# Shows:
-# - Hook execution and errors
-# - Skill loading status
-# - Subagent invocations
-# - Tool permission decisions
-# - MCP server connections
-```
-
-## Quick Fixes
-
-### Skill Not Loading
-
-```bash
-# Check structure
-ls -la .claude/skills/my-skill/
-# Must have: SKILL.md
-
-# Verify YAML frontmatter
-head -10 .claude/skills/my-skill/SKILL.md
-# Must start/end with ---
-
-# Check name matches directory
-grep "^name:" .claude/skills/my-skill/SKILL.md
-```
-
-### Hook Not Executing
-
-```bash
-# Make executable
-chmod +x .claude/hooks/my-hook.sh
-
-# Test manually
-echo '{"tool_name":"Bash"}' | .claude/hooks/my-hook.sh
-echo $?  # Check exit code
-
-# Verify JSON syntax
-jq '.' ~/.claude/settings.json
-```
-
-### Agent Not Being Used
-
-```bash
-# Check file location
-ls ~/.claude/agents/
-ls .claude/agents/
-
-# Verify description includes "Use for:" or "Use proactively"
-grep -i "use" agents/my-agent.md | head -5
-
-# Explicitly request
-# "Use the my-agent agent to analyze this"
-```
-
-## Validation
-
-```bash
-# Run all validations
-just test
-
-# YAML validation only
-just validate-yaml
-
-# Name matching only
-just validate-names
-```
-
-## Official Documentation
-
-- https://code.claude.com/docs/en/hooks - Hooks reference
-- https://code.claude.com/docs/en/skills - Skills reference
-- https://code.claude.com/docs/en/sub-agents - Custom subagents
-- https://code.claude.com/docs/en/settings - Settings configuration
-
-## Additional Resources
-
-- `./references/common-issues.md` - Issue → Solution lookup table
-- `./references/debug-commands.md` - All inspection commands
-- `./references/troubleshooting-flow.md` - Decision tree
-
----
-
-**See Also:** `claude-code-hooks` for hook debugging, `claude-code-templates` for correct structure

+ 0 - 208
skills/claude-code-debug/references/common-issues.md

@@ -1,208 +0,0 @@
-# Common Issues Reference
-
-Issue → Cause → Solution lookup for Claude Code debugging.
-
-## Skills
-
-### Skill Never Activates
-
-| Cause | Solution |
-|-------|----------|
-| Description too vague | Add specific trigger keywords: "Triggers on: keyword1, keyword2" |
-| YAML syntax error | Check frontmatter opens/closes with `---` |
-| Wrong location | Must be `.claude/skills/name/SKILL.md` or `~/.claude/skills/name/SKILL.md` |
-| Name mismatch | Directory name must match `name:` field |
-| Missing SKILL.md | File must be named exactly `SKILL.md` (uppercase) |
-
-**Diagnosis:**
-```bash
-# Check structure
-ls -la .claude/skills/my-skill/
-
-# Verify frontmatter
-head -5 .claude/skills/my-skill/SKILL.md
-
-# Check name matches
-dirname=$(basename "$(pwd)")
-grep "^name: $dirname" SKILL.md
-```
-
-### Skill Loads But Doesn't Help
-
-| Cause | Solution |
-|-------|----------|
-| Content too generic | Add specific commands, examples, patterns |
-| Missing tool examples | Include `bash` blocks with real commands |
-| No "When to Use" | Add scenarios for activation |
-
-## Hooks
-
-### Hook Doesn't Execute
-
-| Cause | Solution |
-|-------|----------|
-| Not executable | `chmod +x hook-script.sh` |
-| Invalid JSON in settings | Validate with `jq '.' settings.json` |
-| Wrong matcher | Matchers are case-sensitive: `"Bash"` not `"bash"` |
-| Relative path fails | Use `$CLAUDE_PROJECT_DIR/path/to/script.sh` |
-| Script not found | Check path exists, use absolute paths |
-
-**Diagnosis:**
-```bash
-# Check executable
-ls -la .claude/hooks/
-
-# Test script manually
-echo '{"tool_name":"Test"}' | ./hook.sh
-echo "Exit code: $?"
-
-# Verify JSON
-jq '.' ~/.claude/settings.json
-
-# List registered hooks
-/hooks
-```
-
-### Hook Runs But Doesn't Block
-
-| Cause | Solution |
-|-------|----------|
-| Exit code not 2 | Use `exit 2` to block (not 1) |
-| Error on stdout | Put error messages on stderr: `echo "error" >&2` |
-| Blocking for wrong tool | Check matcher pattern matches tool name |
-
-### Hook Blocks Everything
-
-| Cause | Solution |
-|-------|----------|
-| Matcher too broad | Use specific matcher instead of `"*"` |
-| Logic error | Add debugging: `echo "DEBUG: $TOOL" >&2` |
-| Always exits 2 | Check conditional logic in script |
-
-## Agents
-
-### Custom Agent Not Used
-
-| Cause | Solution |
-|-------|----------|
-| Description doesn't match | Include "Use for: scenario1, scenario2" |
-| Wrong location | Must be `.claude/agents/name.md` or `~/.claude/agents/name.md` |
-| YAML invalid | Check frontmatter has `name:` and `description:` |
-| Name conflicts | Check for duplicate agent names |
-
-**Diagnosis:**
-```bash
-# List available agents
-/agents
-
-# Check file location
-ls ~/.claude/agents/
-ls .claude/agents/
-
-# Verify frontmatter
-head -10 .claude/agents/my-agent.md
-
-# Force usage
-# "Use the my-agent agent to help with this"
-```
-
-### Agent Doesn't Have Expected Tools
-
-| Cause | Solution |
-|-------|----------|
-| `tools` field restricts | Remove `tools` field to inherit all tools |
-| Permission denied | Check settings.json allow rules |
-| Tool not installed | Verify tool exists (e.g., `which jq`) |
-
-## MCP
-
-### MCP Server Won't Connect
-
-| Cause | Solution |
-|-------|----------|
-| Package not installed | `npm install -g @modelcontextprotocol/server-X` |
-| Missing env vars | Check `.mcp.json` for `${VAR}` references |
-| Server crashes | Test manually: `npx @modelcontextprotocol/server-X` |
-| Wrong transport | HTTP servers need `--transport http` |
-
-**Diagnosis:**
-```bash
-# Test server manually
-npx @modelcontextprotocol/server-filesystem
-
-# Check .mcp.json
-cat .mcp.json | jq '.'
-
-# Verify env vars exist
-echo $GITHUB_TOKEN
-```
-
-### MCP Tools Not Available
-
-| Cause | Solution |
-|-------|----------|
-| Server not in config | Add to `.mcp.json` or use `claude mcp add` |
-| Permission denied | Add `mcp__server__*` to allow rules |
-| Token limit | Reduce output size, check MAX_MCP_OUTPUT_TOKENS |
-
-## Memory & Rules
-
-### CLAUDE.md Not Loading
-
-| Cause | Solution |
-|-------|----------|
-| Wrong location | Must be `./CLAUDE.md` or `./.claude/CLAUDE.md` |
-| Import cycle | Max 5 hops for `@path` imports |
-| Syntax error | Check markdown syntax, especially code blocks |
-
-**Diagnosis:**
-```bash
-# Check what's loaded
-/memory
-
-# Verify file exists
-ls -la CLAUDE.md .claude/CLAUDE.md 2>/dev/null
-
-# Check imports
-grep "^@" CLAUDE.md
-```
-
-### Rules Not Applying
-
-| Cause | Solution |
-|-------|----------|
-| Wrong glob pattern | Test pattern: `ls .claude/rules/**/*.md` |
-| Path not matching | Check `paths:` field matches current file |
-| Lower priority | User rules load before project rules |
-
-## Permissions
-
-### Tool Blocked Unexpectedly
-
-| Cause | Solution |
-|-------|----------|
-| Not in allow list | Add to `permissions.allow` in settings.json |
-| In deny list | Remove from `permissions.deny` |
-| Hook blocking | Check PreToolUse hooks |
-
-**Diagnosis:**
-```bash
-# Check settings
-cat ~/.claude/settings.json | jq '.permissions'
-
-# Check project settings
-cat .claude/settings.local.json | jq '.permissions' 2>/dev/null
-
-# Debug mode shows permission decisions
-claude --debug
-```
-
-## General Debugging Steps
-
-1. **Enable debug mode**: `claude --debug`
-2. **Check file locations**: `ls -la .claude/` and `ls -la ~/.claude/`
-3. **Validate JSON**: `jq '.' settings.json`
-4. **Verify YAML**: Check frontmatter opens/closes with `---`
-5. **Test manually**: Run scripts directly, test MCP servers
-6. **Check permissions**: Review allow/deny rules
-7. **Use inspection commands**: `/hooks`, `/agents`, `/memory`, `/config`

+ 0 - 276
skills/claude-code-debug/references/debug-commands.md

@@ -1,276 +0,0 @@
-# Debug Commands Reference
-
-All inspection and debugging commands for Claude Code.
-
-## CLI Debug Mode
-
-```bash
-# Full debug output
-claude --debug
-
-# Verbose logging
-claude --verbose
-claude -v
-```
-
-### Debug Mode Shows
-
-| Category | Information |
-|----------|-------------|
-| Hooks | Loading, execution, errors, exit codes |
-| Skills | Discovery, activation, loading errors |
-| Agents | Invocation, tool access, context inheritance |
-| Permissions | Allow/deny decisions, rule matching |
-| MCP | Server connections, tool registration |
-| Context | Memory loading, rule application |
-
-## Slash Commands
-
-### /hooks
-
-List all registered hooks and their configuration.
-
-```
-/hooks
-
-Output:
-PreToolUse:
-  - Bash: ./hooks/validate-bash.sh (timeout: 5000ms)
-  - *: ./hooks/audit-log.sh (timeout: 3000ms)
-
-PostToolUse:
-  - Write: ./hooks/notify-write.sh
-```
-
-### /agents
-
-Manage and inspect subagents.
-
-```
-/agents
-
-Output:
-Available Agents:
-  Built-in:
-    - Explore (read-only codebase search)
-    - Plan (implementation planning)
-    - general-purpose (default)
-
-  Custom (user):
-    - python-expert
-    - react-expert
-
-  Custom (project):
-    - my-project-agent
-```
-
-Actions:
-- View agent details
-- Create new agent
-- Edit existing agent
-- Delete agent
-
-### /memory
-
-View and edit memory files (CLAUDE.md).
-
-```
-/memory
-
-Output:
-Loaded Memory Files:
-  1. ~/.claude/CLAUDE.md (user)
-  2. ./CLAUDE.md (project)
-  3. ./.claude/CLAUDE.md (project)
-
-Imports:
-  - @README.md
-  - @docs/api.md
-```
-
-Opens system editor for editing when invoked.
-
-### /config
-
-View current configuration state.
-
-```
-/config
-
-Output:
-Permission Mode: default
-Model: claude-sonnet-4-20250514
-
-Permissions:
-  Allow: Bash(git:*), Bash(npm:*), Read, Write
-  Deny: Bash(rm -rf:*)
-
-Active MCP Servers:
-  - filesystem: /Users/me/.npm/_npx/...
-  - github: /Users/me/.npm/_npx/...
-```
-
-### /plugin
-
-Browse and manage installed plugins.
-
-```
-/plugin
-
-Output:
-Installed Plugins:
-  - my-plugin (v1.0.0)
-    Commands: /my-command
-    Skills: my-skill
-
-Marketplaces:
-  - awesome-plugins (github:user/repo)
-```
-
-## File System Inspection
-
-### Check Extension Structure
-
-```bash
-# Skills
-ls -la .claude/skills/
-ls -la ~/.claude/skills/
-
-# Agents
-ls -la .claude/agents/
-ls -la ~/.claude/agents/
-
-# Commands
-ls -la .claude/commands/
-ls -la ~/.claude/commands/
-
-# Rules
-ls -la .claude/rules/
-ls -la ~/.claude/rules/
-
-# Hooks
-ls -la .claude/hooks/
-```
-
-### Verify Configuration Files
-
-```bash
-# Global settings
-cat ~/.claude/settings.json | jq '.'
-
-# Project settings
-cat .claude/settings.local.json | jq '.'
-
-# MCP configuration
-cat .mcp.json | jq '.'
-```
-
-### Check YAML Frontmatter
-
-```bash
-# View frontmatter
-head -20 path/to/extension.md
-
-# Extract name field
-grep "^name:" path/to/extension.md
-
-# Validate YAML structure
-head -20 path/to/extension.md | grep -E "^---|^name:|^description:"
-```
-
-## Process Debugging
-
-### Test Hook Scripts
-
-```bash
-# Test with sample input
-echo '{"tool_name":"Bash","tool_input":{"command":"ls"}}' | ./hook.sh
-
-# Check exit code
-echo $?
-
-# View stderr output
-echo '{"tool_name":"Bash"}' | ./hook.sh 2>&1
-```
-
-### Test MCP Servers
-
-```bash
-# Run server directly
-npx @modelcontextprotocol/server-filesystem
-
-# Check if package exists
-npm view @modelcontextprotocol/server-github
-
-# Verify env vars
-printenv | grep -i token
-```
-
-## Log Analysis
-
-### Hook Audit Logs
-
-```bash
-# View recent hook activity
-tail -50 .claude/audit.log
-
-# Search for errors
-grep -i error .claude/audit.log
-
-# Count by tool
-awk -F'|' '{print $2}' .claude/audit.log | sort | uniq -c | sort -rn
-```
-
-### Session Logs
-
-```bash
-# Find session files
-ls -la ~/.claude/sessions/
-
-# View recent session
-cat ~/.claude/sessions/$(ls -t ~/.claude/sessions/ | head -1)
-```
-
-## Environment Verification
-
-```bash
-# Claude Code version
-claude --version
-
-# Check API key is set
-echo ${ANTHROPIC_API_KEY:0:10}...
-
-# Project directory
-echo $CLAUDE_PROJECT_DIR
-
-# Current working directory
-pwd
-```
-
-## Validation Commands
-
-```bash
-# Run full validation suite (claude-mods)
-just test
-
-# YAML validation only
-just validate-yaml
-
-# Name matching validation
-just validate-names
-
-# Windows validation
-just test-win
-```
-
-## Quick Reference
-
-| Command | Purpose |
-|---------|---------|
-| `claude --debug` | Enable debug output |
-| `/hooks` | List registered hooks |
-| `/agents` | Manage subagents |
-| `/memory` | View/edit memory files |
-| `/config` | View configuration |
-| `/plugin` | Manage plugins |
-| `just test` | Run validations |

+ 0 - 213
skills/claude-code-debug/references/troubleshooting-flow.md

@@ -1,213 +0,0 @@
-# Troubleshooting Flow
-
-Decision trees for diagnosing Claude Code issues.
-
-## Extension Not Working
-
-```
-Extension not working?
-│
-├─ What type?
-│  │
-│  ├─ Skill ─────────────► Go to: Skill Debugging Flow
-│  ├─ Hook ──────────────► Go to: Hook Debugging Flow
-│  ├─ Agent ─────────────► Go to: Agent Debugging Flow
-│  ├─ Command ───────────► Go to: Command Debugging Flow
-│  └─ MCP ───────────────► Go to: MCP Debugging Flow
-```
-
-## Skill Debugging Flow
-
-```
-Skill not activating?
-│
-├─ Does directory exist?
-│  ├─ No ──► Create: mkdir -p .claude/skills/my-skill
-│  └─ Yes
-│      │
-│      ├─ Does SKILL.md exist (exact case)?
-│      │  ├─ No ──► Create SKILL.md (not skill.md)
-│      │  └─ Yes
-│      │      │
-│      │      ├─ Does frontmatter start with ---?
-│      │      │  ├─ No ──► Add --- at line 1
-│      │      │  └─ Yes
-│      │      │      │
-│      │      │      ├─ Does frontmatter end with ---?
-│      │      │      │  ├─ No ──► Add --- after last field
-│      │      │      │  └─ Yes
-│      │      │      │      │
-│      │      │      │      ├─ Does name: match directory?
-│      │      │      │      │  ├─ No ──► Fix name to match
-│      │      │      │      │  └─ Yes
-│      │      │      │      │      │
-│      │      │      │      │      ├─ Does description have triggers?
-│      │      │      │      │      │  ├─ No ──► Add "Triggers on: x, y, z"
-│      │      │      │      │      │  └─ Yes
-│      │      │      │      │      │      │
-│      │      │      │      │      │      └─ Try: claude --debug
-│      │      │      │      │      │         Look for skill loading errors
-```
-
-## Hook Debugging Flow
-
-```
-Hook not running?
-│
-├─ Is script executable?
-│  ├─ No ──► chmod +x script.sh
-│  └─ Yes
-│      │
-│      ├─ Is settings.json valid JSON?
-│      │  ├─ No ──► Fix JSON syntax (jq '.' to validate)
-│      │  └─ Yes
-│      │      │
-│      │      ├─ Is matcher correct? (case-sensitive!)
-│      │      │  ├─ "bash" ──► Change to "Bash"
-│      │      │  └─ Correct
-│      │      │      │
-│      │      │      ├─ Does path exist?
-│      │      │      │  ├─ No ──► Fix path, use $CLAUDE_PROJECT_DIR
-│      │      │      │  └─ Yes
-│      │      │      │      │
-│      │      │      │      ├─ Does script work manually?
-│      │      │      │      │  │  echo '{"tool_name":"X"}' | ./script.sh
-│      │      │      │      │  │
-│      │      │      │      │  ├─ Fails ──► Fix script errors
-│      │      │      │      │  └─ Works
-│      │      │      │      │      │
-│      │      │      │      │      └─ Run: /hooks
-│      │      │      │      │         Is hook listed?
-│      │      │      │      │         ├─ No ──► Check settings location
-│      │      │      │      │         └─ Yes ──► Try claude --debug
-```
-
-## Agent Debugging Flow
-
-```
-Agent not being used?
-│
-├─ Is file in correct location?
-│  ├─ ~/.claude/agents/name.md (user)
-│  ├─ .claude/agents/name.md (project)
-│  │
-│  ├─ Wrong location ──► Move file
-│  └─ Correct
-│      │
-│      ├─ Does filename match name: field?
-│      │  ├─ No ──► Rename file or fix name field
-│      │  └─ Yes
-│      │      │
-│      │      ├─ Does description include "Use for:"?
-│      │      │  ├─ No ──► Add: "Use for: scenario1, scenario2"
-│      │      │  └─ Yes
-│      │      │      │
-│      │      │      ├─ Run: /agents
-│      │      │      │  Is agent listed?
-│      │      │      │  │
-│      │      │      │  ├─ No ──► Check YAML frontmatter syntax
-│      │      │      │  └─ Yes
-│      │      │      │      │
-│      │      │      │      └─ Try explicit request:
-│      │      │      │         "Use the my-agent agent for this"
-```
-
-## Command Debugging Flow
-
-```
-Command not working?
-│
-├─ Is file in correct location?
-│  ├─ ~/.claude/commands/name.md (user)
-│  ├─ .claude/commands/name.md (project)
-│  │
-│  ├─ Wrong location ──► Move file
-│  └─ Correct
-│      │
-│      ├─ Does /command-name show in help?
-│      │  ├─ No ──► Check YAML frontmatter
-│      │  └─ Yes
-│      │      │
-│      │      └─ Command runs but fails?
-│      │         ├─ Check instructions in command file
-│      │         └─ Verify required tools are available
-```
-
-## MCP Debugging Flow
-
-```
-MCP server not connecting?
-│
-├─ Is server installed?
-│  │  npx @modelcontextprotocol/server-X
-│  │
-│  ├─ "not found" ──► npm install -g @modelcontextprotocol/server-X
-│  └─ Runs
-│      │
-│      ├─ Is server in .mcp.json?
-│      │  ├─ No ──► Add server config or use: claude mcp add
-│      │  └─ Yes
-│      │      │
-│      │      ├─ Are env vars set?
-│      │      │  │  Check ${VAR} references in .mcp.json
-│      │      │  │
-│      │      │  ├─ Missing ──► Set env vars or add to .env
-│      │      │  └─ Set
-│      │      │      │
-│      │      │      ├─ Is transport correct?
-│      │      │      │  │  HTTP servers need --transport http
-│      │      │      │  │
-│      │      │      │  ├─ Wrong ──► Fix transport config
-│      │      │      │  └─ Correct
-│      │      │      │      │
-│      │      │      │      └─ Try: claude --debug
-│      │      │      │         Look for MCP connection errors
-```
-
-## Permission Debugging Flow
-
-```
-Tool blocked unexpectedly?
-│
-├─ Check deny rules first
-│  │  jq '.permissions.deny' ~/.claude/settings.json
-│  │
-│  ├─ Tool in deny ──► Remove from deny list
-│  └─ Not in deny
-│      │
-│      ├─ Check allow rules
-│      │  │  jq '.permissions.allow' ~/.claude/settings.json
-│      │  │
-│      │  ├─ Tool not in allow ──► Add to allow list
-│      │  └─ In allow
-│      │      │
-│      │      ├─ Is pattern correct?
-│      │      │  │  "Bash(git:*)" allows only git commands
-│      │      │  │
-│      │      │  ├─ Pattern too narrow ──► Broaden pattern
-│      │      │  └─ Pattern correct
-│      │      │      │
-│      │      │      ├─ Check PreToolUse hooks
-│      │      │      │  │  /hooks
-│      │      │      │  │
-│      │      │      │  ├─ Hook blocking ──► Fix hook logic
-│      │      │      │  └─ No blocking hook
-│      │      │      │      │
-│      │      │      │      └─ Run: claude --debug
-│      │      │      │         Check permission decision logs
-```
-
-## General Debugging Checklist
-
-When all else fails:
-
-1. [ ] Run `claude --debug` and read output carefully
-2. [ ] Verify file locations and names
-3. [ ] Validate all JSON with `jq '.'`
-4. [ ] Check YAML frontmatter syntax
-5. [ ] Test components in isolation
-6. [ ] Check file permissions (`ls -la`)
-7. [ ] Verify environment variables
-8. [ ] Review recent changes to config
-9. [ ] Try with a fresh session
-10. [ ] Check Claude Code version (`claude --version`)

+ 0 - 130
skills/claude-code-headless/SKILL.md

@@ -1,130 +0,0 @@
----
-name: claude-code-headless
-description: "Run Claude Code programmatically without interactive UI. Triggers on: headless, CLI automation, --print, output-format, stream-json, CI/CD, scripting."
-license: MIT
-compatibility: "Claude Code CLI"
-allowed-tools: "Bash Read"
-metadata:
-  author: claude-mods
-  related-skills: claude-code-hooks, claude-code-debug
----
-
-# Claude Code Headless Mode
-
-Run Claude Code from scripts without interactive UI.
-
-## Quick Start
-
-```bash
-# Basic headless execution
-claude -p "Explain this code" --allowedTools "Read,Grep"
-
-# JSON output for parsing
-claude -p "List files" --output-format json
-
-# Continue conversation
-claude -p "Start analysis" --output-format json > result.json
-session=$(jq -r '.session_id' result.json)
-claude --resume "$session" "Now fix the issues"
-```
-
-## Essential CLI Options
-
-| Flag | Description |
-|------|-------------|
-| `-p`, `--print` | Non-interactive (headless) mode |
-| `--output-format` | text, json, stream-json |
-| `-r`, `--resume` | Resume by session ID |
-| `-c`, `--continue` | Continue most recent session |
-| `--allowedTools` | Comma-separated allowed tools |
-| `--disallowedTools` | Comma-separated denied tools |
-| `--mcp-config` | Path to MCP server config JSON |
-| `--verbose` | Enable verbose logging |
-| `--append-system-prompt` | Add to system prompt |
-
-## Permission Modes
-
-| Mode | Flag | Effect |
-|------|------|--------|
-| Default | (none) | Prompt for permissions |
-| Accept edits | `--permission-mode acceptEdits` | Auto-accept file changes |
-| Bypass | `--permission-mode bypassPermissions` | Skip all prompts |
-
-## Output Formats
-
-### Text (default)
-```bash
-claude -p "Hello"
-# Outputs: Human-readable response
-```
-
-### JSON
-```bash
-claude -p "Hello" --output-format json
-```
-```json
-{
-  "type": "result",
-  "subtype": "success",
-  "result": "Hello! How can I help?",
-  "session_id": "abc123",
-  "total_cost_usd": 0.001,
-  "duration_ms": 1234,
-  "num_turns": 1
-}
-```
-
-### Stream-JSON
-```bash
-claude -p "Hello" --output-format stream-json
-# Real-time JSONL output for each message
-```
-
-## Common Patterns
-
-### Script with tool restrictions
-```bash
-claude -p "Analyze the codebase" \
-  --allowedTools "Read,Grep,Glob" \
-  --disallowedTools "Write,Edit,Bash"
-```
-
-### CI/CD integration
-```bash
-claude -p "Review this PR diff" \
-  --permission-mode acceptEdits \
-  --output-format json \
-  --append-system-prompt "Focus on security issues"
-```
-
-### Multi-turn automation
-```bash
-session=$(claude -p "Start task" --output-format json | jq -r '.session_id')
-claude --resume "$session" "Continue with step 2"
-claude --resume "$session" "Finalize and report"
-```
-
-## Error Handling
-
-```bash
-result=$(claude -p "Task" --output-format json)
-if [[ $(echo "$result" | jq -r '.is_error') == "true" ]]; then
-    echo "Error: $(echo "$result" | jq -r '.result')" >&2
-    exit 1
-fi
-```
-
-## Official Documentation
-
-- https://code.claude.com/docs/en/headless - Headless mode reference
-- https://code.claude.com/docs/en/settings - Settings and permissions
-
-## Additional Resources
-
-- `./references/cli-options.md` - Complete CLI flag reference
-- `./references/output-formats.md` - Output format schemas
-- `./references/integration-patterns.md` - CI/CD and scripting examples
-
----
-
-**See Also:** `claude-code-hooks` for automation events, `claude-code-debug` for troubleshooting

+ 0 - 207
skills/claude-code-headless/references/cli-options.md

@@ -1,207 +0,0 @@
-# CLI Options Reference
-
-Complete reference for Claude Code CLI flags.
-
-## Core Options
-
-### Input/Output
-
-| Flag | Short | Description |
-|------|-------|-------------|
-| `--print` | `-p` | Non-interactive mode (required for headless) |
-| `--output-format` | | Output format: text, json, stream-json |
-| `--verbose` | `-v` | Enable verbose/debug logging |
-| `--quiet` | `-q` | Suppress non-essential output |
-
-### Session Management
-
-| Flag | Short | Description |
-|------|-------|-------------|
-| `--resume` | `-r` | Resume conversation by session ID |
-| `--continue` | `-c` | Continue most recent conversation |
-| `--session-id` | | Specify session ID for new session |
-
-### Prompt Configuration
-
-| Flag | Description |
-|------|-------------|
-| `--append-system-prompt` | Append text to system prompt |
-| `--prepend-system-prompt` | Prepend text to system prompt |
-| `--system-prompt` | Replace entire system prompt |
-
-### Tool Control
-
-| Flag | Description |
-|------|-------------|
-| `--allowedTools` | Comma-separated list of allowed tools |
-| `--disallowedTools` | Comma-separated list of denied tools |
-| `--mcp-config` | Path to MCP server configuration JSON |
-
-### Permission Control
-
-| Flag | Description |
-|------|-------------|
-| `--permission-mode` | Permission handling mode |
-
-Permission modes:
-- `default` - Ask for permission (blocks in headless)
-- `acceptEdits` - Auto-accept file modifications
-- `bypassPermissions` - Skip all permission prompts
-
-### Model Selection
-
-| Flag | Description |
-|------|-------------|
-| `--model` | Model to use: sonnet, opus, haiku |
-
-## Usage Examples
-
-### Basic Headless
-
-```bash
-# Simple query
-claude -p "What is 2+2?"
-
-# With file context
-cat file.py | claude -p "Explain this code"
-
-# From file
-claude -p "$(cat prompt.txt)"
-```
-
-### Tool Restrictions
-
-```bash
-# Read-only mode
-claude -p "Analyze codebase" \
-  --allowedTools "Read,Grep,Glob,WebFetch,WebSearch" \
-  --disallowedTools "Write,Edit,Bash,Task"
-
-# Specific tools only
-claude -p "Search for bugs" \
-  --allowedTools "Read,Grep"
-```
-
-### Session Continuation
-
-```bash
-# Start session, capture ID
-result=$(claude -p "Start analysis" --output-format json)
-session_id=$(echo "$result" | jq -r '.session_id')
-
-# Continue with context
-claude -r "$session_id" "What did you find?"
-
-# Continue most recent
-claude -c "Add more details"
-```
-
-### System Prompt Modification
-
-```bash
-# Add context
-claude -p "Review code" \
-  --append-system-prompt "Focus on security vulnerabilities. Output findings as markdown."
-
-# Full replacement
-claude -p "Hello" \
-  --system-prompt "You are a helpful assistant that only speaks in haiku."
-```
-
-### MCP Integration
-
-```bash
-# Use MCP servers from config
-claude -p "Query the database" \
-  --mcp-config ./mcp-servers.json
-
-# With specific tools
-claude -p "Fetch from GitHub" \
-  --mcp-config ./mcp-servers.json \
-  --allowedTools "mcp__github__*"
-```
-
-### Debugging
-
-```bash
-# Verbose output
-claude -p "Debug this" --verbose
-
-# Debug mode (shows internal operations)
-claude --debug -p "Analyze"
-```
-
-## Input Methods
-
-### Stdin Piping
-
-```bash
-# Pipe file content
-cat error.log | claude -p "Explain these errors"
-
-# Pipe command output
-git diff | claude -p "Review this diff"
-
-# Heredoc
-claude -p "$(cat <<EOF
-Analyze this data:
-- Item 1
-- Item 2
-EOF
-)"
-```
-
-### File Input
-
-```bash
-# Read prompt from file
-claude -p "$(cat prompt.md)"
-
-# With context files
-claude -p "Review: $(cat src/main.ts)"
-```
-
-## Exit Codes
-
-| Code | Meaning |
-|------|---------|
-| 0 | Success |
-| 1 | General error |
-| 2 | Invalid arguments |
-
-## Environment Variables
-
-| Variable | Description |
-|----------|-------------|
-| `ANTHROPIC_API_KEY` | API key for Claude |
-| `CLAUDE_PROJECT_DIR` | Override project directory |
-| `NO_COLOR` | Disable colored output |
-
-## Flag Combinations
-
-### CI/CD Pipeline
-
-```bash
-claude -p "Run tests and report" \
-  --permission-mode acceptEdits \
-  --output-format json \
-  --allowedTools "Bash,Read,Write"
-```
-
-### Security Audit
-
-```bash
-claude -p "Security review" \
-  --allowedTools "Read,Grep,WebSearch" \
-  --disallowedTools "Write,Edit,Bash" \
-  --append-system-prompt "Report vulnerabilities in JSON format"
-```
-
-### Documentation Generation
-
-```bash
-claude -p "Generate API docs" \
-  --permission-mode acceptEdits \
-  --allowedTools "Read,Write,Glob" \
-  --output-format json
-```

+ 0 - 373
skills/claude-code-headless/references/integration-patterns.md

@@ -1,373 +0,0 @@
-# Integration Patterns
-
-Patterns for integrating Claude Code into CI/CD, scripts, and automation.
-
-## CI/CD Pipelines
-
-### GitHub Actions
-
-```yaml
-name: Claude Code Review
-
-on:
-  pull_request:
-    types: [opened, synchronize]
-
-jobs:
-  review:
-    runs-on: ubuntu-latest
-    steps:
-      - uses: actions/checkout@v4
-
-      - name: Get PR diff
-        id: diff
-        run: |
-          gh pr diff ${{ github.event.pull_request.number }} > diff.txt
-        env:
-          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
-
-      - name: Run Claude review
-        run: |
-          result=$(cat diff.txt | claude -p "Review this PR diff for:
-          - Security vulnerabilities
-          - Performance issues
-          - Code quality
-
-          Output as markdown." \
-            --output-format json \
-            --allowedTools "Read,Grep")
-
-          echo "$result" | jq -r '.result' > review.md
-        env:
-          ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
-
-      - name: Post review comment
-        run: |
-          gh pr comment ${{ github.event.pull_request.number }} \
-            --body-file review.md
-        env:
-          GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
-```
-
-### GitLab CI
-
-```yaml
-claude-review:
-  stage: review
-  script:
-    - git diff origin/main...HEAD > diff.txt
-    - |
-      cat diff.txt | claude -p "Security review" \
-        --output-format json \
-        --allowedTools "Read" \
-        > review.json
-    - cat review.json | jq -r '.result' > review.md
-  artifacts:
-    paths:
-      - review.md
-  only:
-    - merge_requests
-```
-
-### Jenkins
-
-```groovy
-pipeline {
-    agent any
-    environment {
-        ANTHROPIC_API_KEY = credentials('anthropic-api-key')
-    }
-    stages {
-        stage('Claude Analysis') {
-            steps {
-                script {
-                    def result = sh(
-                        script: '''
-                            claude -p "Analyze build issues" \
-                                --output-format json \
-                                --allowedTools "Read,Bash"
-                        ''',
-                        returnStdout: true
-                    )
-                    def json = readJSON text: result
-                    if (json.is_error) {
-                        error "Claude analysis failed: ${json.result}"
-                    }
-                }
-            }
-        }
-    }
-}
-```
-
-## Shell Scripts
-
-### PR Review Script
-
-```bash
-#!/bin/bash
-set -euo pipefail
-
-audit_pr() {
-    local pr_number="$1"
-
-    # Get PR diff
-    diff=$(gh pr diff "$pr_number")
-
-    # Run Claude analysis
-    result=$(echo "$diff" | claude -p \
-        --append-system-prompt "Security review. Output JSON: {severity, findings, recommendations}" \
-        --output-format json \
-        --allowedTools "Read,Grep,WebSearch")
-
-    # Check for errors
-    if [[ $(echo "$result" | jq -r '.is_error') == "true" ]]; then
-        echo "Error: $(echo "$result" | jq -r '.result')" >&2
-        return 1
-    fi
-
-    echo "$result" | jq -r '.result'
-}
-
-# Usage
-audit_pr 123
-```
-
-### Batch Processing
-
-```bash
-#!/bin/bash
-set -euo pipefail
-
-process_files() {
-    local pattern="$1"
-    local prompt="$2"
-
-    find . -name "$pattern" -print0 | while IFS= read -r -d '' file; do
-        echo "Processing: $file"
-
-        result=$(cat "$file" | claude -p "$prompt" \
-            --output-format json \
-            --allowedTools "Read")
-
-        if [[ $(echo "$result" | jq -r '.is_error') == "false" ]]; then
-            echo "$result" | jq -r '.result' > "${file}.analysis.md"
-        fi
-    done
-}
-
-# Usage
-process_files "*.py" "Analyze this Python file for issues"
-```
-
-### Multi-Turn Workflow
-
-```bash
-#!/bin/bash
-set -euo pipefail
-
-run_workflow() {
-    # Step 1: Initial analysis
-    result=$(claude -p "Analyze the codebase structure" \
-        --output-format json \
-        --allowedTools "Read,Glob,Grep")
-
-    session=$(echo "$result" | jq -r '.session_id')
-    echo "Session: $session"
-
-    # Step 2: Deep dive with context
-    result=$(claude --resume "$session" \
-        "Now examine the authentication module in detail" \
-        --output-format json)
-
-    # Step 3: Generate report
-    claude --resume "$session" \
-        "Generate a security report in markdown" \
-        --output-format json | jq -r '.result' > report.md
-
-    echo "Report saved to report.md"
-}
-
-run_workflow
-```
-
-## Pre-commit Hooks
-
-### Python Code Review
-
-```bash
-#!/bin/bash
-# .git/hooks/pre-commit
-
-staged_files=$(git diff --cached --name-only --diff-filter=ACM | grep '\.py$' || true)
-
-if [[ -n "$staged_files" ]]; then
-    echo "Running Claude review on staged Python files..."
-
-    for file in $staged_files; do
-        result=$(cat "$file" | claude -p \
-            "Quick code review. Report only critical issues. Be concise." \
-            --output-format json \
-            --allowedTools "Read" 2>/dev/null)
-
-        if [[ $(echo "$result" | jq -r '.is_error') == "false" ]]; then
-            review=$(echo "$result" | jq -r '.result')
-            if [[ "$review" != *"no issues"* ]] && [[ "$review" != *"looks good"* ]]; then
-                echo "Review for $file:"
-                echo "$review"
-                echo ""
-            fi
-        fi
-    done
-fi
-```
-
-## Scheduled Tasks
-
-### Daily Code Quality Report
-
-```bash
-#!/bin/bash
-# Run via cron: 0 8 * * * /path/to/daily-report.sh
-
-REPORT_DIR="/var/reports/claude"
-DATE=$(date +%Y-%m-%d)
-
-mkdir -p "$REPORT_DIR"
-
-cd /path/to/project
-
-result=$(claude -p "Generate a daily code quality report covering:
-1. Recent changes summary
-2. Potential issues
-3. Recommendations
-
-Use git log for recent changes." \
-    --output-format json \
-    --allowedTools "Bash,Read,Grep")
-
-echo "$result" | jq -r '.result' > "$REPORT_DIR/report-$DATE.md"
-
-# Email or Slack notification
-# curl -X POST "$SLACK_WEBHOOK" -d "{\"text\": \"Daily report ready\"}"
-```
-
-## Web Application Integration
-
-### Express.js Endpoint
-
-```javascript
-const express = require('express');
-const { spawn } = require('child_process');
-
-const app = express();
-app.use(express.json());
-
-app.post('/api/claude', async (req, res) => {
-    const { prompt, tools } = req.body;
-
-    const args = ['-p', prompt, '--output-format', 'json'];
-    if (tools) {
-        args.push('--allowedTools', tools.join(','));
-    }
-
-    const claude = spawn('claude', args);
-    let output = '';
-
-    claude.stdout.on('data', (data) => {
-        output += data.toString();
-    });
-
-    claude.on('close', (code) => {
-        try {
-            const result = JSON.parse(output);
-            res.json(result);
-        } catch (e) {
-            res.status(500).json({ error: 'Failed to parse response' });
-        }
-    });
-});
-
-app.listen(3000);
-```
-
-### Python FastAPI
-
-```python
-from fastapi import FastAPI, HTTPException
-from pydantic import BaseModel
-import subprocess
-import json
-
-app = FastAPI()
-
-class ClaudeRequest(BaseModel):
-    prompt: str
-    tools: list[str] | None = None
-
-@app.post("/api/claude")
-async def run_claude(request: ClaudeRequest):
-    args = ["claude", "-p", request.prompt, "--output-format", "json"]
-
-    if request.tools:
-        args.extend(["--allowedTools", ",".join(request.tools)])
-
-    proc = subprocess.run(args, capture_output=True, text=True)
-
-    try:
-        result = json.loads(proc.stdout)
-        return result
-    except json.JSONDecodeError:
-        raise HTTPException(status_code=500, detail="Failed to parse response")
-```
-
-## Error Handling Patterns
-
-### Retry with Backoff
-
-```bash
-#!/bin/bash
-
-run_with_retry() {
-    local max_attempts=3
-    local attempt=1
-    local delay=5
-
-    while [[ $attempt -le $max_attempts ]]; do
-        result=$(claude -p "$1" --output-format json 2>&1)
-
-        if [[ $(echo "$result" | jq -r '.is_error // true') == "false" ]]; then
-            echo "$result"
-            return 0
-        fi
-
-        echo "Attempt $attempt failed, retrying in ${delay}s..." >&2
-        sleep $delay
-        attempt=$((attempt + 1))
-        delay=$((delay * 2))
-    done
-
-    echo "All attempts failed" >&2
-    return 1
-}
-```
-
-### Graceful Degradation
-
-```bash
-#!/bin/bash
-
-analyze_with_fallback() {
-    # Try Claude first
-    result=$(claude -p "$1" --output-format json 2>/dev/null)
-
-    if [[ -z "$result" ]] || [[ $(echo "$result" | jq -r '.is_error') == "true" ]]; then
-        echo "Claude unavailable, using fallback analysis" >&2
-        # Fallback to simpler analysis
-        run_basic_linter "$2"
-        return
-    fi
-
-    echo "$result" | jq -r '.result'
-}
-```

+ 0 - 202
skills/claude-code-headless/references/output-formats.md

@@ -1,202 +0,0 @@
-# Output Formats Reference
-
-Detailed documentation for Claude Code output formats.
-
-## Text Format (Default)
-
-Human-readable output for terminal use.
-
-```bash
-claude -p "Hello"
-# Output: Hello! How can I help you today?
-```
-
-Characteristics:
-- Plain text response
-- May include ANSI colors (disable with `NO_COLOR=1`)
-- No metadata (session ID, cost, etc.)
-- Best for interactive scripts
-
-## JSON Format
-
-Structured output for programmatic parsing.
-
-```bash
-claude -p "Hello" --output-format json
-```
-
-### Success Response
-
-```json
-{
-  "type": "result",
-  "subtype": "success",
-  "result": "Hello! How can I help you today?",
-  "session_id": "session_abc123xyz",
-  "total_cost_usd": 0.00123,
-  "duration_ms": 1542,
-  "num_turns": 1,
-  "is_error": false
-}
-```
-
-### Error Response
-
-```json
-{
-  "type": "result",
-  "subtype": "error",
-  "result": "Error message here",
-  "session_id": "session_abc123xyz",
-  "total_cost_usd": 0.0005,
-  "duration_ms": 234,
-  "num_turns": 0,
-  "is_error": true
-}
-```
-
-### Field Reference
-
-| Field | Type | Description |
-|-------|------|-------------|
-| `type` | string | Always "result" |
-| `subtype` | string | "success" or "error" |
-| `result` | string | Response text or error message |
-| `session_id` | string | Session identifier for resumption |
-| `total_cost_usd` | number | Total API cost in USD |
-| `duration_ms` | number | Total execution time in milliseconds |
-| `num_turns` | number | Number of conversation turns |
-| `is_error` | boolean | Whether result is an error |
-
-### Parsing Examples
-
-```bash
-# Extract session ID
-session=$(claude -p "Start" --output-format json | jq -r '.session_id')
-
-# Check for errors
-result=$(claude -p "Task" --output-format json)
-if [[ $(echo "$result" | jq -r '.is_error') == "true" ]]; then
-    echo "Error: $(echo "$result" | jq -r '.result')"
-    exit 1
-fi
-
-# Get cost
-cost=$(echo "$result" | jq -r '.total_cost_usd')
-echo "Cost: \$${cost}"
-```
-
-## Stream-JSON Format
-
-Real-time JSONL (JSON Lines) output for streaming applications.
-
-```bash
-claude -p "Count to 5" --output-format stream-json
-```
-
-### Message Types
-
-#### Assistant Message
-
-```json
-{"type": "assistant", "content": "1", "timestamp": "2024-01-15T10:30:00Z"}
-{"type": "assistant", "content": "2", "timestamp": "2024-01-15T10:30:01Z"}
-```
-
-#### Tool Use
-
-```json
-{"type": "tool_use", "tool": "Read", "input": {"file_path": "/path/to/file"}, "timestamp": "..."}
-{"type": "tool_result", "tool": "Read", "output": "file contents...", "timestamp": "..."}
-```
-
-#### Final Result
-
-```json
-{"type": "result", "session_id": "...", "total_cost_usd": 0.01, "duration_ms": 5000}
-```
-
-### Stream Processing
-
-```bash
-# Process each line as it arrives
-claude -p "Long task" --output-format stream-json | while IFS= read -r line; do
-    type=$(echo "$line" | jq -r '.type')
-    case "$type" in
-        assistant)
-            echo "Claude: $(echo "$line" | jq -r '.content')"
-            ;;
-        tool_use)
-            echo "Using: $(echo "$line" | jq -r '.tool')"
-            ;;
-        result)
-            echo "Done! Cost: \$$(echo "$line" | jq -r '.total_cost_usd')"
-            ;;
-    esac
-done
-```
-
-### Node.js Stream Processing
-
-```javascript
-const { spawn } = require('child_process');
-const readline = require('readline');
-
-const claude = spawn('claude', ['-p', 'Task', '--output-format', 'stream-json']);
-
-const rl = readline.createInterface({ input: claude.stdout });
-
-rl.on('line', (line) => {
-    const event = JSON.parse(line);
-    switch (event.type) {
-        case 'assistant':
-            process.stdout.write(event.content);
-            break;
-        case 'result':
-            console.log(`\nCost: $${event.total_cost_usd}`);
-            break;
-    }
-});
-```
-
-### Python Stream Processing
-
-```python
-import subprocess
-import json
-
-proc = subprocess.Popen(
-    ['claude', '-p', 'Task', '--output-format', 'stream-json'],
-    stdout=subprocess.PIPE,
-    text=True
-)
-
-for line in proc.stdout:
-    event = json.loads(line)
-    if event['type'] == 'assistant':
-        print(event['content'], end='', flush=True)
-    elif event['type'] == 'result':
-        print(f"\nCost: ${event['total_cost_usd']}")
-```
-
-## Format Comparison
-
-| Feature | text | json | stream-json |
-|---------|------|------|-------------|
-| Real-time output | Yes | No | Yes |
-| Structured data | No | Yes | Yes |
-| Session ID | No | Yes | Yes |
-| Cost tracking | No | Yes | Yes |
-| Easy parsing | No | Yes | Yes |
-| Best for | Humans | Scripts | Real-time apps |
-
-## Format Selection Guide
-
-| Use Case | Format |
-|----------|--------|
-| Interactive terminal | text |
-| CI/CD pipelines | json |
-| Web applications | stream-json |
-| Cost tracking | json |
-| Session management | json |
-| Live progress display | stream-json |

+ 0 - 116
skills/claude-code-hooks/SKILL.md

@@ -1,116 +0,0 @@
----
-name: claude-code-hooks
-description: "Claude Code hook system for pre/post tool execution. Triggers on: hooks, PreToolUse, PostToolUse, hook script, tool validation, audit logging."
-license: MIT
-compatibility: "Claude Code CLI with settings.json support"
-allowed-tools: "Bash Read Write"
-metadata:
-  author: claude-mods
-  related-skills: claude-code-debug, claude-code-headless
----
-
-# Claude Code Hooks
-
-Execute custom scripts before/after Claude Code tool invocations.
-
-## Quick Reference
-
-| Event | When | Has Matcher |
-|-------|------|-------------|
-| `PreToolUse` | Before tool execution | Yes |
-| `PostToolUse` | After tool completes | Yes |
-| `PermissionRequest` | Permission dialog shown | Yes |
-| `Notification` | Notifications sent | Yes |
-| `UserPromptSubmit` | User submits prompt | No |
-| `Stop` | Agent finishes | No |
-| `SubagentStop` | Subagent finishes | No |
-| `PreCompact` | Before context compaction | No |
-| `SessionStart` | Session begins/resumes | No |
-| `SessionEnd` | Session ends | No |
-
-## Basic Configuration
-
-Add to `~/.claude/settings.json` or `.claude/settings.local.json`:
-
-```json
-{
-  "hooks": {
-    "PreToolUse": [{
-      "matcher": "Bash",
-      "hooks": [{
-        "type": "command",
-        "command": "$CLAUDE_PROJECT_DIR/hooks/validate.sh",
-        "timeout": 5000
-      }]
-    }]
-  }
-}
-```
-
-## Matcher Patterns
-
-| Pattern | Matches |
-|---------|---------|
-| `"Write"` | Only Write tool |
-| `"*"` or `""` | All tools |
-| `"mcp__*"` | All MCP tools |
-| `"Bash"` | Bash commands |
-
-## Hook Script Requirements
-
-```bash
-#!/bin/bash
-# Receives JSON via stdin: { "tool_name": "...", "tool_input": {...} }
-INPUT=$(cat)
-TOOL=$(echo "$INPUT" | jq -r '.tool_name')
-
-# Exit codes:
-# 0 = Success (continue)
-# 2 = Block with error (stderr shown to Claude)
-# Other = Non-blocking error
-```
-
-## Common Use Cases
-
-| Use Case | Event | Example |
-|----------|-------|---------|
-| Validate inputs | PreToolUse | Block dangerous commands |
-| Audit logging | PostToolUse | Log all tool usage |
-| Custom approval | PermissionRequest | Slack notification |
-| Session init | SessionStart | Load project context |
-
-## Security Checklist
-
-- [ ] Quote all variables: `"$VAR"` not `$VAR`
-- [ ] Validate paths (no `..` traversal)
-- [ ] Use `$CLAUDE_PROJECT_DIR` for paths
-- [ ] Set reasonable timeouts
-- [ ] Handle jq parsing errors
-
-## Troubleshooting
-
-```bash
-# Debug hook loading
-claude --debug
-
-# List registered hooks
-/hooks
-
-# Test script manually
-echo '{"tool_name":"Bash"}' | ./hooks/validate.sh
-```
-
-## Official Documentation
-
-- https://code.claude.com/docs/en/hooks - Hooks reference
-- https://code.claude.com/docs/en/settings - Settings configuration
-
-## Additional Resources
-
-- `./references/hook-events.md` - All events with input/output schemas
-- `./references/configuration.md` - Advanced config patterns
-- `./references/security-ops.md` - Production security
-
----
-
-**See Also:** `claude-code-debug` for troubleshooting, `claude-code-headless` for CLI automation

+ 0 - 263
skills/claude-code-hooks/references/configuration.md

@@ -1,263 +0,0 @@
-# Hook Configuration Patterns
-
-Advanced configuration for Claude Code hooks.
-
-## Configuration Locations
-
-| File | Scope | Priority |
-|------|-------|----------|
-| `~/.claude/settings.json` | Global (all projects) | Lower |
-| `.claude/settings.local.json` | Project-specific | Higher |
-
-Project settings are additive to global settings.
-
-## Full Configuration Schema
-
-```json
-{
-  "hooks": {
-    "EventName": [
-      {
-        "matcher": "ToolPattern",
-        "hooks": [
-          {
-            "type": "command",
-            "command": "path/to/script.sh",
-            "timeout": 5000
-          }
-        ]
-      }
-    ]
-  }
-}
-```
-
-## Multiple Hooks Per Event
-
-```json
-{
-  "hooks": {
-    "PreToolUse": [
-      {
-        "matcher": "Write",
-        "hooks": [
-          { "type": "command", "command": "validate-write.sh" }
-        ]
-      },
-      {
-        "matcher": "Bash",
-        "hooks": [
-          { "type": "command", "command": "validate-bash.sh" }
-        ]
-      },
-      {
-        "matcher": "*",
-        "hooks": [
-          { "type": "command", "command": "log-all-tools.sh" }
-        ]
-      }
-    ]
-  }
-}
-```
-
-## Matcher Patterns
-
-### Simple Matchers
-
-```json
-{"matcher": "Write"}      // Exact tool name
-{"matcher": "Bash"}       // Bash commands
-{"matcher": "Read"}       // File reads
-```
-
-### Wildcard Matchers
-
-```json
-{"matcher": "*"}          // All tools
-{"matcher": ""}           // All tools (empty = wildcard)
-{"matcher": "mcp__*"}     // All MCP tools
-```
-
-### MCP Tool Matchers
-
-```json
-{"matcher": "mcp__filesystem__*"}     // All filesystem MCP tools
-{"matcher": "mcp__github__create_*"}  // GitHub create operations
-```
-
-## Chaining Multiple Commands
-
-```json
-{
-  "hooks": {
-    "PreToolUse": [
-      {
-        "matcher": "Write",
-        "hooks": [
-          { "type": "command", "command": "lint-check.sh", "timeout": 3000 },
-          { "type": "command", "command": "security-scan.sh", "timeout": 10000 }
-        ]
-      }
-    ]
-  }
-}
-```
-
-Hooks execute sequentially. If any hook exits with code 2, execution stops.
-
-## Timeout Configuration
-
-```json
-{
-  "type": "command",
-  "command": "slow-check.sh",
-  "timeout": 30000  // 30 seconds (milliseconds)
-}
-```
-
-Default timeout: 5000ms (5 seconds)
-
-## Path Variables
-
-Use `$CLAUDE_PROJECT_DIR` for portable paths:
-
-```json
-{
-  "hooks": {
-    "PreToolUse": [{
-      "matcher": "*",
-      "hooks": [{
-        "type": "command",
-        "command": "$CLAUDE_PROJECT_DIR/.claude/hooks/validate.sh"
-      }]
-    }]
-  }
-}
-```
-
-## Conditional Hooks
-
-Handle conditions in the script, not configuration:
-
-```bash
-#!/bin/bash
-INPUT=$(cat)
-TOOL=$(echo "$INPUT" | jq -r '.tool_name')
-
-# Only process Write and Edit
-case "$TOOL" in
-  Write|Edit)
-    # Validation logic
-    ;;
-  *)
-    exit 0  # Skip other tools
-    ;;
-esac
-```
-
-## Environment-Specific Hooks
-
-### Development vs Production
-
-```bash
-#!/bin/bash
-if [[ "${CLAUDE_ENV:-development}" == "production" ]]; then
-    # Stricter validation
-    strict_validate.sh
-else
-    # Lenient for development
-    exit 0
-fi
-```
-
-### Per-Project Override
-
-Project `.claude/settings.local.json` can add project-specific hooks:
-
-```json
-{
-  "hooks": {
-    "PreToolUse": [{
-      "matcher": "Bash",
-      "hooks": [{
-        "type": "command",
-        "command": "$CLAUDE_PROJECT_DIR/scripts/project-validate.sh"
-      }]
-    }]
-  }
-}
-```
-
-## Common Configuration Patterns
-
-### Audit Logging
-
-```json
-{
-  "hooks": {
-    "PostToolUse": [{
-      "matcher": "*",
-      "hooks": [{
-        "type": "command",
-        "command": "$CLAUDE_PROJECT_DIR/.claude/hooks/audit-log.sh"
-      }]
-    }]
-  }
-}
-```
-
-### Block Dangerous Commands
-
-```json
-{
-  "hooks": {
-    "PreToolUse": [{
-      "matcher": "Bash",
-      "hooks": [{
-        "type": "command",
-        "command": "$CLAUDE_PROJECT_DIR/.claude/hooks/block-dangerous.sh",
-        "timeout": 1000
-      }]
-    }]
-  }
-}
-```
-
-### Session Initialization
-
-```json
-{
-  "hooks": {
-    "SessionStart": [{
-      "hooks": [{
-        "type": "command",
-        "command": "$CLAUDE_PROJECT_DIR/.claude/hooks/init-session.sh"
-      }]
-    }]
-  }
-}
-```
-
-## Debugging Configuration
-
-1. **Check JSON validity:**
-   ```bash
-   jq '.' ~/.claude/settings.json
-   ```
-
-2. **Test hook script:**
-   ```bash
-   echo '{"tool_name":"Bash","tool_input":{}}' | ./hook.sh
-   echo $?  # Check exit code
-   ```
-
-3. **Enable debug mode:**
-   ```bash
-   claude --debug
-   ```
-
-4. **List registered hooks:**
-   ```
-   /hooks
-   ```

+ 0 - 251
skills/claude-code-hooks/references/hook-events.md

@@ -1,251 +0,0 @@
-# Hook Events Reference
-
-Comprehensive documentation for all Claude Code hook events.
-
-## Event Processing Order
-
-```
-PreToolUse Hook → Deny Rules → Allow Rules → Ask Rules → Permission Check → [Tool Execution] → PostToolUse Hook
-```
-
-## PreToolUse
-
-Fires before a tool is executed. Can block or modify the operation.
-
-**Input Schema:**
-```json
-{
-  "session_id": "abc123",
-  "tool_name": "Write",
-  "tool_input": {
-    "file_path": "/path/to/file.txt",
-    "content": "file contents..."
-  }
-}
-```
-
-**Use Cases:**
-- Block dangerous operations
-- Validate file paths
-- Enforce naming conventions
-- Rate limiting
-
-**Example:**
-```bash
-#!/bin/bash
-INPUT=$(cat)
-FILE=$(echo "$INPUT" | jq -r '.tool_input.file_path // empty')
-
-# Block writes to protected directories
-if [[ "$FILE" == /etc/* ]] || [[ "$FILE" == /usr/* ]]; then
-    echo "Cannot write to system directories" >&2
-    exit 2
-fi
-```
-
-## PostToolUse
-
-Fires after a tool completes. Cannot block but can log or notify.
-
-**Input Schema:**
-```json
-{
-  "session_id": "abc123",
-  "tool_name": "Bash",
-  "tool_input": {
-    "command": "npm test"
-  },
-  "tool_output": {
-    "stdout": "...",
-    "stderr": "...",
-    "exit_code": 0
-  }
-}
-```
-
-**Use Cases:**
-- Audit logging
-- Metrics collection
-- Notifications on completion
-- Output transformation
-
-**Example:**
-```bash
-#!/bin/bash
-INPUT=$(cat)
-LOG_FILE="$CLAUDE_PROJECT_DIR/.claude/audit.log"
-TOOL=$(echo "$INPUT" | jq -r '.tool_name')
-TIME=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
-
-echo "$TIME | $TOOL | $(echo "$INPUT" | jq -c '.')" >> "$LOG_FILE"
-```
-
-## PermissionRequest
-
-Fires when Claude Code shows a permission dialog.
-
-**Input Schema:**
-```json
-{
-  "session_id": "abc123",
-  "tool_name": "Bash",
-  "tool_input": {
-    "command": "rm -rf node_modules"
-  },
-  "permission_type": "tool_use"
-}
-```
-
-**Use Cases:**
-- Custom approval workflows
-- Slack/Teams notifications
-- External approval systems
-
-## Notification
-
-Fires when Claude Code sends a notification.
-
-**Input Schema:**
-```json
-{
-  "session_id": "abc123",
-  "notification_type": "task_complete",
-  "message": "Build completed successfully"
-}
-```
-
-**Use Cases:**
-- Forward to external services
-- Custom notification routing
-
-## UserPromptSubmit
-
-Fires when user submits a prompt. No matcher support.
-
-**Input Schema:**
-```json
-{
-  "session_id": "abc123",
-  "prompt": "User's message text"
-}
-```
-
-**Use Cases:**
-- Input logging
-- Prompt transformation
-- Usage analytics
-
-## Stop
-
-Fires when the main agent finishes. No matcher support.
-
-**Input Schema:**
-```json
-{
-  "session_id": "abc123",
-  "reason": "completed",
-  "final_message": "Task completed successfully"
-}
-```
-
-**Use Cases:**
-- Session cleanup
-- Final logging
-- Resource deallocation
-
-## SubagentStop
-
-Fires when a subagent finishes. No matcher support.
-
-**Input Schema:**
-```json
-{
-  "session_id": "abc123",
-  "subagent_id": "xyz789",
-  "subagent_type": "python-expert",
-  "result": "Analysis complete"
-}
-```
-
-**Use Cases:**
-- Subagent performance tracking
-- Result aggregation
-
-## PreCompact
-
-Fires before context window compaction. No matcher support.
-
-**Input Schema:**
-```json
-{
-  "session_id": "abc123",
-  "current_tokens": 150000,
-  "max_tokens": 200000
-}
-```
-
-**Use Cases:**
-- Save context state
-- Pre-compaction processing
-
-## SessionStart
-
-Fires when a session begins or resumes. No matcher support.
-
-**Input Schema:**
-```json
-{
-  "session_id": "abc123",
-  "is_resume": false,
-  "project_dir": "/path/to/project"
-}
-```
-
-**Use Cases:**
-- Project initialization
-- Load session state
-- Environment setup
-
-**Example:**
-```bash
-#!/bin/bash
-# Load project-specific environment
-source "$CLAUDE_PROJECT_DIR/.env.local" 2>/dev/null || true
-echo "Session initialized for $(basename "$CLAUDE_PROJECT_DIR")"
-```
-
-## SessionEnd
-
-Fires when a session ends. No matcher support.
-
-**Input Schema:**
-```json
-{
-  "session_id": "abc123",
-  "duration_ms": 3600000,
-  "total_cost_usd": 0.05
-}
-```
-
-**Use Cases:**
-- Session logging
-- Cost tracking
-- Cleanup tasks
-
-## Exit Code Reference
-
-| Code | Meaning | Effect |
-|------|---------|--------|
-| 0 | Success | Continue execution |
-| 2 | Blocking error | Stop, show stderr to Claude |
-| Other | Non-blocking error | Log warning, continue |
-
-## Environment Variables
-
-Available in all hook scripts:
-
-| Variable | Description |
-|----------|-------------|
-| `CLAUDE_PROJECT_DIR` | Current project directory |
-| `CLAUDE_SESSION_ID` | Current session identifier |
-| `CLAUDE_TOOL_NAME` | Tool being executed (PreToolUse/PostToolUse only) |

+ 0 - 264
skills/claude-code-hooks/references/security-patterns.md

@@ -1,264 +0,0 @@
-# Hook Security Patterns
-
-Security best practices for Claude Code hook scripts.
-
-## Input Validation
-
-### Always Parse JSON Safely
-
-```bash
-#!/bin/bash
-set -euo pipefail
-
-INPUT=$(cat)
-
-# Validate JSON structure
-if ! echo "$INPUT" | jq -e '.' > /dev/null 2>&1; then
-    echo "Invalid JSON input" >&2
-    exit 2
-fi
-
-TOOL=$(echo "$INPUT" | jq -r '.tool_name // empty')
-if [[ -z "$TOOL" ]]; then
-    echo "Missing tool_name" >&2
-    exit 2
-fi
-```
-
-### Quote All Variables
-
-```bash
-# GOOD - Variables are quoted
-file_path="$1"
-command="$CLAUDE_PROJECT_DIR/scripts/validate.sh"
-echo "Processing: $file_path"
-
-# BAD - Unquoted variables allow injection
-file_path=$1
-command=$CLAUDE_PROJECT_DIR/scripts/validate.sh
-```
-
-### Path Traversal Prevention
-
-```bash
-#!/bin/bash
-INPUT=$(cat)
-FILE=$(echo "$INPUT" | jq -r '.tool_input.file_path // empty')
-
-# Check for path traversal
-if [[ "$FILE" == *".."* ]]; then
-    echo "Path traversal attempt blocked: $FILE" >&2
-    exit 2
-fi
-
-# Ensure within project directory
-REAL_PATH=$(realpath -m "$FILE" 2>/dev/null || echo "$FILE")
-if [[ "$REAL_PATH" != "$CLAUDE_PROJECT_DIR"* ]]; then
-    echo "Path outside project directory: $FILE" >&2
-    exit 2
-fi
-```
-
-### Command Injection Prevention
-
-```bash
-#!/bin/bash
-INPUT=$(cat)
-CMD=$(echo "$INPUT" | jq -r '.tool_input.command // empty')
-
-# Block dangerous commands
-DANGEROUS_PATTERNS=(
-    "rm -rf /"
-    "rm -rf /*"
-    "> /dev/sda"
-    "mkfs."
-    "dd if="
-    ":(){:|:&};:"
-)
-
-for pattern in "${DANGEROUS_PATTERNS[@]}"; do
-    if [[ "$CMD" == *"$pattern"* ]]; then
-        echo "Blocked dangerous command: $pattern" >&2
-        exit 2
-    fi
-done
-```
-
-## Secrets Management
-
-### Never Log Secrets
-
-```bash
-#!/bin/bash
-INPUT=$(cat)
-
-# DON'T: Log full input (may contain secrets)
-# echo "$INPUT" >> /tmp/debug.log
-
-# DO: Log sanitized data
-TOOL=$(echo "$INPUT" | jq -r '.tool_name')
-echo "$(date) | $TOOL" >> "$CLAUDE_PROJECT_DIR/.claude/audit.log"
-```
-
-### Environment Variable Handling
-
-```bash
-#!/bin/bash
-# Load secrets from secure source
-if [[ -f "$HOME/.secrets/claude-hooks" ]]; then
-    source "$HOME/.secrets/claude-hooks"
-fi
-
-# Never echo secrets
-# echo "Using API key: $API_KEY"  # BAD
-
-# Use for operations without exposing
-curl -s -H "Authorization: Bearer $API_KEY" "$ENDPOINT" > /dev/null
-```
-
-## Rate Limiting
-
-```bash
-#!/bin/bash
-RATE_FILE="/tmp/claude-hook-rate"
-MAX_CALLS=100
-WINDOW=60  # seconds
-
-NOW=$(date +%s)
-CUTOFF=$((NOW - WINDOW))
-
-# Atomic file operations
-{
-    flock -x 200
-
-    # Clean old entries and count recent
-    if [[ -f "$RATE_FILE" ]]; then
-        RECENT=$(awk -v cutoff="$CUTOFF" '$1 > cutoff' "$RATE_FILE" | wc -l)
-    else
-        RECENT=0
-    fi
-
-    if [[ $RECENT -ge $MAX_CALLS ]]; then
-        echo "Rate limit exceeded: $RECENT calls in ${WINDOW}s" >&2
-        exit 2
-    fi
-
-    # Log this call
-    echo "$NOW" >> "$RATE_FILE"
-
-    # Cleanup old entries
-    awk -v cutoff="$CUTOFF" '$1 > cutoff' "$RATE_FILE" > "${RATE_FILE}.tmp"
-    mv "${RATE_FILE}.tmp" "$RATE_FILE"
-
-} 200>"${RATE_FILE}.lock"
-```
-
-## Timeout Handling
-
-```bash
-#!/bin/bash
-# Set script timeout
-TIMEOUT=10
-
-# Use timeout for external commands
-timeout "$TIMEOUT" some-slow-command || {
-    echo "Command timed out after ${TIMEOUT}s" >&2
-    exit 2
-}
-```
-
-## Error Handling
-
-```bash
-#!/bin/bash
-set -euo pipefail
-
-# Trap errors
-trap 'echo "Hook failed at line $LINENO" >&2; exit 1' ERR
-
-# Validate dependencies
-command -v jq >/dev/null 2>&1 || {
-    echo "jq is required but not installed" >&2
-    exit 1
-}
-
-# Main logic with explicit error handling
-INPUT=$(cat) || {
-    echo "Failed to read input" >&2
-    exit 1
-}
-```
-
-## File Permissions
-
-```bash
-# Hook scripts should be executable only by owner
-chmod 700 hook-script.sh
-
-# Sensitive config should be readable only by owner
-chmod 600 ~/.claude/settings.json
-
-# Audit logs should be append-only where possible
-chattr +a /var/log/claude-audit.log  # Linux only
-```
-
-## Audit Trail Pattern
-
-```bash
-#!/bin/bash
-INPUT=$(cat)
-LOG_DIR="$CLAUDE_PROJECT_DIR/.claude/audit"
-mkdir -p "$LOG_DIR"
-
-TIMESTAMP=$(date -u +"%Y-%m-%dT%H:%M:%SZ")
-SESSION=$(echo "$INPUT" | jq -r '.session_id // "unknown"')
-TOOL=$(echo "$INPUT" | jq -r '.tool_name // "unknown"')
-LOG_FILE="$LOG_DIR/${SESSION}.jsonl"
-
-# Append-only logging
-{
-    echo "{\"timestamp\":\"$TIMESTAMP\",\"tool\":\"$TOOL\",\"input\":$(echo "$INPUT" | jq -c '.tool_input')}"
-} >> "$LOG_FILE"
-```
-
-## Security Checklist
-
-### Before Deployment
-
-- [ ] All variables quoted
-- [ ] Path traversal checks implemented
-- [ ] Dangerous command patterns blocked
-- [ ] No secrets in logs
-- [ ] Proper file permissions set
-- [ ] Timeout configured
-- [ ] Error handling complete
-- [ ] Input JSON validated
-
-### Script Header Template
-
-```bash
-#!/bin/bash
-#
-# Claude Code Hook: [description]
-# Security considerations:
-#   - Validates all JSON input
-#   - Blocks path traversal
-#   - Quotes all variables
-#   - Logs sanitized data only
-#
-
-set -euo pipefail
-trap 'echo "Error at line $LINENO" >&2; exit 1' ERR
-
-# Dependencies check
-command -v jq >/dev/null 2>&1 || { echo "jq required" >&2; exit 1; }
-
-# Read and validate input
-INPUT=$(cat)
-if ! echo "$INPUT" | jq -e '.' > /dev/null 2>&1; then
-    echo "Invalid JSON" >&2
-    exit 2
-fi
-
-# Main logic here...
-```

+ 104 - 0
skills/claude-code-internals/SKILL.md

@@ -0,0 +1,104 @@
+---
+name: claude-code-internals
+description: "Claude Code internals - hooks, skills, subagents, headless mode, and debugging, current as of June 2026. Use for: hooks, hook events, hook not firing, PreToolUse, PostToolUse, SessionStart, Stop hook, hook script, stdin JSON contract, tool validation, audit logging, skill frontmatter, SKILL.md, skill not loading, skill not triggering, disable-model-invocation, context fork, dynamic context injection, skill description budget, headless, claude -p, CLI automation, --print, output-format, stream-json, json-schema structured output, CI/CD scripting, bare mode, background agents, debug, troubleshoot, not working, agent not found, plugin not loading, claude plugin validate, /doctor, --safe-mode, MCP server not connecting, settings precedence."
+license: MIT
+compatibility: "Claude Code CLI v2.1.x (June 2026 docs)"
+allowed-tools: "Bash Read Grep"
+metadata:
+  author: claude-mods
+  related-skills: "mcp-ops, setperms, dsp-launch"
+---
+
+# Claude Code Internals
+
+One skill for the machinery of Claude Code itself: the **hook system**, the **skill format**, **headless/programmatic use**, and **debugging your configuration**. Replaces the former claude-code-hooks / claude-code-headless / claude-code-debug skills, refreshed against the June 2026 docs.
+
+> Written against Claude Code ~v2.1.17x. These surfaces move fast — when precision matters, confirm against the live docs (links per section). Two contracts from older guides are **dead**: the `$TOOL_INPUT` env-var hook contract (hooks read **stdin JSON** now) and standalone command files as a separate system (commands merged into skills).
+
+## Route to the Right Reference
+
+| You're doing | Load |
+|---|---|
+| Writing/fixing a hook; event contracts; blocking tools; audit logging | [references/hooks-reference.md](references/hooks-reference.md) |
+| Authoring a skill; frontmatter fields; `$ARGUMENTS`; `` !`cmd` `` injection; triggering problems | [references/skills-reference.md](references/skills-reference.md) |
+| `claude -p`; CI scripts; output parsing; stream-json; structured output; background agents | [references/headless-reference.md](references/headless-reference.md) |
+| Anything configured isn't taking effect; plugin validation; /doctor | [references/debugging-reference.md](references/debugging-reference.md) |
+
+## Mental Model
+
+- **Skills** = prompt content loaded on demand (descriptions always in context, body on invoke). Guidance, not guarantees.
+- **Hooks** = deterministic shell/HTTP/MCP/model handlers on lifecycle events. Guarantees: use hooks (or permissions) for "must never happen", skills/CLAUDE.md for "we do it this way".
+- **Subagents** = isolated context windows with their own system prompt, tools, model, permissions.
+- **Headless** = the same agent loop driven by `claude -p` (Agent SDK under the CLI).
+- **Plugins** = a directory bundling all of the above (`skills/`, `agents/`, `hooks/hooks.json`, `.mcp.json`), manifest at `.claude-plugin/plugin.json`.
+
+## Hooks in 30 Seconds
+
+Configure under the `"hooks"` key in `~/.claude/settings.json`, `.claude/settings.json`, `.claude/settings.local.json`, plugin `hooks/hooks.json`, or **skill/agent frontmatter**:
+
+```json
+{
+  "hooks": {
+    "PreToolUse": [
+      { "matcher": "Bash",
+        "hooks": [{ "type": "command", "command": "${CLAUDE_PROJECT_DIR}/.claude/hooks/check.sh" }] }
+    ]
+  }
+}
+```
+
+Contract: JSON payload on **stdin** → respond with **exit code** (0 ok, 2 block + stderr feedback) and optional **stdout JSON** (`continue`, `systemMessage`, `hookSpecificOutput.permissionDecision/updatedInput/additionalContext/...`).
+
+Events (full catalog + per-event schemas in the reference): `SessionStart`, `SessionEnd`, `Setup`, `UserPromptSubmit`, `UserPromptExpansion`, `PreToolUse`, `PermissionRequest`, `PermissionDenied`, `PostToolUse`, `PostToolUseFailure`, `PostToolBatch`, `Stop`, `StopFailure`, `SubagentStart`, `SubagentStop`, `TaskCreated`, `TaskCompleted`, `TeammateIdle`, `Notification`, `MessageDisplay`, `ConfigChange`, `CwdChanged`, `FileChanged`, `PreCompact`, `PostCompact`, `InstructionsLoaded`, `WorktreeCreate`, `WorktreeRemove`, `Elicitation`, `ElicitationResult`.
+
+Hook types: `command` (sync/`async`/`asyncRewake`), `http`, `mcp_tool`, `prompt`, `agent`. Matchers are case-sensitive strings (`"Edit|Write"`, regex allowed); `if` filters add permission-rule conditions like `Bash(git *)`.
+
+## Skills in 30 Seconds
+
+`.claude/skills/<name>/SKILL.md` (project) or `~/.claude/skills/<name>/SKILL.md` (personal). Frontmatter fields beyond `name`/`description`: `when_to_use`, `argument-hint`, `arguments`, `disable-model-invocation`, `user-invocable`, `allowed-tools`, `disallowed-tools`, `model`, `effort`, `context: fork` + `agent`, `hooks`, `paths`, `shell`.
+
+- Substitutions: `$ARGUMENTS`, `$ARGUMENTS[N]`, `$N` (0-based!), `$name`, `${CLAUDE_SKILL_DIR}`, `${CLAUDE_SESSION_ID}`, `${CLAUDE_EFFORT}`.
+- `` !`command` `` runs at load time and inlines output (dynamic context injection).
+- Body persists all session; keep SKILL.md < 500 lines, details in supporting files.
+- Description + `when_to_use` capped at 1,536 chars in the listing; listing budget = 1% of context window — `/doctor` reports overflow.
+
+## Headless in 30 Seconds
+
+```bash
+claude --bare -p "query" --allowedTools "Read,Grep" --output-format json | jq -r '.result'
+```
+
+- `--bare` for reproducible CI runs (skips hooks/skills/plugins/MCP/CLAUDE.md; auth via `ANTHROPIC_API_KEY` or `claude setup-token`).
+- `--output-format json` → `.result`, `.session_id`, `.is_error`, `.total_cost_usd`, `.num_turns`; add `--json-schema '<schema>'` → `.structured_output`.
+- `stream-json` (+ `--verbose --include-partial-messages`) for token streaming; `system/init` event lists loaded `plugins`/`plugin_errors` (assert in CI).
+- Multi-turn: `--continue`, `--resume <id|name>`, `--fork-session`; caps: `--max-turns`, `--max-budget-usd`.
+- Fire-and-forget: `claude --bg "task"`, then `claude agents --json` / `logs` / `attach` / `stop`.
+
+## Debugging Decision Tree
+
+```
+Something configured isn't working
+├─ What loaded? /context → then /memory /skills /agents /hooks /mcp /permissions
+├─ Config valid? /doctor (invalid keys, schema errors, skill budget overflow)
+├─ Which scope won? /status  (managed > local > project > user; flags/env on top)
+├─ Watch it live: claude --debug hooks | --debug mcp | --debug "api,hooks"
+└─ Bisect: claude --safe-mode (customizations off)
+   └─ still broken → CLAUDE_CONFIG_DIR=/tmp/clean claude  (nothing loads)
+```
+
+Fast classics: hooks belong in `settings.json` not `~/.claude.json`; matcher is a case-sensitive **string**, not an array; skills need a folder + `SKILL.md`; `.mcp.json` at repo root; `settings.local.json` overrides `settings.json`; agent files on disk load at session start; `claude plugin validate --strict` in CI.
+
+## Subagents (quick facts)
+
+Markdown + YAML frontmatter in `.claude/agents/` / `~/.claude/agents/` / plugin `agents/` / `--agents '<json>'`. Required: `name` (the identity — filename doesn't matter), `description`. Optional: `tools`, `disallowedTools`, `model` (default `inherit`), `permissionMode`, `maxTurns`, `skills` (preloads **full** skill content), `mcpServers`, `hooks`, `memory` (`user|project|local`), `background`, `effort`, `isolation: worktree`, `color`, `initialPrompt`. Plugin agents ignore `hooks`/`mcpServers`/`permissionMode`. Built-in Explore/Plan skip CLAUDE.md. The Task tool was renamed **Agent** (v2.1.63; `Task(...)` rules still alias).
+
+## Official Docs (verify here when it matters)
+
+- https://code.claude.com/docs/en/hooks — hook events, contracts, types
+- https://code.claude.com/docs/en/skills — skill format and lifecycle
+- https://code.claude.com/docs/en/cli-reference — every flag (note: `--help` doesn't list them all)
+- https://code.claude.com/docs/en/headless — `claude -p` patterns
+- https://code.claude.com/docs/en/sub-agents — subagent frontmatter
+- https://code.claude.com/docs/en/plugins-reference — plugin schemas + `claude plugin` CLI
+- https://code.claude.com/docs/en/debug-your-config — diagnosis workflow
+- https://code.claude.com/docs/llms.txt — full docs index

+ 0 - 0
skills/claude-code-headless/assets/.gitkeep → skills/claude-code-internals/assets/.gitkeep


+ 129 - 0
skills/claude-code-internals/references/debugging-reference.md

@@ -0,0 +1,129 @@
+# Debugging Reference
+
+Diagnosing Claude Code configuration, extensions, and behavior. Verified against https://code.claude.com/docs/en/debug-your-config (June 2026).
+
+Core principle: when something you configured doesn't take effect, the cause is almost always one of three things — **it didn't load**, **it loaded from a different location than you expect**, or **another scope overrode it**. Inspect what actually loaded before editing anything.
+
+## Inspection Commands (run these first)
+
+| Command | Shows |
+|---|---|
+| `/context` | Everything in the context window by category: system prompt, memory, skills, MCP tools, messages |
+| `/memory` | Which CLAUDE.md and rules files loaded, plus auto-memory |
+| `/skills` | Available skills (project/user/plugin) with invocation badges; `Space` cycles `skillOverrides` states |
+| `/agents` | Configured subagents and settings |
+| `/hooks` | Every hook registered this session, grouped by event |
+| `/mcp` | MCP servers, connection status, approval state |
+| `/permissions` | Resolved allow/deny rules in effect |
+| `/doctor` | Diagnostics: invalid settings keys, schema errors, skill-description budget overflow, install health. Press `f` to send the report to Claude for guided fixes |
+| `/debug [issue]` | Enables debug logging and prompts Claude to self-diagnose from logs |
+| `/status` | Active settings sources, incl. whether managed settings apply |
+
+## CLI Diagnostic Flags
+
+| Flag | Use |
+|---|---|
+| `claude --debug` | Debug logging; filter by category: `--debug "api,hooks"`, negate: `--debug "!statsig,!file"` |
+| `claude --debug hooks` | Watch hook evaluation live: events fired, matchers checked, exit codes, output |
+| `claude --debug mcp` | MCP server stderr (e.g. connected-but-zero-tools) |
+| `claude --debug-file /tmp/claude.log` | Write debug logs to a file (implies debug mode; beats `CLAUDE_CODE_DEBUG_LOGS_DIR`) |
+| `claude --safe-mode` | (v2.1.169+) All customizations off: CLAUDE.md, skills, plugins, hooks, MCP, custom commands/agents, output styles, themes, keybindings, LSP, auto-memory. Auth/model/tools/permissions normal. Managed policy still partially applies |
+| `claude --bare` | Minimal startup for scripts (different goal than safe-mode: speed/reproducibility, not triage) |
+| `claude --verbose` | Full turn-by-turn output |
+| `claude doctor` / `claude --version` | Install health from the shell |
+
+### Bisection workflow
+
+1. `claude --safe-mode` — problem gone? A customization is the cause; use targeted `/` commands to find which.
+2. Still broken? Fully clean session: `cd /tmp && CLAUDE_CONFIG_DIR=/tmp/claude-clean claude` (no user/project config at all; managed settings and env vars still apply; you'll re-login on Linux/Windows).
+3. Reintroduce config one piece at a time.
+
+## Skill Not Working
+
+| Symptom | Cause | Fix |
+|---|---|---|
+| Not in `/skills` | File at `.claude/skills/name.md` instead of a folder | Must be `.claude/skills/name/SKILL.md` |
+| Not in `/skills` | New top-level skills dir created mid-session | Restart (existing dirs are live-watched; new top-level dirs are not) |
+| In `/skills`, Claude never invokes it | `disable-model-invocation: true` ("user-only" badge), or description doesn't match request phrasing | Check badge; add natural trigger keywords to `description`/`when_to_use` |
+| Used to trigger, stopped | Description budget overflow (1% of context window) | `/doctor` shows affected skills. Raise `skillListingBudgetFraction`, set noise skills to `"name-only"` in `skillOverrides`, front-load key triggers (1,536-char per-skill cap) |
+| Triggers too often | Description too broad | Tighten description or `disable-model-invocation: true` |
+| Stops influencing behavior mid-session | Content usually still present; model preferring other approaches — or dropped by compaction (5k/skill, 25k combined budget) | Strengthen description/instructions, enforce with hooks, or re-invoke after compaction |
+| Frontmatter ignored | YAML error (e.g. unquoted `:` in description) | Quote strings; validate YAML |
+
+## Hook Not Firing
+
+Decision tree:
+
+1. **Not listed in `/hooks`?** It isn't being read:
+   - Hooks belong under the `"hooks"` key in a **settings file** — there is no standalone hooks file for user/project config (only plugins use `hooks/hooks.json`).
+   - `~/.claude.json` is app state, NOT settings — `hooks`/`permissions`/`env` go in `~/.claude/settings.json`.
+   - `matcher` as a JSON **array** is a schema error: the entry is dropped, `/doctor` reports it.
+2. **Listed but never fires?** Matcher problems:
+   - Case-sensitive: `Bash` not `bash`; tool names are capitalized.
+   - Multiple tools = one string with `|`: `"Edit|Write"`.
+   - MCP tools need `mcp__server__tool` or regex `mcp__server__.*`.
+3. **Fires but misbehaves?** `claude --debug hooks`, then test the script standalone:
+   ```bash
+   echo '{"tool_name":"Bash","tool_input":{"command":"ls"}}' | ./hook.sh; echo "exit=$?"
+   ```
+   - Script must read **stdin JSON** (the `$TOOL_INPUT` env contract is stale/dead).
+   - Exit 2 + stderr = block; JSON decisions only parsed on exit 0 stdout.
+   - Not executable / wrong shebang / CRLF line endings on Unix.
+4. Settings edits apply after a brief file-stability delay — no restart needed; re-run `/hooks` to refresh.
+
+## Agent Not Found / Not Used
+
+- Locations: `.claude/agents/` (project, recursive scan), `~/.claude/agents/` (user), plugin `agents/`, `--agents` JSON, managed. Priority: managed > CLI > project > user > plugin.
+- Identity comes from the `name` **frontmatter field**, not the filename. Duplicate names in one scope: one silently wins.
+- Files added on disk load at **session start** — restart after manual edits (`/agents`-created ones apply immediately).
+- `name` and `description` are required frontmatter; delegation quality depends on `description` (add "Use proactively…" phrasing).
+- Plugin agents ignore `hooks`, `mcpServers`, `permissionMode` frontmatter (security restriction) — copy into `.claude/agents/` if needed.
+- Explore/Plan built-ins skip CLAUDE.md — restate critical instructions in the delegating prompt or agent body.
+
+## MCP Server Issues
+
+- `/mcp` shows status. Project `.mcp.json` servers need one-time approval — a dismissed prompt leaves them disabled until approved from `/mcp`.
+- `.mcp.json` lives at the **repo root**, not inside `.claude/`; `settings.json` has no `mcpServers` key (use `claude mcp add --scope user` for user scope).
+- Failed to start: usually relative paths in `command`/`args` (resolve against launch dir) — use absolute paths.
+- Connected but zero tools: Reconnect from `/mcp`; persists → `claude --debug mcp` for stderr.
+- Server missing env vars: settings `env` doesn't propagate to MCP child processes — set per-server `env` in `.mcp.json`.
+
+## Plugin Issues
+
+```bash
+claude plugin validate ./my-plugin            # schema check; warnings for unknown fields
+claude plugin validate ./my-plugin --strict   # warnings become errors — use in CI
+claude plugin list                            # what's installed/loaded
+```
+
+- Unrecognized plugin.json fields = warnings (plugin still loads); wrong **types** (e.g. `keywords` as string) = load errors.
+- `.claude-plugin/` holds only `plugin.json` (and marketplace.json); all component dirs (`skills/`, `agents/`, `hooks/`, `commands/`, `output-styles/`) sit at the **plugin root**.
+- Plugin `CLAUDE.md` is NOT loaded — ship context as a skill instead.
+- In headless: the stream-json `system/init` event lists `plugins` and `plugin_errors` — assert on it in CI.
+- `/plugin` manages enable/disable; `defaultEnabled: false` plugins install disabled (v2.1.154+).
+- Plugin component changes need `/reload-plugins` (skills text is the only live-reloaded piece).
+
+## Permission Surprises
+
+- `/permissions` shows resolved rules. Precedence: managed always wins; then local > project > user; flags/env override files.
+- `settings.local.json` silently overrides `settings.json` — the classic "my setting is ignored".
+- `Bash(rm *)` deny matches the **literal command string** — `/bin/rm`, `find -delete` sail past. Hard guarantees need a PreToolUse hook or sandboxing.
+- Prefix rules: trailing space matters — `Bash(git diff *)` vs `Bash(git diff*)` (also matches `git diff-index`).
+
+## Common Causes Cheat Sheet
+
+| Symptom | Likely cause |
+|---|---|
+| Global hooks/permissions/env ignored | Put in `~/.claude.json` instead of `~/.claude/settings.json` |
+| settings.json value ignored | Same key in `settings.local.json` |
+| Subdirectory CLAUDE.md ignored | Loads on demand when Claude Reads a file there, not at startup |
+| Cleanup never runs at session end | No `SessionEnd` hook configured |
+| Skill at `skills/name.md` invisible | Needs folder + `SKILL.md` |
+| Hook with array matcher dropped | Matcher must be a string |
+| MCP under `.claude/.mcp.json` never loads | Move to repo root |
+
+## Debug Logs
+
+- `--debug-file <path>` pins location; `CLAUDE_CODE_DEBUG_LOGS_DIR` sets the directory.
+- Session transcripts (JSONL) live under `~/.claude/projects/<encoded-cwd>/` — greppable for tool inputs/outputs and hook activity.
+- `claude project purge [path] --dry-run` previews clearing all local state for a project.

+ 197 - 0
skills/claude-code-internals/references/headless-reference.md

@@ -0,0 +1,197 @@
+# Headless Reference (`claude -p`)
+
+Verified against https://code.claude.com/docs/en/headless and https://code.claude.com/docs/en/cli-reference (June 2026).
+
+`claude -p` runs the same agent loop as interactive Claude Code via the Agent SDK. For full programmatic control (callbacks, message objects, structured outputs API), use the Python/TypeScript Agent SDK — this page covers the CLI surface.
+
+> Billing note: from June 15, 2026, `claude -p` / Agent SDK usage on subscription plans draws from a separate monthly Agent SDK credit, not interactive limits.
+
+## Core Flags
+
+| Flag | Purpose |
+|---|---|
+| `-p`, `--print` | Non-interactive mode; print result and exit |
+| `--output-format` | `text` (default) \| `json` \| `stream-json` |
+| `--input-format` | `text` (default) \| `stream-json` (print mode) |
+| `--json-schema '<schema>'` | Validated structured output (with `--output-format json`); result lands in `structured_output` |
+| `--include-partial-messages` | Token-level streaming events (needs `-p` + `stream-json`) |
+| `--include-hook-events` | Hook lifecycle events in the stream (needs `stream-json`) |
+| `--replay-user-messages` | Echo stdin user messages back on stdout (stream-json in/out) |
+| `-c`, `--continue` | Continue most recent conversation in this directory |
+| `-r`, `--resume <id\|name>` | Resume a specific session |
+| `--fork-session` | New session ID when resuming (don't mutate the original) |
+| `--session-id <uuid>` | Pin the session ID |
+| `--no-session-persistence` | Don't save the session to disk (print mode) |
+| `--allowedTools` / `--allowed-tools` | Auto-approve matching tools (permission-rule syntax) |
+| `--disallowedTools` | Deny rules; bare name removes the tool from context, `Bash(rm *)` denies matching calls only |
+| `--tools "Bash,Edit,Read"` | Restrict available built-in tools (`""` = none, `"default"` = all). MCP unaffected — pair with `--disallowedTools "mcp__*"` |
+| `--permission-mode` | `default` \| `acceptEdits` \| `plan` \| `auto` \| `dontAsk` \| `bypassPermissions` |
+| `--dangerously-skip-permissions` | = `--permission-mode bypassPermissions` |
+| `--permission-prompt-tool <mcp_tool>` | Delegate permission prompts to an MCP tool |
+| `--max-turns N` | Cap agentic turns (print mode); exits with error at the cap |
+| `--max-budget-usd N` | Spend cap (print mode) |
+| `--model` / `--fallback-model sonnet,haiku` | Model + ordered fallback chain |
+| `--effort low\|medium\|high\|xhigh\|max` | Effort level |
+| `--system-prompt(-file)` / `--append-system-prompt(-file)` | Replace / append system prompt (replace drops ALL default guidance) |
+| `--agents '<json>'` | Define session subagents inline (subagent frontmatter fields + `prompt`) |
+| `--agent <name>` | Run a specific agent as the main session |
+| `--settings <file\|json>` | Override settings for this invocation |
+| `--setting-sources user,project,local` | Limit which settings files load |
+| `--mcp-config <file\|json>` / `--strict-mcp-config` | Load MCP servers / ONLY those servers |
+| `--add-dir <paths>` | Extra working directories |
+| `--bare` | Minimal mode: skip hooks/skills/plugins/MCP/CLAUDE.md auto-discovery |
+| `--bg` | Run as a background agent, return immediately (prints session ID); `--bg --exec 'cmd'` runs a shell job |
+| `--verbose` | Turn-by-turn output (required for some stream-json modes) |
+| `--init` / `--init-only` / `--maintenance` | Run Setup hooks (print mode); `--init-only` exits after hooks |
+| `--exclude-dynamic-system-prompt-sections` | Move per-machine prompt sections to first user message (prompt-cache reuse across machines) |
+
+Stdin is capped at **10MB** (v2.1.128+); larger inputs go in a file referenced by path.
+
+## Bare Mode (recommended for CI/scripts)
+
+```bash
+claude --bare -p "Summarize this file" --allowedTools "Read"
+```
+
+Skips auto-discovery of hooks, skills, plugins, MCP servers, auto memory, and CLAUDE.md — same result on every machine; only explicit flags apply. Tools available: Bash, file read, file edit. Pass context back in explicitly: `--append-system-prompt(-file)`, `--settings`, `--mcp-config`, `--agents`, `--plugin-dir`/`--plugin-url`.
+
+Caveat: bare mode skips OAuth/keychain reads — auth must come from `ANTHROPIC_API_KEY` or an `apiKeyHelper` in `--settings`. Docs state `--bare` will become the default for `-p` in a future release.
+
+`--safe-mode` is the interactive cousin (troubleshooting, not speed): customizations off, auth/permissions normal. See debugging-reference.md.
+
+## Output Formats
+
+### `json`
+
+```json
+{
+  "type": "result",
+  "subtype": "success",
+  "result": "…final text…",
+  "structured_output": { },
+  "session_id": "abc-123",
+  "is_error": false,
+  "num_turns": 3,
+  "duration_ms": 12345,
+  "total_cost_usd": 0.0123,
+  "usage": { }
+}
+```
+
+```bash
+claude -p "Summarize this project" --output-format json | jq -r '.result'
+session_id=$(claude -p "Start a review" --output-format json | jq -r '.session_id')
+```
+
+`total_cost_usd` includes a per-model cost breakdown — track spend per invocation.
+
+### Structured output (`--json-schema`)
+
+```bash
+claude -p "Extract the main function names from auth.py" \
+  --output-format json \
+  --json-schema '{"type":"object","properties":{"functions":{"type":"array","items":{"type":"string"}}},"required":["functions"]}' \
+  | jq '.structured_output'
+```
+
+The agent completes its full workflow, then the response is validated against the schema; metadata (session ID, usage) wraps the `structured_output` field.
+
+### `stream-json`
+
+Newline-delimited JSON events. Token streaming needs `--verbose --include-partial-messages`:
+
+```bash
+claude -p "Write a poem" --output-format stream-json --verbose --include-partial-messages | \
+  jq -rj 'select(.type == "stream_event" and .event.delta.type? == "text_delta") | .event.delta.text'
+```
+
+Notable event types:
+
+- `system` / subtype `init` — first event; reports model, tools, MCP servers, `plugins` (loaded: `name`,`path`) and `plugin_errors` (`plugin`,`type`,`message`) — **fail CI on `plugin_errors`**.
+- `system` / subtype `api_retry` — retryable API failure: `attempt`, `max_retries`, `retry_delay_ms`, `error_status`, `error` (`rate_limit`, `overloaded`, `server_error`, …).
+- `system` / subtype `plugin_install` — with `CLAUDE_CODE_SYNC_PLUGIN_INSTALL` set: `status` `started|installed|failed|completed`.
+- `assistant` / `user` message events; `stream_event` partial deltas; final `result` event mirroring the `json` payload.
+
+### `stream-json` input
+
+`--input-format stream-json` accepts a stream of user messages on stdin for multi-turn programmatic driving; pair with `--output-format stream-json --verbose` (and `--replay-user-messages` for acks).
+
+## Patterns
+
+Pipe data:
+
+```bash
+cat build-error.txt | claude -p 'explain the root cause of this build error' > explanation.txt
+git diff main | claude -p "you are a typo linter. report filename:line + issue per typo. nothing else."
+```
+
+Multi-turn:
+
+```bash
+claude -p "Review this codebase for performance issues"
+claude -p "Now focus on the database queries" --continue
+session=$(claude -p "Start a review" --output-format json | jq -r '.session_id')
+claude -p "Continue that review" --resume "$session"
+```
+
+Scoped git permissions (note the space before `*` — `Bash(git diff *)` ≠ `Bash(git diff*)`, the latter also matches `git diff-index`):
+
+```bash
+claude -p "Look at my staged changes and create an appropriate commit" \
+  --allowedTools "Bash(git diff *),Bash(git log *),Bash(git status *),Bash(git commit *)"
+```
+
+Locked-down CI:
+
+```bash
+claude --bare -p "Apply the lint fixes" --permission-mode acceptEdits
+# dontAsk: deny anything not explicitly allowed — strictest non-interactive baseline
+claude --bare -p "Audit only, change nothing" --permission-mode dontAsk --allowedTools "Read,Grep,Glob"
+```
+
+GitHub Actions review job (modernized):
+
+```yaml
+- name: Claude review
+  run: |
+    gh pr diff "$PR" | claude --bare -p \
+      --append-system-prompt "You are a security engineer. Review for vulnerabilities." \
+      --output-format json --allowedTools "Read,Grep" > review.json
+    jq -e '.is_error == false' review.json
+    jq -r '.result' review.json > review.md
+  env:
+    ANTHROPIC_API_KEY: ${{ secrets.ANTHROPIC_API_KEY }}
+    GH_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+    PR: ${{ github.event.pull_request.number }}
+```
+
+Error handling:
+
+```bash
+result=$(claude -p "Task" --output-format json) || { echo "claude exited non-zero" >&2; exit 1; }
+if [[ $(jq -r '.is_error' <<<"$result") == "true" ]]; then
+  echo "Error: $(jq -r '.result' <<<"$result")" >&2; exit 1
+fi
+```
+
+Background agents:
+
+```bash
+claude --bg "investigate the flaky test"   # prints session ID
+claude agents --json                        # list background sessions
+claude logs <id>; claude attach <id>; claude stop <id>; claude respawn <id>
+```
+
+Skills in prompts: `claude -p "/my-skill arg1"` expands before running. Interactive built-ins (`/config`, `/login`) are unavailable in `-p`.
+
+Background Bash tasks started during a `-p` run (dev servers, watchers) are terminated ~5s after the final result (v2.1.163+).
+
+## CI Auth
+
+- API key: `ANTHROPIC_API_KEY` env var.
+- Subscription: `claude setup-token` generates a long-lived OAuth token for CI/scripts.
+- Bedrock/Vertex/Foundry: usual provider credentials.
+
+## Agent SDK
+
+For retries, hooks-as-callbacks, custom tools, native message objects, and structured outputs beyond `--json-schema`, use the Agent SDK packages: `@anthropic-ai/claude-agent-sdk` (TypeScript) / `claude-agent-sdk` (Python). Docs: https://code.claude.com/docs/en/agent-sdk/overview. The CLI flags above map 1:1 onto SDK options.

+ 325 - 0
skills/claude-code-internals/references/hooks-reference.md

@@ -0,0 +1,325 @@
+# Hooks Reference
+
+Complete hook system reference, verified against https://code.claude.com/docs/en/hooks (June 2026).
+
+> **Stale contract warning:** old guides describe a `$TOOL_INPUT` env-var contract. That is dead.
+> Hooks receive a **JSON payload on stdin** and respond via **exit code + stdout JSON**.
+
+## Event Catalog
+
+| Event | Fires | Matcher matches | Blocking (exit 2) |
+|---|---|---|---|
+| `SessionStart` | Session begins | `startup`, `resume`, `clear`, `compact` | No |
+| `SessionEnd` | Session ends | End reason: `clear`, `resume`, `logout`, `prompt_input_exit`, ... | No |
+| `Setup` | Only with `--init`/`--init-only`/`--maintenance` | `init`, `maintenance` | No |
+| `UserPromptSubmit` | Before prompt is processed (30s default timeout) | (none) | Yes — prompt rejected and erased |
+| `UserPromptExpansion` | When a `/command` expands | Command/skill name | Yes — blocks expansion |
+| `PreToolUse` | Before each tool call | Tool name | Yes — tool call prevented |
+| `PermissionRequest` | When a permission dialog would show | Tool name | Yes — permission denied |
+| `PermissionDenied` | After a tool call is denied | Tool name | No (`retry` output supported) |
+| `PostToolUse` | After tool succeeds | Tool name | No (stderr fed back; tool already ran) |
+| `PostToolUseFailure` | After tool fails | Tool name | No (tool already failed) |
+| `PostToolBatch` | After a parallel tool batch resolves | (none) | Yes — stops loop before next model call |
+| `Stop` | Main agent finishes a turn | (none) | Yes — prevents stop, conversation continues |
+| `StopFailure` | Turn ends in API error | Error type (`rate_limit`, `overloaded`, ...) | No (output ignored) |
+| `SubagentStart` / `SubagentStop` | Subagent spawn / finish | Agent type (e.g. `Explore`, custom names) | Start: No / Stop: Yes |
+| `TaskCreated` / `TaskCompleted` | Task list changes | (none) | Yes — rolls back / prevents completion |
+| `TeammateIdle` | Agent-team teammate goes idle | (none) | Yes — prevents idle |
+| `Notification` | Notification sent | `permission_prompt`, `idle_prompt`, `auth_success`, ... | No |
+| `MessageDisplay` | While a message streams (10s timeout) | (none) | No (`displayContent` rewrite, screen-only) |
+| `ConfigChange` | Settings file changed mid-session | `user_settings`, `project_settings`, `local_settings`, `policy_settings`, `skills` | Yes — blocks the change (except policy) |
+| `CwdChanged` | Working directory changes | (none) | No (`CLAUDE_ENV_FILE` available) |
+| `FileChanged` | Watched file changes | Literal filenames, e.g. `.envrc\|.env` | No (`CLAUDE_ENV_FILE` available) |
+| `PreCompact` / `PostCompact` | Before / after compaction | `manual`, `auto` | Pre: Yes — blocks compaction / Post: No |
+| `InstructionsLoaded` | CLAUDE.md / rules file loads | Load reason: `session_start`, `nested_traversal`, `path_glob_match`, `include`, `compact` | No (async, observability) |
+| `WorktreeCreate` / `WorktreeRemove` | Worktree lifecycle | (none) | Create: any non-zero fails creation |
+| `Elicitation` / `ElicitationResult` | MCP server requests user input / response | MCP server name | Yes — denies / blocks response |
+
+## Stdin JSON Contract
+
+Every hook gets JSON on stdin. Common fields:
+
+```json
+{
+  "session_id": "…",
+  "transcript_path": "/abs/path/to/transcript.jsonl",
+  "cwd": "/working/dir",
+  "hook_event_name": "PreToolUse",
+  "permission_mode": "default|plan|acceptEdits|auto|dontAsk|bypassPermissions"
+}
+```
+
+`agent_id` / `agent_type` appear in subagent context; `effort: {"level": "low|medium|high|xhigh|max"}` on tool-context events.
+
+Event-specific fields:
+
+| Event | Extra stdin fields |
+|---|---|
+| Tool events (`PreToolUse`, `PermissionRequest`, `PermissionDenied`) | `tool_name`, `tool_input` (tool-specific object) |
+| `PostToolUse` | + `tool_output` (string or object) |
+| `PostToolUseFailure` | + `error_message` |
+| `PermissionDenied` | + `denial_reason` |
+| `SessionStart` | `source` (`startup\|resume\|clear\|compact`), `model` |
+| `Setup` | `trigger` (`init\|maintenance`) |
+| `UserPromptSubmit` | `prompt` |
+| `UserPromptExpansion` | `command`, `expansion` |
+| `StopFailure` | `error_type`, `error_message` |
+| `SubagentStart`/`SubagentStop` | `agent_type`, `agent_id` |
+| `Notification` | `notification_type`, `message` |
+| `ConfigChange` | `source` |
+| `FileChanged` | `file_path`, `change_type` (`create\|modify\|delete`) |
+| `CwdChanged` | `directory` |
+| `TaskCreated`/`TaskCompleted` | `task_id`, `task_title` |
+| `InstructionsLoaded` | `file_path`, `memory_type`, `load_reason`, `globs?`, `trigger_file_path?`, `parent_file_path?` |
+| `Elicitation` | `server_name`, `form_fields[]` (`name`, `type`, `label`, `required`) |
+| `WorktreeCreate`/`WorktreeRemove` | `worktree_id`, `worktree_path` (remove) |
+
+Read it with jq:
+
+```bash
+#!/bin/bash
+INPUT=$(cat)
+TOOL=$(echo "$INPUT" | jq -r '.tool_name')
+CMD=$(echo "$INPUT" | jq -r '.tool_input.command // empty')
+```
+
+## Exit Codes
+
+| Code | Meaning |
+|---|---|
+| `0` | Success. stdout parsed as JSON if valid, else treated as plain text context |
+| `2` | Blocking error. stdout ignored; **stderr** is fed to Claude as feedback |
+| other | Non-blocking error. stderr logged, first line shown in transcript, execution continues |
+
+## Stdout JSON Contract (exit 0)
+
+Universal fields (any event):
+
+```json
+{
+  "continue": false,              // false stops Claude entirely
+  "stopReason": "why we stopped", // shown when continue:false
+  "suppressOutput": true,         // hide stdout from transcript
+  "systemMessage": "warning shown to user",
+  "terminalSequence": "]…"  // raw OSC sequence (v2.1.141+)
+}
+```
+
+Event-specific output goes inside `hookSpecificOutput` with a required `hookEventName`:
+
+| Event | `hookSpecificOutput` fields |
+|---|---|
+| `PreToolUse` | `permissionDecision: allow\|deny\|ask\|defer`, `permissionDecisionReason`, `updatedInput` (replace tool args), `additionalContext` |
+| `PermissionRequest` | `decision: { "behavior": "allow\|deny", "updatedInput": {…} }` |
+| `PermissionDenied` | `retry: true` (let the model retry) |
+| `PostToolUse` | `updatedToolOutput` (replace the result Claude sees), `additionalContext` |
+| `SessionStart` / `SubagentStart` | `additionalContext`, `watchPaths: [...]` (feeds `FileChanged`), `reloadSkills: true`; SessionStart only: `sessionTitle`, `initialUserMessage` |
+| `Stop` / `SubagentStop` | `additionalContext` (inject feedback and continue) |
+| `PostToolBatch` / `Setup` | `additionalContext` |
+| `MessageDisplay` | `displayContent` (screen-only rewrite, transcript untouched) |
+| `Elicitation` / `ElicitationResult` | `action: accept\|decline\|cancel`, `content: {field: value}` |
+| `WorktreeCreate` | `worktreePath` (or print path on stdout for command hooks) |
+
+Top-level `decision`/`reason` pattern (alternative to exit 2) for `UserPromptSubmit`, `UserPromptExpansion`, `PostToolUse`, `PostToolUseFailure`, `PostToolBatch`, `ConfigChange`, `PreCompact`, `Stop`, `SubagentStop`:
+
+```json
+{ "decision": "block", "reason": "explanation Claude sees" }
+```
+
+## Hook Types
+
+Five types. All accept `timeout` (seconds), `if` (permission-rule filter), `statusMessage`.
+
+### 1. `command` (default workhorse)
+
+```json
+{
+  "type": "command",
+  "command": "${CLAUDE_PROJECT_DIR}/.claude/hooks/check.sh",
+  "args": ["--fast"],
+  "async": false,
+  "asyncRewake": false,
+  "shell": "bash",
+  "timeout": 600
+}
+```
+
+- **Shell form** (no `args`): command string runs via shell (`sh -c`, Git Bash, or PowerShell with `shell: "powershell"`); pipes, `&&`, globs work.
+- **Exec form** (with `args`): resolved as executable on PATH, spawned directly, no shell.
+- `async: true`: background, never blocks, output discarded.
+- `asyncRewake: true`: background, but **wakes Claude on exit 2** with stderr as a system reminder. Implies async.
+
+### 2. `http`
+
+```json
+{ "type": "http", "url": "https://hooks.internal/check",
+  "headers": {"Authorization": "$HOOK_TOKEN"}, "allowedEnvVars": ["HOOK_TOKEN"] }
+```
+
+POST; 2xx = success (body parsed as JSON or plain text); non-2xx = non-blocking error. Env interpolation in headers requires `allowedEnvVars`.
+
+### 3. `mcp_tool`
+
+```json
+{ "type": "mcp_tool", "server": "my-server", "tool": "validate",
+  "input": {"cmd": "${tool_input.command}"} }
+```
+
+Calls a configured MCP server's tool; `${path.to.field}` interpolates from the stdin payload.
+
+### 4. `prompt` (default timeout 30s)
+
+```json
+{ "type": "prompt", "prompt": "Is this command destructive? $ARGUMENTS", "model": "haiku" }
+```
+
+One-shot yes/no judgment by a fast model; returns the decision as JSON.
+
+### 5. `agent` (default timeout 60s)
+
+```json
+{ "type": "agent", "prompt": "Verify the edited file still compiles. $ARGUMENTS" }
+```
+
+Spawns a subagent with tool access; returns a decision.
+
+## Matchers
+
+| Pattern | Interpreted as |
+|---|---|
+| `"*"`, `""`, omitted | Match everything |
+| Letters/digits/`_`/`\|` only | Exact name or `\|`-list: `Bash`, `Edit\|Write` |
+| Anything else | JavaScript regex: `^Notebook`, `mcp__memory__.*` |
+
+- Matching is **case-sensitive** (`bash` never matches `Bash`).
+- `matcher` must be a **string**, not an array — an array is a schema error and the hook is silently dropped (visible in `/doctor`).
+- MCP tools: `mcp__<server>__<tool>`; match a whole server with regex `mcp__memory__.*`.
+
+### `if` filters
+
+Narrow within a matched event using permission-rule syntax — `"if": "Bash(git *)"` runs only for git commands (subcommands inside `&&` chains and `$()` are checked; leading `FOO=bar` assignments stripped). `Edit(*.ts)` filters by file pattern. **Fails open** if the command can't be parsed — use the permission system for hard enforcement.
+
+## Where Hooks Live
+
+| Location | Scope |
+|---|---|
+| `~/.claude/settings.json` | All your projects |
+| `.claude/settings.json` | Project (committed) |
+| `.claude/settings.local.json` | Project (gitignored) |
+| Managed policy settings | Organization |
+| Plugin `hooks/hooks.json` (or inline in plugin.json) | While plugin enabled |
+| **Skill or agent frontmatter** | While that component is active |
+
+Settings-file shape:
+
+```json
+{
+  "hooks": {
+    "PreToolUse": [
+      { "matcher": "Bash",
+        "hooks": [ { "type": "command", "command": "${CLAUDE_PROJECT_DIR}/.claude/hooks/check.sh" } ] }
+    ]
+  }
+}
+```
+
+Skill/agent frontmatter shape (YAML):
+
+```yaml
+---
+name: secure-operations
+hooks:
+  PreToolUse:
+    - matcher: "Bash"
+      hooks:
+        - type: command
+          command: "./scripts/security-check.sh"
+---
+```
+
+For subagents, `Stop` hooks auto-convert to `SubagentStop`. Plugin **subagents** ignore `hooks` frontmatter (security restriction).
+
+Edits to `settings.json` hooks take effect in the running session after a brief delay — no restart. `disableAllHooks: true` turns everything off (managed hooks only by managed-level setting). Identical handlers are deduplicated. Browse live config with `/hooks`.
+
+## Environment Variables in Hooks
+
+| Variable | Available | Value |
+|---|---|---|
+| `CLAUDE_PROJECT_DIR` | All hooks | Project root |
+| `CLAUDE_PLUGIN_ROOT` | Plugin hooks | Plugin install dir (changes on update) |
+| `CLAUDE_PLUGIN_DATA` | Plugin hooks | Persistent data dir (survives updates) |
+| `CLAUDE_ENV_FILE` | `SessionStart`, `Setup`, `CwdChanged`, `FileChanged` | File to append `export VAR=…` lines; persists into later Bash calls |
+| `CLAUDE_EFFORT` | Tool-context events | `low`…`max` |
+| `CLAUDE_CODE_REMOTE` | All | `"true"` on web, unset locally |
+
+Path placeholders in hook config: `${CLAUDE_PROJECT_DIR}`, `${CLAUDE_PLUGIN_ROOT}`, `${CLAUDE_PLUGIN_DATA}`, and (plugins) `${user_config.*}`.
+
+## Recipes
+
+Block dangerous commands (PreToolUse on `Bash`):
+
+```bash
+#!/bin/bash
+CMD=$(jq -r '.tool_input.command // empty')
+if echo "$CMD" | grep -qE 'rm -rf /|git push --force.*(main|master)'; then
+  jq -n '{hookSpecificOutput: {hookEventName: "PreToolUse",
+          permissionDecision: "deny",
+          permissionDecisionReason: "Destructive command blocked by policy"}}'
+fi
+exit 0
+```
+
+Audit log (PostToolUse, matcher `*`):
+
+```bash
+#!/bin/bash
+cat | jq -c '{ts: now|todate, tool: .tool_name, input: .tool_input}' >> ~/.claude/audit.jsonl
+exit 0
+```
+
+Inject project context + reload skills at session start (SessionStart):
+
+```bash
+#!/bin/bash
+jq -n --arg ctx "$(git -C "$CLAUDE_PROJECT_DIR" status --short)" \
+  '{hookSpecificOutput: {hookEventName: "SessionStart", additionalContext: $ctx, reloadSkills: true}}'
+```
+
+Force a re-check before stopping (Stop):
+
+```bash
+#!/bin/bash
+if ! npm test --silent >/dev/null 2>&1; then
+  echo "Tests are failing — fix them before finishing." >&2
+  exit 2
+fi
+exit 0
+```
+
+Rewrite tool input (PreToolUse `updatedInput`):
+
+```bash
+#!/bin/bash
+INPUT=$(cat)
+SAFE=$(echo "$INPUT" | jq '.tool_input.command |= sub("^pip install"; "uv pip install")')
+jq -n --argjson ti "$(echo "$SAFE" | jq '.tool_input')" \
+  '{hookSpecificOutput: {hookEventName: "PreToolUse", updatedInput: $ti}}'
+```
+
+## Security Checklist
+
+- Quote all shell variables (`"$VAR"`); validate paths (reject `..` traversal)
+- Use `${CLAUDE_PROJECT_DIR}` instead of relative paths (hooks run from varying cwd)
+- Keep `UserPromptSubmit` hooks fast (30s default timeout); set explicit `timeout` elsewhere
+- `set -euo pipefail` plus jq fallbacks (`// empty`) so malformed payloads don't crash into exit-2 blocks
+- Remember hooks execute arbitrary code with your credentials — review hooks in any repo before trusting it
+
+## Debugging Hooks
+
+```bash
+claude --debug hooks      # live: events fired, matchers checked, exit codes, output
+/hooks                    # what's registered this session
+echo '{"tool_name":"Bash","tool_input":{"command":"ls"}}' | ./my-hook.sh; echo "exit=$?"
+```
+
+See [debugging-reference.md](debugging-reference.md) for the hook-not-firing decision tree.

+ 147 - 0
skills/claude-code-internals/references/skills-reference.md

@@ -0,0 +1,147 @@
+# Skills Reference
+
+Agent Skills spec as implemented by Claude Code, verified against https://code.claude.com/docs/en/skills (June 2026).
+
+Key model change vs older guides: **custom commands have been merged into skills.** `.claude/commands/deploy.md` and `.claude/skills/deploy/SKILL.md` both create `/deploy`. Skills follow the [Agent Skills](https://agentskills.io) open standard; Claude Code extends it with invocation control, subagent execution, and dynamic context injection.
+
+## Where Skills Live
+
+| Location | Path | Scope |
+|---|---|---|
+| Enterprise | managed settings dir | Organization |
+| Personal | `~/.claude/skills/<name>/SKILL.md` | All your projects |
+| Project | `.claude/skills/<name>/SKILL.md` | This project |
+| Plugin | `<plugin>/skills/<name>/SKILL.md` | Where enabled (namespaced `plugin:skill`) |
+
+- Same-name conflicts: enterprise > personal > project. Plugin skills are namespaced so they never conflict. A skill beats a same-named `.claude/commands/` file.
+- Project skills also load from `.claude/skills/` in **parent** directories up to the repo root, and on demand from **nested** `.claude/skills/` when working in subdirectories (monorepos).
+- `--add-dir` / `/add-dir` directories DO load their `.claude/skills/` (exception to the file-access-only rule); the `permissions.additionalDirectories` setting does NOT.
+- **Live change detection:** edits to SKILL.md under watched skill directories take effect within the session, no restart. A brand-new top-level skills directory needs a restart. For skill-folders-as-plugins, `hooks/`, `.mcp.json`, `agents/` changes need `/reload-plugins`.
+- Add `.claude-plugin/plugin.json` inside a skill folder and it loads as a plugin named `<name>@skills-dir` (can then bundle agents, hooks, MCP servers).
+
+## Frontmatter Reference (full, current)
+
+All fields optional; `description` strongly recommended.
+
+| Field | Meaning |
+|---|---|
+| `name` | Display name in listings. Defaults to directory name. Does NOT change the `/command` you type (except plugin-root SKILL.md, where it does). |
+| `description` | What it does + when to use. Claude's trigger signal. If omitted, first paragraph of body is used. |
+| `when_to_use` | Extra trigger context (phrases, example requests). Appended to description in the listing. Combined description+when_to_use truncated at **1,536 chars** in the listing. |
+| `argument-hint` | Autocomplete hint, e.g. `[issue-number]` or `[filename] [format]`. |
+| `arguments` | Named positional args for `$name` substitution. Space-separated string or YAML list; names map to positions in order. |
+| `disable-model-invocation` | `true` = only the user can invoke (`/name`). Removes description from Claude's context entirely; also prevents preloading into subagents. Use for side-effect workflows (`/deploy`, `/commit`). Default `false`. |
+| `user-invocable` | `false` = hidden from the `/` menu; only Claude can invoke. For background knowledge. Default `true`. Menu visibility only — does not block the Skill tool. |
+| `allowed-tools` | Tools usable **without permission prompts** while the skill is active (grant, not restriction). Space/comma-separated string or YAML list. Permission-rule syntax works: `Bash(git add *)`. For project skills, requires workspace trust. |
+| `disallowed-tools` | Tools **removed from the pool** while active. Restriction clears on your next message. |
+| `model` | Model while active (same values as `/model`, or `inherit`). Applies for the rest of the turn; session model resumes next prompt. |
+| `effort` | `low`/`medium`/`high`/`xhigh`/`max` while active; overrides session effort. |
+| `context` | `fork` = run in a forked subagent context; the skill body becomes the subagent's prompt (no conversation history). |
+| `agent` | Subagent type when `context: fork` (`Explore`, `Plan`, `general-purpose`, or custom). Default `general-purpose`. |
+| `hooks` | Hooks scoped to the skill's lifecycle (same YAML shape as settings hooks; see hooks-reference.md). |
+| `paths` | Glob patterns gating auto-activation: skill loads only when working with matching files. Comma-separated string or YAML list. |
+| `shell` | `bash` (default) or `powershell` for `` !`cmd` `` injection. `powershell` requires `CLAUDE_CODE_USE_POWERSHELL_TOOL=1`. |
+| `license` | License identifier (Agent Skills spec field). |
+| `compatibility` | Environment requirements (Agent Skills spec field). |
+| `metadata` | Arbitrary key/value strings (author, etc.). |
+
+### Invocation matrix
+
+| Frontmatter | You invoke | Claude invokes | Context cost |
+|---|---|---|---|
+| (default) | Yes | Yes | Description always in context; body loads on invoke |
+| `disable-model-invocation: true` | Yes | No | Nothing in context until you invoke |
+| `user-invocable: false` | No | Yes | Description always in context |
+
+## String Substitutions
+
+| Token | Expands to |
+|---|---|
+| `$ARGUMENTS` | Full argument string as typed. If absent from body, args are appended as `ARGUMENTS: <value>` |
+| `$ARGUMENTS[N]` | N-th argument, 0-based, shell-style quoting (`"hello world"` = one arg) |
+| `$N` (`$0`, `$1`, …) | Shorthand for `$ARGUMENTS[N]` — **note: `$0` is the FIRST argument** |
+| `$name` | Named arg declared in `arguments:` frontmatter |
+| `${CLAUDE_SESSION_ID}` | Current session ID |
+| `${CLAUDE_EFFORT}` | Current effort level (`low`…`max`; ultracode reports `xhigh`) |
+| `${CLAUDE_SKILL_DIR}` | Directory containing this SKILL.md — use for bundled script paths |
+
+Escape a literal dollar before a digit/`ARGUMENTS`/declared name with a single backslash: `\$1.00`.
+
+## Dynamic Context Injection
+
+`` !`command` `` runs **before** Claude sees the skill content; output replaces the placeholder. This is preprocessing — Claude never executes it.
+
+```yaml
+---
+name: pr-summary
+description: Summarize changes in a pull request
+context: fork
+agent: Explore
+allowed-tools: Bash(gh *)
+---
+
+- PR diff: !`gh pr diff`
+- Changed files: !`gh pr diff --name-only`
+```
+
+Rules:
+- `!` must be at line start or after whitespace (`` KEY=!`cmd` `` stays literal).
+- Single pass: emitted output is not re-scanned for placeholders.
+- Multi-line commands: fenced block opened with ```` ```! ````.
+- `"disableSkillShellExecution": true` in settings replaces each command with `[shell command execution disabled by policy]` (bundled/managed skills unaffected).
+- Include the word `ultrathink` anywhere in the body to request deeper reasoning when the skill runs.
+
+## Skill Content Lifecycle (token economics)
+
+- Invoked skill content enters as a single message and **stays for the whole session**; not re-read on later turns. Write standing instructions, not one-time steps.
+- Auto-compaction re-attaches each invoked skill's most recent invocation: first **5,000 tokens** per skill, **25,000-token combined budget**, most-recent first — older skills can drop entirely.
+- Keep SKILL.md under **500 lines**; push detail to supporting files referenced from SKILL.md (progressive disclosure).
+
+## Description Budget (why skills stop triggering)
+
+- All skill names always listed; descriptions fit a budget of **1% of the model context window** (raise via `skillListingBudgetFraction` setting, e.g. `0.02`, or `SLASH_COMMAND_TOOL_CHAR_BUDGET` env var for a fixed char count).
+- Per-skill cap: combined `description` + `when_to_use` truncated at 1,536 chars (configurable via `maxSkillDescriptionChars`). **Put key trigger phrases first.**
+- On overflow, least-invoked skills lose their descriptions first. `/doctor` shows whether the budget is overflowing and which skills are affected.
+- `skillOverrides` in settings (or `/skills` + `Space`): `"on"` / `"name-only"` / `"user-invocable-only"` / `"off"` per skill. Plugin skills are managed via `/plugin` instead.
+
+## Permission Control over Skills
+
+```text
+Skill                 # deny rule: disable Claude's skill invocation entirely
+Skill(commit)         # exact match
+Skill(review-pr *)    # prefix match with any args
+```
+
+## Skills x Subagents (two directions)
+
+| Approach | System prompt | Task | Also loads |
+|---|---|---|---|
+| Skill with `context: fork` | From `agent` type | SKILL.md content | CLAUDE.md, except Explore/Plan agents |
+| Subagent with `skills:` frontmatter | Agent's markdown body | Claude's delegation message | Full preloaded skill content + CLAUDE.md |
+
+Subagent `skills:` preloading injects **full skill content** at startup (not just descriptions). Skills with `disable-model-invocation: true` cannot be preloaded.
+
+## Skills in Headless Mode
+
+`/skill-name args` works inside a `-p` prompt string — Claude Code expands it before running:
+
+```bash
+claude -p "/summarize-changes" --output-format json
+```
+
+## Minimal Working Example
+
+```yaml
+---
+name: summarize-changes
+description: Summarizes uncommitted changes and flags risks. Use when the user asks what changed or wants a commit message.
+---
+
+## Current changes
+
+!`git diff HEAD`
+
+## Instructions
+
+Summarize the changes above in two or three bullets, then list risks.
+```

+ 0 - 0
skills/claude-code-headless/scripts/.gitkeep → skills/claude-code-internals/scripts/.gitkeep


+ 315 - 0
skills/playwright-ops/SKILL.md

@@ -0,0 +1,315 @@
+---
+name: playwright-ops
+description: "Playwright end-to-end testing operations - selectors, fixtures, network mocking, auth, parallelism, CI, visual regression, flake hunting. Use for: playwright, e2e test, end-to-end testing, browser test, getByRole, page object, storageState, trace viewer, flaky test, test sharding, visual regression, toHaveScreenshot, playwright config, codegen."
+license: MIT
+allowed-tools: "Read Write Bash"
+metadata:
+  author: claude-mods
+  related-skills: testing-ops, ci-cd-ops
+---
+
+# Playwright Operations
+
+End-to-end testing with Playwright Test (`@playwright/test`, TS/JS). A Python flavor
+(`pytest-playwright`) exists with the same browser API but pytest-style fixtures — patterns here
+translate directly; runner config does not.
+
+## Quick Start
+
+```bash
+npm init playwright@latest          # scaffold config + example test + GH Actions workflow
+npx playwright test                 # run all tests, all projects
+npx playwright test --project=chromium --grep "@smoke"
+npx playwright test --ui            # interactive UI mode (watch, time-travel)
+npx playwright codegen https://app.local   # record actions -> generated locators
+npx playwright show-report          # open last HTML report
+npx playwright show-trace trace.zip # inspect a trace
+```
+
+## Selector Strategy
+
+**Hierarchy — always prefer the highest tier that uniquely matches:**
+
+| Tier | Locator | When |
+|------|---------|------|
+| 1 | `page.getByRole('button', { name: 'Submit' })` | Anything with an ARIA role — buttons, links, headings, textboxes. Tests a11y for free |
+| 2 | `page.getByLabel('Password')` | Form fields with labels |
+| 3 | `page.getByPlaceholder('name@example.com')` | Inputs without labels (fix the label instead, when you can) |
+| 4 | `page.getByText('Welcome back')` | Non-interactive text content |
+| 5 | `page.getByTestId('cart-total')` | Stable hook when semantics don't disambiguate. Configure attribute via `testIdAttribute` |
+| 6 | `page.locator('css=...')` / `xpath=` | **Last resort.** Coupled to DOM structure; breaks on refactor |
+
+Why: tiers 1–4 locate the way a user perceives the page — resilient to markup changes, and
+`getByRole` fails loudly when accessibility regresses. CSS/XPath encode implementation detail.
+
+**Narrowing without CSS:**
+
+```ts
+page.getByRole('listitem')
+    .filter({ hasText: 'Product 2' })
+    .getByRole('button', { name: 'Add to cart' });
+
+page.getByRole('row').filter({ has: page.getByRole('cell', { name: 'Alice' }) });
+```
+
+### Web-First Assertions (no manual waits, ever)
+
+```ts
+// BAD — checks once, races the render; sleeps are flake factories
+expect(await page.getByText('welcome').isVisible()).toBe(true);
+await page.waitForTimeout(2000);
+
+// GOOD — auto-retries until pass or timeout
+await expect(page.getByText('welcome')).toBeVisible();
+await expect(page.getByRole('list')).toHaveCount(3);
+await expect(page).toHaveURL(/\/dashboard/);
+await expect.soft(page.getByTestId('status')).toHaveText('Active'); // don't stop test on failure
+```
+
+Actions (`click`, `fill`) auto-wait for actionability (visible, stable, enabled). If you feel the
+need for `waitForTimeout`, you're missing an assertion or an `await expect(...)` on a state change.
+For async non-DOM conditions use `expect.poll(() => fn())` or `expect(async () => {...}).toPass()`.
+
+Lint guard: enable `@typescript-eslint/no-floating-promises` — a missing `await` on an assertion is
+the most common silent-pass bug.
+
+## Config Skeleton
+
+Full production template with comments: [assets/playwright.config.template.ts](assets/playwright.config.template.ts)
+
+```ts
+import { defineConfig, devices } from '@playwright/test';
+
+export default defineConfig({
+  testDir: './tests',
+  fullyParallel: true,
+  forbidOnly: !!process.env.CI,
+  retries: process.env.CI ? 2 : 0,
+  workers: process.env.CI ? 1 : undefined,
+  reporter: process.env.CI ? 'blob' : 'html',
+  use: {
+    baseURL: process.env.BASE_URL ?? 'http://localhost:3000',
+    trace: 'on-first-retry',
+    testIdAttribute: 'data-testid',
+  },
+  projects: [
+    { name: 'setup', testMatch: /.*\.setup\.ts/ },
+    {
+      name: 'chromium',
+      use: { ...devices['Desktop Chrome'], storageState: 'playwright/.auth/user.json' },
+      dependencies: ['setup'],
+    },
+  ],
+  webServer: {
+    command: 'npm run dev',
+    url: 'http://localhost:3000',
+    reuseExistingServer: !process.env.CI,
+  },
+});
+```
+
+## Fixtures Decision Tree
+
+```
+What do I need to share/setup?
+│
+├─ Per-test object (page object, seeded record)
+│  └─ test.extend() test-scoped fixture — setup, await use(x), teardown
+│
+├─ Expensive, safe-to-share resource (DB pool, test account)
+│  └─ Worker-scoped: [fn, { scope: 'worker' }] — once per worker process
+│
+├─ Side effect every test needs (log capture, network stub)
+│  └─ Automatic: [fn, { auto: true }] — runs without being referenced
+│
+├─ Config-tunable value (locale, default item)
+│  └─ Option: ['default', { option: true }] — override in projects[].use
+│
+├─ Fixtures from several modules
+│  └─ mergeTests(testA, testB)
+│
+└─ Auth state per test file/role
+   └─ test.use({ storageState: 'playwright/.auth/admin.json' })
+```
+
+**POM-as-fixture (modern recommendation)** — page objects are fine; *instantiating them by hand in
+every test* is not. Inject via fixture:
+
+```ts
+// fixtures.ts
+import { test as base } from '@playwright/test';
+import { TodoPage } from './pages/todo-page';
+
+export const test = base.extend<{ todoPage: TodoPage }>({
+  todoPage: async ({ page }, use) => {
+    const todoPage = new TodoPage(page);
+    await todoPage.goto();
+    await use(todoPage);          // test body runs here
+  },
+});
+export { expect } from '@playwright/test';
+```
+
+Page objects should expose **locators and actions**, not assertions wrapped in try/catch, and never
+store element handles. Details: [references/fixtures-and-pom.md](references/fixtures-and-pom.md)
+
+## Network & API
+
+```
+Network need?
+│
+├─ Stub a third-party API           → page.route('**/api/**', r => r.fulfill({ json }))
+├─ Tweak a real response            → const res = await route.fetch(); route.fulfill({ response: res, json })
+├─ Simulate failure / offline       → route.abort() / route.fulfill({ status: 500 })
+├─ Many endpoints, real shapes      → HAR record + replay (page.routeFromHAR, update: true to record)
+├─ Pure API test (no browser)       → request fixture / APIRequestContext
+├─ Seed data fast, assert via UI    → hybrid: create via request, verify via page
+└─ WebSocket traffic                → page.routeWebSocket(url, ws => ws.onMessage(...))
+```
+
+**Hybrid seed-via-API, assert-via-UI** — the single biggest speed win in most suites:
+
+```ts
+test('shows new project', async ({ request, page }) => {
+  const res = await request.post('/api/projects', { data: { name: 'Apollo' } });
+  expect(res.ok()).toBeTruthy();
+  await page.goto('/projects');
+  await expect(page.getByRole('link', { name: 'Apollo' })).toBeVisible();
+});
+```
+
+Rule of thumb: **mock third-party dependencies you don't own; exercise your own backend for real**
+(or mock it deliberately in a separate "frontend-isolated" project).
+Details: [references/network-and-api.md](references/network-and-api.md)
+
+## Authentication
+
+Standard pattern — login once in a setup project, reuse `storageState` everywhere:
+
+```ts
+// tests/auth.setup.ts
+import { test as setup, expect } from '@playwright/test';
+
+setup('authenticate', async ({ page }) => {
+  await page.goto('/login');
+  await page.getByLabel('Username').fill(process.env.E2E_USER!);
+  await page.getByLabel('Password').fill(process.env.E2E_PASS!);
+  await page.getByRole('button', { name: 'Sign in' }).click();
+  await expect(page.getByTestId('user-menu')).toBeVisible();   // wait for auth to settle!
+  await page.context().storageState({ path: 'playwright/.auth/user.json' });
+});
+```
+
+| Pattern | Use when |
+|---------|----------|
+| One setup project + `storageState` in `use` | One shared account, tests don't mutate server-side user state |
+| Per-role files (`admin.json`, `user.json`) + `test.use({ storageState })` | Role-based behavior under test |
+| Worker-scoped account fixture (`testInfo.parallelIndex`) | Parallel tests mutate user state — one account per worker |
+| API login (`request.post` + `request.storageState`) | Login endpoint exists; 10x faster than UI login |
+
+Gotchas: add `playwright/.auth/` to `.gitignore`. `storageState` captures cookies +
+localStorage — **not sessionStorage** (persist that manually via `page.evaluate` + init script).
+Always assert a logged-in signal before saving state, or you save a half-logged-in race.
+
+## Parallelism, Retries, Isolation
+
+| Knob | Setting | Notes |
+|------|---------|-------|
+| Workers | `workers: process.env.CI ? 1 : undefined` | Local: one per ~CPU core. CI runners are small — shard machines instead of oversubscribing |
+| File-level parallel | `fullyParallel: true` | Also makes sharding split per-test, not per-file |
+| Sharding | `npx playwright test --shard=1/4` | One shard per CI machine; merge blob reports after |
+| Retries | `retries: process.env.CI ? 2 : 0` | Pair with `trace: 'on-first-retry'`; treat "flaky" status as a bug queue, not a fix |
+| Serial | `test.describe.configure({ mode: 'serial' })` | Smell — usually means hidden inter-test coupling |
+
+**Isolation discipline:** every test gets a fresh `context`/`page` (cookies, storage) — keep it
+that way. No test reads state written by another test; shared server-side state is reset via API in
+`beforeEach` or scoped per worker (`test.info().parallelIndex` in usernames/tenant IDs). A suite
+that only passes single-worker is broken, not "sensitive".
+
+**Flake diagnosis:** `trace: 'on-first-retry'` → `npx playwright show-trace` (DOM snapshots,
+network, console per action). Local: `npx playwright test --ui` or `PWDEBUG=1` / `page.pause()`.
+Repro: `--repeat-each=20 --workers=4`. Playbook: [references/flake-hunting.md](references/flake-hunting.md)
+
+## CI (GitHub Actions)
+
+```yaml
+- uses: actions/checkout@v5
+- uses: actions/setup-node@v5
+  with: { node-version: lts/* }
+- run: npm ci
+- run: npx playwright install --with-deps chromium   # only browsers you test
+- run: npx playwright test
+- uses: actions/upload-artifact@v4
+  if: ${{ !cancelled() }}
+  with: { name: playwright-report, path: playwright-report/, retention-days: 30 }
+```
+
+| Decision | Guidance |
+|----------|----------|
+| Container vs install-deps | `mcr.microsoft.com/playwright:vX.Y.Z-jammy` image pins browser+OS (best for visual tests); `install --with-deps` is simpler and fine otherwise. **Pin image tag to your `@playwright/test` version** |
+| Browser caching | Cache `~/.cache/ms-playwright` keyed on Playwright version; skip when using the container |
+| Sharded reports | `reporter: 'blob'` on shards → upload `blob-report/` → merge job: `npx playwright merge-reports --reporter html ./all-blob-reports` |
+| Fail-fast vs full suite | PRs: `fail-fast: false` + `--max-failures=10` per shard — see *all* failures in one round-trip. Smoke gates: fail fast |
+
+Full workflows (sharding matrix, merge job, caching): [references/ci-patterns.md](references/ci-patterns.md)
+
+## Visual Testing
+
+```ts
+await expect(page).toHaveScreenshot('landing.png', {
+  maxDiffPixels: 100,                       // or maxDiffPixelRatio / threshold
+  mask: [page.getByTestId('ad-banner')],    // black-box dynamic regions
+  fullPage: true,
+});
+```
+
+- First run generates the baseline (test fails); update with `npx playwright test --update-snapshots`
+- Snapshots are named per browser **and platform** (`landing-chromium-darwin.png`) — baselines
+  generated on macOS will not match Linux CI. Fix: generate baselines inside the same Docker image
+  CI uses, or run visual tests only in the container
+- Disable animations: `toHaveScreenshot` defaults `animations: 'disabled'`; hide dynamic bits with
+  `mask` or `stylePath` (CSS applied at capture time)
+- Global defaults: `expect: { toHaveScreenshot: { maxDiffPixels: 100 } }` in config
+- `toMatchSnapshot()` for non-image data (text/buffers)
+
+## Component Testing & When to Prefer Cypress
+
+`@playwright/experimental-ct-react` (also vue/svelte) mounts components in a real browser —
+**still experimental**; for component-level work, Vitest browser mode or Testing Library are the
+safer default, with Playwright covering E2E.
+
+| Factor | Playwright | Cypress |
+|--------|-----------|---------|
+| Browsers | Chromium, Firefox, WebKit (real Safari engine) | Chrome-family, Firefox; WebKit experimental |
+| Parallelism | Free, built-in, shardable | Paid Cloud for parallel orchestration |
+| Multi-tab / multi-origin / iframes | Native | Historically constrained |
+| API testing | Built-in `request` context | Via `cy.request`, less ergonomic |
+| Component testing | Experimental | Mature, first-class |
+| In-browser interactive DX | UI mode (excellent) | The original benchmark; some teams still prefer it |
+
+Reach for Cypress when component testing maturity or an existing Cypress investment dominates;
+otherwise Playwright is the default for new E2E suites. (Repo also has a `cypress-expert` agent.)
+
+## Debugging & Codegen
+
+| Tool | Command | Use |
+|------|---------|-----|
+| UI mode | `npx playwright test --ui` | Watch mode, time-travel, pick locators |
+| Inspector | `PWDEBUG=1 npx playwright test` or `page.pause()` | Step through actions live |
+| Codegen | `npx playwright codegen <url>` | Records actions, emits role-based locators — treat output as a draft, refactor into POMs/fixtures |
+| Trace viewer | `npx playwright show-trace trace.zip` | Post-mortem: snapshots, network, console |
+| Headed + slow | `--headed --debug` | Eyeball a single test |
+| VS Code extension | — | Run/debug tests, pick locators in-editor |
+
+An official Playwright MCP server (`@playwright/mcp`) also exists for agent-driven browser
+automation — distinct from the test runner; don't conflate browsing automation with the test suite.
+
+## References
+
+| File | Contents |
+|------|----------|
+| [references/fixtures-and-pom.md](references/fixtures-and-pom.md) | Fixture scopes/options/merging, POM-as-fixture architecture, anti-patterns |
+| [references/network-and-api.md](references/network-and-api.md) | route/fulfill/abort, HAR replay, API testing, hybrid seeding, WebSocket |
+| [references/ci-patterns.md](references/ci-patterns.md) | Full GH Actions workflows: basic, sharded+merge, container, caching, reporters |
+| [references/flake-hunting.md](references/flake-hunting.md) | Systematic flake diagnosis: traces, repro loops, common causes + fixes |
+| [assets/playwright.config.template.ts](assets/playwright.config.template.ts) | Commented production config template |

+ 126 - 0
skills/playwright-ops/assets/playwright.config.template.ts

@@ -0,0 +1,126 @@
+/**
+ * Production Playwright config template.
+ *
+ * Copy to playwright.config.ts and adjust the marked sections.
+ * Conventions baked in:
+ *   - blob reporter on CI (shard-mergeable), html locally
+ *   - trace on first retry (flake forensics at near-zero cost)
+ *   - auth via a `setup` project + storageState (login once, reuse everywhere)
+ *   - webServer boots the app and waits for it — no sleeps in CI scripts
+ */
+import { defineConfig, devices } from '@playwright/test';
+
+export default defineConfig({
+  testDir: './tests',
+
+  // Run tests within files in parallel too (also gives per-test shard balancing).
+  // Set false only if tests within a file are intentionally ordered.
+  fullyParallel: true,
+
+  // A stray `test.only` committed to CI silently skips the suite — make it a build failure.
+  forbidOnly: !!process.env.CI,
+
+  // Retries are flake telemetry, not a fix: retried-then-passed tests show as
+  // "flaky" in the report. Keep 0 locally so you feel flakes immediately.
+  retries: process.env.CI ? 2 : 0,
+
+  // Small CI runners (2-core GitHub hosted) thrash with parallel browser workers.
+  // Scale horizontally with --shard instead. Locally, default = ~half the cores.
+  workers: process.env.CI ? 1 : undefined,
+
+  // blob -> uploaded per shard, merged with `npx playwright merge-reports`.
+  reporter: process.env.CI
+    ? [['blob'], ['github']]
+    : [['html', { open: 'on-failure' }]],
+
+  // Per-action timeout defaults are usually fine; raise the global test timeout
+  // only for genuinely long flows (or per-test via test.slow()).
+  timeout: 30_000,
+  expect: {
+    timeout: 5_000,
+    // Global visual-comparison tolerances; override per assertion when needed.
+    toHaveScreenshot: {
+      maxDiffPixels: 100,
+      // animations: 'disabled' is already the default for screenshots
+    },
+  },
+
+  use: {
+    // All page.goto('/relative') and request.get('/api/...') resolve against this.
+    baseURL: process.env.BASE_URL ?? 'http://localhost:3000',
+
+    // Trace on first retry: the failing run gets full DOM snapshots + network log.
+    // Use 'retain-on-failure' instead if you run with retries: 0.
+    trace: 'on-first-retry',
+    screenshot: 'only-on-failure',
+    // video is mostly redundant with traces; enable only if your team skims videos.
+    // video: 'retain-on-failure',
+
+    // Determinism: pin what the OS would otherwise decide for you.
+    locale: 'en-US',
+    timezoneId: 'UTC',
+    // viewport comes from the device preset per project below.
+
+    // Attribute used by page.getByTestId(); align with your frontend convention.
+    testIdAttribute: 'data-testid',
+
+    // Uncomment if a service worker swallows your route() mocks:
+    // serviceWorkers: 'block',
+  },
+
+  projects: [
+    // --- Auth setup: runs first, saves storage state for the browser projects ---
+    // tests/auth.setup.ts logs in and calls
+    //   page.context().storageState({ path: 'playwright/.auth/user.json' })
+    // Keep playwright/.auth/ in .gitignore.
+    { name: 'setup', testMatch: /.*\.setup\.ts/ },
+
+    {
+      name: 'chromium',
+      use: {
+        ...devices['Desktop Chrome'],
+        storageState: 'playwright/.auth/user.json',
+      },
+      dependencies: ['setup'],
+    },
+
+    // Enable per browser-support matrix. Remember: each enabled project must be
+    // installed in CI (npx playwright install --with-deps firefox webkit).
+    // {
+    //   name: 'firefox',
+    //   use: { ...devices['Desktop Firefox'], storageState: 'playwright/.auth/user.json' },
+    //   dependencies: ['setup'],
+    // },
+    // {
+    //   name: 'webkit',
+    //   use: { ...devices['Desktop Safari'], storageState: 'playwright/.auth/user.json' },
+    //   dependencies: ['setup'],
+    // },
+
+    // Mobile viewport smoke pass.
+    // {
+    //   name: 'mobile-chrome',
+    //   use: { ...devices['Pixel 7'], storageState: 'playwright/.auth/user.json' },
+    //   dependencies: ['setup'],
+    //   grep: /@smoke/,
+    // },
+
+    // Unauthenticated flows (login page itself, public pages) — no storageState.
+    // {
+    //   name: 'chromium-no-auth',
+    //   use: { ...devices['Desktop Chrome'] },
+    //   testMatch: /.*\.public\.spec\.ts/,
+    // },
+  ],
+
+  // Playwright boots your app and polls `url` until it responds — replaces
+  // "npm start & sleep 15" hacks in CI scripts.
+  webServer: {
+    command: process.env.CI ? 'npm run build && npm run start' : 'npm run dev',
+    url: 'http://localhost:3000',
+    reuseExistingServer: !process.env.CI,
+    timeout: 120_000,
+    stdout: 'pipe', // surface app logs in CI output when boot fails
+  },
+  // Multiple servers? webServer also accepts an array: [{ api }, { frontend }].
+});

+ 193 - 0
skills/playwright-ops/references/ci-patterns.md

@@ -0,0 +1,193 @@
+# CI Patterns (GitHub Actions)
+
+Runnable workflows for Playwright in CI, from single-job to sharded fleets. Adapt paths/commands
+for other CI providers — the shape is identical.
+
+## Baseline Workflow
+
+```yaml
+# .github/workflows/playwright.yml
+name: Playwright Tests
+on:
+  push:
+    branches: [main]
+  pull_request:
+    branches: [main]
+jobs:
+  test:
+    timeout-minutes: 60
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v5
+      - uses: actions/setup-node@v5
+        with:
+          node-version: lts/*
+      - name: Install dependencies
+        run: npm ci
+      - name: Install Playwright browsers
+        run: npx playwright install --with-deps chromium
+      - name: Run Playwright tests
+        run: npx playwright test
+      - uses: actions/upload-artifact@v4
+        if: ${{ !cancelled() }}          # upload report on failure too
+        with:
+          name: playwright-report
+          path: playwright-report/
+          retention-days: 30
+```
+
+Notes:
+
+- Install only the browsers your projects use (`chromium` above) — saves minutes per run.
+- `if: ${{ !cancelled() }}` keeps the report when tests fail; that's when you need it.
+- Secrets via `env:` on the test step (`E2E_USER: ${{ secrets.E2E_USER }}`), never committed.
+
+## Container vs install-deps
+
+| Approach | Pros | Cons |
+|----------|------|------|
+| `npx playwright install --with-deps` on the runner | Simple; matches local dev | OS-level rendering drifts with runner image updates — visual baselines can churn |
+| `container: mcr.microsoft.com/playwright:v1.52.0-jammy` | Pinned browser + OS rendering; reproducible visual tests; no install step | Slightly slower job start; must bump tag with `@playwright/test` |
+
+**Always pin the container tag to your exact `@playwright/test` version** — a mismatch produces
+"Executable doesn't exist" or subtle behavior skew.
+
+```yaml
+jobs:
+  test:
+    runs-on: ubuntu-latest
+    container:
+      image: mcr.microsoft.com/playwright:v1.52.0-jammy
+    steps:
+      - uses: actions/checkout@v5
+      - run: npm ci
+      - run: npx playwright test
+        env:
+          HOME: /root      # workaround for firefox in containers
+```
+
+## Caching Browsers (non-container path)
+
+```yaml
+- name: Get Playwright version
+  id: pw-version
+  run: echo "version=$(node -p "require('@playwright/test/package.json').version")" >> "$GITHUB_OUTPUT"
+- uses: actions/cache@v4
+  id: pw-cache
+  with:
+    path: ~/.cache/ms-playwright
+    key: playwright-${{ runner.os }}-${{ steps.pw-version.outputs.version }}
+- run: npx playwright install --with-deps chromium
+  if: steps.pw-cache.outputs.cache-hit != 'true'
+- run: npx playwright install-deps chromium     # OS deps aren't cached
+  if: steps.pw-cache.outputs.cache-hit == 'true'
+```
+
+## Sharding with Blob Reports + Merge
+
+Config side — blob on CI shards, html locally:
+
+```ts
+// playwright.config.ts
+reporter: process.env.CI ? 'blob' : 'html',
+fullyParallel: true,   // shards split per-test instead of per-file -> better balance
+```
+
+```yaml
+jobs:
+  playwright-tests:
+    runs-on: ubuntu-latest
+    strategy:
+      fail-fast: false                  # let every shard finish; see ALL failures
+      matrix:
+        shardIndex: [1, 2, 3, 4]
+        shardTotal: [4]
+    steps:
+      - uses: actions/checkout@v5
+      - uses: actions/setup-node@v5
+        with: { node-version: lts/* }
+      - run: npm ci
+      - run: npx playwright install --with-deps chromium
+      - run: npx playwright test --shard=${{ matrix.shardIndex }}/${{ matrix.shardTotal }}
+      - uses: actions/upload-artifact@v4
+        if: ${{ !cancelled() }}
+        with:
+          name: blob-report-${{ matrix.shardIndex }}
+          path: blob-report
+          retention-days: 1
+
+  merge-reports:
+    if: ${{ !cancelled() }}
+    needs: [playwright-tests]
+    runs-on: ubuntu-latest
+    steps:
+      - uses: actions/checkout@v5
+      - uses: actions/setup-node@v5
+        with: { node-version: lts/* }
+      - run: npm ci
+      - uses: actions/download-artifact@v4
+        with:
+          path: all-blob-reports
+          pattern: blob-report-*
+          merge-multiple: true
+      - run: npx playwright merge-reports --reporter html ./all-blob-reports
+      - uses: actions/upload-artifact@v4
+        with:
+          name: html-report--attempt-${{ github.run_attempt }}
+          path: playwright-report
+          retention-days: 14
+```
+
+`merge-reports` accepts multiple reporters: `--reporter html,github` annotates the PR while also
+producing the browsable report.
+
+## Fail-Fast vs Full-Suite
+
+| Context | Strategy |
+|---------|----------|
+| PR validation | `fail-fast: false` on the matrix + `maxFailures: 10` (or `--max-failures`) per shard. Developers fix everything in one round-trip instead of whack-a-mole |
+| Smoke gate before deploy | Fail fast — `--grep @smoke`, no retries, abort pipeline on first failure |
+| Nightly full regression | Full suite, retries on, no fail-fast; route the merged report to the team channel |
+
+## Reporters
+
+| Reporter | Use |
+|----------|-----|
+| `html` | Local + merged CI artifact — the daily driver |
+| `blob` | Shard intermediate; only input for `merge-reports` |
+| `junit` | Test-management ingestion (Jenkins, Azure DevOps, TestRail): `['junit', { outputFile: 'results.xml' }]` |
+| `github` | Inline PR annotations on failures |
+| `list` / `dot` / `line` | Console verbosity choices |
+
+Multiple at once:
+
+```ts
+reporter: process.env.CI
+  ? [['blob'], ['github']]
+  : [['html', { open: 'on-failure' }]],
+```
+
+## webServer in CI
+
+```ts
+webServer: {
+  command: 'npm run build && npm run start',
+  url: 'http://localhost:3000',
+  reuseExistingServer: !process.env.CI,   // CI always boots fresh
+  timeout: 120_000,
+  stdout: 'pipe',                          // surface server logs in CI output
+},
+```
+
+Playwright waits for `url` to respond before running tests — no `sleep 10` hacks. Multiple
+servers (API + frontend) can be given as an array.
+
+## CI Hardening Checklist
+
+- [ ] `forbidOnly: !!process.env.CI` — a stray `test.only` fails the build instead of silently skipping the suite
+- [ ] `retries: 2` on CI + `trace: 'on-first-retry'`
+- [ ] `workers: 1` per shard on small runners (2-core GitHub runners thrash above that); scale via shards
+- [ ] Report artifacts uploaded with `if: ${{ !cancelled() }}`
+- [ ] Browser install scoped to actual projects
+- [ ] Container tag or browser cache keyed to the Playwright version
+- [ ] Visual-test baselines generated in the same environment CI runs (see SKILL.md Visual Testing)

+ 191 - 0
skills/playwright-ops/references/fixtures-and-pom.md

@@ -0,0 +1,191 @@
+# Fixtures and Page Object Architecture
+
+How to structure Playwright Test suites with fixtures as the composition mechanism and page
+objects as thin locator/action wrappers.
+
+## Built-in Fixtures
+
+| Fixture | Type | Scope | Notes |
+|---------|------|-------|-------|
+| `page` | `Page` | test | Fresh isolated page per test |
+| `context` | `BrowserContext` | test | Fresh context per test — cookies/storage isolated |
+| `browser` | `Browser` | worker | Shared across tests in a worker |
+| `browserName` | `string` | worker | `'chromium' \| 'firefox' \| 'webkit'` |
+| `request` | `APIRequestContext` | test | HTTP client honoring `baseURL` / `extraHTTPHeaders` |
+
+## Custom Fixtures: the Full Shape
+
+```ts
+import { test as base } from '@playwright/test';
+
+type TestFixtures = {
+  todoPage: TodoPage;        // test-scoped
+  defaultItem: string;       // option
+};
+type WorkerFixtures = {
+  account: { username: string; password: string };  // worker-scoped
+};
+
+export const test = base.extend<TestFixtures, WorkerFixtures>({
+  // Option — overridable per project via projects[].use
+  defaultItem: ['Something nice', { option: true }],
+
+  // Test-scoped fixture with setup + teardown
+  todoPage: async ({ page, defaultItem }, use) => {
+    const todoPage = new TodoPage(page);
+    await todoPage.goto();
+    await todoPage.addToDo(defaultItem);
+    await use(todoPage);              // <-- test body executes here
+    await todoPage.removeAll();       // teardown runs even if test fails
+  },
+
+  // Worker-scoped — once per worker process; second generic param
+  account: [async ({ browser }, use, workerInfo) => {
+    const username = 'user-' + workerInfo.workerIndex;
+    const password = await createAccount(username);   // expensive, do once
+    await use({ username, password });
+    await deleteAccount(username);
+  }, { scope: 'worker' }],
+});
+
+export { expect } from '@playwright/test';
+```
+
+Key mechanics:
+
+- **Lazy**: a fixture only runs if the test (or another fixture) references it.
+- **Composable**: fixtures depend on other fixtures by destructuring them.
+- **Teardown order**: reverse of setup, runs even on failure — replaces brittle `afterEach` chains.
+- The two generic params of `extend<TestFixtures, WorkerFixtures>` map to test scope and worker
+  scope respectively. Worker fixtures cannot depend on test fixtures.
+
+## Fixture Options Reference
+
+| Option | Effect |
+|--------|--------|
+| `{ scope: 'worker' }` | One instance per worker process |
+| `{ auto: true }` | Runs for every test without being referenced — global hooks |
+| `{ option: true }` | Value is a project-configurable option |
+| `{ timeout: 60_000 }` | Separate timeout for slow fixture setup |
+| `{ box: true }` | Hide fixture from report/errors (or `box: 'self'` to hide just its step) |
+| `{ title: 'my fixture' }` | Custom name in reports |
+
+### Automatic fixtures as global hooks
+
+```ts
+export const test = base.extend<{ forEachTest: void }, { forEachWorker: void }>({
+  // beforeEach/afterEach equivalent, but reusable across files
+  forEachTest: [async ({ page }, use) => {
+    await page.goto('/');         // before each test
+    await use();
+    // after each test
+  }, { auto: true }],
+
+  // once per worker
+  forEachWorker: [async ({}, use) => {
+    console.log(`Worker ${test.info().workerIndex} starting`);
+    await use();
+  }, { scope: 'worker', auto: true }],
+});
+```
+
+### Overriding built-ins
+
+```ts
+export const test = base.extend({
+  page: async ({ page }, use) => {
+    await page.goto('/dashboard');   // every test starts on dashboard
+    await use(page);
+  },
+  // Override storageState to come from a worker fixture (per-worker auth)
+  storageState: ({ workerStorageState }, use) => use(workerStorageState),
+});
+```
+
+## mergeTests: Composing Fixture Modules
+
+Keep fixture concerns in separate modules and merge at the edge:
+
+```ts
+// fixtures/db.ts        -> export const test = base.extend<{ db: Db }>({...})
+// fixtures/a11y.ts      -> export const test = base.extend<{ axe: Axe }>({...})
+
+// fixtures/index.ts
+import { mergeTests, mergeExpects } from '@playwright/test';
+import { test as dbTest } from './db';
+import { test as a11yTest } from './a11y';
+
+export const test = mergeTests(dbTest, a11yTest);
+export { expect } from '@playwright/test';
+```
+
+Tests import `test`/`expect` from your fixtures module, never from `@playwright/test` directly —
+one import path to rule them all.
+
+## Page Objects: Modern Recommendation
+
+### POM-as-fixture (preferred)
+
+The page object is a plain class; the **fixture** owns construction and navigation:
+
+```ts
+// pages/checkout-page.ts
+import { type Page, type Locator, expect } from '@playwright/test';
+
+export class CheckoutPage {
+  readonly page: Page;
+  readonly cardNumber: Locator;
+  readonly payButton: Locator;
+
+  constructor(page: Page) {
+    this.page = page;
+    this.cardNumber = page.getByLabel('Card number');
+    this.payButton = page.getByRole('button', { name: 'Pay now' });
+  }
+
+  async goto() {
+    await this.page.goto('/checkout');
+  }
+
+  async pay(card: string) {
+    await this.cardNumber.fill(card);
+    await this.payButton.click();
+  }
+}
+```
+
+```ts
+// a test
+test('pays with valid card', async ({ checkoutPage, page }) => {
+  await checkoutPage.pay('4242 4242 4242 4242');
+  await expect(page.getByRole('heading', { name: 'Order confirmed' })).toBeVisible();
+});
+```
+
+### Rules for healthy page objects
+
+| Rule | Why |
+|------|-----|
+| Store `Locator`s, never element handles | Locators are lazy + auto-retrying; handles go stale |
+| Expose actions + locators; keep assertions in tests (or custom `expect` matchers) | Tests stay readable as specs; POMs stay reusable |
+| No `waitForTimeout` / try-catch flow control inside POMs | Hides flake; actions already auto-wait |
+| Constructor takes `Page` (or a `Locator` root for component objects) only | Keeps them trivially fixture-injectable |
+| Prefer small per-screen objects over one God object | Cheap to compose via fixtures |
+
+### When to skip POMs entirely
+
+Small suites (< ~20 tests) over stable UIs often read better with raw `getByRole` calls inline.
+POMs earn their keep when the same screen appears in many tests or locators churn. Don't build the
+abstraction before the duplication exists.
+
+## Hooks vs Fixtures
+
+| Need | Use |
+|------|-----|
+| Shared setup local to one file | `test.beforeEach` is fine |
+| Shared setup across files | Fixture (auto or named) |
+| Expensive once-per-run setup | Project dependencies (setup project) — not `globalSetup`, which skips fixtures/tracing |
+| Once-per-worker setup | Worker-scoped fixture |
+
+`test.beforeAll` runs **once per worker**, not once per run — a classic surprise. For true
+once-per-run work, use a setup project with `dependencies`.

+ 106 - 0
skills/playwright-ops/references/flake-hunting.md

@@ -0,0 +1,106 @@
+# Flake Hunting
+
+Systematic diagnosis of flaky Playwright tests. A flaky test is a bug — in the test, the app, or
+the environment. Retries buy time to fix it; they are not the fix.
+
+## Triage Workflow
+
+```
+Flaky test reported
+│
+1. Get the evidence
+│  └─ trace: 'on-first-retry' in config → download trace from CI artifact
+│     └─ npx playwright show-trace path/to/trace.zip
+│        (per-action DOM snapshots, network, console, timing)
+│
+2. Reproduce locally
+│  └─ npx playwright test failing.spec.ts --repeat-each=20 --workers=4
+│     ├─ Fails alone, repeated        → timing/race within the test
+│     ├─ Fails only with --workers>1  → cross-test state leakage
+│     └─ Fails only in CI             → environment delta (speed, viewport, headless, locale, TZ)
+│
+3. Classify against the table below, fix the CAUSE
+│
+4. Prove the fix
+   └─ --repeat-each=50 clean, then watch the "flaky" count in CI reports trend to zero
+```
+
+CI flake visibility: HTML report marks retried-then-passed tests as **flaky** — review that list
+weekly; it's your queue.
+
+## Common Causes and Fixes
+
+| Symptom | Root cause | Fix |
+|---------|-----------|-----|
+| Click "worked" but nothing happened | Element re-rendered between locate and click (hydration, list re-sort) | Assert the settled state first: `await expect(row).toBeVisible()` then act; prefer role/text locators that target the final element |
+| Assertion passes locally, times out in CI | CI is slower; manual check raced the render | Replace any non-retrying check with web-first `await expect(...)`; raise `expect.timeout` only if the app is legitimately slow |
+| `waitForTimeout` sprinkled around | Sleeping instead of waiting for a condition | Delete; wait on the observable effect: `expect(locator)`, `page.waitForURL()`, `page.waitForResponse()` |
+| Fails only with multiple workers | Tests share an account/record; one mutates what another reads | Per-worker data: suffix usernames/tenants with `test.info().parallelIndex`; or worker-scoped account fixture |
+| First test after auth flaky | storageState saved before login finished | In auth.setup, assert a logged-in signal (`await expect(page.getByTestId('user-menu')).toBeVisible()`) before `storageState({ path })` |
+| Animation mid-flight in screenshots/clicks | CSS transitions | `toHaveScreenshot` disables animations by default; for actions, assert post-animation state or set `reducedMotion: 'reduce'` in `use` |
+| Time-dependent failures (midnight, month-end, TZ) | Real clock | `await page.clock.setFixedTime(new Date('2026-01-15T10:00:00'))`; pin `timezoneId` and `locale` in `use` |
+| Random data collisions | Shared fixtures with hardcoded names | Unique-per-test names: `` `proj-${test.info().testId}` `` |
+| Network nondeterminism from third parties | Live external calls | Mock them (`route.fulfill` / HAR replay) — see network-and-api.md |
+| Passes in `--headed`, fails headless | Viewport/focus/rendering differences | Pin `viewport` in config; debug headless with traces, not by switching to headed |
+| Fails only on retry / second run | Leftover server-side state from first attempt | Make setup idempotent (upsert, not create); clean up in fixture teardown, which runs on failure too |
+
+## Tools Reference
+
+| Tool | Invocation | What it gives you |
+|------|-----------|-------------------|
+| Trace viewer | `trace: 'on-first-retry'` → `npx playwright show-trace trace.zip` | Time-travel DOM snapshots, network log, console, action timeline — the primary CI forensic tool |
+| UI mode | `npx playwright test --ui` | Watch mode + live trace while iterating on a fix |
+| Inspector | `PWDEBUG=1 npx playwright test foo.spec.ts` or `await page.pause()` | Step through actions, try locators live |
+| Repeat | `--repeat-each=20` | Statistical reproduction |
+| Stress | `--workers=4` (or more than usual) | Surfaces isolation bugs |
+| Single worker | `--workers=1` | If this "fixes" it, you have cross-test coupling — that's the bug |
+| Verbose API log | `DEBUG=pw:api npx playwright test` | Every Playwright call with timing |
+| Video | `video: 'retain-on-failure'` | Cheaper than trace to skim; less data |
+
+`trace: 'on'` everywhere is expensive — `'on-first-retry'` is the right default; use
+`'retain-on-failure'` if you run without retries.
+
+## Retrying Non-DOM Conditions
+
+Web-first assertions only retry on locators/page. For everything else:
+
+```ts
+// Poll an arbitrary async value
+await expect.poll(async () => {
+  const res = await request.get(`/api/jobs/${id}`);
+  return (await res.json()).status;
+}, { timeout: 30_000, intervals: [1_000] }).toBe('done');
+
+// Retry a block of assertions/actions together
+await expect(async () => {
+  const res = await request.get('/health');
+  expect(res.status()).toBe(200);
+}).toPass({ timeout: 60_000 });
+```
+
+Use these for eventual consistency (queues, search indexing, emails) instead of sleep loops.
+
+## Isolation Discipline Checklist
+
+- [ ] No test depends on another test having run (`test.describe.configure({ mode: 'serial' })` is a red flag, not a tool of first resort)
+- [ ] Server-side state is created per test (API seeding) or per worker (`parallelIndex`-scoped accounts)
+- [ ] Teardown lives in fixtures (runs on failure), not at the end of test bodies
+- [ ] `storageState` files saved only after asserting login completed
+- [ ] `forbidOnly` on CI; `--repeat-each` smoke before merging new specs
+- [ ] Suite passes with `--workers=8 --repeat-each=3` locally before you blame CI
+
+## Quarantine Pattern
+
+While a flake is being fixed, tag it instead of deleting or `.skip`-ing silently:
+
+```ts
+test('checkout under load @quarantine', async ({ page }) => { ... });
+```
+
+```bash
+npx playwright test --grep-invert @quarantine        # main gate
+npx playwright test --grep @quarantine               # nightly, non-blocking
+```
+
+Track quarantined tests with an issue each; a quarantine list that only grows is a suite dying in
+slow motion.

+ 169 - 0
skills/playwright-ops/references/network-and-api.md

@@ -0,0 +1,169 @@
+# Network Mocking and API Testing
+
+Intercepting browser traffic, replaying HAR recordings, testing APIs directly, and the hybrid
+seed-via-API / assert-via-UI pattern.
+
+## Route Interception: page.route()
+
+```ts
+// Stub an endpoint entirely
+await page.route('*/**/api/v1/fruits', async route => {
+  await route.fulfill({ json: [{ name: 'Strawberry', id: 21 }] });
+});
+
+// Must be registered BEFORE the navigation/action that triggers the request
+await page.goto('/');
+```
+
+| Method | Effect |
+|--------|--------|
+| `route.fulfill({ json, status, headers, body, path })` | Respond without hitting the network |
+| `route.fetch()` | Execute the real request, get the response for modification |
+| `route.continue({ headers, postData, url })` | Pass through, optionally modified |
+| `route.abort('failed')` | Simulate network failure |
+| `route.fallback()` | Defer to the next matching handler (handlers run last-registered-first) |
+
+### Modify a real response
+
+```ts
+await page.route('*/**/api/v1/fruits', async route => {
+  const response = await route.fetch();
+  const json = await response.json();
+  json.push({ name: 'Loquat', id: 100 });
+  await route.fulfill({ response, json });   // real status/headers, patched body
+});
+```
+
+### Failure-mode tests
+
+```ts
+await page.route('**/api/orders', route => route.fulfill({ status: 500 }));
+await page.route('**/*.{png,jpg,jpeg}', route => route.abort());   // block heavy assets
+await context.setOffline(true);                                     // whole-context offline
+```
+
+### Scope and ordering gotchas
+
+- `page.route` applies to that page; `context.route` to every page in the context (use in a
+  fixture for suite-wide stubs).
+- Patterns: glob (`**/api/**`), RegExp, or predicate function. Glob matches the **full URL**.
+- `await page.unroute(pattern)` removes handlers; `page.unrouteAll()` clears them.
+- Service workers can bypass routing — set `serviceWorkers: 'block'` in `use` if your app
+  registers one and mocks mysteriously don't fire.
+
+## HAR Record and Replay
+
+Best for "many endpoints, realistic payloads" — record once against the real backend, replay
+hermetically.
+
+```ts
+// Record (update: true hits the real network and refreshes the file)
+await page.routeFromHAR('./hars/fruits.har', {
+  url: '*/**/api/v1/**',
+  update: true,
+});
+
+// Replay (default update: false serves from the file; unmatched requests are aborted)
+await page.routeFromHAR('./hars/fruits.har', { url: '*/**/api/v1/**' });
+```
+
+CLI recording:
+
+```bash
+npx playwright open --save-har=example.har --save-har-glob="**/api/**" https://example.com
+```
+
+Workflow: re-run recording tests with `update: true` whenever the API contract changes, commit the
+HAR + extracted bodies (`.txt`/`.json` sidecars are editable by hand for edge cases).
+
+## API Testing: request / APIRequestContext
+
+The `request` fixture is an HTTP client honoring config `baseURL` and `extraHTTPHeaders` — no
+browser involved, so it's fast.
+
+```ts
+// playwright.config.ts
+use: {
+  baseURL: 'https://api.github.com',
+  extraHTTPHeaders: {
+    'Accept': 'application/vnd.github.v3+json',
+    'Authorization': `token ${process.env.API_TOKEN}`,
+  },
+},
+```
+
+```ts
+test('creates a bug report', async ({ request }) => {
+  const newIssue = await request.post(`/repos/${USER}/${REPO}/issues`, {
+    data: { title: '[Bug] report 1', body: 'Bug description' },
+  });
+  expect(newIssue.ok()).toBeTruthy();
+
+  const issues = await request.get(`/repos/${USER}/${REPO}/issues`);
+  expect(await issues.json()).toContainEqual(
+    expect.objectContaining({ title: '[Bug] report 1' }),
+  );
+});
+```
+
+Standalone context (different base URL, custom auth, use in setup scripts):
+
+```ts
+import { request } from '@playwright/test';
+
+const api = await request.newContext({ baseURL: 'https://api.example.com' });
+await api.post('/seed', { data: {...} });
+await api.dispose();   // always dispose manually created contexts
+```
+
+`request.post` options: `data` (JSON), `form` (urlencoded), `multipart` (file upload),
+`params` (query), `headers`, `failOnStatusCode`.
+
+## Hybrid: Seed via API, Assert via UI
+
+UI-driven setup is the slowest, flakiest part of most suites. Replace it:
+
+```ts
+test('renders the new project card', async ({ request, page }) => {
+  // Arrange — fast, deterministic, server-side
+  const res = await request.post('/api/projects', { data: { name: 'Apollo' } });
+  expect(res.ok()).toBeTruthy();
+  const { id } = await res.json();
+
+  // Act + Assert — the only part that needs a browser
+  await page.goto(`/projects/${id}`);
+  await expect(page.getByRole('heading', { name: 'Apollo' })).toBeVisible();
+});
+```
+
+Notes:
+
+- The `request` fixture shares `storageState` with the browser context, so an authenticated UI
+  session usually authenticates API calls too. For a different principal, create a separate
+  `request.newContext({ storageState: 'playwright/.auth/admin.json' })`.
+- Postcondition checks invert it: act in the UI, verify via `request.get` that the server really
+  persisted the thing.
+- Cleanup belongs in fixtures (teardown after `use()`) or `afterAll` API calls — not in the test
+  body where a failure skips it.
+
+## What to Mock vs Exercise
+
+| Dependency | Default |
+|------------|---------|
+| Third-party SaaS (payments, analytics, maps) | **Mock** (`route.fulfill` / HAR). You can't control their data or uptime, and you don't want test purchases |
+| Your own backend | **Real** — that's the integration you're paying E2E tests to verify |
+| Your backend, in a frontend-only project | Mock deliberately and label the project (`name: 'ui-isolated'`) so coverage claims stay honest |
+| Time / randomness | `page.clock.install()` / `page.clock.setFixedTime(...)` for time-dependent UI |
+
+## WebSocket Mocking
+
+```ts
+await page.routeWebSocket('wss://example.com/ws', ws => {
+  ws.onMessage(message => {
+    if (message === 'request') ws.send('response');
+  });
+});
+```
+
+By default the intercepted socket never reaches the server; call `ws.connectToServer()` inside the
+handler to proxy with selective message rewriting.

+ 0 - 0
skills/claude-code-hooks/assets/.gitkeep → skills/playwright-ops/scripts/.gitkeep


+ 252 - 0
skills/terraform-ops/SKILL.md

@@ -0,0 +1,252 @@
+---
+name: terraform-ops
+description: "Terraform and OpenTofu infrastructure-as-code operations - project layout, state management, module design, plan/apply safety, CI/CD pipelines, and secrets. Use for: terraform, opentofu, infrastructure as code, IaC, tfstate, terraform state, terraform module, remote backend, terraform plan, terraform apply, for_each, moved block, terraform import, drift detection, tflint, checkov, HCL."
+license: MIT
+allowed-tools: "Read Write Bash"
+metadata:
+  author: claude-mods
+  related-skills: ci-cd-ops, docker-ops, container-orchestration
+---
+
+# Terraform Operations
+
+Terraform / OpenTofu infrastructure-as-code: layout, state, modules, safety, CI/CD, secrets.
+
+**Version context (verified 2026-06):** Terraform **1.15.x** (BUSL-1.1 licence since 1.6) · OpenTofu **1.12.x** (MPL-2.0 fork of Terraform 1.5.x). Commands below are interchangeable (`terraform` ↔ `tofu`) unless flagged. See [Terraform vs OpenTofu](#terraform-vs-opentofu) for the decision note.
+
+## Reference Files
+
+| File | Covers |
+|------|--------|
+| [references/state-management.md](references/state-management.md) | Remote backends, locking, moved/import/removed blocks, state surgery, drift detection |
+| [references/module-patterns.md](references/module-patterns.md) | Module composition, variable validation, optional/nullable, output contracts, versioning |
+| [references/cicd-pipelines.md](references/cicd-pipelines.md) | GitHub Actions plan/apply, OIDC auth, policy gates (tflint/trivy/checkov/OPA), Atlantis/HCP |
+| [references/security-and-secrets.md](references/security-and-secrets.md) | Secrets in state, ephemeral resources, write-only arguments, SOPS/Vault, sensitive limits |
+| [assets/github-actions-terraform.yml](assets/github-actions-terraform.yml) | Ready-to-adapt PR-plan + OIDC-apply workflow |
+
+## Project Layout Decision Tree
+
+```
+How many environments / accounts?
+│
+├─ One environment, one team
+│  └─ Single root module + tfvars. Don't over-engineer.
+│
+├─ Multiple environments (dev/staging/prod)
+│  ├─ Need different backend/account/region per env? (usually YES for prod isolation)
+│  │  └─ DIRECTORY PER ENVIRONMENT (recommended default)
+│  │     environments/{dev,staging,prod}/ each a thin root calling shared modules
+│  │
+│  └─ Environments truly identical except a few variables, same backend account?
+│     └─ Workspaces are *acceptable* — but see the workspace caveats below
+│
+└─ Many teams / many state files / platform engineering
+   └─ Directory-per-env + per-component state split (network / data / app)
+      Consider Terragrunt, Terraform Stacks (HCP), or OpenTofu + CI orchestration
+```
+
+### Canonical multi-env layout
+
+```
+infra/
+├── modules/                  # Reusable child modules (no provider/backend blocks)
+│   ├── network/
+│   │   ├── main.tf
+│   │   ├── variables.tf
+│   │   ├── outputs.tf
+│   │   └── versions.tf       # required_providers ONLY (no provider config)
+│   └── app-service/
+├── environments/             # Root modules — one state file each
+│   ├── dev/
+│   │   ├── main.tf           # module "network" { source = "../../modules/network" ... }
+│   │   ├── backend.tf        # remote backend, env-specific key
+│   │   ├── providers.tf      # provider config lives in ROOT only
+│   │   ├── terraform.tfvars  # committed, non-secret env values
+│   │   └── versions.tf       # required_version + required_providers pins
+│   └── prod/
+└── .tflint.hcl
+```
+
+### Why directories usually beat workspaces
+
+| Concern | Directories | Workspaces |
+|---------|-------------|------------|
+| Separate backend/account per env | Yes — each root has its own `backend.tf` | No — one backend, envs differ only by state key |
+| Blast radius of wrong-env apply | Low — you're physically in `prod/` | High — invisible `terraform workspace select` state |
+| Env-specific config divergence | Natural (different main.tf if needed) | `terraform.workspace` conditionals creep everywhere |
+| Prod IAM isolation | Per-dir CI role | Same credentials see all envs |
+| Visibility in code review | Diff shows which env changed | Workspace is runtime state, not in the diff |
+
+Workspaces fit short-lived ephemeral copies (PR preview envs) — not the dev/prod boundary. HashiCorp's own docs say workspaces are "not suitable for strong separation."
+
+### tfvars conventions
+
+```bash
+terraform.tfvars            # auto-loaded — per-root committed defaults (non-secret)
+*.auto.tfvars               # auto-loaded — generated/local overrides
+prod.tfvars                 # explicit only: terraform plan -var-file=prod.tfvars
+TF_VAR_db_password=...      # env var injection — secrets in CI, never in files
+```
+
+Gotcha: `-var-file` + directories-per-env is belt-and-braces; with workspaces it's load-bearing and one forgotten flag applies dev values to prod.
+
+## State Quick Reference
+
+Full detail: [references/state-management.md](references/state-management.md).
+
+| Task | Command / block | Notes |
+|------|-----------------|-------|
+| Remote backend (AWS) | `backend "s3" { bucket, key, region, use_lockfile = true }` | S3-native locking (TF ≥1.10) — DynamoDB table no longer required |
+| Rename resource in code | `moved { from = aws_x.a, to = aws_x.b }` | Declarative, reviewable, no CLI surgery |
+| Adopt existing infra | `import { to = aws_x.a, id = "i-123" }` + `plan -generate-config-out=gen.tf` | Config-driven import (TF ≥1.5) beats `terraform import` CLI |
+| Forget without destroy | `removed { from = aws_x.a, lifecycle { destroy = false } }` | TF ≥1.7; OpenTofu 1.12 also has `lifecycle { destroy = false }` on resources |
+| Drift detection | `terraform plan -detailed-exitcode` | Exit 0 = clean, 1 = error, **2 = drift** — cron it |
+| Inspect state | `terraform state list` / `state show ADDR` | Read-only, always safe |
+| Move state (last resort) | `terraform state mv SRC DST` | Prefer `moved` blocks — see "when NOT to" below |
+| Pull/push (emergency) | `terraform state pull > backup.tfstate` | ALWAYS pull a backup before any surgery |
+
+**State surgery — when NOT to:** if a `moved`/`removed`/`import` block can express it, use the block. CLI `state mv`/`rm` is immediate, unreviewed, unversioned, and a typo orphans real infrastructure. Legit uses: splitting state between roots, unwedging a failed migration. Always `state pull` a backup first.
+
+## Module Quick Reference
+
+Full detail: [references/module-patterns.md](references/module-patterns.md).
+
+```hcl
+module "network" {
+  source  = "terraform-aws-modules/vpc/aws"
+  version = "~> 6.0"          # pin minor-float for registry modules; exact pin in prod roots
+  # ...
+}
+```
+
+| Rule | Why |
+|------|-----|
+| Composition over inheritance | Roots compose flat modules; never module-wraps-module-wraps-module |
+| No provider blocks in child modules | Providers configured in root only; child declares `required_providers` |
+| `validation` blocks on variables | Fail at plan with a real message, not mid-apply |
+| `optional(type, default)` in object attrs | Callers omit fields; `nullable = false` rejects explicit null |
+| Outputs are the contract | Output IDs/ARNs consumers need; document with `description` |
+| **Anti-pattern: thin wrappers** | A module that just renames variables of another module adds a version-lag layer and zero value — call the upstream module directly |
+
+## Safety Checklist (before every apply)
+
+```
+□ plan output READ, not skimmed — every destroy/replace explained
+□ "Plan: X to add, Y to change, Z to destroy" — does Z surprise you?
+□ -/+ (replace) lines: check the "forces replacement" attribute
+□ Applying the SAME saved plan that was reviewed: plan -out=tfplan → apply tfplan
+□ prevent_destroy on stateful resources (db, state bucket, KMS keys)
+□ Cloud-side deletion protection too (RDS deletion_protection, S3 versioning+MFA-delete)
+□ No -target unless this is a declared emergency (see below)
+□ for_each (stable keys), not count, for any collection that can reorder
+```
+
+### Footguns
+
+| Footgun | Detail | Fix |
+|---------|--------|-----|
+| `count` index shift | Removing item 0 of a `count` list re-addresses every later item → destroy/recreate cascade | `for_each` with stable string keys |
+| `-target` habit | Skips dependency graph; state diverges from config; hides drift | Emergency-only (broken dependency cycle, partial outage). Follow with a full clean plan |
+| `prevent_destroy` false comfort | Doesn't survive the block being deleted, and doesn't stop `state rm` + console delete | Pair with cloud-native deletion protection |
+| Dynamic blocks everywhere | `dynamic` for 2 static blocks is obfuscation | Use `dynamic` only over genuinely variable collections |
+| Unpinned providers | `aws = ">= 5.0"` in prod pulls a breaking major the day it ships | `~> 6.12` + commit `.terraform.lock.hcl` |
+| Apply ≠ reviewed plan | Plan on PR, apply on merge re-plans — drift in between applies unreviewed changes | Save the plan artifact, or accept + re-review the merge plan |
+
+```hcl
+resource "aws_db_instance" "main" {
+  deletion_protection = true            # cloud-side
+  lifecycle {
+    prevent_destroy = true              # terraform-side
+    ignore_changes  = [password]        # if rotated outside TF
+  }
+}
+```
+
+## CI/CD Quick Reference
+
+Full detail: [references/cicd-pipelines.md](references/cicd-pipelines.md) · template: [assets/github-actions-terraform.yml](assets/github-actions-terraform.yml).
+
+```
+PR opened   → fmt -check → validate → tflint → trivy/checkov → plan → plan posted as PR comment
+PR merged   → plan (fresh) → apply, authenticated via OIDC — no long-lived cloud keys
+Nightly     → plan -detailed-exitcode → exit 2 ⇒ drift alert
+```
+
+- **OIDC everywhere** — `aws-actions/configure-aws-credentials` with `role-to-assume`, never `AWS_ACCESS_KEY_ID` secrets. Same supply-chain doctrine as this repo's rules: short-lived tokens, no standing credentials.
+- **Pin action SHAs** in workflows (`uses: actions/checkout@<sha>`), not floating tags.
+- Policy gates: `tflint` (provider-aware lint), `trivy config` / `checkov` (misconfig scan), OPA/`conftest` for org policy ("no public buckets").
+
+| Orchestrator | Fit |
+|---|---|
+| Plain GitHub Actions | Default — full control, free, template in assets/ |
+| Atlantis | Self-hosted PR automation, `atlantis plan/apply` comments, locking per dir |
+| HCP Terraform / Terraform Cloud | Managed runs, Sentinel policy, state hosting; free ≤500 resources |
+| Spacelift / env0 / Digger / Scalr | Commercial Atlantis-likes; Digger runs inside your Actions |
+
+## Testing Quick Reference
+
+```hcl
+# tests/network.tftest.hcl  — native test framework (TF ≥1.6 / OpenTofu ≥1.6)
+variables { cidr = "10.0.0.0/16" }
+
+run "valid_cidr_plan" {
+  command = plan                          # plan = fast unit-ish; apply = real integration
+  assert {
+    condition     = aws_vpc.main.cidr_block == "10.0.0.0/16"
+    error_message = "VPC CIDR did not match input"
+  }
+}
+
+run "rejects_tiny_cidr" {
+  command = plan
+  variables { cidr = "10.0.0.0/30" }
+  expect_failures = [var.cidr]            # asserts the validation block fires
+}
+```
+
+`terraform test` runs every `*.tftest.hcl` under `tests/`; `command = apply` runs create real (then auto-destroyed) infra — use a sandbox account. Mock providers (`mock_provider` blocks, TF ≥1.7) fake apply without credentials. For multi-tool/Go-level orchestration (retry, real HTTP probes), Terratest is the heavyweight alternative — native `terraform test` covers most module CI needs first.
+
+## Secrets Quick Reference
+
+Full detail: [references/security-and-secrets.md](references/security-and-secrets.md).
+
+| Mechanism | Version | What it does |
+|---|---|---|
+| `sensitive = true` | all | Redacts from CLI output **only** — value still plaintext in state |
+| Ephemeral resources (`ephemeral "..."`) | TF ≥1.10 / OpenTofu ≥1.11 | Fetch secret at run time; never persisted to state or plan |
+| Write-only arguments (`password_wo`) | TF ≥1.11 / OpenTofu ≥1.11 | Send secret to provider; never stored in state; rotate via `_wo_version` |
+| SOPS-encrypted tfvars | tool | Secrets encrypted at rest in git; decrypted at plan time |
+| Vault / cloud secret manager | tool | Reference by ID; resource reads secret at boot, TF never sees it |
+| OpenTofu state encryption | OpenTofu ≥1.7 | Client-side AES-GCM encryption of state/plan — **no Terraform equivalent** |
+
+**Rule zero: treat state as secret regardless.** Encrypt the backend (SSE-KMS), restrict IAM on the bucket, never commit `*.tfstate` (gitignore it).
+
+## Terraform vs OpenTofu
+
+| | Terraform | OpenTofu |
+|---|---|---|
+| Licence | **BUSL-1.1** since 1.6 (no production use *competing with HashiCorp*; fine for normal internal use) | **MPL-2.0** — genuinely open source, Linux Foundation |
+| Current | 1.15.x | 1.12.x |
+| Exclusive features | Stacks (HCP-tied), Terraform Cloud agents, `terraform query` | State/plan **encryption**, provider `for_each` iteration, `-exclude` flag, early variable eval in backend/module blocks, OCI registry distribution, `.tofu` file extension |
+| Registry | registry.terraform.io | registry.opentofu.org (mirrors most providers) |
+| Compatibility | — | Forked at 1.5.x; HCL/state compatible for mainstream use, diverging feature-by-feature since |
+
+**Decision:** vendors and anyone redistributing IaC tooling commercially → OpenTofu (licence risk). Teams on HCP Terraform/Sentinel → Terraform. Everyone else: either works; OpenTofu's state encryption is the single biggest technical differentiator. Migration `terraform → tofu` is `tofu init` + state-compatible up to ~1.8-era features; the gap widens each release — migrate early or commit.
+
+## Command Quick Reference
+
+```bash
+terraform init -upgrade               # init / upgrade providers within constraints
+terraform fmt -recursive -check       # CI: fail on unformatted
+terraform validate                    # syntax + internal consistency (no creds needed after init)
+terraform plan -out=tfplan            # save plan for exact-apply
+terraform show -json tfplan | jq      # machine-readable plan (policy tools eat this)
+terraform apply tfplan                # apply EXACTLY the reviewed plan
+terraform plan -detailed-exitcode     # 0 clean / 2 drift — for cron drift checks
+terraform plan -refresh-only          # show drift without proposing config changes
+terraform apply -replace=aws_x.a      # force recreate one resource (replaces old taint)
+terraform state pull > backup.json    # ALWAYS before surgery
+terraform output -json                # consume outputs in scripts
+terraform graph | dot -Tsvg > g.svg   # dependency graph
+tofu init                             # OpenTofu: same verbs throughout
+```

+ 211 - 0
skills/terraform-ops/assets/github-actions-terraform.yml

@@ -0,0 +1,211 @@
+# =============================================================================
+# Terraform CI/CD — PR plan + OIDC apply (GitHub Actions template)
+# =============================================================================
+# What this gives you:
+#   * PRs:   fmt-check -> validate -> tflint -> trivy -> plan -> plan posted as
+#            a PR comment (updated in place, not stacked)
+#   * Merge: fresh plan + apply on main, gated by a GitHub "environment"
+#            (add required reviewers on the environment for a manual approval)
+#   * Auth:  AWS via OIDC — NO long-lived access keys anywhere.
+#
+# Adapt before use:
+#   1. Set WORKING_DIR to your root module (e.g. environments/prod).
+#   2. Replace the two role ARNs (plan = read-only, apply = write).
+#   3. Pin every `uses:` to a commit SHA for your security posture —
+#      tags are shown here for readability; SHA-pin in production:
+#        uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0
+#   4. OpenTofu: swap hashicorp/setup-terraform for opentofu/setup-opentofu
+#      (input `tofu_version`) and `terraform` for `tofu` in run steps.
+#   5. Multi-env monorepo: turn `WORKING_DIR` into a job matrix.
+# =============================================================================
+
+name: terraform
+
+on:
+  pull_request:
+    branches: [main]
+    paths: ["environments/prod/**", "modules/**"]   # plan only when infra changes
+  push:
+    branches: [main]
+    paths: ["environments/prod/**", "modules/**"]
+
+# Least privilege by default; jobs widen only what they need.
+permissions:
+  contents: read
+
+env:
+  TF_VERSION: "1.15.5"            # match required_version in versions.tf
+  WORKING_DIR: environments/prod
+  AWS_REGION: ap-southeast-2
+  TF_IN_AUTOMATION: "true"        # suppresses interactive-use hints in output
+
+# One run per state file at a time. Plans queue; applies never overlap.
+concurrency:
+  group: terraform-prod
+  cancel-in-progress: false       # NEVER cancel a running apply
+
+jobs:
+  # ---------------------------------------------------------------------------
+  # PR: static checks + plan + comment
+  # ---------------------------------------------------------------------------
+  plan:
+    if: github.event_name == 'pull_request'
+    runs-on: ubuntu-latest
+    permissions:
+      contents: read
+      id-token: write             # OIDC token for AWS
+      pull-requests: write        # post the plan comment
+    defaults:
+      run:
+        working-directory: ${{ env.WORKING_DIR }}
+    steps:
+      - uses: actions/checkout@v5
+
+      - uses: hashicorp/setup-terraform@v3
+        with:
+          terraform_version: ${{ env.TF_VERSION }}
+
+      # --- cheap static gates first: fail fast before touching the cloud ---
+      - name: fmt
+        run: terraform fmt -check -recursive -diff
+
+      - name: tflint
+        uses: terraform-linters/setup-tflint@v6
+      - run: tflint --init && tflint --recursive
+        working-directory: ${{ env.WORKING_DIR }}
+
+      - name: trivy misconfig scan
+        uses: aquasecurity/trivy-action@0.33.1
+        with:
+          scan-type: config
+          scan-ref: ${{ env.WORKING_DIR }}
+          exit-code: "1"          # hard gate; set "0" while triaging a baseline
+          severity: HIGH,CRITICAL
+
+      # --- read-only cloud credentials, scoped to PR plans ---
+      # This role's trust policy allows any branch of THIS repo, but the role
+      # itself carries ReadOnlyAccess + state-bucket read. PR code cannot mutate.
+      - name: Configure AWS credentials (plan role, read-only)
+        uses: aws-actions/configure-aws-credentials@v5
+        with:
+          role-to-assume: arn:aws:iam::123456789012:role/github-terraform-plan
+          aws-region: ${{ env.AWS_REGION }}
+
+      - name: init
+        run: terraform init -input=false -lock-timeout=2m
+
+      - name: validate
+        run: terraform validate -no-color
+
+      - name: plan
+        id: plan
+        # -lock=false: a PR plan must never block (or be blocked by) an apply
+        run: |
+          set -o pipefail
+          terraform plan -input=false -no-color -lock=false -out=tfplan 2>&1 \
+            | tee plan.txt
+        continue-on-error: true   # we still want the comment when plan fails
+
+      # Optional org-policy gate against the machine-readable plan:
+      # - run: terraform show -json tfplan > tfplan.json
+      # - run: conftest test --policy ../../policy tfplan.json
+
+      - name: Comment plan on PR
+        uses: actions/github-script@v8
+        env:
+          PLAN_OUTCOME: ${{ steps.plan.outcome }}
+        with:
+          script: |
+            const fs = require('fs');
+            const dir = process.env.WORKING_DIR;
+            const marker = `### Terraform plan — \`${dir}\``;
+            let plan = fs.readFileSync(`${dir}/plan.txt`, 'utf8');
+            // GitHub comment hard cap is 65,536 chars — keep the tail (the summary)
+            if (plan.length > 60000) plan = '... (truncated, see job log)\n' + plan.slice(-60000);
+            const status = process.env.PLAN_OUTCOME === 'success' ? '' : '\n> **PLAN FAILED** — see details below.';
+            const body = `${marker}${status}\n<details><summary>Show plan</summary>\n\n\`\`\`hcl\n${plan}\n\`\`\`\n</details>`;
+            const { data: comments } = await github.rest.issues.listComments({
+              ...context.repo, issue_number: context.issue.number });
+            const prev = comments.find(c => c.body.startsWith(marker));
+            if (prev) {
+              await github.rest.issues.updateComment({ ...context.repo, comment_id: prev.id, body });
+            } else {
+              await github.rest.issues.createComment({ ...context.repo, issue_number: context.issue.number, body });
+            }
+
+      - name: Fail job if plan failed
+        if: steps.plan.outcome == 'failure'
+        run: exit 1
+
+  # ---------------------------------------------------------------------------
+  # Merge to main: fresh plan + apply
+  # ---------------------------------------------------------------------------
+  apply:
+    if: github.event_name == 'push' && github.ref == 'refs/heads/main'
+    runs-on: ubuntu-latest
+    # The "production" environment is where the human gate lives:
+    # repo Settings -> Environments -> production -> required reviewers.
+    environment: production
+    permissions:
+      contents: read
+      id-token: write
+    defaults:
+      run:
+        working-directory: ${{ env.WORKING_DIR }}
+    steps:
+      - uses: actions/checkout@v5
+
+      - uses: hashicorp/setup-terraform@v3
+        with:
+          terraform_version: ${{ env.TF_VERSION }}
+
+      # Apply role: write permissions, trust policy locked to
+      #   repo:myorg/infra:environment:production
+      # so ONLY this gated environment on main can assume it.
+      - name: Configure AWS credentials (apply role)
+        uses: aws-actions/configure-aws-credentials@v5
+        with:
+          role-to-assume: arn:aws:iam::123456789012:role/github-terraform-apply
+          aws-region: ${{ env.AWS_REGION }}
+
+      - name: init
+        run: terraform init -input=false -lock-timeout=5m
+
+      # Fresh plan at merge time. If the world moved since PR review, this plan
+      # differs from the reviewed one — the saved-plan apply below makes that
+      # explicit rather than silently applying something new.
+      - name: plan
+        run: terraform plan -input=false -no-color -lock-timeout=5m -out=tfplan
+
+      - name: apply
+        # Applying the saved plan (not `apply -auto-approve` on config) means
+        # we apply EXACTLY what the step above planned — no second refresh gap.
+        run: terraform apply -input=false -no-color tfplan
+
+  # ---------------------------------------------------------------------------
+  # Nightly drift detection
+  # ---------------------------------------------------------------------------
+  # Move to its own file with `on: schedule` if you prefer; shown here for
+  # completeness. Exit code 2 from -detailed-exitcode means drift.
+  # drift:
+  #   if: github.event_name == 'schedule'
+  #   runs-on: ubuntu-latest
+  #   permissions: { contents: read, id-token: write }
+  #   steps:
+  #     - uses: actions/checkout@v5
+  #     - uses: hashicorp/setup-terraform@v3
+  #       with: { terraform_version: "1.15.5" }
+  #     - uses: aws-actions/configure-aws-credentials@v5
+  #       with:
+  #         role-to-assume: arn:aws:iam::123456789012:role/github-terraform-plan
+  #         aws-region: ap-southeast-2
+  #     - run: terraform init -input=false
+  #       working-directory: environments/prod
+  #     - name: drift check
+  #       working-directory: environments/prod
+  #       run: |
+  #         set +e
+  #         terraform plan -detailed-exitcode -lock=false -input=false -no-color
+  #         code=$?
+  #         [ "$code" -eq 2 ] && { echo "::error::Drift detected in prod"; exit 1; }
+  #         exit $code

+ 182 - 0
skills/terraform-ops/references/cicd-pipelines.md

@@ -0,0 +1,182 @@
+# CI/CD Pipelines
+
+The contract: **every change is planned on the PR, the plan is visible to the reviewer, and apply happens from CI with short-lived credentials.** Humans never run `apply` against shared environments from laptops.
+
+Full workflow template: [../assets/github-actions-terraform.yml](../assets/github-actions-terraform.yml).
+
+## Pipeline Shape
+
+```
+              ┌─ fmt -check ─┐
+PR opened ──> ├─ validate    ├──> tflint ──> trivy/checkov ──> plan ──> plan as PR comment
+              └─ (parallel)  ┘                                              │
+                                                                   reviewer reads plan
+PR merged ──> fresh plan ──> apply (OIDC role, environment gate)
+Nightly   ──> plan -detailed-exitcode ──> exit 2 ⇒ drift alert
+```
+
+Key decisions baked into that shape:
+
+1. **Plan on PR, apply on merge.** The merge re-plans rather than applying the stale PR plan artifact — simpler, and the `concurrency` group serializes applies. If you need apply-exactly-what-was-reviewed, upload `tfplan` as an artifact on the PR and apply that artifact on merge; accept the trade-off that the world may have moved (the apply will fail if so, which is the safe failure).
+2. **One job per root module** (matrix or separate workflows). A monorepo with `environments/{dev,prod}` plans both on PR, applies dev on merge, applies prod behind a GitHub *environment* with required reviewers.
+3. **`concurrency` group per state file** so two merges can't apply concurrently (backend locking would catch it, but failing fast in CI is cleaner).
+
+## OIDC Cloud Auth — no long-lived keys
+
+This is non-negotiable and matches the repo's supply-chain doctrine (short-lived tokens over standing credentials; a leaked workflow can't exfiltrate what doesn't exist). GitHub mints a signed JWT per job; the cloud trusts GitHub's issuer for *specific repos/branches* and returns temporary credentials.
+
+### AWS
+
+```yaml
+permissions:
+  id-token: write      # REQUIRED for OIDC
+  contents: read
+
+steps:
+  - uses: aws-actions/configure-aws-credentials@v5
+    with:
+      role-to-assume: arn:aws:iam::123456789012:role/github-terraform-plan
+      aws-region: ap-southeast-2
+```
+
+Trust policy on the role — scope it tight:
+
+```json
+{
+  "Effect": "Allow",
+  "Principal": { "Federated": "arn:aws:iam::123456789012:oidc-provider/token.actions.githubusercontent.com" },
+  "Action": "sts:AssumeRoleWithWebIdentity",
+  "Condition": {
+    "StringEquals": { "token.actions.githubusercontent.com:aud": "sts.amazonaws.com" },
+    "StringLike":   { "token.actions.githubusercontent.com:sub": "repo:myorg/infra:ref:refs/heads/main" }
+  }
+}
+```
+
+- **Two roles**: a read-only `plan` role (assumable from any branch/PR) and a write `apply` role (assumable only from `ref:refs/heads/main` or `environment:prod`). PR plans from forks then physically cannot mutate anything.
+- Audit the trust federation periodically — stale OIDC subjects (deleted repos, renamed branches) with live trust are exactly the entry point the 2026 supply-chain worms abused. `zizmor` catches `pull_request_target` + OIDC misconfigs statically.
+
+### Azure / GCP equivalents
+
+```yaml
+# Azure — federated credential on an app registration
+- uses: azure/login@v2
+  with:
+    client-id: ${{ vars.AZURE_CLIENT_ID }}
+    tenant-id: ${{ vars.AZURE_TENANT_ID }}
+    subscription-id: ${{ vars.AZURE_SUBSCRIPTION_ID }}
+
+# GCP — workload identity federation
+- uses: google-github-actions/auth@v3
+  with:
+    workload_identity_provider: projects/123/locations/global/workloadIdentityPools/github/providers/myorg
+    service_account: terraform@myproj.iam.gserviceaccount.com
+```
+
+## Plan as PR Comment
+
+The reviewer must see the plan without leaving the PR. Minimal recipe (full version in the asset):
+
+```yaml
+- name: Plan
+  id: plan
+  run: terraform plan -no-color -input=false -out=tfplan 2>&1 | tee plan.txt
+
+- name: Comment plan on PR
+  uses: actions/github-script@v8
+  with:
+    script: |
+      const fs = require('fs');
+      const plan = fs.readFileSync('plan.txt', 'utf8').slice(0, 60000); // comment size cap
+      const body = `### Terraform plan — \`prod\`\n<details><summary>Show plan</summary>\n\n\`\`\`hcl\n${plan}\n\`\`\`\n</details>`;
+      // find-and-update existing comment instead of stacking new ones
+      const { data: comments } = await github.rest.issues.listComments({ ...context.repo, issue_number: context.issue.number });
+      const prev = comments.find(c => c.body.startsWith('### Terraform plan — `prod`'));
+      if (prev) await github.rest.issues.updateComment({ ...context.repo, comment_id: prev.id, body });
+      else await github.rest.issues.createComment({ ...context.repo, issue_number: context.issue.number, body });
+```
+
+Notes:
+
+- **Update-in-place** (find previous comment) or every push spams the PR.
+- Truncate: GitHub comments cap at 65,536 chars. For huge plans link to the job log and post only the resource-change summary (`terraform show -json tfplan | jq -r '.resource_changes[] | "\(.change.actions | join(",")) \(.address)"'`).
+- Plans can leak values — `sensitive = true` redacts in plan output, but data sources and resource attributes are not all marked. Treat the PR comment as visible to everyone with repo read.
+
+## Policy Gates
+
+| Tool | Layer | What it catches | Invocation |
+|---|---|---|---|
+| `terraform fmt -check -recursive` | style | Unformatted code | exit ≠ 0 fails CI |
+| `terraform validate` | syntax | Type errors, bad references | needs `init` (use `-backend=false` for speed) |
+| `tflint` | lint | Provider-aware errors: invalid instance types, deprecated syntax, unused declarations | `tflint --init && tflint --recursive` |
+| `trivy config .` | security | Misconfig: public buckets, open SGs, unencrypted disks (absorbed tfsec's rule set) | exit codes; SARIF upload for code-scanning UI |
+| `checkov -d .` | security | Same space as trivy; bigger policy library, more noise — pick ONE of trivy/checkov | `--soft-fail` while triaging |
+| `conftest test tfplan.json` | org policy | YOUR rules in Rego/OPA: "no resources without tags", "only approved regions", "no IAM * actions" | run against `terraform show -json tfplan` |
+
+```hcl
+# .tflint.hcl
+plugin "aws" {
+  enabled = true
+  version = "0.40.0"
+  source  = "github.com/terraform-linters/tflint-ruleset-aws"
+}
+rule "terraform_required_version" { enabled = true }
+rule "terraform_naming_convention" { enabled = true }
+```
+
+```rego
+# policy/tags.rego — conftest example against plan JSON
+package main
+deny[msg] {
+  rc := input.resource_changes[_]
+  rc.change.actions[_] == "create"
+  not rc.change.after.tags.Environment
+  msg := sprintf("%s: missing required tag 'Environment'", [rc.address])
+}
+```
+
+Layering guidance: fmt/validate/tflint are table stakes on every PR. trivy *or* checkov as the misconfig gate (start `--soft-fail`, ratchet to hard once the baseline is clean). conftest/OPA only when you have genuinely org-specific rules the scanners can't express — it's the highest-maintenance layer.
+
+## Workflow Hardening
+
+Same doctrine as the rest of the repo's supply-chain rules:
+
+- **Pin actions to commit SHAs** — `uses: actions/checkout@08c6903cd8c0fde910a37f88322edcfb5dd907a8 # v5.0.0`, not `@v5`. A hijacked tag is a hijacked pipeline with cloud credentials.
+- `permissions:` block at workflow top, least privilege (`contents: read` default; `id-token: write` only on jobs that auth to cloud; `pull-requests: write` only on the comment job).
+- **Never `pull_request_target` with checkout of PR head** for terraform repos — that's RCE-with-secrets for any fork.
+- Plan jobs from forks: read-only role or no cloud auth at all (validate/lint only).
+- `step-security/harden-runner` for egress allow-listing on apply jobs if you want runtime control.
+- Pin the terraform version: `hashicorp/setup-terraform@<sha>` with `terraform_version: 1.15.5` (or `opentofu/setup-opentofu` with `tofu_version`), matching `required_version`.
+
+## Drift Detection Job
+
+```yaml
+on:
+  schedule: [{ cron: "17 18 * * *" }]   # nightly; odd minute avoids the top-of-hour stampede
+jobs:
+  drift:
+    permissions: { id-token: write, contents: read }
+    steps:
+      # ... checkout, setup, OIDC auth (read-only role), init ...
+      - name: Detect drift
+        run: |
+          set +e
+          terraform plan -detailed-exitcode -lock=false -input=false -no-color
+          code=$?
+          if [ "$code" -eq 2 ]; then echo "::error::Drift detected"; exit 1; fi
+          exit $code
+```
+
+Wire the failure to Slack/issue creation. Exit 2 means *either* console drift or merged-but-unapplied config — both are findings.
+
+## Orchestrator Alternatives
+
+| | Model | Locking | Policy | Cost | Pick when |
+|---|---|---|---|---|---|
+| **GitHub Actions (DIY)** | Workflows you own | `concurrency` groups | tflint/trivy/conftest steps | Free-ish | Default. Full control, template in assets/ |
+| **Atlantis** | Self-hosted server; `atlantis plan` / `atlantis apply` PR comments | Per-directory PR locks (best-in-class) | Custom workflows + conftest | Server you run | Many roots, many PRs, comment-driven culture |
+| **HCP Terraform** | Managed runs + state + UI | Workspace runs serialize | Sentinel / OPA | Free ≤ 500 resources, then $$ | Want managed everything; Sentinel policy; private registry |
+| **Spacelift / env0 / Scalr** | Commercial SaaS orchestrators | Built-in | OPA-based | $$ | Enterprise multi-IaC (Pulumi/CFN too), RBAC needs |
+| **Digger** | Runs inside *your* GitHub Actions | Orchestrated via PR | Pluggable | OSS core | Atlantis UX without hosting a server |
+
+OpenTofu note: Atlantis, Spacelift, env0, Digger all support `tofu` as the binary. HCP Terraform does not run OpenTofu.

+ 223 - 0
skills/terraform-ops/references/module-patterns.md

@@ -0,0 +1,223 @@
+# Module Patterns
+
+Modules are Terraform's unit of reuse. Most module pain comes from treating them like classes (inheritance, deep nesting, wrapping) instead of functions (flat composition, explicit inputs/outputs).
+
+## Root vs Child Modules
+
+| | Root module | Child module |
+|---|---|---|
+| What | The directory you run `terraform` in | Anything called via `module` block |
+| Backend block | Yes — exactly one | **Never** |
+| Provider config (`provider "aws" {}`) | Yes | **Never** — declare `required_providers` only |
+| tfvars | Yes | No (inputs come from the caller) |
+| State | Owns one state file | Lives inside the caller's state |
+
+```hcl
+# modules/network/versions.tf — child module declares NEEDS, not config
+terraform {
+  required_version = ">= 1.10"
+  required_providers {
+    aws = {
+      source  = "hashicorp/aws"
+      version = ">= 6.0, < 8.0"     # modules use RANGES; roots pin tighter
+    }
+  }
+}
+```
+
+A child module with a `provider` block can't be used with `for_each`/`count`/`depends_on` and can't be removed cleanly. Pass aliased providers explicitly when needed:
+
+```hcl
+module "dns" {
+  source    = "../../modules/dns"
+  providers = { aws = aws.us_east_1 }   # ACM certs for CloudFront, etc.
+}
+```
+
+## Composition Over Inheritance
+
+Build small single-purpose modules; compose them in the root by wiring outputs to inputs.
+
+```hcl
+# Root composes — dependencies are explicit data flow
+module "network" {
+  source = "../../modules/network"
+  cidr   = "10.0.0.0/16"
+}
+
+module "database" {
+  source     = "../../modules/database"
+  subnet_ids = module.network.private_subnet_ids   # output -> input wiring
+  vpc_id     = module.network.vpc_id
+}
+
+module "app" {
+  source            = "../../modules/app-service"
+  subnet_ids        = module.network.private_subnet_ids
+  db_connection_arn = module.database.connection_secret_arn
+}
+```
+
+Rules of thumb:
+
+- **Nesting depth ≤ 2.** Root → module → (occasionally) submodule. Deeper means you're rebuilding inheritance and debugging through four layers of variable plumbing.
+- A module should manage a *cohesive* set of resources with a clear lifecycle (a VPC and its subnets/routes — yes; "everything for the app" — no).
+- If a module takes 40 variables and most callers set 3, split it.
+- Don't create a module for a single resource unless it encodes real policy (e.g. an S3 module that enforces encryption + public-access-block on every bucket — that's policy, not wrapping).
+
+### Anti-pattern: thin wrapper modules
+
+```hcl
+# modules/our-vpc/main.tf — adds NOTHING
+module "vpc" {
+  source  = "terraform-aws-modules/vpc/aws"
+  version = "~> 6.0"
+  name    = var.vpc_name        # renamed from `name`. why.
+  cidr    = var.network_cidr    # renamed from `cidr`. why.
+}
+```
+
+Costs: a second version to bump (wrapper lags upstream), a second docs surface, every upstream feature needs a wrapper variable added before anyone can use it, and `moved`-block refactors in upstream don't propagate. **Call the upstream module directly from the root.** A wrapper earns its existence only when it enforces organizational policy (mandatory tags, forced encryption, restricted instance types) — and then it should *say so* in its README.
+
+## Variable Design
+
+### Validation — fail at plan, not mid-apply
+
+```hcl
+variable "environment" {
+  type        = string
+  description = "Deployment environment."
+  validation {
+    condition     = contains(["dev", "staging", "prod"], var.environment)
+    error_message = "environment must be one of: dev, staging, prod."
+  }
+}
+
+variable "cidr" {
+  type = string
+  validation {
+    condition     = can(cidrhost(var.cidr, 0)) && tonumber(split("/", var.cidr)[1]) <= 24
+    error_message = "cidr must be a valid IPv4 CIDR no smaller than /24."
+  }
+}
+```
+
+Since TF 1.9, `condition` can reference *other* variables and data — cross-field validation lives on the variable, not buried in a `precondition`.
+
+### optional() + nullable
+
+```hcl
+variable "logging" {
+  description = "Access-logging config. Omit fields for defaults."
+  type = object({
+    enabled       = optional(bool, true)
+    bucket        = optional(string)          # null when omitted
+    prefix        = optional(string, "logs/")
+    sample_rate   = optional(number, 1.0)
+  })
+  default  = {}
+  nullable = false      # caller may omit the variable, but may NOT pass logging = null
+}
+```
+
+- `optional(type, default)` lets callers pass partial objects — the killer feature for config-object variables. Without defaults you'd force every caller to spell out every field.
+- `nullable = false` means an explicit `null` is rejected and the variable's own `default` is used instead. Use it on almost every variable: it converts "caller passed null, module exploded on `var.x.enabled`" into a plan-time error.
+- Gotcha: `optional(string)` with no default yields `null` — guard with `coalesce(...)` or `try(...)` before interpolating.
+
+### Variable hygiene
+
+| Rule | Why |
+|---|---|
+| Every variable has `description` | It's the module's API doc (`terraform-docs` renders it) |
+| `sensitive = true` on secret inputs | Keeps values out of CLI output (NOT out of state — see security-and-secrets.md) |
+| Prefer typed objects over `map(any)` | `any` defers errors to deep inside the module |
+| No "pass-through everything" variables | A `extra_settings = any` variable is an API you can never change |
+| Defaults = safe choice, not common choice | Default to encrypted/private/protected; make callers opt *out* loudly |
+
+## Output Contracts
+
+Outputs are the module's public API. Consumers wire them into other modules and remote-state reads — changing one is a breaking change.
+
+```hcl
+output "vpc_id" {
+  description = "ID of the created VPC."
+  value       = aws_vpc.main.id
+}
+
+output "private_subnet_ids" {
+  description = "Private subnet IDs, keyed by AZ."
+  value       = { for k, s in aws_subnet.private : k => s.id }
+}
+
+output "db_endpoint" {
+  description = "Writer endpoint. SENSITIVE: contains hostname only, no creds."
+  value       = aws_rds_cluster.main.endpoint
+}
+```
+
+- Output the **identifiers consumers need** (IDs, ARNs, endpoints, security-group IDs) — not whole resource objects (`value = aws_vpc.main` couples consumers to the provider schema and bloats state).
+- `sensitive = true` propagates: an output derived from a sensitive value must itself be marked sensitive or plan errors.
+- Treat output removal/rename like an API break: semver-major the module, or keep the old output as an alias for one cycle.
+
+## Versioning and Pinning
+
+```hcl
+# Registry module — minor-float
+module "vpc" {
+  source  = "terraform-aws-modules/vpc/aws"
+  version = "~> 6.0"        # >= 6.0.0, < 7.0.0
+}
+
+# Git module — pin a TAG (never a branch)
+module "internal" {
+  source = "git::https://github.com/myorg/tf-modules.git//network?ref=v2.3.1"
+}
+```
+
+| Constraint | Meaning | Use for |
+|---|---|---|
+| `~> 6.0` | ≥ 6.0, < 7.0 | Registry modules/providers in shared modules |
+| `~> 6.12.0` | ≥ 6.12.0, < 6.13.0 | Conservative prod roots |
+| `= 6.12.1` / `?ref=v2.3.1` | Exact | Prod roots wanting byte-identical builds |
+| `>= 6.0` (open-ended) | Anything newer | **Never in prod** — a breaking major auto-arrives |
+
+- **Commit `.terraform.lock.hcl`.** It pins exact provider versions + hashes; `terraform init -upgrade` is the deliberate act of moving within constraints. This is the same supply-chain posture as any other lockfile: a pin only protects you if it pre-dates a compromise and you don't run unconstrained upgrades in CI.
+- Run `terraform providers lock -platform=linux_amd64 -platform=darwin_arm64 -platform=windows_amd64` so the lockfile carries hashes for every platform your team + CI uses (OpenTofu 1.12 does this automatically at init).
+- Internal module registries: HCP Terraform private registry, or plain git tags + a `modules/` monorepo. Git tags are fine; the registry's value is the version-constraint syntax and docs rendering.
+
+## Module Repo Layout
+
+```
+terraform-aws-network/            # one module per repo (registry-publishable), or modules/ monorepo
+├── main.tf
+├── variables.tf
+├── outputs.tf
+├── versions.tf
+├── README.md                     # terraform-docs generated
+├── examples/
+│   └── complete/                 # a runnable root that exercises the module — doubles as docs + test fixture
+│       └── main.tf
+└── tests/
+    └── network.tftest.hcl        # native tests (see SKILL.md Testing section)
+```
+
+`terraform-docs markdown table . > README.md` keeps docs honest — wire it as a pre-commit hook or CI check.
+
+## count vs for_each (the index-shift footgun)
+
+```hcl
+# BAD: count over a list
+resource "aws_subnet" "private" {
+  count      = length(var.subnet_cidrs)        # remove element 0 ->
+  cidr_block = var.subnet_cidrs[count.index]   # every subnet re-addresses -> destroy cascade
+}
+
+# GOOD: for_each over a map with stable keys
+resource "aws_subnet" "private" {
+  for_each   = var.subnets                      # { "ap-southeast-2a" = "10.0.1.0/24", ... }
+  cidr_block = each.value
+  availability_zone = each.key
+}
+```
+
+`count` is fine for "0 or 1 of this" conditionals (`count = var.enabled ? 1 : 0`) — though even there, `for_each = var.enabled ? { main = true } : {}` keeps addresses stable if it might ever become "n of this". Migrating existing `count` resources to `for_each`: write `moved` blocks for each index→key pair (see state-management.md) so nothing is destroyed.

+ 156 - 0
skills/terraform-ops/references/security-and-secrets.md

@@ -0,0 +1,156 @@
+# Security and Secrets
+
+The uncomfortable truth first: **Terraform state stores resource attributes in plaintext JSON** — including any password, key, token, or connection string a provider ever returned. `sensitive = true` changes what's *printed*, not what's *stored*. Every secrets strategy below is a variation on "make sure the secret never enters state, and harden state anyway."
+
+## Threat Model
+
+| Surface | Exposure | Mitigation |
+|---|---|---|
+| State file | Plaintext attributes of every resource | Encrypted backend + IAM; OpenTofu state encryption; keep secrets out entirely (below) |
+| Plan files (`tfplan`, JSON) | Contain proposed values, incl. some sensitive ones | Treat plan artifacts as secrets; short retention on CI artifacts |
+| CLI / CI logs | Values interpolated into output | `sensitive = true`, `-no-color` log review, masked CI vars |
+| PR plan comments | Anyone with repo read | Summarize rather than full-dump for sensitive roots |
+| `.tfvars` in git | Whatever you put there | Never commit secret tfvars; SOPS-encrypt or env-var inject |
+| Provider credentials | Long-lived keys in CI secrets | OIDC short-lived tokens (see cicd-pipelines.md) |
+
+## What `sensitive = true` Actually Does (and doesn't)
+
+```hcl
+variable "db_password" {
+  type      = string
+  sensitive = true        # plan/apply output prints (sensitive value)
+}
+
+output "endpoint" {
+  value     = "${aws_db_instance.main.address}:${var.db_password}"  # ERROR unless...
+  sensitive = true        # ...the output is marked too (sensitivity propagates)
+}
+```
+
+Does: redact from `plan`/`apply`/`output` human output; propagate taint through expressions; force derived outputs to be marked.
+Does **not**: encrypt anything; remove the value from state (`terraform state pull | jq` shows it plaintext); redact from `terraform output -json` (explicitly prints sensitive values); stop a provider logging it at TRACE.
+
+`ephemeral = true` on variables (TF ≥ 1.10) goes further: the value may only flow into ephemeral contexts (write-only args, provider config, locals marked ephemeral) and is never written to state or plan.
+
+## Ephemeral Resources (TF ≥ 1.10, OpenTofu ≥ 1.11)
+
+Ephemeral resources open/fetch a value during the run and are **never persisted to state or plan**. The first-class pattern for "read a secret from a manager at apply time."
+
+```hcl
+# Fetch the secret ephemerally — exists only for the duration of the run
+ephemeral "aws_secretsmanager_secret_version" "db" {
+  secret_id = aws_secretsmanager_secret.db.id
+}
+
+# Use it via a write-only argument so it never lands in state either
+resource "aws_db_instance" "main" {
+  # ...
+  password_wo         = ephemeral.aws_secretsmanager_secret_version.db.secret_string
+  password_wo_version = 1
+}
+```
+
+Available ephemeral resource types include `aws_secretsmanager_secret_version`, `aws_ssm_parameter`, `azurerm_key_vault_secret`, `google_secret_manager_secret_version`, Vault's `vault_kv_secret_v2`, and `random_password` (ephemeral variant). Ephemeral *values* can feed: write-only arguments, provider configuration, `terraform_data` triggers — anything that itself persists will error if you try.
+
+Contrast with the classic `data "aws_secretsmanager_secret_version"` — a data source's result **is stored in state**, which quietly copied your secret into the state file. Migrate those reads to `ephemeral` blocks.
+
+## Write-Only Arguments (TF ≥ 1.11, OpenTofu ≥ 1.11)
+
+Provider-defined `*_wo` arguments accept a value, hand it to the API, and store **nothing** in state. Because nothing is stored, Terraform can't diff them — that's what the paired `*_wo_version` integer is for:
+
+```hcl
+resource "aws_db_instance" "main" {
+  password_wo         = var.db_password          # never in state
+  password_wo_version = 2                        # bump to push a new password
+}
+```
+
+- Rotate by incrementing `_wo_version` (or wiring it to a rotation timestamp/secret version).
+- Write-only args accept ephemeral values — the combination above (ephemeral read → write-only write) is the zero-secrets-in-state gold standard.
+- Caveat: not every provider/resource has `_wo` variants yet; check the provider docs. Where absent, fall back to the secret-manager-reference pattern below.
+
+## Pattern Ladder (best to worst)
+
+```
+1. Secret never touches Terraform at all
+   App reads from Secrets Manager/Vault at BOOT using its IAM role;
+   Terraform only creates the (empty or rotated-out-of-band) secret container + IAM.
+
+2. Ephemeral read -> write-only write          (TF >= 1.11)
+   Secret transits the run in memory only. Nothing in state or plan.
+
+3. Secret manager reference via data source    (any version)
+   data "aws_secretsmanager_secret_version" -- secret IS in state,
+   but at least it's centrally rotated + audited. Encrypt state, restrict IAM.
+
+4. TF_VAR_ env injection from CI secret store  (any version)
+   Keeps secrets out of git; still lands in state if assigned to a resource attribute.
+
+5. SOPS-encrypted tfvars in git
+   Good at-rest story for git; same state caveat as 4.
+
+6. Plaintext in tfvars/locals committed to git
+   Never. Rotating means rewriting git history.
+```
+
+### Vault / cloud secret manager integration
+
+```hcl
+# Terraform creates the container + access policy; VALUE is set out-of-band or by rotation lambda
+resource "aws_secretsmanager_secret" "db" {
+  name       = "prod/db/password"
+  kms_key_id = aws_kms_key.secrets.arn
+}
+
+resource "aws_iam_role_policy" "app_reads_secret" {
+  role   = aws_iam_role.app.id
+  policy = jsonencode({
+    Version = "2012-10-17"
+    Statement = [{ Effect = "Allow", Action = "secretsmanager:GetSecretValue",
+                   Resource = aws_secretsmanager_secret.db.arn }]
+  })
+}
+# App fetches the secret at startup. Terraform never knows the value.
+```
+
+Vault: prefer short-TTL dynamic secrets (`vault_database_secret_backend_role`) so even a leaked credential expires in minutes. The Vault provider's classic data sources persist to state — use the ephemeral variants on TF ≥ 1.10.
+
+### SOPS pattern
+
+```bash
+# Encrypt env tfvars with a KMS key; ciphertext is committable
+sops --encrypt --kms arn:aws:kms:...:key/... prod.tfvars.json > prod.sops.tfvars.json
+```
+
+```hcl
+# Via carlpett/sops provider
+data "sops_file" "secrets" { source_file = "prod.sops.tfvars.json" }
+locals { db_password = data.sops_file.secrets.data["db_password"] }
+```
+
+Honest accounting: decrypted values still flow into plan/state unless they terminate in write-only arguments. SOPS solves *git at-rest*, not *state at-rest*.
+
+## Hardening State Itself
+
+Do all of this regardless of which pattern above you use:
+
+- Backend encryption: S3 SSE-KMS with a dedicated CMK; bucket policy denying un-encrypted puts; `azurerm`/`gcs` with CMK.
+- IAM: state bucket readable only by the CI roles + break-glass group. State read access ≈ secret read access — treat the grant accordingly.
+- Versioning on (recovery) + access logging on the bucket (audit).
+- `.gitignore`: `*.tfstate`, `*.tfstate.*`, `*.tfplan`, `.terraform/`, and crash logs (`crash.log` can embed values).
+- **OpenTofu state encryption** (≥ 1.7) — client-side AES-GCM over state *and* plan files, key from PBKDF2 passphrase, AWS/GCP KMS, Azure Key Vault, or external program. The strongest state story available; Terraform has no equivalent (config sample in state-management.md). Plan key rotation: `encryption` supports a `fallback` method so old state remains readable during rotation.
+
+## Provider Credential Hygiene
+
+| Don't | Do |
+|---|---|
+| `provider "aws" { access_key = "..." }` hardcoded | Ambient auth: OIDC in CI, SSO/instance profiles locally |
+| Long-lived `AWS_ACCESS_KEY_ID` in CI secrets | OIDC `role-to-assume` (see cicd-pipelines.md) |
+| One god-role for plan and apply | Read-only plan role; write apply role gated to main/environment |
+| Shared human credentials for break-glass | Named identities + audited assume-role |
+
+## Scanning and Gates
+
+- `trivy config .` / `checkov -d .` catch *misconfigurations* (public buckets, `0.0.0.0/0` ingress, unencrypted volumes) — wire into PR CI (see cicd-pipelines.md).
+- `gitleaks` / push-gates catch secrets *in the repo* — tfvars are a classic leak vector.
+- `terraform providers` + lockfile review on provider bumps: providers execute arbitrary code on your CI runner with cloud credentials. A provider is a dependency — the repo's supply-chain rules (cooldown, behavioural scan before adopting unfamiliar providers from the registry) apply in full.

+ 226 - 0
skills/terraform-ops/references/state-management.md

@@ -0,0 +1,226 @@
+# State Management
+
+Terraform/OpenTofu state is the mapping between config addresses and real infrastructure. It is the single most dangerous file in the project: lose it and Terraform forgets your infra; corrupt it and applies destroy the wrong things; leak it and every secret a provider ever returned is exposed.
+
+## Remote Backends
+
+Never keep state local for anything shared or production. Remote backends give durability, locking, and team access.
+
+### S3 (AWS) — current recommended shape
+
+```hcl
+terraform {
+  backend "s3" {
+    bucket       = "myorg-tfstate"
+    key          = "prod/network/terraform.tfstate"   # one key per root module
+    region       = "ap-southeast-2"
+    encrypt      = true                # SSE; pair with bucket-default SSE-KMS
+    use_lockfile = true               # S3-native locking (TF >= 1.10)
+  }
+}
+```
+
+- **`use_lockfile = true` replaces the DynamoDB lock table** (Terraform ≥ 1.10, OpenTofu ≥ 1.10). It uses S3 conditional writes to create a `.tflock` object next to the state key. The old `dynamodb_table` argument still works and was the standard for a decade — you'll see it everywhere — but new setups don't need the extra table. During migration you can set both; remove `dynamodb_table` once all collaborators are ≥ 1.10.
+- Bucket hygiene: versioning **on** (state history = your undo), default SSE-KMS, block public access, lifecycle rule to expire old noncurrent versions after ~90 days, bucket policy restricting to the CI role + break-glass humans.
+- One bucket per org/account is fine; isolation comes from `key` prefixes + IAM conditions on the prefix.
+
+### azurerm
+
+```hcl
+terraform {
+  backend "azurerm" {
+    resource_group_name  = "rg-tfstate"
+    storage_account_name = "myorgtfstate"
+    container_name       = "tfstate"
+    key                  = "prod.network.tfstate"
+    use_azuread_auth     = true        # RBAC instead of storage keys
+  }
+}
+```
+
+Locking is native via blob leases — nothing extra to configure. Prefer `use_azuread_auth = true` so CI uses OIDC-federated identity, not storage account keys.
+
+### gcs
+
+```hcl
+terraform {
+  backend "gcs" {
+    bucket = "myorg-tfstate"
+    prefix = "prod/network"
+  }
+}
+```
+
+Locking native via Cloud Storage generation preconditions. Enable object versioning on the bucket. Use workload identity federation in CI.
+
+### HCP Terraform / Terraform Cloud
+
+```hcl
+terraform {
+  cloud {
+    organization = "myorg"
+    workspaces { name = "prod-network" }
+  }
+}
+```
+
+State hosted, encrypted, versioned, locked by the platform; runs can execute remotely. Free tier covers ≤ 500 managed resources. Note: OpenTofu cannot use HCP Terraform as a backend (it can use the generic `remote` backend against compatible APIs).
+
+### Backend selection
+
+| Backend | Locking | Encryption at rest | Best when |
+|---|---|---|---|
+| `s3` + `use_lockfile` | S3 conditional writes | SSE-KMS | AWS shops (default) |
+| `s3` + `dynamodb_table` | DynamoDB | SSE-KMS | Legacy / mixed TF < 1.10 teams |
+| `azurerm` | Blob lease (built-in) | Platform + CMK | Azure shops |
+| `gcs` | Generation precondition (built-in) | Platform + CMEK | GCP shops |
+| HCP Terraform | Platform | Platform | Want managed runs/policies too |
+| `pg` / `consul` / `kubernetes` | Yes | Varies | Niche; self-hosted constraints |
+
+### Migrating backends
+
+```bash
+# 1. Add/replace the backend block, then:
+terraform init -migrate-state          # copies state old -> new, prompts
+# 2. Verify: terraform state list shows everything
+# 3. Delete the old state only after a clean plan
+```
+
+## State Locking
+
+Locking prevents two concurrent applies corrupting state. It is **not** optional for teams.
+
+- `terraform apply` acquires the lock automatically; a crash can leave it stuck.
+- `terraform force-unlock <LOCK_ID>` — only after confirming no run is actually live (check CI). The lock ID is printed in the error.
+- `-lock-timeout=5m` in CI lets queued runs wait instead of failing instantly.
+- Locking protects against concurrent *writes*; it does not serialize plans — two PRs can both plan green and conflict at apply. Solve at the orchestration layer (Atlantis dir-locks, Actions concurrency groups — see cicd-pipelines.md).
+
+## Declarative State Changes: moved / import / removed
+
+These blocks are the modern, code-reviewable replacements for CLI state surgery. They live in config, show up in diffs, and execute as part of a normal plan/apply.
+
+### `moved` — refactor without destroy
+
+```hcl
+# Renamed a resource
+moved {
+  from = aws_instance.web
+  to   = aws_instance.frontend
+}
+
+# Moved a resource into a module
+moved {
+  from = aws_vpc.main
+  to   = module.network.aws_vpc.main
+}
+
+# count -> for_each migration
+moved {
+  from = aws_subnet.private[0]
+  to   = aws_subnet.private["ap-southeast-2a"]
+}
+```
+
+Plan shows `# aws_instance.web has moved to aws_instance.frontend` instead of destroy+create. Keep `moved` blocks around for at least one release cycle of a shared module so downstream consumers also get the move; then prune.
+
+### `import` — adopt existing infrastructure (TF ≥ 1.5)
+
+```hcl
+import {
+  to = aws_s3_bucket.legacy
+  id = "myorg-legacy-bucket"
+}
+```
+
+```bash
+# Generate matching config for resources you haven't written yet:
+terraform plan -generate-config-out=generated.tf
+# Review generated.tf, clean it up (it's verbose), move into proper files, apply.
+```
+
+Why blocks beat `terraform import` CLI: the import is planned (you see exactly what will be adopted and whether config matches reality) and reviewed in the PR. The CLI command mutates state immediately with no plan. Import blocks also support `for_each` (TF ≥ 1.7) for bulk adoption.
+
+After a successful apply, delete the `import` blocks — they're one-shot.
+
+### `removed` — forget without destroying (TF ≥ 1.7, OpenTofu ≥ 1.8)
+
+```hcl
+removed {
+  from = aws_db_instance.legacy
+  lifecycle {
+    destroy = false      # remove from state, leave the real DB alone
+  }
+}
+```
+
+Use when handing a resource to another team/state, or un-managing something Terraform should no longer own. The declarative version of `terraform state rm`. OpenTofu 1.12 additionally allows `lifecycle { destroy = false }` directly on a resource being deleted from config.
+
+## State Surgery (CLI) — and when NOT to
+
+```bash
+terraform state list                      # enumerate addresses (safe)
+terraform state show aws_vpc.main         # inspect one resource (safe)
+terraform state pull > backup.tfstate     # ALWAYS do this first
+terraform state mv aws_x.a aws_x.b        # immediate, unreviewed rename
+terraform state mv aws_x.a 'module.m.aws_x.a'
+terraform state rm aws_x.a                # forget (does NOT destroy)
+terraform state push fixed.tfstate        # overwrite remote state (extreme)
+```
+
+**Decision rule:** if a `moved`, `removed`, or `import` block can express the change — use the block. Reasons:
+
+1. Blocks are planned and reviewed; CLI mutations are instant and invisible to reviewers.
+2. Blocks are idempotent across the team; a CLI command run by one person leaves everyone else's mental model stale.
+3. A typo in `state mv` orphans a real resource: Terraform now plans to *create* a duplicate while the original drifts unmanaged.
+
+**Legitimate CLI surgery:**
+
+- Splitting one state into two roots (`state mv -state-out=...` or pull/edit/push between backends).
+- Recovering from a half-failed migration or a provider bug that wedged an address.
+- Anything on Terraform < 1.5 (no blocks available).
+
+**Protocol for any surgery:** `state pull` a timestamped backup → make the change → `terraform plan` must come back clean (or exactly the expected diff) → only then walk away.
+
+## Drift Detection
+
+Infrastructure changes outside Terraform (console edits, autoscaling, other tooling). Detect it before it bites an apply.
+
+```bash
+terraform plan -detailed-exitcode -lock=false
+# exit 0 -> no changes (clean)
+# exit 1 -> error
+# exit 2 -> changes pending (drift OR un-applied config)
+```
+
+- `-lock=false` so a read-only drift check never blocks a real apply.
+- `terraform plan -refresh-only` shows only *state vs reality* differences without proposing config-driven changes — cleaner signal for "who touched the console".
+- Cron it: nightly scheduled CI job, alert on exit 2 (recipe in cicd-pipelines.md).
+- Chronic drift on specific attributes → either codify the external process or `lifecycle { ignore_changes = [...] }` deliberately (document why).
+
+## State File Hygiene
+
+| Rule | Why |
+|---|---|
+| `*.tfstate*` in `.gitignore` | Local state in git = secrets in git history forever |
+| One state per blast-radius unit | Network / data / app split — a bad apply can't take everything |
+| Keep states small (< ~100 resources guideline) | Plan time, lock contention, blast radius all scale with state size |
+| Versioned backend bucket | `state push` mistakes become a revert, not a rebuild |
+| Treat state as secret | Provider attributes (DB passwords, certs) sit in plaintext JSON — see security-and-secrets.md |
+| OpenTofu: consider state encryption | Client-side AES-GCM, keys via PBKDF2/KMS — defence even if the bucket leaks |
+
+```hcl
+# OpenTofu >= 1.7 only — state + plan encryption
+terraform {
+  encryption {
+    key_provider "aws_kms" "main" {
+      kms_key_id = "arn:aws:kms:...:key/..."
+      key_spec   = "AES_256"
+    }
+    method "aes_gcm" "main" {
+      keys = key_provider.aws_kms.main
+    }
+    state { method = method.aes_gcm.main }
+    plan  { method = method.aes_gcm.main }
+  }
+}
+```

+ 0 - 0
skills/claude-code-hooks/scripts/.gitkeep → skills/terraform-ops/scripts/.gitkeep