Browse Source

feat(skills): add fleet-worker for orchestrated cheap-worker delegation

Delegate tool-using, multi-step tasks to a cheaper headless Claude Code worker
on any Anthropic-compatible endpoint (GLM/z.ai default), isolated per task in a
git worktree + CLAUDE_CONFIG_DIR (the load-bearing auth-isolation fix). Pairs
with fleet-ops: fleet-worker spawns and gates (fleet-collect), fleet-ops lands.

Ships bash + PowerShell launchers, fleet-collect (result gate), fleet-doctor
(offline structural + live endpoint staleness verifier, with the oauth-trap
preflight), a sanitized design spec, the fleet-ops handoff recipes, and a
34-assertion offline self-test. Provider-agnostic framing with a know-your-terms
note. Wires into the doc-drift gate (skills 93->94), CHANGELOG, check-resources.

Battle-tested across 6 autonomous builds (NASA + PokeAPI; GLM/Opus/Sonnet).

Co-Authored-By: Claude Opus 4.8 <noreply@anthropic.com>
0xDarkMatter 1 week ago
parent
commit
d69455a88f

+ 1 - 1
AGENTS.md

@@ -5,7 +5,7 @@
 This is **claude-mods** - a collection of custom extensions for Claude Code:
 - **3 expert agents** for pure context-isolation/worker roles (git-agent, firecrawl-expert, project-organizer) - every domain-knowledge agent became an `-ops` skill (v3.0, skills-first)
 - **2 commands** for session management (/sync, /save)
-- **93 skills** for CLI tools, patterns, workflows, and development tasks (incl. `ffmpeg-ops` for probe-first media processing and EDL-driven editing, `supply-chain-defense` for behavioural-first dependency security, `prompt-injection-defense` for instruction-integrity scanning, `pypi-ops` for OIDC Trusted Publishing to PyPI, `net-ops` for network troubleshooting, `windows-ops` / `mac-ops` for workstation diagnostics)
+- **94 skills** for CLI tools, patterns, workflows, and development tasks (incl. `ffmpeg-ops` for probe-first media processing and EDL-driven editing, `supply-chain-defense` for behavioural-first dependency security, `prompt-injection-defense` for instruction-integrity scanning, `pypi-ops` for OIDC Trusted Publishing to PyPI, `net-ops` for network troubleshooting, `windows-ops` / `mac-ops` for workstation diagnostics)
 - **13 output styles** for response personality (Vesper, Spartan, Mentor, Executive, Pair, Atlas, Coach, Harbour, Meridian, Noir, Roast, Sage, Scout)
 - **11 hooks** for pre-commit linting, post-edit formatting, dangerous command warnings, uv enforcement, dependency-install + manifest-edit supply-chain advisories, hidden-Unicode scanning (session-start + pre-commit), live config-change + worktree guards, and pmail notifications - security set auto-wired via plugin hooks.json
 - **Pigeon** inter-session messaging (`pigeon send/read/reply`) - SQLite-backed pmail at `~/.claude/pmail.db`

+ 17 - 0
CHANGELOG.md

@@ -7,6 +7,23 @@ feature releases live in the README "Recent Updates" section.
 
 ## [Unreleased]
 
+### Added
+- **`fleet-worker` skill** - delegate tool-using, multi-step agent tasks to a cheaper
+  headless Claude Code worker on a non-Anthropic model (GLM via z.ai by default; any
+  Anthropic-compatible endpoint via `ANTHROPIC_BASE_URL`). Each worker is a real
+  `claude -p` carrying Claude Code's full tool harness but a "grunt" brain, isolated
+  in its own git worktree + `CLAUDE_CONFIG_DIR` (the load-bearing auth-isolation
+  finding - without it a host subscription token leaks to the endpoint and 401s). An
+  Opus orchestrator fans workers out in parallel, gates raw results with
+  `fleet-collect.sh` (the `is_error`-not-`subtype` footgun, encoded), and hands the
+  winning branches to `fleet-ops` for test-gated landing - fleet-worker is the spawn
+  layer fleet-ops disowns. Ships bash + PowerShell launchers, `fleet-doctor.sh`
+  (offline structural / `--live` endpoint staleness verifier + the oauth-trap
+  preflight), a sanitized design spec, the fleet-ops handoff recipes, and a
+  34-assertion offline self-test. Provider-agnostic framing; carries a "know your
+  terms" note (custom endpoints are documented Claude Code config; keep the
+  orchestrator interactive or on an API key per Anthropic's automated-access terms).
+
 ## [3.1.0] - 2026-06-17
 
 ### Added

+ 6 - 5
README.md

@@ -12,13 +12,13 @@
 
 > *A comprehensive extension toolkit that transforms Claude Code into a specialized development powerhouse.*
 
-**claude-mods** is a production-ready plugin that extends Claude Code with 91 specialized skills, 3 expert agents, 13 output styles, 11 hooks, and modern CLI tools designed for real-world development workflows. Whether you're debugging React hooks, optimizing PostgreSQL queries, or building production CLI applications, this toolkit equips Claude with the domain expertise and procedural knowledge to work at expert level across multiple technology stacks.
+**claude-mods** is a production-ready plugin that extends Claude Code with 94 specialized skills, 3 expert agents, 13 output styles, 11 hooks, and modern CLI tools designed for real-world development workflows. Whether you're debugging React hooks, optimizing PostgreSQL queries, or building production CLI applications, this toolkit equips Claude with the domain expertise and procedural knowledge to work at expert level across multiple technology stacks.
 
 Built on the [Agent Skills specification](https://agentskills.io/specification) (an open standard backed by Anthropic, Vercel, Google, Microsoft, and 40+ agent platforms), claude-mods fills critical gaps in Claude Code's capabilities: persistent session state that survives across machines, on-demand expert knowledge for specialized domains, token-efficient modern CLI tools (10-100x faster than traditional alternatives), and proven workflow patterns for TDD, code review, and feature development. The toolkit implements Anthropic's [recommended patterns for long-running agents](https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents), ensuring your development context never vanishes when sessions end.
 
 From Python async patterns to Rust ownership models, from AWS Fargate deployments to Craft CMS development - claude-mods provides the specialized knowledge and tools that transform Claude from a general-purpose assistant into a domain expert who understands your stack, remembers your workflow, and ships production code.
 
-**3 agents. 93 skills. 13 styles. 11 hooks. 7 rules. One install.**
+**3 agents. 94 skills. 13 styles. 11 hooks. 7 rules. One install.**
 
 ## Recent Updates
 
@@ -73,7 +73,7 @@ Claude Code is powerful out of the box, but it has gaps. This toolkit fills them
 
 - **Session continuity** — Tasks vanish when sessions end. We fix that with `/save` and `/sync`, implementing Anthropic's [recommended pattern](https://www.anthropic.com/engineering/effective-harnesses-for-long-running-agents) for long-running agents.
 
-- **Expert-level knowledge on demand** — 91 on-demand skills covering React, TypeScript, Python, Go, Rust, PostgreSQL, and more, plus 3 specialized agents reserved for genuine context-isolation/worker roles (git operations, web scraping, project reorganization). Skills-first: knowledge loads when relevant instead of living in heavyweight agent prompts.
+- **Expert-level knowledge on demand** — 94 on-demand skills covering React, TypeScript, Python, Go, Rust, PostgreSQL, and more, plus 3 specialized agents reserved for genuine context-isolation/worker roles (git operations, web scraping, project reorganization). Skills-first: knowledge loads when relevant instead of living in heavyweight agent prompts.
 
 - **Modern CLI tools** — Stop using `grep`, `find`, and `cat`. Our rules automatically prefer `ripgrep`, `fd`, `eza`, and `bat` — 10-100x faster and token-efficient.
 
@@ -98,7 +98,7 @@ claude-mods/
 ├── .claude-plugin/     # Plugin metadata
 ├── agents/             # Expert subagents (3)
 ├── commands/           # Slash commands (2)
-├── skills/             # Custom skills (91)
+├── skills/             # Custom skills (94)
 ├── output-styles/      # Response personalities
 ├── hooks/              # Hook examples & docs
 ├── rules/              # Claude Code rules
@@ -278,6 +278,7 @@ See [skill-creator](skills/skill-creator/) for the complete guide.
 | [github-ops](skills/github-ops/) | GitHub remote operations - repo creation, releases, metadata, README Recent Updates convention |
 | [push-gate](skills/push-gate/) | Pre-push safety gate - gitleaks + regex secret scan, forbidden-file check, no bypass |
 | [fleet-ops](skills/fleet-ops/) | Manage a fleet of concurrent Claude sessions - landing queue with test gate, pre-land scrub (experimental) |
+| [fleet-worker](skills/fleet-worker/) | Delegate tasks to cheap headless GLM (or any Anthropic-compatible) workers - per-task git worktree + isolated config, result gating, fan-out that hands winning branches to fleet-ops landing |
 | [summon](skills/summon/) | Transfer Claude Desktop Code-tab sessions between accounts - push/pull with picker |
 | [doc-scanner](skills/doc-scanner/) | Scan and synthesize project documentation |
 | [adr-ops](skills/adr-ops/) | Architecture Decision Records - when-to-write, canonical format, supersession lifecycle, scaffold/index/lint tools |
@@ -558,7 +559,7 @@ When using multiple MCP servers (Chrome DevTools, Vibe Kanban, etc.), their tool
 
 ### Skill Description Budget
 
-With 80+ skills installed (this plugin alone ships 81), skill descriptions can overflow the listing budget. All skill names are always listed, but descriptions share a budget of **1% of the model context window** — on overflow, least-invoked skills lose their descriptions first and **silently stop auto-triggering** (explicit `/name` invocation still works). Each skill's combined `description` + `when_to_use` is also truncated at **1,536 chars**, so trigger phrases belong at the front.
+With 90+ skills installed (this plugin alone ships 94), skill descriptions can overflow the listing budget. All skill names are always listed, but descriptions share a budget of **1% of the model context window** — on overflow, least-invoked skills lose their descriptions first and **silently stop auto-triggering** (explicit `/name` invocation still works). Each skill's combined `description` + `when_to_use` is also truncated at **1,536 chars**, so trigger phrases belong at the front.
 
 - **Check:** run `/doctor` — it shows whether the budget is overflowing and which skills are affected.
 - **Fix:** demote or disable skills you don't use via `skillOverrides` in settings (`"on"` / `"name-only"` / `"user-invocable-only"` / `"off"` per skill, or `/skills` + `Space`). Plugin skills are managed via `/plugin` instead.

+ 1 - 1
docs/PLAN.md

@@ -16,7 +16,7 @@
 | Component | Count | Notes |
 |-----------|-------|-------|
 | Agents | 3 | Pure context-isolation/worker roles only: git-agent (background commits/PRs), firecrawl-expert (noisy scrapes), project-organizer (bulk restructure) |
-| Skills | 93 | Operational skills, CLI tools, workflows, diagnostics, security |
+| Skills | 94 | Operational skills, CLI tools, workflows, diagnostics, security |
 | Commands | 2 | Session management (sync, save) |
 | Rules | 7 | cli-tools, commit-style, naming-conventions, prompt-injection, skill-agent-updates, supply-chain, worktree-boundaries |
 | Output Styles | 13 | Vesper, Spartan, Mentor, Executive, Pair, Atlas, Coach, Harbour, Meridian, Noir, Roast, Sage, Scout |

+ 208 - 0
skills/fleet-worker/SKILL.md

@@ -0,0 +1,208 @@
+---
+name: fleet-worker
+description: "Delegate tool-using, multi-step agent tasks to a cheaper headless Claude Code worker on a non-Anthropic model (GLM via z.ai by default) — a 'grunt worker' an Opus orchestrator fans out in parallel and verifies. Each worker is a real `claude -p` carrying Claude Code's full tool harness (Read/Write/Edit/Bash/Glob/Grep/Task) but a cheaper brain, isolated in its own git worktree + CLAUDE_CONFIG_DIR. Pairs with fleet-ops for test-gated landing. Triggers on: fleet-worker, delegate to GLM, spin up a GLM worker, cheap parallel agent, grunt worker, offload to glm, headless GLM, z.ai worker, GLM-5.2 worker, cheap coding agent, fan out workers, non-Anthropic model in Claude Code, ANTHROPIC_BASE_URL worker."
+when_to_use: "Use when you have independent, well-scoped, tool-using subtasks (refactors, test-writing, doc edits, mechanical multi-file changes) that don't need Opus-level judgment, and you want them done cheaply in parallel while the orchestrator reviews and gates the results before they land. Not for tasks needing the orchestrator's conversation context or expensive-if-wrong unreviewed changes."
+license: MIT
+allowed-tools: "Read Bash Glob Grep AskUserQuestion"
+metadata:
+  author: claude-mods
+  status: beta
+  related-skills: fleet-ops, git-ops, push-gate, claude-code-ops
+---
+
+# fleet-worker
+
+Run a **cheap headless Claude Code worker on a non-Anthropic model** and let an
+Opus orchestrator (this session) fan workers out in parallel, then verify and
+land their work. The worker keeps Claude Code's *entire tool harness*
+(Read/Write/Edit/Bash/Glob/Grep/Task/MCP/hooks) — only the **brain** is swapped
+to a cheaper model via env. GLM-5.2 on z.ai is the default worked example; the
+mechanism is provider-agnostic (any Anthropic-compatible endpoint).
+
+**This is the spawning layer. [`fleet-ops`](../fleet-ops/) is the landing layer.**
+fleet-worker produces branches cheaply; fleet-ops lands them through a test gate
+with your review. See [references/fleet-ops-handoff.md](references/fleet-ops-handoff.md).
+
+## The architecture crux: per-agent model = process isolation
+
+`ANTHROPIC_BASE_URL` and the `ANTHROPIC_DEFAULT_*_MODEL` mapping vars are
+**process-global** — read once per `claude` process, applied to *every* model
+call it makes (including in-process Task subagents). There is no per-agent
+override. So you **cannot** keep one Opus session and have its subagents secretly
+run on GLM. The only way to pair a GLM-brained agent with an Opus orchestrator is
+a **separate OS process** with its own env block. That process is `fleet-worker`.
+
+## The load-bearing rule: auth isolation (do not skip)
+
+On any machine also logged into a Claude.ai/Anthropic subscription, the naïve
+"just set `ANTHROPIC_AUTH_TOKEN`" launcher **fails with `401 token expired or
+incorrect`** — the host's stored subscription OAuth token (`~/.claude.json`
+`oauthAccount` + `forceLoginMethod`) takes precedence and gets sent to the
+non-Anthropic endpoint, which rejects it. `--settings` overrides do **not** fix
+it. The fix is a dedicated, empty config dir:
+
+```bash
+export CLAUDE_CONFIG_DIR="$HOME/.fleet-worker/cfg"   # no inherited OAuth/hooks
+```
+
+The launcher sets this automatically. It also gives each worker a clean
+hook/permission/MCP profile so it can't trip the host's hooks. Full analysis in
+[references/fleet-worker-spec.md](references/fleet-worker-spec.md) §4.
+
+## Setup
+
+1. **Install** — these scripts ship with the skill. After `scripts/install.sh`
+   they live at `~/.claude/skills/fleet-worker/scripts/`. Either call them by that
+   path, or symlink onto PATH for convenience:
+   ```bash
+   ln -s ~/.claude/skills/fleet-worker/scripts/fleet-worker ~/.local/bin/fleet-worker
+   ln -s ~/.claude/skills/fleet-worker/scripts/fleet-collect.sh ~/.local/bin/fleet-collect.sh
+   ```
+2. **Provide the key** (the launcher never prints it; resolution order):
+   - `export ANTHROPIC_AUTH_TOKEN=<key>`, or
+   - `export FLEET_WORKER_KEYRING_SERVICE=<svc> FLEET_WORKER_KEYRING_KEY=<name>` (uses `keyring get`), or
+   - `export ZHIPU_API_KEY=<key>` (or `GLM_API_KEY`).
+3. **Preflight** — `bash scripts/fleet-doctor.sh --offline` (structural) or
+   `--live` (pings the endpoint; warns about the §4 oauth trap).
+
+### Config knobs (env, all optional)
+
+| Var | Default | Purpose |
+|---|---|---|
+| `FLEET_WORKER_BASE_URL` | `https://api.z.ai/api/anthropic` | Anthropic-compatible endpoint |
+| `FLEET_WORKER_MODEL` | `GLM-5.2` | main model (opus+sonnet mapping) |
+| `FLEET_WORKER_SMALL_MODEL` | `GLM-4.5-Air` | background/cheap model (haiku mapping) |
+| `FLEET_WORKER_CONFIG_DIR` | `~/.fleet-worker/cfg` | isolated config dir — **one per parallel worker** |
+| `FLEET_WORKER_EFFORT` | `high` | seeded `effortLevel` in the worker's settings |
+
+Point `FLEET_WORKER_BASE_URL`/`FLEET_WORKER_MODEL` at any other Anthropic-compatible
+gateway (this is the documented Claude Code custom-endpoint mechanism) to drive a
+different cheap model.
+
+## When to delegate (and when not)
+
+| Delegate to a worker | Keep on the orchestrator |
+|---|---|
+| Independent, well-scoped, tool-using subtasks | Tasks needing this conversation's context |
+| Refactors, test-writing, doc edits, mechanical multi-file changes | Judgment calls, architecture, ambiguous specs |
+| Work where Opus-quality isn't required and a wrong edit is cheap to discard | Anything expensive-if-wrong and unreviewed |
+
+The safety comes from the **cage, not the model**: isolated worktree (blast
+radius), isolated config dir (no host creds/hooks), and the orchestrator's
+merge gate (nothing lands without review).
+
+## Single-worker recipe
+
+```bash
+cd <target-worktree>
+fleet-worker --output-format json "Refactor src/parser.py to use the visitor pattern" \
+  > result.json
+fleet-collect.sh result.json && echo "succeeded — review the diff"
+```
+
+`fleet-collect.sh` gates on `is_error` (the real success signal — `subtype` lies)
+and prints the worker's final text. Exit `0` = success, `10` = worker failed.
+
+## Fan-out recipe (parallel workers)
+
+Each task gets its **own git worktree + branch** *and* its **own config dir** so
+N workers never clobber each other. Spawn from the orchestrator's Bash tool with
+`run_in_background: true`, then collect by output file.
+
+```bash
+delegate() {                     # $1 = task-id, $2 = prompt
+  local id="$1" prompt="$2" wt=".fleet-work/$1"
+  git worktree add -q -b "fleet/$id" "$wt" HEAD
+  ( cd "$wt"
+    FLEET_WORKER_CONFIG_DIR="$HOME/.fleet-worker/cfg-$id" \
+      fleet-worker --output-format json "$prompt" > "../$id.result.json" 2> "../$id.err"
+  )
+}
+delegate task-a "Add tests for the auth module"      &
+delegate task-b "Update the README install section"  &
+delegate task-c "Refactor utils.py duplications"     &
+wait                                                  # barrier
+
+for id in task-a task-b task-c; do
+  if fleet-collect.sh ".fleet-work/$id.result.json" >/dev/null; then echo "fleet/$id OK"; fi
+done
+```
+
+Keep concurrency modest (≤ 4–6) — the binding constraint is endpoint quota, not
+local CPU. `.gitignore` the scratch dirs (`.fleet-work/`, `.fleet-worker/`).
+
+## Hand off to fleet-ops (test-gated landing)
+
+The winning branches are ordinary git branches — land them with the sibling skill
+instead of merging by hand:
+
+```bash
+fleet track fleet/task-a fleet/task-b fleet/task-c   # register as lanes
+fleet land  fleet/task-a                          # sequential, test-gated, you review each diff
+```
+
+Full walkthrough + recovery in [references/fleet-ops-handoff.md](references/fleet-ops-handoff.md).
+
+## Permission posture
+
+Headless `-p` can't answer a permission prompt — it would stall. The launcher
+bakes in `--permission-mode bypassPermissions`; safety comes from the cage
+(worktree + isolated config + merge gate), not the prompt. Optionally constrain
+with `--disallowedTools` (e.g. block `WebFetch`) or `--add-dir` scoping.
+
+> **Worktree-under-`.claude/` gotcha:** Claude Code's sensitive-file guard runs
+> *before* `bypassPermissions` for anything under `.claude/`. Keep manual worker
+> worktrees at the repo top (e.g. `.fleet-work/`), not under `.claude/`.
+
+## Reliability & limits
+
+- **Overload (429/529)** is the real-world risk, worst during the model's
+  launch-window peak hours. Retry with jittered backoff, cap attempts, prefer
+  off-peak, and consider routing overflow to `FLEET_WORKER_SMALL_MODEL`.
+- **Bound the loop:** set `--max-turns N` and an orchestrator-side wall-clock
+  timeout per worker. Collect via background + notification; never block.
+- **Cost figures are notional:** `total_cost_usd` is Claude Code's internal
+  pricing table applied to a model it doesn't know — ignore it; account by
+  `usage.*_tokens` and your provider's plan.
+- Re-dispatch is clean (the worktree makes retries idempotent).
+
+## Security
+
+Key pulled at spawn time into a process-local env var, never written to the
+script, args (`ps`-safe), or logs. Isolated config dir keeps worker creds/session
+separate from the host — and the worker can't read the host's subscription
+credentials. Avoid `--debug` in shared logs (may print headers).
+
+## Know your terms (read before publishing or automating)
+
+Using Claude Code with a custom `ANTHROPIC_BASE_URL` is a **documented** feature,
+and the worker's inference never touches Anthropic's API/subscription. But terms
+change and vary by plan — verify both your **Anthropic** terms and your **model
+provider's** terms for your own use. Two specifics worth knowing:
+
+- **Automated subscription access:** Anthropic's Consumer Terms restrict driving a
+  Claude.ai/Pro/Max **subscription** by "automated or non-human means … except
+  when accessing via an Anthropic API Key." Keep the orchestrator **interactive**,
+  or run it on an **API key** if you automate it. (Workers are non-Anthropic, so
+  this clause doesn't reach them.)
+- This skill is a tool, not legal advice. When in doubt, ask your provider.
+
+## Scripts
+
+- `scripts/fleet-worker` / `scripts/fleet-worker.ps1` — the launcher (bash + PowerShell).
+  `fleet-worker --help` for the full env/flag contract.
+- `scripts/fleet-collect.sh` — gate a `--output-format json` result; exit 0 success /
+  10 worker-failed; prints the final text. `fleet-collect.sh --help`.
+- `scripts/fleet-doctor.sh` — `--offline` structural preflight + doc-consistency
+  (CI-safe); `--live` pings the endpoint to confirm the model still resolves and
+  flags the §4 oauth trap. `fleet-doctor.sh --help`.
+
+## References & assets
+
+- [references/fleet-worker-spec.md](references/fleet-worker-spec.md) — full design spec:
+  the architecture, the §4 auth-isolation finding, output-format schema, effort
+  control, the reliability evidence, and the phased-rollout stance.
+- [references/fleet-ops-handoff.md](references/fleet-ops-handoff.md) — fan-out →
+  collect → `fleet track` → `fleet land` walkthrough and recovery.
+- [assets/worker-settings.json](assets/worker-settings.json) — the seed
+  `settings.json` the launcher drops into a fresh config dir (`effortLevel: high`).

+ 4 - 0
skills/fleet-worker/assets/worker-settings.json

@@ -0,0 +1,4 @@
+{
+  "hooks": {},
+  "effortLevel": "high"
+}

+ 120 - 0
skills/fleet-worker/references/fleet-ops-handoff.md

@@ -0,0 +1,120 @@
+# fleet-worker → fleet-ops handoff
+
+The two skills are the two halves of one cheap-labour pipeline:
+
+```
+Opus orchestrator (this session)
+   │  fan out — one fleet-worker per task, each in its own worktree + config dir
+   ▼
+fleet-worker × N    ← cheap "grunt" workers, full Claude Code tool harness
+   │  each leaves a branch with commits
+   ▼
+fleet-collect.sh    ← keep only the workers that truly succeeded (is_error:false)
+   ▼
+fleet track / fleet land   ← sequential, test-gated landing; you review each diff
+   ▼
+main
+```
+
+`fleet-worker` is the **spawn layer** that [`fleet-ops`](../../fleet-ops/) explicitly
+disowns ("anything before *committed on a branch* is the spawning layer's
+problem"). `fleet-ops` is the **landing layer**. Neither overlaps; together they
+cover spawn → verify → land. No new code is needed in fleet-ops — the workers
+produce ordinary branches, and fleet-ops lands branches.
+
+## End-to-end walkthrough
+
+### 1. Fan out (fleet-worker)
+
+Each task gets its own worktree/branch and its own isolated config dir:
+
+```bash
+delegate() {                     # $1 = task-id, $2 = prompt
+  local id="$1" prompt="$2" wt=".fleet-work/$1"
+  git worktree add -q -b "fleet/$id" "$wt" HEAD
+  ( cd "$wt"
+    FLEET_WORKER_CONFIG_DIR="$HOME/.fleet-worker/cfg-$id" \
+      fleet-worker --output-format json "$prompt" > "../$id.result.json" 2> "../$id.err"
+    # the worker edits files; commit so the branch carries the work
+    git add -A && git commit -q -m "fleet/$id: $prompt" || true
+  )
+}
+
+delegate fix-lint    "Fix all eslint errors under src/, no behaviour changes"
+delegate add-tests   "Add unit tests for src/auth/ to cover the happy + error paths"
+delegate doc-sync    "Update README install section to match scripts/install.sh"
+```
+
+Run these in the background from the orchestrator's Bash tool
+(`run_in_background: true`) and collect on completion. Keep concurrency ≤ 4–6.
+
+> The worker commits inside its worktree so the branch has commits for fleet-ops
+> to land. If you prefer, let the worker leave the tree dirty and commit from the
+> orchestrator after reviewing — but `fleet land` needs an immutable commit.
+
+### 2. Gate (fleet-collect.sh)
+
+Decide which branches are even worth landing — `fleet-collect.sh` exits `0` only on
+a true success (`is_error:false`), `10` on a failed/overloaded worker:
+
+```bash
+winners=()
+for id in fix-lint add-tests doc-sync; do
+  if fleet-collect.sh ".fleet-work/$id.result.json" >/dev/null; then
+    winners+=("fleet/$id"); echo "fleet/$id  OK"
+  else
+    echo "fleet/$id  FAILED — discard or re-dispatch"
+  fi
+done
+```
+
+Re-dispatch failures idempotently (the worktree makes retries clean) or drop them.
+
+### 3. Land (fleet-ops)
+
+Register the winning branches as lanes and land them — sequential, test-gated,
+with auto-rebase of the remaining lanes and your review of each diff:
+
+```bash
+fleet track "${winners[@]}"        # register existing branches as lanes
+fleet status                        # see all lanes
+fleet land fleet/fix-lint             # scrub → clean-base → merge → test gate → rebase others
+fleet land fleet/add-tests
+fleet land fleet/doc-sync
+```
+
+If a landing breaks the build, `fleet revert fleet/<id>` backs it out in one
+command. fleet-ops' pre-land scrub also refuses diffs with forbidden patterns
+(debug leftovers, `TODO_SCRUB`) — a useful backstop for grunt-worker output.
+
+## Why the orchestrator stays in the loop
+
+The worker is cheap and *unreviewed* by design. The value of the pairing is that
+**Opus verifies before anything lands**:
+
+- `fleet-collect.sh` filters out workers that errored or got overloaded.
+- `fleet land` runs the **test gate** — a worker that wrote plausible-but-wrong
+  code fails the tests and is hard-reset, never reaching `main`.
+- You (Opus) **review each diff** at land time — the merge gate is the quality
+  control the cheap model doesn't provide.
+
+## Recovery scenarios
+
+| Situation | Move |
+|---|---|
+| Worker returned `is_error:true` (529/overload) | Re-dispatch the same task (idempotent worktree); or route to `FLEET_WORKER_SMALL_MODEL` |
+| Worker succeeded but diff is wrong on review | Don't `fleet track` it; delete the branch + worktree |
+| Lane fails the test gate after merge | `fleet revert fleet/<id>`, then re-dispatch with a tighter prompt |
+| Two workers touched the same files | fleet-ops lands sequentially and rebases; a rebase conflict marks the lane `CONFLICT` — resolve in its worktree |
+| Scratch dirs cluttering the tree | `git worktree remove .fleet-work/<id>`; `.gitignore` `.fleet-work/` and `.fleet-worker/` |
+
+## Boundaries (what each side owns)
+
+| fleet-worker owns | fleet-ops owns |
+|---|---|
+| Spawning workers, model/endpoint/auth, isolated config dirs | Landing branches: scrub, test gate, sequential merge, rebase, revert |
+| Producing a branch with commits per task | Ordering and integration against an up-to-date `main` |
+| Gating raw worker results (`fleet-collect.sh`) | Fleet status across lanes/worktrees |
+
+Don't try to make fleet-ops spawn workers, and don't make fleet-worker merge — the
+seam is the branch.

+ 293 - 0
skills/fleet-worker/references/fleet-worker-spec.md

@@ -0,0 +1,293 @@
+# fleet-worker — Design Specification
+
+**Status:** smoke-tested on live infrastructure; pattern proven.
+**Verdict:** ✅ Feasible. A non-Anthropic (GLM) brain reliably drives Claude
+Code's tool harness in headless mode — *provided* the worker is given an isolated
+config directory (the load-bearing finding, §4).
+
+Contents: [1 Purpose](#1-purpose--scope) · [2 Facts](#2-established-facts) ·
+[3 Architecture](#3-architecture) · [4 Auth isolation](#4-the-load-bearing-finding-auth-isolation) ·
+[5 Launcher](#5-the-launcher) · [6 Invocation](#6-invocation-contract) ·
+[7 Output](#7-output-formats) · [8 Parallel isolation](#8-parallel-worker-isolation) ·
+[9 Permissions](#9-permission-modes) · [10 Effort](#10-effort-control) ·
+[11 Limits](#11-error-handling--limitations) · [12 Security](#12-security-model) ·
+[13 Packaging](#13-packaging) · [14 Rollout](#14-phased-rollout) ·
+[Appendix](#appendix-live-smoke-test-evidence)
+
+---
+
+## 1. Purpose & scope
+
+`fleet-worker` is a thin launcher around the `claude` binary (Claude Code CLI). It
+injects environment variables that point Claude Code at an **Anthropic-compatible
+endpoint** (default: z.ai, model **GLM-5.2**), then runs `claude -p` (headless /
+print mode). The result is a headless Claude Code agent whose *brain* is the
+cheaper model but which retains Claude Code's full tool harness — Read, Write,
+Edit, Bash, Glob, Grep, in-process subagents (Task), MCP, hooks.
+
+The purpose: let an **orchestrator** (a normal Claude Code session on Opus)
+delegate tool-using, multi-step agent tasks to cheaper workers by spawning them as
+subprocesses — one per task, fanned out in parallel.
+
+**What it IS:** a thin env-injecting wrapper that `exec`s `claude -p`; a way to get
+per-agent model selection through process isolation; the unit of delegation an
+orchestrator spawns and collects from.
+
+**What it is NOT:** a new application or agent reimplementation (all capability
+comes from `claude` itself); a one-shot "ask three models" query tool (that asks a
+model a *question*; this hands a model a *task* and a *toolbox*); an in-process
+feature (Claude Code's native Task subagents cannot be repointed per-agent — §3).
+
+## 2. Established facts
+
+- The endpoint speaks the **Anthropic Messages protocol**, so Claude Code can be
+  pointed at it via `ANTHROPIC_BASE_URL` + `ANTHROPIC_AUTH_TOKEN` + the
+  `ANTHROPIC_DEFAULT_{OPUS,SONNET,HAIKU}_MODEL` mapping vars. This is Claude
+  Code's documented custom-endpoint mechanism (the same one used for Bedrock /
+  Vertex / LLM gateways).
+- Default models: **GLM-5.2** (flagship reasoning model, ~3–15 s/call, large
+  context, effort levels) and **GLM-4.5-Air** (smaller, faster, used for the
+  haiku-mapped background calls).
+- Verified against Claude Code 2.x. The key is supplied at spawn time from the
+  environment or an OS keyring (never embedded in the script).
+
+## 3. Architecture
+
+```
+ORCHESTRATOR — Claude Code on Opus (real api.anthropic.com)
+  Bash tool ──spawns──► fleet-worker (subprocess #1) ──┐
+  Bash tool ──spawns──► fleet-worker (subprocess #2) ──┤  each subprocess has
+  Bash tool ──spawns──► fleet-worker (subprocess #N) ──┤  its OWN env block
+  collects JSON results ◄────────────────────────────┘
+                         │
+                         ▼
+WORKER — claude -p (headless)
+  CLAUDE_CONFIG_DIR = isolated   (no host OAuth!)
+  ANTHROPIC_BASE_URL = <endpoint>      model: GLM-5.2 (via sonnet/opus mapping)
+  Brain: GLM   Tools: Claude Code's own harness (Read/Write/Edit/Bash/Task/MCP)
+```
+
+### Why workers MUST be separate processes
+
+`ANTHROPIC_BASE_URL` and the model-mapping vars are **process-global** — read once
+per `claude` process and applied to *every* model call it makes, including
+in-process Task subagents. There is no per-agent override. Therefore the only way
+to run a GLM-brained agent alongside an Opus orchestrator is a **separate OS
+process** with its own env block. Per-agent model selection = process isolation,
+full stop. That is the reason the launcher exists.
+
+## 4. The load-bearing finding: auth isolation
+
+> **The single most important result. The naïve launcher (just set
+> `ANTHROPIC_AUTH_TOKEN`) does NOT work on a machine logged into a Claude.ai /
+> Anthropic subscription. It fails with `401 token expired or incorrect`.**
+
+**What goes wrong.** When the host is logged into a subscription, `~/.claude.json`
+holds an `oauthAccount` and `~/.claude/settings.json` may set
+`"forceLoginMethod": "claudeai"`. With `claude -p` pointed at a non-Anthropic
+endpoint but inheriting that host config, the **stored subscription OAuth token
+takes precedence over `ANTHROPIC_AUTH_TOKEN`** and is sent to the endpoint, which
+rejects it (`401`). Claude Code then retries with backoff for minutes before
+surfacing the failure. (Proven independently: the *same* key sent via raw `curl`
+to `…/v1/messages` returns HTTP 200 — the key and endpoint are fine; Claude Code
+was simply sending a *different* credential.)
+
+**What does NOT fix it:** setting `ANTHROPIC_AUTH_TOKEN` alone; also setting
+`ANTHROPIC_API_KEY`; passing `--settings '{"forceLoginMethod":"console"}'`. The
+stored `oauthAccount` wins regardless.
+
+**What DOES fix it — `CLAUDE_CONFIG_DIR` isolation:**
+
+```bash
+export CLAUDE_CONFIG_DIR="$HOME/.fleet-worker/cfg"   # fresh, empty dir
+```
+
+A clean config dir inherits **no `oauthAccount`** and **no `forceLoginMethod`**, so
+`ANTHROPIC_AUTH_TOKEN` becomes the only credential and reaches the endpoint.
+(Verified: the error flipped from `401` (rejected) to `529` (accepted, server
+overloaded) — i.e. the request now reached the model — and a subsequent run
+completed a full tool-driving loop end-to-end; see Appendix.)
+
+> **Design rule:** the launcher MUST set `CLAUDE_CONFIG_DIR` to a dedicated
+> directory. Non-negotiable on any machine also logged into a subscription. Happy
+> side effect: the worker gets a clean hook/permission/MCP profile and can't trip
+> the host's hooks.
+
+## 5. The launcher
+
+The shipped `scripts/fleet-worker` (bash) and `scripts/fleet-worker.ps1` (PowerShell)
+implement, in order: isolate `CLAUDE_CONFIG_DIR` and seed its `settings.json`
+(§4, §10); resolve the key from `ANTHROPIC_AUTH_TOKEN` → keyring
+(`FLEET_WORKER_KEYRING_SERVICE`/`_KEY`) → `ZHIPU_API_KEY`/`GLM_API_KEY` (never
+echoed); set `ANTHROPIC_BASE_URL` + the model mapping; `exec claude -p --model
+sonnet --permission-mode bypassPermissions "$@" </dev/null`.
+
+Notes:
+- **`--model sonnet`** maps to `FLEET_WORKER_MODEL` (default GLM-5.2). Mapping
+  opus+sonnet → main model means whatever tier the harness requests internally,
+  you get the main model; background/cheap calls hit the haiku-mapped small model.
+- **`</dev/null`** avoids the ~3 s "no stdin data received" stall when the prompt
+  is an argument.
+- **Windows / PowerShell:** `.Trim()` the keyring output (it can carry a trailing
+  CRLF). If a `401` persists despite isolation, check the key for CR contamination.
+
+## 6. Invocation contract
+
+```
+fleet-worker [claude-flags…] "PROMPT"
+fleet-worker [claude-flags…] < prompt.txt
+```
+
+| Aspect | Value |
+|---|---|
+| Prompt | final positional arg, or piped on stdin (arg form recommended) |
+| Baked-in flags | `-p`, `--model sonnet`, `--permission-mode bypassPermissions` |
+| Common extra flags | `--output-format {text,json,stream-json}`, `--add-dir`, `--max-turns N`, `--append-system-prompt`, `--allowedTools`/`--disallowedTools` |
+| Env knobs | `FLEET_WORKER_*` (see SKILL.md table) — give each parallel worker its own `FLEET_WORKER_CONFIG_DIR` |
+| CWD | the worker operates in the process CWD — spawn it `cd`'d into the target worktree |
+
+### Exit codes & failure signals
+| Source | Success | Failure |
+|---|---|---|
+| Process exit code | `0` | `1` (auth, API errors, overload) |
+| `--output-format json` | `is_error: false` | `is_error: true` + `api_error_status: <code>` |
+
+**Caveat:** with `--output-format json`, a 529/overload still produces a
+well-formed result with `"subtype":"success"` but `"is_error":true` and
+`"api_error_status":529`. **Don't trust `subtype` — gate on `is_error`** (and the
+process exit code). `fleet-collect.sh` encodes this.
+
+## 7. Output formats
+
+The final **result object** (from `--output-format json`, or the last line of
+`stream-json`) carries: `is_error` (← primary gate), `api_error_status` (e.g. 529
+/ 401), `duration_ms`, `num_turns`, `result` (← deliverable text), `stop_reason`,
+`session_id` (resumable with `claude -r`), `usage.{input_tokens,
+cache_read_input_tokens, output_tokens}`, `modelUsage.<model>.{…}`,
+`permission_denials`, `terminal_reason`, `uuid`, and `total_cost_usd`.
+
+```bash
+RES=$(fleet-worker --output-format json "…task…")
+echo "$RES" | jq -r '.result'                       # final text
+echo "$RES" | jq -r '.is_error'                     # success gate (NOT subtype)
+echo "$RES" | jq -r '.api_error_status // "none"'   # failure code
+```
+
+> ⚠ **`total_cost_usd` is notional.** Claude Code computes it from its internal
+> pricing table applied to a model name it doesn't recognise, so it falls back to
+> a placeholder rate — it does **not** reflect what the provider charged. Account
+> by `usage.*_tokens` and your provider's plan; ignore the dollar figure.
+
+`stream-json` emits newline-delimited events in real time (init → assistant/user
+with `tool_use`/`tool_result` blocks → final result), so the orchestrator can
+watch tool calls live. Use plain `json` for fire-and-collect; `text` returns only
+`.result`.
+
+## 8. Parallel worker isolation
+
+Each delegated subtask gets its **own git worktree + branch** *and* its **own
+config dir**, so N workers never clobber each other's files, branches, or session
+state. See SKILL.md "Fan-out recipe". Isolation matrix:
+
+| Resource | Mechanism |
+|---|---|
+| Working files / branch | `git worktree add -b fleet/<id>` — one per task |
+| Claude session/config | `FLEET_WORKER_CONFIG_DIR=…/cfg-<id>` — one per task |
+| Result/error capture | per-task `…result.json` / `…err` |
+| Concurrency cap | shell job control / `xargs -P` — ≤ 4–6 (endpoint quota, not CPU, is the limit) |
+
+The orchestrator spawns workers via its Bash tool with `run_in_background: true`,
+tracks them by output file, and collects on completion: gate on `is_error`, merge
+the winners (hand to `fleet-ops`), discard/retry failures.
+
+## 9. Permission modes
+
+| Mode | Edits | Bash | Prompts | For a delegated worker |
+|---|---|---|---|---|
+| `default` | ask | ask | yes | ❌ hangs headless |
+| `acceptEdits` | auto | ask | partial | ⚠ Bash still prompts → can hang |
+| `bypassPermissions` | auto | auto | no | ✅ recommended |
+
+**Safety comes from the cage, not the prompt:** dedicated worktree (blast radius),
+isolated `CLAUDE_CONFIG_DIR` (no host hooks/MCP/creds), optional `--disallowedTools`
+/ `--add-dir` scoping, and the orchestrator's **merge gate** — nothing the worker
+writes reaches `main` without review. Standard headless-CI posture.
+
+## 10. Effort control
+
+Headless `-p` has no interactive `/effort`. Seed it via the isolated config dir's
+`settings.json` — Claude Code persists effort as `"effortLevel"`. The launcher
+writes `{"hooks":{}, "effortLevel":"<FLEET_WORKER_EFFORT|high>"}` into a fresh config
+dir (see `assets/worker-settings.json`). Recommended default for coding workers:
+`high`. Per-task override: `--settings '{"effortLevel":"high"}'`. Confirm the
+provider's effort mapping against its docs at integration time.
+
+## 11. Error handling & limitations
+
+| Failure | Symptom | Mitigation |
+|---|---|---|
+| Auth | `401`; exit 1 after minutes of retries | Ensure `CLAUDE_CONFIG_DIR` isolation (§4); verify the key with raw `curl`; check CRLF on Windows |
+| Overload | `is_error:true`, `api_error_status:529`; exit 1 | Retry with jittered backoff; cap attempts; prefer off-peak; route overflow to the small model |
+| Quota | 429 once a plan cap is hit | Throttle fan-out; schedule heavy batches off-peak |
+| Latency | multi-turn reasoning runs into minutes | `--max-turns N`; orchestrator-side wall-clock timeout; collect via background, never block |
+| Partial output | `result` empty; `stop_reason` ≠ `end_turn` | Check `stop_reason`/`num_turns`; re-dispatch (worktree makes retries clean) |
+| Cost figures wrong | implausible `total_cost_usd` | Ignore — notional (§7) |
+
+**Reliability note (from the investigation, during the GLM-5.2 launch window):**
+the flagship was frequently **529-overloaded** for large Claude-Code-shaped
+requests at peak, while the smaller **GLM-4.5-Air succeeded cleanly**. The
+*pattern* is proven; flagship capacity during launch peaks is a real availability
+risk → build in retry/backoff, an off-peak schedule, and a small-model fallback.
+
+## 12. Security model
+
+1. **Key never at rest in the script** — pulled from env/keyring at spawn time.
+2. **Never in process args** — goes in `ANTHROPIC_AUTH_TOKEN` (env), so it can't
+   leak via `ps` / `/proc/*/cmdline` / shell history.
+3. **Never logged** — the launcher doesn't echo it; `--output-format json` carries
+   no credentials. Avoid `--debug` in shared logs.
+4. **Isolated config dir** — worker creds/session live under `FLEET_WORKER_CONFIG_DIR`,
+   separate from the host; the worker can't read the host's subscription creds.
+5. **Worktree blast-radius + merge gate** bound what an over-eager or
+   prompt-injected worker can do.
+6. **`.gitignore`** the scratch dirs (`.fleet-work/`, `.fleet-worker/`).
+
+## 13. Packaging
+
+Ship the **script on PATH** (the executable the orchestrator spawns) **plus this
+Claude Code skill** (how the orchestrator *knows* it can delegate, with the
+fan-out/collect/isolation recipes). A shell alias is optional sugar.
+
+## 14. Phased rollout
+
+- **Phase 1 (this skill):** the thin launcher + `fleet-collect`/`fleet-doctor` + the
+  recipes. Orchestrator-driven fan-out via Bash + git worktrees, landed by
+  `fleet-ops`. Start here.
+- **Phase 2 (later, on concrete pain):** wire into standing fleet/queue
+  infrastructure for persistent job tracking. Only when manual fan-out is the
+  bottleneck.
+- **Phase 3 (conditional):** a worker-router MCP — only if a standing shared
+  fleet, an async job API, or cost/quota auto-routing becomes a real need.
+
+> **Start thin, graduate only on concrete pain.** The launcher is ~12 lines of
+> load-bearing logic and proven. Don't pre-build the router.
+
+## Appendix: live smoke-test evidence
+
+Observed on Windows 11, Claude Code 2.x, against an Anthropic-compatible GLM
+endpoint, during the GLM-5.2 launch window (peak hours). Key redacted throughout.
+
+- **Endpoint/key validity (raw curl, bypasses Claude Code):** Anthropic-protocol
+  endpoint returned HTTP 200 with both `x-api-key` and `Authorization: Bearer`.
+  Key valid, endpoint correct.
+- **The auth saga:** naïve launcher (host config inherited) → `401`. Adding
+  `--settings '{"forceLoginMethod":"console"}'` → still `401`. Adding
+  `CLAUDE_CONFIG_DIR=isolated` → `529` (**auth accepted**, server overloaded). Fixed
+  *only* by config-dir isolation (§4).
+- **Does a GLM brain drive the tools?** GLM-5.2 was 529-blocked on every attempt
+  at peak; routing the *same* task to **GLM-4.5-Air** completed a **3-turn
+  Write+Read tool loop in ~23 s**, `is_error:false`, file actually written to
+  disk, `modelUsage` showing Claude Code's full system prompt + tool schemas as
+  input. **Verdict: ✅ the pattern works** — a non-Anthropic brain reliably drives
+  Claude Code's harness headless. The only material risk surfaced was flagship
+  endpoint capacity during the launch peak (transient infra, not a design flaw).

+ 68 - 0
skills/fleet-worker/scripts/fleet-collect.sh

@@ -0,0 +1,68 @@
+#!/usr/bin/env bash
+# fleet-collect.sh - gate one fleet-worker JSON result; print its text, set exit code.
+#
+# Reads a `claude -p --output-format json` result object (file arg or stdin),
+# prints the worker's final assistant text to stdout, and exits 0 only when the
+# worker truly succeeded. Encodes the spec footgun: the `subtype` field reads
+# "success" even on an API error - the real gate is is_error==false (corroborated
+# by the process exit code and api_error_status). Use this to decide which fanned-
+# out worker branches are worth landing.
+#
+# Usage:   fleet-collect.sh [--quiet] [RESULT_JSON]
+#          fleet-worker --output-format json "task" | fleet-collect.sh
+# Input:   result JSON as a file arg, or on stdin
+# Output:  stdout = the worker's final text (.result), only on success
+# Stderr:  one human status line (OK / FAILED + api_error_status)
+# Exit:    0 success; 10 worker failed (is_error / api_error); 3 file not found;
+#          4 malformed / not a result object; 2 usage; 5 missing jq
+#
+# Examples:
+#   fleet-collect.sh task-a.result.json && echo "branch fleet/task-a is landable"
+#   fleet-worker --output-format json "fix the failing test" | fleet-collect.sh -q
+set -uo pipefail
+
+EXIT_OK=0; EXIT_USAGE=2; EXIT_NOT_FOUND=3; EXIT_VALIDATION=4; EXIT_MISSING_DEP=5; EXIT_FAIL=10
+
+QUIET=0; SRC=""
+while [ $# -gt 0 ]; do
+  case "$1" in
+    -h|--help)  awk 'NR==1{next} /^#/{sub(/^# ?/,""); print; next} {exit}' "$0"; exit "$EXIT_OK" ;;
+    -q|--quiet) QUIET=1 ;;
+    -*)         echo "fleet-collect.sh: unknown flag: $1 (try --help)" >&2; exit "$EXIT_USAGE" ;;
+    *)          if [ -z "$SRC" ]; then SRC="$1"; else echo "fleet-collect.sh: too many arguments" >&2; exit "$EXIT_USAGE"; fi ;;
+  esac
+  shift
+done
+
+command -v jq >/dev/null 2>&1 || { echo "fleet-collect.sh: jq is required" >&2; exit "$EXIT_MISSING_DEP"; }
+emit() { [ "$QUIET" -eq 1 ] || printf '%s\n' "$1" >&2; }
+
+if [ -n "$SRC" ]; then
+  [ -f "$SRC" ] || { echo "fleet-collect.sh: file not found: $SRC" >&2; exit "$EXIT_NOT_FOUND"; }
+  DATA="$(cat "$SRC")"
+else
+  DATA="$(cat)"
+fi
+
+printf '%s' "$DATA" | jq -e . >/dev/null 2>&1 || {
+  echo "fleet-collect.sh: input is not valid JSON" >&2; exit "$EXIT_VALIDATION"; }
+
+# Note: `.is_error // empty` is WRONG - jq's `//` treats boolean false like null,
+# so a genuine is_error:false would read as empty. Gate on has()/tostring instead.
+is_error="$(printf '%s' "$DATA"  | jq -r 'if has("is_error") then (.is_error|tostring) else "" end')"
+api_status="$(printf '%s' "$DATA" | jq -r '.api_error_status // empty')"
+result="$(printf '%s' "$DATA"    | jq -r '.result // ""')"
+
+if [ -z "$is_error" ]; then
+  echo "fleet-collect.sh: not a result object (no .is_error field)" >&2
+  exit "$EXIT_VALIDATION"
+fi
+
+if [ "$is_error" = "false" ]; then
+  printf '%s\n' "$result"
+  emit "OK"
+  exit "$EXIT_OK"
+fi
+
+emit "FAILED (is_error=$is_error api_error_status=${api_status:-none})"
+exit "$EXIT_FAIL"

+ 220 - 0
skills/fleet-worker/scripts/fleet-doctor.sh

@@ -0,0 +1,220 @@
+#!/usr/bin/env bash
+# fleet-doctor.sh - preflight + staleness verifier for the fleet-worker skill.
+#
+# Two modes (SKILL-RESOURCE-PROTOCOL sec 7):
+#   --offline (default): structural / internal-consistency only, NO network.
+#                        Asserts the launcher parses, and that the model + endpoint
+#                        baked into the launcher are also documented in SKILL.md
+#                        (a drift tripwire - bump the launcher, forget the doc -> fail).
+#                        Safe for PR CI.
+#   --live:              pings the configured Anthropic-compatible endpoint with the
+#                        configured model. 200 = model still resolves. No key /
+#                        unreachable / rate-limited = UNAVAILABLE (advisory, never a
+#                        build failure). A 404 = DRIFT (endpoint/model path gone).
+#
+# Both modes also print non-fatal PREFLIGHT advisories: whether `claude` is on
+# PATH, and whether the host ~/.claude.json carries an oauthAccount (the cause of
+# the 401 trap that CLAUDE_CONFIG_DIR isolation fixes - see spec sec 4).
+#
+# Usage:   fleet-doctor.sh [--offline|--live] [--json] [-q]
+# Output:  stdout = data only (TSV rows, or a --json envelope)
+# Stderr:  panel framing / human status (term.sh)
+# Exit:    0 ok; 2 usage; 4 launcher malformed; 5 missing dep / launcher absent;
+#          7 unavailable (live: no key / endpoint unreachable);
+#          10 drift (offline: doc/launcher mismatch | live: model 404)
+#
+# Examples:
+#   fleet-doctor.sh --offline
+#   FLEET_WORKER_KEYRING_SERVICE=mysvc FLEET_WORKER_KEYRING_KEY=glm fleet-doctor.sh --live
+#   fleet-doctor.sh --offline --json | jq '.data[] | select(.status!="ok")'
+set -uo pipefail
+
+EXIT_OK=0; EXIT_USAGE=2; EXIT_MALFORMED=4; EXIT_MISSING_DEP=5
+EXIT_UNAVAILABLE=7; EXIT_DRIFT=10
+
+# Terminal design system (skills/_lib/term.sh). Framing rides stderr (term_init 2);
+# data rows / --json stay plain on stdout. Degrade gracefully if the lib is gone.
+__lib="$(cd "$(dirname "${BASH_SOURCE[0]}")/../../_lib" 2>/dev/null && pwd || true)"
+if [ -n "${__lib:-}" ] && [ -f "$__lib/term.sh" ]; then . "$__lib/term.sh"; term_init 2; __HAVE_TERM=1
+else __HAVE_TERM=0; TERM_DOT="|"; fi
+
+SCRIPT_DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+LAUNCHER="$SCRIPT_DIR/fleet-worker"
+SKILL_MD="$SCRIPT_DIR/../SKILL.md"
+ASSET_SETTINGS="$SCRIPT_DIR/../assets/worker-settings.json"
+
+MODE="offline"; JSON=0; QUIET=0
+while [ $# -gt 0 ]; do
+  case "$1" in
+    --offline) MODE="offline" ;;
+    --live)    MODE="live" ;;
+    --json)    JSON=1 ;;
+    -q|--quiet) QUIET=1 ;;
+    -h|--help) awk 'NR==1{next} /^#/{sub(/^# ?/,""); print; next} {exit}' "$0"; exit "$EXIT_OK" ;;
+    -*) echo "fleet-doctor.sh: unknown flag: $1 (try --help)" >&2; exit "$EXIT_USAGE" ;;
+    *)  echo "fleet-doctor.sh: unexpected argument: $1 (try --help)" >&2; exit "$EXIT_USAGE" ;;
+  esac
+  shift
+done
+
+command -v grep >/dev/null 2>&1 || { echo "fleet-doctor.sh: grep required" >&2; exit "$EXIT_MISSING_DEP"; }
+HAS_JQ=0; command -v jq >/dev/null 2>&1 && HAS_JQ=1
+[ "$JSON" -eq 1 ] && [ "$HAS_JQ" -eq 0 ] && {
+  echo '{"error":{"code":"PRECONDITION","message":"jq required for --json"}}'
+  echo "fleet-doctor.sh: jq required for --json" >&2; exit "$EXIT_MISSING_DEP"; }
+
+# Panel framing on the human stream (TTY or FORCE_COLOR); piped/quiet -> tagged lines.
+PANEL=0
+if [ "$__HAVE_TERM" -eq 1 ] && [ "$QUIET" -eq 0 ] && { [ -t 2 ] || [ -n "${FORCE_COLOR:-}" ]; }; then PANEL=1; fi
+__PANEL_OPEN=0
+popen() {
+  [ "$PANEL" -eq 1 ] && [ "$__PANEL_OPEN" -eq 0 ] || return 0
+  { term_panel_open claude "fleet-worker doctor ${TERM_DOT} ${MODE}"; term_panel_vert; } >&2
+  __PANEL_OPEN=1
+}
+emit() { [ "$QUIET" -eq 1 ] && return; printf '%s\n' "$1" >&2; }
+# prow <mark> <legacy-prefix> <label> - panel status row, or the legacy tagged line.
+prow() {
+  if [ "$PANEL" -eq 1 ]; then popen; term_status_row "$1" "$3" >&2
+  else emit "  $2 $3"; fi
+}
+
+declare -a JSON_OBJS=()
+declare -a TEXT_ROWS=()
+add_row() { # check status detail
+  TEXT_ROWS+=("$1	$2	$3")
+  [ "$HAS_JQ" -eq 1 ] && JSON_OBJS+=("$(jq -cn --arg c "$1" --arg s "$2" --arg d "$3" \
+    '{check:$c, status:$s, detail:$d}')")
+}
+
+[ "$PANEL" -eq 1 ] && popen || emit "=== fleet-doctor (${MODE}) ==="
+
+drift=0; malformed=0; missing=0; unavailable=0
+
+# -- Preflight advisories (never change exit code) --------------------------
+if command -v claude >/dev/null 2>&1; then
+  prow ok "[ok]" "claude on PATH"; add_row claude-on-path ok "found"
+else
+  prow warn "[advisory]" "claude (Claude Code) not on PATH - workers cannot run here"
+  add_row claude-on-path advisory "not found"
+fi
+__hostcfg="${HOME:-}/.claude.json"
+if [ -f "$__hostcfg" ] && grep -q '"oauthAccount"' "$__hostcfg" 2>/dev/null; then
+  prow warn "[advisory]" "host ~/.claude.json has oauthAccount - workers MUST use an isolated CLAUDE_CONFIG_DIR (spec sec 4)"
+  add_row host-oauth-trap advisory "oauthAccount present"
+else
+  prow ok "[ok]" "no host oauthAccount trap detected"; add_row host-oauth-trap ok "clean"
+fi
+
+# -- Launcher presence + syntax ---------------------------------------------
+if [ ! -f "$LAUNCHER" ]; then
+  prow bad "[MISSING]" "launcher not found: $LAUNCHER"; add_row launcher-present missing "$LAUNCHER"
+  missing=1
+elif ! bash -n "$LAUNCHER" 2>/dev/null; then
+  prow bad "[MALFORMED]" "launcher has a bash syntax error"; add_row launcher-syntax malformed "bash -n failed"
+  malformed=1
+else
+  prow ok "[ok]" "launcher present and parses"; add_row launcher-syntax ok "bash -n clean"
+fi
+
+# -- Asset settings is valid JSON -------------------------------------------
+if [ -f "$ASSET_SETTINGS" ]; then
+  if [ "$HAS_JQ" -eq 1 ] && ! jq -e . "$ASSET_SETTINGS" >/dev/null 2>&1; then
+    prow bad "[MALFORMED]" "worker-settings.json is not valid JSON"; add_row asset-settings malformed "invalid"
+    malformed=1
+  else
+    prow ok "[ok]" "worker-settings.json present"; add_row asset-settings ok "valid"
+  fi
+fi
+
+# -- Extract launcher defaults (model / small model / endpoint) -------------
+def_model=""; def_small=""; def_url=""
+if [ -f "$LAUNCHER" ]; then
+  def_model="$(grep -oE 'FLEET_WORKER_MODEL:-[A-Za-z0-9._-]+' "$LAUNCHER" | head -1 | sed 's/^FLEET_WORKER_MODEL:-//')"
+  def_small="$(grep -oE 'FLEET_WORKER_SMALL_MODEL:-[A-Za-z0-9._-]+' "$LAUNCHER" | head -1 | sed 's/^FLEET_WORKER_SMALL_MODEL:-//')"
+  def_url="$(grep -oE 'FLEET_WORKER_BASE_URL:-[^}\"]+' "$LAUNCHER" | head -1 | sed 's/^FLEET_WORKER_BASE_URL:-//')"
+fi
+
+# -- Offline: launcher defaults must be documented in SKILL.md (drift tripwire)
+if [ "$MODE" = "offline" ]; then
+  if [ -f "$SKILL_MD" ]; then
+    for pair in "model:$def_model" "small-model:$def_small" "endpoint:$def_url"; do
+      label="${pair%%:*}"; val="${pair#*:}"
+      [ -z "$val" ] && continue
+      if grep -qF -- "$val" "$SKILL_MD"; then
+        prow ok "[ok]" "SKILL.md documents $label ($val)"; add_row "doc-$label" ok "$val"
+      else
+        prow bad "[DRIFT]" "launcher $label '$val' is NOT documented in SKILL.md"
+        add_row "doc-$label" drift "$val"; drift=1
+      fi
+    done
+  else
+    prow warn "[advisory]" "SKILL.md not found beside scripts/ - skipping doc-consistency"
+    add_row skill-md advisory "not found"
+  fi
+fi
+
+# -- Live: does the configured model still resolve at the endpoint? ---------
+if [ "$MODE" = "live" ]; then
+  command -v curl >/dev/null 2>&1 || { echo "fleet-doctor.sh: curl required for --live" >&2; exit "$EXIT_MISSING_DEP"; }
+  url="${FLEET_WORKER_BASE_URL:-${def_url:-https://api.z.ai/api/anthropic}}"
+  model="${FLEET_WORKER_MODEL:-${def_model:-GLM-5.2}}"
+  # Resolve key without echoing it.
+  key=""
+  if [ -n "${ANTHROPIC_AUTH_TOKEN:-}" ]; then key="$ANTHROPIC_AUTH_TOKEN"
+  elif [ -n "${FLEET_WORKER_KEYRING_SERVICE:-}" ] && [ -n "${FLEET_WORKER_KEYRING_KEY:-}" ] && command -v keyring >/dev/null 2>&1; then
+    key="$(keyring get "$FLEET_WORKER_KEYRING_SERVICE" "$FLEET_WORKER_KEYRING_KEY" 2>/dev/null | tr -d '\r\n')"
+  elif [ -n "${ZHIPU_API_KEY:-}" ]; then key="$ZHIPU_API_KEY"
+  elif [ -n "${GLM_API_KEY:-}" ]; then key="$GLM_API_KEY"
+  fi
+  if [ -z "$key" ]; then
+    prow warn "[unavailable]" "no API key resolved - cannot run --live (advisory)"
+    add_row live-ping unavailable "no key"; unavailable=1
+  else
+    body="$(printf '{"model":"%s","max_tokens":1,"messages":[{"role":"user","content":"ping"}]}' "$model")"
+    code="$(curl -sS -o /dev/null -m 25 -w '%{http_code}' -X POST "${url%/}/v1/messages" \
+      -H "content-type: application/json" -H "anthropic-version: 2023-06-01" \
+      -H "x-api-key: ${key}" --data "$body" 2>/dev/null || echo 000)"
+    case "$code" in
+      200)
+        prow ok "[ok]" "model $model resolves at ${url} (HTTP 200)"; add_row live-ping ok "$model" ;;
+      404)
+        prow bad "[DRIFT 404]" "model/endpoint not found: $model @ ${url}"; add_row live-ping drift "$model"; drift=1 ;;
+      000|401|403|408|429|5??)
+        prow warn "[unavailable]" "endpoint unreachable / not authorized / overloaded (HTTP $code)"
+        add_row live-ping unavailable "HTTP $code"; unavailable=1 ;;
+      *)
+        prow warn "[unavailable]" "unexpected response (HTTP $code) - treating as advisory"
+        add_row live-ping unavailable "HTTP $code"; unavailable=1 ;;
+    esac
+  fi
+fi
+
+# -- Panel footer -----------------------------------------------------------
+if [ "$PANEL" -eq 1 ] && [ "$__PANEL_OPEN" -eq 1 ]; then
+  ph_state="healthy"; ph_text="all checks pass"
+  if [ "$missing" -eq 1 ] || [ "$malformed" -eq 1 ]; then ph_state="critical"; ph_text="skill broken"
+  elif [ "$drift" -eq 1 ]; then ph_state="critical"; ph_text="drift detected"
+  elif [ "$unavailable" -eq 1 ]; then ph_state="warning"; ph_text="endpoint advisory"; fi
+  { term_panel_vert
+    term_panel_close "--live to ping endpoint ${TERM_DOT} --json for data" "$(term_health "$ph_state" "$ph_text")"
+  } >&2
+fi
+
+# -- Output (data on stdout) ------------------------------------------------
+if [ "$JSON" -eq 1 ]; then
+  printf '%s\n' "${JSON_OBJS[@]:-}" | jq -s \
+    --arg mode "$MODE" \
+    '{data: map(select(length>0)),
+      meta: {mode:$mode, count:(map(select(length>0))|length),
+             schema:"claude-mods.fleet-worker.doctor/v1"}}'
+else
+  for row in "${TEXT_ROWS[@]:-}"; do [ -n "$row" ] && printf '%s\n' "$row"; done
+fi
+
+# -- Exit (precedence: broken > drift > unavailable) ------------------------
+[ "$missing" -eq 1 ]   && exit "$EXIT_MISSING_DEP"
+[ "$malformed" -eq 1 ] && exit "$EXIT_MALFORMED"
+[ "$drift" -eq 1 ]     && exit "$EXIT_DRIFT"
+[ "$unavailable" -eq 1 ] && exit "$EXIT_UNAVAILABLE"
+exit "$EXIT_OK"

+ 91 - 0
skills/fleet-worker/scripts/fleet-worker

@@ -0,0 +1,91 @@
+#!/usr/bin/env bash
+# fleet-worker - run a cheap headless Claude Code worker on a non-Anthropic model.
+#
+# Thin launcher: points Claude Code (`claude -p`) at any Anthropic-compatible
+# endpoint (default: z.ai / GLM) via env, inside an ISOLATED config dir, then
+# execs. The result is a headless agent with Claude Code's full tool harness
+# (Read/Write/Edit/Bash/Glob/Grep/Task) but a cheaper "grunt" brain - fanned
+# out and verified by an Opus orchestrator. See ../SKILL.md.
+#
+# Usage:   fleet-worker [--help] [claude-flags...] "PROMPT"
+#          fleet-worker [claude-flags...] < prompt.txt
+# Input:   prompt as the final positional arg, or piped on stdin
+# Output:  whatever `claude -p` emits (text, or --output-format json/stream-json)
+# Stderr:  claude's own diagnostics; this launcher is silent on success
+# Exit:    0 ok; 1 worker/API error; 2 usage; 5 missing dep / no key resolved
+#
+# Config (env, all optional - defaults target the z.ai GLM Coding Plan):
+#   FLEET_WORKER_CONFIG_DIR   isolated CLAUDE_CONFIG_DIR (default ~/.fleet-worker/cfg)
+#   FLEET_WORKER_BASE_URL     Anthropic-compatible endpoint (default z.ai)
+#   FLEET_WORKER_MODEL        main model      (default GLM-5.2)
+#   FLEET_WORKER_SMALL_MODEL  background model (default GLM-4.5-Air)
+#   FLEET_WORKER_EFFORT       seeded effortLevel (default high)
+# Key resolution order (the key is never printed):
+#   1. ANTHROPIC_AUTH_TOKEN (already exported)            -> used as-is
+#   2. FLEET_WORKER_KEYRING_SERVICE + FLEET_WORKER_KEYRING_KEY -> `keyring get svc key`
+#   3. ZHIPU_API_KEY / GLM_API_KEY                         -> used as-is
+#
+# Examples:
+#   fleet-worker "List the TODOs under src/ and summarize them"
+#   fleet-worker --output-format json "Refactor utils.py" | fleet-collect.sh
+#   FLEET_WORKER_CONFIG_DIR=~/.fleet-worker/cfg-a fleet-worker --output-format json "task a"
+set -uo pipefail
+
+EXIT_OK=0; EXIT_USAGE=2; EXIT_MISSING_DEP=5
+
+case "${1:-}" in
+  -h|--help) awk 'NR==1{next} /^#/{sub(/^# ?/,""); print; next} {exit}' "$0"; exit "$EXIT_OK" ;;
+esac
+
+# --- Auth isolation (LOAD-BEARING; see references/fleet-worker-spec.md sec 4) ------
+# A dedicated config dir means the worker inherits NO host Claude.ai OAuth
+# account or forceLoginMethod, so our token is the only credential and actually
+# reaches the endpoint - otherwise a host subscription token wins and the
+# endpoint rejects it with 401.
+GLM_CFG="${FLEET_WORKER_CONFIG_DIR:-$HOME/.fleet-worker/cfg}"
+if ! mkdir -p "$GLM_CFG" 2>/dev/null; then
+  echo "fleet-worker: cannot create config dir: $GLM_CFG" >&2
+  exit "$EXIT_MISSING_DEP"
+fi
+if [ ! -f "$GLM_CFG/settings.json" ]; then
+  printf '{ "hooks": {}, "effortLevel": "%s" }\n' "${FLEET_WORKER_EFFORT:-high}" > "$GLM_CFG/settings.json"
+fi
+export CLAUDE_CONFIG_DIR="$GLM_CFG"
+
+# --- Resolve the API key (never echoed) --------------------------------------
+resolve_key() {
+  if [ -n "${ANTHROPIC_AUTH_TOKEN:-}" ]; then printf '%s' "$ANTHROPIC_AUTH_TOKEN"; return 0; fi
+  if [ -n "${FLEET_WORKER_KEYRING_SERVICE:-}" ] && [ -n "${FLEET_WORKER_KEYRING_KEY:-}" ] \
+     && command -v keyring >/dev/null 2>&1; then
+    local k
+    k="$(keyring get "$FLEET_WORKER_KEYRING_SERVICE" "$FLEET_WORKER_KEYRING_KEY" 2>/dev/null | tr -d '\r\n')"
+    if [ -n "$k" ]; then printf '%s' "$k"; return 0; fi
+  fi
+  if [ -n "${ZHIPU_API_KEY:-}" ]; then printf '%s' "$ZHIPU_API_KEY"; return 0; fi
+  if [ -n "${GLM_API_KEY:-}" ];   then printf '%s' "$GLM_API_KEY";   return 0; fi
+  return 1
+}
+if ! __key="$(resolve_key)"; then
+  cat >&2 <<'MSG'
+fleet-worker: no API key resolved. Provide one of:
+  - export ANTHROPIC_AUTH_TOKEN=<key>
+  - export FLEET_WORKER_KEYRING_SERVICE=<svc> FLEET_WORKER_KEYRING_KEY=<name>   (uses `keyring get`)
+  - export ZHIPU_API_KEY=<key>    (or GLM_API_KEY)
+MSG
+  exit "$EXIT_MISSING_DEP"
+fi
+
+command -v claude >/dev/null 2>&1 || {
+  echo "fleet-worker: 'claude' (Claude Code) not found on PATH" >&2; exit "$EXIT_MISSING_DEP"; }
+
+# --- Endpoint + model mapping ------------------------------------------------
+export ANTHROPIC_BASE_URL="${FLEET_WORKER_BASE_URL:-https://api.z.ai/api/anthropic}"
+export ANTHROPIC_AUTH_TOKEN="$__key"
+export ANTHROPIC_DEFAULT_OPUS_MODEL="${FLEET_WORKER_MODEL:-GLM-5.2}"
+export ANTHROPIC_DEFAULT_SONNET_MODEL="${FLEET_WORKER_MODEL:-GLM-5.2}"
+export ANTHROPIC_DEFAULT_HAIKU_MODEL="${FLEET_WORKER_SMALL_MODEL:-GLM-4.5-Air}"
+
+# `--model sonnet` resolves to $FLEET_WORKER_MODEL via the mapping above.
+# `</dev/null` avoids the ~3s "no stdin data received" wait when the prompt is
+# passed as an argument.
+exec claude -p --model sonnet --permission-mode bypassPermissions "$@" </dev/null

+ 80 - 0
skills/fleet-worker/scripts/fleet-worker.ps1

@@ -0,0 +1,80 @@
+#!/usr/bin/env pwsh
+# fleet-worker.ps1 - run a cheap headless Claude Code worker on a non-Anthropic model (PowerShell).
+#
+# Thin launcher: points Claude Code (`claude -p`) at any Anthropic-compatible
+# endpoint (default: z.ai / GLM) via env, inside an ISOLATED config dir, then
+# runs it. The result is a headless agent with Claude Code's full tool harness
+# but a cheaper "grunt" brain. See ../SKILL.md.
+#
+# Usage:   fleet-worker.ps1 [-Help] [claude-flags...] "PROMPT"
+# Output:  whatever `claude -p` emits (text, or --output-format json/stream-json)
+# Exit:    0 ok; 1 worker/API error; 5 missing dep / no key resolved
+#
+# Config (env, all optional - defaults target the z.ai GLM Coding Plan):
+#   FLEET_WORKER_CONFIG_DIR   isolated CLAUDE_CONFIG_DIR (default ~/.fleet-worker/cfg)
+#   FLEET_WORKER_BASE_URL     Anthropic-compatible endpoint (default z.ai)
+#   FLEET_WORKER_MODEL        main model      (default GLM-5.2)
+#   FLEET_WORKER_SMALL_MODEL  background model (default GLM-4.5-Air)
+#   FLEET_WORKER_EFFORT       seeded effortLevel (default high)
+# Key resolution order (the key is never printed):
+#   1. ANTHROPIC_AUTH_TOKEN (already set)                  -> used as-is
+#   2. FLEET_WORKER_KEYRING_SERVICE + FLEET_WORKER_KEYRING_KEY -> `keyring get svc key`
+#   3. ZHIPU_API_KEY / GLM_API_KEY                         -> used as-is
+#
+# Examples:
+#   ./fleet-worker.ps1 "List the TODOs under src/ and summarize them"
+#   ./fleet-worker.ps1 --output-format json "Refactor utils.py"
+$ErrorActionPreference = 'Stop'
+
+if ($args.Count -ge 1 -and ($args[0] -eq '-Help' -or $args[0] -eq '--help' -or $args[0] -eq '-h')) {
+  Get-Content $PSCommandPath | Select-Object -Skip 1 |
+    ForEach-Object { if ($_ -match '^#') { $_ -replace '^# ?', '' } else { return } }
+  exit 0
+}
+
+# Auth isolation (LOAD-BEARING; see references/fleet-worker-spec.md sec 4): a dedicated
+# config dir => the worker inherits no host Claude.ai OAuth account, so our token
+# is the only credential and actually reaches the endpoint (else 401).
+$cfg = if ($env:FLEET_WORKER_CONFIG_DIR) { $env:FLEET_WORKER_CONFIG_DIR } `
+       else { Join-Path $HOME '.fleet-worker/cfg' }
+New-Item -ItemType Directory -Force -Path $cfg | Out-Null
+$settings = Join-Path $cfg 'settings.json'
+if (-not (Test-Path $settings)) {
+  $effort = if ($env:FLEET_WORKER_EFFORT) { $env:FLEET_WORKER_EFFORT } else { 'high' }
+  "{ ""hooks"": {}, ""effortLevel"": ""$effort"" }" | Set-Content -Path $settings -Encoding utf8
+}
+$env:CLAUDE_CONFIG_DIR = $cfg
+
+# Resolve the API key (never printed). .Trim() strips trailing CRLF from `keyring get`.
+function Resolve-GlmKey {
+  if ($env:ANTHROPIC_AUTH_TOKEN) { return $env:ANTHROPIC_AUTH_TOKEN }
+  if ($env:FLEET_WORKER_KEYRING_SERVICE -and $env:FLEET_WORKER_KEYRING_KEY -and (Get-Command keyring -ErrorAction SilentlyContinue)) {
+    $k = (keyring get $env:FLEET_WORKER_KEYRING_SERVICE $env:FLEET_WORKER_KEYRING_KEY 2>$null)
+    if ($k) { return ($k | Out-String).Trim() }
+  }
+  if ($env:ZHIPU_API_KEY) { return $env:ZHIPU_API_KEY }
+  if ($env:GLM_API_KEY)   { return $env:GLM_API_KEY }
+  return $null
+}
+$key = Resolve-GlmKey
+if (-not $key) {
+  Write-Error @'
+fleet-worker: no API key resolved. Provide one of:
+  - $env:ANTHROPIC_AUTH_TOKEN = '<key>'
+  - $env:FLEET_WORKER_KEYRING_SERVICE / FLEET_WORKER_KEYRING_KEY  (uses `keyring get`)
+  - $env:ZHIPU_API_KEY = '<key>'   (or GLM_API_KEY)
+'@
+  exit 5
+}
+if (-not (Get-Command claude -ErrorAction SilentlyContinue)) {
+  Write-Error "fleet-worker: 'claude' (Claude Code) not found on PATH"; exit 5
+}
+
+$env:ANTHROPIC_BASE_URL          = if ($env:FLEET_WORKER_BASE_URL) { $env:FLEET_WORKER_BASE_URL } else { 'https://api.z.ai/api/anthropic' }
+$env:ANTHROPIC_AUTH_TOKEN        = $key
+$env:ANTHROPIC_DEFAULT_OPUS_MODEL   = if ($env:FLEET_WORKER_MODEL) { $env:FLEET_WORKER_MODEL } else { 'GLM-5.2' }
+$env:ANTHROPIC_DEFAULT_SONNET_MODEL = $env:ANTHROPIC_DEFAULT_OPUS_MODEL
+$env:ANTHROPIC_DEFAULT_HAIKU_MODEL  = if ($env:FLEET_WORKER_SMALL_MODEL) { $env:FLEET_WORKER_SMALL_MODEL } else { 'GLM-4.5-Air' }
+
+claude -p --model sonnet --permission-mode bypassPermissions @args
+exit $LASTEXITCODE

+ 155 - 0
skills/fleet-worker/tests/run.sh

@@ -0,0 +1,155 @@
+#!/usr/bin/env bash
+# Self-test for the fleet-worker skill scripts.
+#
+# Offline + deterministic (no network). Mocks `claude` and `keyring` on a
+# controlled PATH, then asserts the launcher's auth isolation, key-resolution
+# chain, no-key-leak, and arg forwarding; fleet-collect's success/failure gating;
+# and fleet-doctor's --offline / --help / ASCII purity. Resolves paths relative to
+# itself so it runs both in the repo and once installed to ~/.claude/skills/.
+#
+# Usage:   bash tests/run.sh
+# Exit:    0 all pass, 1 one or more failures (SKIP+exit 0 if jq is unavailable)
+set -uo pipefail
+
+HERE="$(cd "$(dirname "${BASH_SOURCE[0]}")" && pwd)"
+SKILL="$(dirname "$HERE")"
+SCRIPTS="$SKILL/scripts"
+WORKER="$SCRIPTS/fleet-worker"
+COLLECT="$SCRIPTS/fleet-collect.sh"
+DOCTOR="$SCRIPTS/fleet-doctor.sh"
+
+command -v jq >/dev/null 2>&1 || { echo "SKIP: jq not available"; exit 0; }
+
+SB="$(mktemp -d)"; trap 'rm -rf "$SB"' EXIT
+PASS=0; FAIL=0
+ok() { PASS=$((PASS+1)); printf '  PASS  %s\n' "$1"; }
+no() { FAIL=$((FAIL+1)); printf '  FAIL  %s\n' "$1"; }
+ee() { [ "$2" = "$3" ] && ok "$1 (exit $3)" || no "$1 (want $2 got $3)"; }
+eh() { case "$3" in *"$2"*) ok "$1";; *) no "$1 (missing '$2')";; esac; }
+
+echo "=== fleet-worker self-test ==="
+
+# ── Mock bin: a fake `claude` that records env+args to a probe file and prints
+#    only a marker, plus a fake `keyring`. A controlled PATH keeps the real
+#    claude out so launcher behaviour is deterministic. ──────────────────────
+MB="$SB/bin"; mkdir -p "$MB"
+PROBE="$SB/probe.txt"
+cat > "$MB/claude" <<EOF
+#!/usr/bin/env bash
+{
+  echo "CONFIG=\$CLAUDE_CONFIG_DIR"
+  echo "BASE=\$ANTHROPIC_BASE_URL"
+  echo "OPUS=\$ANTHROPIC_DEFAULT_OPUS_MODEL"
+  echo "SONNET=\$ANTHROPIC_DEFAULT_SONNET_MODEL"
+  echo "HAIKU=\$ANTHROPIC_DEFAULT_HAIKU_MODEL"
+  echo "TOKEN=\$ANTHROPIC_AUTH_TOKEN"
+  echo "ARGS=\$*"
+} > "$PROBE"
+echo "MOCK_CLAUDE_RAN"
+EOF
+chmod +x "$MB/claude"
+cat > "$MB/keyring" <<'EOF'
+#!/usr/bin/env bash
+# keyring get <service> <key>
+[ "${1:-}" = "get" ] && echo "KEYRING-TOKEN-${3:-}"
+EOF
+chmod +x "$MB/keyring"
+PC="$MB:/usr/bin:/bin"   # controlled PATH: mock first, no real claude
+
+echo "-- launcher --"
+"$WORKER" --help >/dev/null 2>&1; ee "worker --help" 0 $?
+
+# Success: env isolation + model mapping + arg forwarding + key never printed
+CFG="$SB/cfg-a"
+out="$(PATH="$PC" FLEET_WORKER_CONFIG_DIR="$CFG" \
+       ANTHROPIC_AUTH_TOKEN="" ZHIPU_API_KEY="SEKRET-AAA" GLM_API_KEY="" \
+       "$WORKER" --output-format json "do a thing" 2>&1)"; rc=$?
+ee "worker runs with key" 0 "$rc"
+eh "mock claude executed" "MOCK_CLAUDE_RAN" "$out"
+case "$out" in *SEKRET-AAA*) no "worker leaked key to its own output";; *) ok "worker never prints the key";; esac
+P="$(cat "$PROBE")"
+eh "isolated CLAUDE_CONFIG_DIR set" "CONFIG=$CFG" "$P"
+eh "z.ai base url set"              "BASE=https://api.z.ai/api/anthropic" "$P"
+eh "sonnet maps to GLM-5.2"         "SONNET=GLM-5.2" "$P"
+eh "haiku maps to GLM-4.5-Air"      "HAIKU=GLM-4.5-Air" "$P"
+eh "key reached claude via env"     "TOKEN=SEKRET-AAA" "$P"
+eh "bakes flags + forwards args"    "ARGS=-p --model sonnet --permission-mode bypassPermissions --output-format json do a thing" "$P"
+[ -f "$CFG/settings.json" ] && ok "seeds settings.json" || no "settings.json not seeded"
+eh "settings carries effortLevel"   "effortLevel" "$(cat "$CFG/settings.json" 2>/dev/null)"
+
+# Custom endpoint/model override
+: > "$PROBE"
+PATH="$PC" FLEET_WORKER_CONFIG_DIR="$SB/cfg-x" \
+  ANTHROPIC_AUTH_TOKEN="T" FLEET_WORKER_BASE_URL="https://example.test/anthropic" \
+  FLEET_WORKER_MODEL="GLM-9" FLEET_WORKER_SMALL_MODEL="GLM-9-mini" \
+  "$WORKER" "hi" >/dev/null 2>&1
+P="$(cat "$PROBE")"
+eh "custom base url honoured" "BASE=https://example.test/anthropic" "$P"
+eh "custom model honoured"    "SONNET=GLM-9" "$P"
+
+# Key chain: keyring
+: > "$PROBE"
+PATH="$PC" FLEET_WORKER_CONFIG_DIR="$SB/cfg-b" \
+  ANTHROPIC_AUTH_TOKEN="" ZHIPU_API_KEY="" GLM_API_KEY="" \
+  FLEET_WORKER_KEYRING_SERVICE="svc" FLEET_WORKER_KEYRING_KEY="glm" \
+  "$WORKER" "hi" >/dev/null 2>&1; ee "worker via keyring" 0 $?
+eh "keyring token reached claude" "TOKEN=KEYRING-TOKEN-glm" "$(cat "$PROBE")"
+
+# Key chain: GLM_API_KEY
+: > "$PROBE"
+PATH="$PC" FLEET_WORKER_CONFIG_DIR="$SB/cfg-c" \
+  ANTHROPIC_AUTH_TOKEN="" ZHIPU_API_KEY="" GLM_API_KEY="GAK-1" \
+  "$WORKER" "hi" >/dev/null 2>&1
+eh "GLM_API_KEY reached claude" "TOKEN=GAK-1" "$(cat "$PROBE")"
+
+# No key resolved -> exit 5
+PATH="$PC" FLEET_WORKER_CONFIG_DIR="$SB/cfg-d" \
+  ANTHROPIC_AUTH_TOKEN="" ZHIPU_API_KEY="" GLM_API_KEY="" \
+  "$WORKER" "hi" >/dev/null 2>&1; ee "no key -> 5" 5 $?
+
+# claude missing -> exit 5 (only when real claude isn't in the base PATH)
+if PATH="/usr/bin:/bin" command -v claude >/dev/null 2>&1; then
+  echo "  SKIP  claude-missing (claude resolves via base PATH)"
+else
+  MB2="$SB/bin2"; mkdir -p "$MB2"
+  PATH="$MB2:/usr/bin:/bin" FLEET_WORKER_CONFIG_DIR="$SB/cfg-e" ZHIPU_API_KEY="X" \
+    ANTHROPIC_AUTH_TOKEN="" GLM_API_KEY="" \
+    "$WORKER" "hi" >/dev/null 2>&1; ee "claude missing -> 5" 5 $?
+fi
+
+echo "-- fleet-collect.sh --"
+"$COLLECT" --help >/dev/null 2>&1; ee "collect --help" 0 $?
+out="$(printf '{"is_error":false,"result":"DELIVERABLE"}' | "$COLLECT" -q)"; rc=$?
+ee "collect success -> 0" 0 "$rc"
+eh "collect prints .result" "DELIVERABLE" "$out"
+printf '{"is_error":true,"api_error_status":529,"result":""}' | "$COLLECT" -q >/dev/null 2>&1
+ee "collect worker-failed -> 10" 10 $?
+printf 'not json' | "$COLLECT" -q >/dev/null 2>&1; ee "collect bad json -> 4" 4 $?
+printf '{"x":1}'  | "$COLLECT" -q >/dev/null 2>&1; ee "collect no is_error -> 4" 4 $?
+"$COLLECT" "$SB/nope.json" >/dev/null 2>&1;        ee "collect missing file -> 3" 3 $?
+"$COLLECT" --bogus >/dev/null 2>&1;                ee "collect bad flag -> 2" 2 $?
+
+echo "-- fleet-doctor.sh --"
+"$DOCTOR" --help >/dev/null 2>&1;        ee "doctor --help" 0 $?
+"$DOCTOR" --offline -q >/dev/null 2>&1;  ee "doctor --offline consistent -> 0" 0 $?
+"$DOCTOR" --bogus >/dev/null 2>&1;       ee "doctor bad flag -> 2" 2 $?
+out="$("$DOCTOR" --offline --json -q 2>/dev/null)"
+eh "doctor --json schema" "claude-mods.fleet-worker.doctor/v1" "$out"
+e="$(TERM_ASCII=1 FORCE_COLOR=1 "$DOCTOR" --offline 2>&1 1>/dev/null)"
+if printf '%s' "$e" | LC_ALL=C grep -q '[^[:print:][:cntrl:]]'; then
+  no "doctor framing pure ASCII under TERM_ASCII=1"
+else ok "doctor framing pure ASCII under TERM_ASCII=1"; fi
+grep -q '_lib/term.sh' "$DOCTOR" && ok "doctor sources term.sh" || no "doctor missing term.sh"
+
+# Drift tripwire: copy the skill into a temp dir with a SKILL.md that documents
+# no model name; the doctor run from there must report drift -> exit 10.
+DSB="$SB/skillcopy"; mkdir -p "$DSB/scripts" "$DSB/assets"
+cp "$WORKER" "$DSB/scripts/fleet-worker"
+cp "$DOCTOR" "$DSB/scripts/fleet-doctor.sh"
+printf '# fleet-worker\nThis doc deliberately mentions no model name or endpoint.\n' > "$DSB/SKILL.md"
+printf '{ "hooks": {}, "effortLevel": "high" }\n' > "$DSB/assets/worker-settings.json"
+bash "$DSB/scripts/fleet-doctor.sh" --offline -q >/dev/null 2>&1
+ee "drift (undocumented model) -> 10" 10 $?
+
+echo "=== $PASS passed, $FAIL failed ==="
+[ "$FAIL" -eq 0 ] || exit 1

+ 9 - 0
tests/check-resources.sh

@@ -57,6 +57,10 @@ echo "== mapbox-ops: fact/staleness verifier"
 run "mapbox-ops --offline consistent" 0 "$PY" skills/mapbox-ops/scripts/check-mapbox-facts.py --offline
 run "mapbox-ops --help"               0 "$PY" skills/mapbox-ops/scripts/check-mapbox-facts.py --help
 
+echo "== fleet-worker: doctor (preflight + staleness) verifier"
+run "fleet-doctor --offline consistent" 0 bash skills/fleet-worker/scripts/fleet-doctor.sh --offline
+run "fleet-doctor --help"               0 bash skills/fleet-worker/scripts/fleet-doctor.sh --help
+
 echo "== protocol: every new verifier is executable + compiles"
 for s in skills/claude-api-ops/scripts/check-model-table.py \
          skills/claude-code-ops/scripts/validate-hooks-json.py \
@@ -70,6 +74,8 @@ bash -n skills/ffmpeg-ops/scripts/verify-commands.sh 2>/dev/null \
     && pass "bash -n verify-commands.sh" || bad "bash -n verify-commands.sh"
 bash -n skills/ytdlp-ops/scripts/check-ytdlp-version.sh 2>/dev/null \
     && pass "bash -n check-ytdlp-version.sh" || bad "bash -n check-ytdlp-version.sh"
+bash -n skills/fleet-worker/scripts/fleet-doctor.sh 2>/dev/null \
+    && pass "bash -n fleet-doctor.sh" || bad "bash -n fleet-doctor.sh"
 
 echo "== terminal design: verifier framing adopts term.sh and is ASCII-pure"
 # Each verifier renders its human framing on stderr; under TERM_ASCII=1 every
@@ -88,8 +94,11 @@ purity "hooks-lint"  "$PY" skills/claude-code-ops/scripts/validate-hooks-json.py
 __tf="$(mktemp)"; printf '{"suites":[]}' > "$__tf"
 purity "flake-triage" "$PY" skills/playwright-ops/scripts/triage-flakes.py "$__tf"
 rm -f "$__tf"
+purity "fleet-doctor"  bash skills/fleet-worker/scripts/fleet-doctor.sh --offline
 grep -q '_lib/term.sh' skills/terraform-ops/scripts/check-action-refs.sh \
     && pass "check-action-refs sources term.sh" || bad "check-action-refs missing term.sh"
+grep -q '_lib/term.sh' skills/fleet-worker/scripts/fleet-doctor.sh \
+    && pass "fleet-doctor sources term.sh" || bad "fleet-doctor missing term.sh"
 for s in skills/claude-api-ops/scripts/check-model-table.py \
          skills/claude-code-ops/scripts/validate-hooks-json.py \
          skills/playwright-ops/scripts/triage-flakes.py; do