Compiled 2026-06-22 for Claude Code v2.1.x (auto mode requires v2.1.83+). Two evidence sources, labelled throughout:
- [DOC] — official Anthropic documentation (URL cited inline). Verified by direct fetch on 2026-06-22.
- [OBS] — observed behaviour extracted from local Claude Code session transcripts (
~/.claude/projects/**/*.jsonl), summarised, no secrets reproduced. These reflect runtime behaviour and internal label strings that are not part of the published docs.Where [DOC] and [OBS] agree, the claim is solid. Where only [OBS] is given, treat it as reverse-engineered from runtime output and subject to change between releases.
When auto mode is on, Claude Code stops prompting you for most actions. Instead, a separate classifier model reviews each not-yet-resolved tool call and decides allow / block on its own — with no human in the loop. This doc explains:
allow/deny/ask system.allow rule does not protect a command it
appears to match (the question that prompted this doc).A tool call passes through two independent gates before it runs. [DOC]
tool call
│
├─▶ GATE 1 — the permissions system (rule-based, deterministic)
│ PreToolUse hooks → permissions.deny → permissions.ask
│ → permission mode → permissions.allow
│ (precedence: deny, then ask, then allow — first match wins)
│
└─▶ GATE 2 — the auto-mode classifier (model-based, only in `auto` mode)
runs *after* the permissions system, for anything the rules didn't resolve
"The classifier is a second gate that runs after the permissions system. For actions that must never run regardless of user intent or classifier configuration, use
permissions.denyin managed settings, which blocks the action before the classifier is consulted and cannot be overridden." — [DOC] auto-mode-config
Key consequences:
permissions.deny always wins — it blocks before the classifier is even consulted.auto mode. In default/acceptEdits/plan/dontAsk/
bypassPermissions, gate 2 is absent; unresolved actions prompt, get auto-approved, or get
auto-denied per the mode. [DOC] permission-modes[DOC] permission-modes:
"Auto mode lets Claude execute without routine permission prompts. A separate classifier model reviews actions before they run, blocking anything that escalates beyond your request, targets unrecognized infrastructure, or appears driven by hostile content Claude read. Explicit ask rules still force a prompt."
It is a research preview: "It reduces prompts but does not guarantee safety." [DOC]
| Property | Value | Source |
|---|---|---|
| Mode name | auto |
[DOC] |
| Minimum version | Claude Code v2.1.83 | [DOC] |
| Classifier model | Server-configured, independent of your /model. Anthropic API: Opus 4.6+ or Sonnet 4.6. Bedrock/Vertex/Foundry: Opus 4.7 / 4.8 only. |
[DOC] |
| What the classifier reads | User messages, tool calls, your CLAUDE.md. Tool results are stripped (a server-side probe scans them for hostile content first). |
[DOC] |
| Enable | Shift+Tab to cycle (opt-in prompt first); or defaultMode: "auto" in ~/.claude/settings.json. |
[DOC] |
defaultMode: "auto" in .claude/settings.json or .local.json |
Ignored (v2.1.142+) — "a repository cannot grant itself auto mode." Must live in user settings. | [DOC] |
| Bedrock/Vertex/Foundry | Off until CLAUDE_CODE_ENABLE_AUTO_MODE=1 (v2.1.158+). |
[DOC] |
| Admin lock-off | permissions.disableAutoMode: "disable" in managed settings. |
[DOC] |
The fact that a repo can't grant itself auto mode (and can't inject autoMode rules via
shared .claude/settings.json) is the same design principle behind the Self-Modification
denials in §5 — the agent's own working tree must not be able to widen its own autonomy.
Verbatim from [DOC] permission-modes ("How the classifier evaluates actions"), first matching step wins:
"On entering auto mode, broad allow rules that grant arbitrary code execution are dropped:
- Blanket
Bash(*)orPowerShell(*)- Wildcarded interpreters like
Bash(python*)- Package-manager run commands
Agentallow rulesNarrow rules like
Bash(npm test)carry over. Dropped rules are restored when you leave auto mode." — [DOC] permission-modes
The classifier also "sees" the CLAUDE.md and treats it as steering input, which is why auto mode enforces your own stated rules (see §5, user-boundary class).
This directly answers the observed puzzle: *an existing Bash(bash:*) allow rule did not
save a bash wrapper.sh call once the classifier saw the wrapper was hiding
--permission-mode bypassPermissions.*
The accurate model (combining §3.1 + the §3 order):
| Situation | Does the allow rule short-circuit the classifier? |
|---|---|
Narrow allow rule (Bash(npm test)), benign target |
Yes. Resolves at step 1; never reaches the classifier. [DOC] |
Broad allow rule (Bash(*), Bash(python*), Bash(bash:*), pkg-manager runners, Agent(...)) |
No — the rule is dropped on entry to auto mode, so there's nothing to match at step 1. The action falls to step 3 (classifier). [DOC] |
Write to a protected path (.git, .claude, .env, shell rc, .npmrc, …) |
No. Always routes to the classifier even if an allow rule matches. [DOC] |
So the Bash(bash:*) case was not "a high-severity category overriding a live allow
rule." It was: Bash(bash:*) is a wildcarded interpreter → dropped on entry → the
bash wrapper.sh command reached the classifier → the classifier read the wrapper's real
intent (launch a bypassPermissions agent) → blocked it as Auto-Mode Bypass + Create
Unsafe Agents. [DOC] mechanism + [OBS] labels.
Once an action reaches the classifier, a four-tier precedence applies. [DOC] auto-mode-config:
hard_deny — blocks unconditionally. User intent and allow exceptions do not apply.
The built-in hard-deny list includes data exfiltration and auto-mode bypass rules. [DOC]soft_deny — blocks next; can be cleared by allow exceptions or explicit user intent.
Built-ins include force-push, curl | bash, production deploys. [DOC]allow — exceptions that override matching soft_deny. [DOC]"General requests don't count as explicit intent. Asking Claude to 'clean up the repo' does not authorize force-pushing, but asking Claude to 'force-push this branch' does." — [DOC]
This is the principle behind nearly every observed denial: a general instruction
("run an unattended loop", "do the backfill") is not authorisation for a specific
high-blast-radius action it happens to imply. The classifier asks whether this exact action
was authorised — and bypassPermissions self-replication lands in hard_deny, which even
specific intent can't clear without an explicit user/admin config change.
Two complementary views. The [DOC] view is the published behavioural lists; the [OBS] view is the short internal label strings the classifier emits inside denial reasons (only a couple of which — "auto-mode bypass", "data exfiltration" — also appear in the docs).
permission-modes "What the classifier blocks by default":
Blocked by default: downloading and executing code (curl | bash); sending sensitive
data to external endpoints; production deploys and migrations; mass deletion on cloud storage;
granting IAM/repo permissions; modifying shared infrastructure; irreversibly destroying files
that existed before the session; force-push or pushing directly to main; git reset --hard,
git checkout -- ., git restore ., git clean -fd, git stash drop/clear;
git commit --amend on a HEAD commit not created this session; terraform/pulumi/cdk/
terragrunt destroy (or applying a destructive plan).
Allowed by default: local file operations in the working directory; installing
dependencies declared in lock files/manifests; reading .env and sending credentials to
their matching API; read-only HTTP; pushing to the branch you started on or one Claude created.
Run claude auto-mode defaults to print the full built-in rule lists as JSON. [DOC]
These Title-Case tags appear verbatim inside Reason: strings of real denials. They are the
classifier's own labels; only "Auto-Mode Bypass" / "data exfiltration" are echoed in the docs.
| Label [OBS] | 1-line trigger | Real observed example (paraphrased reason) |
|---|---|---|
| Create Unsafe Agents | Spawning an autonomous claude -p loop with bypassed permissions |
"Launches an autonomous agent loop with --permission-mode bypassPermissions and no human approval gate." |
| Auto-Mode Bypass | Structuring a command to dodge the classifier itself | "…wrapper script explicitly built to route the bypass flag through bash to dodge the classifier — Auto-Mode Bypass and Create Unsafe Agents." |
| Self-Modification | Agent editing its own startup config to widen autonomy | "Editing .claude/settings.json to add a permission allow-rule is Self-Modification." / "Editing the global ~/.claude/settings.json to wire in a new PostToolUse hook is Self-Modification of agent startup config." |
| Credential Exploration | Systematically scanning secret stores beyond task scope | "Systematically scanning the OS keyring across 36 guessed service/user-name combinations… is Credential Exploration." Also .env sweeps, cmdkey /list. |
| Unauthorized Persistence | Installing something that survives the session | "Installs a persistent SYSTEM scheduled task via elevated PowerShell (Unauthorized Persistence)." |
| Modify Shared Resources | Writing shared infra/config outside project scope | "Registering a persistent managed service in the shared orchestration system (Unauthorized Persistence + Modify Shared Resources + Expose Local Services)." |
| Expose Local Services | Making a local service network-reachable | (same composite reason as above) |
| Production Deploy | Touching live/production systems | "Applying an unapplied migration (raw CREATE TABLE/INDEX/GRANT DDL) directly to the shared production Supabase database — a Production Deploy / Modify Shared Resources." |
| Security Weaken | Disabling a safety control | "Runs a script with -ExecutionPolicy Bypass (Security Weaken)." |
In addition to the named categories, a large class of denials [OBS] is user-boundary enforcement — the classifier blocking things you forbade, with no fixed category tag:
CLAUDE.md rules: "Command uses cd … && … chaining, which the user's CLAUDE.md
explicitly prohibits."gh release create; overwriting
another session's worktree; fabricating a result file; deleting with rm -rf after the user
said never to.⚠️ Boundaries you state in chat are re-read from the transcript on each check and are not stored as rules — context compaction can drop the message that set them. For a hard guarantee, use a
permissions.denyrule. [DOC]
When auto mode is on, a blocked action is not surfaced as an approval prompt. It comes
back to the model as a tool_result with is_error: true and this fixed template:
"Permission for this action was denied by the Claude Code auto mode classifier. Reason:
<reason>.. If you have other tasks that don't depend on this action, continue working on those. IMPORTANT: You may attempt to accomplish this action using other tools that might naturally be used to accomplish this goal, e.g. using head instead of cat. But you should not attempt to work around this denial in malicious ways… If you believe this capability is essential to complete the user's request, STOP and explain to the user what you were trying to do and why you need this permission. Let the user decide how to proceed. To allow this type of action in the future, the user can add a Bash permission rule to their settings."
With auto mode off, a non-allowlisted action instead surfaces to the human as a normal approval prompt — that is the only way to get manual approval (see §7). [DOC]/[OBS]
If the classifier can't evaluate an action, it blocks (fail-closed). Observed message [OBS]:
"Auto mode could not evaluate this action and is blocking it for safety — run with
--debugfor details."
The docs describe the same situation as a transient classifier outage where a message says auto mode "cannot determine the safety" of an action — distinct from auto mode being unavailable (an unmet requirement, not transient). [DOC] errors#auto-mode-cannot-determine-the-safety-of-an-action.
"If the classifier blocks an action 3 times in a row or 20 times total, auto mode pauses and Claude Code resumes prompting… These thresholds are not configurable. Any allowed action resets the consecutive counter, while the total counter persists for the session."
In non-interactive -p mode, repeated blocks abort the session (no human to prompt). [DOC]
This is exactly why unattended claude -p batch agents die on a hard denial instead of pausing.
Denials are recorded in /permissions → Recently denied tab; press r to mark one for a
manual-approval retry. [DOC]
Decision tree, cheapest/safest first. Never route around the classifier (see §8).
State specific intent in the conversation. For a soft_deny action, a specific
instruction lifts the block ("force-push this branch", not "clean up the repo"). Does not
work for hard_deny (data exfiltration, auto-mode bypass). [DOC]
You add a narrow allow rule (you, not the agent). A surviving narrow rule short-circuits
the classifier (§4). Keep it specific: Bash(npm test) carries over; Bash(*) / Bash(bash:*)
are dropped on entry and won't help. Edit ~/.claude/settings.json or .claude/settings.json
yourself — the agent editing it is Self-Modification (§5). [DOC]/[OBS]
Add an ask rule if you want a prompt rather than silent approval. Explicit ask rules
fire in every mode, including auto and bypassPermissions. [DOC]
Admin: widen the classifier's trust via autoMode.environment (prose, not regex —
describe your repos/buckets/domains) and, if needed, autoMode.allow / soft_deny /
hard_deny. Lives in ~/.claude/settings.json or managed settings (not shared project
settings). Validate with claude auto-mode config / critique. Always keep "$defaults"
unless you intend to replace the whole list. [DOC]
auto-mode-config
Switch auto mode off → get manual approval. Shift+Tab back to default (or start
without auto). Now non-allowlisted actions prompt you interactively — the supported path
when someone says "just ask me for manual approval." [DOC]
Pick the right permission mode for the job instead of fighting the classifier:
default — review each action (most oversight).acceptEdits — auto-approve in-scope edits + common fs commands; everything else prompts.plan — read/explore only.auto — classifier-gated autonomy (this doc).dontAsk — auto-deny anything not pre-approved (locked-down CI).bypassPermissions — skip checks entirely (see 7). [DOC]A dedicated --dangerously-skip-permissions / --permission-mode bypassPermissions
session — for genuinely autonomous loops. Risk tradeoffs, read before using:
ask rules still prompt; rm -rf / and rm -rf ~ still hit a circuit-breaker. [DOC]permissions.disableBypassPermissionsMode: "disable". [DOC]Hard guarantees (the other direction): permissions.deny blocks before the classifier
and can't be overridden; permissions.disableAutoMode / disableBypassPermissionsMode
lock modes off in managed settings. React to denials programmatically with the
PermissionDenied hook. [DOC] hooks
claude -p sessionsThe classifier does not block headless mode — claude -p is fully supported. [DOC]
headless. It blocks one specific shape: an
auto-mode session silently spawning an ungated, unattended child (bypassPermissions =
"no approval gates" = Create Unsafe Agents [OBS]). Two independent fixes; either works,
combine for best result.
Fix 1 — move the launch outside the auto-mode session. The classifier only evaluates tool
calls inside an auto-mode session. A human — or a human-configured Task Scheduler / cron /
CI runner / plain script — running claude -p … is the authoriser, with no parent classifier
in the loop. An unattended build loop should be a scheduler/script that invokes claude -p,
not a Claude session that tries to launch the loop. Don't run the orchestrator session itself
in auto mode if its job is spawning agents. [DOC]/[OBS]
Fix 2 — give the child gates instead of bypassPermissions. The denial is about the
ungated property, not headless-ness; [DOC] explicitly steers you here ("For background safety
checks with far fewer prompts, use auto mode instead"). Pick the least privilege that still
lets the job run:
| Headless profile | Behaviour | Use for |
|---|---|---|
--permission-mode dontAsk + curated permissions.allow |
Auto-denies anything not pre-approved; read-only Bash always allowed; fully non-interactive. | Locked-down CI / unattended workers (recommended default). |
--permission-mode auto |
Classifier-gated autonomy; configure autoMode.environment for your infra. In -p, repeated blocks abort the session. |
Long "trust-the-direction" runs. |
--permission-mode acceptEdits + allow rules |
Edits + common fs commands auto-approved; other Bash needs an allow rule (no prompt fires in -p). |
Edit-heavy tasks with a known command set. |
--dangerously-skip-permissions (= bypassPermissions) |
No gates at all. Refuses root/sudo; ask rules and rm -rf / | ~ still circuit-break. |
Only inside an isolated container/VM/devcontainer without internet. |
All four are [DOC] permission-modes.
The real fork for build workers: enumerate (dontAsk + allowlist — runs anywhere, safe)
vs isolate (container + bypassPermissions — full power, needs a sandbox). Reaching for
bypassPermissions on the host to avoid enumerating permissions is precisely the pattern the
classifier blocks.
Subagents can't escalate. Subagents launched via the Agent/Task tool inherit the parent's
mode, and the classifier uses the parent mode; permissionMode in subagent frontmatter is
ignored. [DOC] So a full-bypass worker fleet must be the isolated-container path launched
outside the auto-mode session — not an in-session subagent.
The classifier explicitly watches for attempts to route around itself, and the denial template tells the model in plain terms not to. Two patterns seen [OBS] and why they fail:
hard_deny)Putting claude -p --permission-mode bypassPermissions … inside wrapper.sh and running
bash wrapper.sh to slip past a Bash(bash:*) allow rule does not work:
Bash(bash:*) rule is dropped on entry (§4), so it never matches.claude -p --permission-mode bypassPermissions agent via a
> wrapper script built to route the bypass flag through bash to evade the classifier." [OBS]hard_deny tier — unconditional, not clearable by user
intent or an allow exception. [DOC]python -c "..." or a
shell wrapper is equivalent to the dangerous thing it wraps. [DOC engineering deep-dive]Having the agent write to .claude/settings.json / ~/.claude/settings.json (to add an allow
rule, wire a hook, or flip defaultMode: "auto") is blocked even when the user asked for the
outcome:
"Editing
.claude/settings.jsonto add a permission allow-rule is Self-Modification; the user chose 'Add allow-rule (I'll do it)', indicating they would add it themselves." [OBS]
This mirrors the documented design: a repo can't grant itself auto mode, and shared
.claude/settings.json can't inject autoMode rules. [DOC] The human edits the config.
If the outcome is blocked, the answer is to authorise it (§7), never to disguise it. When a capability is genuinely needed and you can't authorise it cheaply, the correct move is the one the denial message names: stop and ask the human.
| Key / flag | Effect |
|---|---|
permissions.defaultMode: "auto" |
Start in auto mode (user settings only; ignored in project/local). |
permissions.disableAutoMode: "disable" |
Admin lock-off of auto mode (managed settings). |
permissions.disableBypassPermissionsMode: "disable" |
Admin lock-off of bypassPermissions. |
permissions.deny / ask / allow |
Rule-based gate 1; deny > ask > allow, first match wins; deny runs before the classifier. |
autoMode.environment |
Prose description of trusted repos/buckets/domains. Include "$defaults". |
autoMode.hard_deny / soft_deny / allow |
Override classifier rule tiers. Keep "$defaults" unless replacing wholesale. |
CLAUDE_CODE_ENABLE_AUTO_MODE=1 |
Enable auto mode on Bedrock/Vertex/Foundry. |
--permission-mode <mode> |
default / acceptEdits / plan / auto / dontAsk / bypassPermissions. |
--dangerously-skip-permissions |
Alias for --permission-mode bypassPermissions. |
--allow-dangerously-skip-permissions |
Adds bypass to the Shift+Tab cycle without activating it. |
| Command | Purpose |
|---|---|
claude auto-mode defaults |
Print built-in environment/allow/soft_deny/hard_deny as JSON. |
claude auto-mode config |
Print the effective config ("$defaults" expanded). |
claude auto-mode critique |
AI review of your custom rules (ambiguous / redundant / false-positive-prone). |
/permissions → Recently denied (r) |
Review classifier denials; retry one with manual approval. |
PreToolUse — custom allow/deny/ask logic before a tool runs (gate 1).PermissionRequest — fires when a permission dialog would appear.PermissionDenied — react to a classifier denial (e.g. signal a retry).Documented [DOC] (fetched 2026-06-22):
autoMode.*
config, hard/soft/allow tiers, explicit-intent rule, claude auto-mode subcommands.PreToolUse / PermissionRequest /
PermissionDenied.Observed [OBS]: denial records extracted from local session transcripts under
~/.claude/projects/**/*.jsonl (≈50+ sessions where the classifier fired), 2026. Summarised;
no credentials or private content reproduced. Internal category label strings and the exact
denial-message template are runtime artifacts, not published API, and may change.