Browse Source

feat(auto-skill): surface suggestions via pending.log + harness whitelist

Auto-skill's Stop hook emitted systemMessage JSON that only Claude saw,
so ~80 suggestions over the past week vanished unnoticed. Three fixes:

- evaluate.sh now also appends to ~/.claude/auto-skill/pending.log
  (ISO8601 | session | cwd | writes | unique | total | histogram),
  so suggestions survive past end-of-turn.
- /sync reads the log at session start and surfaces entries from the
  last 72h under a "Skill Suggestions" section - the one place the
  user reliably sees them.
- track-tools.sh tags Skill calls as Skill:<name> so Gate 1 can
  whitelist harness skills (sync, save, introspect, auto-skill,
  setperms, tool-discovery). Previously, running /sync at session
  start killed the hook for that session forever.

SKILL.md documents the new pending/clear subcommands, the Pending
Log format, and the harness whitelist.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
0xDarkMatter 1 month ago
parent
commit
bd377e4462

+ 34 - 1
commands/sync.md

@@ -171,7 +171,29 @@ Instead, check your system prompt for the memory content you already have, and s
 
 This costs zero extra tokens while confirming the safety net is working.
 
-### Step 7: Output
+### Step 7: Check Pending Skill Suggestions
+
+The `auto-skill` Stop hook writes to `~/.claude/auto-skill/pending.log` whenever
+it detects a skill-worthy session. Those suggestions go to Claude via
+`systemMessage` — which usually dies silently. `/sync` is the surfacing point.
+
+```bash
+LOG="$HOME/.claude/auto-skill/pending.log"
+[ -f "$LOG" ] || exit 0
+
+# Show entries from the last 72 hours
+CUTOFF=$(date -d '72 hours ago' -Iseconds 2>/dev/null || \
+         date -v-72H '+%Y-%m-%dT%H:%M:%S%z' 2>/dev/null)
+
+awk -F'|' -v cutoff="$CUTOFF" '$1 >= cutoff' "$LOG" 2>/dev/null | tail -10
+```
+
+- If the log doesn't exist, or no entries in the last 72h, skip silently
+- If entries exist, show a "Skill Suggestions" section with each row
+- Format per row: timestamp (local), writes/unique, cwd, top tools
+- Offer: run `/auto-skill` to capture, or `auto-skill clear` to dismiss
+
+### Step 8: Output
 
 Format and display unified status.
 
@@ -248,6 +270,17 @@ Run `pigeon read` to read.
 
 [If no unread messages or pigeon not installed: omit this section entirely]
 
+## Skill Suggestions
+
+[If ~/.claude/auto-skill/pending.log has entries from the last 72 hours:]
+2 skill-worthy sessions detected (you missed the in-turn prompts):
+  2026-04-24 19:28  |  12w/5t  |  X:/Forge/Axiom           |  Write(4) Edit(3) Bash(3)
+  2026-04-24 14:47  |  9w/4t   |  X:/Forge/claude-mods     |  Edit(4) Bash(3) Write(2)
+
+Run `/auto-skill` to capture a workflow, or `auto-skill clear` to dismiss.
+
+[If no pending entries in last 72h, or log doesn't exist: omit this section entirely]
+
 ## Quick Reference
 
 | Category | Items |

+ 36 - 1
skills/auto-skill/SKILL.md

@@ -31,6 +31,8 @@ Parse arguments after `auto-skill` (or `/auto-skill`):
 | `auto-skill off --project` | Disable for this project: `mkdir -p .claude && touch .claude/auto-skill.disable` |
 | `auto-skill on --project` | Enable for this project: `rm -f .claude/auto-skill.disable` |
 | `auto-skill status` | Show current state (see Status section below) |
+| `auto-skill pending` | Show all entries in `~/.claude/auto-skill/pending.log` (past suggestions the user may have missed) |
+| `auto-skill clear` | Truncate `~/.claude/auto-skill/pending.log` after confirming with user |
 
 ### Status
 
@@ -179,6 +181,39 @@ After creating, verify the skill:
 3. Confirm the procedure section exists and has steps
 4. Tell the user the skill is ready and how to invoke it
 
+## Pending Log
+
+Because `systemMessage` output from the Stop hook is delivered to Claude (not
+directly to the user), suggestions often die silently when the user's next
+prompt doesn't invite them to be mentioned. To solve this, the hook also
+appends a line to `~/.claude/auto-skill/pending.log` each time it fires:
+
+```
+2026-04-24T19:28:03+10:00|9dc8576c|/x/forge/axiom|12|5|28|Write(4) Edit(3) Bash(3)
+```
+
+Fields (pipe-delimited):
+
+| # | Field | Example |
+|---|-------|---------|
+| 1 | ISO8601 timestamp | `2026-04-24T19:28:03+10:00` |
+| 2 | Short session ID | `9dc8576c` |
+| 3 | CWD when suggestion fired | `/x/forge/axiom` |
+| 4 | Mutating op count | `12` |
+| 5 | Unique tool type count | `5` |
+| 6 | Total tool calls | `28` |
+| 7 | Top-6 tool histogram | `Write(4) Edit(3) Bash(3)` |
+
+`/sync` reads this log at session start and surfaces any entries from the
+last 72 hours under a **"Skill Suggestions"** section — the one place the
+user will reliably see them.
+
+### Viewing and clearing
+
+- `auto-skill pending` — `cat ~/.claude/auto-skill/pending.log` (or show
+  "no pending suggestions" if absent/empty)
+- `auto-skill clear` — truncate after confirming with the user
+
 ## Per-Project Disable
 
 ```bash
@@ -234,7 +269,7 @@ The Stop hook only suggests skill creation when ALL of these pass:
 |------|-----------|-----------|
 | **Mutating ops** | 8+ | High bar reduces noise from routine edits |
 | **Tool diversity** | 4+ distinct types | Write+Edit+Bash+Agent = workflow; Write*20 = repetitive |
-| **No skill loaded** | `Skill` tool absent | If following a skill, work isn't novel |
+| **No non-harness skill loaded** | Skill tool absent OR only harness skills | If following a domain skill, work isn't novel. Harness skills (sync, save, introspect, auto-skill, setperms, tool-discovery) are whitelisted — they're bootstrap/meta, not recipes. |
 | **Per-session** | Once per session | Never nags on resume/continue |
 | **Not disabled** | No `.disable` file | Global or per-project toggle |
 

+ 43 - 7
skills/auto-skill/scripts/evaluate.sh

@@ -4,9 +4,18 @@
 # Suggests skill creation only when a session shows genuine workflow complexity:
 #   - 8+ mutating tool calls (high threshold, reduces noise)
 #   - 4+ distinct mutating tool types (diversity = workflow, not repetitive edits)
-#   - No skill was loaded (novel work, not following existing procedure)
+#   - No non-harness skill was loaded (novel work, not following a recipe).
+#     Harness skills (sync, save, introspect, auto-skill, setperms, tool-discovery)
+#     are whitelisted — they're meta/bootstrap, not domain-specific, so loading
+#     them shouldn't disqualify an otherwise novel session.
 #   - Per-session cooldown file prevents re-fire on resume
 #
+# Output channels (when a suggestion fires):
+#   1. systemMessage JSON on stdout - visible to Claude on next turn
+#   2. Appended line to ~/.claude/auto-skill/pending.log - visible to user
+#      at next /sync (since Claude's systemMessage often dies silently if
+#      the user's next prompt doesn't invite it to be mentioned).
+#
 # Toggle: touch ~/.claude/auto-skill.disable   (global off)
 #         touch .claude/auto-skill.disable      (project off)
 #         rm either file to re-enable
@@ -40,6 +49,8 @@
 
   # --- Classify tools ---
   READ_ONLY_LIST=" Read Glob Grep LS NotebookRead TaskList TaskGet TaskCreate TaskUpdate TaskOutput TaskStop "
+  # Harness skills: loading these should NOT disqualify a session
+  HARNESS_SKILLS=" sync save introspect auto-skill setperms tool-discovery "
   SKILL_LOADED=false
   TOTAL=0
   WRITES=0
@@ -49,17 +60,29 @@
     [ -z "$tool" ] && continue
     TOTAL=$((TOTAL + 1))
 
-    # Check if a skill was loaded
-    [ "$tool" = "Skill" ] && SKILL_LOADED=true
+    # Handle Skill tool (tagged as "Skill:<name>" by track-tools.sh, or bare
+    # "Skill" from pre-whitelist versions)
+    case "$tool" in
+      Skill:*)
+        skill_name="${tool#Skill:}"
+        # Is it a harness skill? If so, ignore entirely.
+        case "$HARNESS_SKILLS" in
+          *" ${skill_name} "*) continue ;;
+          *) SKILL_LOADED=true; continue ;;
+        esac
+        ;;
+      Skill)
+        # Legacy format (pre-whitelist): conservatively disqualify
+        SKILL_LOADED=true
+        continue
+        ;;
+    esac
 
     # Check if read-only (space-padded list for exact word match)
     case "$READ_ONLY_LIST" in
       *" ${tool} "*) continue ;;
     esac
 
-    # Skip Skill tool itself
-    [ "$tool" = "Skill" ] && continue
-
     WRITES=$((WRITES + 1))
 
     # Track unique mutating tool types
@@ -81,7 +104,7 @@
   # Clean up tracking file
   rm -f "$TRACK_FILE"
 
-  # --- Gate 1: Skill was loaded = not novel work ---
+  # --- Gate 1: Non-harness skill was loaded = following a recipe, not novel ---
   [ "$SKILL_LOADED" = true ] && exit 0
 
   # --- Gate 2: Minimum 8 mutating operations ---
@@ -95,6 +118,19 @@
   # Mark this session as suggested (prevents repeat on resume)
   touch "$SUGGESTED_FILE" 2>/dev/null
 
+  # Append to persistent log so the human can see suggestions at next /sync.
+  # systemMessage goes to Claude; this log goes to the user.
+  # Format: ISO8601 | session_id | cwd | writes | unique | total | summary
+  LOG_DIR="$HOME/.claude/auto-skill"
+  LOG_FILE="$LOG_DIR/pending.log"
+  mkdir -p "$LOG_DIR" 2>/dev/null
+  TS=$(date -Iseconds 2>/dev/null || date '+%Y-%m-%dT%H:%M:%S%z')
+  CWD=$(pwd 2>/dev/null || echo "unknown")
+  CLEAN_SUMMARY=$(printf '%s' "$TOOL_SUMMARY" | tr '|' '/' | tr -s ' ')
+  printf '%s|%s|%s|%d|%d|%d|%s\n' \
+    "$TS" "$SHORT_ID" "$CWD" "$WRITES" "$UNIQUE_COUNT" "$TOTAL" "$CLEAN_SUMMARY" \
+    >> "$LOG_FILE" 2>/dev/null
+
   MSG="Skill-worthy session: ${WRITES} mutating ops across ${UNIQUE_COUNT} tool types (${TOTAL} total): ${TOOL_SUMMARY}- run /auto-skill to capture this workflow."
 
   ESCAPED=$(printf '%s' "$MSG" | sed 's/"/\\"/g' | tr '\n' ' ')

+ 12 - 0
skills/auto-skill/scripts/track-tools.sh

@@ -3,6 +3,10 @@
 # Appends tool name to a session-specific temp file.
 # Designed to be fast (<5ms) - no SQLite, no network, just a file append.
 #
+# Special case: when the Skill tool is invoked, we record `Skill:<name>`
+# instead of bare `Skill`. evaluate.sh uses the name to whitelist harness
+# skills (sync, save, etc) from Gate 1 disqualification.
+#
 # CRITICAL: This hook must NEVER fail visibly. All errors suppressed.
 
 {
@@ -16,6 +20,14 @@
   SHORT_ID="${SESSION_ID:0:8}"
   TRACK_FILE="/tmp/claude_autoskill_${SHORT_ID}"
 
+  # Tag Skill tool calls with the skill name so Gate 1 can whitelist
+  if [ "$TOOL_NAME" = "Skill" ]; then
+    SKILL_NAME=$(printf '%s' "$INPUT" | jq -r '.tool_input.skill // "unknown"' 2>/dev/null)
+    # Sanitise: keep parsing simple by normalising separators
+    SKILL_NAME=$(printf '%s' "$SKILL_NAME" | tr ': ' '_')
+    TOOL_NAME="Skill:${SKILL_NAME}"
+  fi
+
   # Append tool name (one per line). Cap at 500 lines to prevent runaway.
   if [ ! -f "$TRACK_FILE" ] || [ "$(wc -l < "$TRACK_FILE" 2>/dev/null)" -lt 500 ]; then
     echo "$TOOL_NAME" >> "$TRACK_FILE"