Browse Source

fix(push-gate): repair regex layer — malformed patterns + self-scan

Two bugs in scripts/scan-secrets.sh that blocked every invocation:

1. The false-positive filter at line 109 contained `\.\.\.'` patterns
   with a literal `'` inside a bash single-quoted string. Bash closed
   the string at that apostrophe, leaving an orphan `)` which grep
   rejected with "Unmatched ( or \(". Dropped the problematic patterns.

2. Push-gate was scanning its own references/secret-patterns.txt file,
   which contains an example of every secret shape it's trying to
   detect (-----BEGIN CERTIFICATE-----, AWS keys, etc). When push-gate
   itself is in the pushed content, this produced guaranteed matches.
   Excluded via git diff pathspec `:(exclude,glob)**/push-gate/
   references/secret-patterns.txt`.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
0xDarkMatter 1 month ago
parent
commit
2570f4fa85
1 changed files with 12 additions and 3 deletions
  1. 12 3
      skills/push-gate/scripts/scan-secrets.sh

+ 12 - 3
skills/push-gate/scripts/scan-secrets.sh

@@ -85,7 +85,12 @@ fi
 # ── Layer 2: regex corpus on the diff ─────────────────────────────────────────
 echo "push-gate: regex layer on added lines"
 DIFF_FILE="$(mktemp -t push-gate-diff.XXXXXX)"
-git diff "$RANGE" > "$DIFF_FILE"
+# Exclude push-gate's own pattern corpus — it contains examples of every
+# secret shape it's trying to detect, so scanning it matches everything.
+# (Classic snake-eating-tail when push-gate is part of the pushed content.)
+git diff "$RANGE" -- . \
+  ':(exclude,glob)**/push-gate/references/secret-patterns.txt' \
+  > "$DIFF_FILE"
 
 # Extract added lines only (strip the leading '+'), ignore file-header lines
 ADDED_FILE="$(mktemp -t push-gate-added.XXXXXX)"
@@ -103,10 +108,14 @@ done < "$PATTERNS_FILE"
 # Run ripgrep with all patterns; capture matches
 RAW_HITS="$(rg --no-filename --line-number --no-heading "${PATTERN_ARGS[@]}" "$ADDED_FILE" 2>/dev/null || true)"
 
-# Filter common false positives
+# Filter common false positives.
+# Note: the `\.\.\.'` ellipsis-apostrophe patterns were removed because they
+# required an embedded `'` inside a bash single-quoted string, which closes
+# the string early and breaks the regex ("Unmatched ( or \("). The remaining
+# patterns (placeholder/example/getenv/etc) cover the bulk of false positives.
 FILTERED_HITS="$(
   printf '%s\n' "$RAW_HITS" \
-    | grep -viE '(example|placeholder|\<dummy\>|\<fake\>|\<TODO\>|<unset>|os\.environ|process\.env|getenv|\$\{[A-Z_]+:-|\$\{[A-Z_]+\}|\$\([A-Z_]+\)|\$env:[A-Z_]+|\.\.\.<|\.\.\.')|\.\.\.'\s*$)' \
+    | grep -viE '(example|placeholder|\<dummy\>|\<fake\>|\<TODO\>|<unset>|os\.environ|process\.env|getenv|\$\{[A-Z_]+:-|\$\{[A-Z_]+\}|\$\([A-Z_]+\)|\$env:[A-Z_]+|\.\.\.<)' \
     || true
 )"