Browse Source

fix(windows-ops): Diagnostic scripts always exit 0 on successful run

Previously health-audit, disk-health, and drive-dependencies
returned EXIT_VALIDATION (4) when they found critical issues
(failing drive, system-critical dependencies, etc). The intent
was to let cron/CI branch on $LASTEXITCODE.

Problem: UIs that treat any non-zero exit as 'task failed' show
a 'Background task failed' badge even though the script ran
perfectly and produced its full diagnostic output. The findings
ARE the success — surfacing them as a failure is misleading.

Fix: exit 0 whenever the script ran to completion regardless of
findings. The findings live where they belong — in the panel
output (FAILING section, ▲ alerts, ⬤ busted footer indicator)
and in the JSON output's verdict + indicator fields.

Automation consuming -Json output should branch on:
  $audit = .\health-audit.ps1 -Json | ConvertFrom-Json
  $failing = $audit | Where-Object level -eq 'fail'
  if ($failing) { Send-Alert ... }

…NOT on $LASTEXITCODE. The .NOTES sections in each script's
comment-based help updated to document this explicitly.

Reserved exit codes for what they actually signal:
  0 — script ran (any verdict surfaced via output)
  1 — generic runtime failure
  2 — usage / bad arguments
  3 — not found (e.g. -DriveLetter X with no such drive)
  5 — missing precondition (missing dependency, perms)

recover-clone.ps1 unchanged — its exit codes map robocopy's
bitmask, which IS the operation's success/failure not a
diagnostic verdict.

Verified all three scripts exit 0 against this machine's
failing-drive state while still rendering the FAILING / CRITICAL
panels with full data.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
0xDarkMatter 2 weeks ago
parent
commit
55114eb8b2

+ 6 - 4
skills/windows-ops/scripts/disk-health.ps1

@@ -45,10 +45,12 @@
     Find the HGST drive and dump its error counts as JSON.
 
 .NOTES
-    Exit codes:
-      0 success — drive looks healthy
+    Exit codes (reflect whether the diagnostic RAN, not what it found):
+      0 success — diagnostic completed (verdict reported via panel + JSON)
       3 not found — no matching disk
-      4 validation — drive shows failure indicators
+
+    The drive's health verdict (HEALTHY / WATCHLIST / FAILING) is in
+    the panel output and JSON; check the verdict field, not $LASTEXITCODE.
 #>
 
 [CmdletBinding(DefaultParameterSetName='Number')]
@@ -318,5 +320,5 @@ if ($Json) {
     Write-TermLine (New-TermPanelClose -Hotkeys $hk -Healths $health)
 }
 
-if ($result.verdict -eq 'FAILING') { exit $script:EXIT_VALIDATION }
+# Verdict is in the panel and JSON output; exit 0 means the diagnostic ran.
 exit $script:EXIT_OK

+ 6 - 3
skills/windows-ops/scripts/drive-dependencies.ps1

@@ -12,11 +12,14 @@
 
     Default output is a human-readable table. -Json emits structured.
 
-    Exit codes:
-      0 success
+    Exit codes (reflect whether the audit RAN, not what it found):
+      0 success — audit completed (verdict reported via panel + JSON)
       2 usage
       3 not found (no such drive)
 
+    The verdict (SAFE TO DISCONNECT / WARNINGS / DO NOT DISCONNECT) is
+    in the panel output and JSON 'verdict' field, not $LASTEXITCODE.
+
 .PARAMETER DriveLetter
     Single drive letter (e.g. 'Y'). Case-insensitive.
 
@@ -338,5 +341,5 @@ if ($Json) {
     Write-TermLine (New-TermPanelClose -Hotkeys $hk -Healths $health)
 }
 
-if ($criticalCount -gt 0) { exit $script:EXIT_VALIDATION }
+# Verdict is in the panel and JSON output; exit 0 means the audit ran.
 exit $script:EXIT_OK

+ 10 - 6
skills/windows-ops/scripts/health-audit.ps1

@@ -37,12 +37,15 @@
     Save audit findings as NDJSON for later processing.
 
 .NOTES
-    Exit codes:
-      0 success — audit completed, no critical findings
-      1 general error during audit
+    Exit codes (reflect whether the audit RAN, not what it found):
+      0 success — audit completed (findings reported via panel + JSON)
+      1 general error during audit (e.g. WinRM unreachable)
       2 usage error (bad arguments)
-      4 critical finding (failing drive, recent unexplained crashes)
       5 missing precondition (PowerShell version, required module)
+
+    Findings are in the output, not the exit code. Automation
+    consuming -Json output should branch on verdict + finding levels,
+    not $LASTEXITCODE.
 #>
 
 [CmdletBinding()]
@@ -486,6 +489,7 @@ if (-not $Json) {
     Write-TermLine (New-TermPanelClose -Hotkeys $hk -Healths $hl)
 }
 
-# Exit code semantics
-if ($failCount -gt 0) { exit $script:EXIT_VALIDATION }
+# Exit code: success means the audit RAN OK, regardless of findings.
+# Findings live in the panel output (stdout/stderr) and JSON. Automation
+# parsing the JSON should branch on verdict counts, not exit codes.
 exit $script:EXIT_OK