Browse Source

feat(skills): Add postgres-ops skill, rename -patterns to -ops

- Add comprehensive postgres-ops skill (4,468 lines) with 6 reference
  files covering schema design, indexing, query tuning, operations,
  replication, and config tuning
- Rename all 14 -patterns skills to -ops (cli, mcp, python-async,
  python-cli, python-database, python-fastapi, python-observability,
  python-pytest, python-typing, rest, sql, security, tailwind, testing)
- Update all cross-references across agents, docs, tests, and catalogs
- Migrate PostgreSQL-specific content from sql-ops/indexing-strategies
  to postgres-ops/indexing, trim sql-ops to vendor-neutral
- Update naming-conventions.md: -ops is the standard for new skills

Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
0xDarkMatter 1 month ago
parent
commit
d3fc69a973
100 changed files with 4585 additions and 89 deletions
  1. 2 2
      AGENTS.md
  2. 9 8
      README.md
  3. 1 1
      agents/claude-architect.md
  4. 10 10
      agents/python-expert.md
  5. 1 1
      agents/sql-expert.md
  6. 25 1
      commands/sync.md
  7. 2 2
      docs/ARCHITECTURE.md
  8. 1 1
      docs/PLAN.md
  9. 9 6
      rules/naming-conventions.md
  10. 1 1
      skills/claude-code-hooks/SKILL.md
  11. 2 2
      skills/cli-patterns/SKILL.md
  12. 0 0
      skills/cli-ops/assets/.gitkeep
  13. 0 0
      skills/cli-ops/references/implementation.md
  14. 0 0
      skills/cli-ops/references/json-schemas.md
  15. 0 0
      skills/cli-ops/scripts/.gitkeep
  16. 4 4
      skills/mcp-patterns/SKILL.md
  17. 0 0
      skills/mcp-ops/assets/.gitkeep
  18. 0 0
      skills/mcp-ops/references/auth-patterns.md
  19. 0 0
      skills/mcp-ops/references/resource-patterns.md
  20. 0 0
      skills/mcp-ops/references/state-patterns.md
  21. 0 0
      skills/mcp-ops/references/testing-patterns.md
  22. 0 0
      skills/mcp-ops/references/tool-patterns.md
  23. 0 0
      skills/mcp-ops/scripts/.gitkeep
  24. 271 0
      skills/postgres-ops/SKILL.md
  25. 0 0
      skills/postgres-ops/assets/.gitkeep
  26. 746 0
      skills/postgres-ops/references/config-tuning.md
  27. 746 0
      skills/postgres-ops/references/indexing.md
  28. 714 0
      skills/postgres-ops/references/operations.md
  29. 632 0
      skills/postgres-ops/references/query-tuning.md
  30. 628 0
      skills/postgres-ops/references/replication.md
  31. 731 0
      skills/postgres-ops/references/schema-design.md
  32. 0 0
      skills/postgres-ops/scripts/.gitkeep
  33. 7 7
      skills/python-async-patterns/SKILL.md
  34. 0 0
      skills/python-async-ops/assets/async-project-template.py
  35. 0 0
      skills/python-async-ops/references/aiohttp-patterns.md
  36. 0 0
      skills/python-async-ops/references/concurrency-patterns.md
  37. 0 0
      skills/python-async-ops/references/debugging-async.md
  38. 0 0
      skills/python-async-ops/references/error-handling.md
  39. 0 0
      skills/python-async-ops/references/mixing-sync-async.md
  40. 0 0
      skills/python-async-ops/references/performance.md
  41. 0 0
      skills/python-async-ops/references/production-patterns.md
  42. 0 0
      skills/python-async-ops/scripts/find-blocking-calls.sh
  43. 4 4
      skills/python-cli-patterns/SKILL.md
  44. 0 0
      skills/python-cli-ops/assets/cli-template.py
  45. 0 0
      skills/python-cli-ops/references/configuration.md
  46. 0 0
      skills/python-cli-ops/references/rich-output.md
  47. 0 0
      skills/python-cli-ops/references/typer-patterns.md
  48. 0 0
      skills/python-cli-ops/scripts/.gitkeep
  49. 7 7
      skills/python-database-patterns/SKILL.md
  50. 0 0
      skills/python-database-ops/assets/alembic.ini.template
  51. 0 0
      skills/python-database-ops/references/connection-pooling.md
  52. 0 0
      skills/python-database-ops/references/migrations.md
  53. 0 0
      skills/python-database-ops/references/sqlalchemy-async.md
  54. 0 0
      skills/python-database-ops/references/transactions.md
  55. 0 0
      skills/python-database-ops/scripts/.gitkeep
  56. 3 3
      skills/python-env/SKILL.md
  57. 8 8
      skills/python-fastapi-patterns/SKILL.md
  58. 0 0
      skills/python-fastapi-ops/assets/fastapi-template.py
  59. 0 0
      skills/python-fastapi-ops/references/background-tasks.md
  60. 0 0
      skills/python-fastapi-ops/references/dependency-injection.md
  61. 0 0
      skills/python-fastapi-ops/references/middleware-patterns.md
  62. 0 0
      skills/python-fastapi-ops/references/validation-serialization.md
  63. 0 0
      skills/python-fastapi-ops/scripts/scaffold-api.sh
  64. 7 7
      skills/python-observability-patterns/SKILL.md
  65. 0 0
      skills/python-observability-ops/assets/logging-config.py
  66. 0 0
      skills/python-observability-ops/references/metrics.md
  67. 0 0
      skills/python-observability-ops/references/structured-logging.md
  68. 0 0
      skills/python-observability-ops/references/tracing.md
  69. 0 0
      skills/python-observability-ops/scripts/.gitkeep
  70. 6 6
      skills/python-pytest-patterns/SKILL.md
  71. 0 0
      skills/python-pytest-ops/assets/conftest.py.template
  72. 0 0
      skills/python-pytest-ops/assets/pytest.ini.template
  73. 0 0
      skills/python-pytest-ops/references/async-testing.md
  74. 0 0
      skills/python-pytest-ops/references/coverage-strategies.md
  75. 0 0
      skills/python-pytest-ops/references/fixtures-advanced.md
  76. 0 0
      skills/python-pytest-ops/references/integration-testing.md
  77. 0 0
      skills/python-pytest-ops/references/mocking-patterns.md
  78. 0 0
      skills/python-pytest-ops/references/property-testing.md
  79. 0 0
      skills/python-pytest-ops/references/test-architecture.md
  80. 0 0
      skills/python-pytest-ops/scripts/generate-conftest.sh
  81. 0 0
      skills/python-pytest-ops/scripts/run-tests.sh
  82. 6 6
      skills/python-typing-patterns/SKILL.md
  83. 0 0
      skills/python-typing-ops/assets/pyproject-typing.toml
  84. 0 0
      skills/python-typing-ops/references/generics-advanced.md
  85. 0 0
      skills/python-typing-ops/references/mypy-config.md
  86. 0 0
      skills/python-typing-ops/references/overloads.md
  87. 0 0
      skills/python-typing-ops/references/protocols-patterns.md
  88. 0 0
      skills/python-typing-ops/references/runtime-validation.md
  89. 0 0
      skills/python-typing-ops/references/type-narrowing.md
  90. 0 0
      skills/python-typing-ops/scripts/check-types.sh
  91. 1 1
      skills/rest-patterns/SKILL.md
  92. 0 0
      skills/rest-ops/assets/.gitkeep
  93. 0 0
      skills/rest-ops/references/caching-patterns.md
  94. 0 0
      skills/rest-ops/references/rate-limiting.md
  95. 0 0
      skills/rest-ops/references/response-formats.md
  96. 0 0
      skills/rest-ops/references/status-codes.md
  97. 0 0
      skills/rest-ops/scripts/.gitkeep
  98. 1 1
      skills/security-patterns/SKILL.md
  99. 0 0
      skills/security-ops/assets/.gitkeep
  100. 0 0
      skills/security-patterns/references/auth-patterns.md

+ 2 - 2
AGENTS.md

@@ -5,7 +5,7 @@
 This is **claude-mods** - a collection of custom extensions for Claude Code:
 - **22 expert agents** for specialized domains (React, Python, Go, Rust, AWS, etc.)
 - **3 commands** for session management (/sync, /save) and experimental features (/canvas)
-- **43 skills** for CLI tools, patterns, workflows, and development tasks
+- **44 skills** for CLI tools, patterns, workflows, and development tasks
 - **Custom output styles** for response personality (e.g., Vesper)
 
 ## Installation
@@ -50,7 +50,7 @@ On "INIT:" message at session start:
 |----------|-------------|
 | `rules/cli-tools.md` | Modern CLI tool preferences (rg, fd, eza, bat) |
 | `rules/thinking.md` | Extended thinking triggers (think → ultrathink) |
-| `skills/cli-patterns/` | Production CLI patterns - agentic workflows, OS keyring auth, stream separation |
+| `skills/cli-ops/` | Production CLI patterns - agentic workflows, OS keyring auth, stream separation |
 | `docs/WORKFLOWS.md` | 10 workflow patterns from Anthropic best practices |
 | `skills/tool-discovery/` | Find the right library for any task |
 | `hooks/README.md` | Pre/post execution hook examples |

File diff suppressed because it is too large
+ 9 - 8
README.md


+ 1 - 1
agents/claude-architect.md

@@ -73,7 +73,7 @@ Route to these skills for detailed patterns:
 | CLI automation | `claude-code-headless` | Flags, output formats, CI/CD |
 | Extension templates | `claude-code-templates` | Agent, skill, command scaffolds |
 | Troubleshooting | `claude-code-debug` | Common issues, debug commands |
-| MCP servers | `mcp-patterns` | Tool handlers, resources, Claude Desktop |
+| MCP servers | `mcp-ops` | Tool handlers, resources, Claude Desktop |
 | Find right tool | `tool-discovery` | Agent vs skill selection flowchart |
 
 Each skill includes:

+ 10 - 10
agents/python-expert.md

@@ -26,7 +26,7 @@ You are a Python expert specializing in decision guidance, performance optimizat
 2. Is it I/O-bound with high concurrency? → Async
 3. Is it simple I/O with few connections? → Sync is fine
 
-→ **Load `python-async-patterns`** for asyncio, TaskGroup, concurrency patterns
+→ **Load `python-async-ops`** for asyncio, TaskGroup, concurrency patterns
 
 ---
 
@@ -91,7 +91,7 @@ class Shape(ABC):
         return f"Area: {self.area()}"
 ```
 
-→ **Load `python-typing-patterns`** for generics, TypeVar, overloads
+→ **Load `python-typing-ops`** for generics, TypeVar, overloads
 
 ---
 
@@ -126,13 +126,13 @@ Route to these skills for detailed patterns:
 
 | Task | Skill | Key Topics |
 |------|-------|------------|
-| FastAPI development | `python-fastapi-patterns` | Dependency injection, middleware, Pydantic v2 |
-| Database/ORM | `python-database-patterns` | SQLAlchemy 2.0, async DB, Alembic |
-| Async patterns | `python-async-patterns` | asyncio, TaskGroup, semaphores, queues |
-| Testing | `python-pytest-patterns` | Fixtures, mocking, parametrize, coverage |
-| Type hints | `python-typing-patterns` | TypeVar, Protocol, generics, overloads |
-| CLI tools | `python-cli-patterns` | Typer, Rich, configuration, subcommands |
-| Logging/metrics | `python-observability-patterns` | structlog, Prometheus, OpenTelemetry |
+| FastAPI development | `python-fastapi-ops` | Dependency injection, middleware, Pydantic v2 |
+| Database/ORM | `python-database-ops` | SQLAlchemy 2.0, async DB, Alembic |
+| Async patterns | `python-async-ops` | asyncio, TaskGroup, semaphores, queues |
+| Testing | `python-pytest-ops` | Fixtures, mocking, parametrize, coverage |
+| Type hints | `python-typing-ops` | TypeVar, Protocol, generics, overloads |
+| CLI tools | `python-cli-ops` | Typer, Rich, configuration, subcommands |
+| Logging/metrics | `python-observability-ops` | structlog, Prometheus, OpenTelemetry |
 | Environment setup | `python-env` | uv, pyproject.toml, publishing |
 
 Each skill includes:
@@ -310,7 +310,7 @@ def setup_logging(level: int = logging.INFO, log_dir: Path = Path("logs")):
     return logger
 ```
 
-→ **Load `python-observability-patterns`** for structlog, metrics, tracing
+→ **Load `python-observability-ops`** for structlog, metrics, tracing
 
 ---
 

+ 1 - 1
agents/sql-expert.md

@@ -72,4 +72,4 @@ All deliverables must meet:
 - Connection pooling tuning
 
 ## Related Skill
-For pattern reference (CTEs, window functions, JOINs), use **sql-patterns** skill.
+For pattern reference (CTEs, window functions, JOINs), use **sql-ops** skill.

+ 25 - 1
commands/sync.md

@@ -33,6 +33,7 @@ $ARGUMENTS
     |      +- Restore tasks via TaskCreate
     |      +- Resolve plan path (Step 0)
     |      +- Read plan (<plan-path>)
+    |      +- Acknowledge memory context (already auto-loaded)
     |      +- Show unified status
     |      +- Suggest next action
     |
@@ -147,7 +148,17 @@ git log -1 --format="%h %s" 2>/dev/null
 - `wc -l` in Git Bash = count lines (CORRECT)
 - Git Bash understands `2>/dev/null` but NOT `2>nul`
 
-### Step 5: Output
+### Step 5: Acknowledge Memory
+
+MEMORY.md is auto-loaded into the system prompt by Claude Code - do NOT re-read the file.
+Instead, check your system prompt for the memory content you already have, and surface it:
+
+- If MEMORY.md has content (non-empty), summarise what it contains (especially any `## Last Session` section written by `/save`)
+- If MEMORY.md is empty, note "Memory: Empty (no notes from previous sessions)"
+
+This costs zero extra tokens while confirming the safety net is working.
+
+### Step 6: Output
 
 Format and display unified status.
 
@@ -204,6 +215,14 @@ Progress: 40% (2/5)
 
 Note: PR row only shown when pr_number/pr_url are present in saved state.
 
+## Memory
+
+[If MEMORY.md has content, summarise key points - especially any `## Last Session` section]
+[If MEMORY.md is empty: "No memory notes from previous sessions."]
+
+Note: MEMORY.md is auto-loaded into the system prompt. This section surfaces
+what's already in context - no file read needed.
+
 ## Quick Reference
 
 | Category | Items |
@@ -242,6 +261,7 @@ Project Synced: [project-name]
 | **Agents** | [count] available |
 | **Plan** | No active plan |
 | **Saved State** | None |
+| **Memory** | [summary of MEMORY.md content, or "Empty"] |
 | **Git** | [branch], [N] uncommitted |
 
 ## Next Steps
@@ -397,6 +417,10 @@ Status
 | In Progress | 1 |
 | Pending | 1 |
 
+## Memory
+
+[Summary of MEMORY.md content, or "Empty"]
+
 ## Git
 
 | Field | Value |

+ 2 - 2
docs/ARCHITECTURE.md

@@ -251,10 +251,10 @@ skills/
 
 ### Example
 
-**`skills/testing-patterns/SKILL.md`**:
+**`skills/testing-ops/SKILL.md`**:
 ```yaml
 ---
-name: testing-patterns
+name: testing-ops
 description: Test architecture, mocking strategies, and coverage patterns. Triggers on: write tests, test strategy, mocking, fixtures, coverage.
 ---
 

+ 1 - 1
docs/PLAN.md

@@ -13,7 +13,7 @@
 | Component | Count | Notes |
 |-----------|-------|-------|
 | Agents | 22 | Domain experts (Python, Go, Rust, React, etc.) |
-| Skills | 43 | Pattern libraries, CLI tools, workflows, dev tasks |
+| Skills | 44 | Operational skills, CLI tools, workflows, dev tasks |
 | Commands | 3 | Session management (sync, save) + experimental (canvas) |
 | Rules | 5 | CLI tools, thinking, commit style, naming, skill-agent-updates |
 | Output Styles | 1 | Vesper personality |

+ 9 - 6
rules/naming-conventions.md

@@ -49,16 +49,19 @@ All skills follow the official Anthropic pattern with bundled resources:
 
 | Pattern | Example | Notes |
 |---------|---------|-------|
-| Language patterns | `python-async-patterns/` | Domain + "patterns" |
-| Tool knowledge | `sqlite-ops/` | Tool + operation |
+| Operational expertise | `postgres-ops/` | Comprehensive domain knowledge (preferred) |
+| Domain-specific | `python-async-ops/` | Domain + "-ops" |
+| Tool knowledge | `sqlite-ops/` | Tool + "-ops" |
 | Workflow | `git-workflow/` | Activity-focused |
-| Framework | `tailwind-patterns/` | Framework + "patterns" |
+| Framework | `tailwind-ops/` | Framework + "-ops" |
+
+**Naming guidance:** Use `-ops` for all skills providing domain knowledge. The `-ops` suffix signals comprehensive operational expertise - design, implementation, and operations.
 
 **Frontmatter:**
 
 ```yaml
 ---
-name: python-async-patterns  # Match directory name
+name: python-async-ops  # Match directory name
 description: "<trigger phrases>"
 compatibility: "<version requirements>"
 allowed-tools: "<tool list>"
@@ -184,7 +187,7 @@ BAD:  pythonExpert.md        - camelCase
 GOOD: python-expert.md       - kebab-case
 
 BAD:  skills/PythonPatterns/ - PascalCase directory
-GOOD: skills/python-patterns/
+GOOD: skills/python-ops/
 
 BAD:  commands/TestGen.md    - PascalCase
 GOOD: commands/testgen.md    - Concatenated lowercase
@@ -198,7 +201,7 @@ GOOD: vesper.md              - lowercase
 | Component | Pattern | Example |
 |-----------|---------|---------|
 | Agent | `{domain}-expert.md` | `docker-expert.md` |
-| Skill | `{topic}-patterns/skill.md` | `python-async-patterns/skill.md` |
+| Skill | `{topic}-ops/SKILL.md` | `postgres-ops/SKILL.md` |
 | Command | `{action}.md` | `review.md` |
 | Rule | `{topic}.md` | `commit-style.md` |
 | Output Style | `{personality}.md` | `vesper.md` |

+ 1 - 1
skills/claude-code-hooks/SKILL.md

@@ -107,7 +107,7 @@ echo '{"tool_name":"Bash"}' | ./hooks/validate.sh
 
 - `./references/hook-events.md` - All events with input/output schemas
 - `./references/configuration.md` - Advanced config patterns
-- `./references/security-patterns.md` - Production security
+- `./references/security-ops.md` - Production security
 
 ---
 

+ 2 - 2
skills/cli-patterns/SKILL.md

@@ -1,10 +1,10 @@
 ---
-name: cli-patterns
+name: cli-ops
 description: "Patterns for building production-quality CLI tools with predictable behavior, parseable output, and agentic workflows. Triggers: cli tool, command line tool, build cli, cli patterns, agentic cli, cli design, typer cli, click cli."
 compatibility: "Python 3.11+, Typer, Click"
 allowed-tools: "Read, Write, Edit"
 depends-on: []
-related-skills: [python-cli-patterns, python-async-patterns]
+related-skills: [python-cli-ops, python-async-ops]
 ---
 
 # CLI Patterns for Agentic Workflows

skills/cli-patterns/assets/.gitkeep → skills/cli-ops/assets/.gitkeep


skills/cli-patterns/references/implementation.md → skills/cli-ops/references/implementation.md


skills/cli-patterns/references/json-schemas.md → skills/cli-ops/references/json-schemas.md


skills/cli-patterns/scripts/.gitkeep → skills/cli-ops/scripts/.gitkeep


+ 4 - 4
skills/mcp-patterns/SKILL.md

@@ -1,5 +1,5 @@
 ---
-name: mcp-patterns
+name: mcp-ops
 description: "Model Context Protocol (MCP) server patterns for building integrations with Claude Code. Triggers on: mcp server, model context protocol, tool handler, mcp resource, mcp tool."
 compatibility: "Requires Python 3.10+ or Node.js 18+ for MCP server development."
 allowed-tools: "Read Write Bash"
@@ -112,8 +112,8 @@ my-mcp-server/
 | OAuth tokens | Token refresh with TTL | `./references/auth-patterns.md` |
 | SQLite cache | Persistent state storage | `./references/state-patterns.md` |
 | In-memory cache | TTL-based caching | `./references/state-patterns.md` |
-| Manual testing | Quick validation script | `./references/testing-patterns.md` |
-| pytest async | Unit tests for tools | `./references/testing-patterns.md` |
+| Manual testing | Quick validation script | `./references/testing-ops.md` |
+| pytest async | Unit tests for tools | `./references/testing-ops.md` |
 
 ## Common Issues
 
@@ -141,4 +141,4 @@ For detailed patterns, load:
 - `./references/resource-patterns.md` - Static and dynamic resource exposure
 - `./references/auth-patterns.md` - Environment variables, OAuth token refresh
 - `./references/state-patterns.md` - SQLite persistence, in-memory caching
-- `./references/testing-patterns.md` - Manual test scripts, pytest async patterns
+- `./references/testing-ops.md` - Manual test scripts, pytest async patterns

skills/mcp-patterns/assets/.gitkeep → skills/mcp-ops/assets/.gitkeep


skills/mcp-patterns/references/auth-patterns.md → skills/mcp-ops/references/auth-patterns.md


skills/mcp-patterns/references/resource-patterns.md → skills/mcp-ops/references/resource-patterns.md


skills/mcp-patterns/references/state-patterns.md → skills/mcp-ops/references/state-patterns.md


skills/mcp-patterns/references/testing-patterns.md → skills/mcp-ops/references/testing-patterns.md


skills/mcp-patterns/references/tool-patterns.md → skills/mcp-ops/references/tool-patterns.md


skills/mcp-patterns/scripts/.gitkeep → skills/mcp-ops/scripts/.gitkeep


File diff suppressed because it is too large
+ 271 - 0
skills/postgres-ops/SKILL.md


skills/python-cli-patterns/scripts/.gitkeep → skills/postgres-ops/assets/.gitkeep


+ 746 - 0
skills/postgres-ops/references/config-tuning.md

@@ -0,0 +1,746 @@
+# PostgreSQL Configuration & Tuning Reference
+
+## Table of Contents
+
+1. [Memory Settings](#memory-settings)
+   - shared_buffers
+   - work_mem
+   - maintenance_work_mem
+   - effective_cache_size
+   - huge_pages
+2. [WAL & Checkpoint Settings](#wal--checkpoint-settings)
+   - wal_level
+   - wal_buffers
+   - checkpoint_completion_target
+   - max_wal_size and min_wal_size
+   - full_page_writes
+3. [Query Planner Settings](#query-planner-settings)
+   - random_page_cost and seq_page_cost
+   - effective_io_concurrency
+   - JIT compilation
+4. [Parallelism Settings](#parallelism-settings)
+5. [Connection Settings](#connection-settings)
+6. [Logging](#logging)
+7. [OLTP vs OLAP Profiles](#oltp-vs-olap-profiles)
+8. [Extensions](#extensions)
+   - pg_stat_statements
+   - pg_trgm
+   - PostGIS
+   - timescaledb
+   - pgcrypto
+   - auto_explain
+
+---
+
+## Memory Settings
+
+### shared_buffers
+
+The PostgreSQL buffer cache: how much memory the server reserves for caching data pages.
+
+```ini
+shared_buffers = 8GB   # Recommended: 25% of total RAM
+```
+
+Rules of thumb:
+- Start at 25% of RAM. Going above 40% rarely helps and can hurt because the OS page cache also buffers the same pages.
+- On dedicated database servers, 25% is conservative but safe. Profile with `pg_buffercache` to measure actual cache hit rates.
+- Requires a server restart to take effect.
+
+Check cache hit ratio:
+
+```sql
+SELECT
+    sum(heap_blks_hit)  AS heap_hit,
+    sum(heap_blks_read) AS heap_read,
+    round(
+        sum(heap_blks_hit)::numeric /
+        nullif(sum(heap_blks_hit) + sum(heap_blks_read), 0) * 100, 2
+    ) AS hit_ratio_pct
+FROM pg_statio_user_tables;
+-- Target: > 99% for OLTP, > 95% for OLAP
+```
+
+Identify which tables consume the most buffer space (requires `pg_buffercache`):
+
+```sql
+CREATE EXTENSION pg_buffercache;
+
+SELECT
+    relname,
+    count(*) * 8192 / 1024 / 1024 AS cached_mb,
+    round(count(*) * 100.0 / (SELECT count(*) FROM pg_buffercache), 2) AS pct_of_cache
+FROM pg_buffercache bc
+JOIN pg_class c ON bc.relfilenode = c.relfilenode
+WHERE c.relkind = 'r'
+GROUP BY relname
+ORDER BY cached_mb DESC
+LIMIT 20;
+```
+
+### work_mem
+
+Memory granted per sort, hash, or merge operation. Each query node (sort, hash join, hash aggregate) can use up to `work_mem` individually.
+
+```ini
+work_mem = 64MB   # Default 4MB is usually too low
+```
+
+Critical nuance: if a query has 5 sort nodes and 20 parallel workers, it can consume `5 * 20 * work_mem` = 100x `work_mem`. For a 32GB server running 100 connections, setting `work_mem = 320MB` is catastrophic.
+
+Sizing strategy:
+1. Estimate concurrent queries: `max_connections * avg_active_fraction`
+2. Reserve memory for OS + shared_buffers + maintenance_work_mem
+3. Divide remainder: `work_mem = remaining / (active_connections * avg_nodes_per_query)`
+
+For most OLTP systems: 16-64MB. For analytics: 256MB-1GB with fewer connections.
+
+Override per session for specific heavy queries:
+
+```sql
+SET work_mem = '512MB';
+SELECT ... FROM large_table ORDER BY ...;
+RESET work_mem;
+```
+
+Monitor actual temporary file creation to detect under-allocation:
+
+```ini
+log_temp_files = 0   # Log all temp files (0 = log everything, N = only above N bytes)
+```
+
+```sql
+-- Check existing temp file usage stats
+SELECT query, temp_blks_written
+FROM pg_stat_statements
+WHERE temp_blks_written > 0
+ORDER BY temp_blks_written DESC
+LIMIT 10;
+```
+
+### maintenance_work_mem
+
+Memory for maintenance operations: VACUUM, ANALYZE, CREATE INDEX, ALTER TABLE ADD FOREIGN KEY, CLUSTER.
+
+```ini
+maintenance_work_mem = 2GB   # Recommended: up to 10% RAM or 1-4GB
+```
+
+Larger values dramatically speed up `CREATE INDEX` and VACUUM on large tables. Unlike `work_mem`, there are never many concurrent maintenance operations, so you can set this aggressively.
+
+Override per session before a large index build:
+
+```sql
+SET maintenance_work_mem = '4GB';
+CREATE INDEX CONCURRENTLY idx_events_created_at ON events (created_at);
+RESET maintenance_work_mem;
+```
+
+### effective_cache_size
+
+A hint to the query planner about total memory available for caching (RAM + OS page cache). It does not allocate memory; it only influences cost estimates.
+
+```ini
+effective_cache_size = 24GB   # Recommended: 75% of total RAM
+```
+
+Higher values make the planner prefer index scans (which benefit from caching) over sequential scans. Too low a value causes the planner to choose sequential scans even when an index scan would be faster.
+
+### huge_pages
+
+Huge pages (2MB pages on Linux instead of 4KB) reduce TLB pressure and can improve throughput on large `shared_buffers` values (above 8GB).
+
+```ini
+huge_pages = try    # 'try' falls back gracefully; use 'on' to enforce
+```
+
+Linux OS setup (must be done before starting PostgreSQL):
+
+```bash
+# Calculate pages needed: shared_buffers / 2MB
+# For shared_buffers = 16GB: 16384 MB / 2 MB = 8192 huge pages, add 10% buffer
+echo 9000 > /proc/sys/vm/nr_hugepages
+
+# Persist across reboots
+echo "vm.nr_hugepages = 9000" >> /etc/sysctl.conf
+sysctl -p
+
+# Verify allocation
+grep HugePages /proc/meminfo
+```
+
+---
+
+## WAL & Checkpoint Settings
+
+### wal_level
+
+Controls how much information is written to WAL.
+
+```ini
+wal_level = replica    # Minimum for streaming replication
+wal_level = logical    # Required for logical replication (writes more)
+```
+
+`wal_level = minimal` disables replication and reduces WAL volume slightly. Use only for standalone servers where you never need PITR.
+
+### wal_buffers
+
+Memory for WAL writes before flushing to disk. PostgreSQL auto-tunes this to 1/32 of `shared_buffers`, capped at 16MB.
+
+```ini
+wal_buffers = 64MB   # Manual override; auto value is usually fine
+```
+
+Rarely needs manual tuning. Increase only if you see contention on `WALBufMappingLock` in `pg_stat_activity`.
+
+### checkpoint_completion_target
+
+Fraction of the checkpoint interval over which to spread checkpoint I/O. Reduces I/O spikes at checkpoint time.
+
+```ini
+checkpoint_completion_target = 0.9   # Recommended (default is 0.9 in PG14+)
+```
+
+With `max_wal_size = 4GB` and `checkpoint_completion_target = 0.9`, PostgreSQL spreads writes over 90% of the checkpoint interval instead of flushing all at once.
+
+### max_wal_size and min_wal_size
+
+Control WAL retention between checkpoints. Larger values reduce checkpoint frequency (less I/O) at the cost of more WAL on disk and longer crash recovery time.
+
+```ini
+min_wal_size = 1GB     # Minimum WAL to retain (default 80MB)
+max_wal_size = 8GB     # Triggers checkpoint when exceeded (default 1GB)
+```
+
+For write-heavy workloads, increase `max_wal_size` to reduce checkpoint frequency. Monitor checkpoint frequency:
+
+```sql
+SELECT checkpoints_timed, checkpoints_req, checkpoint_write_time, checkpoint_sync_time
+FROM pg_stat_bgwriter;
+-- checkpoints_req >> checkpoints_timed means max_wal_size is too small
+```
+
+### full_page_writes
+
+After a checkpoint, PostgreSQL writes the full page image of a modified page the first time it is touched. This protects against torn page writes when the OS crashes mid-write.
+
+```ini
+full_page_writes = on   # NEVER disable this
+```
+
+Disabling `full_page_writes` can cause unrecoverable data corruption after an OS crash. The only safe way to reduce full-page write overhead is to use a filesystem or storage that guarantees atomic page writes (ZFS, some SAN configurations) and you fully understand the implications.
+
+---
+
+## Query Planner Settings
+
+### random_page_cost and seq_page_cost
+
+Control the planner's cost model for I/O. Lower values make the planner favor the corresponding access method.
+
+```ini
+# For NVMe/SSD storage:
+random_page_cost = 1.1
+seq_page_cost = 1.0
+
+# For traditional HDD:
+random_page_cost = 4.0
+seq_page_cost = 1.0
+```
+
+The default `random_page_cost = 4.0` is calibrated for spinning disk. On SSD, it causes the planner to undervalue index scans, leading to unnecessary sequential scans. Always set `random_page_cost = 1.1` on SSD-based servers.
+
+Override per session to diagnose planner choices:
+
+```sql
+SET random_page_cost = 1.1;
+EXPLAIN ANALYZE SELECT ...;
+```
+
+### effective_io_concurrency
+
+Number of concurrent I/O operations the planner assumes the storage can handle. Affects bitmap index scan prefetching.
+
+```ini
+effective_io_concurrency = 200   # NVMe SSD (high parallelism)
+effective_io_concurrency = 2     # Traditional HDD (low parallelism)
+effective_io_concurrency = 1     # NFS/SAN (conservative)
+```
+
+### JIT Compilation
+
+JIT (Just-In-Time compilation via LLVM) can speed up CPU-intensive queries (complex aggregations, many expressions) but adds compilation overhead that hurts short OLTP queries.
+
+```ini
+jit = on                  # Enable JIT globally (default on in PG11+)
+jit_above_cost = 100000   # Only JIT-compile queries above this cost
+jit_optimize_above_cost = 500000  # Apply expensive optimizations above this cost
+jit_inline_above_cost = 500000    # Inline functions above this cost
+```
+
+For OLTP workloads where queries are fast and simple:
+
+```ini
+jit = off   # Disable entirely to avoid overhead
+```
+
+Check if JIT was used in a query:
+
+```sql
+EXPLAIN (ANALYZE, VERBOSE, FORMAT TEXT)
+SELECT sum(total) FROM orders WHERE created_at > now() - interval '1 year';
+-- Look for "JIT:" section in output
+```
+
+---
+
+## Parallelism Settings
+
+PostgreSQL can parallelize sequential scans, aggregations, joins, and index scans.
+
+```ini
+# Total background workers available to the instance
+max_worker_processes = 16           # Default 8; should be >= CPU cores
+
+# Maximum parallel workers available for queries at any time
+max_parallel_workers = 8            # Default 8; cap at physical CPU cores
+
+# Workers per individual query node
+max_parallel_workers_per_gather = 4 # Default 2; practical limit 4-8
+
+# Minimum table size before considering parallel scan
+min_parallel_table_scan_size = 8MB  # Default; lower to enable on smaller tables
+min_parallel_index_scan_size = 512kB
+
+# Include leader process in parallel work (default on)
+parallel_leader_participation = on
+```
+
+Force parallelism for testing (dangerous in production):
+
+```sql
+SET max_parallel_workers_per_gather = 8;
+SET parallel_setup_cost = 0;
+SET parallel_tuple_cost = 0;
+EXPLAIN ANALYZE SELECT count(*) FROM large_table;
+```
+
+Disable parallelism for a session (useful when debugging):
+
+```sql
+SET max_parallel_workers_per_gather = 0;
+```
+
+---
+
+## Connection Settings
+
+### max_connections
+
+PostgreSQL creates one process per connection. High connection counts waste memory and cause lock contention.
+
+```ini
+max_connections = 200   # Keep below 300; use pgBouncer for more
+```
+
+Each idle connection consumes ~5MB RAM just for the process overhead. With `work_mem = 64MB` and a sort-heavy query, one connection can briefly use 64MB * N sort nodes.
+
+Use PgBouncer in transaction mode for OLTP:
+
+```ini
+# pgbouncer.ini
+pool_mode = transaction
+max_client_conn = 2000
+default_pool_size = 20   # Connections to PostgreSQL per database/user pair
+```
+
+```ini
+# Reserve connections for superusers (DBA access during emergencies)
+superuser_reserved_connections = 5
+```
+
+### TCP Keepalives
+
+Detect dead connections (e.g., after network partition) without relying on the application:
+
+```ini
+tcp_keepalives_idle = 60      # Start keepalives after 60s idle
+tcp_keepalives_interval = 10  # Retry every 10s
+tcp_keepalives_count = 6      # Drop connection after 6 failed probes (1 minute)
+```
+
+Monitor current connections and their state:
+
+```sql
+SELECT
+    state,
+    count(*),
+    max(now() - state_change) AS longest_in_state
+FROM pg_stat_activity
+WHERE datname = current_database()
+GROUP BY state
+ORDER BY count DESC;
+
+-- Find idle connections older than 10 minutes
+SELECT pid, usename, application_name, state, state_change, query
+FROM pg_stat_activity
+WHERE state = 'idle'
+  AND state_change < now() - interval '10 minutes';
+```
+
+---
+
+## Logging
+
+### Slow Query Logging
+
+```ini
+log_min_duration_statement = 1000   # Log queries taking > 1 second (ms)
+                                     # Set to 0 to log all; -1 to disable
+```
+
+### Statement-Level Logging
+
+```ini
+log_statement = 'ddl'   # Recommended for most production servers
+# Options: none | ddl | mod | all
+# 'ddl'  = CREATE, DROP, ALTER, TRUNCATE
+# 'mod'  = ddl + INSERT, UPDATE, DELETE, COPY
+# 'all'  = everything (very verbose, for debugging only)
+```
+
+### Lock Logging
+
+```ini
+log_lock_waits = on       # Log if a query waits for a lock
+deadlock_timeout = 1s     # Time before checking for deadlock (and logging wait)
+```
+
+Deadlocks are logged automatically at `log_error_verbosity` level. Lock waits (not deadlocks) require `log_lock_waits = on`:
+
+```ini
+# Also useful for identifying lock contention:
+log_min_duration_statement = 500    # Catch queries slow due to lock waits
+```
+
+Query current lock waits:
+
+```sql
+SELECT
+    blocked.pid                   AS blocked_pid,
+    blocked.query                 AS blocked_query,
+    blocking.pid                  AS blocking_pid,
+    blocking.query                AS blocking_query,
+    now() - blocked.query_start   AS wait_duration
+FROM pg_stat_activity blocked
+JOIN pg_stat_activity blocking
+    ON blocking.pid = ANY(pg_blocking_pids(blocked.pid))
+WHERE blocked.cardinality(pg_blocking_pids(blocked.pid)) > 0;
+```
+
+### auto_explain
+
+Automatically log EXPLAIN ANALYZE for slow queries without modifying application code:
+
+```ini
+# Load as a shared library (requires restart)
+shared_preload_libraries = 'pg_stat_statements, auto_explain'
+
+# auto_explain settings (in postgresql.conf or per session)
+auto_explain.log_min_duration = 5000    # Log plans for queries > 5 seconds
+auto_explain.log_analyze = on           # Include ANALYZE (actual vs estimated rows)
+auto_explain.log_buffers = on           # Include buffer usage
+auto_explain.log_format = text          # text | json | yaml | xml
+auto_explain.log_verbose = off          # Include column-level output (very noisy)
+auto_explain.log_nested_statements = off # Exclude PL/pgSQL internal queries
+auto_explain.sample_rate = 1.0          # Sample 100% of queries; set lower under load
+```
+
+Enable per session without restart:
+
+```sql
+LOAD 'auto_explain';
+SET auto_explain.log_min_duration = '1s';
+SET auto_explain.log_analyze = true;
+```
+
+---
+
+## OLTP vs OLAP Profiles
+
+Two complete configuration profiles showing key differences.
+
+### OLTP Profile (32GB RAM, NVMe SSD, 200 connections)
+
+```ini
+# Memory
+shared_buffers = 8GB
+work_mem = 32MB
+maintenance_work_mem = 1GB
+effective_cache_size = 24GB
+huge_pages = try
+
+# WAL & Checkpoints
+wal_level = replica
+wal_buffers = 64MB
+checkpoint_completion_target = 0.9
+max_wal_size = 4GB
+min_wal_size = 1GB
+full_page_writes = on
+
+# Planner
+random_page_cost = 1.1
+seq_page_cost = 1.0
+effective_io_concurrency = 200
+jit = off                        # Short queries don't benefit; avoid overhead
+
+# Parallelism - conservative for OLTP
+max_worker_processes = 16
+max_parallel_workers = 4
+max_parallel_workers_per_gather = 2
+
+# Connections
+max_connections = 200
+superuser_reserved_connections = 5
+tcp_keepalives_idle = 60
+tcp_keepalives_interval = 10
+tcp_keepalives_count = 6
+
+# Logging
+log_min_duration_statement = 500
+log_statement = 'ddl'
+log_lock_waits = on
+deadlock_timeout = 1s
+
+# Extensions
+shared_preload_libraries = 'pg_stat_statements, auto_explain'
+pg_stat_statements.track = all
+auto_explain.log_min_duration = 2000
+auto_explain.log_analyze = on
+auto_explain.log_buffers = on
+```
+
+### OLAP Profile (128GB RAM, NVMe SSD, 20 connections, analytics workload)
+
+```ini
+# Memory - larger allocations per query
+shared_buffers = 32GB
+work_mem = 2GB                   # Large sorts and hash joins for analytics
+maintenance_work_mem = 4GB
+effective_cache_size = 96GB
+huge_pages = on
+
+# WAL & Checkpoints - less frequent, larger checkpoints
+wal_level = replica
+wal_buffers = 64MB
+checkpoint_completion_target = 0.9
+max_wal_size = 16GB              # Fewer checkpoints for write-heavy ETL
+min_wal_size = 4GB
+full_page_writes = on
+
+# Planner - favor parallel plans and large scans
+random_page_cost = 1.1
+seq_page_cost = 1.0
+effective_io_concurrency = 200
+jit = on                         # CPU-heavy aggregations benefit from JIT
+jit_above_cost = 50000           # Lower threshold to engage JIT sooner
+
+# Parallelism - aggressive for analytics
+max_worker_processes = 32
+max_parallel_workers = 24
+max_parallel_workers_per_gather = 12
+min_parallel_table_scan_size = 1MB
+min_parallel_index_scan_size = 128kB
+parallel_leader_participation = on
+
+# Connections - low count, use pooling at application layer
+max_connections = 50
+superuser_reserved_connections = 5
+tcp_keepalives_idle = 60
+tcp_keepalives_interval = 10
+tcp_keepalives_count = 6
+
+# Logging
+log_min_duration_statement = 5000   # Only log very slow queries
+log_statement = 'ddl'
+log_lock_waits = on
+deadlock_timeout = 5s
+
+# Extensions
+shared_preload_libraries = 'pg_stat_statements, auto_explain'
+pg_stat_statements.track = all
+auto_explain.log_min_duration = 10000
+auto_explain.log_analyze = on
+auto_explain.log_buffers = on
+auto_explain.log_verbose = on        # Column-level detail useful for analytics
+```
+
+---
+
+## Extensions
+
+### pg_stat_statements
+
+Tracks cumulative execution statistics for all SQL statements. Essential for identifying slow queries.
+
+```ini
+# postgresql.conf
+shared_preload_libraries = 'pg_stat_statements'
+pg_stat_statements.track = all          # top | all (includes nested statements)
+pg_stat_statements.max = 10000          # Max distinct statements tracked
+pg_stat_statements.track_utility = on   # Track VACUUM, CREATE, etc.
+```
+
+```sql
+CREATE EXTENSION pg_stat_statements;
+
+-- Top 10 queries by total execution time
+SELECT
+    round(total_exec_time::numeric, 2) AS total_ms,
+    calls,
+    round(mean_exec_time::numeric, 2) AS mean_ms,
+    round(stddev_exec_time::numeric, 2) AS stddev_ms,
+    rows,
+    round(100.0 * total_exec_time / sum(total_exec_time) OVER (), 2) AS pct,
+    left(query, 120) AS query
+FROM pg_stat_statements
+ORDER BY total_exec_time DESC
+LIMIT 10;
+
+-- Queries with high I/O (temp file usage)
+SELECT query, calls, total_exec_time, temp_blks_written
+FROM pg_stat_statements
+WHERE temp_blks_written > 0
+ORDER BY temp_blks_written DESC
+LIMIT 10;
+
+-- Reset statistics
+SELECT pg_stat_statements_reset();
+```
+
+### pg_trgm
+
+Trigram similarity enables fast fuzzy text search and LIKE/ILIKE acceleration with GIN or GiST indexes.
+
+```sql
+CREATE EXTENSION pg_trgm;
+
+-- Similarity search (0 to 1 score)
+SELECT name, similarity(name, 'PostgreSQL') AS sim
+FROM products
+WHERE similarity(name, 'PostgreSQL') > 0.3
+ORDER BY sim DESC;
+
+-- Accelerate LIKE/ILIKE with GIN index
+CREATE INDEX idx_products_name_trgm ON products USING gin (name gin_trgm_ops);
+
+-- Now this query uses the index:
+EXPLAIN ANALYZE SELECT * FROM products WHERE name ILIKE '%ostgre%';
+
+-- Word similarity (better for phrase matching)
+SELECT word_similarity('PostgreSQL', 'Postgres SQL tutorial');
+```
+
+### PostGIS
+
+Spatial and geographic data types, indexing, and functions. Use GiST indexes for geometry columns.
+
+```sql
+CREATE EXTENSION postgis;
+
+-- Spatial columns
+CREATE TABLE locations (
+    id      bigserial PRIMARY KEY,
+    name    text,
+    geom    geometry(Point, 4326)   -- WGS84 lat/lng
+);
+
+CREATE INDEX idx_locations_geom ON locations USING gist (geom);
+
+-- Find points within 10km of a given point
+SELECT name, ST_Distance(geom::geography, ST_MakePoint(-73.9857, 40.7484)::geography) AS dist_m
+FROM locations
+WHERE ST_DWithin(geom::geography, ST_MakePoint(-73.9857, 40.7484)::geography, 10000)
+ORDER BY dist_m;
+```
+
+### timescaledb
+
+Automatically partitions time-series data into chunks, enables continuous aggregates, and provides compression.
+
+```sql
+CREATE EXTENSION timescaledb;
+
+-- Convert a regular table to a hypertable (partitioned by time)
+CREATE TABLE metrics (
+    time        timestamptz NOT NULL,
+    device_id   int,
+    temperature double precision
+);
+SELECT create_hypertable('metrics', 'time', chunk_time_interval => interval '1 day');
+
+-- Automatic compression for old chunks
+ALTER TABLE metrics SET (
+    timescaledb.compress,
+    timescaledb.compress_orderby = 'time DESC',
+    timescaledb.compress_segmentby = 'device_id'
+);
+SELECT add_compression_policy('metrics', interval '7 days');
+
+-- Continuous aggregate (materialized, auto-refreshed)
+CREATE MATERIALIZED VIEW metrics_hourly
+WITH (timescaledb.continuous) AS
+SELECT time_bucket('1 hour', time) AS bucket, device_id, avg(temperature) AS avg_temp
+FROM metrics
+GROUP BY bucket, device_id;
+```
+
+### pgcrypto
+
+Cryptographic functions for hashing, encryption, and key generation.
+
+```sql
+CREATE EXTENSION pgcrypto;
+
+-- Password hashing (bcrypt)
+INSERT INTO users (email, password_hash)
+VALUES ('user@example.com', crypt('user_password', gen_salt('bf', 12)));
+
+-- Verify password
+SELECT id FROM users
+WHERE email = 'user@example.com'
+  AND password_hash = crypt('supplied_password', password_hash);
+
+-- Symmetric encryption (AES via pgp_sym_encrypt)
+SELECT pgp_sym_encrypt('sensitive data', 'encryption_key');
+SELECT pgp_sym_decrypt(encrypted_col, 'encryption_key') FROM secrets;
+
+-- Generate random UUID
+SELECT gen_random_uuid();
+
+-- Generate cryptographically secure random bytes
+SELECT encode(gen_random_bytes(32), 'hex');
+```
+
+### auto_explain
+
+Logs query execution plans automatically for slow queries. Configured as a shared library (see [Logging](#logging) section). No SQL setup required beyond loading the library.
+
+Load temporarily in a session for debugging without a server restart:
+
+```sql
+LOAD 'auto_explain';
+SET auto_explain.log_min_duration = 0;    -- Log everything in this session
+SET auto_explain.log_analyze = true;
+SET auto_explain.log_buffers = true;
+
+-- Run your query; check PostgreSQL logs for the plan
+SELECT * FROM orders WHERE customer_id = 12345 ORDER BY created_at DESC LIMIT 100;
+```
+
+Sample only a fraction of queries under high load to reduce log volume:
+
+```ini
+auto_explain.sample_rate = 0.01   # Log plans for ~1% of qualifying queries
+```

+ 746 - 0
skills/postgres-ops/references/indexing.md

@@ -0,0 +1,746 @@
+# PostgreSQL Indexing Reference
+
+## Table of Contents
+
+1. [Index Types Overview](#index-types-overview)
+2. [B-tree Indexes](#b-tree-indexes)
+3. [Hash Indexes](#hash-indexes)
+4. [GIN Indexes](#gin-indexes)
+5. [GiST Indexes](#gist-indexes)
+6. [BRIN Indexes](#brin-indexes)
+7. [Composite Indexes](#composite-indexes)
+8. [Partial Indexes](#partial-indexes)
+9. [Expression Indexes](#expression-indexes)
+10. [Covering Indexes (INCLUDE)](#covering-indexes-include)
+11. [GIN Specifics](#gin-specifics)
+12. [GiST Specifics](#gist-specifics)
+13. [BRIN Specifics](#brin-specifics)
+14. [Index Maintenance](#index-maintenance)
+15. [Anti-Patterns](#anti-patterns)
+
+---
+
+## Index Types Overview
+
+| Type | Best For | Operators Supported | Notes |
+|------|----------|---------------------|-------|
+| B-tree | Equality, range, sorting | `=`, `<`, `>`, `<=`, `>=`, `BETWEEN`, `LIKE 'foo%'` | Default; works for most cases |
+| Hash | Equality only | `=` | Smaller than B-tree for pure equality |
+| GIN | Multi-valued columns | `@>`, `<@`, `&&`, `?`, `@@` | JSONB, arrays, FTS, tsvector |
+| GiST | Geometric, range, custom | `&&`, `@>`, `<@`, `<->` | Ranges, PostGIS, exclusion constraints |
+| BRIN | Append-only correlated data | `=`, `<`, `>` range | Tiny size, ideal for time-series |
+| SP-GiST | Partitioned/hierarchical data | Varies | IP addresses, phone trees, quadtrees |
+
+---
+
+## B-tree Indexes
+
+The default index type. Keeps values in sorted order, enabling equality lookups, range scans, and ORDER BY satisfaction without a sort step.
+
+```sql
+-- Basic B-tree (implicit)
+CREATE INDEX idx_orders_customer_id ON orders(customer_id);
+
+-- Explicit declaration
+CREATE INDEX idx_orders_customer_id ON orders USING btree(customer_id);
+
+-- Descending sort order (useful when ORDER BY col DESC is common)
+CREATE INDEX idx_events_created_desc ON events(created_at DESC);
+
+-- NULLS FIRST / NULLS LAST (match your ORDER BY for index-only scan benefit)
+CREATE INDEX idx_tasks_due_date ON tasks(due_date ASC NULLS LAST);
+```
+
+B-tree supports prefix matching on text columns with `LIKE 'prefix%'` (but NOT `LIKE '%suffix'`). Requires `text_pattern_ops` if the column uses a non-C locale:
+
+```sql
+CREATE INDEX idx_users_name_prefix ON users(name text_pattern_ops);
+-- Now: WHERE name LIKE 'Joh%'  uses the index
+```
+
+---
+
+## Hash Indexes
+
+Hash indexes store a hash of the indexed value. They are smaller than B-tree and marginally faster for pure equality lookups, but cannot satisfy range queries or sorting.
+
+```sql
+CREATE INDEX idx_sessions_token ON sessions USING hash(token);
+
+-- Only useful for:
+SELECT * FROM sessions WHERE token = 'abc123';
+
+-- Useless for:
+SELECT * FROM sessions WHERE token > 'abc123';   -- cannot use hash index
+SELECT * FROM sessions ORDER BY token;            -- cannot use hash index
+```
+
+Hash indexes are WAL-logged since PG10 and safe for production use. Choose hash only when you are certain the column will never participate in range queries or ORDER BY.
+
+---
+
+## GIN Indexes
+
+Generalized Inverted Index. Designed for columns that contain multiple values (arrays, JSONB, tsvector). GIN maps each element value to the set of rows containing it.
+
+```sql
+-- Array column
+CREATE INDEX idx_articles_tags ON articles USING gin(tags);
+
+-- JSONB column (default operator class)
+CREATE INDEX idx_products_attrs ON products USING gin(attributes);
+
+-- Full-text search
+CREATE INDEX idx_posts_fts ON posts USING gin(to_tsvector('english', body));
+
+-- Pre-computed tsvector column (faster updates)
+ALTER TABLE posts ADD COLUMN search_vector tsvector
+    GENERATED ALWAYS AS (to_tsvector('english', coalesce(title,'') || ' ' || coalesce(body,''))) STORED;
+CREATE INDEX idx_posts_search ON posts USING gin(search_vector);
+```
+
+GIN indexes have high build cost and write overhead (each element is indexed separately) but excellent read performance for containment queries.
+
+---
+
+## GiST Indexes
+
+Generalized Search Tree. A framework supporting custom data types with custom operators. Suitable for geometric data, range types, and exclusion constraints.
+
+```sql
+-- Range type
+CREATE INDEX idx_bookings_during ON bookings USING gist(during);
+
+-- PostGIS geometry
+CREATE INDEX idx_locations_geom ON locations USING gist(geom);
+
+-- Exclusion constraint (requires GiST index internally)
+CREATE EXTENSION btree_gist;
+ALTER TABLE bookings ADD CONSTRAINT no_overlap
+    EXCLUDE USING gist (room_id WITH =, during WITH &&);
+```
+
+GiST is lossy (may return false positives that are then rechecked), making it slightly less precise than GIN but more flexible for custom types.
+
+---
+
+## BRIN Indexes
+
+Block Range INdex. Stores min/max values per block range rather than per row. Extremely small (often 1000x smaller than B-tree) but only useful when physical row order correlates with query values.
+
+```sql
+-- Time-series table where rows are appended in timestamp order
+CREATE INDEX idx_events_created_brin ON events USING brin(created_at);
+
+-- Adjust pages_per_range (default 128): smaller = more precise, larger index
+CREATE INDEX idx_events_created_brin ON events USING brin(created_at)
+WITH (pages_per_range = 32);
+```
+
+---
+
+## Composite Indexes
+
+A composite (multi-column) index covers multiple columns. Column ordering is critical.
+
+### Ordering Rules
+
+1. **Equality conditions first** - columns used with `=` should come before range columns
+2. **Most selective first** - among equality columns, put highest cardinality first
+3. **Leftmost prefix rule** - an index on `(a, b, c)` can also serve queries on `(a)` and `(a, b)` but NOT `(b)` or `(c)` alone
+
+```sql
+-- Query pattern: WHERE status = 'active' AND created_at > '2024-01-01'
+-- Equality (status) before range (created_at)
+CREATE INDEX idx_orders_status_created ON orders(status, created_at);
+
+-- Query pattern: WHERE tenant_id = 1 AND user_id = 42 AND created_at > '2024-01-01'
+CREATE INDEX idx_events_tenant_user_created ON events(tenant_id, user_id, created_at);
+
+-- This index CANNOT be used for: WHERE user_id = 42 (skips leftmost column)
+-- This index CAN be used for: WHERE tenant_id = 1 (leftmost prefix only)
+-- This index CAN be used for: WHERE tenant_id = 1 AND user_id = 42 ORDER BY created_at
+
+-- Verify index is being used
+EXPLAIN (ANALYZE, BUFFERS)
+SELECT * FROM events
+WHERE tenant_id = 1 AND user_id = 42 AND created_at > now() - interval '7 days';
+```
+
+### Selectivity Check
+
+```sql
+-- Estimate selectivity per column before deciding order
+SELECT
+    count(DISTINCT status)::float / count(*) AS status_selectivity,
+    count(DISTINCT customer_id)::float / count(*) AS customer_selectivity
+FROM orders;
+-- Higher ratio = more selective = put earlier in composite index
+```
+
+---
+
+## Partial Indexes
+
+A partial index indexes only the rows satisfying a WHERE predicate. Results in a smaller, faster index.
+
+```sql
+-- Index only active users (WHERE status = 'active' is common)
+CREATE INDEX idx_users_active_email ON users(email) WHERE status = 'active';
+
+-- Index only unprocessed jobs (queue pattern)
+CREATE INDEX idx_jobs_pending ON jobs(created_at) WHERE processed_at IS NULL;
+
+-- Soft-delete pattern: exclude deleted rows from index
+CREATE INDEX idx_products_name ON products(name) WHERE deleted_at IS NULL;
+
+-- Partial unique: only one active record per external_id
+CREATE UNIQUE INDEX idx_subscriptions_active
+ON subscriptions(external_id) WHERE cancelled_at IS NULL;
+```
+
+For the planner to use a partial index, the query WHERE clause must be **semantically implied** by the index predicate:
+
+```sql
+-- Index: WHERE status = 'active'
+-- Query must include: WHERE status = 'active' (explicitly, not implied by a join)
+SELECT * FROM users WHERE status = 'active' AND email = 'foo@example.com';
+-- Planner can use idx_users_active_email above
+
+-- This query CANNOT use it (predicate not present):
+SELECT * FROM users WHERE email = 'foo@example.com';
+```
+
+### Size Savings Example
+
+```sql
+-- Measure savings
+SELECT
+    pg_size_pretty(pg_relation_size('idx_jobs_all')) AS full_index,
+    pg_size_pretty(pg_relation_size('idx_jobs_pending')) AS partial_index;
+
+-- Often 10-100x smaller when condition filters 90%+ of rows
+```
+
+---
+
+## Expression Indexes
+
+Index on the result of an expression or function rather than a raw column value. The expression must be **immutable** (same input always produces same output).
+
+```sql
+-- Case-insensitive email lookup
+CREATE INDEX idx_users_email_lower ON users(lower(email));
+-- Query must use the same expression:
+SELECT * FROM users WHERE lower(email) = lower('User@Example.com');
+
+-- Date extraction (find all orders on a given day)
+CREATE INDEX idx_orders_date ON orders(date_trunc('day', created_at));
+SELECT * FROM orders WHERE date_trunc('day', created_at) = '2024-03-01';
+
+-- JSONB field extraction (use when you query a specific key frequently)
+CREATE INDEX idx_users_plan ON users((data ->> 'subscription_plan'));
+SELECT * FROM users WHERE data ->> 'subscription_plan' = 'enterprise';
+
+-- Numeric cast from JSONB text field
+CREATE INDEX idx_orders_amount ON orders(((data ->> 'amount')::numeric));
+SELECT * FROM orders WHERE (data ->> 'amount')::numeric > 1000;
+
+-- Partial expression index: only index non-null computed values
+CREATE INDEX idx_products_lower_name ON products(lower(name))
+WHERE name IS NOT NULL;
+```
+
+### Immutability Requirement
+
+Functions used in expression indexes must be declared `IMMUTABLE`. PostgreSQL will reject `STABLE` or `VOLATILE` functions.
+
+```sql
+-- This fails: now() is STABLE, not IMMUTABLE
+CREATE INDEX bad ON events(date_trunc('day', now()));  -- ERROR
+
+-- Custom function must be explicitly IMMUTABLE
+CREATE FUNCTION clean_phone(text) RETURNS text
+LANGUAGE sql IMMUTABLE STRICT AS $$
+    SELECT regexp_replace($1, '[^0-9]', '', 'g')
+$$;
+
+CREATE INDEX idx_contacts_phone ON contacts(clean_phone(phone_raw));
+```
+
+---
+
+## Covering Indexes (INCLUDE)
+
+The `INCLUDE` clause adds non-key columns to the index leaf pages. These columns are not searchable but allow index-only scans, avoiding heap fetches entirely.
+
+```sql
+-- Without INCLUDE: planner must fetch heap to get email
+CREATE INDEX idx_users_name ON users(name);
+
+-- With INCLUDE: index-only scan possible
+CREATE INDEX idx_users_name_covering ON users(name) INCLUDE (email, status);
+SELECT email, status FROM users WHERE name = 'Alice';  -- no heap access
+```
+
+### When to Use INCLUDE vs Composite
+
+| Scenario | Use |
+|----------|-----|
+| Column needed in SELECT but not WHERE/ORDER BY | `INCLUDE` |
+| Column used in WHERE or ORDER BY | Add as key column |
+| Column has high write churn | Prefer key column (INCLUDE columns still updated) |
+| Need to cover a few extra cheap columns | `INCLUDE` |
+| Covering a large text column | Avoid; inflates index; use composite carefully |
+
+```sql
+-- Index-only scan verification in EXPLAIN output
+EXPLAIN (ANALYZE, BUFFERS)
+SELECT name, email FROM users WHERE name LIKE 'A%';
+-- Look for "Index Only Scan" and "Heap Fetches: 0" (or low count if visibility map not up to date)
+
+-- Force visibility map update to enable index-only scans
+VACUUM users;
+```
+
+---
+
+## GIN Specifics
+
+### Operator Classes
+
+```sql
+-- Default operator class: supports @>, <@, ?, ?|, ?& on jsonb
+-- Indexes all key-value pairs; larger index; supports more operators
+CREATE INDEX idx_data_gin ON records USING gin(data);
+
+-- jsonb_path_ops: supports ONLY @> (containment)
+-- Indexes only values (not keys); ~30% smaller; faster for containment queries
+CREATE INDEX idx_data_gin_path ON records USING gin(data jsonb_path_ops);
+
+-- Choose jsonb_path_ops when:
+-- - You only query with @> (containment)
+-- - Index size is a concern
+-- - Write throughput needs improvement
+
+-- Choose default when:
+-- - You use ?, ?|, ?& (key existence checks)
+-- - You need to query nested structures with multiple operators
+```
+
+### Trigram Search (pg_trgm)
+
+```sql
+CREATE EXTENSION pg_trgm;
+
+-- GIN trigram index for LIKE, ILIKE, and regex
+CREATE INDEX idx_products_name_trgm ON products USING gin(name gin_trgm_ops);
+
+-- Now these use the index (unlike standard B-tree):
+SELECT * FROM products WHERE name ILIKE '%widget%';
+SELECT * FROM products WHERE name ~ 'wid.*et';
+
+-- Similarity search
+SELECT name, similarity(name, 'wiget') AS sim
+FROM products
+WHERE name % 'wiget'   -- % operator: similarity > threshold (default 0.3)
+ORDER BY sim DESC;
+
+-- GiST alternative (smaller index, slightly slower queries)
+CREATE INDEX idx_products_name_trgm_gist ON products USING gist(name gist_trgm_ops);
+```
+
+### Array Operators with GIN
+
+```sql
+CREATE INDEX idx_articles_tags ON articles USING gin(tags);
+
+-- Supported operators with this index:
+SELECT * FROM articles WHERE tags @> ARRAY['postgresql'];   -- contains
+SELECT * FROM articles WHERE tags <@ ARRAY['a','b','c'];   -- is contained by
+SELECT * FROM articles WHERE tags && ARRAY['postgresql'];  -- overlap
+SELECT * FROM articles WHERE 'postgresql' = ANY(tags);    -- equivalent to @>
+```
+
+### Full-Text Search
+
+```sql
+-- Index a computed tsvector
+CREATE INDEX idx_posts_fts ON posts USING gin(to_tsvector('english', title || ' ' || body));
+
+-- Or index a stored tsvector column (faster updates, more storage)
+ALTER TABLE posts ADD COLUMN fts tsvector
+    GENERATED ALWAYS AS (
+        setweight(to_tsvector('english', coalesce(title, '')), 'A') ||
+        setweight(to_tsvector('english', coalesce(body, '')), 'B')
+    ) STORED;
+
+CREATE INDEX idx_posts_fts ON posts USING gin(fts);
+
+-- Query
+SELECT title, ts_rank(fts, query) AS rank
+FROM posts, to_tsquery('english', 'postgresql & index') query
+WHERE fts @@ query
+ORDER BY rank DESC;
+```
+
+### GIN Tuning
+
+```sql
+-- gin_pending_list_limit: GIN uses a fast-update pending list
+-- Larger = fewer full index updates during writes, more reads deferred
+-- Default: 4MB
+ALTER INDEX idx_posts_fts SET (fastupdate = on);
+
+-- Force pending list flush (useful before a read-heavy period)
+SELECT gin_clean_pending_list('idx_posts_fts');
+```
+
+---
+
+## GiST Specifics
+
+### Range Type Indexing
+
+```sql
+CREATE EXTENSION btree_gist;  -- required for scalar types in EXCLUDE
+
+CREATE TABLE schedules (
+    id       serial PRIMARY KEY,
+    staff_id integer,
+    shift    tsrange
+);
+
+CREATE INDEX idx_schedules_shift ON schedules USING gist(shift);
+
+-- Supported operators:
+-- && overlap, @> contains, <@ is contained by, = equal, << strictly left, >> strictly right
+SELECT * FROM schedules WHERE shift && '[2024-03-01 08:00, 2024-03-01 16:00)';
+SELECT * FROM schedules WHERE shift @> '2024-03-01 10:00'::timestamptz;
+
+-- Exclusion constraint: no staff member double-booked
+ALTER TABLE schedules ADD CONSTRAINT no_double_shift
+    EXCLUDE USING gist (staff_id WITH =, shift WITH &&);
+```
+
+### PostGIS with GiST
+
+```sql
+-- Bounding-box spatial index (default, fast)
+CREATE INDEX idx_locations_geom ON locations USING gist(geom);
+
+-- KNN search: find 5 nearest stores to a point
+SELECT name, geom <-> ST_MakePoint(-87.6298, 41.8781)::geography AS distance
+FROM stores
+ORDER BY distance
+LIMIT 5;
+
+-- Bounding-box overlap (fast, approximate)
+SELECT * FROM polygons WHERE geom && ST_MakeEnvelope(-88, 41, -87, 42, 4326);
+
+-- Exact intersection (uses index for bbox pre-filter, then rechecks)
+SELECT * FROM polygons WHERE ST_Intersects(geom, ST_MakeEnvelope(-88, 41, -87, 42, 4326));
+```
+
+### GiST vs GIN Trade-offs
+
+| Property | GiST | GIN |
+|----------|------|-----|
+| Build time | Faster | Slower |
+| Index size | Larger | Smaller (for same data) |
+| Query speed | Slightly slower (lossy, recheck) | Faster for exact lookups |
+| Concurrent writes | Better | GIN pending list helps |
+| Use for exclusion constraints | Yes | No |
+
+---
+
+## BRIN Specifics
+
+### How BRIN Works
+
+BRIN stores the minimum and maximum values for each block range (group of consecutive pages). Effective when the physical storage order of rows correlates with the query predicate.
+
+```sql
+-- Ideal: append-only log table; rows inserted in timestamp order
+CREATE TABLE application_logs (
+    id          bigserial,
+    recorded_at timestamptz NOT NULL DEFAULT now(),
+    level       text,
+    message     text
+);
+
+-- BRIN is tiny: 1 page per 128 pages of heap (default)
+CREATE INDEX idx_logs_recorded_brin ON application_logs USING brin(recorded_at);
+
+-- Dramatically smaller than B-tree for the same column:
+SELECT
+    pg_size_pretty(pg_relation_size('idx_logs_recorded_btree')) AS btree_size,
+    pg_size_pretty(pg_relation_size('idx_logs_recorded_brin'))  AS brin_size;
+-- Typical ratio: 1000:1 in favor of BRIN for correlated data
+```
+
+### Tuning pages_per_range
+
+```sql
+-- Default pages_per_range = 128 (coarse, very small index)
+-- Smaller value = more precise (fewer false positives), larger index
+-- Larger value = less precise, smaller index
+
+-- For high-precision time ranges on a large table
+CREATE INDEX idx_logs_brin_precise ON application_logs USING brin(recorded_at)
+WITH (pages_per_range = 16);
+
+-- Query still requires a sequential scan of matching block ranges
+-- followed by heap fetch and recheck; BRIN shines when most blocks are skipped
+```
+
+### Ideal BRIN Workloads
+
+- Time-series and IoT data inserted in timestamp order
+- Append-only audit tables
+- Log tables where records are never updated out of order
+- Data warehouse fact tables loaded in date sequence
+
+### When BRIN Is NOT Appropriate
+
+- Tables with random INSERT patterns (poor correlation)
+- Frequently updated rows that change index key values
+- Small tables (B-tree overhead is trivial; BRIN gains are minimal)
+- When precise, low-latency lookups are required (BRIN may still scan many pages)
+
+---
+
+## Index Maintenance
+
+### Finding Unused Indexes
+
+```sql
+-- Indexes with zero or low scans since last statistics reset
+SELECT
+    schemaname,
+    tablename,
+    indexname,
+    pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
+    idx_scan,
+    idx_tup_read,
+    idx_tup_fetch
+FROM pg_stat_user_indexes
+WHERE idx_scan = 0
+  AND pg_relation_size(indexrelid) > 1024 * 1024  -- larger than 1MB
+ORDER BY pg_relation_size(indexrelid) DESC;
+
+-- When were statistics last reset?
+SELECT stats_reset FROM pg_stat_bgwriter;
+```
+
+### Detecting Index Bloat
+
+```sql
+-- Approximate bloat using pgstattuple extension
+CREATE EXTENSION pgstattuple;
+
+SELECT * FROM pgstatindex('idx_orders_customer_id');
+-- Look at: avg_leaf_density (below ~70% means bloat)
+
+-- Or use the bloat query from check_postgres
+SELECT
+    schemaname,
+    tablename,
+    indexname,
+    pg_size_pretty(pg_relation_size(indexrelid)) AS index_size,
+    round(100 * (1 - avg_leaf_density / 90.0), 1) AS bloat_pct
+FROM pg_stat_user_indexes
+JOIN pg_index USING (indexrelid)
+CROSS JOIN pgstatindex(indexrelid::regclass::text)
+WHERE NOT indisprimary
+ORDER BY bloat_pct DESC;
+```
+
+### Rebuilding Indexes
+
+```sql
+-- Rebuild without locking reads/writes (PG12+)
+REINDEX INDEX CONCURRENTLY idx_orders_customer_id;
+
+-- Rebuild all indexes on a table concurrently
+REINDEX TABLE CONCURRENTLY orders;
+
+-- Classic REINDEX (takes ShareLock, blocks writes):
+REINDEX INDEX idx_orders_customer_id;
+
+-- Rebuild as new index, then swap (manual CONCURRENTLY approach, pre-PG12)
+CREATE INDEX CONCURRENTLY idx_orders_customer_id_new ON orders(customer_id);
+DROP INDEX idx_orders_customer_id;
+ALTER INDEX idx_orders_customer_id_new RENAME TO idx_orders_customer_id;
+```
+
+### Monitoring Index Size and Growth
+
+```sql
+-- All index sizes for a table, sorted descending
+SELECT
+    indexname,
+    pg_size_pretty(pg_relation_size(indexrelid)) AS size,
+    indexdef
+FROM pg_indexes
+JOIN pg_stat_user_indexes USING (schemaname, tablename, indexname)
+WHERE tablename = 'orders'
+ORDER BY pg_relation_size(indexrelid) DESC;
+
+-- Total index overhead vs table size
+SELECT
+    relname AS table_name,
+    pg_size_pretty(pg_relation_size(oid)) AS table_size,
+    pg_size_pretty(pg_indexes_size(oid)) AS indexes_size,
+    round(100.0 * pg_indexes_size(oid) / nullif(pg_relation_size(oid), 0), 1) AS index_ratio_pct
+FROM pg_class
+WHERE relkind = 'r'
+  AND relnamespace = 'public'::regnamespace
+ORDER BY pg_indexes_size(oid) DESC;
+```
+
+### Monitoring Index Usage in Queries
+
+```sql
+-- Enable pg_stat_statements for query-level stats
+CREATE EXTENSION pg_stat_statements;
+
+-- Find slow queries that do sequential scans on large tables
+SELECT
+    query,
+    calls,
+    round(mean_exec_time::numeric, 2) AS avg_ms,
+    round(total_exec_time::numeric, 2) AS total_ms
+FROM pg_stat_statements
+ORDER BY mean_exec_time DESC
+LIMIT 20;
+
+-- Check for sequential scans on a specific table
+SELECT
+    seq_scan,
+    seq_tup_read,
+    idx_scan,
+    idx_tup_fetch,
+    n_live_tup
+FROM pg_stat_user_tables
+WHERE relname = 'orders';
+
+-- High seq_scan with high n_live_tup = missing index candidate
+```
+
+---
+
+## Anti-Patterns
+
+### Over-Indexing
+
+Every index adds overhead to INSERT, UPDATE, and DELETE operations. Index only columns that appear in WHERE, JOIN, or ORDER BY clauses of frequent or critical queries.
+
+```sql
+-- Bad: indexing every column "just in case"
+CREATE INDEX ON orders(id);           -- already the PK
+CREATE INDEX ON orders(created_at);   -- only used in one monthly report
+CREATE INDEX ON orders(notes);        -- free-text, rarely filtered
+CREATE INDEX ON orders(updated_at);   -- only used in batch maintenance jobs
+
+-- Measure write amplification
+SELECT
+    relname,
+    n_tup_ins,
+    n_tup_upd,
+    n_tup_del,
+    (SELECT count(*) FROM pg_indexes WHERE tablename = relname) AS index_count
+FROM pg_stat_user_tables
+WHERE relname = 'orders';
+```
+
+### Wrong Index Type Selection
+
+```sql
+-- Bad: B-tree on a column used only with @> (JSONB containment)
+CREATE INDEX idx_bad ON products USING btree(attributes);
+-- attributes @> '{"color": "red"}' will NOT use this index
+
+-- Good: GIN for containment queries
+CREATE INDEX idx_good ON products USING gin(attributes jsonb_path_ops);
+
+-- Bad: GIN on a column used only for equality
+CREATE INDEX idx_bad2 ON sessions USING gin(token);
+-- token is text, not multi-valued; GIN has no benefit here
+
+-- Good: B-tree or Hash for equality on scalar
+CREATE INDEX idx_good2 ON sessions USING hash(token);
+```
+
+### Indexing Low-Cardinality Columns Without Partial
+
+```sql
+-- Bad: B-tree index on a boolean column (only 2 distinct values)
+-- Planner will likely choose a seq scan anyway for common value
+CREATE INDEX idx_orders_is_paid ON orders(is_paid);
+
+-- Bad: B-tree on status with 3-4 values and one dominant
+CREATE INDEX idx_orders_status ON orders(status);
+-- If 95% of rows have status = 'completed', this index is useless for that value
+
+-- Good: Partial index targeting the rare, actionable value
+CREATE INDEX idx_orders_unpaid ON orders(created_at) WHERE is_paid = false;
+CREATE INDEX idx_orders_pending ON orders(created_at) WHERE status = 'pending';
+```
+
+### Redundant Indexes
+
+```sql
+-- Bad: (a) is made redundant by (a, b) for queries filtering on a alone
+CREATE INDEX idx_a   ON t(a);
+CREATE INDEX idx_a_b ON t(a, b);
+
+-- Check for prefix-redundant indexes
+SELECT
+    i1.indexname AS redundant,
+    i2.indexname AS superseded_by
+FROM pg_indexes i1
+JOIN pg_indexes i2 ON i1.tablename = i2.tablename
+    AND i1.indexname != i2.indexname
+    AND position(replace(i1.indexdef, i1.indexname, '') IN i2.indexdef) > 0
+WHERE i1.tablename = 'orders';
+```
+
+### Missing Indexes on Foreign Keys
+
+Unindexed foreign keys cause sequential scans during CASCADE deletes and parent-table updates.
+
+```sql
+-- Find foreign key columns without an index
+SELECT
+    tc.table_name,
+    kcu.column_name,
+    ccu.table_name AS referenced_table
+FROM information_schema.table_constraints AS tc
+JOIN information_schema.key_column_usage AS kcu
+    ON tc.constraint_name = kcu.constraint_name
+JOIN information_schema.constraint_column_usage AS ccu
+    ON ccu.constraint_name = tc.constraint_name
+WHERE tc.constraint_type = 'FOREIGN KEY'
+  AND NOT EXISTS (
+      SELECT 1 FROM pg_index pi
+      JOIN pg_attribute pa ON pa.attrelid = pi.indrelid
+          AND pa.attnum = ANY(pi.indkey)
+      WHERE pi.indrelid = (tc.table_name)::regclass
+        AND pa.attname = kcu.column_name
+  );
+```
+
+### Forgetting to Run ANALYZE After Bulk Load
+
+```sql
+-- After COPY or bulk INSERT, statistics are stale; planner makes bad choices
+COPY orders FROM '/tmp/orders.csv' CSV HEADER;
+ANALYZE orders;  -- always run this after bulk loads
+
+-- Or with autovacuum disabled during load:
+SET session_replication_role = replica;  -- disable FK checks for speed
+COPY orders FROM '/tmp/orders.csv' CSV HEADER;
+SET session_replication_role = DEFAULT;
+ANALYZE orders;
+```

+ 714 - 0
skills/postgres-ops/references/operations.md

@@ -0,0 +1,714 @@
+# PostgreSQL Operations Reference
+
+## Table of Contents
+
+1. [Backup Strategies](#backup-strategies)
+2. [Vacuum Deep Dive](#vacuum-deep-dive)
+3. [Monitoring](#monitoring)
+4. [Connection Pooling](#connection-pooling)
+
+---
+
+## Backup Strategies
+
+### pg_dump
+
+Logical backup of a single database. Consistent snapshot via a single transaction.
+Does not back up roles, tablespaces, or server-level configuration.
+
+```bash
+# Custom format (-Fc): compressed, parallel-restorable, most versatile
+pg_dump -h localhost -U postgres -d mydb -Fc -f mydb.dump
+
+# Plain SQL format (-Fp): human-readable, pipe-friendly, not parallel-restorable
+pg_dump -h localhost -U postgres -d mydb -Fp -f mydb.sql
+
+# Directory format (-Fd): one file per table, supports parallel dump and restore
+pg_dump -h localhost -U postgres -d mydb -Fd -f mydb_dir/
+
+# Parallel dump (directory format required, -j = number of workers)
+pg_dump -h localhost -U postgres -d mydb -Fd -j 4 -f mydb_dir/
+
+# Compressed with explicit compression level (PG16+ supports --compress=lz4)
+pg_dump -h localhost -U postgres -d mydb -Fc --compress=9 -f mydb.dump
+
+# Dump only specific tables
+pg_dump -h localhost -U postgres -d mydb -Fc -t orders -t customers -f subset.dump
+
+# Dump only schema (no data)
+pg_dump -h localhost -U postgres -d mydb -Fc --schema-only -f schema.dump
+
+# Dump only data (no DDL)
+pg_dump -h localhost -U postgres -d mydb -Fc --data-only -f data.dump
+
+# Exclude specific tables (e.g., large log tables)
+pg_dump -h localhost -U postgres -d mydb -Fc -T audit_logs -T event_stream -f mydb.dump
+```
+
+#### pg_restore
+
+```bash
+# Restore custom/directory format
+pg_restore -h localhost -U postgres -d mydb_restore -Fc mydb.dump
+
+# Parallel restore (directory format)
+pg_restore -h localhost -U postgres -d mydb_restore -Fd -j 4 mydb_dir/
+
+# Restore single table from full dump
+pg_restore -h localhost -U postgres -d mydb -t orders mydb.dump
+
+# Restore schema only, then data (useful for pre-creating indexes)
+pg_restore -h localhost -U postgres -d mydb --schema-only mydb.dump
+pg_restore -h localhost -U postgres -d mydb --data-only mydb.dump
+
+# --no-owner / --no-privileges: skip ownership and ACL statements
+pg_restore -h localhost -U postgres -d mydb --no-owner --no-privileges mydb.dump
+```
+
+### pg_dumpall
+
+Backs up all databases plus server-level objects (roles, tablespaces).
+Output is always plain SQL (no custom/directory format support).
+
+```bash
+# Full cluster backup
+pg_dumpall -h localhost -U postgres -f cluster_backup.sql
+
+# Globals only (roles and tablespaces, no database data)
+pg_dumpall -h localhost -U postgres --globals-only -f globals.sql
+
+# Restore
+psql -h localhost -U postgres -f cluster_backup.sql
+```
+
+### pg_basebackup
+
+Physical backup of the entire cluster. Required for PITR and streaming replication setup.
+Much faster than pg_dump for large databases since it copies raw files.
+
+```bash
+# Basic base backup (plain format, WAL streamed during backup)
+pg_basebackup -h localhost -U replicator -D /backup/base -P
+
+# Include WAL files in backup (-Xs = stream WAL during backup)
+pg_basebackup -h localhost -U replicator -D /backup/base -Xs -P
+
+# Tar format with gzip compression (one .tar.gz per tablespace)
+pg_basebackup -h localhost -U replicator -D /backup/base -Ft -z -P
+
+# Tar format with LZ4 (PG15+, faster than gzip)
+pg_basebackup -h localhost -U replicator -D /backup/base -Ft --compress=lz4 -P
+
+# Checkpoint mode: fast = force immediate checkpoint, spread = rate-limited I/O
+pg_basebackup -h localhost -U replicator -D /backup/base -Xs --checkpoint=fast -P
+
+# Required postgresql.conf settings for pg_basebackup:
+# wal_level = replica          (minimum)
+# max_wal_senders = 3          (at least 1 available sender)
+# archive_mode = on            (for PITR)
+```
+
+### PITR: Point-in-Time Recovery
+
+PITR combines a base backup with WAL archive segments to restore to any point in time.
+
+#### WAL Archiving Setup
+
+```bash
+# postgresql.conf settings
+wal_level = replica
+archive_mode = on
+archive_command = 'test ! -f /wal_archive/%f && cp %p /wal_archive/%f'
+# %p = full path to WAL file, %f = filename only
+
+# With AWS S3 (using WAL-E or pgBackRest in production)
+archive_command = 'aws s3 cp %p s3://my-bucket/wal-archive/%f'
+
+# Verify archive is working
+SELECT pg_switch_wal();  -- force WAL segment switch to test archive_command
+-- Check /wal_archive for new .wal files
+```
+
+#### Recovery Configuration
+
+Create `recovery.signal` file in PGDATA to trigger recovery mode (PG12+).
+Recovery parameters go in `postgresql.conf` (PG12+) or `recovery.conf` (pre-PG12).
+
+```bash
+# postgresql.conf additions for recovery
+restore_command = 'cp /wal_archive/%f %p'
+# or from S3:
+restore_command = 'aws s3 cp s3://my-bucket/wal-archive/%f %p'
+
+# Recovery target options (pick one):
+recovery_target_time = '2024-03-15 14:30:00'        # time-based
+recovery_target_xid = '1234567'                       # transaction ID
+recovery_target_lsn = '0/15D5A50'                    # LSN
+recovery_target_name = 'before_migration'             # named restore point
+recovery_target = 'immediate'                         # as soon as consistent
+
+# After reaching target:
+recovery_target_action = 'promote'   # promote to primary (default)
+recovery_target_action = 'pause'     # pause, inspect, then pg_wal_replay_resume()
+recovery_target_action = 'shutdown'  # stop after recovery
+
+# Named restore points (create before risky operations)
+SELECT pg_create_restore_point('before_bulk_delete');
+```
+
+```bash
+# Full PITR procedure:
+# 1. Stop PostgreSQL
+# 2. Move PGDATA aside: mv /var/lib/postgresql/14/main /var/lib/postgresql/14/main.bak
+# 3. Restore base backup: pg_basebackup ... or extract tar
+# 4. Add recovery settings to postgresql.conf
+# 5. Touch recovery.signal: touch $PGDATA/recovery.signal
+# 6. Start PostgreSQL -- it will replay WAL until target, then promote
+```
+
+### Backup Verification
+
+```bash
+# pg_verifybackup (PG13+): verify base backup integrity
+pg_verifybackup /backup/base
+
+# Check backup manifest
+pg_verifybackup --no-manifest-checksums /backup/base  # skip slow checksum verify
+
+# Test restore (do this regularly in staging)
+pg_restore --list mydb.dump | head -20   # check contents without restoring
+
+# Verify dump readability
+pg_restore -l mydb.dump > /dev/null && echo "Dump is readable"
+```
+
+---
+
+## Vacuum Deep Dive
+
+### Regular VACUUM vs VACUUM FULL vs pg_repack
+
+**Regular VACUUM**: marks dead tuples as reusable space. Does not shrink table on disk.
+Non-blocking (shares table with readers and writers). Run this routinely.
+
+**VACUUM FULL**: rewrites entire table to new file, reclaiming disk space.
+Requires exclusive lock (blocks all access). Causes table/index bloat to disappear.
+Rarely needed if autovacuum is tuned correctly.
+
+**pg_repack**: rewrites table without long exclusive lock (builds new table in background,
+swaps at end with brief lock). Preferred over VACUUM FULL for large production tables.
+
+```sql
+-- Regular VACUUM (non-blocking)
+VACUUM orders;
+
+-- VACUUM with ANALYZE (update statistics too)
+VACUUM ANALYZE orders;
+
+-- VERBOSE output to understand what was cleaned
+VACUUM VERBOSE orders;
+
+-- VACUUM FULL (requires AccessExclusiveLock -- schedule maintenance window)
+VACUUM FULL orders;
+
+-- Check what VACUUM would do (dry run via visibility info)
+SELECT relname, n_dead_tup, n_live_tup,
+       round(100.0 * n_dead_tup / nullif(n_live_tup + n_dead_tup, 0), 2) AS dead_pct,
+       last_vacuum, last_autovacuum
+FROM pg_stat_user_tables
+ORDER BY n_dead_tup DESC;
+```
+
+```bash
+# pg_repack (must install extension)
+pg_repack -h localhost -U postgres -d mydb -t orders
+
+# Repack entire database
+pg_repack -h localhost -U postgres -d mydb
+
+# Repack only indexes (faster, lower risk)
+pg_repack -h localhost -U postgres -d mydb -t orders --only-indexes
+```
+
+### Autovacuum Tuning
+
+Autovacuum triggers when: `n_dead_tup > autovacuum_vacuum_threshold + autovacuum_vacuum_scale_factor * n_live_tup`
+
+```bash
+# postgresql.conf global settings
+autovacuum = on                              # never disable
+autovacuum_max_workers = 5                   # default 3; increase for many tables
+autovacuum_vacuum_threshold = 50            # min dead tuples before trigger
+autovacuum_vacuum_scale_factor = 0.02       # 2% of table (default 0.2 = 20%)
+autovacuum_analyze_threshold = 50
+autovacuum_analyze_scale_factor = 0.01      # 1% (default 0.1 = 10%)
+autovacuum_vacuum_cost_delay = 2ms          # throttle I/O (default 2ms in PG13+)
+autovacuum_vacuum_cost_limit = 200          # I/O budget per delay cycle
+autovacuum_naptime = 30s                    # check interval for each worker
+```
+
+```sql
+-- Per-table autovacuum override (large tables need lower scale factor)
+-- For a 100M row table, 20% = 20M dead tuples before vacuum -- too late
+ALTER TABLE orders SET (
+    autovacuum_vacuum_scale_factor = 0.01,   -- 1% instead of 20%
+    autovacuum_vacuum_threshold = 1000,
+    autovacuum_analyze_scale_factor = 0.005,
+    autovacuum_vacuum_cost_delay = 10        -- ms; slow down to reduce I/O impact
+);
+
+-- High-churn tables (logs, queues): more aggressive
+ALTER TABLE job_queue SET (
+    autovacuum_vacuum_scale_factor = 0.001,
+    autovacuum_vacuum_threshold = 100
+);
+```
+
+### Transaction ID Wraparound
+
+PostgreSQL uses 32-bit transaction IDs (XIDs). After ~2 billion transactions, XID wraps.
+PostgreSQL will stop accepting writes when age reaches `autovacuum_freeze_max_age` (default 200M).
+
+```sql
+-- Monitor XID age across all databases (run as superuser)
+SELECT datname,
+       age(datfrozenxid) AS xid_age,
+       2000000000 - age(datfrozenxid) AS remaining_xids,
+       round(100.0 * age(datfrozenxid) / 2000000000, 2) AS pct_used
+FROM pg_database
+ORDER BY age(datfrozenxid) DESC;
+
+-- Monitor per-table XID age (find tables that need freezing)
+SELECT relname,
+       age(relfrozenxid) AS xid_age,
+       pg_size_pretty(pg_total_relation_size(oid)) AS size
+FROM pg_class
+WHERE relkind = 'r'
+ORDER BY age(relfrozenxid) DESC
+LIMIT 20;
+
+-- Emergency response when approaching wraparound:
+-- 1. Check if autovacuum is running: SELECT * FROM pg_stat_activity WHERE query LIKE 'autovacuum%';
+-- 2. Manual aggressive freeze:
+VACUUM FREEZE orders;        -- force freeze all tuples in table
+-- 3. For cluster-wide freeze:
+-- vacuumdb -a -F -j 4       -- freeze all databases, 4 parallel workers
+
+-- Relevant postgresql.conf settings
+vacuum_freeze_min_age = 50000000        -- freeze tuples older than this (50M XIDs)
+vacuum_freeze_table_age = 150000000     -- force full table scan at this age
+autovacuum_freeze_max_age = 200000000   -- emergency autovacuum triggered here
+```
+
+### Dead Tuple Accumulation
+
+Long-running transactions and `idle in transaction` sessions prevent VACUUM from removing dead tuples
+because those old snapshots may still need to see pre-update versions.
+
+```sql
+-- Find sessions holding old snapshots (preventing dead tuple cleanup)
+SELECT pid, usename, application_name, state,
+       now() - xact_start AS xact_age,
+       now() - query_start AS query_age,
+       left(query, 80) AS current_query
+FROM pg_stat_activity
+WHERE state != 'idle'
+  AND xact_start < now() - interval '5 minutes'
+ORDER BY xact_start;
+
+-- Find the oldest active transaction (this limits vacuum)
+SELECT min(xact_start), max(now() - xact_start) AS max_age
+FROM pg_stat_activity
+WHERE xact_start IS NOT NULL;
+
+-- Check if replication slots are holding back WAL (another source of bloat)
+SELECT slot_name, active, pg_size_pretty(
+    pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)
+) AS lag
+FROM pg_replication_slots;
+
+-- Kill long-running idle-in-transaction sessions (use carefully)
+SELECT pg_terminate_backend(pid)
+FROM pg_stat_activity
+WHERE state = 'idle in transaction'
+  AND now() - xact_start > interval '1 hour';
+
+-- Prevent accumulation: set statement/transaction timeouts
+-- postgresql.conf or ALTER ROLE:
+-- idle_in_transaction_session_timeout = '10min'
+-- statement_timeout = '30s'
+```
+
+---
+
+## Monitoring
+
+### pg_stat_activity
+
+```sql
+-- Connection overview by state
+SELECT state, count(*), max(now() - state_change) AS max_time_in_state
+FROM pg_stat_activity
+GROUP BY state
+ORDER BY count DESC;
+
+-- Long-running queries (over 30 seconds)
+SELECT pid, usename, application_name, client_addr, state,
+       now() - query_start AS duration,
+       wait_event_type, wait_event,
+       left(query, 120) AS query
+FROM pg_stat_activity
+WHERE state != 'idle'
+  AND query_start < now() - interval '30 seconds'
+ORDER BY query_start;
+
+-- Wait events (what are connections waiting for)
+SELECT wait_event_type, wait_event, count(*)
+FROM pg_stat_activity
+WHERE state != 'idle'
+GROUP BY wait_event_type, wait_event
+ORDER BY count DESC;
+-- Common wait events:
+-- Lock/relation = waiting for table lock
+-- Client/ClientRead = waiting for client to send data
+-- IO/DataFileRead = reading from disk
+-- IPC/BgWorkerShutdown = parallel query coordination
+```
+
+### pg_stat_user_tables
+
+```sql
+-- Tables with high sequential scan rates (missing indexes?)
+SELECT relname,
+       seq_scan,
+       idx_scan,
+       round(100.0 * idx_scan / nullif(seq_scan + idx_scan, 0), 2) AS idx_pct,
+       n_live_tup,
+       n_dead_tup,
+       last_vacuum::date,
+       last_autovacuum::date,
+       last_analyze::date
+FROM pg_stat_user_tables
+WHERE seq_scan > 100
+ORDER BY seq_scan DESC;
+
+-- Tables most in need of VACUUM
+SELECT relname,
+       n_dead_tup,
+       n_live_tup,
+       round(100.0 * n_dead_tup / nullif(n_live_tup + n_dead_tup, 0), 2) AS dead_pct,
+       last_autovacuum,
+       last_vacuum
+FROM pg_stat_user_tables
+ORDER BY n_dead_tup DESC
+LIMIT 20;
+```
+
+### Unused Index Detection
+
+```sql
+-- Indexes that are never used (candidates for removal)
+SELECT schemaname, relname AS table, indexrelname AS index,
+       pg_size_pretty(pg_relation_size(i.indexrelid)) AS index_size,
+       idx_scan AS scans
+FROM pg_stat_user_indexes ui
+JOIN pg_index i ON i.indexrelid = ui.indexrelid
+WHERE idx_scan = 0
+  AND NOT indisunique           -- keep unique constraints
+  AND NOT indisprimary          -- keep primary keys
+ORDER BY pg_relation_size(i.indexrelid) DESC;
+
+-- Indexes with low usage relative to writes (more overhead than benefit)
+SELECT relname AS table,
+       indexrelname AS index,
+       idx_scan AS reads,
+       pg_stat_get_tuples_inserted(relid) + pg_stat_get_tuples_updated(relid)
+           + pg_stat_get_tuples_deleted(relid) AS writes,
+       pg_size_pretty(pg_relation_size(indexrelid)) AS size
+FROM pg_stat_user_indexes
+WHERE idx_scan < 100
+ORDER BY pg_relation_size(indexrelid) DESC;
+
+-- Note: reset stats after index creation or major data loads
+-- SELECT pg_stat_reset();  -- resets ALL stats for this database
+```
+
+### pg_stat_bgwriter
+
+```sql
+-- Checkpoint health and buffer writer activity
+SELECT checkpoints_timed,
+       checkpoints_req,                          -- forced (bad: I/O spike risk)
+       round(100.0 * checkpoints_req /
+           nullif(checkpoints_timed + checkpoints_req, 0), 2) AS forced_pct,
+       buffers_checkpoint,                        -- written at checkpoint
+       buffers_clean,                             -- written by bgwriter
+       maxwritten_clean,                          -- bgwriter hit write limit (increase bgwriter_lru_maxpages)
+       buffers_backend,                           -- written by backend directly (bad)
+       buffers_backend_fsync,                     -- backend had to fsync (very bad)
+       buffers_alloc,                             -- new buffers allocated
+       stats_reset::date
+FROM pg_stat_bgwriter;
+
+-- High checkpoints_req: reduce checkpoint_completion_target or increase max_wal_size
+-- High buffers_backend: shared_buffers too small or checkpoint interval too short
+-- Ideal: checkpoints_req / total < 10%, buffers_backend / total < 5%
+
+-- postgresql.conf tuning:
+-- checkpoint_completion_target = 0.9   -- spread checkpoint I/O over 90% of interval
+-- max_wal_size = 4GB                   -- larger = fewer forced checkpoints
+-- checkpoint_timeout = 10min           -- default
+```
+
+### Lock Contention and Deadlocks
+
+```sql
+-- Active locks and what is blocking what
+SELECT blocked.pid AS blocked_pid,
+       blocked.usename AS blocked_user,
+       blocking.pid AS blocking_pid,
+       blocking.usename AS blocking_user,
+       blocked_activity.wait_event,
+       blocked_activity.query AS blocked_query,
+       blocking_activity.query AS blocking_query
+FROM pg_locks blocked
+JOIN pg_stat_activity blocked_activity ON blocked_activity.pid = blocked.pid
+JOIN pg_locks blocking ON blocking.locktype = blocked.locktype
+    AND blocking.database IS NOT DISTINCT FROM blocked.database
+    AND blocking.relation IS NOT DISTINCT FROM blocked.relation
+    AND blocking.page IS NOT DISTINCT FROM blocked.page
+    AND blocking.tuple IS NOT DISTINCT FROM blocked.tuple
+    AND blocking.classid IS NOT DISTINCT FROM blocked.classid
+    AND blocking.objid IS NOT DISTINCT FROM blocked.objid
+    AND blocking.objsubid IS NOT DISTINCT FROM blocked.objsubid
+    AND blocking.pid != blocked.pid
+    AND blocking.granted
+JOIN pg_stat_activity blocking_activity ON blocking_activity.pid = blocking.pid
+WHERE NOT blocked.granted;
+
+-- Lock types by table (what mode of locks are held)
+SELECT relname, mode, count(*)
+FROM pg_locks l
+JOIN pg_class c ON c.oid = l.relation
+WHERE l.granted
+GROUP BY relname, mode
+ORDER BY relname, mode;
+
+-- Deadlock investigation: enable logging in postgresql.conf
+-- log_lock_waits = on          -- log waits over deadlock_timeout
+-- deadlock_timeout = 1s        -- time before deadlock check runs
+-- log_min_duration_statement = 5000  -- log queries taking over 5s
+```
+
+### Cache Hit Ratio
+
+```sql
+-- Database-level buffer cache hit ratio (target: > 99% for OLTP)
+SELECT datname,
+       blks_hit,
+       blks_read,
+       round(100.0 * blks_hit / nullif(blks_hit + blks_read, 0), 2) AS hit_ratio
+FROM pg_stat_database
+WHERE datname = current_database();
+
+-- Table-level cache hit ratio
+SELECT relname,
+       heap_blks_hit,
+       heap_blks_read,
+       round(100.0 * heap_blks_hit / nullif(heap_blks_hit + heap_blks_read, 0), 2) AS hit_ratio
+FROM pg_statio_user_tables
+ORDER BY heap_blks_read DESC;
+
+-- Index cache hit ratio
+SELECT relname, indexrelname,
+       idx_blks_hit,
+       idx_blks_read,
+       round(100.0 * idx_blks_hit / nullif(idx_blks_hit + idx_blks_read, 0), 2) AS hit_ratio
+FROM pg_statio_user_indexes
+ORDER BY idx_blks_read DESC;
+```
+
+### Table and Index Bloat Estimation
+
+```sql
+-- Table bloat estimate (uses pgstattuple extension if available, else heuristic)
+-- Heuristic approach (no extension required):
+SELECT
+    schemaname,
+    relname AS table,
+    pg_size_pretty(pg_total_relation_size(schemaname || '.' || relname)) AS total_size,
+    pg_size_pretty(pg_relation_size(schemaname || '.' || relname)) AS table_size,
+    round(100.0 * n_dead_tup / nullif(n_live_tup + n_dead_tup, 0), 2) AS dead_tup_pct
+FROM pg_stat_user_tables
+ORDER BY pg_total_relation_size(schemaname || '.' || relname) DESC;
+
+-- Using pgstattuple for precise bloat (requires extension, scans full table)
+CREATE EXTENSION IF NOT EXISTS pgstattuple;
+SELECT * FROM pgstattuple('orders');
+-- dead_tuple_percent > 20% warrants VACUUM or pg_repack
+
+-- Index bloat via pgstatindex
+SELECT * FROM pgstatindex('orders_pkey');
+-- avg_leaf_density < 70% suggests bloat; REINDEX or pg_repack --only-indexes
+
+-- Quick bloat estimate without extension (from check_postgres project):
+SELECT
+    tablename,
+    pg_size_pretty(real_size) AS real_size,
+    pg_size_pretty(bloat_size) AS bloat_size,
+    round(bloat_ratio::numeric, 2) AS bloat_ratio
+FROM (
+    SELECT tablename,
+           pg_total_relation_size(tablename::regclass) AS real_size,
+           pg_total_relation_size(tablename::regclass) -
+               (pg_relation_size(tablename::regclass) * (1.0 - n_dead_tup::float / nullif(n_live_tup + n_dead_tup, 0))) AS bloat_size,
+           100.0 * n_dead_tup / nullif(n_live_tup + n_dead_tup, 0) AS bloat_ratio
+    FROM pg_stat_user_tables
+) t
+WHERE bloat_ratio > 10
+ORDER BY bloat_size DESC;
+```
+
+---
+
+## Connection Pooling
+
+### pgBouncer Modes
+
+**Session mode**: client holds server connection for entire session duration.
+Same behavior as direct connection. Use for: apps using session-level features
+(temp tables, prepared statements with protocol-level binding, SET LOCAL, advisory locks).
+
+**Transaction mode**: server connection returned to pool after each transaction.
+Much higher multiplexing. Use for: most web applications using short transactions.
+Limitations: SET, LISTEN, NOTIFY, prepared statements (without `server_reset_query`), temp tables.
+
+**Statement mode**: connection returned after each statement. Rarely used.
+Limitation: no multi-statement transactions. Useful only for simple read-only workloads.
+
+```ini
+; pgbouncer.ini
+[databases]
+mydb = host=localhost port=5432 dbname=mydb
+
+[pgbouncer]
+listen_port = 6432
+listen_addr = *
+auth_type = scram-sha-256
+auth_file = /etc/pgbouncer/userlist.txt
+
+pool_mode = transaction          ; session|transaction|statement
+max_client_conn = 1000           ; max connections from clients
+default_pool_size = 25           ; server connections per database/user pair
+min_pool_size = 5                ; keep this many open even when idle
+reserve_pool_size = 5            ; extra connections for pool_mode=session bursts
+reserve_pool_timeout = 5         ; seconds to wait before using reserve pool
+
+server_reset_query = DISCARD ALL ; run after each session return (session mode)
+server_check_query = SELECT 1    ; health check query
+server_idle_timeout = 600        ; close idle server connections after 10min
+client_idle_timeout = 0          ; 0 = never close idle clients (set in app instead)
+
+; Logging
+log_connections = 0              ; reduce noise in production
+log_disconnections = 0
+log_pooler_errors = 1
+stats_period = 60                ; log stats every 60 seconds
+```
+
+```bash
+# pgBouncer monitoring via admin console
+psql -h localhost -p 6432 -U pgbouncer pgbouncer
+
+SHOW POOLS;      -- pool stats: cl_active, cl_waiting, sv_active, sv_idle
+SHOW CLIENTS;    -- connected clients
+SHOW SERVERS;    -- server connections
+SHOW STATS;      -- request rates, query times
+SHOW CONFIG;     -- current configuration
+
+PAUSE mydb;      -- pause pool (for maintenance)
+RESUME mydb;     -- resume pool
+RELOAD;          -- reload config without restart
+```
+
+### Application-Level Pooling (SQLAlchemy)
+
+```python
+from sqlalchemy import create_engine
+
+engine = create_engine(
+    "postgresql+psycopg2://user:pass@localhost:5432/mydb",
+    pool_size=10,           # persistent connections in pool
+    max_overflow=20,        # extra connections beyond pool_size (temporary)
+    pool_timeout=30,        # seconds to wait for available connection
+    pool_recycle=1800,      # recycle connections after 30min (avoid stale connections)
+    pool_pre_ping=True,     # test connection before using from pool
+)
+
+# With pgBouncer in transaction mode, use NullPool or StaticPool
+# (pgBouncer handles pooling, app should not pool on top of pooler)
+from sqlalchemy.pool import NullPool
+engine = create_engine(
+    "postgresql+psycopg2://user:pass@pgbouncer:6432/mydb",
+    poolclass=NullPool       # no application-level pooling
+)
+```
+
+### Connection Sizing Guidelines
+
+The classic formula: `connections = (core_count * 2) + effective_spindle_count`
+
+For SSD storage (spindles = 1), a 16-core server: `(16 * 2) + 1 = 33` PostgreSQL connections.
+
+```sql
+-- Check current connection usage
+SELECT count(*) AS total,
+       count(*) FILTER (WHERE state = 'active') AS active,
+       count(*) FILTER (WHERE state = 'idle') AS idle,
+       count(*) FILTER (WHERE state = 'idle in transaction') AS idle_txn,
+       max_conn
+FROM pg_stat_activity
+CROSS JOIN (SELECT setting::int AS max_conn FROM pg_settings WHERE name = 'max_connections') s
+GROUP BY max_conn;
+
+-- Connections by application
+SELECT application_name, count(*), max(now() - state_change) AS longest_idle
+FROM pg_stat_activity
+GROUP BY application_name
+ORDER BY count DESC;
+```
+
+```ini
+; postgresql.conf connection settings
+max_connections = 100            ; total connections (including superuser)
+superuser_reserved_connections = 3  ; reserved for superuser access
+
+; Memory implication: each connection uses ~5-10MB of RAM
+; At max_connections = 200: budget 1-2GB RAM for connection overhead alone
+; Use pgBouncer to keep max_connections low (50-100) and serve thousands of clients
+
+; Recommended approach for most web apps:
+; App -> pgBouncer (transaction mode, max_client_conn=1000, pool_size=25)
+;     -> PostgreSQL (max_connections=50)
+```
+
+### Monitoring Pool Health
+
+```sql
+-- Alert conditions to monitor:
+-- 1. Connections near max_connections
+SELECT count(*) * 100.0 / current_setting('max_connections')::int AS pct_used
+FROM pg_stat_activity;
+
+-- 2. Idle-in-transaction accumulating (connection leak or slow clients)
+SELECT count(*)
+FROM pg_stat_activity
+WHERE state = 'idle in transaction'
+  AND now() - state_change > interval '5 minutes';
+
+-- 3. Connection wait (pgBouncer cl_waiting > 0 sustained = under-provisioned pool)
+
+-- Set timeouts to prevent connection leaks:
+-- ALTER ROLE myapp SET idle_in_transaction_session_timeout = '5min';
+-- ALTER ROLE myapp SET statement_timeout = '30s';
+```

+ 632 - 0
skills/postgres-ops/references/query-tuning.md

@@ -0,0 +1,632 @@
+# PostgreSQL Query Tuning Reference
+
+## Table of Contents
+
+1. [EXPLAIN Output Reference](#explain-output-reference)
+2. [Plan Node Reference](#plan-node-reference)
+3. [pg_stat_statements](#pg_stat_statements)
+4. [Common Optimization Patterns](#common-optimization-patterns)
+5. [Parallel Query](#parallel-query)
+6. [Statistics and Planner](#statistics-and-planner)
+
+---
+
+## EXPLAIN Output Reference
+
+### Format Options
+
+```sql
+-- Default text format (human readable)
+EXPLAIN SELECT * FROM orders WHERE customer_id = 42;
+
+-- With actual execution stats (runs the query)
+EXPLAIN (ANALYZE, BUFFERS) SELECT * FROM orders WHERE customer_id = 42;
+
+-- Full verbose output with all options
+EXPLAIN (ANALYZE, BUFFERS, VERBOSE, FORMAT TEXT) SELECT * FROM orders WHERE customer_id = 42;
+
+-- JSON format for programmatic parsing
+EXPLAIN (ANALYZE, BUFFERS, FORMAT JSON) SELECT * FROM orders WHERE customer_id = 42;
+
+-- YAML format
+EXPLAIN (ANALYZE, BUFFERS, FORMAT YAML) SELECT * FROM orders WHERE customer_id = 42;
+```
+
+### Key Fields Decoded
+
+```
+Seq Scan on orders  (cost=0.00..4821.00 rows=1000 width=64)
+                          ^      ^       ^         ^
+                          |      |       |         estimated avg row width (bytes)
+                          |      |       estimated output rows
+                          |      total cost (return last row)
+                          startup cost (return first row)
+
+(actual time=0.042..18.340 rows=987 loops=1)
+              ^       ^     ^        ^
+              |       |     |        number of times node executed
+              |       |     actual rows returned
+              |       actual time to return last row (ms)
+              actual time to return first row (ms)
+```
+
+**Startup cost vs total cost**: An index scan on a large table may have a high startup cost
+(building the bitmap) but low total cost per row. Nested loops favor low startup cost.
+A sort node has startup cost = full sort cost because no rows are returned until sorted.
+
+**Loops**: When a node has `loops=N`, the `actual time` is per-loop average and `actual rows`
+is per-loop average. Multiply by loops to get totals. This matters for nested loop inners.
+
+```sql
+-- Identify row estimate errors (poor estimates = bad plans)
+-- Look for large divergence between "rows=X" and "actual rows=Y"
+-- A 10x+ difference warrants investigation via ANALYZE or statistics adjustments
+```
+
+### Buffer Information
+
+```
+Buffers: shared hit=1024 read=256 dirtied=10 written=5
+          ^              ^         ^           ^
+          |              |         |           pages written to disk
+          |              |         pages modified during query
+          |              pages read from disk (cache miss)
+          pages served from shared_buffers (cache hit)
+```
+
+Cache hit ratio for a single query:
+- `hit / (hit + read)` -- aim for > 0.99 in OLTP workloads
+
+### Reading Execution Time
+
+```sql
+-- Planning time vs execution time appear at bottom of EXPLAIN ANALYZE output
+-- Planning Time: 1.234 ms
+-- Execution Time: 45.678 ms
+-- High planning time relative to execution suggests query plan caching issues
+-- or extremely complex queries with many joins
+```
+
+---
+
+## Plan Node Reference
+
+### Scan Types
+
+**Sequential Scan** -- reads entire table from disk in order.
+Chosen when: selectivity is high (returning large fraction of rows), no suitable index,
+small table fits in a few pages, or planner estimates index overhead exceeds benefit.
+
+```sql
+-- Force/prevent seq scan for testing (session level)
+SET enable_seqscan = off;   -- discourages seq scan
+SET enable_seqscan = on;    -- restore default
+```
+
+**Index Scan** -- follows index B-tree to find heap row pointers, fetches each heap page.
+Chosen when: high selectivity (few rows), index covers filter column, ORDER BY matches index.
+Drawback: random I/O on heap. Can be slower than seq scan on spinning disk for > ~5% of table.
+
+**Index Only Scan** -- satisfies query entirely from index, no heap fetch (if visibility map allows).
+Requires: all SELECT and WHERE columns in the index. Needs up-to-date visibility map (regular VACUUM).
+
+```sql
+-- Check if index only scan is blocked by visibility map
+SELECT relname, n_dead_tup, last_vacuum, last_autovacuum
+FROM pg_stat_user_tables
+WHERE relname = 'orders';
+
+-- Create covering index to enable index only scan
+CREATE INDEX idx_orders_covering ON orders (customer_id) INCLUDE (total, status, created_at);
+```
+
+**Bitmap Index Scan + Bitmap Heap Scan** -- builds bitmap of matching pages in memory,
+then fetches those pages in order (reduces random I/O vs plain Index Scan for moderate selectivity).
+Two-phase: BitmapIndexScan builds the bitmap, BitmapHeapScan fetches heap pages.
+
+```sql
+-- Bitmap scans combine multiple indexes via BitmapAnd / BitmapOr
+-- Useful when query has multiple filter conditions each with their own index
+EXPLAIN SELECT * FROM orders WHERE status = 'pending' AND region = 'EU';
+-- May show: BitmapAnd -> BitmapIndexScan on idx_status + BitmapIndexScan on idx_region
+```
+
+### Join Types
+
+**Nested Loop** -- for each outer row, scan inner relation.
+Cost: O(outer_rows * inner_scan_cost). Best when outer is small and inner lookup is fast (indexed).
+Chosen when: outer result set is small, inner has index on join column.
+
+```sql
+-- Nested loop is ideal for:
+-- SELECT * FROM orders o JOIN customers c ON c.id = o.customer_id WHERE o.id = 99;
+-- (single order -> single customer lookup via PK)
+
+SET enable_nestloop = off;  -- force alternative join type for testing
+```
+
+**Hash Join** -- build hash table from smaller relation, probe with larger.
+Cost: O(build + probe). Best for large unsorted relations with no useful index on join key.
+Chosen when: joining large tables, no index on join columns, equality join only.
+Memory: controlled by `work_mem`. If hash table exceeds work_mem, spills to disk (batch mode).
+
+```sql
+-- Check for hash join disk spills in EXPLAIN ANALYZE
+-- Batches: 4 means spilled to disk in 4 batches -- increase work_mem to fix
+-- Hash Batches: 1 is ideal (all in memory)
+SET work_mem = '256MB';  -- session level for large analytical queries
+```
+
+**Merge Join** -- sort both relations on join key, merge in order.
+Cost: O(N log N + M log M) for sorting. Best when inputs are already sorted (index).
+Chosen when: both sides are large, inputs already sorted, range or equality join.
+
+```sql
+SET enable_hashjoin = off;
+SET enable_mergejoin = off;
+-- Use sparingly in production; better to fix the cause (add index, fix statistics)
+```
+
+### Aggregation
+
+**HashAggregate** -- builds hash table of group keys, accumulates aggregates.
+Chosen for: unsorted input, many distinct groups. Memory: bounded by `work_mem`.
+When it exceeds work_mem, spills to disk (check `Disk: XkB` in EXPLAIN ANALYZE output).
+
+**GroupAggregate** -- streams sorted input, emits group when key changes.
+Chosen when: input already sorted on GROUP BY columns (index), or few distinct groups.
+Zero memory overhead but requires sorted input.
+
+```sql
+-- Force sorted approach by ensuring index on GROUP BY columns
+CREATE INDEX idx_orders_customer ON orders (customer_id, created_at);
+-- Now GROUP BY customer_id may use GroupAggregate instead of HashAggregate
+```
+
+### Sort Operations
+
+```sql
+EXPLAIN ANALYZE SELECT * FROM orders ORDER BY created_at DESC LIMIT 100;
+
+-- In-memory sort: Sort Method: quicksort  Memory: 2048kB
+-- Disk sort:      Sort Method: external merge  Disk: 512000kB  -- bad, increase work_mem
+
+-- Top-N Heapsort: Sort Method: top-N heapsort  Memory: 64kB  -- efficient for LIMIT
+-- Top-N heapsort is optimal for ORDER BY ... LIMIT N patterns
+```
+
+---
+
+## pg_stat_statements
+
+### Setup
+
+```sql
+-- postgresql.conf (requires restart)
+shared_preload_libraries = 'pg_stat_statements'
+pg_stat_statements.max = 10000          -- number of query fingerprints tracked
+pg_stat_statements.track = all          -- top|all|none (all includes nested queries)
+pg_stat_statements.track_utility = on   -- track COPY, CREATE TABLE, etc.
+
+-- After restart, create extension in each database you want to monitor
+CREATE EXTENSION pg_stat_statements;
+```
+
+### Key Columns (PostgreSQL 14+)
+
+```sql
+SELECT
+    queryid,                          -- internal hash identifier
+    query,                            -- normalized query text (params replaced with $1, $2)
+    calls,                            -- number of times executed
+    total_exec_time,                  -- total execution time (ms)
+    mean_exec_time,                   -- avg execution time (ms)
+    stddev_exec_time,                 -- std deviation (high = inconsistent)
+    min_exec_time,
+    max_exec_time,
+    rows,                             -- total rows returned/affected
+    shared_blks_hit,                  -- buffer cache hits
+    shared_blks_read,                 -- disk reads
+    shared_blks_dirtied,
+    shared_blks_written,
+    temp_blks_read,                   -- temp file reads (work_mem overflow)
+    temp_blks_written,
+    wal_bytes,                        -- WAL generated (high = write-heavy)
+    toplevel                          -- true if called at top level (PG14+)
+FROM pg_stat_statements;
+```
+
+### Finding Problem Queries
+
+```sql
+-- Top 10 queries by total time (cumulative load on server)
+SELECT
+    round(total_exec_time::numeric, 2) AS total_ms,
+    calls,
+    round(mean_exec_time::numeric, 2) AS mean_ms,
+    round((100 * total_exec_time / sum(total_exec_time) OVER ())::numeric, 2) AS pct_total,
+    left(query, 80) AS query_snippet
+FROM pg_stat_statements
+ORDER BY total_exec_time DESC
+LIMIT 10;
+
+-- Top 10 by mean execution time (slowest individual queries)
+SELECT
+    calls,
+    round(mean_exec_time::numeric, 2) AS mean_ms,
+    round(stddev_exec_time::numeric, 2) AS stddev_ms,
+    round(max_exec_time::numeric, 2) AS max_ms,
+    left(query, 80) AS query_snippet
+FROM pg_stat_statements
+WHERE calls > 10                   -- ignore rarely-run queries
+ORDER BY mean_exec_time DESC
+LIMIT 10;
+
+-- Queries with worst cache hit ratio (I/O bound candidates)
+SELECT
+    calls,
+    round(mean_exec_time::numeric, 2) AS mean_ms,
+    shared_blks_hit + shared_blks_read AS total_blks,
+    round(
+        100.0 * shared_blks_hit / nullif(shared_blks_hit + shared_blks_read, 0),
+        2
+    ) AS hit_pct,
+    left(query, 80) AS query_snippet
+FROM pg_stat_statements
+WHERE shared_blks_hit + shared_blks_read > 1000
+ORDER BY hit_pct ASC
+LIMIT 10;
+
+-- Queries generating most temp files (work_mem too low or bad query)
+SELECT
+    calls,
+    temp_blks_written,
+    round(mean_exec_time::numeric, 2) AS mean_ms,
+    left(query, 80) AS query_snippet
+FROM pg_stat_statements
+WHERE temp_blks_written > 0
+ORDER BY temp_blks_written DESC
+LIMIT 10;
+```
+
+### Reset Strategy
+
+```sql
+-- Reset stats for all queries (do after tuning to get fresh baseline)
+SELECT pg_stat_statements_reset();
+
+-- Reset stats for specific query (PG12+ by queryid)
+SELECT pg_stat_statements_reset(userid, dbid, queryid)
+FROM pg_stat_statements
+WHERE query LIKE '%orders%'
+LIMIT 1;
+```
+
+---
+
+## Common Optimization Patterns
+
+### CTE Materialization (PostgreSQL 12+)
+
+```sql
+-- Pre-PG12: CTEs were always materialized (optimization fence)
+-- PG12+: planner decides, but you can force behavior
+
+-- MATERIALIZED: always execute CTE once and cache result
+-- Use when: CTE is expensive but referenced multiple times
+WITH expensive_agg AS MATERIALIZED (
+    SELECT customer_id, sum(total) AS lifetime_value
+    FROM orders
+    GROUP BY customer_id
+)
+SELECT c.name, e.lifetime_value
+FROM customers c
+JOIN expensive_agg e ON e.customer_id = c.id;
+
+-- NOT MATERIALIZED: inline the CTE (allow planner to push predicates in)
+-- Use when: CTE is referenced once, or predicate pushdown is important
+WITH recent_orders AS NOT MATERIALIZED (
+    SELECT * FROM orders WHERE status = 'complete'
+)
+SELECT * FROM recent_orders WHERE customer_id = 42;
+-- Planner can now push "customer_id = 42" into the subquery and use an index
+```
+
+### EXISTS vs IN vs JOIN
+
+```sql
+-- EXISTS: short-circuits on first match, good for correlated checks
+-- Best when: checking existence only, inner side can be large
+SELECT c.id, c.name
+FROM customers c
+WHERE EXISTS (
+    SELECT 1 FROM orders o WHERE o.customer_id = c.id AND o.status = 'pending'
+);
+
+-- IN with subquery: similar to EXISTS in modern PG (planner converts to semi-join)
+-- Bad when: subquery returns NULLs (IN with NULLs behaves unexpectedly)
+SELECT c.id, c.name
+FROM customers c
+WHERE c.id IN (SELECT customer_id FROM orders WHERE status = 'pending');
+
+-- JOIN (semi-join via DISTINCT): explicit, predictable
+-- Needed when: you want columns from both sides, or deduplication matters
+SELECT DISTINCT c.id, c.name
+FROM customers c
+JOIN orders o ON o.customer_id = c.id AND o.status = 'pending';
+
+-- NOT IN danger with NULLs: returns zero rows if subquery has any NULL
+-- Always use NOT EXISTS for negation checks
+SELECT * FROM customers WHERE id NOT IN (SELECT customer_id FROM orders);
+-- If ANY customer_id in orders is NULL, returns no rows!
+-- Use instead:
+SELECT * FROM customers c
+WHERE NOT EXISTS (SELECT 1 FROM orders o WHERE o.customer_id = c.id);
+```
+
+### Lateral Join vs Subquery
+
+```sql
+-- LATERAL: allows subquery to reference columns from preceding FROM items
+-- Useful for: top-N per group, correlated row-limited subqueries
+
+-- Top 3 orders per customer (lateral is clean and indexed)
+SELECT c.name, o.id, o.total
+FROM customers c
+CROSS JOIN LATERAL (
+    SELECT id, total
+    FROM orders
+    WHERE customer_id = c.id
+    ORDER BY total DESC
+    LIMIT 3
+) o;
+
+-- Equivalent window function approach (often similar performance)
+SELECT name, order_id, total
+FROM (
+    SELECT c.name, o.id AS order_id, o.total,
+           row_number() OVER (PARTITION BY c.id ORDER BY o.total DESC) AS rn
+    FROM customers c
+    JOIN orders o ON o.customer_id = c.id
+) ranked
+WHERE rn <= 3;
+```
+
+### Pagination: OFFSET vs Keyset
+
+```sql
+-- OFFSET pagination: simple but degrades at high page numbers
+-- At page 1000 with LIMIT 20, PostgreSQL fetches 20020 rows and discards 20000
+SELECT id, name, created_at FROM orders ORDER BY created_at DESC LIMIT 20 OFFSET 20000;
+
+-- Keyset (cursor) pagination: O(1) regardless of page depth
+-- Requires: sorting by unique+indexed column(s), no arbitrary page jumping
+-- After receiving last row of previous page with (created_at='2024-01-15', id=9876):
+SELECT id, name, created_at
+FROM orders
+WHERE (created_at, id) < ('2024-01-15', 9876)  -- uses row comparison
+ORDER BY created_at DESC, id DESC
+LIMIT 20;
+
+-- Index to support keyset:
+CREATE INDEX idx_orders_keyset ON orders (created_at DESC, id DESC);
+```
+
+### DISTINCT ON vs Window Function Deduplication
+
+```sql
+-- DISTINCT ON: PostgreSQL extension, returns first row per group (by ORDER BY)
+-- Fast, single pass, leverages index on distinct columns
+SELECT DISTINCT ON (customer_id)
+    customer_id, id AS order_id, total, created_at
+FROM orders
+ORDER BY customer_id, created_at DESC;   -- gets most recent order per customer
+
+-- Create index to support: (customer_id, created_at DESC)
+CREATE INDEX idx_orders_latest ON orders (customer_id, created_at DESC);
+
+-- Window function equivalent (more portable, more flexible)
+SELECT customer_id, order_id, total, created_at
+FROM (
+    SELECT customer_id, id AS order_id, total, created_at,
+           row_number() OVER (PARTITION BY customer_id ORDER BY created_at DESC) AS rn
+    FROM orders
+) t
+WHERE rn = 1;
+```
+
+### Bulk Operations
+
+```sql
+-- COPY is fastest for bulk insert (bypasses most overhead)
+-- From file:
+COPY orders (customer_id, total, status) FROM '/tmp/orders.csv' WITH (FORMAT csv, HEADER);
+
+-- From stdin (psql):
+\COPY orders (customer_id, total, status) FROM 'orders.csv' CSV HEADER
+
+-- unnest trick for bulk insert from application (avoids N round trips)
+-- Send arrays of values, unnest server-side
+INSERT INTO orders (customer_id, total, status)
+SELECT * FROM unnest(
+    ARRAY[1, 2, 3],              -- customer_ids
+    ARRAY[100.00, 200.00, 50.00], -- totals
+    ARRAY['pending', 'complete', 'pending']::text[]
+) AS t(customer_id, total, status);
+
+-- For very large bulk loads, disable indexes and re-add after:
+ALTER TABLE orders DISABLE TRIGGER ALL;
+-- ... COPY ...
+ALTER TABLE orders ENABLE TRIGGER ALL;
+-- Or: drop indexes, load, recreate (faster than incremental index updates)
+
+-- Batch INSERT with ON CONFLICT (UPSERT)
+INSERT INTO order_status_log (order_id, status, updated_at)
+VALUES (1, 'shipped', now()), (2, 'delivered', now())
+ON CONFLICT (order_id) DO UPDATE
+    SET status = EXCLUDED.status,
+        updated_at = EXCLUDED.updated_at;
+```
+
+---
+
+## Parallel Query
+
+### Configuration Settings
+
+```sql
+-- Key settings (postgresql.conf or ALTER SYSTEM)
+max_parallel_workers_per_gather = 4      -- max workers per Gather node (default: 2)
+max_parallel_workers = 8                  -- total parallel workers across all queries
+max_worker_processes = 16                 -- total background workers (includes parallel)
+min_parallel_table_scan_size = '8MB'     -- table must be > this for parallel seq scan
+min_parallel_index_scan_size = '512kB'   -- index must be > this for parallel index scan
+parallel_tuple_cost = 0.1                -- cost of passing tuple between workers
+parallel_setup_cost = 1000               -- overhead of launching workers
+```
+
+### When Parallel Query Engages
+
+```sql
+-- Parallel is chosen when: large table, high work_mem not limiting, no write operations
+-- Check if parallel is being used:
+EXPLAIN SELECT count(*), avg(total) FROM orders;
+-- Should show: Gather -> Partial Aggregate -> Parallel Seq Scan on orders
+
+-- Force parallel for testing (lower thresholds):
+SET min_parallel_table_scan_size = 0;
+SET parallel_setup_cost = 0;
+SET max_parallel_workers_per_gather = 4;
+```
+
+### When Parallel Does NOT Kick In
+
+- Queries that write (INSERT, UPDATE, DELETE, MERGE)
+- Queries inside functions marked `PARALLEL UNSAFE` (default for user functions)
+- Queries using cursors (`DECLARE ... CURSOR FOR`)
+- Queries called from another parallel worker
+- When `max_parallel_workers_per_gather = 0`
+- When `LIMIT` is small relative to table size (planner avoids parallel startup cost)
+
+```sql
+-- Mark functions parallel safe to allow parallel plans that call them
+CREATE OR REPLACE FUNCTION calculate_discount(total numeric) RETURNS numeric
+LANGUAGE sql
+PARALLEL SAFE    -- only if function has no side effects and is truly safe
+AS $$
+    SELECT total * 0.9;
+$$;
+```
+
+---
+
+## Statistics and Planner
+
+### Column Statistics
+
+```sql
+-- Default statistics target is 100 (samples ~30000 rows per column)
+-- Increase for columns with many distinct values or skewed distributions
+
+-- Check current statistics targets
+SELECT attname, attstattarget
+FROM pg_attribute
+WHERE attrelid = 'orders'::regclass AND attnum > 0;
+
+-- Increase statistics for a specific column
+ALTER TABLE orders ALTER COLUMN status SET STATISTICS 500;
+ANALYZE orders;  -- must re-run ANALYZE to collect new statistics
+
+-- Check what the planner knows about a column
+SELECT * FROM pg_stats
+WHERE tablename = 'orders' AND attname = 'status';
+-- Key fields: n_distinct, most_common_vals, most_common_freqs, histogram_bounds
+```
+
+### Extended Statistics
+
+```sql
+-- When two columns are correlated, single-column stats mislead the planner
+-- Example: city and zip_code are correlated; planner underestimates after filtering both
+
+-- Create extended statistics to capture column correlations
+CREATE STATISTICS orders_region_status_stats (dependencies, ndistinct)
+    ON region, status FROM orders;
+
+ANALYZE orders;
+
+-- Check extended statistics
+SELECT * FROM pg_statistic_ext;
+SELECT * FROM pg_statistic_ext_data;
+
+-- MCV (most common values) extended statistics
+CREATE STATISTICS orders_mcv ON region, status FROM orders
+    WITH (kind = mcv);
+ANALYZE orders;
+```
+
+### n_distinct Overrides
+
+```sql
+-- When planner guesses wrong number of distinct values
+-- Positive value = exact count, negative = fraction of total rows
+
+-- Tell planner there are exactly 50 distinct statuses
+ALTER TABLE orders ALTER COLUMN status SET (n_distinct = 50);
+
+-- Tell planner distinct count is 10% of table rows
+ALTER TABLE orders ALTER COLUMN customer_id SET (n_distinct = -0.1);
+
+ANALYZE orders;  -- re-analyze to apply
+```
+
+### pg_hint_plan (Last Resort)
+
+```sql
+-- Install pg_hint_plan extension (not in core, must compile or use package)
+-- Use only when statistics fixes and index changes are insufficient
+
+-- Hints are embedded in comments before the query
+/*+ SeqScan(orders) */ SELECT * FROM orders WHERE status = 'pending';
+
+/*+ IndexScan(orders idx_orders_status) */ SELECT * FROM orders WHERE status = 'pending';
+
+/*+ HashJoin(orders customers) Leading(orders customers) */
+SELECT * FROM orders o JOIN customers c ON c.id = o.customer_id;
+
+-- Available hint types:
+-- Scan: SeqScan, IndexScan, IndexOnlyScan, BitmapScan, NoSeqScan, NoIndexScan
+-- Join: NestLoop, HashJoin, MergeJoin, NoNestLoop, NoHashJoin, NoMergeJoin
+-- Join order: Leading(table1 table2 table3)
+-- Parallel: Parallel(table N)  -- N = number of workers
+
+-- Always document WHY a hint is needed and create a ticket to fix root cause
+-- Hints become stale as data grows and can cause regressions after schema changes
+```
+
+### Diagnosing Estimate vs Actual Divergence
+
+```sql
+-- Large divergence between estimated and actual rows is the #1 cause of bad plans
+-- Use this query pattern to identify problem queries via pg_stat_statements + EXPLAIN
+
+-- Step 1: find high-variance queries in pg_stat_statements
+-- Step 2: run EXPLAIN ANALYZE and look for nodes where rows estimate is off by 10x+
+-- Step 3: check pg_stats for the filtered columns
+
+-- Example: orders table filtered on two correlated columns
+EXPLAIN (ANALYZE, FORMAT JSON)
+SELECT * FROM orders WHERE region = 'US' AND status = 'pending';
+
+-- If estimated rows = 10 but actual rows = 50000, investigate:
+SELECT n_distinct, most_common_vals, most_common_freqs
+FROM pg_stats
+WHERE tablename = 'orders' AND attname IN ('region', 'status');
+
+-- Fix options in priority order:
+-- 1. ANALYZE (if stats are stale)
+-- 2. Increase statistics target: ALTER TABLE ... ALTER COLUMN ... SET STATISTICS 500
+-- 3. Create extended statistics for correlated columns
+-- 4. Rewrite query to give planner better information
+-- 5. pg_hint_plan as absolute last resort
+```

+ 628 - 0
skills/postgres-ops/references/replication.md

@@ -0,0 +1,628 @@
+# PostgreSQL Replication, Partitioning & FDW Reference
+
+## Table of Contents
+
+1. [Streaming Replication](#streaming-replication)
+   - Primary Configuration
+   - Replica Configuration
+   - Synchronous vs Asynchronous
+   - Monitoring Replication
+   - Replication Slots
+2. [Logical Replication](#logical-replication)
+   - Publications
+   - Subscriptions
+   - Row Filters and Column Lists (PG15+)
+   - Use Cases and Limitations
+3. [Failover](#failover)
+   - Promoting a Standby
+   - Timeline Switches
+   - Connection Routing
+4. [Table Partitioning](#table-partitioning)
+   - RANGE Partitioning
+   - LIST Partitioning
+   - HASH Partitioning
+   - Sub-partitioning
+   - Partition Maintenance
+   - When to Partition
+5. [Foreign Data Wrappers](#foreign-data-wrappers)
+   - postgres_fdw Setup
+   - IMPORT FOREIGN SCHEMA
+   - Performance and Pushdown
+
+---
+
+## Streaming Replication
+
+### Primary Configuration
+
+Edit `postgresql.conf` on the primary:
+
+```ini
+# Minimum required for streaming replication
+wal_level = replica          # or 'logical' if you also need logical replication
+max_wal_senders = 10         # number of concurrent standby connections
+wal_keep_size = 1GB          # retain WAL to prevent standby falling behind
+                             # prefer replication slots over this setting
+
+# Optional but recommended
+hot_standby_feedback = on    # prevents primary from vacuuming rows standby needs
+```
+
+Create a replication role on the primary:
+
+```sql
+CREATE ROLE replicator WITH REPLICATION LOGIN PASSWORD 'secret';
+```
+
+Allow the standby in `pg_hba.conf` on the primary:
+
+```
+# TYPE  DATABASE        USER         ADDRESS          METHOD
+host    replication     replicator   192.168.1.0/24   scram-sha-256
+```
+
+Reload after editing `pg_hba.conf`:
+
+```sql
+SELECT pg_reload_conf();
+```
+
+### Replica Configuration
+
+Take a base backup from the primary (run on standby host):
+
+```bash
+pg_basebackup \
+  --host=primary-host \
+  --username=replicator \
+  --pgdata=/var/lib/postgresql/data \
+  --wal-method=stream \
+  --checkpoint=fast \
+  --progress
+```
+
+Create `postgresql.conf` overrides or `postgresql.auto.conf` on the replica:
+
+```ini
+primary_conninfo = 'host=primary-host port=5432 user=replicator password=secret'
+primary_slot_name = 'replica1_slot'   # if using replication slots
+hot_standby = on                       # allow read queries on replica
+recovery_min_apply_delay = 0           # set to e.g. '30min' for delayed replica
+```
+
+Create the standby signal file (PG12+):
+
+```bash
+touch /var/lib/postgresql/data/standby.signal
+```
+
+### Synchronous vs Asynchronous Replication
+
+**Asynchronous** (default): primary commits without waiting for standby. Risk of data loss on primary failure equal to replication lag.
+
+**Synchronous**: primary waits for at least one standby to confirm WAL receipt before returning to client.
+
+```ini
+# On primary postgresql.conf
+synchronous_standby_names = 'replica1'
+# or for ANY 1 of multiple standbys:
+synchronous_standby_names = 'ANY 1 (replica1, replica2, replica3)'
+# or require ALL listed:
+synchronous_standby_names = 'FIRST 2 (replica1, replica2, replica3)'
+```
+
+Standby names come from the `application_name` in `primary_conninfo`:
+
+```ini
+primary_conninfo = 'host=primary port=5432 user=replicator application_name=replica1'
+```
+
+Trade-offs:
+
+| Mode | Durability | Write Latency | Throughput |
+|------|-----------|---------------|------------|
+| Async | Data loss possible | Low | Highest |
+| Sync (remote_write) | WAL received, not flushed | Medium | High |
+| Sync (on) | WAL flushed to disk | Higher | Lower |
+| Sync (remote_apply) | Changes applied | Highest | Lowest |
+
+```ini
+# Control sync level (default is 'on' = flush to standby disk)
+synchronous_commit = remote_write   # faster, slight durability trade-off
+```
+
+### Monitoring Replication
+
+On the primary, query `pg_stat_replication`:
+
+```sql
+SELECT
+    application_name,
+    client_addr,
+    state,                          -- startup, catchup, streaming
+    sync_state,                     -- async, sync, potential
+    sent_lsn,
+    write_lsn,
+    flush_lsn,
+    replay_lsn,
+    -- Replication lag in bytes
+    (sent_lsn - replay_lsn) AS replay_lag_bytes,
+    -- Replication lag in time (PG10+)
+    write_lag,
+    flush_lag,
+    replay_lag
+FROM pg_stat_replication;
+```
+
+On the replica, check if it is in recovery and its LSN position:
+
+```sql
+SELECT
+    pg_is_in_recovery(),
+    pg_last_wal_receive_lsn(),
+    pg_last_wal_replay_lsn(),
+    pg_last_xact_replay_timestamp(),
+    -- Time lag (approximate)
+    now() - pg_last_xact_replay_timestamp() AS replication_delay;
+```
+
+Alert when lag exceeds threshold:
+
+```sql
+-- Alert if replay lag > 30 seconds
+SELECT application_name, replay_lag
+FROM pg_stat_replication
+WHERE replay_lag > interval '30 seconds';
+```
+
+### Replication Slots
+
+Replication slots prevent the primary from removing WAL segments needed by a standby, eliminating the need for `wal_keep_size` tuning. The risk is unbounded WAL accumulation if a slot is abandoned.
+
+Create a physical slot on the primary:
+
+```sql
+SELECT pg_create_physical_replication_slot('replica1_slot');
+```
+
+List all slots and check for lag:
+
+```sql
+SELECT
+    slot_name,
+    slot_type,
+    active,
+    restart_lsn,
+    confirmed_flush_lsn,
+    -- WAL retained by this slot in bytes
+    pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn) AS retained_bytes,
+    pg_size_pretty(pg_wal_lsn_diff(pg_current_wal_lsn(), restart_lsn)) AS retained_size
+FROM pg_replication_slots;
+```
+
+Drop an abandoned slot to reclaim disk:
+
+```sql
+SELECT pg_drop_replication_slot('replica1_slot');
+```
+
+Set a safety limit to prevent disk exhaustion (PG13+):
+
+```ini
+max_slot_wal_keep_size = 10GB   # drop slot if WAL retention exceeds this
+```
+
+---
+
+## Logical Replication
+
+Logical replication decodes WAL into row-level change streams. It allows selective table sync and works across major versions.
+
+### Publications
+
+A publication defines what changes to export:
+
+```sql
+-- All tables, all operations
+CREATE PUBLICATION pub_all FOR ALL TABLES;
+
+-- Specific tables
+CREATE PUBLICATION pub_orders FOR TABLE orders, order_items;
+
+-- Specific operations only
+CREATE PUBLICATION pub_inserts FOR TABLE events WITH (publish = 'insert');
+
+-- With row filter (PG15+): only published rows matching WHERE
+CREATE PUBLICATION pub_active_orders FOR TABLE orders
+    WHERE (status != 'cancelled');
+
+-- With column list (PG15+): only publish selected columns
+CREATE PUBLICATION pub_orders_summary FOR TABLE orders (id, status, total, created_at);
+```
+
+Manage publications:
+
+```sql
+ALTER PUBLICATION pub_orders ADD TABLE shipments;
+ALTER PUBLICATION pub_orders DROP TABLE order_items;
+DROP PUBLICATION pub_orders;
+
+-- Inspect
+SELECT * FROM pg_publication;
+SELECT * FROM pg_publication_tables;
+```
+
+The publisher must have `wal_level = logical`:
+
+```ini
+wal_level = logical
+max_replication_slots = 10
+max_wal_senders = 10
+```
+
+### Subscriptions
+
+On the subscriber database:
+
+```sql
+CREATE SUBSCRIPTION sub_orders
+    CONNECTION 'host=primary-host dbname=mydb user=replicator password=secret'
+    PUBLICATION pub_orders;
+```
+
+The subscriber creates a replication slot on the publisher automatically. The target tables must already exist with compatible schemas.
+
+```sql
+-- Disable/re-enable a subscription
+ALTER SUBSCRIPTION sub_orders DISABLE;
+ALTER SUBSCRIPTION sub_orders ENABLE;
+
+-- Refresh after publisher adds tables
+ALTER SUBSCRIPTION sub_orders REFRESH PUBLICATION;
+
+-- Skip copying initial data (for ongoing sync only)
+CREATE SUBSCRIPTION sub_orders
+    CONNECTION '...'
+    PUBLICATION pub_orders
+    WITH (copy_data = false);
+
+-- Drop subscription (also drops remote slot)
+DROP SUBSCRIPTION sub_orders;
+```
+
+Monitor subscriptions:
+
+```sql
+-- On subscriber
+SELECT * FROM pg_stat_subscription;
+
+-- On publisher - logical slots
+SELECT slot_name, active, confirmed_flush_lsn
+FROM pg_replication_slots
+WHERE slot_type = 'logical';
+```
+
+### Limitations of Logical Replication
+
+- DDL changes are not replicated. Schema changes must be applied manually to subscribers before altering the publisher.
+- Sequences are not replicated. After failover, reset sequences on the new primary.
+- Large objects (`pg_largeobject`) are not replicated.
+- Conflict resolution is basic: by default, subscriber errors on unique constraint conflicts. Use `ALTER SUBSCRIPTION ... SKIP` to advance past a conflict LSN.
+- Requires `REPLICA IDENTITY` on tables without primary keys:
+
+```sql
+-- Full row image (slow, safe for tables without PK)
+ALTER TABLE events REPLICA IDENTITY FULL;
+
+-- Use a unique index as identity
+ALTER TABLE events REPLICA IDENTITY USING INDEX events_uuid_idx;
+```
+
+---
+
+## Failover
+
+### Promoting a Standby
+
+Trigger promotion using `pg_promote()` (PG12+, no file touch needed):
+
+```sql
+-- Connect to the standby and run:
+SELECT pg_promote();
+```
+
+Or use `pg_ctl`:
+
+```bash
+pg_ctl promote -D /var/lib/postgresql/data
+```
+
+After promotion, the former standby becomes a normal read-write primary. Update `primary_conninfo` on remaining standbys to point to the new primary and restart them.
+
+### Timeline Switches
+
+Every promotion increments the timeline ID. PostgreSQL uses timelines to track branching histories, allowing standbys to follow the correct WAL history.
+
+```sql
+-- Check current timeline on any server
+SELECT timeline_id FROM pg_control_checkpoint();
+
+-- View WAL segment filenames: first 8 hex chars = timeline
+-- 000000020000000000000001 = timeline 2, segment 1
+```
+
+When a former primary comes back, configure it as a new standby using `recovery_target_timeline = 'latest'` (the default), which lets it follow the new timeline.
+
+### Connection Routing
+
+**HAProxy** (layer 4, health-check based):
+
+```
+frontend postgres_write
+    bind *:5432
+    default_backend postgres_primary
+
+backend postgres_primary
+    option httpchk GET /primary  # Patroni health endpoint
+    server pg1 192.168.1.1:5432 check port 8008
+    server pg2 192.168.1.2:5432 check port 8008
+
+backend postgres_replica
+    option httpchk GET /replica
+    server pg1 192.168.1.1:5432 check port 8008
+    server pg2 192.168.1.2:5432 check port 8008
+```
+
+**PgBouncer** target switch: update `[databases]` section and reload:
+
+```ini
+[databases]
+mydb = host=new-primary-ip port=5432 dbname=mydb
+```
+
+```bash
+psql -p 6432 pgbouncer -c "RELOAD"
+```
+
+**DNS-based**: Update the DNS record for `pg-primary.internal` to point to the new primary's IP. Works well with short TTLs (30s) and application-level retry logic.
+
+---
+
+## Table Partitioning
+
+Declarative partitioning (PG10+) uses `PARTITION BY` on the parent table. The parent table itself holds no rows.
+
+### RANGE Partitioning
+
+Most common for time-series and log data:
+
+```sql
+CREATE TABLE orders (
+    id          bigserial,
+    created_at  timestamptz NOT NULL,
+    customer_id bigint,
+    total       numeric(12,2)
+) PARTITION BY RANGE (created_at);
+
+-- Create partitions for each month
+CREATE TABLE orders_2024_01
+    PARTITION OF orders
+    FOR VALUES FROM ('2024-01-01') TO ('2024-02-01');
+
+CREATE TABLE orders_2024_02
+    PARTITION OF orders
+    FOR VALUES FROM ('2024-02-01') TO ('2024-03-01');
+
+-- Catch-all default partition
+CREATE TABLE orders_default
+    PARTITION OF orders DEFAULT;
+```
+
+### LIST Partitioning
+
+Useful for discrete categorical values:
+
+```sql
+CREATE TABLE products (
+    id     bigserial,
+    region text NOT NULL,
+    name   text
+) PARTITION BY LIST (region);
+
+CREATE TABLE products_us   PARTITION OF products FOR VALUES IN ('us', 'ca');
+CREATE TABLE products_eu   PARTITION OF products FOR VALUES IN ('de', 'fr', 'uk');
+CREATE TABLE products_apac PARTITION OF products FOR VALUES IN ('au', 'jp', 'sg');
+CREATE TABLE products_other PARTITION OF products DEFAULT;
+```
+
+### HASH Partitioning
+
+Distributes rows evenly when there is no natural range or list split:
+
+```sql
+CREATE TABLE sessions (
+    id      uuid NOT NULL,
+    user_id bigint,
+    data    jsonb
+) PARTITION BY HASH (id);
+
+-- 8 partitions, modulus = total count, remainder = partition number
+CREATE TABLE sessions_0 PARTITION OF sessions FOR VALUES WITH (modulus 8, remainder 0);
+CREATE TABLE sessions_1 PARTITION OF sessions FOR VALUES WITH (modulus 8, remainder 1);
+-- ... through remainder 7
+```
+
+### Sub-partitioning
+
+Combine strategies: partition by month, then by region within each month:
+
+```sql
+CREATE TABLE events (
+    id         bigserial,
+    created_at timestamptz NOT NULL,
+    region     text NOT NULL
+) PARTITION BY RANGE (created_at);
+
+CREATE TABLE events_2024_01
+    PARTITION OF events
+    FOR VALUES FROM ('2024-01-01') TO ('2024-02-01')
+    PARTITION BY LIST (region);
+
+CREATE TABLE events_2024_01_us
+    PARTITION OF events_2024_01
+    FOR VALUES IN ('us');
+```
+
+### Partition Pruning
+
+The planner eliminates irrelevant partitions at plan time (static) or execution time (dynamic):
+
+```sql
+-- Enable/disable for debugging
+SET enable_partition_pruning = on;  -- default on
+
+EXPLAIN SELECT * FROM orders WHERE created_at >= '2024-06-01' AND created_at < '2024-07-01';
+-- Should show only orders_2024_06 in the plan, not all partitions
+```
+
+Each partition should have its own indexes. Indexes on the parent do not cascade automatically (they do in PG11+ for primary keys and unique constraints created on the parent):
+
+```sql
+-- Create index on all existing partitions at once (PG11+ creates on parent + all children)
+CREATE INDEX ON orders (customer_id);
+```
+
+### Partition Maintenance
+
+```sql
+-- Add a new partition (no locking on existing data)
+CREATE TABLE orders_2025_01
+    PARTITION OF orders
+    FOR VALUES FROM ('2025-01-01') TO ('2025-02-01');
+
+-- Detach a partition (it becomes a standalone table, no data movement)
+ALTER TABLE orders DETACH PARTITION orders_2023_01;
+-- PG14+: detach concurrently (non-blocking)
+ALTER TABLE orders DETACH PARTITION orders_2023_01 CONCURRENTLY;
+
+-- Drop old data instantly (no vacuum needed)
+DROP TABLE orders_2023_01;
+
+-- Attach an existing table as a partition (verify constraint first)
+ALTER TABLE orders_old ADD CONSTRAINT orders_old_check
+    CHECK (created_at >= '2022-01-01' AND created_at < '2023-01-01');
+ALTER TABLE orders ATTACH PARTITION orders_old
+    FOR VALUES FROM ('2022-01-01') TO ('2023-01-01');
+```
+
+### When to Partition
+
+Partition when:
+- Table exceeds ~100M rows or 100GB and queries frequently filter on the partition key
+- You need instant bulk deletes (drop a partition vs DELETE + VACUUM)
+- You want to spread data across tablespaces on different disks
+- Autovacuum cannot keep up with a single large table
+
+Do not partition just because a table is large. Partitioning adds overhead for queries that scan all partitions (no partition key filter). A well-indexed single table often outperforms a partitioned one for OLTP workloads.
+
+---
+
+## Foreign Data Wrappers
+
+FDWs allow PostgreSQL to query external data sources as if they were local tables.
+
+### postgres_fdw Setup
+
+```sql
+-- 1. Install extension
+CREATE EXTENSION postgres_fdw;
+
+-- 2. Define the remote server
+CREATE SERVER remote_analytics
+    FOREIGN DATA WRAPPER postgres_fdw
+    OPTIONS (
+        host 'analytics-db.internal',
+        port '5432',
+        dbname 'analytics'
+    );
+
+-- 3. Map local user to remote credentials
+CREATE USER MAPPING FOR current_user
+    SERVER remote_analytics
+    OPTIONS (user 'readonly_user', password 'secret');
+
+-- 4. Create individual foreign tables
+CREATE FOREIGN TABLE remote_events (
+    id         bigint,
+    event_type text,
+    created_at timestamptz,
+    payload    jsonb
+)
+SERVER remote_analytics
+OPTIONS (schema_name 'public', table_name 'events');
+```
+
+### IMPORT FOREIGN SCHEMA
+
+Import all (or selected) tables from a remote schema at once:
+
+```sql
+-- Import entire remote schema
+IMPORT FOREIGN SCHEMA public
+    FROM SERVER remote_analytics
+    INTO local_remote_schema;
+
+-- Import only specific tables
+IMPORT FOREIGN SCHEMA public
+    LIMIT TO (events, pageviews, sessions)
+    FROM SERVER remote_analytics
+    INTO local_remote_schema;
+
+-- Exclude specific tables
+IMPORT FOREIGN SCHEMA public
+    EXCEPT (internal_audit_log)
+    FROM SERVER remote_analytics
+    INTO local_remote_schema;
+```
+
+### Performance and Pushdown
+
+postgres_fdw pushes WHERE clauses, ORDER BY, LIMIT, and aggregates to the remote server when possible, reducing data transfer.
+
+```sql
+-- Check what gets pushed down with EXPLAIN VERBOSE
+EXPLAIN (VERBOSE, ANALYZE)
+SELECT event_type, count(*)
+FROM remote_events
+WHERE created_at > now() - interval '7 days'
+GROUP BY event_type;
+-- Look for "Remote SQL:" in the output
+```
+
+Join pushdown (PG14+): joins between two foreign tables on the same server are pushed down to a single remote query:
+
+```sql
+-- Both tables on same server -> single remote query
+SELECT e.event_type, s.user_id
+FROM remote_events e
+JOIN remote_sessions s ON e.session_id = s.id
+WHERE e.created_at > now() - interval '1 day';
+```
+
+Control pushdown behavior per server:
+
+```sql
+ALTER SERVER remote_analytics OPTIONS (
+    use_remote_estimate 'true',   -- fetch remote row estimates for better plans
+    fetch_size '10000'             -- rows fetched per round-trip (default 100)
+);
+```
+
+Inspect all configured FDW objects:
+
+```sql
+SELECT srvname, srvfdw, srvoptions FROM pg_foreign_server;
+SELECT * FROM pg_user_mappings;
+SELECT foreign_table_schema, foreign_table_name, foreign_server_name
+FROM information_schema.foreign_tables;
+```

+ 731 - 0
skills/postgres-ops/references/schema-design.md

@@ -0,0 +1,731 @@
+# PostgreSQL Schema Design Reference
+
+## Table of Contents
+
+1. [Normalization Quick Guide](#normalization-quick-guide)
+2. [Data Types Deep Dive](#data-types-deep-dive)
+   - [JSONB](#jsonb)
+   - [Arrays](#arrays)
+   - [Range Types](#range-types)
+   - [Composite Types](#composite-types)
+   - [Domain Types](#domain-types)
+3. [Constraints](#constraints)
+4. [Generated Columns](#generated-columns)
+5. [Table Inheritance and Partitioning](#table-inheritance-and-partitioning)
+6. [Row-Level Security](#row-level-security)
+
+---
+
+## Normalization Quick Guide
+
+### 1NF - First Normal Form
+Each column holds atomic values; no repeating groups; each row uniquely identified.
+
+```sql
+-- Violates 1NF: phone_numbers is a comma-separated list
+CREATE TABLE contacts_bad (
+    id      integer PRIMARY KEY,
+    name    text,
+    phones  text   -- "555-1234, 555-5678"
+);
+
+-- 1NF compliant: one phone per row
+CREATE TABLE contacts (
+    id   integer PRIMARY KEY,
+    name text NOT NULL
+);
+
+CREATE TABLE contact_phones (
+    contact_id integer REFERENCES contacts(id),
+    phone      text NOT NULL,
+    PRIMARY KEY (contact_id, phone)
+);
+```
+
+### 2NF - Second Normal Form
+Must be 1NF. Every non-key column depends on the *entire* primary key (eliminates partial dependencies in composite-key tables).
+
+```sql
+-- Violates 2NF: product_name depends only on product_id, not the full key
+CREATE TABLE order_items_bad (
+    order_id     integer,
+    product_id   integer,
+    product_name text,    -- partial dependency
+    quantity     integer,
+    PRIMARY KEY (order_id, product_id)
+);
+
+-- 2NF compliant: move product_name to products table
+CREATE TABLE products (
+    id   integer PRIMARY KEY,
+    name text NOT NULL
+);
+
+CREATE TABLE order_items (
+    order_id   integer,
+    product_id integer REFERENCES products(id),
+    quantity   integer NOT NULL,
+    PRIMARY KEY (order_id, product_id)
+);
+```
+
+### 3NF - Third Normal Form
+Must be 2NF. No transitive dependencies (non-key columns depending on other non-key columns).
+
+```sql
+-- Violates 3NF: zip_code -> city, zip_code -> state (transitive)
+CREATE TABLE employees_bad (
+    id        integer PRIMARY KEY,
+    name      text,
+    zip_code  text,
+    city      text,   -- depends on zip_code, not id
+    state     text    -- depends on zip_code, not id
+);
+
+-- 3NF compliant
+CREATE TABLE zip_codes (
+    zip   text PRIMARY KEY,
+    city  text NOT NULL,
+    state text NOT NULL
+);
+
+CREATE TABLE employees (
+    id       integer PRIMARY KEY,
+    name     text NOT NULL,
+    zip_code text REFERENCES zip_codes(zip)
+);
+```
+
+### When to Denormalize
+
+Denormalization trades write complexity for read performance. Justify it with EXPLAIN ANALYZE evidence, not intuition.
+
+| Scenario | Denormalization Approach |
+|----------|--------------------------|
+| Frequent aggregate reads | Materialized view or stored summary column |
+| Immutable reference data | Embed directly (e.g., country name at order time) |
+| Hot join path with no writes | Redundant column with trigger to keep in sync |
+| Reporting / OLAP workload | Star schema, wide fact tables |
+
+```sql
+-- Example: store calculated total on order to avoid summing line items every read
+ALTER TABLE orders ADD COLUMN total_cents integer NOT NULL DEFAULT 0;
+
+-- Keep in sync via trigger
+CREATE FUNCTION recalc_order_total() RETURNS trigger LANGUAGE plpgsql AS $$
+BEGIN
+    UPDATE orders
+    SET total_cents = (
+        SELECT COALESCE(SUM(unit_price_cents * quantity), 0)
+        FROM order_items
+        WHERE order_id = COALESCE(NEW.order_id, OLD.order_id)
+    )
+    WHERE id = COALESCE(NEW.order_id, OLD.order_id);
+    RETURN NEW;
+END;
+$$;
+
+CREATE TRIGGER trg_order_items_total
+AFTER INSERT OR UPDATE OR DELETE ON order_items
+FOR EACH ROW EXECUTE FUNCTION recalc_order_total();
+```
+
+---
+
+## Data Types Deep Dive
+
+### JSONB
+
+JSONB stores JSON as a binary decomposed format. Supports indexing; operators work directly on the stored value. Use `jsonb` over `json` unless you need to preserve key order or duplicate keys.
+
+#### Operators
+
+```sql
+-- @>  containment: does left contain right?
+SELECT * FROM products WHERE attributes @> '{"color": "red"}';
+
+-- ->  extract field as jsonb
+SELECT data -> 'address' FROM users;
+
+-- ->> extract field as text
+SELECT data ->> 'email' FROM users;
+
+-- #>  extract at path as jsonb
+SELECT data #> '{address, city}' FROM users;
+
+-- #>> extract at path as text
+SELECT data #>> '{address, city}' FROM users;
+
+-- jsonb_path_query (SQL/JSON path, PG12+)
+SELECT jsonb_path_query(data, '$.orders[*].amount ? (@ > 100)') FROM users;
+
+-- jsonb_path_exists
+SELECT * FROM users WHERE jsonb_path_exists(data, '$.tags[*] ? (@ == "premium")');
+
+-- Modifying JSONB
+UPDATE users SET data = data || '{"verified": true}';         -- merge/overwrite key
+UPDATE users SET data = data - 'temp_field';                   -- remove key
+UPDATE users SET data = jsonb_set(data, '{address,zip}', '"90210"');
+```
+
+#### Indexing JSONB
+
+```sql
+-- GIN default: supports @>, ?, ?|, ?& on all keys and values
+CREATE INDEX idx_products_attrs ON products USING gin(attributes);
+
+-- GIN jsonb_path_ops: supports only @> but uses less space and is faster for containment
+CREATE INDEX idx_products_attrs_path ON products USING gin(attributes jsonb_path_ops);
+
+-- B-tree on extracted scalar: for equality/range on a known field
+CREATE INDEX idx_users_email ON users ((data ->> 'email'));
+
+-- B-tree on cast extracted value
+CREATE INDEX idx_orders_amount ON orders ((data ->> 'amount')::numeric);
+```
+
+#### When to Use JSONB vs Relational Columns
+
+| Use JSONB When | Use Relational Columns When |
+|----------------|----------------------------|
+| Schema varies per row (EAV alternative) | Column is queried in WHERE, JOIN, or ORDER BY frequently |
+| Optional metadata with sparse keys | Column participates in foreign key |
+| Storing external API payloads as-is | Strong type enforcement required |
+| Prototyping before schema stabilizes | Aggregate functions (SUM, AVG) on the field |
+
+---
+
+### Arrays
+
+PostgreSQL native arrays allow storing multiple values of the same type in a single column.
+
+```sql
+CREATE TABLE articles (
+    id   integer PRIMARY KEY,
+    tags text[]
+);
+
+INSERT INTO articles (id, tags) VALUES (1, ARRAY['postgres', 'sql', 'performance']);
+INSERT INTO articles (id, tags) VALUES (2, '{"nosql","databases"}');  -- literal syntax
+```
+
+#### Operators
+
+```sql
+-- ANY: value matches any element
+SELECT * FROM articles WHERE 'postgres' = ANY(tags);
+
+-- ALL: condition holds for every element
+SELECT * FROM articles WHERE 5 > ALL(ARRAY[1,2,3,4]);
+
+-- @>  contains (left contains right)
+SELECT * FROM articles WHERE tags @> ARRAY['sql', 'postgres'];
+
+-- <@  is contained by
+SELECT * FROM articles WHERE ARRAY['sql'] <@ tags;
+
+-- &&  overlap (share at least one element)
+SELECT * FROM articles WHERE tags && ARRAY['postgres', 'mysql'];
+
+-- Appending / removing
+UPDATE articles SET tags = tags || ARRAY['new-tag'] WHERE id = 1;
+UPDATE articles SET tags = array_remove(tags, 'old-tag') WHERE id = 1;
+
+-- Array length and access
+SELECT array_length(tags, 1), tags[1] FROM articles;  -- 1-indexed
+```
+
+#### Indexing Arrays
+
+```sql
+-- GIN index for @>, <@, &&, ANY equality
+CREATE INDEX idx_articles_tags ON articles USING gin(tags);
+```
+
+#### Arrays vs Junction Tables
+
+| Use Arrays When | Use Junction Tables When |
+|-----------------|--------------------------|
+| List is small and bounded | Elements have their own attributes |
+| No referential integrity needed | Many-to-many with query filters on the joined entity |
+| Queries use containment/overlap operators | Need to query "all articles for a tag" efficiently |
+| Ordering within the list matters | Cardinality is high or unbounded |
+
+---
+
+### Range Types
+
+Range types represent a range of values of a base type. Built-in range types: `int4range`, `int8range`, `numrange`, `tsrange`, `tstzrange`, `daterange`.
+
+```sql
+CREATE TABLE room_bookings (
+    id          serial PRIMARY KEY,
+    room_id     integer NOT NULL,
+    booked_at   tsrange NOT NULL
+);
+
+INSERT INTO room_bookings (room_id, booked_at) VALUES
+    (1, '[2024-03-01 09:00, 2024-03-01 11:00)'),  -- inclusive start, exclusive end
+    (1, '[2024-03-01 14:00, 2024-03-01 16:00)');
+```
+
+#### Operators
+
+```sql
+-- && overlap
+SELECT * FROM room_bookings WHERE booked_at && '[2024-03-01 10:00, 2024-03-01 12:00)';
+
+-- @> contains a point
+SELECT * FROM room_bookings WHERE booked_at @> '2024-03-01 10:30'::timestamptz;
+
+-- <@ is contained by
+SELECT * FROM room_bookings WHERE booked_at <@ '[2024-03-01 00:00, 2024-03-02 00:00)';
+
+-- Boundary extraction
+SELECT lower(booked_at), upper(booked_at) FROM room_bookings;
+
+-- Adjacency
+SELECT * FROM schedules WHERE period1 -|- period2;  -- ranges are adjacent
+
+-- daterange example
+SELECT * FROM subscriptions
+WHERE validity @> CURRENT_DATE::date;
+```
+
+#### Exclusion Constraints (prevent overlaps)
+
+```sql
+-- Requires btree_gist extension for non-geometric types
+CREATE EXTENSION IF NOT EXISTS btree_gist;
+
+ALTER TABLE room_bookings
+ADD CONSTRAINT no_double_booking
+EXCLUDE USING gist (room_id WITH =, booked_at WITH &&);
+
+-- Multi-column exclusion with additional equality condition
+ALTER TABLE room_bookings
+ADD CONSTRAINT no_double_booking_per_tenant
+EXCLUDE USING gist (tenant_id WITH =, room_id WITH =, booked_at WITH &&);
+```
+
+#### Custom Range Types
+
+```sql
+CREATE TYPE floatrange AS RANGE (subtype = float8, subtype_diff = float8mi);
+
+SELECT '[1.5, 2.5]'::floatrange @> 2.0;  -- true
+```
+
+---
+
+### Composite Types
+
+Composite types group multiple fields into a single reusable type.
+
+```sql
+-- Define a composite type
+CREATE TYPE address AS (
+    street  text,
+    city    text,
+    state   text,
+    zip     text
+);
+
+-- Use in a table
+CREATE TABLE customers (
+    id              serial PRIMARY KEY,
+    name            text NOT NULL,
+    billing_address address,
+    shipping_address address
+);
+
+-- Insert and access
+INSERT INTO customers (name, billing_address)
+VALUES ('Acme Corp', ROW('123 Main St', 'Springfield', 'IL', '62701'));
+
+SELECT (billing_address).city FROM customers;
+SELECT * FROM customers WHERE (billing_address).state = 'IL';
+
+-- Update a field within composite
+UPDATE customers
+SET billing_address.zip = '62702'
+WHERE id = 1;
+```
+
+Composite types are also implicitly created for every table and are used as the row type in PL/pgSQL functions.
+
+---
+
+### Domain Types
+
+Domains are named data types with optional constraints, providing centralized validation logic.
+
+```sql
+-- Email domain with CHECK constraint
+CREATE DOMAIN email_address AS text
+CHECK (VALUE ~ '^[a-zA-Z0-9._%+\-]+@[a-zA-Z0-9.\-]+\.[a-zA-Z]{2,}$');
+
+-- Non-negative money (in cents)
+CREATE DOMAIN positive_cents AS integer
+CHECK (VALUE > 0);
+
+-- Non-empty text
+CREATE DOMAIN nonempty_text AS text
+CHECK (VALUE <> '' AND VALUE IS NOT NULL)
+NOT NULL;
+
+-- Use domains in tables
+CREATE TABLE invoices (
+    id            serial PRIMARY KEY,
+    customer_email email_address NOT NULL,
+    amount_cents   positive_cents NOT NULL,
+    description    nonempty_text
+);
+
+-- Domain constraints can be altered without modifying tables
+ALTER DOMAIN positive_cents ADD CONSTRAINT allow_zero CHECK (VALUE >= 0);
+```
+
+---
+
+## Constraints
+
+### CHECK Constraints
+
+```sql
+-- Column-level
+CREATE TABLE products (
+    id         serial PRIMARY KEY,
+    price      numeric CHECK (price >= 0),
+    status     text CHECK (status IN ('active', 'inactive', 'archived'))
+);
+
+-- Table-level (can reference multiple columns)
+CREATE TABLE discounts (
+    id              serial PRIMARY KEY,
+    discount_pct    numeric,
+    discount_flat   numeric,
+    CONSTRAINT one_discount_type CHECK (
+        (discount_pct IS NULL) != (discount_flat IS NULL)
+    )
+);
+
+-- Named constraint for clearer error messages
+ALTER TABLE orders ADD CONSTRAINT chk_positive_total
+CHECK (total_cents > 0);
+```
+
+### UNIQUE Constraints
+
+```sql
+-- Single column
+CREATE TABLE users (
+    id    serial PRIMARY KEY,
+    email text UNIQUE NOT NULL
+);
+
+-- Composite unique
+CREATE TABLE team_members (
+    team_id integer,
+    user_id integer,
+    UNIQUE (team_id, user_id)
+);
+
+-- Partial unique (unique only within a condition)
+CREATE UNIQUE INDEX idx_users_active_email
+ON users (email) WHERE deleted_at IS NULL;
+```
+
+### EXCLUDE Constraints
+
+Exclusion constraints generalize UNIQUE by allowing any operator, not just equality. Require a GiST or SP-GiST index.
+
+```sql
+-- No two bookings for the same room may overlap
+CREATE EXTENSION btree_gist;
+
+CREATE TABLE bookings (
+    id      serial PRIMARY KEY,
+    room    text,
+    during  tsrange,
+    EXCLUDE USING gist (room WITH =, during WITH &&)
+);
+```
+
+### Foreign Key Options
+
+```sql
+CREATE TABLE orders (
+    id          serial PRIMARY KEY,
+    customer_id integer,
+
+    -- ON DELETE options:
+    -- CASCADE     - delete order when customer deleted
+    -- SET NULL    - set customer_id to NULL
+    -- SET DEFAULT - set to column default
+    -- RESTRICT    - error if customer has orders (default behavior)
+    -- NO ACTION   - like RESTRICT but deferred-constraint-friendly
+
+    CONSTRAINT fk_orders_customer
+        FOREIGN KEY (customer_id)
+        REFERENCES customers(id)
+        ON DELETE RESTRICT
+        ON UPDATE CASCADE
+);
+```
+
+### Deferrable Constraints
+
+Deferrable constraints are checked at transaction commit instead of statement time, enabling circular references and bulk data loading.
+
+```sql
+-- Define as deferrable
+ALTER TABLE employees ADD CONSTRAINT fk_manager
+FOREIGN KEY (manager_id) REFERENCES employees(id)
+DEFERRABLE INITIALLY DEFERRED;
+
+-- Or defer within a transaction
+BEGIN;
+SET CONSTRAINTS fk_manager DEFERRED;
+-- Insert records that temporarily violate the constraint
+INSERT INTO employees (id, manager_id, name) VALUES (1, 2, 'Alice');
+INSERT INTO employees (id, manager_id, name) VALUES (2, 1, 'Bob');
+COMMIT;  -- constraint checked here, both records now exist
+```
+
+---
+
+## Generated Columns
+
+Generated columns compute their value automatically from other columns. PG12+ supports STORED (persisted to disk). PG16+ added experimental VIRTUAL (computed on read, not stored).
+
+```sql
+-- STORED generated column
+CREATE TABLE measurements (
+    id            serial PRIMARY KEY,
+    value_celsius numeric NOT NULL,
+    -- Automatically computed and stored
+    value_fahrenheit numeric GENERATED ALWAYS AS (value_celsius * 9/5 + 32) STORED
+);
+
+INSERT INTO measurements (value_celsius) VALUES (100);
+SELECT value_celsius, value_fahrenheit FROM measurements;
+-- Returns: 100, 212
+
+-- Full name from parts
+CREATE TABLE persons (
+    id         serial PRIMARY KEY,
+    first_name text NOT NULL,
+    last_name  text NOT NULL,
+    full_name  text GENERATED ALWAYS AS (first_name || ' ' || last_name) STORED
+);
+
+-- Searchable slug from title
+CREATE TABLE posts (
+    id    serial PRIMARY KEY,
+    title text NOT NULL,
+    slug  text GENERATED ALWAYS AS (
+        lower(regexp_replace(trim(title), '[^a-zA-Z0-9]+', '-', 'g'))
+    ) STORED
+);
+
+CREATE INDEX idx_posts_slug ON posts(slug);
+```
+
+Restrictions: generation expression cannot reference other generated columns, user-defined functions must be IMMUTABLE, cannot have a DEFAULT, cannot be written to directly.
+
+---
+
+## Table Inheritance and Partitioning
+
+### Traditional Inheritance (pre-PG10)
+
+```sql
+CREATE TABLE events (
+    id         bigserial PRIMARY KEY,
+    occurred_at timestamptz NOT NULL,
+    payload    jsonb
+);
+
+CREATE TABLE click_events (
+    element_id text NOT NULL
+) INHERITS (events);
+
+-- Queries on parent include child rows
+SELECT count(*) FROM events;  -- includes click_events rows
+SELECT count(*) FROM ONLY events;  -- excludes child tables
+```
+
+Traditional inheritance is largely superseded by declarative partitioning for the partition use case.
+
+### Declarative Partitioning (PG10+)
+
+#### Range Partitioning
+
+```sql
+CREATE TABLE events (
+    id          bigint NOT NULL,
+    occurred_at timestamptz NOT NULL,
+    payload     jsonb
+) PARTITION BY RANGE (occurred_at);
+
+CREATE TABLE events_2024_q1 PARTITION OF events
+FOR VALUES FROM ('2024-01-01') TO ('2024-04-01');
+
+CREATE TABLE events_2024_q2 PARTITION OF events
+FOR VALUES FROM ('2024-04-01') TO ('2024-07-01');
+
+-- Default partition catches unmatched rows
+CREATE TABLE events_default PARTITION OF events DEFAULT;
+
+-- Index on partition key (propagates to all partitions)
+CREATE INDEX ON events (occurred_at);
+```
+
+#### List Partitioning
+
+```sql
+CREATE TABLE orders (
+    id      bigint NOT NULL,
+    region  text NOT NULL,
+    total   numeric
+) PARTITION BY LIST (region);
+
+CREATE TABLE orders_us PARTITION OF orders FOR VALUES IN ('US', 'CA');
+CREATE TABLE orders_eu PARTITION OF orders FOR VALUES IN ('DE', 'FR', 'GB');
+CREATE TABLE orders_other PARTITION OF orders DEFAULT;
+```
+
+#### Hash Partitioning
+
+```sql
+CREATE TABLE user_events (
+    user_id bigint NOT NULL,
+    event   text
+) PARTITION BY HASH (user_id);
+
+CREATE TABLE user_events_0 PARTITION OF user_events FOR VALUES WITH (MODULUS 4, REMAINDER 0);
+CREATE TABLE user_events_1 PARTITION OF user_events FOR VALUES WITH (MODULUS 4, REMAINDER 1);
+CREATE TABLE user_events_2 PARTITION OF user_events FOR VALUES WITH (MODULUS 4, REMAINDER 2);
+CREATE TABLE user_events_3 PARTITION OF user_events FOR VALUES WITH (MODULUS 4, REMAINDER 3);
+```
+
+#### Sub-partitioning
+
+```sql
+CREATE TABLE metrics (
+    tenant_id integer NOT NULL,
+    recorded_at date NOT NULL,
+    value numeric
+) PARTITION BY LIST (tenant_id);
+
+CREATE TABLE metrics_tenant1 PARTITION OF metrics
+FOR VALUES IN (1) PARTITION BY RANGE (recorded_at);
+
+CREATE TABLE metrics_tenant1_2024 PARTITION OF metrics_tenant1
+FOR VALUES FROM ('2024-01-01') TO ('2025-01-01');
+```
+
+---
+
+## Row-Level Security
+
+RLS restricts which rows a user can see or modify. Enabled per table; policies define the filter predicate.
+
+### Enabling RLS
+
+```sql
+ALTER TABLE documents ENABLE ROW LEVEL SECURITY;
+
+-- Without this, the table owner bypasses all policies!
+ALTER TABLE documents FORCE ROW LEVEL SECURITY;
+```
+
+### Policy Types
+
+```sql
+-- PERMISSIVE (default): policies are OR'd together; user sees rows matching ANY policy
+-- RESTRICTIVE: policies are AND'd; user must match ALL restrictive policies
+
+-- Allow users to see only their own rows
+CREATE POLICY user_isolation ON documents
+AS PERMISSIVE
+FOR ALL
+TO application_role
+USING (owner_id = current_setting('app.user_id')::integer);
+
+-- Separate read and write policies
+CREATE POLICY documents_select ON documents
+FOR SELECT
+TO application_role
+USING (owner_id = current_setting('app.user_id')::integer OR is_public = true);
+
+CREATE POLICY documents_insert ON documents
+FOR INSERT
+TO application_role
+WITH CHECK (owner_id = current_setting('app.user_id')::integer);
+
+CREATE POLICY documents_update ON documents
+FOR UPDATE
+TO application_role
+USING (owner_id = current_setting('app.user_id')::integer)
+WITH CHECK (owner_id = current_setting('app.user_id')::integer);
+
+CREATE POLICY documents_delete ON documents
+FOR DELETE
+TO application_role
+USING (owner_id = current_setting('app.user_id')::integer);
+```
+
+### Multi-Tenant Pattern
+
+```sql
+-- Set tenant context at session start (via connection pooler or app middleware)
+SET app.tenant_id = '42';
+
+-- RLS policy using session variable
+CREATE POLICY tenant_isolation ON orders
+USING (tenant_id = current_setting('app.tenant_id')::integer);
+
+-- Superuser bypass: use a dedicated non-superuser role for the app
+CREATE ROLE app_user NOLOGIN;
+GRANT SELECT, INSERT, UPDATE, DELETE ON orders TO app_user;
+
+-- Service role that bypasses RLS (for admin tasks)
+CREATE ROLE service_role BYPASSRLS LOGIN;
+```
+
+### RESTRICTIVE Policies
+
+```sql
+-- Combine PERMISSIVE (what user owns) AND RESTRICTIVE (not deleted)
+CREATE POLICY only_active ON documents
+AS RESTRICTIVE
+FOR ALL
+USING (deleted_at IS NULL);
+
+CREATE POLICY owner_access ON documents
+AS PERMISSIVE
+FOR ALL
+USING (owner_id = current_setting('app.user_id')::integer);
+
+-- Result: user sees rows where deleted_at IS NULL AND owner_id matches
+```
+
+### Common Pitfalls
+
+| Pitfall | Fix |
+|---------|-----|
+| Table owner bypasses RLS silently | Add `FORCE ROW LEVEL SECURITY` to the table |
+| No policy defined means no rows visible | Always define at least one PERMISSIVE policy per operation |
+| Superuser always bypasses RLS | Use a non-superuser application role |
+| `current_user` vs session variable | Use `current_setting()` for app-set context; `current_user` reflects DB login role |
+| Performance: predicate not pushed down | Create index on the tenant/owner column used in policy USING clause |
+
+```sql
+-- Verify your policies are working
+SET ROLE app_user;
+SET app.user_id = '1';
+SELECT count(*) FROM documents;  -- should only return user 1's documents
+RESET ROLE;
+```

skills/python-database-patterns/scripts/.gitkeep → skills/postgres-ops/scripts/.gitkeep


+ 7 - 7
skills/python-async-patterns/SKILL.md

@@ -1,10 +1,10 @@
 ---
-name: python-async-patterns
+name: python-async-ops
 description: "Python asyncio patterns for concurrent programming. Triggers on: asyncio, async, await, coroutine, gather, semaphore, TaskGroup, event loop, aiohttp, concurrent."
 compatibility: "Python 3.10+ recommended. Some patterns require 3.11+ (TaskGroup, timeout)."
 allowed-tools: "Read Write"
-depends-on: [python-typing-patterns]
-related-skills: [python-fastapi-patterns, python-observability-patterns]
+depends-on: [python-typing-ops]
+related-skills: [python-fastapi-ops, python-observability-ops]
 ---
 
 # Python Async Patterns
@@ -152,9 +152,9 @@ For detailed patterns, load:
 ## See Also
 
 **Prerequisites:**
-- `python-typing-patterns` - Type hints for async functions
+- `python-typing-ops` - Type hints for async functions
 
 **Related Skills:**
-- `python-fastapi-patterns` - Async web APIs
-- `python-observability-patterns` - Async logging and tracing
-- `python-database-patterns` - Async database access
+- `python-fastapi-ops` - Async web APIs
+- `python-observability-ops` - Async logging and tracing
+- `python-database-ops` - Async database access

skills/python-async-patterns/assets/async-project-template.py → skills/python-async-ops/assets/async-project-template.py


skills/python-async-patterns/references/aiohttp-patterns.md → skills/python-async-ops/references/aiohttp-patterns.md


skills/python-async-patterns/references/concurrency-patterns.md → skills/python-async-ops/references/concurrency-patterns.md


skills/python-async-patterns/references/debugging-async.md → skills/python-async-ops/references/debugging-async.md


skills/python-async-patterns/references/error-handling.md → skills/python-async-ops/references/error-handling.md


skills/python-async-patterns/references/mixing-sync-async.md → skills/python-async-ops/references/mixing-sync-async.md


skills/python-async-patterns/references/performance.md → skills/python-async-ops/references/performance.md


skills/python-async-patterns/references/production-patterns.md → skills/python-async-ops/references/production-patterns.md


skills/python-async-patterns/scripts/find-blocking-calls.sh → skills/python-async-ops/scripts/find-blocking-calls.sh


+ 4 - 4
skills/python-cli-patterns/SKILL.md

@@ -1,10 +1,10 @@
 ---
-name: python-cli-patterns
+name: python-cli-ops
 description: "CLI application patterns for Python. Triggers on: cli, command line, typer, click, argparse, terminal, rich, console, terminal ui."
 compatibility: "Python 3.10+. Requires typer and rich for modern CLI development."
 allowed-tools: "Read Write Bash"
 depends-on: []
-related-skills: [python-typing-patterns, python-observability-patterns]
+related-skills: [python-typing-ops, python-observability-ops]
 ---
 
 # Python CLI Patterns
@@ -164,8 +164,8 @@ def process(file: str):
 ## See Also
 
 **Related Skills:**
-- `python-typing-patterns` - Type hints for CLI arguments
-- `python-observability-patterns` - Logging for CLI applications
+- `python-typing-ops` - Type hints for CLI arguments
+- `python-observability-ops` - Logging for CLI applications
 
 **Complementary Skills:**
 - `python-env` - Package CLI for distribution

skills/python-cli-patterns/assets/cli-template.py → skills/python-cli-ops/assets/cli-template.py


skills/python-cli-patterns/references/configuration.md → skills/python-cli-ops/references/configuration.md


skills/python-cli-patterns/references/rich-output.md → skills/python-cli-ops/references/rich-output.md


skills/python-cli-patterns/references/typer-patterns.md → skills/python-cli-ops/references/typer-patterns.md


skills/python-observability-patterns/scripts/.gitkeep → skills/python-cli-ops/scripts/.gitkeep


+ 7 - 7
skills/python-database-patterns/SKILL.md

@@ -1,10 +1,10 @@
 ---
-name: python-database-patterns
+name: python-database-ops
 description: "SQLAlchemy and database patterns for Python. Triggers on: sqlalchemy, database, orm, migration, alembic, async database, connection pool, repository pattern, unit of work."
 compatibility: "SQLAlchemy 2.0+, Python 3.10+. Async requires asyncpg (PostgreSQL) or aiosqlite."
 allowed-tools: "Read Write Bash"
-depends-on: [python-typing-patterns, python-async-patterns]
-related-skills: [python-fastapi-patterns]
+depends-on: [python-typing-ops, python-async-ops]
+related-skills: [python-fastapi-ops, postgres-ops]
 ---
 
 # Python Database Patterns
@@ -176,9 +176,9 @@ async def get_user(user_id: int, db: DB):
 ## See Also
 
 **Prerequisites:**
-- `python-typing-patterns` - Mapped types and annotations
-- `python-async-patterns` - Async database sessions
+- `python-typing-ops` - Mapped types and annotations
+- `python-async-ops` - Async database sessions
 
 **Related Skills:**
-- `python-fastapi-patterns` - Dependency injection for DB sessions
-- `python-pytest-patterns` - Database fixtures and testing
+- `python-fastapi-ops` - Dependency injection for DB sessions
+- `python-pytest-ops` - Database fixtures and testing

skills/python-database-patterns/assets/alembic.ini.template → skills/python-database-ops/assets/alembic.ini.template


skills/python-database-patterns/references/connection-pooling.md → skills/python-database-ops/references/connection-pooling.md


skills/python-database-patterns/references/migrations.md → skills/python-database-ops/references/migrations.md


skills/python-database-patterns/references/sqlalchemy-async.md → skills/python-database-ops/references/sqlalchemy-async.md


skills/python-database-patterns/references/transactions.md → skills/python-database-ops/references/transactions.md


skills/rest-patterns/assets/.gitkeep → skills/python-database-ops/scripts/.gitkeep


+ 3 - 3
skills/python-env/SKILL.md

@@ -114,6 +114,6 @@ For detailed patterns, load:
 This is a **foundation skill** with no prerequisites.
 
 **Build on this skill:**
-- `python-typing-patterns` - Type hints for projects
-- `python-pytest-patterns` - Testing infrastructure
-- `python-fastapi-patterns` - Web API development
+- `python-typing-ops` - Type hints for projects
+- `python-pytest-ops` - Testing infrastructure
+- `python-fastapi-ops` - Web API development

+ 8 - 8
skills/python-fastapi-patterns/SKILL.md

@@ -1,10 +1,10 @@
 ---
-name: python-fastapi-patterns
+name: python-fastapi-ops
 description: "FastAPI web framework patterns. Triggers on: fastapi, api endpoint, dependency injection, pydantic model, openapi, swagger, starlette, async api, rest api, uvicorn."
 compatibility: "FastAPI 0.100+, Pydantic v2, Python 3.10+. Requires uvicorn for production."
 allowed-tools: "Read Write Bash"
-depends-on: [python-typing-patterns, python-async-patterns]
-related-skills: [python-database-patterns, python-observability-patterns, python-pytest-patterns]
+depends-on: [python-typing-ops, python-async-ops]
+related-skills: [python-database-ops, python-observability-ops, python-pytest-ops]
 ---
 
 # FastAPI Patterns
@@ -197,10 +197,10 @@ app.include_router(items.router, prefix="/api/v1")
 ## See Also
 
 **Prerequisites:**
-- `python-typing-patterns` - Pydantic models and type hints
-- `python-async-patterns` - Async endpoint patterns
+- `python-typing-ops` - Pydantic models and type hints
+- `python-async-ops` - Async endpoint patterns
 
 **Related Skills:**
-- `python-database-patterns` - SQLAlchemy integration
-- `python-observability-patterns` - Logging, metrics, tracing middleware
-- `python-pytest-patterns` - API testing with TestClient
+- `python-database-ops` - SQLAlchemy integration
+- `python-observability-ops` - Logging, metrics, tracing middleware
+- `python-pytest-ops` - API testing with TestClient

skills/python-fastapi-patterns/assets/fastapi-template.py → skills/python-fastapi-ops/assets/fastapi-template.py


skills/python-fastapi-patterns/references/background-tasks.md → skills/python-fastapi-ops/references/background-tasks.md


skills/python-fastapi-patterns/references/dependency-injection.md → skills/python-fastapi-ops/references/dependency-injection.md


skills/python-fastapi-patterns/references/middleware-patterns.md → skills/python-fastapi-ops/references/middleware-patterns.md


skills/python-fastapi-patterns/references/validation-serialization.md → skills/python-fastapi-ops/references/validation-serialization.md


skills/python-fastapi-patterns/scripts/scaffold-api.sh → skills/python-fastapi-ops/scripts/scaffold-api.sh


+ 7 - 7
skills/python-observability-patterns/SKILL.md

@@ -1,10 +1,10 @@
 ---
-name: python-observability-patterns
+name: python-observability-ops
 description: "Observability patterns for Python applications. Triggers on: logging, metrics, tracing, opentelemetry, prometheus, observability, monitoring, structlog, correlation id."
 compatibility: "Python 3.10+. Requires structlog, opentelemetry-api, prometheus-client."
 allowed-tools: "Read Write"
-depends-on: [python-async-patterns]
-related-skills: [python-fastapi-patterns, python-cli-patterns]
+depends-on: [python-async-ops]
+related-skills: [python-fastapi-ops, python-cli-ops]
 ---
 
 # Python Observability Patterns
@@ -176,11 +176,11 @@ async def process_order(order_id: int):
 ## See Also
 
 **Prerequisites:**
-- `python-async-patterns` - Async context propagation
+- `python-async-ops` - Async context propagation
 
 **Related Skills:**
-- `python-fastapi-patterns` - API middleware for metrics/tracing
-- `python-cli-patterns` - CLI logging patterns
+- `python-fastapi-ops` - API middleware for metrics/tracing
+- `python-cli-ops` - CLI logging patterns
 
 **Integration Skills:**
-- `python-database-patterns` - Database query tracing
+- `python-database-ops` - Database query tracing

skills/python-observability-patterns/assets/logging-config.py → skills/python-observability-ops/assets/logging-config.py


skills/python-observability-patterns/references/metrics.md → skills/python-observability-ops/references/metrics.md


skills/python-observability-patterns/references/structured-logging.md → skills/python-observability-ops/references/structured-logging.md


skills/python-observability-patterns/references/tracing.md → skills/python-observability-ops/references/tracing.md


skills/rest-patterns/scripts/.gitkeep → skills/python-observability-ops/scripts/.gitkeep


+ 6 - 6
skills/python-pytest-patterns/SKILL.md

@@ -1,10 +1,10 @@
 ---
-name: python-pytest-patterns
+name: python-pytest-ops
 description: "pytest testing patterns for Python. Triggers on: pytest, fixture, mark, parametrize, mock, conftest, test coverage, unit test, integration test, pytest.raises."
 compatibility: "pytest 7.0+, Python 3.9+. Some features require pytest-asyncio, pytest-mock, pytest-cov."
 allowed-tools: "Read Write Bash"
 depends-on: []
-related-skills: [python-typing-patterns, python-async-patterns]
+related-skills: [python-typing-ops, python-async-ops]
 ---
 
 # Python pytest Patterns
@@ -193,9 +193,9 @@ def client(app):
 ## See Also
 
 **Related Skills:**
-- `python-typing-patterns` - Type-safe test code
-- `python-async-patterns` - Async test patterns (pytest-asyncio)
+- `python-typing-ops` - Type-safe test code
+- `python-async-ops` - Async test patterns (pytest-asyncio)
 
 **Testing specific frameworks:**
-- `python-fastapi-patterns` - TestClient, API testing
-- `python-database-patterns` - Database fixtures, transactions
+- `python-fastapi-ops` - TestClient, API testing
+- `python-database-ops` - Database fixtures, transactions

skills/python-pytest-patterns/assets/conftest.py.template → skills/python-pytest-ops/assets/conftest.py.template


skills/python-pytest-patterns/assets/pytest.ini.template → skills/python-pytest-ops/assets/pytest.ini.template


skills/python-pytest-patterns/references/async-testing.md → skills/python-pytest-ops/references/async-testing.md


skills/python-pytest-patterns/references/coverage-strategies.md → skills/python-pytest-ops/references/coverage-strategies.md


skills/python-pytest-patterns/references/fixtures-advanced.md → skills/python-pytest-ops/references/fixtures-advanced.md


skills/python-pytest-patterns/references/integration-testing.md → skills/python-pytest-ops/references/integration-testing.md


skills/python-pytest-patterns/references/mocking-patterns.md → skills/python-pytest-ops/references/mocking-patterns.md


skills/python-pytest-patterns/references/property-testing.md → skills/python-pytest-ops/references/property-testing.md


skills/python-pytest-patterns/references/test-architecture.md → skills/python-pytest-ops/references/test-architecture.md


skills/python-pytest-patterns/scripts/generate-conftest.sh → skills/python-pytest-ops/scripts/generate-conftest.sh


skills/python-pytest-patterns/scripts/run-tests.sh → skills/python-pytest-ops/scripts/run-tests.sh


+ 6 - 6
skills/python-typing-patterns/SKILL.md

@@ -1,10 +1,10 @@
 ---
-name: python-typing-patterns
+name: python-typing-ops
 description: "Python type hints and type safety patterns. Triggers on: type hints, typing, TypeVar, Generic, Protocol, mypy, pyright, type annotation, overload, TypedDict."
 compatibility: "Python 3.10+ (uses union syntax X | Y). Some patterns require 3.11+ (Self, TypeVarTuple)."
 allowed-tools: "Read Write"
 depends-on: []
-related-skills: [python-pytest-patterns]
+related-skills: [python-pytest-ops]
 ---
 
 # Python Typing Patterns
@@ -224,9 +224,9 @@ python_version = "3.11"
 This is a **foundation skill** with no prerequisites.
 
 **Related Skills:**
-- `python-pytest-patterns` - Type-safe fixtures and mocking
+- `python-pytest-ops` - Type-safe fixtures and mocking
 
 **Build on this skill:**
-- `python-async-patterns` - Async type annotations
-- `python-fastapi-patterns` - Pydantic models and validation
-- `python-database-patterns` - SQLAlchemy type annotations
+- `python-async-ops` - Async type annotations
+- `python-fastapi-ops` - Pydantic models and validation
+- `python-database-ops` - SQLAlchemy type annotations

skills/python-typing-patterns/assets/pyproject-typing.toml → skills/python-typing-ops/assets/pyproject-typing.toml


skills/python-typing-patterns/references/generics-advanced.md → skills/python-typing-ops/references/generics-advanced.md


skills/python-typing-patterns/references/mypy-config.md → skills/python-typing-ops/references/mypy-config.md


skills/python-typing-patterns/references/overloads.md → skills/python-typing-ops/references/overloads.md


skills/python-typing-patterns/references/protocols-patterns.md → skills/python-typing-ops/references/protocols-patterns.md


skills/python-typing-patterns/references/runtime-validation.md → skills/python-typing-ops/references/runtime-validation.md


skills/python-typing-patterns/references/type-narrowing.md → skills/python-typing-ops/references/type-narrowing.md


skills/python-typing-patterns/scripts/check-types.sh → skills/python-typing-ops/scripts/check-types.sh


+ 1 - 1
skills/rest-patterns/SKILL.md

@@ -1,5 +1,5 @@
 ---
-name: rest-patterns
+name: rest-ops
 description: "Quick reference for RESTful API design patterns, HTTP semantics, caching, and rate limiting. Triggers on: rest api, http methods, status codes, api design, endpoint design, api versioning, rate limiting, caching headers."
 allowed-tools: "Read Write"
 ---

skills/security-patterns/assets/.gitkeep → skills/rest-ops/assets/.gitkeep


skills/rest-patterns/references/caching-patterns.md → skills/rest-ops/references/caching-patterns.md


skills/rest-patterns/references/rate-limiting.md → skills/rest-ops/references/rate-limiting.md


skills/rest-patterns/references/response-formats.md → skills/rest-ops/references/response-formats.md


skills/rest-patterns/references/status-codes.md → skills/rest-ops/references/status-codes.md


skills/sql-patterns/assets/.gitkeep → skills/rest-ops/scripts/.gitkeep


+ 1 - 1
skills/security-patterns/SKILL.md

@@ -1,5 +1,5 @@
 ---
-name: security-patterns
+name: security-ops
 description: "Security patterns and OWASP guidelines. Triggers on: security review, OWASP, XSS, SQL injection, CSRF, authentication, authorization, secrets management, input validation, secure coding."
 compatibility: "Language-agnostic patterns with framework-specific examples in references."
 allowed-tools: "Read Write Bash Grep"

skills/sql-patterns/scripts/.gitkeep → skills/security-ops/assets/.gitkeep


+ 0 - 0
skills/security-patterns/references/auth-patterns.md


Some files were not shown because too many files changed in this diff