Context Harvest Operation

Purpose: Extract knowledge from AI summaries → permanent context, then clean workspace

Last Updated: 2026-01-06

Core Problem

AI agents create summary files (OVERVIEW.md, SESSION-*.md, SUMMARY.md) that contain valuable knowledge but clutter the workspace. These files "plague" the codebase.

Solution: Harvest the knowledge → permanent context, then delete the summaries.

Auto-Detection Patterns

Harvest automatically detects these patterns:

Filename patterns:

*OVERVIEW.md
*SUMMARY.md
SESSION-*.md
CONTEXT-*.md
*NOTES.md

Location patterns:

Files in .tmp/ directory
Files with "Summary", "Overview", "Session" in title
Files >2KB in root directory (likely summaries)

6-Stage Workflow

Stage 1: Scan

Action: Find all summary files in workspace

Process:

Search for auto-detection patterns
Check .tmp/ directory
List files with sizes
Sort by modification date (newest first)

Output: List of candidate files

Example:

Found 3 summary documents:
1. CONTEXT-SYSTEM-OVERVIEW.md (4.2 KB, modified 1 hour ago)
2. SESSION-auth-work.md (1.8 KB, modified today)
3. .tmp/IMPLEMENTATION-NOTES.md (800 bytes, modified today)

Stage 2: Analyze

Action: Categorize content by function

Mapping Rules: | Content Type | Target Folder | How to Identify | |--------------|---------------|-----------------| | Design decisions | concepts/ | "We decided to...", "Architecture", "Pattern" | | Solutions/patterns | examples/ | Code snippets, "Here's how we..." | | Workflows | guides/ | Numbered steps, "How to...", "Setup" | | Errors encountered | errors/ | Error messages, "Fixed issue", "Gotcha" | | Reference data | lookup/ | Tables, lists, paths, commands |

Process:

Read each file
Identify valuable sections (skip planning/conversation)
Categorize by function
Determine target file path
Generate preview (first 60 chars)

Output: Categorized items with letter IDs

Stage 3: Approve (CRITICAL)

Action: Present approval UI with letter-based selection

ALWAYS show approval UI before extracting/deleting. NEVER auto-harvest without user confirmation.

Format:

### CONTEXT-SYSTEM-OVERVIEW.md (4.2 KB)

✓ [A] Design: Function-based context organization
    → Would add to: core/concepts/context-organization.md
    Preview: "Organize by function (concepts/, examples/...)..."

✓ [B] Pattern: Minimal Viable Information
    → Would add to: core/concepts/mvi-principle.md
    Preview: "Extract core only (1-3 sentences), 3-5 key points..."

✓ [C] Workflow: Harvesting summary documents
    → Would create: core/guides/harvesting.md
    Preview: "Scan for summaries → Extract → Approve → Delete"

✗ [D] Skip: Planning discussion notes (temporary knowledge)

---

### SESSION-auth-work.md (1.8 KB)

✓ [E] Error: JWT token expiration not handled
    → Would add to: development/errors/auth-errors.md
    Preview: "Symptom: 401 after 1 hour. Cause: No refresh flow..."

✓ [F] Example: JWT refresh token implementation
    → Would create: development/examples/jwt-refresh.md
    Preview: "Store refresh token → Check expiry → Request new..."

---

### .tmp/IMPLEMENTATION-NOTES.md (800 bytes)

✗ [G] Skip: Duplicate info (already in development/concepts/api-design.md)

---

**Quick options**:
- Type 'A B C E F' - Approve specific items
- Type 'all' - Approve all ✓ items (A B C E F)
- Type 'none' - Skip harvesting, delete files anyway
- Type 'cancel' - Keep files, don't harvest

Validation:

MUST wait for user input
MUST not proceed without approval
If user types 'cancel', stop immediately

Output: List of approved items

Stage 4: Extract

Action: Extract and minimize approved items

Apply MVI to all extracted content:

Core concept: 1-3 sentences
Key points: 3-5 bullets
Minimal example: <10 lines
Reference link: to original source
Files: <200 lines each

Process:

For each approved item:
- Extract core content
- Apply MVI minimization (see compact.md)
- Generate preview of final content
Show extraction preview (APPROVAL REQUIRED):

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Extraction Preview
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

[A] → core/concepts/context-organization.md (CREATE, 45 lines)
┌─────────────────────────────────────────────────────────┐
│ # Concept: Context Organization                         │
│                                                         │
│ **Purpose**: Function-based knowledge organization      │
│                                                         │
│ ## Core Concept                                         │
│ Organize context by function: concepts/, examples/...   │
│ ...                                                     │
└─────────────────────────────────────────────────────────┘

[E] → development/errors/auth-errors.md (ADD to existing, 98 → 112 lines)
┌─────────────────────────────────────────────────────────┐
│ + ## Error: JWT Token Expiration Not Handled             │
│ +                                                       │
│ + **Symptom**: 401 after 1 hour                         │
│ + **Cause**: No refresh token flow                      │
│ + ...                                                   │
└─────────────────────────────────────────────────────────┘

... ({remaining_count} more items)

Show all? [y/n] | Approve extraction? [y/n/edit]: _

On approval:
- Write files to disk
- Add cross-references
- Update navigation.md maps

Output: List of created/updated files

Stage 5: Cleanup (APPROVAL REQUIRED)

Action: Archive or delete source summary files

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Cleanup: Source Files
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

Successfully harvested from:
  CONTEXT-SYSTEM-OVERVIEW.md (4.2 KB)
  SESSION-auth-work.md (1.8 KB)

Skipped (no valuable content):
  .tmp/IMPLEMENTATION-NOTES.md (800 bytes)

How should we handle these source files?

  1. Archive (safe) — move to .tmp/archive/harvested/{date}/
     → Can restore later if needed

  2. Delete — permanently remove harvested files
     → Frees disk space, no undo

  3. Keep — leave source files in place
     → No cleanup, files remain where they are

Choose [1/2/3] (default: 1): _

ONLY cleanup files that had content successfully harvested. If extraction failed, keep the original file.

Output: Cleanup report

Stage 6: Report

Action: Show comprehensive results summary

Format:

✅ Harvested 5 items into permanent context:
   - Added to core/concepts/context-organization.md
   - Added to core/concepts/mvi-principle.md
   - Created core/guides/harvesting.md
   - Added to development/errors/auth-errors.md
   - Created development/examples/jwt-refresh.md

🗑️ Cleaned up workspace:
   - Archived: CONTEXT-SYSTEM-OVERVIEW.md → .tmp/archive/harvested/2026-01-06/
   - Archived: SESSION-auth-work.md → .tmp/archive/harvested/2026-01-06/
   - Deleted: .tmp/IMPLEMENTATION-NOTES.md (no valuable content)

📊 Updated navigation maps:
   - .opencode/context/core/navigation.md
   - .opencode/context/development/navigation.md

💾 Disk space freed: 6.8 KB

Usage Examples

Scan entire workspace

/context harvest

Scan specific directory

/context harvest .tmp/
/context harvest docs/sessions/

Harvest specific file

/context harvest OVERVIEW.md
/context harvest SESSION-2026-01-06.md

Smart Content Detection

✅ Extract (Valuable Knowledge)

Design decisions ("We chose X because...")
Patterns that worked ("This pattern solved...")
Errors encountered + solutions
API changes ("Updated from v1 to v2...")
Performance findings ("Optimization reduced...")
Core concepts explained

❌ Skip (Temporary/Noise)

Planning discussion ("Should we...?", "Maybe try...")
Conversational notes ("I think...", "We talked about...")
Duplicate info (already in context)
TODO lists (move to task system instead)
Timestamps and session metadata

Safety Features

Approval gate - Never auto-delete without confirmation
Archive by default - Move to .tmp/archive/, not permanent delete
Validation - Check file sizes, structure before committing
Rollback - Can restore from archive if needed
Dry run - Show what would happen before doing it

Success Criteria

After harvest operation:

Valuable knowledge extracted to permanent context?
All extracted files <200 lines?
Files in correct function folders?
navigation.md navigation updated?
Summary files archived/deleted?
Workspace cleaner than before?
No knowledge lost?

compact.md - How to minimize extracted content
mvi-principle.md - What to extract
structure.md - Where files go
creation.md - File creation rules

harvest.md 10 KB

History Raw

Context Harvest Operation

Core Problem

Auto-Detection Patterns

6-Stage Workflow

Stage 1: Scan

Stage 2: Analyze

Stage 3: Approve (CRITICAL)

Stage 4: Extract

Stage 5: Cleanup (APPROVAL REQUIRED)

Stage 6: Report

Usage Examples

Scan entire workspace

Scan specific directory

Harvest specific file

Smart Content Detection

✅ Extract (Valuable Knowledge)

❌ Skip (Temporary/Noise)

Safety Features

Success Criteria

Related

harvest.md 10 KB History Raw

Context Harvest Operation

Core Problem

Auto-Detection Patterns

6-Stage Workflow

Stage 1: Scan

Stage 2: Analyze

Stage 3: Approve (CRITICAL)

Stage 4: Extract

Stage 5: Cleanup (APPROVAL REQUIRED)

Stage 6: Report

Usage Examples

Scan entire workspace

Scan specific directory

Harvest specific file

Smart Content Detection

✅ Extract (Valuable Knowledge)

❌ Skip (Temporary/Noise)

Safety Features

Success Criteria

Related

harvest.md 10 KB

History Raw