# Modern CLI Toolkit

Token-efficient CLI tools that replace verbose legacy commands. These tools are optimized for AI coding assistants by producing cleaner, more concise output.

## Why These Tools?

| Benefit | Impact |
|---------|--------|
| **Respects .gitignore** | 60-99% fewer irrelevant results |
| **Cleaner output** | 50-80% fewer tokens consumed |
| **Faster execution** | 2-100x speed improvements |
| **Better defaults** | Less flags needed |

## Quick Install

**Windows (PowerShell as Admin):**
```powershell
.\tools\install-windows.ps1
```

**Linux/macOS:**
```bash
./tools/install-unix.sh
```

## Tool Categories

### File Search & Navigation

| Legacy | Modern | Improvement |
|--------|--------|-------------|
| `find` | `fd` | 5x faster, simpler syntax, .gitignore aware |
| `grep` | `rg` (ripgrep) | 10x faster, .gitignore aware, better output |
| `ls` | `eza` | Git status, icons, tree view built-in |
| `cat` | `bat` | Syntax highlighting, line numbers |
| `cd` | `zoxide` | Smart directory jumping |
| `tree` | `broot` | Interactive, filterable tree |

### Data Processing

| Legacy | Modern | Improvement |
|--------|--------|-------------|
| `sed` | `sd` | Simpler regex syntax, no escaping pain |
| JSON manual | `jq` | Structured queries and transforms |
| YAML manual | `yq` | Same as jq for YAML/TOML |

### Document Conversion

| Legacy | Modern | Improvement |
|--------|--------|-------------|
| PyMuPDF/pdfplumber | `markitdown` | One CLI for all document types |
| python-docx | `markitdown` | Consistent markdown output |
| Tesseract (OCR) | `markitdown` | Built-in image text extraction |

**markitdown** (Microsoft) - Convert documents to markdown:
```bash
pip install markitdown

# Usage
markitdown document.pdf       # PDF
markitdown report.docx        # Word
markitdown data.xlsx          # Excel (tables)
markitdown slides.pptx        # PowerPoint
markitdown image.png          # OCR
```
Supports: PDF, DOCX, XLSX, PPTX, images (OCR), HTML, audio (speech-to-text), CSV, JSON, XML

### Git Operations

| Legacy | Modern | Improvement |
|--------|--------|-------------|
| `git diff` | `delta` | Syntax highlighting, side-by-side |
| `git diff` | `difft` | Semantic AST-aware diffs |
| `git *` | `lazygit` | Full TUI, faster workflow |
| GitHub web | `gh` | CLI for PRs, issues, actions |

### System Monitoring

| Legacy | Modern | Improvement |
|--------|--------|-------------|
| `du -h` | `dust` | Visual tree sorted by size |
| `top` | `btm` (bottom) | Graphs, cleaner UI |
| `ps aux` | `procs` | Structured, colored output |

### Code Analysis

| Task | Tool |
|------|------|
| Line counts | `tokei` |
| AST search | `ast-grep` / `sg` |
| Benchmarks | `hyperfine` |

### Interactive Selection

| Task | Tool |
|------|------|
| Fuzzy file find | `fzf` + `fd` |
| Interactive grep | `fzf` + `rg` |
| History search | `Ctrl+R` (fzf) |

### Documentation

| Legacy | Modern | Improvement |
|--------|--------|-------------|
| `man` | `tldr` | 98% smaller, practical examples |

### Python

| Legacy | Modern | Improvement |
|--------|--------|-------------|
| `pip` | `uv` | 10-100x faster installs |
| `python -m venv` | `uv venv` | Faster venv creation |

### Task Running

| Legacy | Modern | Improvement |
|--------|--------|-------------|
| `make` | `just` | Simpler syntax, better errors |

### AI Provider CLIs

Custom CLI wrappers included in this toolkit for multi-LLM delegation:

| Provider | CLI | Strength |
|----------|-----|----------|
| Gemini | `gemini` | 1M context, code analysis (install separately) |
| OpenAI | `codex` | Deep reasoning (install separately) |
| **Perplexity** | `perplexity` | **Web search + citations** (included) |

**Perplexity CLI** (included - runs via `perplexity.py`):
```bash
# Direct question with web-grounded answer
perplexity "What's new in TypeScript 5.7?"

# Use reasoning model for complex analysis
perplexity -m sonar-reasoning "Explain microservices vs monolith tradeoffs"

# Pipe content for analysis
cat code.py | perplexity "Review this code for security issues"

# Filter by recency (day, week, month, year)
perplexity --recency day "Latest AI news"

# Restrict search to specific domains
perplexity --domains "github.com,docs.python.org" "Python asyncio patterns"

# JSON output for programmatic use
perplexity --json "query" > output.json

# List available models
perplexity --list-models
```

**Models:**
| Model | Use Case |
|-------|----------|
| `sonar` | Fast, cost-effective for quick facts |
| `sonar-pro` | Complex queries, more citations (default) |
| `sonar-reasoning` | Multi-step problem solving |
| `sonar-reasoning-pro` | Deep reasoning (DeepSeek-R1) |
| `sonar-deep-research` | Comprehensive agentic research |

**Setup:**
```bash
# Set API key (get from https://www.perplexity.ai/settings/api)
export PERPLEXITY_API_KEY="your-key-here"

# Add to shell profile for persistence:
# echo 'export PERPLEXITY_API_KEY="your-key-here"' >> ~/.bashrc
```

---

### Web Fetching (URL Retrieval Hierarchy)

Benchmarked performance (10 URLs, varying complexity):

| Tool | Avg Speed | Success | Best For |
|------|-----------|---------|----------|
| **WebFetch** | Instant | Varies | First attempt - built-in |
| **Jina Reader** | **0.5s** | 10/10 | Default fallback - 5-10x faster |
| **Firecrawl** | 4-5s | 10/10 | Anti-bot bypass, Cloudflare |
| **markitdown** | 2-3s | 9/10 | Local files + simple pages |

**Jina Reader** (free tier: 10M tokens) - **Recommended default**:
```bash
# Simple - just prefix any URL
curl https://r.jina.ai/https://example.com

# Search + fetch in one call
curl https://s.jina.ai/your%20search%20query
```

**Firecrawl** (requires API key) - **Anti-bot specialist**:
```bash
# When Jina fails due to anti-bot
firecrawl https://blocked-site.com

# Save to file
firecrawl https://example.com -o output.md

# With JSON metadata
firecrawl https://example.com --json
```
- Handles Cloudflare, Datadome, and other anti-bot systems
- Supports interactive scraping (click, scroll, fill forms)
- AI-powered structured data extraction

**markitdown** - **Local files + URLs**:
```bash
# URLs (slower than Jina, but works offline)
markitdown https://example.com

# Local files (unique capability)
markitdown document.pdf
markitdown report.docx
markitdown data.xlsx
```

**Decision Tree:**
1. Try `WebFetch` first (instant, free)
2. If blocked → Try Jina `r.jina.ai/URL` (fastest, best success rate)
3. If anti-bot/Cloudflare → Try `firecrawl <url>` (designed for bypass)
4. For local files (PDF, Word, Excel) → Use `markitdown`

## Token Efficiency Benchmarks

Tested on a typical Node.js project with `node_modules`:

| Operation | Legacy | Modern | Token Savings |
|-----------|--------|--------|---------------|
| Find all files | `find`: 307 results | `fd`: 69 results | **78%** |
| Search 'function' | `grep`: 6,193 bytes | `rg`: 1,244 bytes | **80%** |
| Directory listing | `ls -laR`: 3,666 bytes | `eza --tree`: 670 bytes | **82%** |
| Disk usage | `du -h`: ~500 tokens | `dust`: ~100 tokens | **80%** |
| Man page | `man git`: ~5000 tokens | `tldr git`: ~100 tokens | **98%** |

## Verification

After installation, verify all tools:

```bash
# Check all tools are available
which fd rg eza bat zoxide delta difft jq yq sd lazygit gh tokei uv just ast-grep fzf dust btm procs tldr

# Check custom CLI wrappers
perplexity --list-models
```

## Experimental / Future

### Nushell - Structured Data Shell

[Nushell](https://www.nushell.sh/) is a modern shell that treats everything as structured data (tables, records, lists) instead of text streams. It could potentially replace jq + yq + awk + sed with a unified syntax.

**Status:** Experimental (v0.108.x) - not recommended for production scripts yet.

**When to consider:**
- Heavy data pipeline work (parsing APIs, configs)
- Frustrated with jq syntax
- Want unified commands across JSON/YAML/CSV/TOML

**Example comparison:**

```bash
# Traditional (jq)
curl -s api.example.com/users | jq '.data[] | select(.active) | .name'

# Nushell
http get api.example.com/users | where active | get name
```

```bash
# Traditional (multiple tools)
ps aux | grep node | awk '{print $2, $4}' | sort -k2 -nr

# Nushell
ps | where name == "node" | select pid mem | sort-by mem --reverse
```

**Why we're waiting:**
- Still 0.x (breaking changes possible)
- Learning curve for team environments
- Current jq + yq stack handles 95% of cases
- CI/CD scripts need POSIX bash compatibility

**Install (when ready to experiment):**
```bash
# Windows
winget install Nushell.Nushell

# macOS
brew install nushell

# Linux
cargo install nu
```

**Resources:**
- [Nushell Book](https://www.nushell.sh/book/)
- [Nushell GitHub](https://github.com/nushell/nushell)
- [Nushell for SREs](https://medium.com/@nonickedgr/nushell-for-sres-modern-shell-scripting-for-internal-tools-7b5dca51dc66)

---

## Sources

- [It's FOSS - Rust CLI Tools](https://itsfoss.com/rust-cli-tools/)
- [Zaiste - Shell Commands in Rust](https://zaiste.net/posts/shell-commands-rust/)
- [GitHub - Rust CLI Tools List](https://gist.github.com/sts10/daadbc2f403bdffad1b6d33aff016c0a)
- [DEV.to - CLI Tools You Can't Live Without](https://dev.to/lissy93/cli-tools-you-cant-live-without-57f6)