# Rate Limiting Patterns

Strategies and headers for API rate limiting.

## Standard Headers

### Response Headers

```http
X-RateLimit-Limit: 1000          # Max requests per window
X-RateLimit-Remaining: 847       # Requests remaining
X-RateLimit-Reset: 1698415200    # Unix timestamp when limit resets
Retry-After: 60                  # Seconds to wait (on 429)
```

### Rate Limit Response (429)

```json
{
  "error": {
    "code": "RATE_LIMIT_EXCEEDED",
    "message": "Too many requests",
    "retry_after": 60,
    "limit": 1000,
    "remaining": 0,
    "reset_at": "2024-10-27T12:00:00Z"
  }
}
```

## Rate Limiting Strategies

### Fixed Window

Simple: count requests in fixed time periods.

```
Window: 1 minute (00:00-00:59, 01:00-01:59, ...)
Limit: 100 requests

Pros: Simple to implement
Cons: Burst at window edges (200 in 2 seconds across boundary)
```

### Sliding Window

Smoother: use weighted average across windows.

```
Current window: 50% through
Previous window: 60 requests
Current window: 40 requests

Weighted count = (60 × 0.5) + 40 = 70
Remaining = 100 - 70 = 30

Pros: Smoother limits
Cons: More complex, needs previous window data
```

### Token Bucket

Allows bursts with steady refill.

```
Bucket capacity: 100 tokens
Refill rate: 10 tokens/second

- Start with 100 tokens
- Each request costs 1 token
- Tokens refill at 10/sec
- Burst allowed up to 100, then steady 10/sec

Pros: Allows bursts, intuitive
Cons: More state to track
```

### Leaky Bucket

Fixed output rate, queue excess.

```
Processing rate: 10 requests/second
Queue size: 50

- Requests queue up
- Processed at constant rate
- Queue overflow = 429

Pros: Smooth output, protects backend
Cons: Adds latency
```

## Rate Limit Tiers

### By User/Plan

```http
# Free tier
X-RateLimit-Limit: 100
X-RateLimit-Window: 3600   # per hour

# Pro tier
X-RateLimit-Limit: 10000
X-RateLimit-Window: 3600
```

### By Endpoint

```http
# Search (expensive)
X-RateLimit-Limit: 10
X-RateLimit-Window: 60

# Read (cheap)
X-RateLimit-Limit: 1000
X-RateLimit-Window: 60
```

### By Operation Type

```http
# Writes
POST/PUT/DELETE: 100/minute

# Reads
GET: 1000/minute
```

## Implementation Headers

### GitHub Style

```http
X-RateLimit-Limit: 5000
X-RateLimit-Remaining: 4999
X-RateLimit-Reset: 1372700873
X-RateLimit-Used: 1
X-RateLimit-Resource: core
```

### RFC Draft (RateLimit Headers)

```http
RateLimit-Limit: 100
RateLimit-Remaining: 50
RateLimit-Reset: 60
```

## Client Handling

### Retry Logic

```javascript
async function fetchWithRetry(url, options, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    const response = await fetch(url, options);

    if (response.status === 429) {
      const retryAfter = response.headers.get('Retry-After') || 60;
      await sleep(retryAfter * 1000);
      continue;
    }

    return response;
  }
  throw new Error('Rate limit exceeded after retries');
}
```

### Proactive Backoff

```javascript
function checkRateLimit(response) {
  const remaining = response.headers.get('X-RateLimit-Remaining');
  const reset = response.headers.get('X-RateLimit-Reset');

  if (remaining < 10) {
    const waitMs = (reset - Date.now()) / remaining;
    // Slow down requests
  }
}
```

## Best Practices

### Server Side

1. Include rate limit headers in all responses
2. Return 429 with clear error message
3. Always include `Retry-After`
4. Consider different limits per endpoint
5. Log rate limit hits for monitoring

### Client Side

1. Respect `Retry-After` header
2. Implement exponential backoff
3. Monitor remaining quota
4. Cache responses to reduce requests
5. Batch operations when possible