Rate Limiting Cheat Sheet
Quick reference for API rate limiting: algorithms, HTTP headers, status codes, and implementation best practices.
Rate Limiting Algorithms
| Algorithm | How It Works | Pros | Cons |
|---|---|---|---|
| Token Bucket | Tokens added at fixed rate; requests consume tokens | Allows bursting, smooth average rate | Complex to implement correctly |
| Sliding Window Log | Track timestamp of each request in window | Most accurate, no boundary issues | High memory usage |
| Sliding Window Counter | Weighted average of current + previous window | Low memory, smooth transitions | Approximation, not exact |
| Leaky Bucket | Requests queue and drain at fixed rate | Smooth output rate, prevents bursts | No bursting allowed |
| Fixed Window | Count requests per fixed time window (e.g., 100/min) | Simple, easy to implement | Boundary burst problem |
HTTP Rate Limit Headers
| Header | Description | Example |
|---|---|---|
| X-RateLimit-Limit | Maximum requests allowed in window | 1000 |
| X-RateLimit-Remaining | Requests remaining in current window | 42 |
| X-RateLimit-Reset | Unix timestamp when window resets | 1709726400 |
| Retry-After | Seconds to wait before retrying (on 429) | 60 |
| RateLimit-Limit | RFC draft: limit (req/window) | 100;w=60 |
| RateLimit-Remaining | RFC draft: remaining requests | 10 |
| RateLimit-Reset | RFC draft: seconds until reset | 35 |
# Example response headers
HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 42
X-RateLimit-Reset: 1709726400
Retry-After: 60
# RFC-style headers (new standard)
RateLimit-Limit: 100;w=60
RateLimit-Remaining: 10
RateLimit-Reset: 35
HTTP Status Codes
| 429 Too Many Requests | Rate limit exceeded. Include Retry-After header. |
| 503 Service Unavailable | Server overloaded (may include Retry-After). |
| 403 Forbidden | Quota exceeded (some APIs use this). |
# 429 Response Example
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 60
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0
{
"error": "rate_limit_exceeded",
"message": "You have exceeded your rate limit",
"retry_after": 60
}
Rate Limiting Strategies
| Strategy | Use Case | Example |
|---|---|---|
| Per-User | Authenticated APIs | 100 req/min per API key |
| Per-IP | Public endpoints, auth endpoints | 10 req/min per IP |
| Per-Endpoint | Expensive operations | 10 search req/min, 100 others |
| Global | System-wide protection | 10000 req/min total |
| Tiered | Freemium APIs | Free: 100/hr, Pro: 1000/hr |
| Burst + Sustained | Allow short bursts, limit average | Burst: 50, Sustained: 200/min |
Client-Side Rate Limit Handling
// JavaScript: Exponential backoff with retry-after
async function fetchWithRateLimit(url, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
const res = await fetch(url);
if (res.status === 429) {
const retryAfter = res.headers.get('Retry-After');
const waitTime = retryAfter ? parseInt(retryAfter) : Math.pow(2, i) * 1000;
console.log(`Rate limited. Waiting ${waitTime}ms...`);
await new Promise(r => setTimeout(r, waitTime));
continue;
}
if (!res.ok) throw new Error(`HTTP ${res.status}`);
return res.json();
}
throw new Error('Max retries exceeded');
}
# Python: Rate limit aware HTTP client
import time
import requests
from requests.exceptions import HTTPError
def fetch_with_backoff(url, max_retries=3):
for i in range(max_retries):
response = requests.get(url)
if response.status_code == 429:
retry_after = int(response.headers.get('Retry-After', 2 ** i))
print(f"Rate limited. Waiting {retry_after}s...")
time.sleep(retry_after)
continue
response.raise_for_status()
return response.json()
raise Exception("Max retries exceeded")
# cURL: Check rate limit headers
curl -i https://api.example.com/data
# Response headers:
# X-RateLimit-Limit: 1000
# X-RateLimit-Remaining: 42
# X-RateLimit-Reset: 1709726400
# Sleep until reset (bash)
remaining=$(curl -sI https://api.example.com/data | grep X-RateLimit-Remaining | cut -d' ' -f2)
if [ "$remaining" -lt 10 ]; then
reset=$(curl -sI https://api.example.com/data | grep X-RateLimit-Reset | cut -d' ' -f2)
sleep_time=$((reset - $(date +%s)))
echo "Rate limit low. Sleeping for $sleep_time seconds"
sleep $sleep_time
fi
Rate Limiting Tools & Middleware
| Tool | Stack | Algorithm |
|---|---|---|
| express-rate-limit | Node.js/Express | Fixed Window |
| rate-limit-redis | Node.js + Redis | Sliding Window |
| django-ratelimit | Python/Django | Fixed Window |
| flask-limiter | Python/Flask | Multiple |
| Redis + Lua | Any (custom) | Token Bucket / Sliding |
| nginx limit_req | Nginx | Leaky Bucket |
| Cloudflare Rate Limiting | Edge/CDN | Sliding Window |
Best Practices
✓
Always include rate limit headers in responses (even successful ones)
✓
Return 429 with Retry-After header, not 503 or 403
✓
Document rate limits clearly in API docs
✓
Use tiered limits for different user plans (free vs paid)
✓
Consider higher limits for read-only vs write operations
✓
Implement graceful degradation, not hard cutoffs
✗
Don't hide rate limits—transparency builds trust
✗
Avoid per-endpoint limits that are too complex to understand