Rate Limiting Cheat Sheet

Quick reference for API rate limiting: algorithms, HTTP headers, status codes, and implementation best practices.

Algorithms Headers Status Codes Strategies Client Tools

Rate Limiting Algorithms

Algorithm How It Works Pros Cons
Token Bucket Tokens added at fixed rate; requests consume tokens Allows bursting, smooth average rate Complex to implement correctly
Sliding Window Log Track timestamp of each request in window Most accurate, no boundary issues High memory usage
Sliding Window Counter Weighted average of current + previous window Low memory, smooth transitions Approximation, not exact
Leaky Bucket Requests queue and drain at fixed rate Smooth output rate, prevents bursts No bursting allowed
Fixed Window Count requests per fixed time window (e.g., 100/min) Simple, easy to implement Boundary burst problem

HTTP Rate Limit Headers

Header Description Example
X-RateLimit-Limit Maximum requests allowed in window 1000
X-RateLimit-Remaining Requests remaining in current window 42
X-RateLimit-Reset Unix timestamp when window resets 1709726400
Retry-After Seconds to wait before retrying (on 429) 60
RateLimit-Limit RFC draft: limit (req/window) 100;w=60
RateLimit-Remaining RFC draft: remaining requests 10
RateLimit-Reset RFC draft: seconds until reset 35
# Example response headers
HTTP/1.1 200 OK
X-RateLimit-Limit: 1000
X-RateLimit-Remaining: 42
X-RateLimit-Reset: 1709726400
Retry-After: 60

# RFC-style headers (new standard)
RateLimit-Limit: 100;w=60
RateLimit-Remaining: 10
RateLimit-Reset: 35

HTTP Status Codes

429 Too Many Requests Rate limit exceeded. Include Retry-After header.
503 Service Unavailable Server overloaded (may include Retry-After).
403 Forbidden Quota exceeded (some APIs use this).
# 429 Response Example
HTTP/1.1 429 Too Many Requests
Content-Type: application/json
Retry-After: 60
X-RateLimit-Limit: 100
X-RateLimit-Remaining: 0

{
  "error": "rate_limit_exceeded",
  "message": "You have exceeded your rate limit",
  "retry_after": 60
}

Rate Limiting Strategies

Strategy Use Case Example
Per-User Authenticated APIs 100 req/min per API key
Per-IP Public endpoints, auth endpoints 10 req/min per IP
Per-Endpoint Expensive operations 10 search req/min, 100 others
Global System-wide protection 10000 req/min total
Tiered Freemium APIs Free: 100/hr, Pro: 1000/hr
Burst + Sustained Allow short bursts, limit average Burst: 50, Sustained: 200/min

Client-Side Rate Limit Handling

// JavaScript: Exponential backoff with retry-after
async function fetchWithRateLimit(url, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    const res = await fetch(url);

    if (res.status === 429) {
      const retryAfter = res.headers.get('Retry-After');
      const waitTime = retryAfter ? parseInt(retryAfter) : Math.pow(2, i) * 1000;
      console.log(`Rate limited. Waiting ${waitTime}ms...`);
      await new Promise(r => setTimeout(r, waitTime));
      continue;
    }

    if (!res.ok) throw new Error(`HTTP ${res.status}`);
    return res.json();
  }
  throw new Error('Max retries exceeded');
}
# Python: Rate limit aware HTTP client
import time
import requests
from requests.exceptions import HTTPError

def fetch_with_backoff(url, max_retries=3):
    for i in range(max_retries):
        response = requests.get(url)

        if response.status_code == 429:
            retry_after = int(response.headers.get('Retry-After', 2 ** i))
            print(f"Rate limited. Waiting {retry_after}s...")
            time.sleep(retry_after)
            continue

        response.raise_for_status()
        return response.json()

    raise Exception("Max retries exceeded")
# cURL: Check rate limit headers
curl -i https://api.example.com/data

# Response headers:
# X-RateLimit-Limit: 1000
# X-RateLimit-Remaining: 42
# X-RateLimit-Reset: 1709726400

# Sleep until reset (bash)
remaining=$(curl -sI https://api.example.com/data | grep X-RateLimit-Remaining | cut -d' ' -f2)
if [ "$remaining" -lt 10 ]; then
  reset=$(curl -sI https://api.example.com/data | grep X-RateLimit-Reset | cut -d' ' -f2)
  sleep_time=$((reset - $(date +%s)))
  echo "Rate limit low. Sleeping for $sleep_time seconds"
  sleep $sleep_time
fi

Rate Limiting Tools & Middleware

Tool Stack Algorithm
express-rate-limit Node.js/Express Fixed Window
rate-limit-redis Node.js + Redis Sliding Window
django-ratelimit Python/Django Fixed Window
flask-limiter Python/Flask Multiple
Redis + Lua Any (custom) Token Bucket / Sliding
nginx limit_req Nginx Leaky Bucket
Cloudflare Rate Limiting Edge/CDN Sliding Window

Best Practices

Always include rate limit headers in responses (even successful ones)

Return 429 with Retry-After header, not 503 or 403

Document rate limits clearly in API docs

Use tiered limits for different user plans (free vs paid)

Consider higher limits for read-only vs write operations

Implement graceful degradation, not hard cutoffs

Don't hide rate limits—transparency builds trust

Avoid per-endpoint limits that are too complex to understand

Related Resources