TL;DR
The #1 rate limiting best practice is applying different limits for different endpoints - strict limits for auth and expensive operations, more lenient for public reads. Use sliding windows, identify users by authenticated ID (not just IP), return proper 429 responses with Retry-After headers, and use distributed stores like Redis for multi-server deployments.
"Rate limiting is your API's immune system. Without it, a single bad actor can bring down your entire service and bankrupt your cloud budget."
Why Rate Limiting Matters
Rate limiting protects against:
- Brute force attacks: Login attempts, password resets
- DDoS: Overwhelming your servers
- Scraping: Automated data extraction
- API abuse: Excessive usage beyond plan limits
- Cost attacks: Running up your cloud bills
Best Practice 1: Different Limits for Different Endpoints 5 min
Not all endpoints need the same limits:
import rateLimit from 'express-rate-limit';
// General API: 100 requests per 15 minutes
const apiLimiter = rateLimit({
windowMs: 15 * 60 * 1000,
max: 100,
standardHeaders: true,
legacyHeaders: false,
});
// Login: 5 attempts per 15 minutes
const loginLimiter = rateLimit({
windowMs: 15 * 60 * 1000,
max: 5,
message: { error: 'Too many login attempts' },
skipSuccessfulRequests: true,
});
// Password reset: 3 per hour
const resetLimiter = rateLimit({
windowMs: 60 * 60 * 1000,
max: 3,
});
// Expensive operations: 10 per hour
const expensiveLimiter = rateLimit({
windowMs: 60 * 60 * 1000,
max: 10,
});
// Apply limits
app.use('/api/', apiLimiter);
app.post('/api/auth/login', loginLimiter);
app.post('/api/auth/reset-password', resetLimiter);
app.post('/api/generate', expensiveLimiter);
Best Practice 2: Identify Users Correctly 3 min
IP-based limiting is not enough for authenticated APIs:
const userLimiter = rateLimit({
windowMs: 15 * 60 * 1000,
max: 100,
keyGenerator: (req) => {
// Use user ID for authenticated requests
if (req.user?.id) {
return `user:${req.user.id}`;
}
// Fall back to IP for unauthenticated
return `ip:${req.ip}`;
},
skip: (req) => {
// Skip rate limiting for admins
return req.user?.role === 'admin';
},
});
Best Practice 3: Use Redis for Distributed Systems 5 min
Memory-based rate limiting does not work across multiple servers:
import rateLimit from 'express-rate-limit';
import RedisStore from 'rate-limit-redis';
import { createClient } from 'redis';
const redisClient = createClient({
url: process.env.REDIS_URL,
});
await redisClient.connect();
const limiter = rateLimit({
store: new RedisStore({
sendCommand: (...args) => redisClient.sendCommand(args),
}),
windowMs: 15 * 60 * 1000,
max: 100,
});
Best Practice 4: Return Proper Headers 2 min
Help clients understand rate limits:
// Response headers to include:
RateLimit-Limit: 100 // Max requests allowed
RateLimit-Remaining: 42 // Requests remaining
RateLimit-Reset: 1640000000 // When limit resets (Unix timestamp)
Retry-After: 120 // Seconds until they can retry (on 429)
// Example 429 response
{
"error": "Too many requests",
"retryAfter": 120
}
Best Practice 5: Sliding Window Algorithm 3 min
Sliding windows are smoother than fixed windows:
| Algorithm | Pros | Cons |
|---|---|---|
| Fixed Window | Simple, low memory | Burst at window boundary |
| Sliding Window | Smooth, no burst | More complex |
| Token Bucket | Allows controlled bursts | More complex |
Recommended Limits
| Endpoint Type | Recommended Limit |
|---|---|
| General API | 100-1000/hour |
| Login | 5-10/15 minutes |
| Password reset | 3-5/hour |
| Email sending | 10/hour |
| AI/expensive | 10-50/hour |
| Public read | 1000+/hour |
Official Resources: For comprehensive rate limiting guidance, see OWASP Denial of Service Cheat Sheet, Google Cloud Rate Limiting Strategies, and express-rate-limit documentation.
Should I rate limit by IP or user ID?
Both. Use user ID for authenticated requests (prevents abuse from one account) and IP for unauthenticated requests (prevents brute force). Some attacks come from single IPs with multiple accounts.
How do I handle rate limiting behind a proxy?
Configure your app to trust the proxy and read the real IP from X-Forwarded-For header. In Express: app.set('trust proxy', 1). Be careful not to trust arbitrary headers.
Should I tell users when they are rate limited?
Yes, return a 429 status with a Retry-After header and a clear error message. This helps legitimate users and automated clients back off appropriately.