Add Rate Limiting to API with AI Prompts

TL;DR

These prompts help you implement rate limiting to prevent API abuse. They cover token bucket and sliding window algorithms, per-user and global limits, and proper response headers. Rate limiting protects against DDoS, prevents abuse, and ensures fair usage.

Basic Rate Limiting

Use this prompt to generate a complete rate limiting middleware with per-IP and per-user limits. Your AI will create reusable middleware, Redis or in-memory storage setup, proper 429 responses, and standard rate limit headers (X-RateLimit-Limit, Remaining, Reset).

AI Prompt

Basic Rate Limiting Setup

Add rate limiting to my API endpoints.

Requirements:

  1. Limit requests per IP address (for unauthenticated requests)
  2. Limit requests per user/API key (for authenticated requests)
  3. Return proper 429 status when limit exceeded
  4. Include rate limit headers in responses
  5. Store rate limit state (Redis or in-memory)

Default limits:

  • Unauthenticated: 60 requests per minute
  • Authenticated: 1000 requests per minute
  • Specific endpoints may have custom limits

Response headers to include:

  • X-RateLimit-Limit: total allowed
  • X-RateLimit-Remaining: remaining requests
  • X-RateLimit-Reset: when limit resets (Unix timestamp)
  • Retry-After: seconds to wait (when limited)

Create reusable middleware that can be configured per route.

Framework-Specific Implementation

Next.js Rate Limiting

Copy this prompt to generate rate limiting utilities for both Next.js App Router and Pages Router. You'll get a higher-order function wrapper, Upstash Redis or in-memory storage, and edge-compatible helpers that handle serverless cold starts.

AI Prompt

Next.js Rate Limiting

Add rate limiting to my Next.js API routes.

For both App Router and Pages Router:

  1. Create a rate limiting utility using Upstash Redis or in-memory
  2. Support edge runtime (for middleware) and Node runtime
  3. Create higher-order function for Pages Router
  4. Create helper for App Router route handlers
  5. Support Vercel's x-real-ip header for IP detection

Usage should be simple:

  • Pages Router: export default withRateLimit(handler, { limit: 10 })
  • App Router: await rateLimit(request, { limit: 10 })

Handle serverless cold starts (in-memory won't persist between invocations).

Express.js Rate Limiting

Use this prompt to set up Express rate limiting with Redis-backed storage, IP whitelisting, and tiered limits by user plan. Your AI will configure global limits, stricter auth endpoint limits, and separate dev/production settings.

AI Prompt

Express Rate Limiting

Add rate limiting to my Express.js API.

Options:

  1. Use express-rate-limit with Redis store
  2. Implement custom rate limiter with sliding window

Requirements:

  • Global rate limit for all routes
  • Stricter limits for auth endpoints (login, register)
  • Skip rate limiting for health checks
  • Whitelist certain IPs (internal services)
  • Different limits by user tier (free vs paid)

Configuration:

  • Development: relaxed limits for testing
  • Production: strict limits

Also add request throttling for expensive operations.

Advanced Rate Limiting

This prompt asks your AI to build subscription-aware rate limiting with Free, Pro, and Enterprise tiers. You'll get tier lookup logic, burst allowance handling, usage tracking for billing, and per-endpoint limit configuration.

AI Prompt

Tiered Rate Limits

Implement tiered rate limiting based on user subscription.

Tiers:

  • Free: 100 requests/hour
  • Pro: 1000 requests/hour
  • Enterprise: 10000 requests/hour

Requirements:

  1. Look up user tier from database/cache
  2. Apply appropriate limit based on tier
  3. Return tier info in response headers
  4. Track usage for billing/analytics
  5. Grace period for temporary overages

Also implement:

  • Burst allowance (temporary spike above limit)
  • Separate limits for different endpoint types
  • Usage dashboard data collection

Copy this prompt to configure per-endpoint rate limits based on operation type (auth, read, write, search). Your AI will generate a configurable decorator or middleware with violation logging and abuse alerting.

AI Prompt

Endpoint-Specific Limits

Add different rate limits for different types of endpoints.

Categories:

  1. Auth endpoints (login, register): Very strict (5/minute per IP)
  2. Read endpoints: Moderate (100/minute)
  3. Write endpoints: Stricter (30/minute)
  4. Search/expensive: Very strict (10/minute)
  5. Public/cached: Lenient (1000/minute)

Implement:

  • Rate limit decorator/middleware that accepts config
  • Configuration file for all endpoints
  • Easy override for specific routes
  • Logging of rate limit violations
  • Alerting for sustained abuse

Don't forget distributed systems: If you have multiple server instances, use Redis or another shared store for rate limit state. In-memory rate limiting won't work correctly when requests hit different servers.

Handling Rate Limits

Use this prompt to improve how your API communicates rate limits to clients. You'll get a standardized 429 error response format with Retry-After headers, plus client-side code examples demonstrating exponential backoff.

AI Prompt

Rate Limit Response Handling

Improve how my API handles and communicates rate limits.

Server-side:

  1. Return 429 status code when limited
  2. Include helpful error message with limit details
  3. Add Retry-After header with wait time
  4. Log rate limit events for monitoring

Client-side guidance:

  1. Explain exponential backoff strategy
  2. Document rate limit headers
  3. Provide code examples for handling 429

Response format: { "error": "rate_limit_exceeded", "message": "Too many requests. Please wait 30 seconds.", "retryAfter": 30, "limit": 100, "remaining": 0, "reset": 1706745600 }

Pro tip: Consider using a sliding window algorithm instead of fixed windows. Fixed windows can allow burst traffic right before and after the window resets. Sliding windows provide smoother rate limiting.

What rate limits should I start with?

Start conservative and increase based on legitimate usage. A good starting point is 60/minute for unauthenticated and 1000/minute for authenticated users. Monitor and adjust based on actual usage patterns.

Should I rate limit by IP or by user?

Both. Use IP limiting for unauthenticated requests and user/API key limiting for authenticated requests. This protects against both anonymous abuse and authenticated abuse.

What's the difference between rate limiting and throttling?

Rate limiting rejects requests over the limit with 429. Throttling slows down requests (adds delay) instead of rejecting. Rate limiting is more common for APIs, throttling for preventing system overload.

Test Your Rate Limiting

Check if your API has proper rate limiting and abuse protection.

AI Fix Prompts

Add Rate Limiting to API with AI Prompts