Case Study

Production API Rate Limiting

Fixed-window rate limiting at the Next.js edge layer using Upstash Redis — per-IP enforcement with a deliberate fail-open design that keeps the site available when Redis is unreachable.

100
Max req/min per IP
0ms
Overhead when Redis is down (fail-open)
9
Unit tests including boundary + proxy cases
Edge
Runtime — no cold-start latency

The Problem

The /api/v1/* routes on damilola.tech include a Claude-powered job-scoring endpoint and a resume-tailoring endpoint — both hit the Anthropic API and incur non-trivial cost per call. Without a rate limit, a single client could exhaust the monthly token budget in minutes and take down the AI chat features for all users.

The constraint: protection must work at the edge (before any serverless function cold-starts), be distributed across Vercel's global edge nodes, and degrade gracefully — blocking all traffic because a Redis instance is unavailable would be worse than the abuse it prevents.

Architecture

Fixed-Window Counter with Time-Bucketed Keys

The implementation uses a fixed-window approximation that behaves like a sliding window through key naming. Each window is 60 seconds wide; the key includes both the IP and the current window ID:

const windowId = Math.floor(Date.now() / 1000 / WINDOW_SEC);
const key = `ratelimit:api:${ip}:${windowId}`;

Keys expire after 2× the window duration (120s TTL) — the extra window ensures Redis cleans up counters from the previous period without an explicit delete. A single Upstash pipeline call performs INCR + EXPIRE in a single round trip.

Fail-Open Design

There are three explicit fail-open paths — all return NextResponse.next() rather than blocking:

IP Extraction

Next.js 15 removed NextRequest.ip. The middleware reads the real client IP from x-forwarded-for (Vercel's canonical header), taking the first entry to handle multi-proxy chains ("203.0.113.1, 10.0.0.1, 172.16.0.1" → "203.0.113.1"), with x-real-ip as a fallback and 'unknown' as a shared bucket when both are absent.

Implementation Highlights

The full middleware from middleware.ts:

export const runtime = 'edge';
export const RATE_LIMIT = 100;   // max requests per window per IP
export const WINDOW_SEC = 60;    // fixed window size in seconds

export async function middleware(request: NextRequest): Promise<NextResponse> {
  const redisUrl = process.env.UPSTASH_REDIS_REST_URL;
  const redisToken = process.env.UPSTASH_REDIS_REST_TOKEN;

  // Fail-open: Redis unavailable → no rate limiting
  if (!redisUrl || !redisToken) return NextResponse.next();

  const ip =
    request.headers.get('x-forwarded-for')?.split(',')[0].trim() ??
    request.headers.get('x-real-ip') ??
    'unknown';

  const windowId = Math.floor(Date.now() / 1000 / WINDOW_SEC);
  const key = `ratelimit:api:${ip}:${windowId}`;

  let count: number;
  try {
    const res = await fetch(`${redisUrl}/pipeline`, {
      method: 'POST',
      headers: { Authorization: `Bearer ${redisToken}`, 'Content-Type': 'application/json' },
      body: JSON.stringify([
        ['INCR', key],
        ['EXPIRE', key, WINDOW_SEC * 2],   // 2-window TTL ensures cleanup
      ]),
    });
    if (!res.ok) return NextResponse.next();   // fail open on Redis error
    const [{ result: incr }] = await res.json();
    count = incr;
  } catch {
    return NextResponse.next();   // fail open on network error
  }

  if (count > RATE_LIMIT) {
    return new NextResponse(
      JSON.stringify({ error: 'Too many requests. Please try again later.' }),
      { status: 429, headers: { 'Content-Type': 'application/json',
                                'Retry-After': String(WINDOW_SEC) } },
    );
  }
  return NextResponse.next();
}

export const config = { matcher: '/api/v1/:path*' };

Key Engineering Decisions

Test Coverage

9 vitest unit tests cover the full behavior surface:

Tech Stack

Next.js 15 Middleware
TypeScript
Upstash Redis (REST)
Vercel Edge Runtime
vitest (unit tests)
View Source →← All Projects