Case Study

Production API Rate Limiting

Fixed-window rate limiting at the Next.js edge layer using Upstash Redis — per-IP enforcement with a deliberate fail-open design that keeps the site available when Redis is unreachable.

100

Max req/min per IP

0ms

Overhead when Redis is down (fail-open)

Unit tests including boundary + proxy cases

Edge

Runtime — no cold-start latency

The Problem

The /api/v1/* routes on damilola.tech include a Claude-powered job-scoring endpoint and a resume-tailoring endpoint — both hit the Anthropic API and incur non-trivial cost per call. Without a rate limit, a single client could exhaust the monthly token budget in minutes and take down the AI chat features for all users.

The constraint: protection must work at the edge (before any serverless function cold-starts), be distributed across Vercel's global edge nodes, and degrade gracefully — blocking all traffic because a Redis instance is unavailable would be worse than the abuse it prevents.

Architecture

Fixed-Window Counter with Time-Bucketed Keys

The implementation uses a fixed-window approximation that behaves like a sliding window through key naming. Each window is 60 seconds wide; the key includes both the IP and the current window ID:

const windowId = Math.floor(Date.now() / 1000 / WINDOW_SEC);
const key = `ratelimit:api:${ip}:${windowId}`;

Keys expire after 2× the window duration (120s TTL) — the extra window ensures Redis cleans up counters from the previous period without an explicit delete. A single Upstash pipeline call performs INCR + EXPIRE in a single round trip.

Fail-Open Design

There are three explicit fail-open paths — all return NextResponse.next() rather than blocking:

›Missing env vars — if UPSTASH_REDIS_REST_URL or UPSTASH_REDIS_REST_TOKEN are absent (local dev, misconfigured env), requests pass through without any Redis call.
›Non-OK HTTP response — if Upstash returns a 5xx, the check is skipped rather than failing the request.
›Network error — the fetch call is wrapped in try/catch; a thrown network exception passes through instead of surfacing a 500.

IP Extraction

Next.js 15 removed NextRequest.ip. The middleware reads the real client IP from x-forwarded-for (Vercel's canonical header), taking the first entry to handle multi-proxy chains ("203.0.113.1, 10.0.0.1, 172.16.0.1" → "203.0.113.1"), with x-real-ip as a fallback and 'unknown' as a shared bucket when both are absent.

Implementation Highlights

The full middleware from middleware.ts:

export const runtime = 'edge';
export const RATE_LIMIT = 100;   // max requests per window per IP
export const WINDOW_SEC = 60;    // fixed window size in seconds

export async function middleware(request: NextRequest): Promise<NextResponse> {
  const redisUrl = process.env.UPSTASH_REDIS_REST_URL;
  const redisToken = process.env.UPSTASH_REDIS_REST_TOKEN;

  // Fail-open: Redis unavailable → no rate limiting
  if (!redisUrl || !redisToken) return NextResponse.next();

  const ip =
    request.headers.get('x-forwarded-for')?.split(',')[0].trim() ??
    request.headers.get('x-real-ip') ??
    'unknown';

  const windowId = Math.floor(Date.now() / 1000 / WINDOW_SEC);
  const key = `ratelimit:api:${ip}:${windowId}`;

  let count: number;
  try {
    const res = await fetch(`${redisUrl}/pipeline`, {
      method: 'POST',
      headers: { Authorization: `Bearer ${redisToken}`, 'Content-Type': 'application/json' },
      body: JSON.stringify([
        ['INCR', key],
        ['EXPIRE', key, WINDOW_SEC * 2],   // 2-window TTL ensures cleanup
      ]),
    });
    if (!res.ok) return NextResponse.next();   // fail open on Redis error
    const [{ result: incr }] = await res.json();
    count = incr;
  } catch {
    return NextResponse.next();   // fail open on network error
  }

  if (count > RATE_LIMIT) {
    return new NextResponse(
      JSON.stringify({ error: 'Too many requests. Please try again later.' }),
      { status: 429, headers: { 'Content-Type': 'application/json',
                                'Retry-After': String(WINDOW_SEC) } },
    );
  }
  return NextResponse.next();
}

export const config = { matcher: '/api/v1/:path*' };

Key Engineering Decisions

›
Availability over strictness
The fail-open choice is intentional: a portfolio site that becomes inaccessible because of a Redis outage is a worse outcome than a brief window of unthrottled traffic. Strictness would be the right call for a financial API; for a career portfolio, uptime is the priority.
›
Edge runtime, not serverless functions
Middleware runs before any serverless function cold-starts. Moving rate limiting to the edge means abuse is stopped before the expensive Anthropic API call is ever initiated — cheaper per blocked request and faster to respond (no function spin-up).
›
Pipeline for single-round-trip INCR + EXPIRE
Using Upstash's /pipeline endpoint keeps the counter increment and TTL refresh in a single round trip. A two-step INCR → EXPIRE sequence over REST would risk a leaked key (INCR succeeds, EXPIRE fails, counter never expires) in the presence of partial failures.
›
Fixed window is sufficient here
A pure sliding-window algorithm requires per-request log storage or a sorted set, adding latency and Upstash command cost. The fixed-window approximation is adequate for abuse prevention on a portfolio site — the 2× burst at a window boundary is acceptable and well-understood.

Test Coverage

9 vitest unit tests cover the full behavior surface:

›Requests under the limit pass through (200)
›Requests exceeding RATE_LIMIT return 429 with JSON body and Retry-After header
›Exactly RATE_LIMIT requests pass (boundary: count === limit → 200)
›RATE_LIMIT + 1 triggers the 429 (boundary: count just over → blocked)
›Missing env vars → fail open (no Redis call)
›Redis non-OK response → fail open
›Network error (fetch throws) → fail open
›Multi-proxy x-forwarded-for extracts first (client) IP
›x-real-ip fallback used when x-forwarded-for is absent

Tech Stack

›Next.js 15 Middleware

›TypeScript

›Upstash Redis (REST)

›Vercel Edge Runtime

›vitest (unit tests)

View Source →← All Projects