Case Study
Production API Rate Limiting
Fixed-window rate limiting at the Next.js edge layer using Upstash Redis — per-IP enforcement with a deliberate fail-open design that keeps the site available when Redis is unreachable.
The Problem
The /api/v1/* routes on damilola.tech include a Claude-powered job-scoring endpoint and a resume-tailoring endpoint — both hit the Anthropic API and incur non-trivial cost per call. Without a rate limit, a single client could exhaust the monthly token budget in minutes and take down the AI chat features for all users.
The constraint: protection must work at the edge (before any serverless function cold-starts), be distributed across Vercel's global edge nodes, and degrade gracefully — blocking all traffic because a Redis instance is unavailable would be worse than the abuse it prevents.
Architecture
Fixed-Window Counter with Time-Bucketed Keys
The implementation uses a fixed-window approximation that behaves like a sliding window through key naming. Each window is 60 seconds wide; the key includes both the IP and the current window ID:
const windowId = Math.floor(Date.now() / 1000 / WINDOW_SEC);
const key = `ratelimit:api:${ip}:${windowId}`;Keys expire after 2× the window duration (120s TTL) — the extra window ensures Redis cleans up counters from the previous period without an explicit delete. A single Upstash pipeline call performs INCR + EXPIRE in a single round trip.
Fail-Open Design
There are three explicit fail-open paths — all return NextResponse.next() rather than blocking:
- ›Missing env vars — if
UPSTASH_REDIS_REST_URLorUPSTASH_REDIS_REST_TOKENare absent (local dev, misconfigured env), requests pass through without any Redis call. - ›Non-OK HTTP response — if Upstash returns a 5xx, the check is skipped rather than failing the request.
- ›Network error — the
fetchcall is wrapped in try/catch; a thrown network exception passes through instead of surfacing a 500.
IP Extraction
Next.js 15 removed NextRequest.ip. The middleware reads the real client IP from x-forwarded-for (Vercel's canonical header), taking the first entry to handle multi-proxy chains ("203.0.113.1, 10.0.0.1, 172.16.0.1" → "203.0.113.1"), with x-real-ip as a fallback and 'unknown' as a shared bucket when both are absent.
Implementation Highlights
The full middleware from middleware.ts:
export const runtime = 'edge';
export const RATE_LIMIT = 100; // max requests per window per IP
export const WINDOW_SEC = 60; // fixed window size in seconds
export async function middleware(request: NextRequest): Promise<NextResponse> {
const redisUrl = process.env.UPSTASH_REDIS_REST_URL;
const redisToken = process.env.UPSTASH_REDIS_REST_TOKEN;
// Fail-open: Redis unavailable → no rate limiting
if (!redisUrl || !redisToken) return NextResponse.next();
const ip =
request.headers.get('x-forwarded-for')?.split(',')[0].trim() ??
request.headers.get('x-real-ip') ??
'unknown';
const windowId = Math.floor(Date.now() / 1000 / WINDOW_SEC);
const key = `ratelimit:api:${ip}:${windowId}`;
let count: number;
try {
const res = await fetch(`${redisUrl}/pipeline`, {
method: 'POST',
headers: { Authorization: `Bearer ${redisToken}`, 'Content-Type': 'application/json' },
body: JSON.stringify([
['INCR', key],
['EXPIRE', key, WINDOW_SEC * 2], // 2-window TTL ensures cleanup
]),
});
if (!res.ok) return NextResponse.next(); // fail open on Redis error
const [{ result: incr }] = await res.json();
count = incr;
} catch {
return NextResponse.next(); // fail open on network error
}
if (count > RATE_LIMIT) {
return new NextResponse(
JSON.stringify({ error: 'Too many requests. Please try again later.' }),
{ status: 429, headers: { 'Content-Type': 'application/json',
'Retry-After': String(WINDOW_SEC) } },
);
}
return NextResponse.next();
}
export const config = { matcher: '/api/v1/:path*' };Key Engineering Decisions
- ›Availability over strictness
The fail-open choice is intentional: a portfolio site that becomes inaccessible because of a Redis outage is a worse outcome than a brief window of unthrottled traffic. Strictness would be the right call for a financial API; for a career portfolio, uptime is the priority.
- ›Edge runtime, not serverless functions
Middleware runs before any serverless function cold-starts. Moving rate limiting to the edge means abuse is stopped before the expensive Anthropic API call is ever initiated — cheaper per blocked request and faster to respond (no function spin-up).
- ›Pipeline for single-round-trip INCR + EXPIRE
Using Upstash's
/pipelineendpoint keeps the counter increment and TTL refresh in a single round trip. A two-step INCR → EXPIRE sequence over REST would risk a leaked key (INCR succeeds, EXPIRE fails, counter never expires) in the presence of partial failures. - ›Fixed window is sufficient here
A pure sliding-window algorithm requires per-request log storage or a sorted set, adding latency and Upstash command cost. The fixed-window approximation is adequate for abuse prevention on a portfolio site — the 2× burst at a window boundary is acceptable and well-understood.
Test Coverage
9 vitest unit tests cover the full behavior surface:
- ›Requests under the limit pass through (200)
- ›Requests exceeding
RATE_LIMITreturn 429 with JSON body andRetry-Afterheader - ›Exactly
RATE_LIMITrequests pass (boundary: count === limit → 200) - ›
RATE_LIMIT + 1triggers the 429 (boundary: count just over → blocked) - ›Missing env vars → fail open (no Redis call)
- ›Redis non-OK response → fail open
- ›Network error (fetch throws) → fail open
- ›Multi-proxy
x-forwarded-forextracts first (client) IP - ›
x-real-ipfallback used whenx-forwarded-foris absent