Rate Limiting Isn't Optional - Here How to Actually Implement It in Node.js

No rate limiting means any client can hit your API as many times as it wants. This guide walks through the right way to implement it in Node.js - from express-rate-limit basics to Redis-backed sliding windows and layered per-route limits that work in production.

Sandeep Bansod

May 1, 202610 min read

If your API has no rate limiting, any client can send as many requests as it wants. A broken retry loop, a scraper, or a user who refreshes too fast all of it hits your server with no limit.

This guide shows you how to add rate limiting to a Node.js API properly from the basic setup to Redis-backed distributed limiting that works in production.

Why You Need Rate Limiting

Without rate limiting your API is fully exposed to:

Retry loops that go infinite - a client bug keeps sending requests non-stop
Credential stuffing - bots trying thousands of username/password combinations
Web scrapers - pulling all your data in minutes
One user burning your third-party API quota - costing you money
Heavy users slowing things down for everyone else

Rate limiting puts a ceiling on how many requests a client can make in a given time window. Once they hit the limit they get a 429 Too Many Requests response.

The Wrong Way: In-Memory Counters

The first thing most people try looks like this

JavaScript

const requestCounts = {};

app.use((req, res, next) =&gt; {
  const ip = req.ip;
  requestCounts[ip] = (requestCounts[ip] || 0) + 1;

  if (requestCounts[ip] &gt; 100) {
    return res.status(429).json({ error: &#39;Too many requests&#39; });
  }

  next();
});

This works on one server. But the moment you have two instances running behind a load balancer, each instance has its own counter. A client that's blocked on instance A just keeps hitting instance B. Your limit is effectively multiplied by the number of servers.

Also, every time your server restarts, all counters reset to zero.

Use in-memory for local development only. For production, you need a shared store - more on that below.

Starting With express-rate-limit

express-rate-limit is the standard package for rate limiting in Express apps.

Bash

npm install express-rate-limit

Basic setup:

JavaScript

import rateLimit from &#39;express-rate-limit&#39;;

const limiter = rateLimit({
  windowMs: 15 * 60 * 1000, // 15-minute window
  max: 100,                  // max requests per window, per IP
  standardHeaders: true,     // sends RateLimit-* headers to the client
  legacyHeaders: false,
  message: {
    error: &#39;Too many requests. Please try again later.&#39;,
  },
});

app.use(&#39;/api&#39;, limiter);

This is a solid start. But the default store is still in-memory, and there are two common mistakes that silently break it in production.

Fix 1: Set trust proxy

If your app runs behind nginx, a cloud load balancer, or Cloudflare, then req.ip will show the proxy's internal IP address not the actual client IP.

That means every request looks like it's coming from the same address. Your rate limiter treats all users as one person.

Fix it with one line in Express:

JavaScript

app.set(&#39;trust proxy&#39;, 1); // trust the first proxy in the chain

// Check that it&#39;s working:
app.get(&#39;/debug/ip&#39;, (req, res) =&gt; {
  res.json({ ip: req.ip });
});

If you see 127.0.0.1 on your production server, the setting isn't working yet.

Fix 2: Use user ID on authenticated routes

Limiting by IP address causes problems when many users share the same IP - like a team working from one office network.

For routes where users are logged in use their user ID instead:

JavaScript

const userLimiter = rateLimit({
  windowMs: 60 * 1000, // 1 minute
  max: 60,
  keyGenerator: (req) =&gt; {
    return req.user?.id ?? req.ip; // use user ID if available, fall back to IP
  },
});

app.use(&#39;/api/dashboard&#39;, authenticate, userLimiter);

This way, one users heavy usage doesn't block everyone else on their network.

The Three Rate Limiting Algorithms

Before adding Redis, it helps to understand the three main approaches. They all do the same thing but behave differently at the edges.

Fixed Window

Time is split into fixed chunks say, every 60 seconds. Each client gets 100 requests per chunk.

The problem: a client can use 100 requests at second 59, and another 100 at second 61. Thats 200 requests in 2 seconds double the limit because the window reset right in between.

Sliding Window

Instead of resetting at fixed intervals, the window moves with each request. The check is always "how many requests in the last 60 seconds?"

This avoids the burst problem. There's no boundary to exploit. It's more accurate, but requires tracking timestamps for each request, not just a count.

Token Bucket

Each client has a bucket that holds tokens. Each request uses one token. Tokens refill at a steady rate (for example, 2 per second).

If a client hasn't made requests in a while, their tokens build up. This allows short bursts - a user who's been idle can fire off a few quick requests - while still keeping the long-term rate under control.

Most production APIs use token bucket or sliding window. Fixed window is simpler to implement but easier to game.

Switching to Redis (Production Setup)

For a multi-server setup, you need a central store that all instances can share. Redis is the standard choice.

Bash

npm install rate-limiter-flexible ioredis

rate-limiter-flexible gives you full control over the algorithm and works with Redis out of the box.

Here a sliding window rate limiter backed by Redis:

JavaScript

import { RateLimiterRedis } from &#39;rate-limiter-flexible&#39;;
import Redis from &#39;ioredis&#39;;

const redisClient = new Redis({
  host: process.env.REDIS_HOST,
  port: Number(process.env.REDIS_PORT),
  enableOfflineQueue: false,
});

const rateLimiter = new RateLimiterRedis({
  storeClient: redisClient,
  keyPrefix: &#39;rl_api&#39;,
  points: 60,        // max requests
  duration: 60,      // per 60 seconds
  blockDuration: 60, // block the client for 60s after limit is hit
});

export async function rateLimitMiddleware(req, res, next) {
  const key = req.user?.id ?? req.ip;

  try {
    const result = await rateLimiter.consume(key);

    res.setHeader(&#39;X-RateLimit-Limit&#39;, 60);
    res.setHeader(&#39;X-RateLimit-Remaining&#39;, result.remainingPoints);
    res.setHeader(&#39;X-RateLimit-Reset&#39;, new Date(Date.now() + result.msBeforeNext).toISOString());

    next();
  } catch (rejRes) {
    if (rejRes instanceof Error) {
      // Redis is unreachable — let the request through rather than block everyone
      console.error(&#39;Rate limiter error:&#39;, rejRes.message);
      return next();
    }

    res.setHeader(&#39;Retry-After&#39;, Math.ceil(rejRes.msBeforeNext / 1000));
    res.status(429).json({
      error: &#39;Too many requests&#39;,
      retryAfter: Math.ceil(rejRes.msBeforeNext / 1000),
    });
  }
}

One decision you need to make: what happens when Redis is down? In the example above, the request is let through (fail open). That's fine for most APIs. For login or payment endpoints, you might prefer to block all traffic (fail closed) until Redis comes back.

Set Different Limits for Different Routes

Not every route deserves the same limit. A search endpoint that runs an expensive database query should be tighter than a simple status check.

Here a practical three layer setup:

JavaScript

// Global: catches runaway clients before they reach any route
const globalLimiter = new RateLimiterRedis({
  storeClient: redisClient,
  keyPrefix: &#39;rl_global&#39;,
  points: 300,
  duration: 60,
});

// Per route: tighter limits on heavy endpoints
const searchLimiter = new RateLimiterRedis({
  storeClient: redisClient,
  keyPrefix: &#39;rl_search&#39;,
  points: 10,
  duration: 60,
});

// Auth: very tight — prevents brute force login attacks
const authLimiter = new RateLimiterRedis({
  storeClient: redisClient,
  keyPrefix: &#39;rl_auth&#39;,
  points: 5,
  duration: 300,       // 5 attempts per 5 minutes
  blockDuration: 900,  // blocked for 15 minutes after that
});

// Apply them
app.post(&#39;/api/auth/login&#39;, makeMiddleware(authLimiter), loginHandler);
app.get(&#39;/api/search&#39;, makeMiddleware(searchLimiter), searchHandler);
app.use(&#39;/api&#39;, makeMiddleware(globalLimiter));

The auth limiter matters the most here. Five login attempts per five minutes stops credential stuffing without locking out someone who mistyped their password once.

What to Send in the 429 Response

A 429 with no explanation leaves developers guessing. Give them what they need to handle it:

JavaScript

res.status(429).json({
  error: &#39;rate_limit_exceeded&#39;,
  message: &#39;You have sent too many requests. Please wait before trying again.&#39;,
  limit: 60,
  remaining: 0,
  resetAt: new Date(Date.now() + msBeforeNext).toISOString(),
  retryAfter: Math.ceil(msBeforeNext / 1000), // seconds to wait
});

Also set the response headers:

TEXT

HTTP/1.1 429 Too Many Requests
Retry-After: 47
X-RateLimit-Limit: 60
X-RateLimit-Remaining: 0
X-RateLimit-Reset: 2026-05-01T04:23:00.000Z

Any client that reads retry after will wait the correct amount of time before retrying. That one header stops most retry hammering on its own.

Testing That It Actually Works

Don't ship rate limiting without testing it. Here a quick test with Supertest:

JavaScript

// test/rate-limit.test.js
import request from &#39;supertest&#39;;
import app from &#39;../src/app.js&#39;;

describe(&#39;Rate limiting&#39;, () =&gt; {
  it(&#39;allows requests within the limit&#39;, async () =&gt; {
    for (let i = 0; i &lt; 10; i++) {
      const res = await request(app).get(&#39;/api/search?q=test&#39;);
      expect(res.status).not.toBe(429);
    }
  });

  it(&#39;blocks requests that go over the limit&#39;, async () =&gt; {
    const requests = Array.from({ length: 15 }, () =&gt;
      request(app).get(&#39;/api/search?q=test&#39;)
    );
    const responses = await Promise.all(requests);
    const blocked = responses.filter((r) =&gt; r.status === 429);
    expect(blocked.length).toBeGreaterThan(0);
  });

  it(&#39;returns a Retry-After header when blocked&#39;, async () =&gt; {
    const requests = Array.from({ length: 15 }, () =&gt;
      request(app).get(&#39;/api/search?q=test&#39;)
    );
    const responses = await Promise.all(requests);
    const blocked = responses.find((r) =&gt; r.status === 429);
    expect(blocked?.headers[&#39;retry-after&#39;]).toBeDefined();
  });
});

For load testing, use autocannon:

Bash

npx autocannon -c 50 -d 10 http://localhost:3000/api/search

Run it and check how many 429 responses come back. If you see zero, your limit is set too high.

The Short Version

In-memory rate limiting breaks the moment you have more than one server
Set trust proxy correctly - otherwise you're limiting the wrong IP
Use user ID as the rate limit key for authenticated routes
For production, use Redis as the shared store
Apply different limits to different routes - auth tighter, general looser
Always send retry after in your 429 response
Test it under load before you deploy

Enjoying this article?

Get new articles, tips, and fixes delivered straight to your inbox — free, no spam.

Was this article helpful?

Let me know if this was useful — it helps me write more content like this.

What's next?

Daily Challenge

Put it into practice

Try today's hands-on dev challenge — takes under 5 minutes.

Open challenge

Related Tool

Timestamp Converter

Free browser-based dev tool — no sign-up needed.

Open tool

Quick Tip

30-second dev lessons

Browse tips, fixes, and bugs — bite-sized and practical.

Browse tips

New challenge and tips drop daily. Come back tomorrow to keep your streak going.

Tags:Node.js Security API Express Redis Backend

Found this useful? Share it.

X LinkedIn HN

Sandeep Bansod

I'm a Front-End Developer located in India focused on making websites look great, work fast and perform well with a seamless user experience. Over the years I've worked across different areas of digital design, web development, email design, app UI/UX and development.

Comments

All comments are reviewed before publishing

CORS Isn't a Bug - It's Your API Trying to Warn You (And You Ignored It)

Stop fighting CORS. Understand preflight requests, credentials, wildcard mistakes. CORS isn't a bug—it's your API warning you about real security issues.

Apr 22, 2026·12 min readRead

Security Updated

Session Hijacking Starts With Your Cookies — Here's What You Missed

Most developers think session hijacking is an advanced attack. It's not. It usually starts with something very basic: your cookies. Learn the 3 flags and token refresh pattern that actually works.

Apr 21, 2026·8 min readRead

Rate Limiting Isn't Optional - Here How to Actually Implement It in Node.js

Why You Need Rate Limiting

The Wrong Way: In-Memory Counters

Starting With express-rate-limit

Fix 1: Set trust proxy

Fix 2: Use user ID on authenticated routes

The Three Rate Limiting Algorithms

Fixed Window

Sliding Window

Token Bucket

Switching to Redis (Production Setup)

Set Different Limits for Different Routes

What to Send in the 429 Response

Testing That It Actually Works

The Short Version

Sandeep Bansod

Comments

Leave a Comment

Related Articles

CORS Isn't a Bug - It's Your API Trying to Warn You (And You Ignored It)

Session Hijacking Starts With Your Cookies — Here's What You Missed

Rate Limiting Isn't Optional - Here How to Actually Implement It in Node.js

Why You Need Rate Limiting

The Wrong Way: In-Memory Counters

Starting With express-rate-limit

Fix 1: Set trust proxy

Fix 2: Use user ID on authenticated routes

The Three Rate Limiting Algorithms

Fixed Window

Sliding Window

Token Bucket

Switching to Redis (Production Setup)

Set Different Limits for Different Routes

What to Send in the 429 Response

Testing That It Actually Works

The Short Version

Sandeep Bansod

Comments

Leave a Comment

Related Articles

CORS Isn't a Bug - It's Your API Trying to Warn You (And You Ignored It)

Session Hijacking Starts With Your Cookies — Here's What You Missed

Level up your dev skills — weekly