Introduction
This downloadable resource is a complete, production-ready example of implementing Rate limiting and throttling in ColdFusion (CFML). It includes a reusable RateLimiter component, a lightweight in-memory storage engine with TTL, ready-to-drop Application.cfc hooks, and working API endpoint samples. Whether you’re protecting login routes from brute-force attacks, enforcing per-IP quotas on public APIs, or smoothing out burst traffic, this example provides a clean, extensible foundation you can deploy in minutes.
At its core, the solution implements a token bucket algorithm with optional hard blocking for abusive clients, returns standard 429 Too Many Requests responses, and emits X-RateLimit headers. It works on Adobe ColdFusion and Lucee without external dependencies, while also documenting how to swap in Redis or a database for clustered environments.
What You’ll Get
- A ZIP package named: cf-rate-limit-example.zip
- Source code:
- /lib/RateLimiter.cfc — core Rate limiting engine (token bucket + optional blocking)
- /lib/storage/MemoryStore.cfc — thread-safe in-memory TTL store using cflock
- /Application.cfc — prewired middleware-style Integration via onRequestStart
- /api/hello.cfm — sample endpoint returning headers and JSON
- /api/login.cfm — sample route with stricter limits
- Configuration examples:
- /config/rate-limits.json — sample policy map for per-route rules
- /env/.env.example — optional environment variables template
- Documentation and tools:
- README.pdf (6–8 pages) with Architecture, diagrams, and tuning guidance
- Postman collection: tests/postman/cf-rate-limit.postman_collection.json
- Optional Docker Compose for Redis (if you want distributed tokens)
- Unit-style smoke tests: /tests/simple/*.cfm for sanity checks
Note: If you don’t see a Download link in your interface, you can copy the code blocks in this article directly into your project. The structure above is still recommended.
Overview
The example demonstrates how to implement:
- Request throttling and API rate limiting using a token bucket.
- Per-route policies with different limits, windows, and burst settings.
- Client identification using user IDs or IP addresses.
- Standards-based responses: 429, Retry-After, and X-RateLimit headers.
- Pluggable storage: start with MemoryStore; later switch to Redis or DB.
The design emphasizes simplicity, low overhead, and readability, while offering hooks for advanced deployments (multi-node clusters, distributed caches, or custom quota strategies).
Supported Environments
| Runtime | Version Range | Notes |
|---|---|---|
| Adobe ColdFusion | 2018 / 2021 / 2023 | Works out-of-the-box |
| Lucee | 5.x / 6.x | Works out-of-the-box |
| Storage | In-memory (default) | Redis/DB optional for clustering |
No third-party libraries are required to get started.
Benefits
- Improves API resilience by smoothing bursts and preventing abuse.
- Protects sensitive endpoints (e.g., /api/login) from credential stuffing.
- Reduces Infrastructure costs by avoiding unnecessary downstream load.
- Provides consistent limits across multiple routes and clients.
- Easy to maintain: clear CFScript, modular components, and documented policies.
File Layout and Contents
- /lib/RateLimiter.cfc
- Implements token bucket logic, per-route rules, and optional hard blocking.
- /lib/storage/MemoryStore.cfc
- Thread-safe storage with TTL support via cflock and in-struct expiration.
- /Application.cfc
- Wires the limiter into onRequestStart; sets HTTP headers or returns 429.
- /api/*
- Demonstrates how endpoints behave under different limits.
- /config/rate-limits.json
- Example Configuration to keep policies out of code.
- /tests/simple/*
- Quick checks to validate the limiter works.
Installation
- Download and unzip cf-rate-limit-example.zip.
- Copy the folders /lib, /api, and /config into your ColdFusion webroot or module.
- Merge or replace your Application.cfc with the one in the package:
- If you already have an Application.cfc, copy the relevant parts (onApplicationStart and onRequestStart) and adjust paths.
- Restart your CFML application (touch Application.cfc, or restart the server) so onApplicationStart executes.
- Open /api/hello.cfm in a browser and refresh repeatedly to see X-RateLimit headers and limiting behavior.
Configuration
Rate Limit Policies
You can define policies per route. Example (JS-like struct or JSON):
- Global fallback rule for all routes:
- windowSec: 60
- maxRequests: 120
- burst: 60
- Strict rule for /api/login:
- windowSec: 300
- maxRequests: 15
- burst: 10
- blockSec: 600 (optional hard block after exceeding capacity)
In Application.cfc (simplified):
- application.rateRules = {
“*” = { windowSec=60, maxRequests=120, burst=60 },
“/api/login” = { windowSec=300, maxRequests=15, burst=10, blockSec=600 },
“/api/search” = { windowSec=60, maxRequests=30, burst=20 }
};
Client Identity
- Default: session.userId if available, else cgi.remote_addr (IP-based).
- You can replace this logic with API keys, OAuth client IDs, or JWT claims.
Headers
- Sends:
- X-RateLimit-Limit
- X-RateLimit-Remaining
- X-RateLimit-Policy
- Retry-After (when throttled)
- Returns 429 status on violation with a JSON error body.
H5: Token Bucket vs Fixed Window
- Token Bucket (default): Allows short bursts while keeping average rate capped; smoother user experience.
- Fixed Window: Simpler, but can allow spikes at window boundaries.
- Sliding Window: More precise; higher implementation complexity.
The provided example uses a token bucket for balance and Performance.
H5: Choosing a Storage Provider
- MemoryStore (default): Easiest to start; single-node only.
- Redis: Best for clusters; centralized counters with TTL; High Performance.
- Database: Works if Redis isn’t available; use indexed tables and short TTLs.
How to Use
Step-by-Step
- Set policies in application.rateRules or config/rate-limits.json.
- Choose the identity key: IP-based for anonymous traffic or user-based for authenticated routes.
- Initialize the RateLimiter in onApplicationStart:
- application.storage = new lib.storage.MemoryStore();
- application.rateLimiter = new lib.RateLimiter(rules=application.rateRules, store=application.storage);
- Enforce in onRequestStart:
- Determine the route and key (user or IP).
- Call application.rateLimiter.check(key=…, route=…).
- If not allowed, respond with 429 and Retry-After. Otherwise, continue.
- Verify:
- Hit /api/hello.cfm rapidly; watch headers and JSON responses.
- Adjust limits until they match your traffic profile.
Emitting Useful Headers
- Always include X-RateLimit-Limit and X-RateLimit-Remaining.
- Include Retry-After (in seconds) for throttled responses.
- Optionally add X-RateLimit-Policy for transparency.
Example Code (Copy-Paste Ready)
/lib/RateLimiter.cfc
component output=false {
variables.rules = {};
variables.store = 0;
public any function init(struct rules={}, any store) {
variables.rules = arguments.rules ?: {};
variables.store = arguments.store;
return this;
}
public string function policyFor(required string route) {
var r = variables.rules[route] ?: variables.rules[“*”];
if (!isStruct(r)) return “”;
return “max=” & r.maxRequests & “; window=” & r.windowSec & “s; burst=” & (r.burst ?: r.maxRequests);
}
public struct function check(required string key, required string route) {
var r = variables.rules[route] ?: variables.rules[“*”];
if (!isStruct(r)) return { allowed=true, remaining=999999, limit=999999, retryAfter=0 };
var capacity = val(r.burst ?: r.maxRequests);
var windowSec = val(r.windowSec);
var ratePerSec = (capacity > 0 && windowSec > 0) ? (r.maxRequests / windowSec) : 0;
var bucketKey = "rl|" & route & "|" & key;
var nowMs = getTickCount();
var state = variables.store.get(bucketKey) ?: { tokens=capacity, ts=nowMs, blockedUntil=0 };
if (structKeyExists(r, "blockSec") && state.blockedUntil > nowMs) {
var retry = ceiling((state.blockedUntil - nowMs) / 1000);
return { allowed=false, remaining=0, limit=capacity, retryAfter=retry };
}
var elapsedSec = (nowMs - state.ts) / 1000.0;
var newTokens = min(capacity, state.tokens + elapsedSec * ratePerSec);
var allowed = newTokens >= 1;
if (allowed) {
newTokens -= 1;
state.tokens = newTokens;
state.ts = nowMs;
} else {
if (structKeyExists(r, "blockSec") && r.blockSec > 0) {
state.blockedUntil = nowMs + (r.blockSec * 1000);
}
}
variables.store.set(bucketKey, state, max(windowSec, 60));
var remaining = int(max(0, newTokens));
var retryAfter = allowed ? 0 : ceiling((1 - newTokens) / ratePerSec);
return { allowed=allowed, remaining=remaining, limit=capacity, retryAfter=retryAfter };
}
}
/lib/storage/MemoryStore.cfc
component output=false {
variables.map = {};
public any function init() {
return this;
}
public any function get(required string key) {
var itm = variables.map[key];
if (isStruct(itm)) {
if (itm.expireAt GT now()) {
return itm.value;
} else {
structDelete(variables.map, key);
}
}
return;
}
public void function set(required string key, required any value, numeric ttl=300) {
var expireAt = dateAdd(“s”, ttl, now());
cflock name=”rl-lock-#hash(arguments.key)#” type=”exclusive” timeout=”2″ {
variables.map[arguments.key] = { value=arguments.value, expireAt=expireAt };
}
}
}
/Application.cfc (relevant parts)
component output=false {
this.name = “cfRateLimitDemo”;
this.sessionManagement = true;
public boolean function onApplicationStart() {
application.rateRules = {
“*” = { windowSec=60, maxRequests=120, burst=60 },
“/api/login” = { windowSec=300, maxRequests=15, burst=10, blockSec=600 },
“/api/search”= { windowSec=60, maxRequests=30, burst=20 }
};
application.storage = new lib.storage.MemoryStore();
application.rateLimiter = new lib.RateLimiter(rules=application.rateRules, store=application.storage);
return true;
}
public void function onRequestStart(string targetPage) {
var route = lcase(reReplaceNoCase(arguments.targetPage, “(\?.*)$”, “”, “all”));
var clientId = isDefined(“session.userId”) ? “u:” & session.userId : “ip:” & cgi.remote_addr;
var result = application.rateLimiter.check(key=clientId, route=route);
if (!result.allowed) {
cfheader(name="Retry-After", value=result.retryAfter);
cfheader(statuscode="429", statustext="Too Many Requests");
cfcontent type="application/json" reset="true" variable=toBinary(toBase64(serializeJSON({
error="too_many_requests",
message="Rate limit exceeded. Try again in " & result.retryAfter & "s.",
route=route,
limit=result.limit,
remaining=result.remaining
})));
cfabort;
} else {
cfheader(name="X-RateLimit-Limit", value=result.limit);
cfheader(name="X-RateLimit-Remaining", value=result.remaining);
cfheader(name="X-RateLimit-Policy", value=application.rateLimiter.policyFor(route));
}
}
}
Best practices
Policy Design
- Apply stricter limits on sensitive endpoints (login, Password reset).
- Use different thresholds for authenticated vs anonymous users.
- Group routes by cost; heavier endpoints should have tighter quotas.
Client Identification
- Prefer stable identifiers (user ID, API key) over IP when possible.
- For public endpoints, combine IP + User-Agent to reduce NAT collisions.
Headers and UX
- Always include Retry-After so clients can back off appropriately.
- Expose X-RateLimit-Remaining to help well-behaved clients self-throttle.
Concurrency and Locks
- Use cflock for in-memory stores to maintain atomicity.
- In clusters, use a centralized store (e.g., Redis) to maintain global fairness.
Monitoring
- Log throttling events with key, route, and retryAfter.
- Watch for sudden spikes; adjust policies or add bot detection where necessary.
H5: Security Considerations
- Whitelist trusted internal services or health checks.
- Never leak sensitive identifiers in headers or error messages.
- Consider CAPTCHA or additional verification after repeated throttling on login.
Benefits and Use Cases
- API Gateways and Microservices: enforce per-client quotas without an external gateway.
- Public APIs: prevent abuse while allowing limited free access with burst tolerance.
- Authentication flows: slow down brute-force attempts using user/IP-based limits.
- Search endpoints: avoid expensive, repeated queries from hammering your database.
- Background jobs exposed as HTTP endpoints: protect downstream services and databases.
These patterns save engineering time, improve performance under load, and deliver predictable behavior for consumers, all while being easy to configure and extend.
Switching to Redis or a Database (Optional)
-
Redis:
- Replace MemoryStore with a Redis-backed store.
- Store JSON blobs for state and use native TTL (EXPIRE/SETEX).
- For strict atomicity, implement the token update as a Lua script or use Redis transactions.
-
Database:
- Create a table (key, tokens, ts, blockedUntil, expiresAt).
- Use an UPSERT pattern and indexed key column.
- Run a scheduled cleanup job to remove expired rows.
This allows true distributed rate limiting across multiple ColdFusion nodes.
Key Takeaways
- The provided RateLimiter.cfc offers a robust token bucket implementation with optional hard blocking.
- A thread-safe MemoryStore gives you zero-dependency throttling on a single node.
- Middleware-style Integration via onRequestStart keeps your endpoints clean.
- Standards-based responses (429 + Retry-After + X-RateLimit-*) help client-side backoff.
- Easily extendable: swap storage, adjust policies, and tailor identity keys.
FAQ
How does this differ from a fixed-window rate limiter?
The token bucket algorithm allows controlled bursts while maintaining an average request rate over time. Fixed-window counters can allow spikes at window boundaries; token buckets produce smoother traffic and a better user experience.
Does this work on Lucee as well as Adobe ColdFusion?
Yes. The example uses portable CFScript and core language Features. It runs on Adobe ColdFusion 2018/2021/2023 and Lucee 5.x/6.x without changes.
How can I use this in a clustered environment?
Replace the in-memory store with a shared backend like Redis. Persist the bucket state in Redis with TTL, and update tokens atomically (Lua script or transactions). All nodes then enforce the same global limits.
Can I exempt certain routes or internal IPs?
Yes. Add allow-list checks in onRequestStart before invoking the limiter, or set a very high policy for trusted routes. You can also skip enforcement for server-to-server requests with known headers or API keys.
What’s the difference between rate limiting and throttling?
They’re closely related. Rate limiting enforces a maximum number of requests in a given period (quotas), while throttling often refers to actively slowing or delaying requests to reduce system load. This example focuses on rejecting excess requests with 429 and guiding clients via headers, but you could extend it to delay requests instead.
