Rate limiter
Source: src/security/rate-limiter.plugin.ts
Three layers of rate limiting, all enforced inside Fastify.
| Layer | Bucket | Limit | Window | Backed by |
|---|---|---|---|---|
| Global | per-IP | 100 req | 60s | @fastify/rate-limit |
| Tenant | per-tenant | 300 req | 60s | rate-limiter-flexible (memory) |
| Auth (Phase 2) | per-IP, scoped to /auth/* | 5 req | 15 min | @fastify/rate-limit |
Why three layers?
- Layer 1 protects against unauthenticated abuse (anyone flooding
/mcpwithout a key). - Layer 2 protects against an authenticated tenant misusing their own key (avoids one tenant DoS-ing another via shared upstream capacity).
- Layer 3 protects against credential-stuffing / authorization-code interception (very low limit because no legitimate flow needs more than ~1 attempt per minute).
Each layer writes a rate_limit.exceeded audit row when it rejects. The audit write is fire-and-forget so it never adds latency to the 429 response.
State location
Both Layer 1 and Layer 2 hold counters in process memory:
@fastify/rate-limituses an in-process LRU.rate-limiter-flexible’sRateLimiterMemoryuses a Map.
This is acceptable because production deploys a single Node process per host (see docs/setup-ubuntu.md — systemd unit, not PM2 cluster). If we move to multi-process / multi-host later, both stores must move to Redis or a shared DB table.
Order of registration
The plugin file exports two plugins so they can be registered around the auth plugin in src/index.ts:
helmet
→ globalRateLimiterPlugin (Layer 1: per-IP, fires for every request)
→ tenantAuthPlugin (sets req.tenantId)
→ tenantRateLimiterPlugin (Layer 2: per-tenant — uses req.tenantId)Why split? Fastify runs preHandler hooks in registration order. If both layers were registered inside one plugin before the auth plugin, the per-tenant hook would fire when req.tenantId is still undefined, skip every time, and the per-tenant cap would be a no-op. Putting tenantRateLimiterPlugin after tenantAuthPlugin guarantees the tenant ID is populated by the time we consume the bucket.
Plugin scope: must be fastify-plugin-wrapped
Both globalRateLimiterPlugin and tenantRateLimiterPlugin are exported through fp(...). Without that wrapper, Fastify encapsulates each app.register(...)-mounted plugin and its hooks/sub-registrations stay confined to the plugin’s own scope — never reaching the /mcp route, which is mounted on the parent. The tenantRateLimiterPlugin’s preHandler hook would never fire and the bucket consumption would be silently skipped. (The 2026-05-07 Ubuntu smoke-test hit exactly this for tenantAuthPlugin; both plugins were fixed at the same time — see docs/phase-1-foundation.md §14 #6.)
Test-only helper
_resetTenantRateLimiter() rebuilds the RateLimiterMemory instance without recreating the export binding, so test files can start from a clean bucket between cases.
Tests
Phase 1 §E5 specifies a Fastify app.inject()-based test that exercises:
- 100 global requests → 101st = 429 + audit row (scope:
global) - 300 tenant requests → 301st = 429 + audit row (scope:
tenant) - 11 auth failures → IP-block engages even on a valid key (audit row:
auth.blocked_ip)
These tests require the full app to be bootable in a unit-test context, which depends on the secrets dir and a Postgres instance — they live in the same place CI provisions both, see .github/workflows/ci.yml.