Phase 1 — Secure Foundation
Detailed execution doc for Phase 1 of the Deneva MCP Tool Architecture Plan. The architecture doc is the source of truth for what is being built and why; this doc is the source of truth for how and in what order.
Estimated effort: 1–2 weeks for one engineer.
Phase 2 follow-up: docs/phase-2-google-ads.md.
Goal
Stand up a minimal but complete Fastify + Drizzle + Postgres stack with the entire security middleware chain wired end-to-end, validated by two stub MCP tools. Every later phase plugs into this foundation without changing it: Phase 2 adds adapters and real tokens, Phase 3 adds more adapters, Phase 4 adds background sync, Phase 5 adds production hardening.
If Phase 1 is done correctly, an attacker hitting an unauthenticated request, sending a malformed payload, or exceeding the rate limit gets the same correct response on day one as they will on day 365.
Definition of Done (high-level — full checklist in §11)
- A request to
/mcpwith a validX-Api-Keyreaches thepingtool, gets a 200, and produces anapi_key.auth_success+mcp.tool_calledaudit row. - The same request without the header gets a 401 and an
api_key.auth_failureaudit row, in constant time vs an invalid-but-well-formed key. - Two separate tenants cannot read each other’s
metric_cacherows even if a query forgets aWHERE tenant_id = ?clause (RLS catches it). -
DELETE FROM audit_logas themcp_approle fails with permission denied. - CI fails the build on a high-severity
npm auditadvisory.
Workstream order & dependency graph
A. Project bootstrap ──┬──▶ C. Secrets loader ──┬──▶ D. API key auth ──┬──▶ G. MCP server + stubs
│ │ │
└──▶ B. Database & schema ─▶ F. Audit log ──────┤
│ │
└─▶ E. Rate limiting ┤
│
H. OAuth scaffolding ┤
│
I. CI & dep hygiene ─────┘ (runs alongside everything)A → B → (C, F, E) → D → G is the critical path. H and I are parallelisable.
Workstream A — Project bootstrap
A0. Dependencies (one-time npm install)
Pin exact versions in package-lock.json (Phase 1 §A1’s package.json declares ranges; lockfile is what actually ships). The MCP SDK shape may evolve — the version below is the contract Phase 1 is written against; bump deliberately, not opportunistically.
# Runtime
npm install \
fastify@^5.0.0 \
@fastify/helmet@^13.0.0 \
@fastify/rate-limit@^10.0.0 \
rate-limiter-flexible@^7.0.0 \
drizzle-orm@^0.36.0 \
pg@^8.13.0 \
zod@^3.23.0 \
@modelcontextprotocol/sdk@^1.0.0
# ^^^ Phase 1 was authored against SDK 1.x. The McpServer / StreamableHTTPServerTransport
# API shape in §G1 may differ on newer minor versions — adapt the registry, keep the contract.
# Dev / build / test
npm install --save-dev \
typescript@^5.6.0 \
tsx@^4.19.0 \
@types/node@^22.0.0 \
@types/pg@^8.11.0 \
drizzle-kit@^0.28.0 \
vitest@^2.1.0 \
eslint@^9.0.0 \
typescript-eslint@^8.0.0Acceptance: npm install completes with zero high/critical advisories (npm audit --audit-level=high exits 0).
A1. Initialise project files
Files:
package.jsontsconfig.json.gitignore.editorconfig.nvmrc
package.json (relevant fields):
{
"name": "deneva-mcp",
"private": true,
"type": "module",
"engines": { "node": ">=22.0.0 <23" },
"scripts": {
"dev": "tsx watch src/index.ts",
"build": "tsc --project tsconfig.json",
"start": "node dist/index.js",
"typecheck": "tsc --noEmit",
"lint": "eslint . --max-warnings=0",
"test": "vitest run",
"test:watch":"vitest",
"audit": "npm audit --audit-level=high",
"db:migrate":"drizzle-kit migrate",
"db:studio": "drizzle-kit studio"
}
}tsconfig.json essentials:
{
"compilerOptions": {
"target": "ES2023",
"module": "ESNext",
"moduleResolution": "Bundler",
"strict": true,
"noUncheckedIndexedAccess": true,
"exactOptionalPropertyTypes": true,
"noImplicitOverride": true,
"isolatedModules": true,
"resolveJsonModule": true,
"outDir": "dist",
"rootDir": "src"
},
"include": ["src/**/*"]
}.gitignore (must include):
node_modules
dist
.env
.env.*
secrets/
coverage.nvmrc: 22.
Acceptance: npm install && npm run typecheck exits 0 on an empty src/index.ts.
A2. ESLint with the SQL-injection guard
File: eslint.config.js
The non-negotiable rule: ban template-string interpolation inside db.execute(sql\…`) and similar. Drizzle's parameterized helpers (eq, and, sql`…${param}“) remain allowed; raw template construction does not.
// eslint.config.js
import tseslint from 'typescript-eslint';
export default tseslint.config(
...tseslint.configs.recommendedTypeChecked,
{
languageOptions: { parserOptions: { project: './tsconfig.json' } },
rules: {
'no-restricted-syntax': [
'error',
{
// db.execute(sql`...${var}...`) where the sql tag is given a raw template literal
selector:
"CallExpression[callee.property.name='execute'] > TaggedTemplateExpression[tag.name='sql'][quasi.expressions.length>0]",
message:
'Raw template interpolation in db.execute(sql`...`) is forbidden. Use parameterized helpers (eq, and, sql.placeholder) instead.',
},
],
},
},
);Acceptance: npm run lint flags a deliberately-injected db.execute(sql\SELECT * FROM x WHERE id = ${userId}`)` test fixture and exits non-zero.
Workstream B — Database & schema
B1. Local Postgres via docker-compose
File: docker-compose.yml
services:
postgres:
image: postgres:16-alpine
restart: unless-stopped
environment:
POSTGRES_DB: deneva_mcp
POSTGRES_USER: mcp_admin
POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-dev_only_password}
ports:
- "127.0.0.1:5432:5432" # bound to localhost only
volumes:
- mcp_pg_data:/var/lib/postgresql/data
volumes:
mcp_pg_data:No TLS in dev. The
postgres:16-alpineimage does not ship the Debianssl-certpackage, so the previous draft’s snake-oil cert paths would crash the container on boot. Production uses a CA-signed cert mounted via the systemd unit (Phase 5 §K). Thessl: { rejectUnauthorized: false }insrc/db/index.tsreflects this — connections succeed regardless of TLS posture in dev.
Acceptance: docker compose up -d postgres && docker compose exec postgres psql -U mcp_admin -d deneva_mcp -c "select 1" returns 1.
B2. Drizzle schema + initial migration
Files:
drizzle.config.tssrc/db/schema.tssrc/db/index.tssrc/db/migrations/0000_init.sql(generated bydrizzle-kit generate)
src/db/schema.ts — paste the seven pgTable definitions from the architecture doc (tenants, apiKeys, platformCredentials, oauthStates, metricCache, auditLog, syncLog). Prepend the imports the architecture doc omits:
import { pgTable, uuid, text, timestamp, jsonb, integer } from 'drizzle-orm/pg-core';drizzle.config.ts — drizzle-kit runs as a CLI, as mcp_admin (privileged), separate from the runtime pool which connects as mcp_app. Read the admin password synchronously from the dev secrets dir:
// drizzle.config.ts
import { defineConfig } from 'drizzle-kit';
import { readFileSync } from 'node:fs';
import { join } from 'node:path';
const password = readFileSync(
process.env.NODE_ENV === 'production'
? join('/run/credentials', process.env.SYSTEMD_UNIT ?? 'deneva-mcp.service', 'DB_ADMIN_PASSWORD')
: join(process.cwd(), 'secrets', 'DB_ADMIN_PASSWORD'),
'utf8',
).trim();
export default defineConfig({
schema: './src/db/schema.ts',
out: './src/db/migrations',
dialect: 'postgresql',
dbCredentials: {
host: process.env.DB_HOST ?? '127.0.0.1',
port: 5432,
user: 'mcp_admin',
password,
database: 'deneva_mcp',
ssl: false, // dev; prod uses CA-signed cert (Phase 5)
},
strict: true,
});Why the split:
mcp_app(runtime) lacks DDL grants — it cannotCREATE TABLE.mcp_admin(migrations) has full privileges and bypasses RLS, which is correct for migrations and seed scripts but must not be used at runtime. Production uses a separate encryptedDB_ADMIN_PASSWORDcredential that lives only on the deploy host, not in the running app’s credential set.
src/db/index.ts:
import { drizzle } from 'drizzle-orm/node-postgres';
import { Pool } from 'pg';
import * as schema from './schema.js';
import { loadSecret } from '../security/secrets.loader.js';
const pool = new Pool({
host: process.env.DB_HOST ?? '127.0.0.1',
port: 5432,
user: 'mcp_app',
password: (await loadSecret('DB_PASSWORD')).toString('utf8'),
database: 'deneva_mcp',
ssl: { rejectUnauthorized: false }, // dev; production uses CA-signed cert
max: 20, // see §B6; sized for Phase 4 sync fan-out
});
export const db = drizzle(pool, { schema });Acceptance: npm run db:migrate creates all seven tables. \d+ audit_log in psql shows the columns from the architecture doc.
B3. Row-level security policies
File: src/db/rls.sql (run as mcp_admin after migrations)
ALTER TABLE metric_cache ENABLE ROW LEVEL SECURITY;
ALTER TABLE platform_credentials ENABLE ROW LEVEL SECURITY;
ALTER TABLE sync_log ENABLE ROW LEVEL SECURITY;
ALTER TABLE api_keys ENABLE ROW LEVEL SECURITY;
-- Tenant isolation: every tenant-scoped query must SET app.current_tenant_id first.
CREATE POLICY tenant_isolation_metric_cache ON metric_cache
USING (tenant_id = current_setting('app.current_tenant_id', true)::uuid);
CREATE POLICY tenant_isolation_creds ON platform_credentials
USING (tenant_id = current_setting('app.current_tenant_id', true)::uuid);
CREATE POLICY tenant_isolation_sync_log ON sync_log
USING (tenant_id = current_setting('app.current_tenant_id', true)::uuid);
CREATE POLICY tenant_isolation_api_keys ON api_keys
USING (tenant_id = current_setting('app.current_tenant_id', true)::uuid);
-- audit_log is intentionally NOT RLS-isolated: cross-tenant security review needs full visibility.
-- Access control on audit_log is via role grants in roles.sql.Setting the tenant context must happen on every request, immediately after auth:
// inside tenant.middleware.ts, after tenantId is resolved:
await db.execute(sql`SELECT set_config('app.current_tenant_id', ${tenantId}, true)`);(true = transaction-local; the setting clears at end of transaction. Pair with a per-request transaction or a connection-pool reset hook.)
B4. Database role separation
File: src/db/roles.sql (run once, as mcp_admin, after migrations have created the tables)
mcp_admin already exists — Postgres creates it on first boot from the POSTGRES_USER env var in §B1’s compose file. This script only creates mcp_app and applies grants on the tables that drizzle-kit migrate produced. Apply with:
psql "postgresql://mcp_admin:$ADMIN_PW@127.0.0.1:5432/deneva_mcp" \
-v app_password="$APP_PW" -f src/db/roles.sqlThe -v flag is mandatory — without it, the :'…' placeholder is passed through unsubstituted and CREATE ROLE fails.
CREATE ROLE mcp_app LOGIN PASSWORD :'app_password' NOINHERIT;
GRANT CONNECT ON DATABASE deneva_mcp TO mcp_app;
GRANT USAGE ON SCHEMA public TO mcp_app;
GRANT SELECT, INSERT, UPDATE, DELETE
ON ALL TABLES IN SCHEMA public TO mcp_app;
GRANT USAGE
ON ALL SEQUENCES IN SCHEMA public TO mcp_app;
-- audit_log is INSERT-only for the application
REVOKE UPDATE, DELETE ON audit_log FROM mcp_app;
-- mcp_app must NOT bypass RLS
ALTER ROLE mcp_app NOBYPASSRLS;
-- mcp_admin already has full privileges (created by Postgres at boot).
-- Re-asserting NOBYPASSRLS for mcp_app is the only contract we enforce here.Acceptance: as mcp_app, INSERT INTO audit_log ... succeeds; UPDATE audit_log SET outcome='success' fails with permission denied.
B5. RLS verification integration test
File: tests/integration/rls.test.ts
import { describe, it, expect } from 'vitest';
import { db } from '../../src/db/index.js';
import { metricCache, tenants } from '../../src/db/schema.js';
import { sql } from 'drizzle-orm';
describe('row-level security', () => {
it('blocks tenant A from reading tenant B rows', async () => {
const [a] = await db.insert(tenants).values({ name: 'A' }).returning();
const [b] = await db.insert(tenants).values({ name: 'B' }).returning();
// Insert a row owned by tenant B, with B's context set.
await db.execute(sql`SELECT set_config('app.current_tenant_id', ${b.id}, false)`);
await db.insert(metricCache).values({
tenantId: b.id, platform: 'google', reportType: 'health',
dateRangeKey: 'last_7', data: {}, expiresAt: new Date(Date.now() + 60_000),
});
// Switch to tenant A's context.
await db.execute(sql`SELECT set_config('app.current_tenant_id', ${a.id}, false)`);
const visible = await db.select().from(metricCache);
expect(visible).toHaveLength(0); // RLS hides B's row from A
});
});Acceptance: test passes against a DB connection authenticated as mcp_app (NOT mcp_admin — admin bypasses RLS).
B6. Migration rollback strategy
Drizzle Kit does not auto-generate down migrations. Phase 3 drops & recreates indexes; Phase 4 adds non-nullable columns. Both can fail mid-deploy on prod data the dev DB never saw. The startup-grade rollback strategy is restore from a pre-migration dump — not maintaining a parallel set of down.sql files (cheap to write, expensive to keep correct).
The deploy procedure (used in CI/CD on the Ubuntu host once Phase 5 §K is in place):
# Take a snapshot dump immediately before applying migrations.
TS=$(date -u +%Y%m%dT%H%M%SZ)
pg_dump --format=custom --no-owner --no-privileges deneva_mcp \
> /var/backups/deneva-mcp/pre-migrate-${TS}.dump
# Apply migrations.
npm run db:migrate || {
echo "Migration failed; restore with:"
echo " dropdb deneva_mcp && createdb deneva_mcp"
echo " pg_restore --dbname=deneva_mcp /var/backups/deneva-mcp/pre-migrate-${TS}.dump"
exit 1
}Local dev rollback: docker compose down -v && docker compose up -d postgres && npm run db:migrate wipes and reapplies — usually faster than reasoning about a partial failure.
For destructive migrations (drop/recreate index, NOT NULL on existing column), commit a one-liner *.rollback.sql next to the generated migration with the inverse SQL — used by an operator following docs/compliance/runbooks/database-restore.md (Phase 5) when a restore from dump would lose too much intervening data.
Acceptance: the deploy script aborts on a failed migration; the runbook mentioned above (added in Phase 5 §K6) references this pre-migrate dump path.
Workstream C — Secrets loader
C1. Two-backend loader
File: src/security/secrets.loader.ts
import { readFile } from 'node:fs/promises';
import { join } from 'node:path';
const REQUIRED_SECRETS = [
'CREDENTIAL_KEK',
'API_KEY_HMAC_SECRET',
'DB_PASSWORD',
'INNGEST_SIGNING_KEY',
] as const;
type SecretName = (typeof REQUIRED_SECRETS)[number];
const cache = new Map<SecretName, Buffer>();
export async function loadSecret(name: SecretName): Promise<Buffer> {
const hit = cache.get(name);
if (hit) return hit;
const path = process.env.NODE_ENV === 'production'
? join('/run/credentials', process.env.SYSTEMD_UNIT ?? 'deneva-mcp.service', name)
: join(process.cwd(), 'secrets', name);
const value = await readFile(path);
cache.set(name, value);
return value;
}
export async function verifyAllSecretsLoadable(): Promise<void> {
// Fail fast at startup — better than discovering a missing secret at first request.
for (const name of REQUIRED_SECRETS) await loadSecret(name);
}C2. Startup verification
File: src/index.ts (entry point)
import { verifyAllSecretsLoadable } from './security/secrets.loader.js';
await verifyAllSecretsLoadable(); // throws if any required secret is missing
// ... continue Fastify bootstrapC3. Dev secrets bootstrap script
File: scripts/dev-secrets.sh
#!/usr/bin/env bash
set -euo pipefail
mkdir -p secrets
chmod 700 secrets
gen() { head -c 32 /dev/urandom | base64 > "secrets/$1" && chmod 600 "secrets/$1"; }
[[ -f secrets/CREDENTIAL_KEK ]] || gen CREDENTIAL_KEK
[[ -f secrets/API_KEY_HMAC_SECRET ]] || gen API_KEY_HMAC_SECRET
[[ -f secrets/DB_PASSWORD ]] || echo -n 'dev_only_password' > secrets/DB_PASSWORD
[[ -f secrets/DB_ADMIN_PASSWORD ]] || echo -n 'dev_only_password' > secrets/DB_ADMIN_PASSWORD
[[ -f secrets/INNGEST_SIGNING_KEY ]] || gen INNGEST_SIGNING_KEY
chmod 600 secrets/DB_PASSWORD secrets/DB_ADMIN_PASSWORD
echo "Dev secrets in ./secrets/ (gitignored)."
DB_ADMIN_PASSWORDis not inREQUIRED_SECRETS. The runtime app connects asmcp_appand never needs the admin password. It is consumed only bydrizzle.config.ts(migrations) andscripts/seed-tenant.mjs. In dev it matches the composePOSTGRES_PASSWORDdefault; in prod it lives on the deploy host’s encrypted credential store, separate from the app’s runtime credentials.
C4. systemd unit reference (production — copy-paste, do not commit a real one in Phase 1)
# /etc/systemd/system/deneva-mcp.service
[Service]
Type=simple
User=deneva-mcp
WorkingDirectory=/opt/deneva-mcp
ExecStart=/usr/bin/node dist/index.js
Environment=NODE_ENV=production
Environment=SYSTEMD_UNIT=deneva-mcp.service
LoadCredentialEncrypted=CREDENTIAL_KEK:/etc/deneva-mcp/creds/CREDENTIAL_KEK.cred
LoadCredentialEncrypted=API_KEY_HMAC_SECRET:/etc/deneva-mcp/creds/API_KEY_HMAC_SECRET.cred
LoadCredentialEncrypted=DB_PASSWORD:/etc/deneva-mcp/creds/DB_PASSWORD.cred
LoadCredentialEncrypted=INNGEST_SIGNING_KEY:/etc/deneva-mcp/creds/INNGEST_SIGNING_KEY.cred
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=true
PrivateTmp=true(Full hardening flag set lands in Phase 5.)
Acceptance: starting the server with one of the four secret files removed produces a clear error and exit code != 0 before any port is opened.
Workstream D — API key auth
D1. Key service
File: src/security/api-key.service.ts
import { createHmac, randomBytes, timingSafeEqual } from 'node:crypto';
import { loadSecret } from './secrets.loader.js';
const HMAC_SECRET = await loadSecret('API_KEY_HMAC_SECRET');
export function generateApiKey(): string {
// 32 bytes = 256 bits, base64url-encoded; shown to user once, never stored
return randomBytes(32).toString('base64url');
}
export function hashApiKey(rawKey: string): string {
return createHmac('sha256', HMAC_SECRET).update(rawKey).digest('hex');
}
export function verifyApiKey(rawKey: string, storedHash: string): boolean {
const candidate = Buffer.from(hashApiKey(rawKey));
const stored = Buffer.from(storedHash);
if (candidate.length !== stored.length) return false;
return timingSafeEqual(candidate, stored);
}D2. Tenant middleware
File: src/security/tenant.middleware.ts
import type { FastifyPluginAsync } from 'fastify';
import { and, eq, gt, isNull } from 'drizzle-orm';
import { db } from '../db/index.js';
import { apiKeys } from '../db/schema.js';
import { hashApiKey } from './api-key.service.js';
import { writeAuditEvent } from './audit-log.service.js';
declare module 'fastify' {
interface FastifyRequest { tenantId?: string; apiKeyId?: string }
}
export const tenantAuthPlugin: FastifyPluginAsync = async (fastify) => {
fastify.addHook('preHandler', async (req, reply) => {
if (!req.url.startsWith('/mcp')) return; // auth only on /mcp routes
const raw = req.headers['x-api-key'];
if (typeof raw !== 'string' || raw.length === 0) {
await writeAuditEvent('api_key.auth_failure', 'failure', { ip: req.ip, reason: 'missing' });
return reply.code(401).send({ error: 'unauthorized' });
}
const hash = hashApiKey(raw);
const now = new Date();
const [row] = await db.select().from(apiKeys).where(
and(
eq(apiKeys.keyHash, hash),
isNull(apiKeys.revokedAt),
gt(apiKeys.expiresAt, now),
),
);
if (!row) {
await writeAuditEvent('api_key.auth_failure', 'failure', { ip: req.ip, reason: 'invalid' });
return reply.code(401).send({ error: 'unauthorized' });
}
req.tenantId = row.tenantId;
req.apiKeyId = row.id;
// Async fire-and-forget: lastUsedAt update should not block the request.
void db.update(apiKeys).set({ lastUsedAt: now }).where(eq(apiKeys.id, row.id));
await writeAuditEvent('api_key.auth_success', 'success', { tenantId: row.tenantId });
});
};Note on constant-time behaviour: doing the DB lookup after hashing means missing-key and invalid-key paths take similar time. The cheap missing-header path returns earlier, but it doesn’t leak whether a specific key is valid — only whether the header was present.
D3. Rotation endpoint (pulled forward from Phase 5)
File: src/auth/admin-routes.ts
import type { FastifyPluginAsync } from 'fastify';
import { eq } from 'drizzle-orm';
import { z } from 'zod';
import { timingSafeEqual } from 'node:crypto';
import { db } from '../db/index.js';
import { apiKeys } from '../db/schema.js';
import { generateApiKey, hashApiKey } from '../security/api-key.service.js';
import { writeAuditEvent } from '../security/audit-log.service.js';
import { loadSecret } from '../security/secrets.loader.js';
const ADMIN_HEADER_NAME = 'x-admin-token';
const adminToken = Buffer.from((await loadSecret('API_KEY_HMAC_SECRET')).toString('hex')); // Phase 1 placeholder; Phase 5 introduces a separate ADMIN_TOKEN secret
function adminTokenMatches(presented: unknown): boolean {
if (typeof presented !== 'string') return false;
const cand = Buffer.from(presented);
if (cand.length !== adminToken.length) return false;
return timingSafeEqual(cand, adminToken);
}
const RotateBody = z.object({ tenantId: z.string().uuid(), description: z.string().min(1).max(120) });
export const adminRoutes: FastifyPluginAsync = async (fastify) => {
fastify.post('/admin/api-keys/rotate', async (req, reply) => {
if (!adminTokenMatches(req.headers[ADMIN_HEADER_NAME])) return reply.code(401).send();
const body = RotateBody.parse(req.body);
const newKey = generateApiKey();
const newHash = hashApiKey(newKey);
const graceUntil = new Date(Date.now() + 24 * 60 * 60 * 1000);
await db.transaction(async (tx) => {
// Mark previous active keys for this tenant with a 24h grace expiry — do NOT revoke immediately.
await tx.update(apiKeys)
.set({ expiresAt: graceUntil })
.where(eq(apiKeys.tenantId, body.tenantId));
await tx.insert(apiKeys).values({
tenantId: body.tenantId,
keyHash: newHash,
description: body.description,
expiresAt: new Date(Date.now() + 365 * 24 * 60 * 60 * 1000),
});
});
await writeAuditEvent('api_key.rotated', 'success', { tenantId: body.tenantId });
return { apiKey: newKey, graceUntil }; // shown once, never retrievable again
});
};Phase 5 follow-up: replace the placeholder admin token with a dedicated
ADMIN_TOKENsecret, move the route behind nginx with an IP allow-list, and add a separate revoke-immediately variant.
D4. Acceptance tests
File: tests/integration/auth.test.ts
Required cases:
POST /mcpwithoutX-Api-Key→ 401, audit rowapi_key.auth_failurewithreason: 'missing'.POST /mcpwith random key → 401, audit row withreason: 'invalid'. Timing within 20% of case 3 (no constant-time guarantee in tests, but a sanity check).POST /mcpwith valid key → 200, audit rowapi_key.auth_success,lastUsedAtupdated.- After rotation: old key works for ≤24h, new key works immediately. After 25h (advance fake clock), old key returns 401.
Workstream E — Rate limiting
E1. Global per-IP limit
File: src/security/rate-limiter.plugin.ts
import type { FastifyPluginAsync } from 'fastify';
import rateLimit from '@fastify/rate-limit';
export const rateLimiterPlugin: FastifyPluginAsync = async (fastify) => {
await fastify.register(rateLimit, {
global: true,
max: 100,
timeWindow: 60_000,
keyGenerator: (req) => req.ip,
errorResponseBuilder: (req, _ctx) => {
// The plugin invokes this on every blocked request — do the audit write here.
// Lazy import keeps audit-log.service out of the plugin's import cycle.
void import('./audit-log.service.js').then(({ writeAuditEvent }) =>
writeAuditEvent('rate_limit.exceeded', 'failure', { ip: req.ip, scope: 'global' }),
);
return { error: 'rate_limit_exceeded' };
},
});
};Why
errorResponseBuilderinstead of anonExceededhook:@fastify/rate-limitv10 does not expose a per-routeonExceededcallback — the response builder is the single point that fires on every 429. The audit write is fire-and-forget (void) so it doesn’t add latency to the response.
E2. Per-tenant limit
// inside the same plugin file
import { RateLimiterMemory } from 'rate-limiter-flexible';
const tenantLimiter = new RateLimiterMemory({ points: 300, duration: 60 });
fastify.addHook('preHandler', async (req, reply) => {
if (!req.tenantId) return;
try { await tenantLimiter.consume(req.tenantId, 1); }
catch {
const { writeAuditEvent } = await import('./audit-log.service.js');
await writeAuditEvent('rate_limit.exceeded', 'failure', { tenantId: req.tenantId, scope: 'tenant' });
return reply.code(429).send({ error: 'rate_limit_exceeded' });
}
});E3. Strict /auth/* limit (placeholder routes — real ones in Phase 2)
fastify.register(async (inst) => {
await inst.register(rateLimit, { max: 5, timeWindow: 15 * 60_000, keyGenerator: (req) => req.ip });
inst.get('/auth/_phase1_placeholder', async () => ({ ok: true }));
}, { prefix: '/' });E4. Auth-failure IP block
Single-process assumption. The IP-block map and the per-tenant rate-limit bucket below both live in process memory. This is fine for the systemd-direct deployment Phase 5 §B2 specifies (one Node process per host). The architecture doc’s
ecosystem.config.jsexample shows PM2 cluster mode withinstances: 2; we have deliberately diverged from that — clustering would require moving these maps to Redis or ablocked_ipsDB table. If you choose to scale horizontally later, those two structures are the migration targets.
After 10 api_key.auth_failure events from the same IP within an hour:
- Insert a row in an in-memory
blockedIpsmap (Phase 1 — Redis in Phase 5) withexpiresAt = now + 1h. - Subsequent requests from that IP short-circuit at the first
preHandlerwith 401 + audit eventauth.blocked_ip. - A simple
setInterval(cleanup, 60_000)removes expired entries.
// security/ip-block.service.ts — sketch
const blocks = new Map<string, number>();
export function isBlocked(ip: string): boolean {
const exp = blocks.get(ip);
if (!exp) return false;
if (exp < Date.now()) { blocks.delete(ip); return false; }
return true;
}
export function recordFailure(ip: string): boolean { /* sliding-window count; returns true when threshold tripped */ }E5. Acceptance tests
File: tests/integration/rate-limit.test.ts
Use Fastify’s built-in app.inject() (no separate package — it’s exposed on every Fastify instance) + a fake clock (vi.useFakeTimers()):
- 100 requests from one IP succeed; 101st returns 429 + audit row with
scope: 'global'. - 300 requests from one tenant succeed; 301st returns 429 + audit row with
scope: 'tenant'. - 11 auth-failure attempts from one IP → IP block engages; a valid key from that IP also gets 401 with
auth.blocked_ipaudit row until clock advances 1h.
Workstream F — Audit log
F1. Service
File: src/security/audit-log.service.ts
import { db } from '../db/index.js';
import { auditLog } from '../db/schema.js';
export type AuditEventType =
| 'api_key.auth_success' | 'api_key.auth_failure'
| 'api_key.created' | 'api_key.rotated' | 'api_key.revoked'
| 'oauth.flow_started' | 'oauth.flow_completed' | 'oauth.flow_failed'
| 'oauth.token_refreshed' | 'oauth.token_revoked'
| 'mcp.tool_called' | 'mcp.tool_failed'
| 'tenant.created' | 'tenant.deleted'
| 'rate_limit.exceeded' | 'auth.blocked_ip' | 'sync.exhausted';
interface Ctx {
tenantId?: string; ip?: string; requestId?: string;
reason?: string; scope?: string; tool?: string;
[k: string]: unknown;
}
const PII_KEYS = new Set(['email', 'name', 'firstName', 'lastName', 'phone', 'address']);
function stripPii(meta: Ctx): Ctx {
const out: Ctx = {};
for (const [k, v] of Object.entries(meta)) if (!PII_KEYS.has(k)) out[k] = v;
return out;
}
export async function writeAuditEvent(
eventType: AuditEventType,
outcome: 'success' | 'failure',
ctx: Ctx = {},
): Promise<void> {
const { tenantId, ip, requestId, ...rest } = ctx;
await db.insert(auditLog).values({
tenantId: tenantId ?? null,
eventType,
actorIp: ip ?? null,
requestId: requestId ?? null,
outcome,
metadata: stripPii(rest as Ctx),
});
}F2. Wire-up checklist
-
tenantAuthPlugincallswriteAuditEvent('api_key.auth_*', ...)on every result. -
rateLimiterPlugincallswriteAuditEvent('rate_limit.exceeded', ...)on every 429. - IP-block service calls
writeAuditEvent('auth.blocked_ip', ...)on engagement. - Stub tools call
writeAuditEvent('mcp.tool_*', ...)on entry/exit. - Rotation endpoint calls
writeAuditEvent('api_key.rotated', ...).
F3. Acceptance
File: tests/integration/audit.test.ts
- Each test in §D4 / §E5 asserts the corresponding audit row appears.
- A negative test connects as
mcp_appand runsUPDATE audit_log SET outcome='success' WHERE id = ...andDELETE FROM audit_log WHERE ...— both must throw permission denied. - The
metadatacolumn for anapi_key.auth_failurerow never contains anemailkey even if the test injects one.
Workstream G — MCP server + stub tools
G1. Tool registry
File: src/mcp/server.ts
import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { StreamableHTTPServerTransport } from '@modelcontextprotocol/sdk/server/streamableHttp.js';
import type { FastifyInstance } from 'fastify';
import { pingTool } from './tools/ping.js';
import { accountHealthTool } from './tools/account-health.js';
import { writeAuditEvent } from '../security/audit-log.service.js';
export interface ToolDef<I, O> {
name: string;
description: string;
inputSchema: import('zod').ZodType<I>;
handler: (input: I, ctx: { tenantId: string; requestId: string }) => Promise<O>;
}
const TOOLS: ToolDef<unknown, unknown>[] = [pingTool, accountHealthTool];
export function buildMcpServer(): McpServer {
const server = new McpServer({ name: 'deneva-mcp', version: '0.1.0' });
for (const t of TOOLS) {
server.tool(t.name, t.description, t.inputSchema, async (input, extra) => {
// tenantId + requestId are propagated via the transport's request-context (set in mountMcp)
const ctx = extra.context as { tenantId: string; requestId: string };
try {
return await t.handler(input as never, ctx as never);
} catch (err) {
// Single producer for mcp.tool_failed in Phase 1 — every later phase reuses this path.
// Phase 2's typed-error path (AdapterError) flows through here too.
await writeAuditEvent('mcp.tool_failed', 'failure', {
tenantId: ctx.tenantId,
requestId: ctx.requestId,
tool: t.name,
reason: err instanceof Error ? err.message : 'unknown',
});
throw err; // let the MCP transport translate to a JSON-RPC error
}
});
}
return server;
}
export async function mountMcp(fastify: FastifyInstance): Promise<void> {
fastify.post('/mcp', async (req, reply) => {
const transport = new StreamableHTTPServerTransport({
sessionIdGenerator: () => crypto.randomUUID(),
});
const server = buildMcpServer();
await server.connect(transport);
await transport.handleRequest(req.raw, reply.raw, { tenantId: req.tenantId!, requestId: req.id });
});
}The exact MCP SDK call shape may differ slightly between SDK versions; the registry pattern above is the contract Phase 2 must preserve.
G2. ping tool
File: src/mcp/tools/ping.ts
import { z } from 'zod';
import type { ToolDef } from '../server.js';
import { writeAuditEvent } from '../../security/audit-log.service.js';
const Input = z.object({});
export const pingTool: ToolDef<z.infer<typeof Input>, { ok: true; tenantId: string; requestId: string }> = {
name: 'ping',
description: 'Health-check tool — verifies the entire middleware stack.',
inputSchema: Input,
async handler(_input, ctx) {
await writeAuditEvent('mcp.tool_called', 'success', { tenantId: ctx.tenantId, requestId: ctx.requestId, tool: 'ping' });
return { ok: true, tenantId: ctx.tenantId, requestId: ctx.requestId };
},
};G3. get_account_health stub
File: src/mcp/tools/account-health.ts
import { z } from 'zod';
import type { ToolDef } from '../server.js';
import { writeAuditEvent } from '../../security/audit-log.service.js';
const Input = z.object({
platform: z.enum(['google', 'meta', 'tiktok']),
dateRange: z.enum(['last_7_days', 'last_30_days', 'last_90_days']),
});
interface Output {
platform: 'google' | 'meta' | 'tiktok';
dateRange: string;
metrics: { spend: number; roas: number; cpa: number; ctr: number };
_stub: true; // removed in Phase 2 when real data lands
}
export const accountHealthTool: ToolDef<z.infer<typeof Input>, Output> = {
name: 'get_account_health',
description: 'Returns spend, ROAS, CPA, CTR for the given platform and date range.',
inputSchema: Input,
async handler(input, ctx) {
await writeAuditEvent('mcp.tool_called', 'success', { tenantId: ctx.tenantId, requestId: ctx.requestId, tool: 'get_account_health' });
return {
platform: input.platform,
dateRange: input.dateRange,
metrics: { spend: 0, roas: 0, cpa: 0, ctr: 0 },
_stub: true,
};
},
};G4. Server bootstrap
File: src/index.ts
import Fastify from 'fastify';
import helmet from '@fastify/helmet';
import { verifyAllSecretsLoadable } from './security/secrets.loader.js';
import { rateLimiterPlugin } from './security/rate-limiter.plugin.js';
import { tenantAuthPlugin } from './security/tenant.middleware.js';
import { adminRoutes } from './auth/admin-routes.js';
import { mountMcp } from './mcp/server.js';
import { startOAuthStateCleanup } from './auth/oauth-state.service.js';
await verifyAllSecretsLoadable();
// Pino logs structured JSON to stdout. In production the systemd unit captures stdout into
// journald — no log-shipping infrastructure is wired in Phase 1. Retention (size + time)
// is configured in Phase 5 §L; for dev, `journalctl -u deneva-mcp -f` tails the live log.
//
// Redaction has two layers:
// 1. Pino's built-in `redact` strips known request-header paths (api-key, admin-token).
// 2. A custom `err` serializer scrubs token-shaped strings and token-named fields from
// arbitrary log objects — adapter SDKs sometimes echo `access_token` into error
// messages, and we don't want those in journald. Phase 5 §L2's grep evidence check
// then becomes a regression-detector, not the primary defence.
const TOKEN_KEY_RE = /access_token|refresh_token|client_secret|authorization|api[_-]?key|x-admin-token/i;
function scrubTokens(value: unknown, depth = 0): unknown {
if (depth > 6 || value == null) return value;
if (typeof value === 'string') {
return value
.replace(/Bearer\s+\S+/gi, 'Bearer [REDACTED]')
.replace(/ya29\.[A-Za-z0-9_-]+/g, '[REDACTED]')
.replace(/EAA[A-Za-z0-9]+/g, '[REDACTED]');
}
if (Array.isArray(value)) return value.map((v) => scrubTokens(v, depth + 1));
if (typeof value === 'object') {
const out: Record<string, unknown> = {};
for (const [k, v] of Object.entries(value as Record<string, unknown>)) {
out[k] = TOKEN_KEY_RE.test(k) ? '[REDACTED]' : scrubTokens(v, depth + 1);
}
return out;
}
return value;
}
const app = Fastify({
logger: {
redact: ['req.headers["x-api-key"]', 'req.headers["x-admin-token"]'],
serializers: {
err: (e: Error) => scrubTokens({ type: e.name, message: e.message, stack: e.stack }),
},
},
genReqId: () => crypto.randomUUID(),
});
await app.register(helmet, { global: true });
await app.register(rateLimiterPlugin);
await app.register(tenantAuthPlugin);
await app.register(adminRoutes);
await mountMcp(app);
// Process-liveness probe for nginx / systemd / external uptime monitor.
// Unauthenticated by design — a 200 here means the Node process is alive and listening,
// not that downstream dependencies are healthy. Phase 4 §G3 owns the dependency-health
// endpoint (/admin/health/inngest); Phase 5 §G1 wires this to an external monitor.
const startedAt = Date.now();
app.get('/health', async () => ({
ok: true,
version: process.env.npm_package_version ?? '0.1.0',
uptimeSec: Math.floor((Date.now() - startedAt) / 1000),
}));
startOAuthStateCleanup();
await app.listen({ host: '127.0.0.1', port: 3001 });G5. Acceptance
File: tests/integration/mcp-e2e.test.ts
- Drive both tools through
/mcpwith a real API key. Assert: 200, response shape matches the Zod output,mcp.tool_calledaudit row exists. - Logging an
Errorwhose message containsBearer ya29.fooemits a JSON log line with the token segment redacted (assert on the parsed line — the literalya29.fooandBearer ya29substrings must not appear). Same assertion for an error with a property namedaccess_token. - Manual
curlexamples documented:
# ping
curl -X POST https://localhost/mcp \
-H "X-Api-Key: $KEY" -H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"ping","arguments":{}}}'
# get_account_health
curl -X POST https://localhost/mcp \
-H "X-Api-Key: $KEY" -H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"get_account_health","arguments":{"platform":"google","dateRange":"last_7_days"}}}'Workstream H — OAuth state scaffolding
H1. Service skeleton
File: src/auth/oauth-state.service.ts
import { createHash, randomBytes } from 'node:crypto';
import { and, eq, gt, lt } from 'drizzle-orm';
import { db } from '../db/index.js';
import { oauthStates } from '../db/schema.js';
export async function createOAuthState(tenantId: string, platform: string) {
const state = randomBytes(32).toString('base64url');
const codeVerifier = randomBytes(32).toString('base64url');
const codeChallenge = createHash('sha256').update(codeVerifier).digest('base64url');
await db.insert(oauthStates).values({
state, codeVerifier, tenantId, platform,
expiresAt: new Date(Date.now() + 10 * 60 * 1000),
});
return { state, codeVerifier, codeChallenge };
}
export async function consumeOAuthState(state: string) {
const [row] = await db.delete(oauthStates)
.where(and(eq(oauthStates.state, state), gt(oauthStates.expiresAt, new Date())))
.returning();
if (!row) throw new Error('Invalid or expired OAuth state');
return row; // single-use: deleted on read
}
export function startOAuthStateCleanup(): void {
// Plain in-process timer is fine for Phase 1; replaced by Inngest cron in Phase 4.
setInterval(() => {
void db.delete(oauthStates).where(lt(oauthStates.expiresAt, new Date()));
}, 5 * 60_000).unref();
}No HTTP routes here — the
/auth/:platform/startand/callbackendpoints land in Phase 2 with the Google adapter. Phase 1 only proves the storage and cleanup work.
H2. Acceptance
- Unit test:
createOAuthStatewrites a row;consumeOAuthStatereturns and deletes it; second call with the same state throws. - Integration test: insert a row with
expiresAt = now - 1ms, runstartOAuthStateCleanuponce (extracted as a callable for test), assert row is gone.
H3. Seed script — first tenant + API key
File: scripts/seed-tenant.mjs
A one-off bootstrap tool. Connects as mcp_admin (bypasses RLS — no tenant context to set), creates a tenant, mints a fresh API key, and prints the raw key once — there is no way to retrieve it later. Use the printed key as $KEY in the §13 smoke test.
// scripts/seed-tenant.mjs
//
// Usage:
// node scripts/seed-tenant.mjs "Acme Corp"
//
// Reads DB_ADMIN_PASSWORD and API_KEY_HMAC_SECRET from ./secrets/.
// Prints "API key: <raw>" to stdout. The raw key is shown ONCE — store it now.
import { readFileSync } from 'node:fs';
import { createHmac, randomBytes } from 'node:crypto';
import pg from 'pg';
const tenantName = process.argv[2] ?? 'dev-tenant';
const adminPw = readFileSync('secrets/DB_ADMIN_PASSWORD', 'utf8').trim();
const hmacKey = readFileSync('secrets/API_KEY_HMAC_SECRET');
const client = new pg.Client({
host: '127.0.0.1', port: 5432, user: 'mcp_admin',
password: adminPw, database: 'deneva_mcp',
});
await client.connect();
const rawKey = randomBytes(32).toString('base64url');
const keyHash = createHmac('sha256', hmacKey).update(rawKey).digest('hex');
const expires = new Date(Date.now() + 365 * 24 * 60 * 60 * 1000);
try {
await client.query('BEGIN');
const { rows: [tenant] } = await client.query(
'INSERT INTO tenants (name) VALUES ($1) RETURNING id',
[tenantName],
);
await client.query(
`INSERT INTO api_keys (tenant_id, key_hash, description, expires_at)
VALUES ($1, $2, $3, $4)`,
[tenant.id, keyHash, 'seed-tenant.mjs', expires],
);
await client.query('COMMIT');
console.log(`Tenant: ${tenant.id} (${tenantName})`);
console.log(`API key: ${rawKey}`);
console.log('Store this key now — it cannot be retrieved later.');
} catch (err) {
await client.query('ROLLBACK');
throw err;
} finally {
await client.end();
}Acceptance: node scripts/seed-tenant.mjs "Acme Corp" prints a tenant UUID and a base64url key. psql ... -c "SELECT count(*) FROM api_keys WHERE tenant_id = '<that uuid>';" returns 1. curl -H "X-Api-Key: <that key>" ... to /mcp succeeds.
Workstream I — CI & dependency hygiene
I1. GitHub Actions workflow
File: .github/workflows/ci.yml
name: CI
on:
pull_request:
push: { branches: [main] }
jobs:
build:
runs-on: ubuntu-latest
services:
postgres:
image: postgres:16-alpine
env: { POSTGRES_PASSWORD: ci, POSTGRES_DB: deneva_mcp }
ports: ['5432:5432']
options: >-
--health-cmd "pg_isready -U postgres" --health-interval 5s
--health-timeout 5s --health-retries 10
steps:
- uses: actions/checkout@v4
- uses: actions/setup-node@v4
with: { node-version: '22', cache: 'npm' }
- run: npm ci
- run: bash scripts/dev-secrets.sh
- run: npm run typecheck
- run: npm run lint
- run: npm run audit # fails on high/critical advisories
- run: npm run db:migrate
- run: npm testI2. Dependabot config
File: .github/dependabot.yml
version: 2
updates:
- package-ecosystem: "npm"
directory: "/"
schedule: { interval: "daily" }
open-pull-requests-limit: 10
groups:
minor-and-patch:
update-types: ["minor", "patch"]
- package-ecosystem: "github-actions"
directory: "/"
schedule: { interval: "weekly" }I3. Branch protection (informational — manual setup)
In the GitHub repo settings:
- Require status check
buildbefore merge. - Require at least one review.
- Require linear history.
- Disallow force pushes to
main.
I4. Acceptance
- A PR that introduces a dependency with a known high-severity advisory fails the build at the
npm run auditstep. - A PR that breaks the RLS test fails at
npm test. - Dependabot creates at least one PR within 24h of merging the config.
§11 — Definition of Done (full checklist)
Status legend (as of 2026-05-07, post-Ubuntu-smoke-test)
[x]— implementation complete in the repo, verifiable by reading the source. End-to-end-verified items also note the run that proved it.[~]— code shipped but not yet executed end-to-end (needsnpm install+ Ubuntu host run, seedocs/setup-ubuntu.md).[ ]— not yet implemented.The 2026-05-07 Ubuntu smoke-test surfaced eight bugs (now fixed) — see §14 for the catalogue.
A. Project bootstrap
- [~]
npm run typecheckpasses on a clean checkout. (Code in place; not yet run becausenode_modulesare not installed in this checkout.) -
npm run lintflags forbidden rawsqlinterpolation. (Rule wired ineslint.config.js.) - Node 22 enforced via
enginesand.nvmrc. (package.jsonengines +.nvmrc.)
B. Database & schema
- All seven tables defined in
src/db/schema.ts. (Migration generation deferred to firstnpm installrun —npm run db:generate.) - RLS enabled on
metric_cache,platform_credentials,sync_log. (src/db/rls.sql.api_keysis intentionally NOT under RLS — see §14 item #4 and the policy comment.) -
mcp_appcannot UPDATE/DELETEaudit_log;NOBYPASSRLSset. (src/db/roles.sql. Verified end-to-end on the 2026-05-07 Ubuntu smoke-test —mcp_appcould see api_keys after RLS was lifted, audit_log inserts succeeded, deletes were blocked.) - RLS verification test written. (
tests/integration/rls.test.ts— runs in CI against a real Postgres.) -
Poolmaxset explicitly (20). (src/db/index.ts.) - Pre-migration
pg_dumpstep documented in deploy procedure. (docs/setup-ubuntu.md“Update the application” section.)
C. Secrets loader
- All four required secrets must load at startup or the process exits ≠ 0. (
verifyAllSecretsLoadable()insrc/security/secrets.loader.ts, called first insrc/index.ts.) - No secret value is read from
process.env. (Loader reads only from/run/credentials/...or./secrets/.) -
secrets/directory is gitignored. (.gitignore.)
D. API key auth
- HMAC-SHA256 (not plain SHA-256) hashes used. (
src/security/api-key.service.ts. Verified 2026-05-07: hash computed live from the running service’s HMAC secret matched the DB-stored hash byte-for-byte.) -
timingSafeEqualused for comparison. (Same file.) - Rotation endpoint creates a new key with 24h grace on the old. (
src/auth/admin-routes.ts.) - [~] All four D4 acceptance tests pass. (
tests/integration/api-key.test.tscovers the unit-level cases. The full E2E flow (missing/invalid/valid key + rotation + 25h-clock-advance) requiresapp.inject()against a booted Fastify instance — add in a follow-up before Phase 1 sign-off. Manual smoke-test on 2026-05-07 verified the success path: valid key →initialize→ 200.) - Auth middleware actually runs on
/mcprequests. (Plugin wrapped withfastify-plugininsrc/security/tenant.middleware.ts— without it, Fastify’s encapsulation hid the preHandler hook from the/mcproute. See §14 #6.)
E. Rate limiting
- [~] Global per-IP, per-tenant, and per-route limits enforced separately. (Global + per-tenant shipped — see
src/security/rate-limiter.plugin.ts, split intoglobalRateLimiterPlugin+tenantRateLimiterPluginso they bracket auth correctly. Per-route/auth/*strict limit deferred to Phase 2 when the real OAuth routes are added.) - IP block engages after 10 auth failures within 1h. (
src/security/ip-block.service.ts, wired insrc/security/tenant.middleware.ts.) - [~] All E5 acceptance tests pass. (
tests/integration/ip-block.test.tscovers the IP-block service. The fullapp.inject()-driven cases (101st global request, 301st tenant request, blocked-IP-with-valid-key) follow alongside the D4 E2E suite.)
F. Audit log
- Every event in §F2 wire-up checklist produces a row. (
api_key.auth_*in tenant middleware;rate_limit.exceededin rate limiter;auth.blocked_ipin tenant middleware on threshold trip;mcp.tool_calledin each tool;mcp.tool_failedinsrc/mcp/server.ts— single producer;api_key.rotatedin admin routes.) - PII keys are stripped from
metadata. (src/security/audit-log.service.tsstripPii. Test:tests/integration/audit.test.ts.) -
mcp_appcannot mutate or delete rows. (REVOKE insrc/db/roles.sql; test intests/integration/audit.test.ts.)
G. MCP server
-
initializereachable through/mcpwith a valid key. (2026-05-07 smoke-test:POST /mcpwithinitializereturns 200 withserverInfo: { name: "deneva-mcp", version: "0.1.0" }and anmcp-session-idheader.) - Both
pingandget_account_healthregistered. (src/mcp/server.tsregisters both via the registry; uses non-deprecatedserver.registerTool().) - Tool registry pattern is set up so Phase 2 only adds files under
mcp/tools/+adapters/. (Seedocs/components/mcp-tools.md“Adding a new tool”.) - Server binds to
127.0.0.1:3001only. (src/index.tsapp.listen({ host: '127.0.0.1', port: 3001 }).) - A thrown error inside any tool produces an
mcp.tool_failedaudit row (single producer). (Wrapper around every handler insrc/mcp/server.ts.) -
GET /healthreturns 200 with{ ok, version, uptimeSec }and is unauthenticated. (Insrc/index.ts; auth middleware skips non-/mcproutes.) - Tool context (
tenantId,requestId) flows to handlers. (Closure-based:buildMcpServer({ tenantId, requestId })insrc/mcp/server.tsbuilds a fresh server per request with context baked into each handler. The MCP SDK’stransport.handleRequest(req, res, parsedBody)has no application-context slot — the third arg is the JSON-RPC body. See §14 #8.) - [~] End-to-end MCP test (
tests/integration/mcp-e2e.test.ts). (Pending — requires booting the full app inside vitest. Statefultools/callover curl needs theinitialize→notifications/initialized→tools/callsession dance, which a real MCP client handles automatically. Same follow-up as D4/E5 E2E.)
H. OAuth state
-
oauth_statestable populated and consumed in unit tests. (tests/integration/oauth-state.test.ts.) - Cleanup cron deletes expired rows. (
startOAuthStateCleanupinsrc/auth/oauth-state.service.ts+_cleanupOAuthStatesNowtest.)
I. CI
- CI runs typecheck, lint, audit, migrations, tests against a real Postgres. (
.github/workflows/ci.yml.) - Dependabot config merged. (
.github/dependabot.yml.) - CI fails on a high-severity advisory. (
npm audit --audit-level=highstep in the workflow.)
Outstanding before Phase 1 sign-off
- Run
npm installonce (locally or on the CI runner) sopackage-lock.jsonis generated and committed. This unblocks the typecheck / lint / migration generation steps. - Generate the initial migration:
npm run db:generate— producessrc/db/migrations/0000_*.sqlfrom the schema. Commit it. - Add the three E2E suites flagged
[~]above (auth.test.ts,rate-limit.test.ts,mcp-e2e.test.ts) — they need a booted Fastify instance viaapp.inject()and Postgres up. The CI workflow already provisions Postgres, so they slot in without infra changes. Smoke-test on the Ubuntu host— done 2026-05-07 followingdocs/setup-ubuntu.md. AuthenticatedPOST /mcpinitializereturns 200 with the expected server info. Eight bugs found and fixed during the smoke test; see §14 below.
§14 — Errata: bugs found during the first Ubuntu smoke-test
The first end-to-end run on a fresh Ubuntu 24.04 host (2026-05-07) hit eight issues. All eight are now fixed in the source and the setup guide; this section is the record of what was wrong and where the fix landed, so the next person walking this guide doesn’t re-debug them.
| # | Symptom | Root cause | Fix |
|---|---|---|---|
| 1 | psql:src/db/roles.sql: ERROR: permission denied to create role when running Step 8 as mcp_admin. | The seed-time mcp_admin role doesn’t carry CREATEROLE. The earlier draft of src/db/roles.sql and Step 8 told the operator to run the script as mcp_admin. | Run roles.sql via sudo -u postgres psql -d deneva_mcp -v app_password="$APP_PW" -f src/db/roles.sql. Updated roles.sql header comment + setup-ubuntu.md Step 8. |
| 2 | seed-tenant.mjs crashed with ENOENT: no such file or directory, open 'secrets/API_KEY_HMAC_SECRET'. | Step 8 wrote DB_ADMIN_PASSWORD and DB_PASSWORD into secrets/ but never the HMAC secret, even though seed-tenant.mjs reads it from there. The API_KEY_HMAC_SECRET value also wasn’t generated as a shell variable until Step 7’s interactive prompt — too late to reuse. | Step 4a now mints API_KEY_HMAC_SECRET="$(openssl rand -base64 32)" alongside the DB passwords; Step 8 writes all three into secrets/. Files in setup-ubuntu.md. |
| 3 | Service crash-looped with Check failed: 12 == errno (V8 fatal ENOMEM from mprotect). | Systemd unit shipped with MemoryDenyWriteExecute=true, which blocks the mprotect(PROT_EXEC) calls V8’s JIT baseline compiler issues. Incompatible with any V8 runtime. | Removed MemoryDenyWriteExecute=true from the systemd unit in setup-ubuntu.md Step 10. The other hardening flags (NoNewPrivileges, ProtectSystem=strict, RestrictNamespaces, etc.) stay. |
| 4 | Authenticated POST /mcp returned 401 even with a valid key; audit_log was empty. | RLS policy on api_keys (tenant_id = current_setting('app.current_tenant_id')::uuid) requires the tenant context to already be set — but the auth middleware reads api_keys to discover the tenant. Chicken-and-egg: mcp_app saw zero rows, so the lookup always failed. | Removed RLS from api_keys in src/db/rls.sql. api_keys is auth infrastructure — access is controlled by hash-unguessability, not tenant scope. |
| 5 | Auth still failing with empty audit_log and 401 after fix #4. | db/index.ts set ssl: { rejectUnauthorized: true } in NODE_ENV=production. The Phase-1 server uses the snake-oil cert, which Node’s bundled CA store does not trust. Every pool connection threw self-signed certificate. The error path turned into a 401 without an audit row (see fix #6). | src/db/index.ts now uses ssl: false. App and Postgres share a host through Phase 5, so loopback traffic never benefits from SSL — revisit only if Postgres moves off-host. |
| 6 | After fix #5 the DB connection worked from a one-off Node script, yet the running service still returned 401 with 1.98ms response time and zero audit rows. | tenantAuthPlugin (and tenantRateLimiterPlugin) were registered via app.register(...) without fastify-plugin wrapping. Fastify’s plugin encapsulation meant the addHook('preHandler', ...) calls inside applied only to routes registered inside the plugin scope. The MCP route is mounted directly on the parent via mountMcp(app), so the auth hook never ran for POST /mcp. The 401 was actually coming from the route handler’s defence-in-depth if (!tenantId) check. | Both plugins are now fp(...)-wrapped in src/security/tenant.middleware.ts and src/security/rate-limiter.plugin.ts. fastify-plugin is already a transitive dep via @fastify/rate-limit. |
| 7 | POST /mcp returned 406 Not Acceptable: Client must accept both application/json and text/event-stream. | StreamableHTTPServerTransport enforces content-negotiation. The smoke-test curl only sent Content-Type, not Accept. | Added -H "Accept: application/json, text/event-stream" to the smoke-test curl in Step 11 of setup-ubuntu.md. |
| 8 | After fix #7, POST /mcp returned 400 Parse error: Invalid JSON-RPC message for any body. | The third argument of transport.handleRequest(req, res, parsedBody?) is the JSON-RPC payload, not application context. The original code passed { tenantId, requestId } there; the SDK then validated that as a JSON-RPC message and rejected it. Compounding factor: Fastify’s body parser had already consumed req.raw, so falling back to the stream was empty too. | src/mcp/server.ts now passes req.body as parsedBody. Tool context flows via closure on buildMcpServer({ tenantId, requestId }) — context captured per-request when the server is built, no SDK-context plumbing required. |
After these eight fixes, the §13 smoke test passes end-to-end: initialize returns 200 with serverInfo: { name: "deneva-mcp", version: "0.1.0" } and an mcp-session-id header. Authenticated tool calls require the same session ID + the notifications/initialized step that real MCP clients (Claude Desktop, etc.) maintain across requests; curl-based testing of tools/call is therefore not part of the Phase 1 smoke test.
§12 — Out of scope (deferred)
| Item | Phase |
|---|---|
Real OAuth /auth/:platform/start and /callback HTTP routes | 2 |
Envelope encryption (credentials.service.ts) — needs real tokens | 2 |
| Google / Meta / TikTok platform adapters | 2 / 3 |
| Inngest sync functions, signed-webhook verification | 4 |
Cache TTL config and cache.service.ts | 2 (per-platform) |
| nginx config, UFW rules, full systemd hardening flag set | 5 |
GDPR erasure endpoint (deleteTenant) | 5 |
| Penetration test of public endpoints | 5 |
If a task you’re about to do is on this list, stop — it belongs in a later phase.
§13 — Manual smoke test (run end-to-end before declaring Phase 1 done)
# 1. Bring up Postgres
docker compose up -d postgres
# 2. Generate dev secrets
bash scripts/dev-secrets.sh
# 3. Run migrations
npm run db:migrate
# 4. Create the mcp_app role + apply RLS (must be done BEFORE the app connects).
# mcp_admin lacks CREATEROLE, so roles.sql runs as the postgres superuser.
APP_PW="$(cat secrets/DB_PASSWORD)"
ADMIN_PW="$(cat secrets/DB_ADMIN_PASSWORD)"
sudo -u postgres psql -d deneva_mcp -v app_password="$APP_PW" -f src/db/roles.sql
psql "postgresql://mcp_admin:${ADMIN_PW}@127.0.0.1:5432/deneva_mcp" -f src/db/rls.sql
# 5. Seed a tenant + API key
node scripts/seed-tenant.mjs "dev-tenant" # prints "API key: XYZ123..." ONCE — copy now
export KEY=XYZ123...
# 6. Start the server (now that mcp_app role exists)
npm run dev
# 7. initialize with valid key — expect 200 + audit row.
# The Accept header is REQUIRED by Streamable HTTP; without it the SDK 406s.
# `initialize` is the right smoke-test target because `tools/call` requires a
# stateful session (initialize → notifications/initialized → tools/call) that
# a real MCP client handles, but curl on its own does not.
curl -s -o /dev/null -w "%{http_code}\n" -X POST http://127.0.0.1:3001/mcp \
-H "X-Api-Key: $KEY" \
-H "Content-Type: application/json" \
-H "Accept: application/json, text/event-stream" \
-d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"smoke","version":"1.0"}}}'
# → 200
# 8. /mcp without key — expect 401 + audit row
curl -s -o /dev/null -w "%{http_code}\n" -X POST http://127.0.0.1:3001/mcp \
-H "Content-Type: application/json" \
-H "Accept: application/json, text/event-stream" \
-d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"smoke","version":"1.0"}}}'
# → 401
# 9. Hammer with 101 requests in a minute — expect at least one 429
for i in $(seq 1 101); do
curl -s -o /dev/null -w "%{http_code} " -X POST http://127.0.0.1:3001/mcp \
-H "X-Api-Key: $KEY" \
-H "Content-Type: application/json" \
-H "Accept: application/json, text/event-stream" \
-d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"smoke","version":"1.0"}}}'
done; echo
# → mostly 200, then 429s
# 10. Confirm mcp_app cannot tamper with audit_log
psql "postgresql://mcp_app@127.0.0.1:5432/deneva_mcp" -c "DELETE FROM audit_log;"
# → ERROR: permission denied for table audit_log
# 11. Inspect the trail
psql "postgresql://mcp_admin:${ADMIN_PW}@127.0.0.1:5432/deneva_mcp" \
-c "SELECT event_type, outcome, count(*) FROM audit_log GROUP BY 1,2 ORDER BY 1,2;"
# → expect rows for api_key.auth_success/failure, mcp.tool_called, rate_limit.exceededIf every step above produces the expected outcome, Phase 1 is shipped. Move on to Phase 2.