Skip to Content
Deneva MCPPlan1 — Foundation

Phase 1 — Secure Foundation

Detailed execution doc for Phase 1 of the Deneva MCP Tool Architecture Plan. The architecture doc is the source of truth for what is being built and why; this doc is the source of truth for how and in what order.

Estimated effort: 1–2 weeks for one engineer. Phase 2 follow-up: docs/phase-2-google-ads.md.


Goal

Stand up a minimal but complete Fastify + Drizzle + Postgres stack with the entire security middleware chain wired end-to-end, validated by two stub MCP tools. Every later phase plugs into this foundation without changing it: Phase 2 adds adapters and real tokens, Phase 3 adds more adapters, Phase 4 adds background sync, Phase 5 adds production hardening.

If Phase 1 is done correctly, an attacker hitting an unauthenticated request, sending a malformed payload, or exceeding the rate limit gets the same correct response on day one as they will on day 365.

Definition of Done (high-level — full checklist in §11)

  • A request to /mcp with a valid X-Api-Key reaches the ping tool, gets a 200, and produces an api_key.auth_success + mcp.tool_called audit row.
  • The same request without the header gets a 401 and an api_key.auth_failure audit row, in constant time vs an invalid-but-well-formed key.
  • Two separate tenants cannot read each other’s metric_cache rows even if a query forgets a WHERE tenant_id = ? clause (RLS catches it).
  • DELETE FROM audit_log as the mcp_app role fails with permission denied.
  • CI fails the build on a high-severity npm audit advisory.

Workstream order & dependency graph

A. Project bootstrap ──┬──▶ C. Secrets loader ──┬──▶ D. API key auth ──┬──▶ G. MCP server + stubs │ │ │ └──▶ B. Database & schema ─▶ F. Audit log ──────┤ │ │ └─▶ E. Rate limiting ┤ H. OAuth scaffolding ┤ I. CI & dep hygiene ─────┘ (runs alongside everything)

A → B → (C, F, E) → D → G is the critical path. H and I are parallelisable.


Workstream A — Project bootstrap

A0. Dependencies (one-time npm install)

Pin exact versions in package-lock.json (Phase 1 §A1’s package.json declares ranges; lockfile is what actually ships). The MCP SDK shape may evolve — the version below is the contract Phase 1 is written against; bump deliberately, not opportunistically.

# Runtime npm install \ fastify@^5.0.0 \ @fastify/helmet@^13.0.0 \ @fastify/rate-limit@^10.0.0 \ rate-limiter-flexible@^7.0.0 \ drizzle-orm@^0.36.0 \ pg@^8.13.0 \ zod@^3.23.0 \ @modelcontextprotocol/sdk@^1.0.0 # ^^^ Phase 1 was authored against SDK 1.x. The McpServer / StreamableHTTPServerTransport # API shape in §G1 may differ on newer minor versions — adapt the registry, keep the contract. # Dev / build / test npm install --save-dev \ typescript@^5.6.0 \ tsx@^4.19.0 \ @types/node@^22.0.0 \ @types/pg@^8.11.0 \ drizzle-kit@^0.28.0 \ vitest@^2.1.0 \ eslint@^9.0.0 \ typescript-eslint@^8.0.0

Acceptance: npm install completes with zero high/critical advisories (npm audit --audit-level=high exits 0).

A1. Initialise project files

Files:

  • package.json
  • tsconfig.json
  • .gitignore
  • .editorconfig
  • .nvmrc

package.json (relevant fields):

{ "name": "deneva-mcp", "private": true, "type": "module", "engines": { "node": ">=22.0.0 <23" }, "scripts": { "dev": "tsx watch src/index.ts", "build": "tsc --project tsconfig.json", "start": "node dist/index.js", "typecheck": "tsc --noEmit", "lint": "eslint . --max-warnings=0", "test": "vitest run", "test:watch":"vitest", "audit": "npm audit --audit-level=high", "db:migrate":"drizzle-kit migrate", "db:studio": "drizzle-kit studio" } }

tsconfig.json essentials:

{ "compilerOptions": { "target": "ES2023", "module": "ESNext", "moduleResolution": "Bundler", "strict": true, "noUncheckedIndexedAccess": true, "exactOptionalPropertyTypes": true, "noImplicitOverride": true, "isolatedModules": true, "resolveJsonModule": true, "outDir": "dist", "rootDir": "src" }, "include": ["src/**/*"] }

.gitignore (must include):

node_modules dist .env .env.* secrets/ coverage

.nvmrc: 22.

Acceptance: npm install && npm run typecheck exits 0 on an empty src/index.ts.

A2. ESLint with the SQL-injection guard

File: eslint.config.js

The non-negotiable rule: ban template-string interpolation inside db.execute(sql\…`) and similar. Drizzle's parameterized helpers (eq, and, sql`…${param}“) remain allowed; raw template construction does not.

// eslint.config.js import tseslint from 'typescript-eslint'; export default tseslint.config( ...tseslint.configs.recommendedTypeChecked, { languageOptions: { parserOptions: { project: './tsconfig.json' } }, rules: { 'no-restricted-syntax': [ 'error', { // db.execute(sql`...${var}...`) where the sql tag is given a raw template literal selector: "CallExpression[callee.property.name='execute'] > TaggedTemplateExpression[tag.name='sql'][quasi.expressions.length>0]", message: 'Raw template interpolation in db.execute(sql`...`) is forbidden. Use parameterized helpers (eq, and, sql.placeholder) instead.', }, ], }, }, );

Acceptance: npm run lint flags a deliberately-injected db.execute(sql\SELECT * FROM x WHERE id = ${userId}`)` test fixture and exits non-zero.


Workstream B — Database & schema

B1. Local Postgres via docker-compose

File: docker-compose.yml

services: postgres: image: postgres:16-alpine restart: unless-stopped environment: POSTGRES_DB: deneva_mcp POSTGRES_USER: mcp_admin POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-dev_only_password} ports: - "127.0.0.1:5432:5432" # bound to localhost only volumes: - mcp_pg_data:/var/lib/postgresql/data volumes: mcp_pg_data:

No TLS in dev. The postgres:16-alpine image does not ship the Debian ssl-cert package, so the previous draft’s snake-oil cert paths would crash the container on boot. Production uses a CA-signed cert mounted via the systemd unit (Phase 5 §K). The ssl: { rejectUnauthorized: false } in src/db/index.ts reflects this — connections succeed regardless of TLS posture in dev.

Acceptance: docker compose up -d postgres && docker compose exec postgres psql -U mcp_admin -d deneva_mcp -c "select 1" returns 1.

B2. Drizzle schema + initial migration

Files:

  • drizzle.config.ts
  • src/db/schema.ts
  • src/db/index.ts
  • src/db/migrations/0000_init.sql (generated by drizzle-kit generate)

src/db/schema.ts — paste the seven pgTable definitions from the architecture doc (tenants, apiKeys, platformCredentials, oauthStates, metricCache, auditLog, syncLog). Prepend the imports the architecture doc omits:

import { pgTable, uuid, text, timestamp, jsonb, integer } from 'drizzle-orm/pg-core';

drizzle.config.ts — drizzle-kit runs as a CLI, as mcp_admin (privileged), separate from the runtime pool which connects as mcp_app. Read the admin password synchronously from the dev secrets dir:

// drizzle.config.ts import { defineConfig } from 'drizzle-kit'; import { readFileSync } from 'node:fs'; import { join } from 'node:path'; const password = readFileSync( process.env.NODE_ENV === 'production' ? join('/run/credentials', process.env.SYSTEMD_UNIT ?? 'deneva-mcp.service', 'DB_ADMIN_PASSWORD') : join(process.cwd(), 'secrets', 'DB_ADMIN_PASSWORD'), 'utf8', ).trim(); export default defineConfig({ schema: './src/db/schema.ts', out: './src/db/migrations', dialect: 'postgresql', dbCredentials: { host: process.env.DB_HOST ?? '127.0.0.1', port: 5432, user: 'mcp_admin', password, database: 'deneva_mcp', ssl: false, // dev; prod uses CA-signed cert (Phase 5) }, strict: true, });

Why the split: mcp_app (runtime) lacks DDL grants — it cannot CREATE TABLE. mcp_admin (migrations) has full privileges and bypasses RLS, which is correct for migrations and seed scripts but must not be used at runtime. Production uses a separate encrypted DB_ADMIN_PASSWORD credential that lives only on the deploy host, not in the running app’s credential set.

src/db/index.ts:

import { drizzle } from 'drizzle-orm/node-postgres'; import { Pool } from 'pg'; import * as schema from './schema.js'; import { loadSecret } from '../security/secrets.loader.js'; const pool = new Pool({ host: process.env.DB_HOST ?? '127.0.0.1', port: 5432, user: 'mcp_app', password: (await loadSecret('DB_PASSWORD')).toString('utf8'), database: 'deneva_mcp', ssl: { rejectUnauthorized: false }, // dev; production uses CA-signed cert max: 20, // see §B6; sized for Phase 4 sync fan-out }); export const db = drizzle(pool, { schema });

Acceptance: npm run db:migrate creates all seven tables. \d+ audit_log in psql shows the columns from the architecture doc.

B3. Row-level security policies

File: src/db/rls.sql (run as mcp_admin after migrations)

ALTER TABLE metric_cache ENABLE ROW LEVEL SECURITY; ALTER TABLE platform_credentials ENABLE ROW LEVEL SECURITY; ALTER TABLE sync_log ENABLE ROW LEVEL SECURITY; ALTER TABLE api_keys ENABLE ROW LEVEL SECURITY; -- Tenant isolation: every tenant-scoped query must SET app.current_tenant_id first. CREATE POLICY tenant_isolation_metric_cache ON metric_cache USING (tenant_id = current_setting('app.current_tenant_id', true)::uuid); CREATE POLICY tenant_isolation_creds ON platform_credentials USING (tenant_id = current_setting('app.current_tenant_id', true)::uuid); CREATE POLICY tenant_isolation_sync_log ON sync_log USING (tenant_id = current_setting('app.current_tenant_id', true)::uuid); CREATE POLICY tenant_isolation_api_keys ON api_keys USING (tenant_id = current_setting('app.current_tenant_id', true)::uuid); -- audit_log is intentionally NOT RLS-isolated: cross-tenant security review needs full visibility. -- Access control on audit_log is via role grants in roles.sql.

Setting the tenant context must happen on every request, immediately after auth:

// inside tenant.middleware.ts, after tenantId is resolved: await db.execute(sql`SELECT set_config('app.current_tenant_id', ${tenantId}, true)`);

(true = transaction-local; the setting clears at end of transaction. Pair with a per-request transaction or a connection-pool reset hook.)

B4. Database role separation

File: src/db/roles.sql (run once, as mcp_admin, after migrations have created the tables)

mcp_admin already exists — Postgres creates it on first boot from the POSTGRES_USER env var in §B1’s compose file. This script only creates mcp_app and applies grants on the tables that drizzle-kit migrate produced. Apply with:

psql "postgresql://mcp_admin:$ADMIN_PW@127.0.0.1:5432/deneva_mcp" \ -v app_password="$APP_PW" -f src/db/roles.sql

The -v flag is mandatory — without it, the :'…' placeholder is passed through unsubstituted and CREATE ROLE fails.

CREATE ROLE mcp_app LOGIN PASSWORD :'app_password' NOINHERIT; GRANT CONNECT ON DATABASE deneva_mcp TO mcp_app; GRANT USAGE ON SCHEMA public TO mcp_app; GRANT SELECT, INSERT, UPDATE, DELETE ON ALL TABLES IN SCHEMA public TO mcp_app; GRANT USAGE ON ALL SEQUENCES IN SCHEMA public TO mcp_app; -- audit_log is INSERT-only for the application REVOKE UPDATE, DELETE ON audit_log FROM mcp_app; -- mcp_app must NOT bypass RLS ALTER ROLE mcp_app NOBYPASSRLS; -- mcp_admin already has full privileges (created by Postgres at boot). -- Re-asserting NOBYPASSRLS for mcp_app is the only contract we enforce here.

Acceptance: as mcp_app, INSERT INTO audit_log ... succeeds; UPDATE audit_log SET outcome='success' fails with permission denied.

B5. RLS verification integration test

File: tests/integration/rls.test.ts

import { describe, it, expect } from 'vitest'; import { db } from '../../src/db/index.js'; import { metricCache, tenants } from '../../src/db/schema.js'; import { sql } from 'drizzle-orm'; describe('row-level security', () => { it('blocks tenant A from reading tenant B rows', async () => { const [a] = await db.insert(tenants).values({ name: 'A' }).returning(); const [b] = await db.insert(tenants).values({ name: 'B' }).returning(); // Insert a row owned by tenant B, with B's context set. await db.execute(sql`SELECT set_config('app.current_tenant_id', ${b.id}, false)`); await db.insert(metricCache).values({ tenantId: b.id, platform: 'google', reportType: 'health', dateRangeKey: 'last_7', data: {}, expiresAt: new Date(Date.now() + 60_000), }); // Switch to tenant A's context. await db.execute(sql`SELECT set_config('app.current_tenant_id', ${a.id}, false)`); const visible = await db.select().from(metricCache); expect(visible).toHaveLength(0); // RLS hides B's row from A }); });

Acceptance: test passes against a DB connection authenticated as mcp_app (NOT mcp_admin — admin bypasses RLS).

B6. Migration rollback strategy

Drizzle Kit does not auto-generate down migrations. Phase 3 drops & recreates indexes; Phase 4 adds non-nullable columns. Both can fail mid-deploy on prod data the dev DB never saw. The startup-grade rollback strategy is restore from a pre-migration dump — not maintaining a parallel set of down.sql files (cheap to write, expensive to keep correct).

The deploy procedure (used in CI/CD on the Ubuntu host once Phase 5 §K is in place):

# Take a snapshot dump immediately before applying migrations. TS=$(date -u +%Y%m%dT%H%M%SZ) pg_dump --format=custom --no-owner --no-privileges deneva_mcp \ > /var/backups/deneva-mcp/pre-migrate-${TS}.dump # Apply migrations. npm run db:migrate || { echo "Migration failed; restore with:" echo " dropdb deneva_mcp && createdb deneva_mcp" echo " pg_restore --dbname=deneva_mcp /var/backups/deneva-mcp/pre-migrate-${TS}.dump" exit 1 }

Local dev rollback: docker compose down -v && docker compose up -d postgres && npm run db:migrate wipes and reapplies — usually faster than reasoning about a partial failure.

For destructive migrations (drop/recreate index, NOT NULL on existing column), commit a one-liner *.rollback.sql next to the generated migration with the inverse SQL — used by an operator following docs/compliance/runbooks/database-restore.md (Phase 5) when a restore from dump would lose too much intervening data.

Acceptance: the deploy script aborts on a failed migration; the runbook mentioned above (added in Phase 5 §K6) references this pre-migrate dump path.


Workstream C — Secrets loader

C1. Two-backend loader

File: src/security/secrets.loader.ts

import { readFile } from 'node:fs/promises'; import { join } from 'node:path'; const REQUIRED_SECRETS = [ 'CREDENTIAL_KEK', 'API_KEY_HMAC_SECRET', 'DB_PASSWORD', 'INNGEST_SIGNING_KEY', ] as const; type SecretName = (typeof REQUIRED_SECRETS)[number]; const cache = new Map<SecretName, Buffer>(); export async function loadSecret(name: SecretName): Promise<Buffer> { const hit = cache.get(name); if (hit) return hit; const path = process.env.NODE_ENV === 'production' ? join('/run/credentials', process.env.SYSTEMD_UNIT ?? 'deneva-mcp.service', name) : join(process.cwd(), 'secrets', name); const value = await readFile(path); cache.set(name, value); return value; } export async function verifyAllSecretsLoadable(): Promise<void> { // Fail fast at startup — better than discovering a missing secret at first request. for (const name of REQUIRED_SECRETS) await loadSecret(name); }

C2. Startup verification

File: src/index.ts (entry point)

import { verifyAllSecretsLoadable } from './security/secrets.loader.js'; await verifyAllSecretsLoadable(); // throws if any required secret is missing // ... continue Fastify bootstrap

C3. Dev secrets bootstrap script

File: scripts/dev-secrets.sh

#!/usr/bin/env bash set -euo pipefail mkdir -p secrets chmod 700 secrets gen() { head -c 32 /dev/urandom | base64 > "secrets/$1" && chmod 600 "secrets/$1"; } [[ -f secrets/CREDENTIAL_KEK ]] || gen CREDENTIAL_KEK [[ -f secrets/API_KEY_HMAC_SECRET ]] || gen API_KEY_HMAC_SECRET [[ -f secrets/DB_PASSWORD ]] || echo -n 'dev_only_password' > secrets/DB_PASSWORD [[ -f secrets/DB_ADMIN_PASSWORD ]] || echo -n 'dev_only_password' > secrets/DB_ADMIN_PASSWORD [[ -f secrets/INNGEST_SIGNING_KEY ]] || gen INNGEST_SIGNING_KEY chmod 600 secrets/DB_PASSWORD secrets/DB_ADMIN_PASSWORD echo "Dev secrets in ./secrets/ (gitignored)."

DB_ADMIN_PASSWORD is not in REQUIRED_SECRETS. The runtime app connects as mcp_app and never needs the admin password. It is consumed only by drizzle.config.ts (migrations) and scripts/seed-tenant.mjs. In dev it matches the compose POSTGRES_PASSWORD default; in prod it lives on the deploy host’s encrypted credential store, separate from the app’s runtime credentials.

C4. systemd unit reference (production — copy-paste, do not commit a real one in Phase 1)

# /etc/systemd/system/deneva-mcp.service [Service] Type=simple User=deneva-mcp WorkingDirectory=/opt/deneva-mcp ExecStart=/usr/bin/node dist/index.js Environment=NODE_ENV=production Environment=SYSTEMD_UNIT=deneva-mcp.service LoadCredentialEncrypted=CREDENTIAL_KEK:/etc/deneva-mcp/creds/CREDENTIAL_KEK.cred LoadCredentialEncrypted=API_KEY_HMAC_SECRET:/etc/deneva-mcp/creds/API_KEY_HMAC_SECRET.cred LoadCredentialEncrypted=DB_PASSWORD:/etc/deneva-mcp/creds/DB_PASSWORD.cred LoadCredentialEncrypted=INNGEST_SIGNING_KEY:/etc/deneva-mcp/creds/INNGEST_SIGNING_KEY.cred NoNewPrivileges=true ProtectSystem=strict ProtectHome=true PrivateTmp=true

(Full hardening flag set lands in Phase 5.)

Acceptance: starting the server with one of the four secret files removed produces a clear error and exit code != 0 before any port is opened.


Workstream D — API key auth

D1. Key service

File: src/security/api-key.service.ts

import { createHmac, randomBytes, timingSafeEqual } from 'node:crypto'; import { loadSecret } from './secrets.loader.js'; const HMAC_SECRET = await loadSecret('API_KEY_HMAC_SECRET'); export function generateApiKey(): string { // 32 bytes = 256 bits, base64url-encoded; shown to user once, never stored return randomBytes(32).toString('base64url'); } export function hashApiKey(rawKey: string): string { return createHmac('sha256', HMAC_SECRET).update(rawKey).digest('hex'); } export function verifyApiKey(rawKey: string, storedHash: string): boolean { const candidate = Buffer.from(hashApiKey(rawKey)); const stored = Buffer.from(storedHash); if (candidate.length !== stored.length) return false; return timingSafeEqual(candidate, stored); }

D2. Tenant middleware

File: src/security/tenant.middleware.ts

import type { FastifyPluginAsync } from 'fastify'; import { and, eq, gt, isNull } from 'drizzle-orm'; import { db } from '../db/index.js'; import { apiKeys } from '../db/schema.js'; import { hashApiKey } from './api-key.service.js'; import { writeAuditEvent } from './audit-log.service.js'; declare module 'fastify' { interface FastifyRequest { tenantId?: string; apiKeyId?: string } } export const tenantAuthPlugin: FastifyPluginAsync = async (fastify) => { fastify.addHook('preHandler', async (req, reply) => { if (!req.url.startsWith('/mcp')) return; // auth only on /mcp routes const raw = req.headers['x-api-key']; if (typeof raw !== 'string' || raw.length === 0) { await writeAuditEvent('api_key.auth_failure', 'failure', { ip: req.ip, reason: 'missing' }); return reply.code(401).send({ error: 'unauthorized' }); } const hash = hashApiKey(raw); const now = new Date(); const [row] = await db.select().from(apiKeys).where( and( eq(apiKeys.keyHash, hash), isNull(apiKeys.revokedAt), gt(apiKeys.expiresAt, now), ), ); if (!row) { await writeAuditEvent('api_key.auth_failure', 'failure', { ip: req.ip, reason: 'invalid' }); return reply.code(401).send({ error: 'unauthorized' }); } req.tenantId = row.tenantId; req.apiKeyId = row.id; // Async fire-and-forget: lastUsedAt update should not block the request. void db.update(apiKeys).set({ lastUsedAt: now }).where(eq(apiKeys.id, row.id)); await writeAuditEvent('api_key.auth_success', 'success', { tenantId: row.tenantId }); }); };

Note on constant-time behaviour: doing the DB lookup after hashing means missing-key and invalid-key paths take similar time. The cheap missing-header path returns earlier, but it doesn’t leak whether a specific key is valid — only whether the header was present.

D3. Rotation endpoint (pulled forward from Phase 5)

File: src/auth/admin-routes.ts

import type { FastifyPluginAsync } from 'fastify'; import { eq } from 'drizzle-orm'; import { z } from 'zod'; import { timingSafeEqual } from 'node:crypto'; import { db } from '../db/index.js'; import { apiKeys } from '../db/schema.js'; import { generateApiKey, hashApiKey } from '../security/api-key.service.js'; import { writeAuditEvent } from '../security/audit-log.service.js'; import { loadSecret } from '../security/secrets.loader.js'; const ADMIN_HEADER_NAME = 'x-admin-token'; const adminToken = Buffer.from((await loadSecret('API_KEY_HMAC_SECRET')).toString('hex')); // Phase 1 placeholder; Phase 5 introduces a separate ADMIN_TOKEN secret function adminTokenMatches(presented: unknown): boolean { if (typeof presented !== 'string') return false; const cand = Buffer.from(presented); if (cand.length !== adminToken.length) return false; return timingSafeEqual(cand, adminToken); } const RotateBody = z.object({ tenantId: z.string().uuid(), description: z.string().min(1).max(120) }); export const adminRoutes: FastifyPluginAsync = async (fastify) => { fastify.post('/admin/api-keys/rotate', async (req, reply) => { if (!adminTokenMatches(req.headers[ADMIN_HEADER_NAME])) return reply.code(401).send(); const body = RotateBody.parse(req.body); const newKey = generateApiKey(); const newHash = hashApiKey(newKey); const graceUntil = new Date(Date.now() + 24 * 60 * 60 * 1000); await db.transaction(async (tx) => { // Mark previous active keys for this tenant with a 24h grace expiry — do NOT revoke immediately. await tx.update(apiKeys) .set({ expiresAt: graceUntil }) .where(eq(apiKeys.tenantId, body.tenantId)); await tx.insert(apiKeys).values({ tenantId: body.tenantId, keyHash: newHash, description: body.description, expiresAt: new Date(Date.now() + 365 * 24 * 60 * 60 * 1000), }); }); await writeAuditEvent('api_key.rotated', 'success', { tenantId: body.tenantId }); return { apiKey: newKey, graceUntil }; // shown once, never retrievable again }); };

Phase 5 follow-up: replace the placeholder admin token with a dedicated ADMIN_TOKEN secret, move the route behind nginx with an IP allow-list, and add a separate revoke-immediately variant.

D4. Acceptance tests

File: tests/integration/auth.test.ts

Required cases:

  1. POST /mcp without X-Api-Key → 401, audit row api_key.auth_failure with reason: 'missing'.
  2. POST /mcp with random key → 401, audit row with reason: 'invalid'. Timing within 20% of case 3 (no constant-time guarantee in tests, but a sanity check).
  3. POST /mcp with valid key → 200, audit row api_key.auth_success, lastUsedAt updated.
  4. After rotation: old key works for ≤24h, new key works immediately. After 25h (advance fake clock), old key returns 401.

Workstream E — Rate limiting

E1. Global per-IP limit

File: src/security/rate-limiter.plugin.ts

import type { FastifyPluginAsync } from 'fastify'; import rateLimit from '@fastify/rate-limit'; export const rateLimiterPlugin: FastifyPluginAsync = async (fastify) => { await fastify.register(rateLimit, { global: true, max: 100, timeWindow: 60_000, keyGenerator: (req) => req.ip, errorResponseBuilder: (req, _ctx) => { // The plugin invokes this on every blocked request — do the audit write here. // Lazy import keeps audit-log.service out of the plugin's import cycle. void import('./audit-log.service.js').then(({ writeAuditEvent }) => writeAuditEvent('rate_limit.exceeded', 'failure', { ip: req.ip, scope: 'global' }), ); return { error: 'rate_limit_exceeded' }; }, }); };

Why errorResponseBuilder instead of an onExceeded hook: @fastify/rate-limit v10 does not expose a per-route onExceeded callback — the response builder is the single point that fires on every 429. The audit write is fire-and-forget (void) so it doesn’t add latency to the response.

E2. Per-tenant limit

// inside the same plugin file import { RateLimiterMemory } from 'rate-limiter-flexible'; const tenantLimiter = new RateLimiterMemory({ points: 300, duration: 60 }); fastify.addHook('preHandler', async (req, reply) => { if (!req.tenantId) return; try { await tenantLimiter.consume(req.tenantId, 1); } catch { const { writeAuditEvent } = await import('./audit-log.service.js'); await writeAuditEvent('rate_limit.exceeded', 'failure', { tenantId: req.tenantId, scope: 'tenant' }); return reply.code(429).send({ error: 'rate_limit_exceeded' }); } });

E3. Strict /auth/* limit (placeholder routes — real ones in Phase 2)

fastify.register(async (inst) => { await inst.register(rateLimit, { max: 5, timeWindow: 15 * 60_000, keyGenerator: (req) => req.ip }); inst.get('/auth/_phase1_placeholder', async () => ({ ok: true })); }, { prefix: '/' });

E4. Auth-failure IP block

Single-process assumption. The IP-block map and the per-tenant rate-limit bucket below both live in process memory. This is fine for the systemd-direct deployment Phase 5 §B2 specifies (one Node process per host). The architecture doc’s ecosystem.config.js example shows PM2 cluster mode with instances: 2; we have deliberately diverged from that — clustering would require moving these maps to Redis or a blocked_ips DB table. If you choose to scale horizontally later, those two structures are the migration targets.

After 10 api_key.auth_failure events from the same IP within an hour:

  • Insert a row in an in-memory blockedIps map (Phase 1 — Redis in Phase 5) with expiresAt = now + 1h.
  • Subsequent requests from that IP short-circuit at the first preHandler with 401 + audit event auth.blocked_ip.
  • A simple setInterval(cleanup, 60_000) removes expired entries.
// security/ip-block.service.ts — sketch const blocks = new Map<string, number>(); export function isBlocked(ip: string): boolean { const exp = blocks.get(ip); if (!exp) return false; if (exp < Date.now()) { blocks.delete(ip); return false; } return true; } export function recordFailure(ip: string): boolean { /* sliding-window count; returns true when threshold tripped */ }

E5. Acceptance tests

File: tests/integration/rate-limit.test.ts

Use Fastify’s built-in app.inject() (no separate package — it’s exposed on every Fastify instance) + a fake clock (vi.useFakeTimers()):

  • 100 requests from one IP succeed; 101st returns 429 + audit row with scope: 'global'.
  • 300 requests from one tenant succeed; 301st returns 429 + audit row with scope: 'tenant'.
  • 11 auth-failure attempts from one IP → IP block engages; a valid key from that IP also gets 401 with auth.blocked_ip audit row until clock advances 1h.

Workstream F — Audit log

F1. Service

File: src/security/audit-log.service.ts

import { db } from '../db/index.js'; import { auditLog } from '../db/schema.js'; export type AuditEventType = | 'api_key.auth_success' | 'api_key.auth_failure' | 'api_key.created' | 'api_key.rotated' | 'api_key.revoked' | 'oauth.flow_started' | 'oauth.flow_completed' | 'oauth.flow_failed' | 'oauth.token_refreshed' | 'oauth.token_revoked' | 'mcp.tool_called' | 'mcp.tool_failed' | 'tenant.created' | 'tenant.deleted' | 'rate_limit.exceeded' | 'auth.blocked_ip' | 'sync.exhausted'; interface Ctx { tenantId?: string; ip?: string; requestId?: string; reason?: string; scope?: string; tool?: string; [k: string]: unknown; } const PII_KEYS = new Set(['email', 'name', 'firstName', 'lastName', 'phone', 'address']); function stripPii(meta: Ctx): Ctx { const out: Ctx = {}; for (const [k, v] of Object.entries(meta)) if (!PII_KEYS.has(k)) out[k] = v; return out; } export async function writeAuditEvent( eventType: AuditEventType, outcome: 'success' | 'failure', ctx: Ctx = {}, ): Promise<void> { const { tenantId, ip, requestId, ...rest } = ctx; await db.insert(auditLog).values({ tenantId: tenantId ?? null, eventType, actorIp: ip ?? null, requestId: requestId ?? null, outcome, metadata: stripPii(rest as Ctx), }); }

F2. Wire-up checklist

  • tenantAuthPlugin calls writeAuditEvent('api_key.auth_*', ...) on every result.
  • rateLimiterPlugin calls writeAuditEvent('rate_limit.exceeded', ...) on every 429.
  • IP-block service calls writeAuditEvent('auth.blocked_ip', ...) on engagement.
  • Stub tools call writeAuditEvent('mcp.tool_*', ...) on entry/exit.
  • Rotation endpoint calls writeAuditEvent('api_key.rotated', ...).

F3. Acceptance

File: tests/integration/audit.test.ts

  • Each test in §D4 / §E5 asserts the corresponding audit row appears.
  • A negative test connects as mcp_app and runs UPDATE audit_log SET outcome='success' WHERE id = ... and DELETE FROM audit_log WHERE ... — both must throw permission denied.
  • The metadata column for an api_key.auth_failure row never contains an email key even if the test injects one.

Workstream G — MCP server + stub tools

G1. Tool registry

File: src/mcp/server.ts

import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js'; import { StreamableHTTPServerTransport } from '@modelcontextprotocol/sdk/server/streamableHttp.js'; import type { FastifyInstance } from 'fastify'; import { pingTool } from './tools/ping.js'; import { accountHealthTool } from './tools/account-health.js'; import { writeAuditEvent } from '../security/audit-log.service.js'; export interface ToolDef<I, O> { name: string; description: string; inputSchema: import('zod').ZodType<I>; handler: (input: I, ctx: { tenantId: string; requestId: string }) => Promise<O>; } const TOOLS: ToolDef<unknown, unknown>[] = [pingTool, accountHealthTool]; export function buildMcpServer(): McpServer { const server = new McpServer({ name: 'deneva-mcp', version: '0.1.0' }); for (const t of TOOLS) { server.tool(t.name, t.description, t.inputSchema, async (input, extra) => { // tenantId + requestId are propagated via the transport's request-context (set in mountMcp) const ctx = extra.context as { tenantId: string; requestId: string }; try { return await t.handler(input as never, ctx as never); } catch (err) { // Single producer for mcp.tool_failed in Phase 1 — every later phase reuses this path. // Phase 2's typed-error path (AdapterError) flows through here too. await writeAuditEvent('mcp.tool_failed', 'failure', { tenantId: ctx.tenantId, requestId: ctx.requestId, tool: t.name, reason: err instanceof Error ? err.message : 'unknown', }); throw err; // let the MCP transport translate to a JSON-RPC error } }); } return server; } export async function mountMcp(fastify: FastifyInstance): Promise<void> { fastify.post('/mcp', async (req, reply) => { const transport = new StreamableHTTPServerTransport({ sessionIdGenerator: () => crypto.randomUUID(), }); const server = buildMcpServer(); await server.connect(transport); await transport.handleRequest(req.raw, reply.raw, { tenantId: req.tenantId!, requestId: req.id }); }); }

The exact MCP SDK call shape may differ slightly between SDK versions; the registry pattern above is the contract Phase 2 must preserve.

G2. ping tool

File: src/mcp/tools/ping.ts

import { z } from 'zod'; import type { ToolDef } from '../server.js'; import { writeAuditEvent } from '../../security/audit-log.service.js'; const Input = z.object({}); export const pingTool: ToolDef<z.infer<typeof Input>, { ok: true; tenantId: string; requestId: string }> = { name: 'ping', description: 'Health-check tool — verifies the entire middleware stack.', inputSchema: Input, async handler(_input, ctx) { await writeAuditEvent('mcp.tool_called', 'success', { tenantId: ctx.tenantId, requestId: ctx.requestId, tool: 'ping' }); return { ok: true, tenantId: ctx.tenantId, requestId: ctx.requestId }; }, };

G3. get_account_health stub

File: src/mcp/tools/account-health.ts

import { z } from 'zod'; import type { ToolDef } from '../server.js'; import { writeAuditEvent } from '../../security/audit-log.service.js'; const Input = z.object({ platform: z.enum(['google', 'meta', 'tiktok']), dateRange: z.enum(['last_7_days', 'last_30_days', 'last_90_days']), }); interface Output { platform: 'google' | 'meta' | 'tiktok'; dateRange: string; metrics: { spend: number; roas: number; cpa: number; ctr: number }; _stub: true; // removed in Phase 2 when real data lands } export const accountHealthTool: ToolDef<z.infer<typeof Input>, Output> = { name: 'get_account_health', description: 'Returns spend, ROAS, CPA, CTR for the given platform and date range.', inputSchema: Input, async handler(input, ctx) { await writeAuditEvent('mcp.tool_called', 'success', { tenantId: ctx.tenantId, requestId: ctx.requestId, tool: 'get_account_health' }); return { platform: input.platform, dateRange: input.dateRange, metrics: { spend: 0, roas: 0, cpa: 0, ctr: 0 }, _stub: true, }; }, };

G4. Server bootstrap

File: src/index.ts

import Fastify from 'fastify'; import helmet from '@fastify/helmet'; import { verifyAllSecretsLoadable } from './security/secrets.loader.js'; import { rateLimiterPlugin } from './security/rate-limiter.plugin.js'; import { tenantAuthPlugin } from './security/tenant.middleware.js'; import { adminRoutes } from './auth/admin-routes.js'; import { mountMcp } from './mcp/server.js'; import { startOAuthStateCleanup } from './auth/oauth-state.service.js'; await verifyAllSecretsLoadable(); // Pino logs structured JSON to stdout. In production the systemd unit captures stdout into // journald — no log-shipping infrastructure is wired in Phase 1. Retention (size + time) // is configured in Phase 5 §L; for dev, `journalctl -u deneva-mcp -f` tails the live log. // // Redaction has two layers: // 1. Pino's built-in `redact` strips known request-header paths (api-key, admin-token). // 2. A custom `err` serializer scrubs token-shaped strings and token-named fields from // arbitrary log objects — adapter SDKs sometimes echo `access_token` into error // messages, and we don't want those in journald. Phase 5 §L2's grep evidence check // then becomes a regression-detector, not the primary defence. const TOKEN_KEY_RE = /access_token|refresh_token|client_secret|authorization|api[_-]?key|x-admin-token/i; function scrubTokens(value: unknown, depth = 0): unknown { if (depth > 6 || value == null) return value; if (typeof value === 'string') { return value .replace(/Bearer\s+\S+/gi, 'Bearer [REDACTED]') .replace(/ya29\.[A-Za-z0-9_-]+/g, '[REDACTED]') .replace(/EAA[A-Za-z0-9]+/g, '[REDACTED]'); } if (Array.isArray(value)) return value.map((v) => scrubTokens(v, depth + 1)); if (typeof value === 'object') { const out: Record<string, unknown> = {}; for (const [k, v] of Object.entries(value as Record<string, unknown>)) { out[k] = TOKEN_KEY_RE.test(k) ? '[REDACTED]' : scrubTokens(v, depth + 1); } return out; } return value; } const app = Fastify({ logger: { redact: ['req.headers["x-api-key"]', 'req.headers["x-admin-token"]'], serializers: { err: (e: Error) => scrubTokens({ type: e.name, message: e.message, stack: e.stack }), }, }, genReqId: () => crypto.randomUUID(), }); await app.register(helmet, { global: true }); await app.register(rateLimiterPlugin); await app.register(tenantAuthPlugin); await app.register(adminRoutes); await mountMcp(app); // Process-liveness probe for nginx / systemd / external uptime monitor. // Unauthenticated by design — a 200 here means the Node process is alive and listening, // not that downstream dependencies are healthy. Phase 4 §G3 owns the dependency-health // endpoint (/admin/health/inngest); Phase 5 §G1 wires this to an external monitor. const startedAt = Date.now(); app.get('/health', async () => ({ ok: true, version: process.env.npm_package_version ?? '0.1.0', uptimeSec: Math.floor((Date.now() - startedAt) / 1000), })); startOAuthStateCleanup(); await app.listen({ host: '127.0.0.1', port: 3001 });

G5. Acceptance

File: tests/integration/mcp-e2e.test.ts

  • Drive both tools through /mcp with a real API key. Assert: 200, response shape matches the Zod output, mcp.tool_called audit row exists.
  • Logging an Error whose message contains Bearer ya29.foo emits a JSON log line with the token segment redacted (assert on the parsed line — the literal ya29.foo and Bearer ya29 substrings must not appear). Same assertion for an error with a property named access_token.
  • Manual curl examples documented:
# ping curl -X POST https://localhost/mcp \ -H "X-Api-Key: $KEY" -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"ping","arguments":{}}}' # get_account_health curl -X POST https://localhost/mcp \ -H "X-Api-Key: $KEY" -H "Content-Type: application/json" \ -d '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"get_account_health","arguments":{"platform":"google","dateRange":"last_7_days"}}}'

Workstream H — OAuth state scaffolding

H1. Service skeleton

File: src/auth/oauth-state.service.ts

import { createHash, randomBytes } from 'node:crypto'; import { and, eq, gt, lt } from 'drizzle-orm'; import { db } from '../db/index.js'; import { oauthStates } from '../db/schema.js'; export async function createOAuthState(tenantId: string, platform: string) { const state = randomBytes(32).toString('base64url'); const codeVerifier = randomBytes(32).toString('base64url'); const codeChallenge = createHash('sha256').update(codeVerifier).digest('base64url'); await db.insert(oauthStates).values({ state, codeVerifier, tenantId, platform, expiresAt: new Date(Date.now() + 10 * 60 * 1000), }); return { state, codeVerifier, codeChallenge }; } export async function consumeOAuthState(state: string) { const [row] = await db.delete(oauthStates) .where(and(eq(oauthStates.state, state), gt(oauthStates.expiresAt, new Date()))) .returning(); if (!row) throw new Error('Invalid or expired OAuth state'); return row; // single-use: deleted on read } export function startOAuthStateCleanup(): void { // Plain in-process timer is fine for Phase 1; replaced by Inngest cron in Phase 4. setInterval(() => { void db.delete(oauthStates).where(lt(oauthStates.expiresAt, new Date())); }, 5 * 60_000).unref(); }

No HTTP routes here — the /auth/:platform/start and /callback endpoints land in Phase 2 with the Google adapter. Phase 1 only proves the storage and cleanup work.

H2. Acceptance

  • Unit test: createOAuthState writes a row; consumeOAuthState returns and deletes it; second call with the same state throws.
  • Integration test: insert a row with expiresAt = now - 1ms, run startOAuthStateCleanup once (extracted as a callable for test), assert row is gone.

H3. Seed script — first tenant + API key

File: scripts/seed-tenant.mjs

A one-off bootstrap tool. Connects as mcp_admin (bypasses RLS — no tenant context to set), creates a tenant, mints a fresh API key, and prints the raw key once — there is no way to retrieve it later. Use the printed key as $KEY in the §13 smoke test.

// scripts/seed-tenant.mjs // // Usage: // node scripts/seed-tenant.mjs "Acme Corp" // // Reads DB_ADMIN_PASSWORD and API_KEY_HMAC_SECRET from ./secrets/. // Prints "API key: <raw>" to stdout. The raw key is shown ONCE — store it now. import { readFileSync } from 'node:fs'; import { createHmac, randomBytes } from 'node:crypto'; import pg from 'pg'; const tenantName = process.argv[2] ?? 'dev-tenant'; const adminPw = readFileSync('secrets/DB_ADMIN_PASSWORD', 'utf8').trim(); const hmacKey = readFileSync('secrets/API_KEY_HMAC_SECRET'); const client = new pg.Client({ host: '127.0.0.1', port: 5432, user: 'mcp_admin', password: adminPw, database: 'deneva_mcp', }); await client.connect(); const rawKey = randomBytes(32).toString('base64url'); const keyHash = createHmac('sha256', hmacKey).update(rawKey).digest('hex'); const expires = new Date(Date.now() + 365 * 24 * 60 * 60 * 1000); try { await client.query('BEGIN'); const { rows: [tenant] } = await client.query( 'INSERT INTO tenants (name) VALUES ($1) RETURNING id', [tenantName], ); await client.query( `INSERT INTO api_keys (tenant_id, key_hash, description, expires_at) VALUES ($1, $2, $3, $4)`, [tenant.id, keyHash, 'seed-tenant.mjs', expires], ); await client.query('COMMIT'); console.log(`Tenant: ${tenant.id} (${tenantName})`); console.log(`API key: ${rawKey}`); console.log('Store this key now — it cannot be retrieved later.'); } catch (err) { await client.query('ROLLBACK'); throw err; } finally { await client.end(); }

Acceptance: node scripts/seed-tenant.mjs "Acme Corp" prints a tenant UUID and a base64url key. psql ... -c "SELECT count(*) FROM api_keys WHERE tenant_id = '<that uuid>';" returns 1. curl -H "X-Api-Key: <that key>" ... to /mcp succeeds.


Workstream I — CI & dependency hygiene

I1. GitHub Actions workflow

File: .github/workflows/ci.yml

name: CI on: pull_request: push: { branches: [main] } jobs: build: runs-on: ubuntu-latest services: postgres: image: postgres:16-alpine env: { POSTGRES_PASSWORD: ci, POSTGRES_DB: deneva_mcp } ports: ['5432:5432'] options: >- --health-cmd "pg_isready -U postgres" --health-interval 5s --health-timeout 5s --health-retries 10 steps: - uses: actions/checkout@v4 - uses: actions/setup-node@v4 with: { node-version: '22', cache: 'npm' } - run: npm ci - run: bash scripts/dev-secrets.sh - run: npm run typecheck - run: npm run lint - run: npm run audit # fails on high/critical advisories - run: npm run db:migrate - run: npm test

I2. Dependabot config

File: .github/dependabot.yml

version: 2 updates: - package-ecosystem: "npm" directory: "/" schedule: { interval: "daily" } open-pull-requests-limit: 10 groups: minor-and-patch: update-types: ["minor", "patch"] - package-ecosystem: "github-actions" directory: "/" schedule: { interval: "weekly" }

I3. Branch protection (informational — manual setup)

In the GitHub repo settings:

  • Require status check build before merge.
  • Require at least one review.
  • Require linear history.
  • Disallow force pushes to main.

I4. Acceptance

  • A PR that introduces a dependency with a known high-severity advisory fails the build at the npm run audit step.
  • A PR that breaks the RLS test fails at npm test.
  • Dependabot creates at least one PR within 24h of merging the config.

§11 — Definition of Done (full checklist)

Status legend (as of 2026-05-07, post-Ubuntu-smoke-test)

  • [x] — implementation complete in the repo, verifiable by reading the source. End-to-end-verified items also note the run that proved it.
  • [~] — code shipped but not yet executed end-to-end (needs npm install + Ubuntu host run, see docs/setup-ubuntu.md).
  • [ ] — not yet implemented.

The 2026-05-07 Ubuntu smoke-test surfaced eight bugs (now fixed) — see §14 for the catalogue.

A. Project bootstrap

  • [~] npm run typecheck passes on a clean checkout. (Code in place; not yet run because node_modules are not installed in this checkout.)
  • npm run lint flags forbidden raw sql interpolation. (Rule wired in eslint.config.js.)
  • Node 22 enforced via engines and .nvmrc. (package.json engines + .nvmrc.)

B. Database & schema

  • All seven tables defined in src/db/schema.ts. (Migration generation deferred to first npm install run — npm run db:generate.)
  • RLS enabled on metric_cache, platform_credentials, sync_log. (src/db/rls.sql. api_keys is intentionally NOT under RLS — see §14 item #4 and the policy comment.)
  • mcp_app cannot UPDATE/DELETE audit_log; NOBYPASSRLS set. (src/db/roles.sql. Verified end-to-end on the 2026-05-07 Ubuntu smoke-test — mcp_app could see api_keys after RLS was lifted, audit_log inserts succeeded, deletes were blocked.)
  • RLS verification test written. (tests/integration/rls.test.ts — runs in CI against a real Postgres.)
  • Pool max set explicitly (20). (src/db/index.ts.)
  • Pre-migration pg_dump step documented in deploy procedure. (docs/setup-ubuntu.md “Update the application” section.)

C. Secrets loader

  • All four required secrets must load at startup or the process exits ≠ 0. (verifyAllSecretsLoadable() in src/security/secrets.loader.ts, called first in src/index.ts.)
  • No secret value is read from process.env. (Loader reads only from /run/credentials/... or ./secrets/.)
  • secrets/ directory is gitignored. (.gitignore.)

D. API key auth

  • HMAC-SHA256 (not plain SHA-256) hashes used. (src/security/api-key.service.ts. Verified 2026-05-07: hash computed live from the running service’s HMAC secret matched the DB-stored hash byte-for-byte.)
  • timingSafeEqual used for comparison. (Same file.)
  • Rotation endpoint creates a new key with 24h grace on the old. (src/auth/admin-routes.ts.)
  • [~] All four D4 acceptance tests pass. (tests/integration/api-key.test.ts covers the unit-level cases. The full E2E flow (missing/invalid/valid key + rotation + 25h-clock-advance) requires app.inject() against a booted Fastify instance — add in a follow-up before Phase 1 sign-off. Manual smoke-test on 2026-05-07 verified the success path: valid key → initialize → 200.)
  • Auth middleware actually runs on /mcp requests. (Plugin wrapped with fastify-plugin in src/security/tenant.middleware.ts — without it, Fastify’s encapsulation hid the preHandler hook from the /mcp route. See §14 #6.)

E. Rate limiting

  • [~] Global per-IP, per-tenant, and per-route limits enforced separately. (Global + per-tenant shipped — see src/security/rate-limiter.plugin.ts, split into globalRateLimiterPlugin + tenantRateLimiterPlugin so they bracket auth correctly. Per-route /auth/* strict limit deferred to Phase 2 when the real OAuth routes are added.)
  • IP block engages after 10 auth failures within 1h. (src/security/ip-block.service.ts, wired in src/security/tenant.middleware.ts.)
  • [~] All E5 acceptance tests pass. (tests/integration/ip-block.test.ts covers the IP-block service. The full app.inject()-driven cases (101st global request, 301st tenant request, blocked-IP-with-valid-key) follow alongside the D4 E2E suite.)

F. Audit log

G. MCP server

  • initialize reachable through /mcp with a valid key. (2026-05-07 smoke-test: POST /mcp with initialize returns 200 with serverInfo: { name: "deneva-mcp", version: "0.1.0" } and an mcp-session-id header.)
  • Both ping and get_account_health registered. (src/mcp/server.ts registers both via the registry; uses non-deprecated server.registerTool().)
  • Tool registry pattern is set up so Phase 2 only adds files under mcp/tools/ + adapters/. (See docs/components/mcp-tools.md “Adding a new tool”.)
  • Server binds to 127.0.0.1:3001 only. (src/index.ts app.listen({ host: '127.0.0.1', port: 3001 }).)
  • A thrown error inside any tool produces an mcp.tool_failed audit row (single producer). (Wrapper around every handler in src/mcp/server.ts.)
  • GET /health returns 200 with { ok, version, uptimeSec } and is unauthenticated. (In src/index.ts; auth middleware skips non-/mcp routes.)
  • Tool context (tenantId, requestId) flows to handlers. (Closure-based: buildMcpServer({ tenantId, requestId }) in src/mcp/server.ts builds a fresh server per request with context baked into each handler. The MCP SDK’s transport.handleRequest(req, res, parsedBody) has no application-context slot — the third arg is the JSON-RPC body. See §14 #8.)
  • [~] End-to-end MCP test (tests/integration/mcp-e2e.test.ts). (Pending — requires booting the full app inside vitest. Stateful tools/call over curl needs the initializenotifications/initializedtools/call session dance, which a real MCP client handles automatically. Same follow-up as D4/E5 E2E.)

H. OAuth state

I. CI

  • CI runs typecheck, lint, audit, migrations, tests against a real Postgres. (.github/workflows/ci.yml.)
  • Dependabot config merged. (.github/dependabot.yml.)
  • CI fails on a high-severity advisory. (npm audit --audit-level=high step in the workflow.)

Outstanding before Phase 1 sign-off

  1. Run npm install once (locally or on the CI runner) so package-lock.json is generated and committed. This unblocks the typecheck / lint / migration generation steps.
  2. Generate the initial migration: npm run db:generate — produces src/db/migrations/0000_*.sql from the schema. Commit it.
  3. Add the three E2E suites flagged [~] above (auth.test.ts, rate-limit.test.ts, mcp-e2e.test.ts) — they need a booted Fastify instance via app.inject() and Postgres up. The CI workflow already provisions Postgres, so they slot in without infra changes.
  4. Smoke-test on the Ubuntu hostdone 2026-05-07 following docs/setup-ubuntu.md. Authenticated POST /mcp initialize returns 200 with the expected server info. Eight bugs found and fixed during the smoke test; see §14 below.

§14 — Errata: bugs found during the first Ubuntu smoke-test

The first end-to-end run on a fresh Ubuntu 24.04 host (2026-05-07) hit eight issues. All eight are now fixed in the source and the setup guide; this section is the record of what was wrong and where the fix landed, so the next person walking this guide doesn’t re-debug them.

#SymptomRoot causeFix
1psql:src/db/roles.sql: ERROR: permission denied to create role when running Step 8 as mcp_admin.The seed-time mcp_admin role doesn’t carry CREATEROLE. The earlier draft of src/db/roles.sql and Step 8 told the operator to run the script as mcp_admin.Run roles.sql via sudo -u postgres psql -d deneva_mcp -v app_password="$APP_PW" -f src/db/roles.sql. Updated roles.sql header comment + setup-ubuntu.md Step 8.
2seed-tenant.mjs crashed with ENOENT: no such file or directory, open 'secrets/API_KEY_HMAC_SECRET'.Step 8 wrote DB_ADMIN_PASSWORD and DB_PASSWORD into secrets/ but never the HMAC secret, even though seed-tenant.mjs reads it from there. The API_KEY_HMAC_SECRET value also wasn’t generated as a shell variable until Step 7’s interactive prompt — too late to reuse.Step 4a now mints API_KEY_HMAC_SECRET="$(openssl rand -base64 32)" alongside the DB passwords; Step 8 writes all three into secrets/. Files in setup-ubuntu.md.
3Service crash-looped with Check failed: 12 == errno (V8 fatal ENOMEM from mprotect).Systemd unit shipped with MemoryDenyWriteExecute=true, which blocks the mprotect(PROT_EXEC) calls V8’s JIT baseline compiler issues. Incompatible with any V8 runtime.Removed MemoryDenyWriteExecute=true from the systemd unit in setup-ubuntu.md Step 10. The other hardening flags (NoNewPrivileges, ProtectSystem=strict, RestrictNamespaces, etc.) stay.
4Authenticated POST /mcp returned 401 even with a valid key; audit_log was empty.RLS policy on api_keys (tenant_id = current_setting('app.current_tenant_id')::uuid) requires the tenant context to already be set — but the auth middleware reads api_keys to discover the tenant. Chicken-and-egg: mcp_app saw zero rows, so the lookup always failed.Removed RLS from api_keys in src/db/rls.sql. api_keys is auth infrastructure — access is controlled by hash-unguessability, not tenant scope.
5Auth still failing with empty audit_log and 401 after fix #4.db/index.ts set ssl: { rejectUnauthorized: true } in NODE_ENV=production. The Phase-1 server uses the snake-oil cert, which Node’s bundled CA store does not trust. Every pool connection threw self-signed certificate. The error path turned into a 401 without an audit row (see fix #6).src/db/index.ts now uses ssl: false. App and Postgres share a host through Phase 5, so loopback traffic never benefits from SSL — revisit only if Postgres moves off-host.
6After fix #5 the DB connection worked from a one-off Node script, yet the running service still returned 401 with 1.98ms response time and zero audit rows.tenantAuthPlugin (and tenantRateLimiterPlugin) were registered via app.register(...) without fastify-plugin wrapping. Fastify’s plugin encapsulation meant the addHook('preHandler', ...) calls inside applied only to routes registered inside the plugin scope. The MCP route is mounted directly on the parent via mountMcp(app), so the auth hook never ran for POST /mcp. The 401 was actually coming from the route handler’s defence-in-depth if (!tenantId) check.Both plugins are now fp(...)-wrapped in src/security/tenant.middleware.ts and src/security/rate-limiter.plugin.ts. fastify-plugin is already a transitive dep via @fastify/rate-limit.
7POST /mcp returned 406 Not Acceptable: Client must accept both application/json and text/event-stream.StreamableHTTPServerTransport enforces content-negotiation. The smoke-test curl only sent Content-Type, not Accept.Added -H "Accept: application/json, text/event-stream" to the smoke-test curl in Step 11 of setup-ubuntu.md.
8After fix #7, POST /mcp returned 400 Parse error: Invalid JSON-RPC message for any body.The third argument of transport.handleRequest(req, res, parsedBody?) is the JSON-RPC payload, not application context. The original code passed { tenantId, requestId } there; the SDK then validated that as a JSON-RPC message and rejected it. Compounding factor: Fastify’s body parser had already consumed req.raw, so falling back to the stream was empty too.src/mcp/server.ts now passes req.body as parsedBody. Tool context flows via closure on buildMcpServer({ tenantId, requestId }) — context captured per-request when the server is built, no SDK-context plumbing required.

After these eight fixes, the §13 smoke test passes end-to-end: initialize returns 200 with serverInfo: { name: "deneva-mcp", version: "0.1.0" } and an mcp-session-id header. Authenticated tool calls require the same session ID + the notifications/initialized step that real MCP clients (Claude Desktop, etc.) maintain across requests; curl-based testing of tools/call is therefore not part of the Phase 1 smoke test.


§12 — Out of scope (deferred)

ItemPhase
Real OAuth /auth/:platform/start and /callback HTTP routes2
Envelope encryption (credentials.service.ts) — needs real tokens2
Google / Meta / TikTok platform adapters2 / 3
Inngest sync functions, signed-webhook verification4
Cache TTL config and cache.service.ts2 (per-platform)
nginx config, UFW rules, full systemd hardening flag set5
GDPR erasure endpoint (deleteTenant)5
Penetration test of public endpoints5

If a task you’re about to do is on this list, stop — it belongs in a later phase.


§13 — Manual smoke test (run end-to-end before declaring Phase 1 done)

# 1. Bring up Postgres docker compose up -d postgres # 2. Generate dev secrets bash scripts/dev-secrets.sh # 3. Run migrations npm run db:migrate # 4. Create the mcp_app role + apply RLS (must be done BEFORE the app connects). # mcp_admin lacks CREATEROLE, so roles.sql runs as the postgres superuser. APP_PW="$(cat secrets/DB_PASSWORD)" ADMIN_PW="$(cat secrets/DB_ADMIN_PASSWORD)" sudo -u postgres psql -d deneva_mcp -v app_password="$APP_PW" -f src/db/roles.sql psql "postgresql://mcp_admin:${ADMIN_PW}@127.0.0.1:5432/deneva_mcp" -f src/db/rls.sql # 5. Seed a tenant + API key node scripts/seed-tenant.mjs "dev-tenant" # prints "API key: XYZ123..." ONCE — copy now export KEY=XYZ123... # 6. Start the server (now that mcp_app role exists) npm run dev # 7. initialize with valid key — expect 200 + audit row. # The Accept header is REQUIRED by Streamable HTTP; without it the SDK 406s. # `initialize` is the right smoke-test target because `tools/call` requires a # stateful session (initialize → notifications/initialized → tools/call) that # a real MCP client handles, but curl on its own does not. curl -s -o /dev/null -w "%{http_code}\n" -X POST http://127.0.0.1:3001/mcp \ -H "X-Api-Key: $KEY" \ -H "Content-Type: application/json" \ -H "Accept: application/json, text/event-stream" \ -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"smoke","version":"1.0"}}}' # → 200 # 8. /mcp without key — expect 401 + audit row curl -s -o /dev/null -w "%{http_code}\n" -X POST http://127.0.0.1:3001/mcp \ -H "Content-Type: application/json" \ -H "Accept: application/json, text/event-stream" \ -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"smoke","version":"1.0"}}}' # → 401 # 9. Hammer with 101 requests in a minute — expect at least one 429 for i in $(seq 1 101); do curl -s -o /dev/null -w "%{http_code} " -X POST http://127.0.0.1:3001/mcp \ -H "X-Api-Key: $KEY" \ -H "Content-Type: application/json" \ -H "Accept: application/json, text/event-stream" \ -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"smoke","version":"1.0"}}}' done; echo # → mostly 200, then 429s # 10. Confirm mcp_app cannot tamper with audit_log psql "postgresql://mcp_app@127.0.0.1:5432/deneva_mcp" -c "DELETE FROM audit_log;" # → ERROR: permission denied for table audit_log # 11. Inspect the trail psql "postgresql://mcp_admin:${ADMIN_PW}@127.0.0.1:5432/deneva_mcp" \ -c "SELECT event_type, outcome, count(*) FROM audit_log GROUP BY 1,2 ORDER BY 1,2;" # → expect rows for api_key.auth_success/failure, mcp.tool_called, rate_limit.exceeded

If every step above produces the expected outcome, Phase 1 is shipped. Move on to Phase 2.