Phase 1 — Secure Foundation

Detailed execution doc for Phase 1 of the Deneva MCP Tool Architecture Plan. The architecture doc is the source of truth for what is being built and why; this doc is the source of truth for how and in what order.

Estimated effort: 1–2 weeks for one engineer. Phase 2 follow-up: docs/phase-2-google-ads.md.

Goal

Stand up a minimal but complete Fastify + Drizzle + Postgres stack with the entire security middleware chain wired end-to-end, validated by two stub MCP tools. Every later phase plugs into this foundation without changing it: Phase 2 adds adapters and real tokens, Phase 3 adds more adapters, Phase 4 adds background sync, Phase 5 adds production hardening.

If Phase 1 is done correctly, an attacker hitting an unauthenticated request, sending a malformed payload, or exceeding the rate limit gets the same correct response on day one as they will on day 365.

Definition of Done (high-level — full checklist in §11)

A request to /mcp with a valid X-Api-Key reaches the ping tool, gets a 200, and produces an api_key.auth_success + mcp.tool_called audit row.
The same request without the header gets a 401 and an api_key.auth_failure audit row, in constant time vs an invalid-but-well-formed key.
Two separate tenants cannot read each other’s metric_cache rows even if a query forgets a WHERE tenant_id = ? clause (RLS catches it).
DELETE FROM audit_log as the mcp_app role fails with permission denied.
CI fails the build on a high-severity npm audit advisory.

Workstream order & dependency graph


A. Project bootstrap ──┬──▶ C. Secrets loader ──┬──▶ D. API key auth ──┬──▶ G. MCP server + stubs
                       │                        │                      │
                       └──▶ B. Database & schema ─▶ F. Audit log ──────┤
                                                  │                    │
                                                  └─▶ E. Rate limiting ┤
                                                                       │
                                                  H. OAuth scaffolding ┤
                                                                       │
                                              I. CI & dep hygiene ─────┘ (runs alongside everything)

A → B → (C, F, E) → D → G is the critical path. H and I are parallelisable.

Workstream A — Project bootstrap

A0. Dependencies (one-time `npm install`)

Pin exact versions in package-lock.json (Phase 1 §A1’s package.json declares ranges; lockfile is what actually ships). The MCP SDK shape may evolve — the version below is the contract Phase 1 is written against; bump deliberately, not opportunistically.


# Runtime
npm install \
  fastify@^5.0.0 \
  @fastify/helmet@^13.0.0 \
  @fastify/rate-limit@^10.0.0 \
  rate-limiter-flexible@^7.0.0 \
  drizzle-orm@^0.36.0 \
  pg@^8.13.0 \
  zod@^3.23.0 \
  @modelcontextprotocol/sdk@^1.0.0
  # ^^^ Phase 1 was authored against SDK 1.x. The McpServer / StreamableHTTPServerTransport
  # API shape in §G1 may differ on newer minor versions — adapt the registry, keep the contract.
 
# Dev / build / test
npm install --save-dev \
  typescript@^5.6.0 \
  tsx@^4.19.0 \
  @types/node@^22.0.0 \
  @types/pg@^8.11.0 \
  drizzle-kit@^0.28.0 \
  vitest@^2.1.0 \
  eslint@^9.0.0 \
  typescript-eslint@^8.0.0

Acceptance: npm install completes with zero high/critical advisories (npm audit --audit-level=high exits 0).

A1. Initialise project files

Files:

package.json
tsconfig.json
.gitignore
.editorconfig
.nvmrc

package.json (relevant fields):


{
  "name": "deneva-mcp",
  "private": true,
  "type": "module",
  "engines": { "node": ">=22.0.0 <23" },
  "scripts": {
    "dev":       "tsx watch src/index.ts",
    "build":     "tsc --project tsconfig.json",
    "start":     "node dist/index.js",
    "typecheck": "tsc --noEmit",
    "lint":      "eslint . --max-warnings=0",
    "test":      "vitest run",
    "test:watch":"vitest",
    "audit":     "npm audit --audit-level=high",
    "db:migrate":"drizzle-kit migrate",
    "db:studio": "drizzle-kit studio"
  }
}

tsconfig.json essentials:


{
  "compilerOptions": {
    "target": "ES2023",
    "module": "ESNext",
    "moduleResolution": "Bundler",
    "strict": true,
    "noUncheckedIndexedAccess": true,
    "exactOptionalPropertyTypes": true,
    "noImplicitOverride": true,
    "isolatedModules": true,
    "resolveJsonModule": true,
    "outDir": "dist",
    "rootDir": "src"
  },
  "include": ["src/**/*"]
}

.gitignore (must include):


node_modules
dist
.env
.env.*
secrets/
coverage

.nvmrc: 22.

Acceptance: npm install && npm run typecheck exits 0 on an empty src/index.ts.

A2. ESLint with the SQL-injection guard

File: eslint.config.js

The non-negotiable rule: ban template-string interpolation inside db.execute(sql\…`) and similar. Drizzle's parameterized helpers (eq, and, sql`…${param}“) remain allowed; raw template construction does not.


// eslint.config.js
import tseslint from 'typescript-eslint';
 
export default tseslint.config(
  ...tseslint.configs.recommendedTypeChecked,
  {
    languageOptions: { parserOptions: { project: './tsconfig.json' } },
    rules: {
      'no-restricted-syntax': [
        'error',
        {
          // db.execute(sql`...${var}...`) where the sql tag is given a raw template literal
          selector:
            "CallExpression[callee.property.name='execute'] > TaggedTemplateExpression[tag.name='sql'][quasi.expressions.length>0]",
          message:
            'Raw template interpolation in db.execute(sql`...`) is forbidden. Use parameterized helpers (eq, and, sql.placeholder) instead.',
        },
      ],
    },
  },
);

Acceptance: npm run lint flags a deliberately-injected db.execute(sql\SELECT * FROM x WHERE id = ${userId}`)` test fixture and exits non-zero.

Workstream B — Database & schema

B1. Local Postgres via docker-compose

File: docker-compose.yml


services:
  postgres:
    image: postgres:16-alpine
    restart: unless-stopped
    environment:
      POSTGRES_DB: deneva_mcp
      POSTGRES_USER: mcp_admin
      POSTGRES_PASSWORD: ${POSTGRES_PASSWORD:-dev_only_password}
    ports:
      - "127.0.0.1:5432:5432"   # bound to localhost only
    volumes:
      - mcp_pg_data:/var/lib/postgresql/data
volumes:
  mcp_pg_data:

No TLS in dev. The postgres:16-alpine image does not ship the Debian ssl-cert package, so the previous draft’s snake-oil cert paths would crash the container on boot. Production uses a CA-signed cert mounted via the systemd unit (Phase 5 §K). The ssl: { rejectUnauthorized: false } in src/db/index.ts reflects this — connections succeed regardless of TLS posture in dev.

Acceptance: docker compose up -d postgres && docker compose exec postgres psql -U mcp_admin -d deneva_mcp -c "select 1" returns 1.

B2. Drizzle schema + initial migration

Files:

drizzle.config.ts
src/db/schema.ts
src/db/index.ts
src/db/migrations/0000_init.sql (generated by drizzle-kit generate)

src/db/schema.ts — paste the seven pgTable definitions from the architecture doc (tenants, apiKeys, platformCredentials, oauthStates, metricCache, auditLog, syncLog). Prepend the imports the architecture doc omits:


import { pgTable, uuid, text, timestamp, jsonb, integer } from 'drizzle-orm/pg-core';

drizzle.config.ts — drizzle-kit runs as a CLI, as mcp_admin (privileged), separate from the runtime pool which connects as mcp_app. Read the admin password synchronously from the dev secrets dir:


// drizzle.config.ts
import { defineConfig } from 'drizzle-kit';
import { readFileSync } from 'node:fs';
import { join } from 'node:path';
 
const password = readFileSync(
  process.env.NODE_ENV === 'production'
    ? join('/run/credentials', process.env.SYSTEMD_UNIT ?? 'deneva-mcp.service', 'DB_ADMIN_PASSWORD')
    : join(process.cwd(), 'secrets', 'DB_ADMIN_PASSWORD'),
  'utf8',
).trim();
 
export default defineConfig({
  schema:  './src/db/schema.ts',
  out:     './src/db/migrations',
  dialect: 'postgresql',
  dbCredentials: {
    host:     process.env.DB_HOST ?? '127.0.0.1',
    port:     5432,
    user:     'mcp_admin',
    password,
    database: 'deneva_mcp',
    ssl:      false, // dev; prod uses CA-signed cert (Phase 5)
  },
  strict: true,
});

Why the split: mcp_app (runtime) lacks DDL grants — it cannot CREATE TABLE. mcp_admin (migrations) has full privileges and bypasses RLS, which is correct for migrations and seed scripts but must not be used at runtime. Production uses a separate encrypted DB_ADMIN_PASSWORD credential that lives only on the deploy host, not in the running app’s credential set.

src/db/index.ts:


import { drizzle } from 'drizzle-orm/node-postgres';
import { Pool } from 'pg';
import * as schema from './schema.js';
import { loadSecret } from '../security/secrets.loader.js';
 
const pool = new Pool({
  host:     process.env.DB_HOST ?? '127.0.0.1',
  port:     5432,
  user:     'mcp_app',
  password: (await loadSecret('DB_PASSWORD')).toString('utf8'),
  database: 'deneva_mcp',
  ssl:      { rejectUnauthorized: false }, // dev; production uses CA-signed cert
  max:      20,                            // see §B6; sized for Phase 4 sync fan-out
});
 
export const db = drizzle(pool, { schema });

Acceptance: npm run db:migrate creates all seven tables. \d+ audit_log in psql shows the columns from the architecture doc.

B3. Row-level security policies

File: src/db/rls.sql (run as mcp_admin after migrations)


ALTER TABLE metric_cache          ENABLE ROW LEVEL SECURITY;
ALTER TABLE platform_credentials  ENABLE ROW LEVEL SECURITY;
ALTER TABLE sync_log              ENABLE ROW LEVEL SECURITY;
ALTER TABLE api_keys              ENABLE ROW LEVEL SECURITY;
 
-- Tenant isolation: every tenant-scoped query must SET app.current_tenant_id first.
CREATE POLICY tenant_isolation_metric_cache ON metric_cache
  USING (tenant_id = current_setting('app.current_tenant_id', true)::uuid);
CREATE POLICY tenant_isolation_creds ON platform_credentials
  USING (tenant_id = current_setting('app.current_tenant_id', true)::uuid);
CREATE POLICY tenant_isolation_sync_log ON sync_log
  USING (tenant_id = current_setting('app.current_tenant_id', true)::uuid);
CREATE POLICY tenant_isolation_api_keys ON api_keys
  USING (tenant_id = current_setting('app.current_tenant_id', true)::uuid);
 
-- audit_log is intentionally NOT RLS-isolated: cross-tenant security review needs full visibility.
-- Access control on audit_log is via role grants in roles.sql.

Setting the tenant context must happen on every request, immediately after auth:


// inside tenant.middleware.ts, after tenantId is resolved:
await db.execute(sql`SELECT set_config('app.current_tenant_id', ${tenantId}, true)`);

(true = transaction-local; the setting clears at end of transaction. Pair with a per-request transaction or a connection-pool reset hook.)

B4. Database role separation

File: src/db/roles.sql (run once, as mcp_admin, after migrations have created the tables)

mcp_admin already exists — Postgres creates it on first boot from the POSTGRES_USER env var in §B1’s compose file. This script only creates mcp_app and applies grants on the tables that drizzle-kit migrate produced. Apply with:


psql "postgresql://mcp_admin:$ADMIN_PW@127.0.0.1:5432/deneva_mcp" \
  -v app_password="$APP_PW" -f src/db/roles.sql

The -v flag is mandatory — without it, the :'…' placeholder is passed through unsubstituted and CREATE ROLE fails.


CREATE ROLE mcp_app LOGIN PASSWORD :'app_password' NOINHERIT;
 
GRANT CONNECT ON DATABASE deneva_mcp TO mcp_app;
GRANT USAGE   ON SCHEMA public          TO mcp_app;
GRANT SELECT, INSERT, UPDATE, DELETE
      ON ALL TABLES IN SCHEMA public    TO mcp_app;
GRANT USAGE
      ON ALL SEQUENCES IN SCHEMA public TO mcp_app;
 
-- audit_log is INSERT-only for the application
REVOKE UPDATE, DELETE ON audit_log FROM mcp_app;
 
-- mcp_app must NOT bypass RLS
ALTER ROLE mcp_app NOBYPASSRLS;
 
-- mcp_admin already has full privileges (created by Postgres at boot).
-- Re-asserting NOBYPASSRLS for mcp_app is the only contract we enforce here.

Acceptance: as mcp_app, INSERT INTO audit_log ... succeeds; UPDATE audit_log SET outcome='success' fails with permission denied.

B5. RLS verification integration test

File: tests/integration/rls.test.ts


import { describe, it, expect } from 'vitest';
import { db } from '../../src/db/index.js';
import { metricCache, tenants } from '../../src/db/schema.js';
import { sql } from 'drizzle-orm';
 
describe('row-level security', () => {
  it('blocks tenant A from reading tenant B rows', async () => {
    const [a] = await db.insert(tenants).values({ name: 'A' }).returning();
    const [b] = await db.insert(tenants).values({ name: 'B' }).returning();
 
    // Insert a row owned by tenant B, with B's context set.
    await db.execute(sql`SELECT set_config('app.current_tenant_id', ${b.id}, false)`);
    await db.insert(metricCache).values({
      tenantId: b.id, platform: 'google', reportType: 'health',
      dateRangeKey: 'last_7', data: {}, expiresAt: new Date(Date.now() + 60_000),
    });
 
    // Switch to tenant A's context.
    await db.execute(sql`SELECT set_config('app.current_tenant_id', ${a.id}, false)`);
    const visible = await db.select().from(metricCache);
 
    expect(visible).toHaveLength(0); // RLS hides B's row from A
  });
});

Acceptance: test passes against a DB connection authenticated as mcp_app (NOT mcp_admin — admin bypasses RLS).

B6. Migration rollback strategy

Drizzle Kit does not auto-generate down migrations. Phase 3 drops & recreates indexes; Phase 4 adds non-nullable columns. Both can fail mid-deploy on prod data the dev DB never saw. The startup-grade rollback strategy is restore from a pre-migration dump — not maintaining a parallel set of down.sql files (cheap to write, expensive to keep correct).

The deploy procedure (used in CI/CD on the Ubuntu host once Phase 5 §K is in place):


# Take a snapshot dump immediately before applying migrations.
TS=$(date -u +%Y%m%dT%H%M%SZ)
pg_dump --format=custom --no-owner --no-privileges deneva_mcp \
  > /var/backups/deneva-mcp/pre-migrate-${TS}.dump
 
# Apply migrations.
npm run db:migrate || {
  echo "Migration failed; restore with:"
  echo "  dropdb deneva_mcp && createdb deneva_mcp"
  echo "  pg_restore --dbname=deneva_mcp /var/backups/deneva-mcp/pre-migrate-${TS}.dump"
  exit 1
}

Local dev rollback: docker compose down -v && docker compose up -d postgres && npm run db:migrate wipes and reapplies — usually faster than reasoning about a partial failure.

For destructive migrations (drop/recreate index, NOT NULL on existing column), commit a one-liner *.rollback.sql next to the generated migration with the inverse SQL — used by an operator following docs/compliance/runbooks/database-restore.md (Phase 5) when a restore from dump would lose too much intervening data.

Acceptance: the deploy script aborts on a failed migration; the runbook mentioned above (added in Phase 5 §K6) references this pre-migrate dump path.

Workstream C — Secrets loader

C1. Two-backend loader

File: src/security/secrets.loader.ts


import { readFile } from 'node:fs/promises';
import { join } from 'node:path';
 
const REQUIRED_SECRETS = [
  'CREDENTIAL_KEK',
  'API_KEY_HMAC_SECRET',
  'DB_PASSWORD',
  'INNGEST_SIGNING_KEY',
] as const;
type SecretName = (typeof REQUIRED_SECRETS)[number];
 
const cache = new Map<SecretName, Buffer>();
 
export async function loadSecret(name: SecretName): Promise<Buffer> {
  const hit = cache.get(name);
  if (hit) return hit;
 
  const path = process.env.NODE_ENV === 'production'
    ? join('/run/credentials', process.env.SYSTEMD_UNIT ?? 'deneva-mcp.service', name)
    : join(process.cwd(), 'secrets', name);
 
  const value = await readFile(path);
  cache.set(name, value);
  return value;
}
 
export async function verifyAllSecretsLoadable(): Promise<void> {
  // Fail fast at startup — better than discovering a missing secret at first request.
  for (const name of REQUIRED_SECRETS) await loadSecret(name);
}

C2. Startup verification

File: src/index.ts (entry point)


import { verifyAllSecretsLoadable } from './security/secrets.loader.js';
 
await verifyAllSecretsLoadable(); // throws if any required secret is missing
// ... continue Fastify bootstrap

C3. Dev secrets bootstrap script

File: scripts/dev-secrets.sh


#!/usr/bin/env bash
set -euo pipefail
mkdir -p secrets
chmod 700 secrets
gen() { head -c 32 /dev/urandom | base64 > "secrets/$1" && chmod 600 "secrets/$1"; }
[[ -f secrets/CREDENTIAL_KEK       ]] || gen CREDENTIAL_KEK
[[ -f secrets/API_KEY_HMAC_SECRET  ]] || gen API_KEY_HMAC_SECRET
[[ -f secrets/DB_PASSWORD          ]] || echo -n 'dev_only_password' > secrets/DB_PASSWORD
[[ -f secrets/DB_ADMIN_PASSWORD    ]] || echo -n 'dev_only_password' > secrets/DB_ADMIN_PASSWORD
[[ -f secrets/INNGEST_SIGNING_KEY  ]] || gen INNGEST_SIGNING_KEY
chmod 600 secrets/DB_PASSWORD secrets/DB_ADMIN_PASSWORD
echo "Dev secrets in ./secrets/ (gitignored)."

DB_ADMIN_PASSWORD is not in REQUIRED_SECRETS. The runtime app connects as mcp_app and never needs the admin password. It is consumed only by drizzle.config.ts (migrations) and scripts/seed-tenant.mjs. In dev it matches the compose POSTGRES_PASSWORD default; in prod it lives on the deploy host’s encrypted credential store, separate from the app’s runtime credentials.

C4. systemd unit reference (production — copy-paste, do not commit a real one in Phase 1)


# /etc/systemd/system/deneva-mcp.service
[Service]
Type=simple
User=deneva-mcp
WorkingDirectory=/opt/deneva-mcp
ExecStart=/usr/bin/node dist/index.js
Environment=NODE_ENV=production
Environment=SYSTEMD_UNIT=deneva-mcp.service
 
LoadCredentialEncrypted=CREDENTIAL_KEK:/etc/deneva-mcp/creds/CREDENTIAL_KEK.cred
LoadCredentialEncrypted=API_KEY_HMAC_SECRET:/etc/deneva-mcp/creds/API_KEY_HMAC_SECRET.cred
LoadCredentialEncrypted=DB_PASSWORD:/etc/deneva-mcp/creds/DB_PASSWORD.cred
LoadCredentialEncrypted=INNGEST_SIGNING_KEY:/etc/deneva-mcp/creds/INNGEST_SIGNING_KEY.cred
 
NoNewPrivileges=true
ProtectSystem=strict
ProtectHome=true
PrivateTmp=true

(Full hardening flag set lands in Phase 5.)

Acceptance: starting the server with one of the four secret files removed produces a clear error and exit code != 0 before any port is opened.

Workstream D — API key auth

D1. Key service

File: src/security/api-key.service.ts


import { createHmac, randomBytes, timingSafeEqual } from 'node:crypto';
import { loadSecret } from './secrets.loader.js';
 
const HMAC_SECRET = await loadSecret('API_KEY_HMAC_SECRET');
 
export function generateApiKey(): string {
  // 32 bytes = 256 bits, base64url-encoded; shown to user once, never stored
  return randomBytes(32).toString('base64url');
}
 
export function hashApiKey(rawKey: string): string {
  return createHmac('sha256', HMAC_SECRET).update(rawKey).digest('hex');
}
 
export function verifyApiKey(rawKey: string, storedHash: string): boolean {
  const candidate = Buffer.from(hashApiKey(rawKey));
  const stored    = Buffer.from(storedHash);
  if (candidate.length !== stored.length) return false;
  return timingSafeEqual(candidate, stored);
}

D2. Tenant middleware

File: src/security/tenant.middleware.ts


import type { FastifyPluginAsync } from 'fastify';
import { and, eq, gt, isNull } from 'drizzle-orm';
import { db } from '../db/index.js';
import { apiKeys } from '../db/schema.js';
import { hashApiKey } from './api-key.service.js';
import { writeAuditEvent } from './audit-log.service.js';
 
declare module 'fastify' {
  interface FastifyRequest { tenantId?: string; apiKeyId?: string }
}
 
export const tenantAuthPlugin: FastifyPluginAsync = async (fastify) => {
  fastify.addHook('preHandler', async (req, reply) => {
    if (!req.url.startsWith('/mcp')) return;            // auth only on /mcp routes
    const raw = req.headers['x-api-key'];
    if (typeof raw !== 'string' || raw.length === 0) {
      await writeAuditEvent('api_key.auth_failure', 'failure', { ip: req.ip, reason: 'missing' });
      return reply.code(401).send({ error: 'unauthorized' });
    }
    const hash = hashApiKey(raw);
    const now  = new Date();
    const [row] = await db.select().from(apiKeys).where(
      and(
        eq(apiKeys.keyHash, hash),
        isNull(apiKeys.revokedAt),
        gt(apiKeys.expiresAt, now),
      ),
    );
    if (!row) {
      await writeAuditEvent('api_key.auth_failure', 'failure', { ip: req.ip, reason: 'invalid' });
      return reply.code(401).send({ error: 'unauthorized' });
    }
    req.tenantId  = row.tenantId;
    req.apiKeyId  = row.id;
    // Async fire-and-forget: lastUsedAt update should not block the request.
    void db.update(apiKeys).set({ lastUsedAt: now }).where(eq(apiKeys.id, row.id));
    await writeAuditEvent('api_key.auth_success', 'success', { tenantId: row.tenantId });
  });
};

Note on constant-time behaviour: doing the DB lookup after hashing means missing-key and invalid-key paths take similar time. The cheap missing-header path returns earlier, but it doesn’t leak whether a specific key is valid — only whether the header was present.

D3. Rotation endpoint (pulled forward from Phase 5)

File: src/auth/admin-routes.ts


import type { FastifyPluginAsync } from 'fastify';
import { eq } from 'drizzle-orm';
import { z } from 'zod';
import { timingSafeEqual } from 'node:crypto';
import { db } from '../db/index.js';
import { apiKeys } from '../db/schema.js';
import { generateApiKey, hashApiKey } from '../security/api-key.service.js';
import { writeAuditEvent } from '../security/audit-log.service.js';
import { loadSecret } from '../security/secrets.loader.js';
 
const ADMIN_HEADER_NAME = 'x-admin-token';
const adminToken = Buffer.from((await loadSecret('API_KEY_HMAC_SECRET')).toString('hex')); // Phase 1 placeholder; Phase 5 introduces a separate ADMIN_TOKEN secret
 
function adminTokenMatches(presented: unknown): boolean {
  if (typeof presented !== 'string') return false;
  const cand = Buffer.from(presented);
  if (cand.length !== adminToken.length) return false;
  return timingSafeEqual(cand, adminToken);
}
 
const RotateBody = z.object({ tenantId: z.string().uuid(), description: z.string().min(1).max(120) });
 
export const adminRoutes: FastifyPluginAsync = async (fastify) => {
  fastify.post('/admin/api-keys/rotate', async (req, reply) => {
    if (!adminTokenMatches(req.headers[ADMIN_HEADER_NAME])) return reply.code(401).send();
    const body = RotateBody.parse(req.body);
 
    const newKey = generateApiKey();
    const newHash = hashApiKey(newKey);
    const graceUntil = new Date(Date.now() + 24 * 60 * 60 * 1000);
 
    await db.transaction(async (tx) => {
      // Mark previous active keys for this tenant with a 24h grace expiry — do NOT revoke immediately.
      await tx.update(apiKeys)
        .set({ expiresAt: graceUntil })
        .where(eq(apiKeys.tenantId, body.tenantId));
      await tx.insert(apiKeys).values({
        tenantId: body.tenantId,
        keyHash: newHash,
        description: body.description,
        expiresAt: new Date(Date.now() + 365 * 24 * 60 * 60 * 1000),
      });
    });
    await writeAuditEvent('api_key.rotated', 'success', { tenantId: body.tenantId });
 
    return { apiKey: newKey, graceUntil }; // shown once, never retrievable again
  });
};

Phase 5 follow-up: replace the placeholder admin token with a dedicated ADMIN_TOKEN secret, move the route behind nginx with an IP allow-list, and add a separate revoke-immediately variant.

D4. Acceptance tests

File: tests/integration/auth.test.ts

Required cases:

POST /mcp without X-Api-Key → 401, audit row api_key.auth_failure with reason: 'missing'.
POST /mcp with random key → 401, audit row with reason: 'invalid'. Timing within 20% of case 3 (no constant-time guarantee in tests, but a sanity check).
POST /mcp with valid key → 200, audit row api_key.auth_success, lastUsedAt updated.
After rotation: old key works for ≤24h, new key works immediately. After 25h (advance fake clock), old key returns 401.

Workstream E — Rate limiting

E1. Global per-IP limit

File: src/security/rate-limiter.plugin.ts


import type { FastifyPluginAsync } from 'fastify';
import rateLimit from '@fastify/rate-limit';
 
export const rateLimiterPlugin: FastifyPluginAsync = async (fastify) => {
  await fastify.register(rateLimit, {
    global: true,
    max: 100,
    timeWindow: 60_000,
    keyGenerator: (req) => req.ip,
    errorResponseBuilder: (req, _ctx) => {
      // The plugin invokes this on every blocked request — do the audit write here.
      // Lazy import keeps audit-log.service out of the plugin's import cycle.
      void import('./audit-log.service.js').then(({ writeAuditEvent }) =>
        writeAuditEvent('rate_limit.exceeded', 'failure', { ip: req.ip, scope: 'global' }),
      );
      return { error: 'rate_limit_exceeded' };
    },
  });
};

Why errorResponseBuilder instead of an onExceeded hook: @fastify/rate-limit v10 does not expose a per-route onExceeded callback — the response builder is the single point that fires on every 429. The audit write is fire-and-forget (void) so it doesn’t add latency to the response.

E2. Per-tenant limit


// inside the same plugin file
import { RateLimiterMemory } from 'rate-limiter-flexible';
 
const tenantLimiter = new RateLimiterMemory({ points: 300, duration: 60 });
 
fastify.addHook('preHandler', async (req, reply) => {
  if (!req.tenantId) return;
  try { await tenantLimiter.consume(req.tenantId, 1); }
  catch {
    const { writeAuditEvent } = await import('./audit-log.service.js');
    await writeAuditEvent('rate_limit.exceeded', 'failure', { tenantId: req.tenantId, scope: 'tenant' });
    return reply.code(429).send({ error: 'rate_limit_exceeded' });
  }
});

E3. Strict `/auth/*` limit (placeholder routes — real ones in Phase 2)


fastify.register(async (inst) => {
  await inst.register(rateLimit, { max: 5, timeWindow: 15 * 60_000, keyGenerator: (req) => req.ip });
  inst.get('/auth/_phase1_placeholder', async () => ({ ok: true }));
}, { prefix: '/' });

E4. Auth-failure IP block

Single-process assumption. The IP-block map and the per-tenant rate-limit bucket below both live in process memory. This is fine for the systemd-direct deployment Phase 5 §B2 specifies (one Node process per host). The architecture doc’s ecosystem.config.js example shows PM2 cluster mode with instances: 2; we have deliberately diverged from that — clustering would require moving these maps to Redis or a blocked_ips DB table. If you choose to scale horizontally later, those two structures are the migration targets.

After 10 api_key.auth_failure events from the same IP within an hour:

Insert a row in an in-memory blockedIps map (Phase 1 — Redis in Phase 5) with expiresAt = now + 1h.
Subsequent requests from that IP short-circuit at the first preHandler with 401 + audit event auth.blocked_ip.
A simple setInterval(cleanup, 60_000) removes expired entries.


// security/ip-block.service.ts — sketch
const blocks = new Map<string, number>();
export function isBlocked(ip: string): boolean {
  const exp = blocks.get(ip);
  if (!exp) return false;
  if (exp < Date.now()) { blocks.delete(ip); return false; }
  return true;
}
export function recordFailure(ip: string): boolean { /* sliding-window count; returns true when threshold tripped */ }

E5. Acceptance tests

File: tests/integration/rate-limit.test.ts

Use Fastify’s built-in app.inject() (no separate package — it’s exposed on every Fastify instance) + a fake clock (vi.useFakeTimers()):

100 requests from one IP succeed; 101st returns 429 + audit row with scope: 'global'.
300 requests from one tenant succeed; 301st returns 429 + audit row with scope: 'tenant'.
11 auth-failure attempts from one IP → IP block engages; a valid key from that IP also gets 401 with auth.blocked_ip audit row until clock advances 1h.

Workstream F — Audit log

F1. Service

File: src/security/audit-log.service.ts


import { db } from '../db/index.js';
import { auditLog } from '../db/schema.js';
 
export type AuditEventType =
  | 'api_key.auth_success' | 'api_key.auth_failure'
  | 'api_key.created' | 'api_key.rotated' | 'api_key.revoked'
  | 'oauth.flow_started' | 'oauth.flow_completed' | 'oauth.flow_failed'
  | 'oauth.token_refreshed' | 'oauth.token_revoked'
  | 'mcp.tool_called' | 'mcp.tool_failed'
  | 'tenant.created' | 'tenant.deleted'
  | 'rate_limit.exceeded' | 'auth.blocked_ip' | 'sync.exhausted';
 
interface Ctx {
  tenantId?: string; ip?: string; requestId?: string;
  reason?: string; scope?: string; tool?: string;
  [k: string]: unknown;
}
 
const PII_KEYS = new Set(['email', 'name', 'firstName', 'lastName', 'phone', 'address']);
 
function stripPii(meta: Ctx): Ctx {
  const out: Ctx = {};
  for (const [k, v] of Object.entries(meta)) if (!PII_KEYS.has(k)) out[k] = v;
  return out;
}
 
export async function writeAuditEvent(
  eventType: AuditEventType,
  outcome: 'success' | 'failure',
  ctx: Ctx = {},
): Promise<void> {
  const { tenantId, ip, requestId, ...rest } = ctx;
  await db.insert(auditLog).values({
    tenantId: tenantId ?? null,
    eventType,
    actorIp:  ip ?? null,
    requestId: requestId ?? null,
    outcome,
    metadata: stripPii(rest as Ctx),
  });
}

F2. Wire-up checklist

tenantAuthPlugin calls writeAuditEvent('api_key.auth_*', ...) on every result.
rateLimiterPlugin calls writeAuditEvent('rate_limit.exceeded', ...) on every 429.
IP-block service calls writeAuditEvent('auth.blocked_ip', ...) on engagement.
Stub tools call writeAuditEvent('mcp.tool_*', ...) on entry/exit.
Rotation endpoint calls writeAuditEvent('api_key.rotated', ...).

F3. Acceptance

File: tests/integration/audit.test.ts

Each test in §D4 / §E5 asserts the corresponding audit row appears.
A negative test connects as mcp_app and runs UPDATE audit_log SET outcome='success' WHERE id = ... and DELETE FROM audit_log WHERE ... — both must throw permission denied.
The metadata column for an api_key.auth_failure row never contains an email key even if the test injects one.

Workstream G — MCP server + stub tools

G1. Tool registry

File: src/mcp/server.ts


import { McpServer } from '@modelcontextprotocol/sdk/server/mcp.js';
import { StreamableHTTPServerTransport } from '@modelcontextprotocol/sdk/server/streamableHttp.js';
import type { FastifyInstance } from 'fastify';
import { pingTool } from './tools/ping.js';
import { accountHealthTool } from './tools/account-health.js';
import { writeAuditEvent } from '../security/audit-log.service.js';
 
export interface ToolDef<I, O> {
  name: string;
  description: string;
  inputSchema: import('zod').ZodType<I>;
  handler: (input: I, ctx: { tenantId: string; requestId: string }) => Promise<O>;
}
 
const TOOLS: ToolDef<unknown, unknown>[] = [pingTool, accountHealthTool];
 
export function buildMcpServer(): McpServer {
  const server = new McpServer({ name: 'deneva-mcp', version: '0.1.0' });
  for (const t of TOOLS) {
    server.tool(t.name, t.description, t.inputSchema, async (input, extra) => {
      // tenantId + requestId are propagated via the transport's request-context (set in mountMcp)
      const ctx = extra.context as { tenantId: string; requestId: string };
      try {
        return await t.handler(input as never, ctx as never);
      } catch (err) {
        // Single producer for mcp.tool_failed in Phase 1 — every later phase reuses this path.
        // Phase 2's typed-error path (AdapterError) flows through here too.
        await writeAuditEvent('mcp.tool_failed', 'failure', {
          tenantId:  ctx.tenantId,
          requestId: ctx.requestId,
          tool:      t.name,
          reason:    err instanceof Error ? err.message : 'unknown',
        });
        throw err; // let the MCP transport translate to a JSON-RPC error
      }
    });
  }
  return server;
}
 
export async function mountMcp(fastify: FastifyInstance): Promise<void> {
  fastify.post('/mcp', async (req, reply) => {
    const transport = new StreamableHTTPServerTransport({
      sessionIdGenerator: () => crypto.randomUUID(),
    });
    const server = buildMcpServer();
    await server.connect(transport);
    await transport.handleRequest(req.raw, reply.raw, { tenantId: req.tenantId!, requestId: req.id });
  });
}

The exact MCP SDK call shape may differ slightly between SDK versions; the registry pattern above is the contract Phase 2 must preserve.

G2. `ping` tool

File: src/mcp/tools/ping.ts


import { z } from 'zod';
import type { ToolDef } from '../server.js';
import { writeAuditEvent } from '../../security/audit-log.service.js';
 
const Input = z.object({});
 
export const pingTool: ToolDef<z.infer<typeof Input>, { ok: true; tenantId: string; requestId: string }> = {
  name: 'ping',
  description: 'Health-check tool — verifies the entire middleware stack.',
  inputSchema: Input,
  async handler(_input, ctx) {
    await writeAuditEvent('mcp.tool_called', 'success', { tenantId: ctx.tenantId, requestId: ctx.requestId, tool: 'ping' });
    return { ok: true, tenantId: ctx.tenantId, requestId: ctx.requestId };
  },
};

G3. `get_account_health` stub

File: src/mcp/tools/account-health.ts


import { z } from 'zod';
import type { ToolDef } from '../server.js';
import { writeAuditEvent } from '../../security/audit-log.service.js';
 
const Input = z.object({
  platform:  z.enum(['google', 'meta', 'tiktok']),
  dateRange: z.enum(['last_7_days', 'last_30_days', 'last_90_days']),
});
 
interface Output {
  platform: 'google' | 'meta' | 'tiktok';
  dateRange: string;
  metrics: { spend: number; roas: number; cpa: number; ctr: number };
  _stub: true; // removed in Phase 2 when real data lands
}
 
export const accountHealthTool: ToolDef<z.infer<typeof Input>, Output> = {
  name: 'get_account_health',
  description: 'Returns spend, ROAS, CPA, CTR for the given platform and date range.',
  inputSchema: Input,
  async handler(input, ctx) {
    await writeAuditEvent('mcp.tool_called', 'success', { tenantId: ctx.tenantId, requestId: ctx.requestId, tool: 'get_account_health' });
    return {
      platform: input.platform,
      dateRange: input.dateRange,
      metrics: { spend: 0, roas: 0, cpa: 0, ctr: 0 },
      _stub: true,
    };
  },
};

G4. Server bootstrap

File: src/index.ts


import Fastify from 'fastify';
import helmet from '@fastify/helmet';
import { verifyAllSecretsLoadable } from './security/secrets.loader.js';
import { rateLimiterPlugin } from './security/rate-limiter.plugin.js';
import { tenantAuthPlugin } from './security/tenant.middleware.js';
import { adminRoutes } from './auth/admin-routes.js';
import { mountMcp } from './mcp/server.js';
import { startOAuthStateCleanup } from './auth/oauth-state.service.js';
 
await verifyAllSecretsLoadable();
 
// Pino logs structured JSON to stdout. In production the systemd unit captures stdout into
// journald — no log-shipping infrastructure is wired in Phase 1. Retention (size + time)
// is configured in Phase 5 §L; for dev, `journalctl -u deneva-mcp -f` tails the live log.
//
// Redaction has two layers:
//   1. Pino's built-in `redact` strips known request-header paths (api-key, admin-token).
//   2. A custom `err` serializer scrubs token-shaped strings and token-named fields from
//      arbitrary log objects — adapter SDKs sometimes echo `access_token` into error
//      messages, and we don't want those in journald. Phase 5 §L2's grep evidence check
//      then becomes a regression-detector, not the primary defence.
const TOKEN_KEY_RE = /access_token|refresh_token|client_secret|authorization|api[_-]?key|x-admin-token/i;
function scrubTokens(value: unknown, depth = 0): unknown {
  if (depth > 6 || value == null) return value;
  if (typeof value === 'string') {
    return value
      .replace(/Bearer\s+\S+/gi, 'Bearer [REDACTED]')
      .replace(/ya29\.[A-Za-z0-9_-]+/g, '[REDACTED]')
      .replace(/EAA[A-Za-z0-9]+/g, '[REDACTED]');
  }
  if (Array.isArray(value)) return value.map((v) => scrubTokens(v, depth + 1));
  if (typeof value === 'object') {
    const out: Record<string, unknown> = {};
    for (const [k, v] of Object.entries(value as Record<string, unknown>)) {
      out[k] = TOKEN_KEY_RE.test(k) ? '[REDACTED]' : scrubTokens(v, depth + 1);
    }
    return out;
  }
  return value;
}
const app = Fastify({
  logger: {
    redact: ['req.headers["x-api-key"]', 'req.headers["x-admin-token"]'],
    serializers: {
      err: (e: Error) => scrubTokens({ type: e.name, message: e.message, stack: e.stack }),
    },
  },
  genReqId: () => crypto.randomUUID(),
});
 
await app.register(helmet, { global: true });
await app.register(rateLimiterPlugin);
await app.register(tenantAuthPlugin);
await app.register(adminRoutes);
await mountMcp(app);
 
// Process-liveness probe for nginx / systemd / external uptime monitor.
// Unauthenticated by design — a 200 here means the Node process is alive and listening,
// not that downstream dependencies are healthy. Phase 4 §G3 owns the dependency-health
// endpoint (/admin/health/inngest); Phase 5 §G1 wires this to an external monitor.
const startedAt = Date.now();
app.get('/health', async () => ({
  ok:        true,
  version:   process.env.npm_package_version ?? '0.1.0',
  uptimeSec: Math.floor((Date.now() - startedAt) / 1000),
}));
 
startOAuthStateCleanup();
 
await app.listen({ host: '127.0.0.1', port: 3001 });

G5. Acceptance

File: tests/integration/mcp-e2e.test.ts

Drive both tools through /mcp with a real API key. Assert: 200, response shape matches the Zod output, mcp.tool_called audit row exists.
Logging an Error whose message contains Bearer ya29.foo emits a JSON log line with the token segment redacted (assert on the parsed line — the literal ya29.foo and Bearer ya29 substrings must not appear). Same assertion for an error with a property named access_token.
Manual curl examples documented:


# ping
curl -X POST https://localhost/mcp \
  -H "X-Api-Key: $KEY" -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":1,"method":"tools/call","params":{"name":"ping","arguments":{}}}'
 
# get_account_health
curl -X POST https://localhost/mcp \
  -H "X-Api-Key: $KEY" -H "Content-Type: application/json" \
  -d '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"get_account_health","arguments":{"platform":"google","dateRange":"last_7_days"}}}'

Workstream H — OAuth state scaffolding

H1. Service skeleton

File: src/auth/oauth-state.service.ts


import { createHash, randomBytes } from 'node:crypto';
import { and, eq, gt, lt } from 'drizzle-orm';
import { db } from '../db/index.js';
import { oauthStates } from '../db/schema.js';
 
export async function createOAuthState(tenantId: string, platform: string) {
  const state         = randomBytes(32).toString('base64url');
  const codeVerifier  = randomBytes(32).toString('base64url');
  const codeChallenge = createHash('sha256').update(codeVerifier).digest('base64url');
 
  await db.insert(oauthStates).values({
    state, codeVerifier, tenantId, platform,
    expiresAt: new Date(Date.now() + 10 * 60 * 1000),
  });
  return { state, codeVerifier, codeChallenge };
}
 
export async function consumeOAuthState(state: string) {
  const [row] = await db.delete(oauthStates)
    .where(and(eq(oauthStates.state, state), gt(oauthStates.expiresAt, new Date())))
    .returning();
  if (!row) throw new Error('Invalid or expired OAuth state');
  return row; // single-use: deleted on read
}
 
export function startOAuthStateCleanup(): void {
  // Plain in-process timer is fine for Phase 1; replaced by Inngest cron in Phase 4.
  setInterval(() => {
    void db.delete(oauthStates).where(lt(oauthStates.expiresAt, new Date()));
  }, 5 * 60_000).unref();
}

No HTTP routes here — the /auth/:platform/start and /callback endpoints land in Phase 2 with the Google adapter. Phase 1 only proves the storage and cleanup work.

H2. Acceptance

Unit test: createOAuthState writes a row; consumeOAuthState returns and deletes it; second call with the same state throws.
Integration test: insert a row with expiresAt = now - 1ms, run startOAuthStateCleanup once (extracted as a callable for test), assert row is gone.

H3. Seed script — first tenant + API key

File: scripts/seed-tenant.mjs

A one-off bootstrap tool. Connects as mcp_admin (bypasses RLS — no tenant context to set), creates a tenant, mints a fresh API key, and prints the raw key once — there is no way to retrieve it later. Use the printed key as $KEY in the §13 smoke test.


// scripts/seed-tenant.mjs
//
// Usage:
//   node scripts/seed-tenant.mjs "Acme Corp"
//
// Reads DB_ADMIN_PASSWORD and API_KEY_HMAC_SECRET from ./secrets/.
// Prints "API key: <raw>" to stdout. The raw key is shown ONCE — store it now.
 
import { readFileSync } from 'node:fs';
import { createHmac, randomBytes } from 'node:crypto';
import pg from 'pg';
 
const tenantName = process.argv[2] ?? 'dev-tenant';
 
const adminPw  = readFileSync('secrets/DB_ADMIN_PASSWORD', 'utf8').trim();
const hmacKey  = readFileSync('secrets/API_KEY_HMAC_SECRET');
 
const client = new pg.Client({
  host: '127.0.0.1', port: 5432, user: 'mcp_admin',
  password: adminPw, database: 'deneva_mcp',
});
await client.connect();
 
const rawKey = randomBytes(32).toString('base64url');
const keyHash = createHmac('sha256', hmacKey).update(rawKey).digest('hex');
const expires = new Date(Date.now() + 365 * 24 * 60 * 60 * 1000);
 
try {
  await client.query('BEGIN');
  const { rows: [tenant] } = await client.query(
    'INSERT INTO tenants (name) VALUES ($1) RETURNING id',
    [tenantName],
  );
  await client.query(
    `INSERT INTO api_keys (tenant_id, key_hash, description, expires_at)
     VALUES ($1, $2, $3, $4)`,
    [tenant.id, keyHash, 'seed-tenant.mjs', expires],
  );
  await client.query('COMMIT');
  console.log(`Tenant: ${tenant.id} (${tenantName})`);
  console.log(`API key: ${rawKey}`);
  console.log('Store this key now — it cannot be retrieved later.');
} catch (err) {
  await client.query('ROLLBACK');
  throw err;
} finally {
  await client.end();
}

Acceptance: node scripts/seed-tenant.mjs "Acme Corp" prints a tenant UUID and a base64url key. psql ... -c "SELECT count(*) FROM api_keys WHERE tenant_id = '<that uuid>';" returns 1. curl -H "X-Api-Key: <that key>" ... to /mcp succeeds.

Workstream I — CI & dependency hygiene

I1. GitHub Actions workflow

File: .github/workflows/ci.yml


name: CI
on:
  pull_request:
  push: { branches: [main] }
 
jobs:
  build:
    runs-on: ubuntu-latest
    services:
      postgres:
        image: postgres:16-alpine
        env: { POSTGRES_PASSWORD: ci, POSTGRES_DB: deneva_mcp }
        ports: ['5432:5432']
        options: >-
          --health-cmd "pg_isready -U postgres" --health-interval 5s
          --health-timeout 5s --health-retries 10
    steps:
      - uses: actions/checkout@v4
      - uses: actions/setup-node@v4
        with: { node-version: '22', cache: 'npm' }
      - run: npm ci
      - run: bash scripts/dev-secrets.sh
      - run: npm run typecheck
      - run: npm run lint
      - run: npm run audit          # fails on high/critical advisories
      - run: npm run db:migrate
      - run: npm test

I2. Dependabot config

File: .github/dependabot.yml


version: 2
updates:
  - package-ecosystem: "npm"
    directory: "/"
    schedule: { interval: "daily" }
    open-pull-requests-limit: 10
    groups:
      minor-and-patch:
        update-types: ["minor", "patch"]
  - package-ecosystem: "github-actions"
    directory: "/"
    schedule: { interval: "weekly" }

I3. Branch protection (informational — manual setup)

In the GitHub repo settings:

Require status check build before merge.
Require at least one review.
Require linear history.
Disallow force pushes to main.

I4. Acceptance

A PR that introduces a dependency with a known high-severity advisory fails the build at the npm run audit step.
A PR that breaks the RLS test fails at npm test.
Dependabot creates at least one PR within 24h of merging the config.

§11 — Definition of Done (full checklist)

Status legend (as of 2026-05-07, post-Ubuntu-smoke-test)

[x] — implementation complete in the repo, verifiable by reading the source. End-to-end-verified items also note the run that proved it.

[~] — code shipped but not yet executed end-to-end (needs npm install + Ubuntu host run, see docs/setup-ubuntu.md).

[ ] — not yet implemented.

The 2026-05-07 Ubuntu smoke-test surfaced eight bugs (now fixed) — see §14 for the catalogue.

A. Project bootstrap

[~] npm run typecheck passes on a clean checkout. (Code in place; not yet run because node_modules are not installed in this checkout.)
npm run lint flags forbidden raw sql interpolation. (Rule wired in eslint.config.js.)
Node 22 enforced via engines and .nvmrc. (package.json engines + .nvmrc.)

B. Database & schema

All seven tables defined in src/db/schema.ts. (Migration generation deferred to first npm install run — npm run db:generate.)
RLS enabled on metric_cache, platform_credentials, sync_log. (src/db/rls.sql. api_keys is intentionally NOT under RLS — see §14 item #4 and the policy comment.)
mcp_app cannot UPDATE/DELETE audit_log; NOBYPASSRLS set. (src/db/roles.sql. Verified end-to-end on the 2026-05-07 Ubuntu smoke-test — mcp_app could see api_keys after RLS was lifted, audit_log inserts succeeded, deletes were blocked.)
RLS verification test written. (tests/integration/rls.test.ts — runs in CI against a real Postgres.)
Pool max set explicitly (20). (src/db/index.ts.)
Pre-migration pg_dump step documented in deploy procedure. (docs/setup-ubuntu.md “Update the application” section.)

C. Secrets loader

All four required secrets must load at startup or the process exits ≠ 0. (verifyAllSecretsLoadable() in src/security/secrets.loader.ts, called first in src/index.ts.)
No secret value is read from process.env. (Loader reads only from /run/credentials/... or ./secrets/.)
secrets/ directory is gitignored. (.gitignore.)

D. API key auth

HMAC-SHA256 (not plain SHA-256) hashes used. (src/security/api-key.service.ts. Verified 2026-05-07: hash computed live from the running service’s HMAC secret matched the DB-stored hash byte-for-byte.)
timingSafeEqual used for comparison. (Same file.)
Rotation endpoint creates a new key with 24h grace on the old. (src/auth/admin-routes.ts.)
[~] All four D4 acceptance tests pass. (tests/integration/api-key.test.ts covers the unit-level cases. The full E2E flow (missing/invalid/valid key + rotation + 25h-clock-advance) requires app.inject() against a booted Fastify instance — add in a follow-up before Phase 1 sign-off. Manual smoke-test on 2026-05-07 verified the success path: valid key → initialize → 200.)
Auth middleware actually runs on /mcp requests. (Plugin wrapped with fastify-plugin in src/security/tenant.middleware.ts — without it, Fastify’s encapsulation hid the preHandler hook from the /mcp route. See §14 #6.)

E. Rate limiting

[~] Global per-IP, per-tenant, and per-route limits enforced separately. (Global + per-tenant shipped — see src/security/rate-limiter.plugin.ts, split into globalRateLimiterPlugin + tenantRateLimiterPlugin so they bracket auth correctly. Per-route /auth/* strict limit deferred to Phase 2 when the real OAuth routes are added.)
IP block engages after 10 auth failures within 1h. (src/security/ip-block.service.ts, wired in src/security/tenant.middleware.ts.)
[~] All E5 acceptance tests pass. (tests/integration/ip-block.test.ts covers the IP-block service. The full app.inject()-driven cases (101st global request, 301st tenant request, blocked-IP-with-valid-key) follow alongside the D4 E2E suite.)

F. Audit log

Every event in §F2 wire-up checklist produces a row. (api_key.auth_* in tenant middleware; rate_limit.exceeded in rate limiter; auth.blocked_ip in tenant middleware on threshold trip; mcp.tool_called in each tool; mcp.tool_failed in src/mcp/server.ts — single producer; api_key.rotated in admin routes.)
PII keys are stripped from metadata. (src/security/audit-log.service.ts stripPii. Test: tests/integration/audit.test.ts.)
mcp_app cannot mutate or delete rows. (REVOKE in src/db/roles.sql; test in tests/integration/audit.test.ts.)

G. MCP server

initialize reachable through /mcp with a valid key. (2026-05-07 smoke-test: POST /mcp with initialize returns 200 with serverInfo: { name: "deneva-mcp", version: "0.1.0" } and an mcp-session-id header.)
Both ping and get_account_health registered. (src/mcp/server.ts registers both via the registry; uses non-deprecated server.registerTool().)
Tool registry pattern is set up so Phase 2 only adds files under mcp/tools/ + adapters/. (See docs/components/mcp-tools.md “Adding a new tool”.)
Server binds to 127.0.0.1:3001 only. (src/index.ts app.listen({ host: '127.0.0.1', port: 3001 }).)
A thrown error inside any tool produces an mcp.tool_failed audit row (single producer). (Wrapper around every handler in src/mcp/server.ts.)
GET /health returns 200 with { ok, version, uptimeSec } and is unauthenticated. (In src/index.ts; auth middleware skips non-/mcp routes.)
Tool context (tenantId, requestId) flows to handlers. (Closure-based: buildMcpServer({ tenantId, requestId }) in src/mcp/server.ts builds a fresh server per request with context baked into each handler. The MCP SDK’s transport.handleRequest(req, res, parsedBody) has no application-context slot — the third arg is the JSON-RPC body. See §14 #8.)
[~] End-to-end MCP test (tests/integration/mcp-e2e.test.ts). (Pending — requires booting the full app inside vitest. Stateful tools/call over curl needs the initialize → notifications/initialized → tools/call session dance, which a real MCP client handles automatically. Same follow-up as D4/E5 E2E.)

H. OAuth state

oauth_states table populated and consumed in unit tests. (tests/integration/oauth-state.test.ts.)
Cleanup cron deletes expired rows. (startOAuthStateCleanup in src/auth/oauth-state.service.ts + _cleanupOAuthStatesNow test.)

I. CI

CI runs typecheck, lint, audit, migrations, tests against a real Postgres. (.github/workflows/ci.yml.)
Dependabot config merged. (.github/dependabot.yml.)
CI fails on a high-severity advisory. (npm audit --audit-level=high step in the workflow.)

Outstanding before Phase 1 sign-off

Run npm install once (locally or on the CI runner) so package-lock.json is generated and committed. This unblocks the typecheck / lint / migration generation steps.
Generate the initial migration: npm run db:generate — produces src/db/migrations/0000_*.sql from the schema. Commit it.
Add the three E2E suites flagged [~] above (auth.test.ts, rate-limit.test.ts, mcp-e2e.test.ts) — they need a booted Fastify instance via app.inject() and Postgres up. The CI workflow already provisions Postgres, so they slot in without infra changes.
~~Smoke-test on the Ubuntu host~~ — done 2026-05-07 following docs/setup-ubuntu.md. Authenticated POST /mcp initialize returns 200 with the expected server info. Eight bugs found and fixed during the smoke test; see §14 below.

§14 — Errata: bugs found during the first Ubuntu smoke-test

The first end-to-end run on a fresh Ubuntu 24.04 host (2026-05-07) hit eight issues. All eight are now fixed in the source and the setup guide; this section is the record of what was wrong and where the fix landed, so the next person walking this guide doesn’t re-debug them.

#	Symptom	Root cause	Fix
1	`psql:src/db/roles.sql: ERROR: permission denied to create role` when running Step 8 as `mcp_admin`.	The seed-time `mcp_admin` role doesn’t carry `CREATEROLE`. The earlier draft of `src/db/roles.sql` and Step 8 told the operator to run the script as `mcp_admin`.	Run `roles.sql` via `sudo -u postgres psql -d deneva_mcp -v app_password="$APP_PW" -f src/db/roles.sql`. Updated `roles.sql` header comment + setup-ubuntu.md Step 8.
2	`seed-tenant.mjs` crashed with `ENOENT: no such file or directory, open 'secrets/API_KEY_HMAC_SECRET'`.	Step 8 wrote `DB_ADMIN_PASSWORD` and `DB_PASSWORD` into `secrets/` but never the HMAC secret, even though `seed-tenant.mjs` reads it from there. The `API_KEY_HMAC_SECRET` value also wasn’t generated as a shell variable until Step 7’s interactive prompt — too late to reuse.	Step 4a now mints `API_KEY_HMAC_SECRET="$(openssl rand -base64 32)"` alongside the DB passwords; Step 8 writes all three into `secrets/`. Files in setup-ubuntu.md.
3	Service crash-looped with `Check failed: 12 == errno` (V8 fatal `ENOMEM` from `mprotect`).	Systemd unit shipped with `MemoryDenyWriteExecute=true`, which blocks the `mprotect(PROT_EXEC)` calls V8’s JIT baseline compiler issues. Incompatible with any V8 runtime.	Removed `MemoryDenyWriteExecute=true` from the systemd unit in setup-ubuntu.md Step 10. The other hardening flags (`NoNewPrivileges`, `ProtectSystem=strict`, `RestrictNamespaces`, etc.) stay.
4	Authenticated `POST /mcp` returned 401 even with a valid key; `audit_log` was empty.	RLS policy on `api_keys` (`tenant_id = current_setting('app.current_tenant_id')::uuid`) requires the tenant context to already be set — but the auth middleware reads `api_keys` to discover the tenant. Chicken-and-egg: `mcp_app` saw zero rows, so the lookup always failed.	Removed RLS from `api_keys` in `src/db/rls.sql`. `api_keys` is auth infrastructure — access is controlled by hash-unguessability, not tenant scope.
5	Auth still failing with empty `audit_log` and 401 after fix #4.	`db/index.ts` set `ssl: { rejectUnauthorized: true }` in `NODE_ENV=production`. The Phase-1 server uses the snake-oil cert, which Node’s bundled CA store does not trust. Every pool connection threw `self-signed certificate`. The error path turned into a 401 without an audit row (see fix #6).	`src/db/index.ts` now uses `ssl: false`. App and Postgres share a host through Phase 5, so loopback traffic never benefits from SSL — revisit only if Postgres moves off-host.
6	After fix #5 the DB connection worked from a one-off Node script, yet the running service still returned 401 with `1.98ms` response time and zero audit rows.	`tenantAuthPlugin` (and `tenantRateLimiterPlugin`) were registered via `app.register(...)` without `fastify-plugin` wrapping. Fastify’s plugin encapsulation meant the `addHook('preHandler', ...)` calls inside applied only to routes registered inside the plugin scope. The MCP route is mounted directly on the parent via `mountMcp(app)`, so the auth hook never ran for `POST /mcp`. The 401 was actually coming from the route handler’s defence-in-depth `if (!tenantId)` check.	Both plugins are now `fp(...)`-wrapped in `src/security/tenant.middleware.ts` and `src/security/rate-limiter.plugin.ts`. `fastify-plugin` is already a transitive dep via `@fastify/rate-limit`.
7	`POST /mcp` returned 406 `Not Acceptable: Client must accept both application/json and text/event-stream`.	`StreamableHTTPServerTransport` enforces content-negotiation. The smoke-test `curl` only sent `Content-Type`, not `Accept`.	Added `-H "Accept: application/json, text/event-stream"` to the smoke-test curl in Step 11 of setup-ubuntu.md.
8	After fix #7, `POST /mcp` returned 400 `Parse error: Invalid JSON-RPC message` for any body.	The third argument of `transport.handleRequest(req, res, parsedBody?)` is the JSON-RPC payload, not application context. The original code passed `{ tenantId, requestId }` there; the SDK then validated that as a JSON-RPC message and rejected it. Compounding factor: Fastify’s body parser had already consumed `req.raw`, so falling back to the stream was empty too.	`src/mcp/server.ts` now passes `req.body` as `parsedBody`. Tool context flows via closure on `buildMcpServer({ tenantId, requestId })` — context captured per-request when the server is built, no SDK-context plumbing required.

After these eight fixes, the §13 smoke test passes end-to-end: initialize returns 200 with serverInfo: { name: "deneva-mcp", version: "0.1.0" } and an mcp-session-id header. Authenticated tool calls require the same session ID + the notifications/initialized step that real MCP clients (Claude Desktop, etc.) maintain across requests; curl-based testing of tools/call is therefore not part of the Phase 1 smoke test.

§12 — Out of scope (deferred)

Item	Phase
Real OAuth `/auth/:platform/start` and `/callback` HTTP routes	2
Envelope encryption (`credentials.service.ts`) — needs real tokens	2
Google / Meta / TikTok platform adapters	2 / 3
Inngest sync functions, signed-webhook verification	4
Cache TTL config and `cache.service.ts`	2 (per-platform)
nginx config, UFW rules, full systemd hardening flag set	5
GDPR erasure endpoint (`deleteTenant`)	5
Penetration test of public endpoints	5

If a task you’re about to do is on this list, stop — it belongs in a later phase.

§13 — Manual smoke test (run end-to-end before declaring Phase 1 done)


# 1. Bring up Postgres
docker compose up -d postgres
 
# 2. Generate dev secrets
bash scripts/dev-secrets.sh
 
# 3. Run migrations
npm run db:migrate
 
# 4. Create the mcp_app role + apply RLS (must be done BEFORE the app connects).
#    mcp_admin lacks CREATEROLE, so roles.sql runs as the postgres superuser.
APP_PW="$(cat secrets/DB_PASSWORD)"
ADMIN_PW="$(cat secrets/DB_ADMIN_PASSWORD)"
sudo -u postgres psql -d deneva_mcp -v app_password="$APP_PW" -f src/db/roles.sql
psql "postgresql://mcp_admin:${ADMIN_PW}@127.0.0.1:5432/deneva_mcp" -f src/db/rls.sql
 
# 5. Seed a tenant + API key
node scripts/seed-tenant.mjs "dev-tenant"   # prints "API key: XYZ123..." ONCE — copy now
export KEY=XYZ123...
 
# 6. Start the server (now that mcp_app role exists)
npm run dev
 
# 7. initialize with valid key — expect 200 + audit row.
#    The Accept header is REQUIRED by Streamable HTTP; without it the SDK 406s.
#    `initialize` is the right smoke-test target because `tools/call` requires a
#    stateful session (initialize → notifications/initialized → tools/call) that
#    a real MCP client handles, but curl on its own does not.
curl -s -o /dev/null -w "%{http_code}\n" -X POST http://127.0.0.1:3001/mcp \
  -H "X-Api-Key: $KEY" \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"smoke","version":"1.0"}}}'
# → 200
 
# 8. /mcp without key — expect 401 + audit row
curl -s -o /dev/null -w "%{http_code}\n" -X POST http://127.0.0.1:3001/mcp \
  -H "Content-Type: application/json" \
  -H "Accept: application/json, text/event-stream" \
  -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"smoke","version":"1.0"}}}'
# → 401
 
# 9. Hammer with 101 requests in a minute — expect at least one 429
for i in $(seq 1 101); do
  curl -s -o /dev/null -w "%{http_code} " -X POST http://127.0.0.1:3001/mcp \
    -H "X-Api-Key: $KEY" \
    -H "Content-Type: application/json" \
    -H "Accept: application/json, text/event-stream" \
    -d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"smoke","version":"1.0"}}}'
done; echo
# → mostly 200, then 429s
 
# 10. Confirm mcp_app cannot tamper with audit_log
psql "postgresql://mcp_app@127.0.0.1:5432/deneva_mcp" -c "DELETE FROM audit_log;"
# → ERROR: permission denied for table audit_log
 
# 11. Inspect the trail
psql "postgresql://mcp_admin:${ADMIN_PW}@127.0.0.1:5432/deneva_mcp" \
  -c "SELECT event_type, outcome, count(*) FROM audit_log GROUP BY 1,2 ORDER BY 1,2;"
# → expect rows for api_key.auth_success/failure, mcp.tool_called, rate_limit.exceeded

If every step above produces the expected outcome, Phase 1 is shipped. Move on to Phase 2.