Cache service
Source: src/cache/ — cache.service.ts, ttl-config.ts, metrics.ts.
TTL cache for adapter results, with pg_advisory_xact_lock-based thundering-herd protection and per-transaction RLS context.
Public API
readOrFetch<T>(
key: CacheKey,
fetcher: () => Promise<T>,
opts?: { force?: boolean },
): Promise<{ data: T; cache: 'hit' | 'miss' }>CacheKey is { tenantId, platform, reportType, dateRangeKey }. On cache hit, the fetcher is never called. On miss, exactly one fetcher invocation runs to completion; concurrent callers for the same key wait on the advisory lock and read the freshly-written row when the lock is released.
force: true skips the read path entirely — the eager-sync worker (Phase 4) uses it to refresh a still-fresh row on schedule.
Why open a transaction on every call
metric_cache is RLS-isolated; the runtime connects as mcp_app (NOBYPASSRLS). Every read or write needs app.current_tenant_id set first:
SELECT set_config('app.current_tenant_id', $1, true)The true makes it transaction-local — pooled connections never leak tenant context across requests. Wrapping the whole readOrFetch in a transaction costs one extra BEGIN/COMMIT on cache hits (~sub-ms locally); the alternative was to open the tx only on miss, but then the hit-path read would happen outside any tenant context and RLS would hide every row.
Thundering-herd protection
Inside the transaction, after the optional first read, the service acquires pg_advisory_xact_lock(hashCacheKey(key)). The hash is a 64-bit FNV-1a of ${tenantId}|${platform}|${reportType}|${dateRangeKey}, reinterpreted into the signed int8 range Postgres expects. Concurrent callers serialise on this lock; only one runs the fetcher. After acquiring the lock, the service re-reads the row (double-checked locking) — if a peer already populated it, the current call exits as a cache hit.
Advisory locks auto-release at COMMIT / ROLLBACK, so they can’t leak.
TTL configuration
ttl-config.ts declares per-platform, per-report-type TTLs. The Google entries cover all seven Phase 2 reports; Meta and TikTok have a single placeholder that fills in during Phase 3.
export const TTL_SECONDS = {
google: {
account_health: 3600,
search_term_waste: 7200,
quality_score: 7200,
auction_insights: 14_400,
pmax_breakdown: 7200,
budget_optimizer: 3600,
weekly_anomaly: 7200,
},
meta: { account_health: 3600 },
tiktok: { account_health: 7200 },
} as const;ttlFor(platform, reportType) throws on an unknown combination — cheap fail-fast against typos.
Metrics
metrics.ts keeps a process-local Map<"platform/reportType", { hit, miss }> populated by recordCacheEvent on every hit/miss. snapshotCacheMetrics() derives hitRate = hit / (hit + miss) per key.
Phase 2 PR-7 exposes this through GET /admin/metrics/cache. Phase 5 replaces the Map with a Prometheus / OTel exporter — until then, the counters are local to the process and reset on restart.
Required schema
The cache upsert needs a unique index on (tenant_id, platform, report_type, date_range_key). Migration 0001_modern_sharon_carter.sql adds metric_cache_key_idx. If you change the cache key shape, regenerate.
Tests
tests/integration/cache.test.ts covers:
- Miss → hit progression within TTL (fetcher invoked once).
- Expired row triggers re-fetch and bumps
fetched_at. - Thundering herd: 4 concurrent misses produce exactly 1 fetcher call; 1 miss + 3 hits.
- Hit/miss counters appear in
snapshotCacheMetrics()with the expected ratio. force: truere-runs the fetcher and overwrites the row even when fresh.
Cross-references
database.md—metric_cacheRLS + unique index.mcp-tools.md— the seven tools (PR-6) drivereadOrFetch.