Phase 8 — Future (sketches only)

Effort: deferred — explicitly out of scope for v1. Documented so v1 design doesn’t preclude any of it.

1. Approved templates

Goal: Send WhatsApp template messages so we can re-engage users outside the 24h customer service window.

Sketch

New tool send_template with input { to, templateName, languageCode, components?: TemplateComponent[], phoneNumberId? }.
New table templates: id, phone_number_id, name, language_code, category (marketing/utility/auth), status (pending/approved/rejected), body (jsonb), synced_at. Inngest cron cron/templates.sync polls Meta’s GET /{waba_id}/message_templates daily and upserts.
send_message extended: when the 24h window is closed, return a typed error pointing at available templates ({ availableTemplates: [{ name, languageCode }] }). Do not auto-pick a template — a template choice is a business decision.
Scopes: tools:send_template. Daily cap counts templates toward the same per-client / per-number budget.

Risks

Meta’s template approval is async and may take days; the cron must be resilient to long pending periods.
Template syntax ({{1}} parameters) is fiddly — full validation in the tool helps the LLM construct correct payloads.

2. Multi-number / multi-WABA operations

Goal: Onboard additional phone numbers with their own access tokens; allow one client to use multiple numbers and one number to serve multiple clients.

Sketch — schema is already ready, so this is mostly operational:

Per-number token rotation procedure documented; tokens stored individually under phone_numbers.access_token_secret_ref.
New CLI: admin numbers add --waba <id> --phone-id <id> --display-number "+44..." --token-ref secrets://wa_access_token_<id>.
New CLI: admin numbers rotate-token --phone-id <id> — generates a new file under /run/secrets/, restarts the app to reload.
Per-number Meta webhook verify tokens (phone_numbers.webhook_verify_token_ref) so leaking one doesn’t compromise others.
Per-number monthly conversation accounting against Meta’s pricing tiers — new table usage_monthly with rollups for billing visibility.

3. Client self-service portal

Goal: A small admin UI for the owner (and later, scoped read for clients).

Sketch

Next.js admin at https://admin.wa.<yourdomain>.
Owner-only initially: CRUD on clients, keys, grants; per-client message stats; audit log viewer.
Later: scoped read for clients (their own keys, their own messages, their own audit subset).
Auth via the same wamcp_live_... keys with admin:* scope, owner-only.

4. Outbound retries with per-error-code policy

Goal: Tier the retry/backoff per Meta error class.

Sketch

A meta_error_classification.ts table mapping error codes to one of: retry_5xx_only (default), retry_with_backoff, drop (e.g. 131047 re-engagement — caller’s job), fail_fast (e.g. 132xxx invalid recipient).
Inngest function policy reads from this table; centralised so we can adjust without code changes.

5. Multi-instance scale-out

Goal: Run more than one app instance behind a load balancer.

Sketch — what changes

In-process Map<clientId, Set<McpSession>> → Postgres LISTEN/NOTIFY (or Redis pub/sub). When mcp/client.notify fires, the function publishes to a channel; each app instance listens and pushes to its local sessions.
rate_limit_buckets already in Postgres — works as-is, though Redis would be faster at higher RPS.
mcp_sessions becomes a real table for session resumption across instances.
Sticky-less load balancing.

6. OpenTelemetry

Goal: Distributed traces tying together MCP tool calls, Inngest function runs, Meta API calls, and DB queries.

Sketch

Wire @opentelemetry/api + auto-instrumentation for Express, fetch, postgres.
Export to Grafana Cloud Tempo (or similar).
Tie span IDs into audit_log.metadata.trace_id so an investigator can pivot from an audit row to the full trace.

7. (Maybe) Inngest self-hosted

Goal: Run Inngest itself on-prem for data residency / no external dependency.

Sketch

Inngest binary as a Docker service in the same compose file.
Events no longer leave the Ubuntu host.
Trade-off: more ops burden (Inngest needs its own Postgres, requires monitoring), but stronger privacy stance for sensitive clients.
Worth doing if and only if a specific client mandates it.

None of the items above is started in v1. v1 is Phases 0–7 only.

Definition of Done

This phase has no v1 DoD — it is explicitly deferred. Each item below becomes its own phase / sub-plan when prioritised.

Triggers for promoting an item out of Phase 8

Templates — first time a use case needs to message a user outside the 24h window.
Multi-number ops — second phone number is added to the WABA.
Self-service portal — when manual admin CLI work for the owner exceeds ~30 min/week.
Per-error-code retry policy — when retry tuning becomes a recurring incident-runbook entry.
Multi-instance scale-out — when single-host throughput or availability becomes the binding constraint.
OpenTelemetry — when a single audit-log query no longer answers “what happened in this request?”.
Self-hosted Inngest — when a specific client mandates no events leave on-prem infra.

When any box above is ticked, open a new phase file (phase-9-<topic>.md, phase-10-..., etc.) with its own DoD, and update README.md status table.