Phase 6 — Media + Interactive Messages
Effort: L
Goal
Send and receive image / document / audio media, and send interactive (button / list) messages and handle their replies. Media is stored on local disk with a signed-URL access path; Nginx never serves media directly.
Deliverables
Inbound media flow
src/inngest/functions/process-message.ts— whenmessage.type ∈ {image, document, audio, video, sticker}, emitwa/media.download.requestedwith{ clientId, phoneNumberId, mediaId, wamid }.src/inngest/functions/download-media.ts:GET https://graph.facebook.com/v23.0/{mediaId}→ temporary signed URL (Meta URL expires in ~5 min).- Stream download into
MEDIA_ROOT/<phone_number_id>/<yyyy>/<mm>/<uuid>.<ext>with mode0640. The file UUID is generated locally; never use Meta’s id as a filename. - Compute SHA-256 streaming; record
mime_type,size_bytes,sha256,storage_path(relative). - Insert
media_objectsrow. - Update the
messagesrow’smedia_object_id. - Idempotent on
(wamid, wa_media_id)— re-runs are no-ops.
- Mime sniff: validate the first N bytes against the Meta-reported
mime_type(e.g. via thefile-typepackage). On mismatch, log and store withmime_type = sniffed, set ametadata.mime_mismatch = trueflag for audit. - Size guards: enforce Meta’s documented inbound limits at the file-stream level; abort the download if size exceeds the type’s cap (image 5 MB, document 100 MB, audio 16 MB, video 16 MB, sticker 500 KB).
Outbound media flow
src/meta/upload-media.ts—POST /{phone_number_id}/mediamultipart upload. Returns Meta’smedia_id.src/meta/send-media.ts—POST /{phone_number_id}/messageswithtype ∈ {image, document, audio, video}and either{ id: <meta_media_id> }(after upload) or{ link: <url> }.src/inngest/functions/upload-media.ts— Phase 3 stub fleshed out. Streams fromlocalPathto Meta, returnsmedia_id. Used bysend_mediawhen the source is a local file.src/tools/send-media.ts:- Input zod:
{ to: string, kind: 'image' | 'document' | 'audio' | 'video', source: { url: string } | { localPath: string } | { mediaId: string }, caption?: string, // image/document/video only filename?: string, // document only, max 240 chars phoneNumberId?: string } - Validates type-specific size + mime constraints before upload.
- For
localPath: streams viaupload-mediafunction first; forurl: passes Meta alink(Meta does the fetch). - Persists
outrow +media_objectsrow. - Requires scopes
tools:send_media+media:write.
- Input zod:
Signed-URL access
src/media/signed-url.ts:sign(mediaId)→https://wa.<yourdomain>/media/<mediaId>?exp=<unix>&sig=<hex>withsig = HMAC-SHA256(MEDIA_SIGNING_SECRET, mediaId|exp).verify(url params)→ boolean + structured error.- TTL from
MEDIA_URL_TTL_SECONDS(default 300).
src/transport/http-media.ts—GET /media/:id:- Verify HMAC signature.
- Check
expnot in the past. - Resolve
media_objectsrow. - Tenancy check: the caller’s auth context’s
clientIdmust match the media’s owner (look upmessages.client_idvia the link). Even with a valid signature, cross-tenant access is refused. - Stream the file from
MEDIA_ROOT/<storage_path>with the rightContent-Type,Content-Length, andContent-Disposition.
src/tools/get-media-url.ts— input{ mediaId: string }. Verifies the caller’s grant on the media’sphone_number_id. Returns the signed URL.- Nginx (Phase 7) never has a
location /mediablock. All access flows through Express → the auth + signature pipeline. - Signed URLs are NOT bearer-shareable. Even with a valid
sigandexp,GET /media/:idstill requires a Bearer API key in theAuthorizationheader AND verifies that the caller’sclientIdmatches the media’s owning tenancy. A client cannot hand a signed URL to a teammate who has no key. The signature only proves “this URL was minted by us recently”; the auth header proves “this caller is allowed”. Document this indocs/api/mcp-tools.mdso clients don’t expect link-shareability.
Interactive messages
src/meta/send-interactive.ts— POSTstype: interactivewithbuttonorlistshape per Meta spec.src/tools/send-interactive-buttons.ts:- Input zod:
{ to, header?: string, body: string, footer?: string, buttons: Array<{ id: string, title: string }> (max 3), phoneNumberId? }. - Validates: ≤ 3 buttons; titles ≤ 20 chars; ids ≤ 256 chars and unique within the message.
- Input zod:
src/tools/send-interactive-list.ts:- Input zod:
{ to, header?, body, footer?, buttonText: string, sections: Array<{ title?: string, rows: Array<{ id, title, description? }> }>, phoneNumberId? }. - Validates Meta’s structural limits (≤ 10 sections, ≤ 10 rows total).
- Input zod:
src/webhook/normalise.ts— extended to handleinteractivereplies (button_reply,list_reply). Stores selectedidinmessages.payload.selectedIdand a friendly preview inmessages.body("User picked: <title>").
Retention
src/inngest/functions/prune-media.ts— daily cron. Formedia_objectsrows older than 90 days (configurable):- Delete the file from disk.
- Nullify
storage_path, setmetadata.pruned_at. - Row is kept so audit references still resolve.
Docs (extended)
docs/architecture/media.md— full design: inbound download flow, outbound upload, signed-URL contract, on-disk layout, retention, mime sniffing, why Nginx never serves media.docs/api/mcp-tools.md— extended withsend_media,send_interactive_buttons,send_interactive_list,get_media_url. Each with full input schemas and examples.docs/api/webhook-payloads.md— extended with media and interactive payload shapes.
Critical files
- src/meta/{upload-media,download-media,send-media,send-interactive}.ts
- src/inngest/functions/{download-media,upload-media,prune-media}.ts
- src/media/{storage,signed-url}.ts
- src/transport/http-media.ts
- src/tools/{send-media,send-interactive-buttons,send-interactive-list,get-media-url}.ts
- src/webhook/normalise.ts — extended
- drizzle/0005_media.sql
Tests
Unit
tests/unit/media/signed-url.test.ts:sign+verifyround-trip.- Expired URL fails.
- Tampered signature fails (constant-time).
- Wrong
expfails. - 100% coverage required.
tests/unit/media/storage.test.ts:- Path construction
MEDIA_ROOT/<phone>/<yyyy>/<mm>/<uuid>.<ext>is correct. - Path traversal: a malicious
storage_pathlike../../etc/passwdis rejected.
- Path construction
tests/unit/tools/interactive-validation.test.ts:- Button count > 3 rejected.
- Title > 20 chars rejected.
- List section count > 10 rejected.
Integration (testcontainers Postgres + memfs for media)
tests/integration/media/download.test.ts:- Webhook inbound with image →
wa/media.download.requestedemitted → file lands at the expected path → row inmedia_objects→messages.media_object_idset. - Re-run with same
wamid→ no duplicate file, no duplicate row. - Meta-reported mime mismatching sniffed mime → row stored with
metadata.mime_mismatch = true.
- Webhook inbound with image →
tests/integration/media/upload.test.ts:send_mediawithsource.localPath→ multipart upload to Meta (mocked), media row inserted with both Metamedia_idand local path (the local copy is kept for audit).send_mediawithsource.url→ Metalinkmode; no upload, no local copy.
tests/integration/media/signed-url.test.ts:get_media_urlreturns URL;GET /media/:id?...200 with correct content.- URL after
MEDIA_URL_TTL_SECONDS→ 403. - Tampered
sig→ 403. - Cross-tenant: client B with valid signature for client A’s media → 403 (tenancy check).
tests/integration/interactive/send-buttons.test.ts:- Send 3-button → Meta receives correct payload; row inserted.
- Reply with
button_reply→ row inserted withpayload.selectedId= the button id.
tests/integration/interactive/send-list.test.ts:- Similar to buttons;
list_replyhandling.
- Similar to buttons;
tests/integration/retention/prune-media.test.ts:- Seed 100d-old media → cron runs → file deleted from disk; row kept with nulled
storage_path.
- Seed 100d-old media → cron runs → file deleted from disk; row kept with nulled
Coverage
src/media/signed-url.tsat 100%.- Phase total ≥ 80%.
Code documentation
- TSDoc with
@remarkson:download-media.ts(idempotency, size guards, mime sniff, file naming policy).signed-url.ts(HMAC contract, TTL, why tenancy check is still required).http-media.ts(auth + signature + tenancy ordering, why no Nginx direct-serve).send-media.ts,send-interactive-*.ts(Meta payload constraints, validation rules).normalise.ts(interactive reply normalisation rules).
docs/architecture/media.mdcomplete.docs/api/mcp-tools.md,docs/api/webhook-payloads.mdextended.docs/reference/regenerated.
Acceptance
- Inbound image — phone sends an image →
messagesrow withmedia_object_idset;get_media_urlreturns a short-lived URL;curlof that URL fetches the image. - Outbound image (localPath) —
send_mediawith a PDF on the host file system → recipient receives it on WhatsApp. - Outbound image (url) —
send_mediawith a public URL → recipient receives it (Meta fetches). - Interactive round trip —
send_interactive_buttonswith 3 options → button press on the phone →messagesrow inserted withpayload.selectedId= the pressed button’s id. - Cross-tenant block — client B with a valid signed URL constructed for client A’s media → 403 from
GET /media/:id. - Path traversal block — manually attempting
GET /media/../../etc/passwdreturns 403/404 with no filesystem read. - Media retention — manually backdated
media_objectsrow + dummy file → cron prunes file, keeps row. pnpm test:cigreen; coverage gates met.
Notes
- Files on disk live at
/var/lib/whatsapp-mcp/media/<phone_number_id>/<yyyy>/<mm>/<uuid>.<ext>. Ownerwhatsapp-mcp, mode0640. Directory never web-served.storage_pathin DB is always relative. - Templates remain deferred to phase-8-future.md.
send_messageto a user outside the 24h window still returnsOutOfSessionWindowErrorfrom Phase 2.
Definition of Done
Inbound media
-
process-messageemitswa/media.download.requestedfor media types. -
download-mediastreams toMEDIA_ROOT/<phone>/<yyyy>/<mm>/<uuid>.<ext>with mode 0640. - Mime sniff against Meta-reported type; mismatch flagged in metadata.
- Size guards per type enforced at stream level.
- Idempotent on
(wamid, wa_media_id). -
media_objectsrow inserted;messages.media_object_idset.
Outbound media
-
src/meta/upload-media.tsmultipart upload returning Metamedia_id. -
src/meta/send-media.tssupports id, link, and (via upload-then-send) localPath. -
send_mediatool with full zod input (kind / source / caption / filename / phoneNumberId). - Pre-upload size + mime validation.
- Scopes
tools:send_media+media:writeenforced.
Signed URLs + media route
-
src/media/signed-url.tssign/verify with HMAC +exp. -
GET /media/:idruns auth + signature + tenancy check (in that order). - Path-traversal rejection (relative-path only; reject
..). -
get_media_urltool returns a short-lived URL. - No
location /mediablock planned in Nginx (Phase 7).
Interactive
-
send_interactive_buttons(≤3 buttons; title ≤20 chars). -
send_interactive_list(≤10 sections; total ≤10 rows). - Webhook normaliser handles
button_reply+list_reply→payload.selectedId+ preview body.
Retention
-
prune-mediacron deletes files > 90d; row kept with nulledstorage_path.
Tests
-
tests/unit/media/signed-url.test.tsat 100% coverage. -
tests/unit/media/storage.test.ts(path construction + traversal block). -
tests/unit/tools/interactive-validation.test.ts. -
tests/integration/media/download.test.ts(happy + replay + mime mismatch). -
tests/integration/media/upload.test.ts(localPath + url). -
tests/integration/media/signed-url.test.ts(happy + expired + tampered + cross-tenant). -
tests/integration/interactive/send-buttons.test.ts. -
tests/integration/interactive/send-list.test.ts. -
tests/integration/retention/prune-media.test.ts. - Coverage:
signed-url.ts= 100%; phase ≥ 80%.
Documentation
-
docs/architecture/media.mdwritten. -
docs/api/mcp-tools.mdextended (send_media, send_interactive_*, get_media_url). -
docs/api/webhook-payloads.mdextended (media + interactive). - TSDoc
@remarksondownload-media,signed-url,http-media,send-media,send-interactive-*,normalise. -
docs/reference/regenerated cleanly.
Acceptance verified
- Inbound image →
messagesrow with media;get_media_urlreturns URL;curldownloads correctly. - Outbound localPath (PDF) → recipient receives.
- Outbound url → recipient receives.
- Interactive button round-trip records
payload.selectedId. - Cross-tenant signed URL → 403.
- Path traversal attempt → 403/404 with no fs read.
- Backdated row → cron prunes file, keeps row.
Phase signoff
- Phase 6 complete. README.md status table updated to ✅.