R2 lifecycle policy
Use this when: provisioning a new R2 bucket for muntin-api, or
after rotating R2 API credentials, or after editing
infra/cloudflare/r2-lifecycle.json.
| Last applied | Elapsed | Next review |
|---|---|---|
| Not yet executed | n/a | First Friday of each calendar quarter |
What this runbook covers
R2 bucket-level configuration that the Worker cannot apply itself because R2 doesn't expose lifecycle controls through the binding. The two pieces of bucket policy we care about:
- AbortIncompleteMultipartUpload at 1 day. Cloudflare's
S3-compatible API rejects sub-day minima, so 1 day is the floor. Combined with the per-row retention reaper at apps/api/src/scheduled/retention-reaper.ts (which runs every hour against the committed-document table) this closes the multipart-abandonment privacy gap.
- _(Future)_ Object-level expiry via lifecycle would be cheaper than
the per-row reaper, but R2 lifecycle expiry is bucket-wide and we want per-document retention_until from the documents table. Reaper wins on flexibility; bucket lifecycle wins on cost.
Procedure
Prereqs:
- AWS CLI installed (the R2 S3-compatible endpoint accepts the AWS
CLI shape).
- R2 API token with the
R2 Editpermission. Generate at
Cloudflare Dashboard -> R2 -> Manage R2 API Tokens. Export as R2_ACCESS_KEY_ID + R2_SECRET_ACCESS_KEY in your shell.
- The Cloudflare account ID exported as
R2_ACCOUNT_ID. - The bucket already created via `wrangler r2 bucket create
muntin-documents-iad`.
Apply:
``sh export R2_ACCOUNT_ID=... export AWS_ACCESS_KEY_ID="$R2_ACCESS_KEY_ID" export AWS_SECRET_ACCESS_KEY="$R2_SECRET_ACCESS_KEY" bash scripts/configure-r2-lifecycle.sh ``
Verify:
``sh aws s3api get-bucket-lifecycle-configuration \ --bucket muntin-documents-iad \ --endpoint-url "https://${R2_ACCOUNT_ID}.r2.cloudflarestorage.com" ``
The response should match infra/cloudflare/r2-lifecycle.json modulo the _comment field (R2 strips unknown fields).
Enablement (turning the reaper ON safely)
The retention reaper is OFF by default. The published retention policy ("documents deleted ~24h after extraction") is NOT enforced until an operator arms it. Arming it blindly would start deleting customer R2 objects on the next hourly tick, so enable it in two stages — verify read-only first, then arm destruction.
Prerequisites: DATABASE_URL set (NeonDocumentsStore active), migrations 0001..0016 applied, wrangler tail access.
Stage 1 — dry-run verification (deletes NOTHING):
```sh cd apps/api wrangler secret put RETENTION_REAPER_DRYRUN # value: true
leave RETENTION_REAPER_ENABLED unset / not "true"
```
Wait for the hourly tick (or trigger it: curl "https://api.muntin.digital/__scheduled?cron=0++++"). In wrangler tail confirm a retention_reaper.tick line whose result has "dry_run": true, "scanned": N, "would_purge": N, "purged": 0, "boundary_violations": 0, plus one retention_reaper.dry_run line per expired document. This proves the live-Neon documents path enumerates expired rows correctly and the tenant-prefix invariant holds — with zero deletions. If boundary_violations > 0, STOP and investigate (a row's r2_key escaped its org_<id>/ prefix) before arming.
Stage 2 — arm destruction:
``sh wrangler secret put RETENTION_REAPER_ENABLED # value: true wrangler secret delete RETENTION_REAPER_DRYRUN ``
Next tick: retention_reaper.tick with "dry_run": false, "purged": N, and a document.r2_purged audit event per deleted document (the customer sees the deletion in their own audit log). Roll back instantly by deleting RETENTION_REAPER_ENABLED.
Once Stage 2 is verified live, the privacy-policy / /promises retention copy (currently "being enabled") may be restored to the enforced-deletion wording.
When the reaper is enough
The per-document reaper deletes objects when their row's retention_until elapses, regardless of bucket policy. Skipping the bucket-lifecycle step does NOT leave customer data resident indefinitely; it leaves _abandoned multipart parts_ (no committed row, no reaper coverage) resident for the bucket-default forever. This is the gap that finding B-priv-7 names; apply the policy.
After every drill or apply
Update the "Last applied" row at the top of this file with the current date and the operator's initials.