Privacy Policy
Draft, awaiting counsel review. This is the engineering-level
source of truth for what the architecture actually does. The
published version (after counsel review and product launch) lives
at
muntin.digital/ledger/privacyand supersedes anything here.Material changes get notified to all customers 30 days ahead.
Last updated: 2026-05-29 (draft) · Version: 0.2 (post-pivot) — noted that our marketing-site Plausible analytics is self-hosted (Community Edition) and cookieless.
Disponible en español:
docs/es/privacy-policy.md. En caso dedivergencia entre ambas versiones, prevalece la inglesa hasta que
un revisor bilingüe certifique la paridad. Routing details in
Muntin Digital LLC ("Muntin," "we," "us") runs Muntin Ledger. This policy describes what data we handle, what we never do with it, and how to make us prove any of it.
We are a one-person studio building tools for small restaurants. The language here is intentionally direct.
What we collect
When you sign up:
- Your email address.
- The workspace name you choose.
- Your role (owner, bookkeeper, etc.).
When you use the product:
- Vendor invoices you upload, paste, snap, or forward. Uploaded
over TLS and read once to extract your ledger; you can turn on a lock so the retained original is encrypted to a key only your recovery phrase can open (opt-in; see "How we secure your data" below).
- The structured data we extract from them (vendor names,
amounts, dates, line items). Stored as cleartext rows scoped to your workspace; this is what enables cross-device sync.
- Records of integration sync attempts (success / failure, timestamps,
destination).
- Audit log entries (what happened, when, by whom).
What we explicitly do NOT collect:
- IP addresses for marketing or behavioural profiling. Request IPs
appear in security-only audit entries that are not joined to any analytics system.
- Browser fingerprinting, session replay, or scroll heatmaps.
- Third-party trackers, ad pixels, or cross-site identifiers.
Our only analytics is Plausible, which we self-host (Community Edition, on our own Fly.io tenant — Plausible-the-company is not a sub-processor) and runs on the marketing site only; never inside the authenticated product. It is cookieless: it sets no cookies and stores no IP address or User-Agent. It counts aggregate visits by deriving a daily-rotating, then-discarded visitor hash — there is no cross-site identifier and no behavioural profile.
How we use it
To run the product. Specifically:
- Read your invoices and turn them into structured data.
- Store the structured data in your Muntin Ledger. Your records
live in your workspace until you delete them or close your account. We are not a pipe to anyone else's system; the ledger is the destination.
- Compute insights from your records — vendor spend trends,
price-hike anomalies, duplicate-bill detection. These run as deterministic SQL queries against your records.
- Export, on your request. CSV download is built in to every
paid tier. QuickBooks and Xero connectors ship in private beta; they are optional and one-way (Muntin → QBO / Xero), never required.
- Email you about your account.
That is the entire list.
How we read your invoices (no AI in the customer-data path)
Field extraction runs on a deterministic engine we built on top of IBM's open-source docling. The engine resolves the vendor against a seeded catalog, applies a per-vendor template the operator has confirmed in earlier invoices, and falls back to a synonym dictionary + position heuristics when no template exists yet. No third party processes your invoice content. No LLM is in the customer-data path.
- Why the template path: after ~3 invoices from the same
vendor, the engine has memorised where each field lives on that vendor's layout. Subsequent invoices file silently. You confirm; we remember.
- CI enforcement: a build-time gate (
scripts/no-llm-ci.sh)
fails any commit that imports an LLM SDK (anthropic, openai, instructor, cohere, together, vertexai, google.generativeai) or reaches an LLM HTTP endpoint. Architecture, not policy.
- PII scrubber:
tools/pii-scrubber/removes SSNs, US phone
numbers, emails, and ACH account numbers from Sentry error events and audit-log target references before persistence.
The v2 architecture (pre-2026-05) routed extraction through Anthropic Claude under a zero-retention contract. The v4 pivot removed that hop entirely; this Policy reflects the current architecture.
What we never do
These are commitments backed by architecture, not just policy:
- We do not train any model — ours or a vendor's — on your invoice
content.
- We do not sell, rent, or share your data with data brokers, ad
networks, or LLM training pipelines.
- We do not use third-party analytics, session replay, or marketing
pixels in the authenticated product. Our sessions table stores a per-session SHA-256 of your browser's User-Agent string ("device fingerprint" in the schema, audit-batch-2 P-F3) only so we can label which device a session belongs to in the eventual /settings/security surface; it is not behavioural fingerprinting and is not joined to any cross-site signal.
- We do not include customer content in application logs, error
reports, or support tickets. Our CI suite enforces this.
- We do not enrich your vendors with data-broker information, even
if you opt in.
- We do not share any data with law enforcement except under
compulsion of valid legal process. We disclose aggregate request counts in our annual transparency report and publish a warrant canary at docs/warrant-canary.md.
Sub-processors
We use six service providers to run Muntin Ledger. Each is named, the scope is described, and a Data Processing Agreement is on file. Anthropic is not in this list — the v4 architecture removed the LLM extraction call entirely. Sentry-the-company is not in this list either: we self-host the Sentry error-tracking software on our own Fly.io tenant, so no separable third party processes error metadata; PII is scrubbed by tools/pii-scrubber/ before persistence.
| Provider | What they do |
|---|---|
| AWS | Object Lock WORM mirror for audit-log metadata (active). Per-document envelope encryption is performed client-side (X25519 + AES-256-GCM); AWS KMS is NOT in that path and holds no key material for customer content. No AWS compute, no S3 for customer content. |
| Cloudflare | Edge network, Workers, D1, KV, R2, email routing |
| Fly.io | Compute for the docling + extract services (deterministic engine; no LLM); also hosts our self-hosted Sentry instance |
| Neon | Managed Postgres for extractions + extraction_templates + audit_events; row-level security enforced per org |
| Stripe | Billing and payments |
| Resend | Magic-link and account email |
Full list, contracts, and breach-notification SLAs: docs/sub-processors.md. Sub-processor changes get 30 days' advance email notice with a right to terminate.
How long we keep things
| Data | Retention |
|---|---|
| Raw invoice files (PDFs, photos) | Policy default 24h after extraction (configurable 1h–30d in Cloud; customer-controlled in self-hosted). Automatic time-based purge is performed by the retention reaper, which is being enabled — until then, explicit deletion (honored within 60s) is the guaranteed path. |
| Extracted invoice records | Until you delete them or close your account |
| Computed verdicts (anomalies, duplicates) | Until the underlying record is deleted; cleared on Mark expected |
| Audit log entries | 7 years (regulatory + recommended retention for financial records) |
| Magic-link tokens | 15 minutes; deleted on first use |
| Session cookies | 15 minutes (access token) plus 30 days (refresh token), rotated on every refresh; revoked immediately on sign-out |
| Marketing-site analytics (Plausible) | No cookies are set. Our self-hosted, cookieless Plausible records no IP/UA and no per-visitor identifier — only aggregate counts derived from a daily-rotating, then-discarded hash. Marketing site only; never in the authenticated product. |
| Backups | Inherit residency and retention of the source region |
When you delete a record, the live data is removed within 60 seconds. The audit log entry referencing that record is content-tombstoned — the row stays for chain integrity, but the target reference is replaced with a hash. See docs/threat-model.md for the full mechanism.
Waitlist signups (your email, and optionally your name and restaurant) are kept only so we can reach you when early access opens; founding-member signups are tagged so we can honor founder pricing at launch. We never share them and set no tracking. Email privacy@muntin.digital to have yours deleted at any time; signups that never activate are removed within 90 days of launch.
When you close your account, all customer data is purged within 30 days. We send a cryptographic receipt confirming the purge.
Your rights
Under GDPR (if you are in the EU — see DPA for the cross-border posture) and CCPA (if you are in California), you have the right to:
- Access your data: every workspace can download a CSV at any
time (Settings → Export & integrations → Download CSV, or GET /v1/exports/csv). The CSV format is documented at docs/csv-export.md. For full structured export including the audit log, the signed-JSON path lands during MVP weeks 7–10; until then, email privacy@muntin.digital and we will produce the export within five business days.
- Correct your data: every extracted field is editable in the
review pane.
- Delete your data: see "How long we keep things" above.
- Port your data: the CSV download is your portability artifact
for invoice records; the signed JSON export covers the audit log and the full structured payload.
- Object to processing: close your account; we delete within 30
days.
We do not sell personal information for purposes of CCPA Section 1798.140(t). The California-specific disclosures and rights are in the For California residents (CCPA/CPRA) section below.
For California residents (CCPA/CPRA)
This section is the California Consumer Privacy Act / California Privacy Rights Act disclosure. It supplements the rest of this Policy for residents of California; the rest of the Policy continues to apply.
Categories of personal information we collect
| Category (Cal. Civ. Code §1798.140) | What we collect |
|---|---|
| Identifiers | Your email address; the workspace name you choose; your role; request IPs (kept only in security-only audit entries, for rate-limiting and abuse mitigation; never joined to any analytics system). |
| Commercial information | Your subscription tier; payment metadata (last-4, brand, billing ZIP) returned by Stripe; invoice records you store in your ledger. |
| Internet or other electronic network activity | Audit-log entries describing operator actions inside your workspace (which user opened what record, when, from which session). A SHA-256 hash of your browser User-Agent string on each session row (so the eventual /settings/security surface can label devices). |
| Sensitive personal information | We do not knowingly collect SPI. See "Sensitive PI" below. |
We do not collect biometric information. We do not collect precise geolocation. We do not draw inferences from a profile.
Categories of sources
- Directly from the consumer: account sign-up; invoice uploads,
pastes, snaps, or email-forwards.
- Automatically from devices: request IP (security-only),
SHA-256 of User-Agent string.
- From sub-processors: Stripe returns transaction records for
the billing relationship.
Purposes of collection and use
- Providing the Service: running the deterministic extraction
pipeline (per-vendor templates resolved against the seeded catalog) and storing the resulting records in your Muntin Ledger.
- Security: rate-limiting; abuse mitigation; production-access
audit; chain-integrity verification.
- Legal compliance: financial-records retention; sub-processor
contracts; lawful-process response under the warrant-canary process at docs/warrant-canary.md.
Disclosure
We do not sell personal information and we do not share personal information for cross-context behavioral advertising (the CPRA-specific term in Cal. Civ. Code §1798.140(ah)).
We disclose to sub-processors enumerated in docs/sub-processors.md, each under a written Data Processing Agreement and each with a bounded role. The sub-processor list is machine-checked on every CI run by scripts/check-subprocessor-freshness.mjs, which enforces the 30-day-notice clause: a sub-processor cannot become active until at least 30 days have passed since the announcement-list email went out. Sub-processor additions are emailed 30 days in advance to any address subscribed to subprocessors@muntin.digital.
We may also disclose personal information when compelled by valid legal process; we publish aggregate request counts in our annual transparency report and maintain a warrant canary at docs/warrant-canary.md.
Sensitive personal information
We do not knowingly collect sensitive personal information ("SPI" under CPRA — Cal. Civ. Code §1798.140(ae)). Invoice content can incidentally contain SPI: for example, a sole-proprietor service vendor may use a Social Security Number as a tax identifier on a W-9-derived invoice. We treat all invoice content as confidential under DPA §3 (Nature and purpose of processing). Known SPI patterns (SSNs, US phone numbers, emails, ACH-shaped account numbers) are redacted by the PII scrubber at tools/pii-scrubber/redaction.py before any error event or audit-log target reference is persisted, so SPI does not flow into Sentry or into the audit chain's target- reference column. We do not use SPI to infer characteristics of a consumer.
Retention
| Category | Retention |
|---|---|
| Session refresh-window state (sessions table) | 30 days from creation; revoked immediately on sign-out. |
| Audit-log entries | 7 years (financial-records retention; see DPA §8). |
| Invoice content (raw files and extracted records) | Per the org's org.retention_seconds setting; policy default 24 hours after extraction, customer-configurable up to 90 days. Automatic time-based deletion is enforced by the retention reaper (being enabled); explicit deletion is honored within 60 seconds today. |
| Stripe payment metadata | Per Stripe's retention policy (Stripe is the controller for its retained payment records under PCI-DSS). |
| Magic-link tokens | 15 minutes from issuance; deleted on first use. |
| Request IPs in security-only audit | 90 days, then purged from the audit row's metadata blob. |
The detailed retention table for invoice content and computed verdicts is in the "How long we keep things" section above; the 12-month-aligned summary here is for CCPA disclosure compliance.
Your California rights
You have the following rights with respect to personal information about you. We honour each of them without discrimination (Cal. Civ. Code §1798.125 — no different prices, no degraded service, no withheld access).
- Right to know what we collect, how, and why — this section
is that disclosure. You may also request a portable copy via Settings → Export & integrations → Download CSV (GET /v1/exports/csv), or by emailing privacy@muntin.digital.
- Right to delete. Use the in-product erasure flow at
/settings/security (which posts to POST /v1/erasure), or email privacy@muntin.digital. Live records are removed within 60 seconds; audit-row target references are content-tombstoned so the hash chain survives.
- Right to correct. Every extracted field is editable in the
review pane. For account-level fields (email, workspace name) edit them at /settings/profile, or email privacy@muntin.digital.
- Right to opt out of sale or sharing. Not applicable — we do
not sell or share personal information. As required by CPRA, we still post a "Do not sell or share my personal information" link at /settings/privacy/do-not-sell-or-share that states this status explicitly.
- Right to limit use of sensitive personal information. Not
applicable — we do not knowingly collect SPI (see above).
- Right to non-discrimination. We do not offer a different
price, a different tier, or a degraded service to consumers who exercise their rights.
How to exercise your rights
- In-product: use
/settings/securityfor deletion and
/settings/profile for correction. We verify identity through the existing authenticated session plus a confirmation email to the address on file.
- By email:
privacy@muntin.digital. We verify identity by
matching the requester's email to the email on file for the workspace identified in the request, and by confirming any recent session metadata visible at /settings/security.
- Response time: we acknowledge within 10 business days and
substantively respond within 45 calendar days (or notify you of a 45-day extension under §1798.130(a)(2), if a request is unusually complex).
Authorized agents
A California consumer may use an authorized agent to make a request. We accept authorized-agent requests through privacy@muntin.digital and require:
- A signed power of attorney from the consumer, or written
authorization that we may directly verify with the consumer.
- The consumer's own verification (we may reach out to the email
on file before fulfilling the request).
Updates and cross-references
For the Record of Processing Activities (the per-activity catalogue of what we process, on what legal basis, and with which sub- processors) see docs/ropa.md. For the Data Protection Impact Assessment covering the extraction pipeline see docs/dpia-invoice-extraction.md. Both documents are referenced by this section because they underpin the disclosures here; they are drafts pending counsel review.
Human-in-the-loop commitment
A draft verdict — the extracted invoice record we propose — is not the final ledger entry. You confirm it first.
Here is the path, in plain terms:
- You upload, paste, snap, or forward an invoice.
- The deterministic engine reads it. It applies the per-vendor
template you have already confirmed, or, on the first few invoices from a new vendor, falls back to a synonym dictionary and position heuristics.
- We show you a draft. You approve it, edit it, or reject it.
- Only an approved draft writes to your ledger and to the audit
chain.
What this means for your rights under GDPR Article 22:
- There is no automated decision-making that produces legal effects
or similarly significant effects on a data subject. Every verdict is operator-confirmed at the verdict step.
- We do not auto-post to QuickBooks Online, Xero, or any other
downstream system at private beta. You start the post; we do not.
What this means under the EU AI Act:
- Muntin Ledger is a deterministic system with mandatory human
review at the verdict step. It is not a "high-risk AI system" under Annex III. The IBM docling layout engine is open source and used only to recover page layout and reading order; it does not draw inferences about a data subject.
- We will reassess this stance if the AI Act's general-purpose-AI
provisions begin to apply to us, or if we materially change our processing.
If we ever build an auto-posting path — for example, a future "post nightly" feature that commits approved drafts without an explicit human step — we will give you at least 90 days' notice by email and update this Privacy Policy before that path is turned on. You will keep the operator-confirmed path if you want it; auto-posting will not be the only option.
The same commitment, in Controller-Processor language, is in DPA §12.
How we secure your data
The honest split (v6 architecture). Muntin Ledger handles two classes of customer data with two different security postures:
- Scanned documents (PDFs, photos) are uploaded over TLS and
read ONCE on our infrastructure to extract your structured ledger (the extraction step is server-side, by necessity — there is no on-device OCR). You can then turn on a lock (opt-in): the retained original is encrypted with a per-document key wrapped to an identity key only your 12-word recovery phrase can unlock — our servers hold only your public key and cannot reopen a locked document. The plaintext is discarded once the locked copy is written. If you do NOT turn on the lock, the original is retained only as long as needed and deleted under our retention policy (see "Data retention"); either way Cloudflare R2 also encrypts all objects at rest with provider-managed keys. This is enforced in code and tests; it is opt-in and not yet independently verified in production — do not rely on it as your sole control until we publish that verification.
- **Structured ledger rows (vendor names, totals, dates, line
items, GL classifications)** live in cleartext on our servers, scoped to your workspace by per-org Row Level Security. This is the data that makes cross-device sync and the insights engine work. We can read your structured ledger to deliver the product. We read a scanned document once during extraction; for a document you have locked we cannot reopen the retained copy (only your recovery phrase can), and for an unlocked document our access is restricted to the extraction pipeline and the retention policy. The ledger staying readable to us is a deliberate design carve-out, not a gap — it is what makes search and sync work.
Marketing line: _You can lock your scanned originals so only your recovery phrase opens them; your searchable ledger keeps working across your devices._ We do not claim that every document is end-to-end encrypted — the lock is opt-in and the structured ledger is server-readable by design.
How your data is protected, locked or not. Turning on the lock is the strongest control; even without it, every one of these is in force now and independently verifiable:
- In transit: TLS 1.3 for every request.
- At rest: scanned documents live in cloud object storage that
encrypts all objects at rest with provider-managed keys; the structured ledger is in a managed Postgres database.
- Tenant isolation: your structured ledger is scoped to your
workspace by per-org Row Level Security plus an explicit per-request tenant predicate. One tenant cannot read another tenant's records — not even with the database password.
- Tamper-evident audit chain: every privileged access to your
data is appended to a hash chain you can verify yourself; we publish the chain head every six hours so you can detect silent edits or deletions.
- No third-party AI: extraction is a deterministic engine
(docling + our own rules). Your invoices are never sent to an external LLM and are never used to train any model.
- Hardened, ephemeral extraction: the worker that reads an
invoice runs on an isolated, single-job machine with a read-only filesystem, dropped privileges and core dumps disabled, and is destroyed after each document.
- PII kept out of operations: an automated scrubber redacts
emails, SSNs, phone and bank patterns from our logs and error reports before they are written.
- Your rights: you can export your data and request erasure at
any time (see "Your rights" and "Your data deletion rights").
The document lock (device-held keys for scanned originals) removes our infrastructure's ability to reopen a locked document after extraction. It is opt-in and enforced in code and tests; we will publish independent production verification before describing it as proven, and we do not apply it retroactively to the one-time extraction read.
Cryptographic primitives (auditable).
- TLS 1.3 in transit.
- Per-document Data Encryption Keys (DEKs): AES-256-GCM, random
per document, encrypting the stored original. The DEK is wrapped to your X25519 identity PUBLIC key via an ephemeral-X25519 + HKDF-SHA256 envelope; document_id is bound as Additional Authenticated Data so the server cannot swap one wrapped key for another within your keyspace. The server holds only your public key — it can encrypt to you but cannot decrypt.
- Identity key: a per-user X25519 keypair. The secret key is
sealed with AES-256-GCM under a User Master Key (UMK) and stored server-side only as opaque ciphertext.
- UMK derivation: Argon2id (run on your device, WebAssembly) over
a 12-word / 128-bit BIP-0039 recovery phrase plus a 16-byte per-user salt, at the OWASP / RFC 9106 profile (64 MiB memory, 4 iterations, parallelism 2). The server never sees the phrase or the UMK; both exist only in device memory. A fixed canary, sealed under the UMK, lets a new device detect a wrong phrase before touching any document.
- Recovery and rotation: the phrase IS the key root — there is no
separate fallback. You can rotate to a new phrase; only the sealed identity key is re-wrapped (your public key is unchanged, so every already-encrypted document stays valid). If you lose the phrase, your locked scanned documents are unrecoverable; the server cannot help. Your structured ledger is unaffected and can be re-granted from the workspace owner.
- Audit chain: every access lands on a tamper-evident hash chain.
We publish a chain head every six hours so anyone can verify that nothing was rewritten.
- Implementation + wire pin: the crypto core
(packages/recovery-crypto, WASM Argon2id + @noble X25519/ HKDF + WebCrypto AES-GCM) is exercised by known-answer tests, and the exact server↔client byte format is regression-pinned by apps/api/tests/c2-encrypt-encoding.test.ts so the encrypt and decrypt sides cannot silently drift. The native iPhone and desktop apps run the same web client today; native crypto implementations are later work and will be held to the same pinned vectors before they ship.
Operational controls. Production access requires a named- customer ticket, two-person approval, and time-boxed credentials. Every access is logged into your own audit log within 24 hours. The docling extraction worker runs on ephemeral per-job Fly Machines: read-only root filesystem, dropped Linux capabilities, non-root runtime user, core dumps disabled at both RLIMIT_CORE and container ulimit -c levels. Each Machine terminates after its single document.
See docs/threat-model.md for the full threat model and the defenses for each top-five threat.
Your data deletion rights
You can delete your data in two ways.
Workspace-level deletion. If you own the workspace, you can tear down the whole workspace — including every other user's data inside it — by typing the confirmation phrase "delete my workspace" at the /v1/erasure endpoint. Every user's sessions are revoked, the audit chain is tombstoned, and the retention reaper removes the R2 objects + Postgres rows within 24 hours. Re-signing in with the same email creates a fresh workspace.
User-level deletion. Any user can delete their own personal data without affecting other users in the same workspace. Type the confirmation phrase "delete my data" at the /v1/erasure/user endpoint. We:
- revoke every session for your account (across every workspace
you belong to);
- revoke every passkey you registered;
- delete your sealed identity key and its Argon2id salt, so the
recovery phrase can no longer derive the key that unlocks your documents;
- delete every per-document wrapped key you owned — your locked
PDFs and photos become permanently unreadable because the key needed to decrypt them is gone;
- revoke every workspace membership you had with reason
self_leave;
- anonymise your email address in our
userstable (replaced
with a deterministic placeholder) and stamp the row as tombstoned.
Your audit-chain entries stay in place. The chain is tamper- evident; we keep its hash continuity intact. After the tombstone, your audit-chain entries reference a pseudonymous user identifier (a string like usr_xxx) that no longer maps to any personal data — your name, your email, the documents you handled are all gone. This is the GDPR Art. 4(5) pseudonymisation treatment, valid where the structural integrity of a tamper-evident log requires it.
The 24-hour retention reaper deletes the R2 objects + cleartext Postgres rows you used to own after either deletion path.
Contact privacy@muntin.digital if you cannot exercise either right (for example, you have lost the device that holds your session). The manual recovery path is in runbooks/account-deletion.md.
How to contact us
- Privacy questions:
privacy@muntin.digital - Security reports: see docs/security.txt
- Anything else:
hello@muntin.digital
We commit to a first response within three business days.
Changes to this policy
Material changes get 30 days' advance email notice. The current version is always posted at the URL above. A change log will live at docs/privacy-policy-changelog.md once the policy is published (post counsel review).