Skip to content

Zero Data Retention (ZDR) for LLM Providers

This is a reference note based on the article "Zero Data Retention (ZDR) for LLM Providers" by Abu Bakar Siddik.

Key Takeaways

  • Zero Data Retention (ZDR) is a bundle of technical controls and contract terms ensuring customer content (prompts, outputs, files) is not stored at rest by the vendor. It is critical for enterprise adoption of LLMs in regulated sectors (Healthcare, Finance, Government) to move from experimental scripts to production systems.
  • Threat Model: ZDR mitigates risks like training data leakage, abuse monitoring retention, employee access, subpoena/legal discovery, and breaches at the provider. However, it does not mitigate risks from your own logging or prompt injection exfiltration; those require proxy-based redaction and sandboxing.
  • Approaches:
  • Self-hosted (air-gapped): Strongest privacy, open-weight only, high ops cost.
  • Self-hosted (VPC): Very strong privacy, open-weight only, medium ops cost.
  • Cloud ZDR + Private Link: Strong (contractual), frontier models, low/medium setup.
  • SaaS ZDR API: Good (contractual), frontier models, low setup.
  • Gateway with ZDR routing: Good (delegated), multi-provider, low setup.
  • Provider Policies:
  • OpenAI: ZDR/MAM requires enterprise sales approval. store parameter is always treated as false when ZDR is on.
  • Anthropic: ZDR Arrangement via enterprise contract. 7 days default retention (not for training).
  • AWS Bedrock & Fireworks AI: ZDR by default. No prompts/completions logged without explicit opt-in.
  • Google Vertex AI: Abuse monitoring exception via support/invoiced billing.
  • OpenRouter: ZDR provider routing can be enforced per-request via provider.data_collection: "deny".
  • Compliance: For HIPAA, a BAA is required (available from Azure, AWS, Google, Anthropic Enterprise, Fireworks, Together).
  • Verification: ZDR audits require four pillars of evidence:
  • Configuration Artifacts (e.g., Azure ContentLogging=false)
  • Negative Tests (e.g., attempting to retrieve a completion should fail)
  • Environment Audit (checking proxy/logging config for PII)
  • Contractual Proof (BAA, DPA, SOC 2)

Reference Context

  • Article outlines architectural blueprints for Cloud ZDR + Private Network, Self-Hosted Production Stack, and Gateway-Based Multi-Provider ZDR.
  • Data protection beyond ZDR emphasizes early PII redaction (using Presidio, LLM Guard, AWS Bedrock Guardrails) before data leaves the network.