PII Redaction
Personally identifiable information (PII) requires special handling at every layer. MemorySync implements a defense-in-depth approach: encrypting content at rest, anonymizing queries in logs, stripping metadata from context outputs, and providing a complete data deletion pipeline through Data Subject Requests.
PII Protection Philosophy
MemorySync treats PII protection as a multi-layer problem. No single mechanism is sufficient — instead, the platform applies complementary protections at each stage of the data lifecycle:
| Layer | Mechanism | Protection |
|---|---|---|
| At rest | Authenticated encryption | Memory content is encrypted with per-tenant keys before storage. Plaintext never touches disk. |
| In transit | Query anonymization | Raw query strings are never logged. Only opaque hashed identifiers appear in retrieval logs. |
| In context | Metadata stripping | Context builders strip IDs, scores, and metadata before passing content to LLMs. |
| On deletion | Cascading DSR | Complete removal of all user data across all tables and storage systems. |
Query Anonymization
The retrieval pipeline never logs raw query strings. Instead, every query is hashed before it appears in any structured log or metric:
- Opaque hashed identifier. The raw query is replaced with a short, opaque hashed identifier (
query_hash) before any structured log is written. The original text is discarded. - No plaintext logging. The structured log emitted by each retrieval call includes
query_hash, candidate counts, selected counts, latency, and scores — but never the original query text. - PII-safe debugging. When investigating recall quality, the
query_hashis used to correlate log entries without exposing user queries.
Memory Encryption
Memory content is encrypted at rest using the encryption settings configured in tenant settings:
- Algorithm. Authenticated encryption is applied per organization, using per-tenant keys protected by envelope encryption. The active algorithm and key rotation parameters are stored in tenant settings.
- Plaintext isolation. Memory content is persisted as ciphertext only. The retrieval pipeline decrypts content in memory when building a response — plaintext is never persisted to disk or backups.
- Key rotation. Encryption keys are rotated on a configurable schedule (default: every 90 days). The active key version is tracked per record so older memories can still be decrypted with their original key.
- Selective access. Internal extraction pipelines and the context builder access plaintext through a controlled decryption layer — raw ciphertext is never returned to callers.
DSR Delete: The Nuclear Option
The most powerful PII redaction mechanism is the delete-type Data Subject Request, which permanently removes all traces of a user from the system:
- 1Memory deletion. All memories owned by the subject are deleted, including their encrypted content, vectors, metadata, and scores.
- 2Event purge. All memory event history for the user is deleted.
- 3Recall log purge. All recall log entries for the user are deleted, removing query history and recall patterns.
- 4Audit log purge. All audit entries referencing the user (by any of the user’s identifiers) are deleted.
- 5User deletion. The user record itself is deleted, completing the cascade.
A final system-level audit entry is created to record that the deletion was performed, including the deleted user’s ID for compliance traceability.
Audit Log Sanitization
Audit logs themselves are designed to avoid storing sensitive content:
- No content logging. Audit entries record actions (created, updated, deleted, accessed) and references (resource type, resource ID) — never the actual memory content.
- Header redaction. Sensitive HTTP headers (authorization tokens, cookies, API keys) are replaced with
[REDACTED]before being stored in audit metadata. - Actor attribution. Each audit entry records who performed the action (
actor_id), what category it belongs to (compliance,security,data), and the severity level.
Context Builder Safety
When recalled memories are assembled into context for LLM prompt injection, the context builder applies strict sanitization:
- No IDs. Memory IDs, user IDs, and internal references are stripped from the context output. The LLM sees only the semantic content.
- No metadata. Scores, confidence values, source attribution, and other platform metadata are excluded from the context string.
- No raw vectors. Embedding vectors are never included in context output.
- Whitespace normalization. Content is cleaned, whitespace-collapsed, and trimmed to a maximum line length to prevent injection artifacts.
- Case-insensitive dedup. Duplicate content (same text, different casing) is collapsed within each section to prevent repetitive context.
Export Token Guarantees
DSR signing keys and storage parameters are managed by the hosted platform on your behalf. As a cloud customer you don’t need to provision any secrets yourself — the platform guarantees the following properties:
| Property | Guarantee |
|---|---|
| Signed export links | Every export download link is HMAC-signed with a managed signing key. Tampered or replayed links always fail verification. |
| SLA deadline | Default DSR SLA window is 30 days. Adjustable for enterprise plans through support. |
| Download link lifetime | Export download links are valid for 24 hours by default, then automatically expire. |
| Encrypted export storage | Generated export bundles are stored in encrypted long-term storage and are purged after the retention window elapses. |