Advanced

PII Redaction

Personally identifiable information (PII) requires special handling at every layer. MemorySync implements a defense-in-depth approach: encrypting content at rest, anonymizing queries in logs, stripping metadata from context outputs, and providing a complete data deletion pipeline through Data Subject Requests.

PII Protection Philosophy

MemorySync treats PII protection as a multi-layer problem. No single mechanism is sufficient — instead, the platform applies complementary protections at each stage of the data lifecycle:

Layer	Mechanism	Protection
At rest	Authenticated encryption	Memory content is encrypted with per-tenant keys before storage. Plaintext never touches disk.
In transit	Query anonymization	Raw query strings are never logged. Only opaque hashed identifiers appear in retrieval logs.
In context	Metadata stripping	Context builders strip IDs, scores, and metadata before passing content to LLMs.
On deletion	Cascading DSR	Complete removal of all user data across all tables and storage systems.

Query Anonymization

The retrieval pipeline never logs raw query strings. Instead, every query is hashed before it appears in any structured log or metric:

Opaque hashed identifier. The raw query is replaced with a short, opaque hashed identifier (query_hash) before any structured log is written. The original text is discarded.
No plaintext logging. The structured log emitted by each retrieval call includes query_hash, candidate counts, selected counts, latency, and scores — but never the original query text.
PII-safe debugging. When investigating recall quality, the query_hash is used to correlate log entries without exposing user queries.

Design rationale

Query strings may contain PII (e.g. "What does John Smith prefer for lunch?"). By hashing before logging, the platform ensures that PII never leaks into log aggregation systems, SIEMs, or error trackers.

Memory Encryption

Memory content is encrypted at rest using the encryption settings configured in tenant settings:

Algorithm. Authenticated encryption is applied per organization, using per-tenant keys protected by envelope encryption. The active algorithm and key rotation parameters are stored in tenant settings.
Plaintext isolation. Memory content is persisted as ciphertext only. The retrieval pipeline decrypts content in memory when building a response — plaintext is never persisted to disk or backups.
Key rotation. Encryption keys are rotated on a configurable schedule (default: every 90 days). The active key version is tracked per record so older memories can still be decrypted with their original key.
Selective access. Internal extraction pipelines and the context builder access plaintext through a controlled decryption layer — raw ciphertext is never returned to callers.

DSR Delete: The Nuclear Option

The most powerful PII redaction mechanism is the delete-type Data Subject Request, which permanently removes all traces of a user from the system:

1Memory deletion. All memories owned by the subject are deleted, including their encrypted content, vectors, metadata, and scores.
2Event purge. All memory event history for the user is deleted.
3Recall log purge. All recall log entries for the user are deleted, removing query history and recall patterns.
4Audit log purge. All audit entries referencing the user (by any of the user’s identifiers) are deleted.
5User deletion. The user record itself is deleted, completing the cascade.

A final system-level audit entry is created to record that the deletion was performed, including the deleted user’s ID for compliance traceability.

Audit Log Sanitization

Audit logs themselves are designed to avoid storing sensitive content:

No content logging. Audit entries record actions (created, updated, deleted, accessed) and references (resource type, resource ID) — never the actual memory content.
Header redaction. Sensitive HTTP headers (authorization tokens, cookies, API keys) are replaced with [REDACTED] before being stored in audit metadata.
Actor attribution. Each audit entry records who performed the action (actor_id), what category it belongs to (compliance, security, data), and the severity level.

Context Builder Safety

When recalled memories are assembled into context for LLM prompt injection, the context builder applies strict sanitization:

No IDs. Memory IDs, user IDs, and internal references are stripped from the context output. The LLM sees only the semantic content.
No metadata. Scores, confidence values, source attribution, and other platform metadata are excluded from the context string.
No raw vectors. Embedding vectors are never included in context output.
Whitespace normalization. Content is cleaned, whitespace-collapsed, and trimmed to a maximum line length to prevent injection artifacts.
Case-insensitive dedup. Duplicate content (same text, different casing) is collapsed within each section to prevent repetitive context.

Export Token Guarantees

DSR signing keys and storage parameters are managed by the hosted platform on your behalf. As a cloud customer you don’t need to provision any secrets yourself — the platform guarantees the following properties:

Property	Guarantee
Signed export links	Every export download link is HMAC-signed with a managed signing key. Tampered or replayed links always fail verification.
SLA deadline	Default DSR SLA window is 30 days. Adjustable for enterprise plans through support.
Download link lifetime	Export download links are valid for 24 hours by default, then automatically expire.
Encrypted export storage	Generated export bundles are stored in encrypted long-term storage and are purged after the retention window elapses.

Security

Treat the export download link as a sensitive credential. Anyone with the link can fetch the export until it expires — share it only over secure channels and with personnel authorised to receive the data.

← Previous

Privacy Controls

Production Checklist