Advanced

Memory Compaction

Over time, a project accumulates clusters of semantically similar memories — slight variations of the same fact, repeated preferences, or overlapping context. Memory compaction merges these clusters into single, richer memories that improve recall precision while reducing storage and embedding costs.

What is Memory Compaction

Memory compaction is a cluster-based synthesis process. The compression engine identifies groups of semantically related memories, synthesizes them into a single merged memory that preserves all unique information, and retires the originals. The result is a smaller, higher-quality memory corpus that produces better recall results.

Key guarantee

Compaction never loses information. Every entity, fact, and attribution from the original memories must be present in the synthesized output, verified by strict validation before the merge is committed.

Cluster Discovery

Before any synthesis happens, the engine must identify which memories belong together. Cluster discovery works in two stages:

1Semantic grouping. The engine computes pairwise vector similarity between memories within a project. Memories with similarity above the compaction threshold (configurable per project) are grouped into candidate clusters.
2Cluster validation. Each candidate cluster is validated for merge eligibility: all members must share the same user, project, and environment scope. Clusters that span different scopes are rejected.

The engine prefers smaller, tighter clusters (2–5 memories) over large ones. Oversized clusters are split before synthesis to keep the merged output coherent.

Synthesis Pipeline

Once a cluster is validated, the synthesis pipeline produces the merged memory:

1Content assembly. The engine collects the text, metadata, entities, tags, and source attribution from all cluster members into a unified context block.
2Synthesis. The assembled context is passed to the synthesis prompt, which produces a single merged memory that combines all unique information from the originals.
3Validation. The synthesized output is validated against strict rules (see next section). If validation fails, the merge is aborted and the originals remain untouched.
4Embedding. The validated merged memory is embedded and indexed in the vector store, replacing the original vectors.

Strict Validation Rules

The compression engine enforces strict validation to prevent information loss. Every synthesized memory must pass all of these checks before the merge is committed:

Validation Rule	Description
Entity preservation	Every named entity (people, organizations, tools, etc.) that appears in any source memory must appear in the merged output.
Fact completeness	No factual assertion from any source memory may be omitted. Contradictions are preserved with attribution.
Source attribution	The merged memory’s metadata must link back to all original source memories for traceability.
Length bounds	The merged output must not exceed the maximum memory size. If synthesis produces content that is too long, the merge is rejected.
Scope integrity	The merged memory inherits the scope (user, project, environment) from the originals. All originals must share the same scope.

Original Memory Handling

After a successful merge, the original memories are not deleted immediately. Instead, they follow a soft-retirement process:

Soft-delete. Original memories are marked as soft-deleted. They no longer appear in recall results but remain in the database for audit purposes.
Vector removal. The original vectors are removed from the active index. Only the merged memory’s vector remains searchable.
Audit link. Each retired memory receives a metadata link to the merged memory that replaced it, creating a full provenance chain.
Retention. Soft-deleted originals are subject to the standard retention policy. They are permanently purged only when the retention window expires.

When Compaction Runs

Compaction can be triggered in three ways:

Trigger	Description
Scheduled sweep	Background workers periodically scan projects for compaction candidates. The sweep interval is configurable per organization.
Threshold-based	When a project’s memory count exceeds a configured threshold, compaction is automatically triggered to keep the corpus manageable.
Manual API	Admins can trigger compaction for a specific project via the API. This is useful after large bulk imports.

Failure Recovery

The compression engine is designed for safe partial failure:

Atomic merge. Each cluster merge is wrapped in a single database transaction. If any step fails — synthesis, validation, embedding, or soft-delete — the entire merge is rolled back and the original memories remain untouched.
Cluster isolation. Failure in one cluster does not affect other clusters in the same batch. The engine continues processing remaining clusters.
Retry with backoff. Failed clusters are flagged and retried in subsequent sweeps with exponential backoff. Persistent failures are logged and excluded from future sweeps.
No partial merges. There is no state where some originals are retired but the merged memory doesn’t exist. The transaction guarantee prevents this.

Important

If a compaction run fails repeatedly on the same cluster, check the audit log for validation errors. The most common cause is entity conflicts in the synthesis output.

← Previous

Re-evaluation Engine

Explainability