Memory Compaction
Over time, a project accumulates clusters of semantically similar memories — slight variations of the same fact, repeated preferences, or overlapping context. Memory compaction merges these clusters into single, richer memories that improve recall precision while reducing storage and embedding costs.
What is Memory Compaction
Memory compaction is a cluster-based synthesis process. The compression engine identifies groups of semantically related memories, synthesizes them into a single merged memory that preserves all unique information, and retires the originals. The result is a smaller, higher-quality memory corpus that produces better recall results.
Cluster Discovery
Before any synthesis happens, the engine must identify which memories belong together. Cluster discovery works in two stages:
- 1Semantic grouping. The engine computes pairwise vector similarity between memories within a project. Memories with similarity above the compaction threshold (configurable per project) are grouped into candidate clusters.
- 2Cluster validation. Each candidate cluster is validated for merge eligibility: all members must share the same user, project, and environment scope. Clusters that span different scopes are rejected.
The engine prefers smaller, tighter clusters (2–5 memories) over large ones. Oversized clusters are split before synthesis to keep the merged output coherent.
Synthesis Pipeline
Once a cluster is validated, the synthesis pipeline produces the merged memory:
- 1Content assembly. The engine collects the text, metadata, entities, tags, and source attribution from all cluster members into a unified context block.
- 2Synthesis. The assembled context is passed to the synthesis prompt, which produces a single merged memory that combines all unique information from the originals.
- 3Validation. The synthesized output is validated against strict rules (see next section). If validation fails, the merge is aborted and the originals remain untouched.
- 4Embedding. The validated merged memory is embedded and indexed in the vector store, replacing the original vectors.
Strict Validation Rules
The compression engine enforces strict validation to prevent information loss. Every synthesized memory must pass all of these checks before the merge is committed:
| Validation Rule | Description |
|---|---|
| Entity preservation | Every named entity (people, organizations, tools, etc.) that appears in any source memory must appear in the merged output. |
| Fact completeness | No factual assertion from any source memory may be omitted. Contradictions are preserved with attribution. |
| Source attribution | The merged memory’s metadata must link back to all original source memories for traceability. |
| Length bounds | The merged output must not exceed the maximum memory size. If synthesis produces content that is too long, the merge is rejected. |
| Scope integrity | The merged memory inherits the scope (user, project, environment) from the originals. All originals must share the same scope. |
Original Memory Handling
After a successful merge, the original memories are not deleted immediately. Instead, they follow a soft-retirement process:
- Soft-delete. Original memories are marked as soft-deleted. They no longer appear in recall results but remain in the database for audit purposes.
- Vector removal. The original vectors are removed from the active index. Only the merged memory’s vector remains searchable.
- Audit link. Each retired memory receives a metadata link to the merged memory that replaced it, creating a full provenance chain.
- Retention. Soft-deleted originals are subject to the standard retention policy. They are permanently purged only when the retention window expires.
When Compaction Runs
Compaction can be triggered in three ways:
| Trigger | Description |
|---|---|
| Scheduled sweep | Background workers periodically scan projects for compaction candidates. The sweep interval is configurable per organization. |
| Threshold-based | When a project’s memory count exceeds a configured threshold, compaction is automatically triggered to keep the corpus manageable. |
| Manual API | Admins can trigger compaction for a specific project via the API. This is useful after large bulk imports. |
Failure Recovery
The compression engine is designed for safe partial failure:
- Atomic merge. Each cluster merge is wrapped in a single database transaction. If any step fails — synthesis, validation, embedding, or soft-delete — the entire merge is rolled back and the original memories remain untouched.
- Cluster isolation. Failure in one cluster does not affect other clusters in the same batch. The engine continues processing remaining clusters.
- Retry with backoff. Failed clusters are flagged and retried in subsequent sweeps with exponential backoff. Persistent failures are logged and excluded from future sweeps.
- No partial merges. There is no state where some originals are retired but the merged memory doesn’t exist. The transaction guarantee prevents this.