Memory Tiering
Not all memories need instant recall. MemorySync organizes memories into three tiers — hot, warm, and cold — based on age, access patterns, and storage quotas. Hot-tier memories sit in the active vector index for fast recall. Warm memories remain in the database but leave the fast index. Cold memories are archived to long-term storage.
The Three-Tier Model
| Tier | Vector Index | Database | Characteristics |
|---|---|---|---|
| Hot | ✅ Active | ✅ Active | Recently created or frequently accessed memories. Instantly recallable via semantic search. This is the default tier for all new memories. |
| Warm | ❌ Removed | ✅ Active | Older memories that haven’t been accessed recently. Still in the database and queryable by ID, but no longer in the fast vector index. |
| Cold | ❌ Removed | ⚠️ Archived | Long-dormant memories archived to encrypted long-term storage. Must be rehydrated before they can be recalled or queried. |
The tier manager handles all transitions automatically. Memories move down (hot → warm → cold) based on time and access patterns, and up (cold → warm → hot) when they are re-accessed.
Time-Based Transitions
The primary driver of tier transitions is time since last access. The tier manager uses configurable thresholds:
| Transition | Default Threshold | Condition |
|---|---|---|
| Hot → Warm | 30 days | Memory has not been accessed (last_accessed_at) in 30+ days. |
| Warm → Cold | 180 days | Memory has been in warm tier for 180+ days without re-access. |
These thresholds are configurable at the organization level through tenant settings. The tier manager checks transitions during periodic sweep jobs that scan the memory table for candidates.
POST /memory/query updates last_accessed_at and resets the timer. A memory that is recalled even once stays in its current tier.Size-Based Transitions
In addition to time-based rules, the tier manager enforces per-user storage quotas. When a user’s hot-tier memory exceeds the configured byte limit, the tier manager proactively moves the oldest, least-accessed memories to the warm tier to make room.
- Excess calculation. The tier manager computes
total_hot_bytes - quota_bytesfor each user. If the result is positive, memories are transitioned until the hot tier is within quota. - Eviction order. Memories are ranked by
last_accessed_at(oldest first), then by importance (lowest first). High-importance memories are evicted last. - No data loss. Size-based transitions only move memories down a tier — they never delete content. The data remains fully accessible in the lower tier.
Cold Tier Archival
When a memory transitions to the cold tier, the tier manager performs a full archival sequence:
- 1Vector removal. The memory’s vector is removed from the active vector store. It can no longer appear in semantic search results.
- 2Content serialization. The memory’s content, metadata, and vector are serialized into an archive format.
- 3Archive upload. The serialized archive is uploaded to encrypted long-term archive storage with server-side encryption enabled.
- 4Database marker. The memory row is updated with
tier: coldand the archive’s storage path. The row remains in the database as a lightweight pointer.
Promotion & Rehydration
Memories can move back up to a higher tier when they are re-accessed. The process depends on the current tier:
| Transition | What happens |
|---|---|
| Warm → Hot | The memory’s vector is re-inserted into the active index. last_accessed_at is updated. This is fast — the vector is still in the database. |
| Cold → Warm | The archived content is rehydrated from long-term archive storage and restored to the database. The vector is not re-indexed until an explicit recall triggers warm → hot promotion. |
| Cold → Hot (direct) | Not supported. Cold memories must first rehydrate to warm, then promote to hot on the next access. This prevents expensive rehydrations from blocking recall queries. |
Distributed Locking
Tier transitions involve cross-system operations (database updates, vector store mutations, archive uploads). To prevent race conditions when multiple workers process transitions concurrently, the tier manager uses row-level locking:
- Lock-skip selection. The sweep claims candidate rows with row-level locks that skip already-locked rows rather than blocking on them. Contention between workers never causes a worker to wait.
- Atomic transitions. Each tier transition (database update, vector mutation, archive operation) is wrapped in a single transaction. If any step fails, the transaction rolls back and the memory stays in its current tier.
- No starvation. Skipped rows are picked up in the next sweep. The sweep interval ensures all eligible memories are processed within a bounded time window.
Batch Processing & Audit
The tier manager processes transitions in configurable batches and maintains a full audit trail:
- Batch size. Each sweep processes a fixed number of transitions (configurable per organization) to control resource usage.
- Tier-history records. Every transition is logged with the old tier, new tier, transition reason (time-based, size-based, or promotion), and timestamp — visible from the dashboard for audit and debugging.
- Rollback on failure. If a batch partially fails, only the committed transitions are recorded. Uncommitted transitions remain in their original tier for the next sweep.
- Metrics. The tier manager emits per-batch metrics: transitions completed, bytes moved, archive uploads/downloads, and processing duration.