Advanced

Memory Tiering

Not all memories need instant recall. MemorySync organizes memories into three tiers — hot, warm, and cold — based on age, access patterns, and storage quotas. Hot-tier memories sit in the active vector index for fast recall. Warm memories remain in the database but leave the fast index. Cold memories are archived to long-term storage.

The Three-Tier Model

Tier	Vector Index	Database	Characteristics
Hot	✅ Active	✅ Active	Recently created or frequently accessed memories. Instantly recallable via semantic search. This is the default tier for all new memories.
Warm	❌ Removed	✅ Active	Older memories that haven’t been accessed recently. Still in the database and queryable by ID, but no longer in the fast vector index.
Cold	❌ Removed	⚠️ Archived	Long-dormant memories archived to encrypted long-term storage. Must be rehydrated before they can be recalled or queried.

The tier manager handles all transitions automatically. Memories move down (hot → warm → cold) based on time and access patterns, and up (cold → warm → hot) when they are re-accessed.

Time-Based Transitions

The primary driver of tier transitions is time since last access. The tier manager uses configurable thresholds:

Transition	Default Threshold	Condition
Hot → Warm	30 days	Memory has not been accessed (`last_accessed_at`) in 30+ days.
Warm → Cold	180 days	Memory has been in warm tier for 180+ days without re-access.

These thresholds are configurable at the organization level through tenant settings. The tier manager checks transitions during periodic sweep jobs that scan the memory table for candidates.

Access resets the clock

Every successful recall via POST /memory/query updates last_accessed_at and resets the timer. A memory that is recalled even once stays in its current tier.

Size-Based Transitions

In addition to time-based rules, the tier manager enforces per-user storage quotas. When a user’s hot-tier memory exceeds the configured byte limit, the tier manager proactively moves the oldest, least-accessed memories to the warm tier to make room.

Excess calculation. The tier manager computes total_hot_bytes - quota_bytes for each user. If the result is positive, memories are transitioned until the hot tier is within quota.
Eviction order. Memories are ranked by last_accessed_at (oldest first), then by importance (lowest first). High-importance memories are evicted last.
No data loss. Size-based transitions only move memories down a tier — they never delete content. The data remains fully accessible in the lower tier.

Cold Tier Archival

When a memory transitions to the cold tier, the tier manager performs a full archival sequence:

1Vector removal. The memory’s vector is removed from the active vector store. It can no longer appear in semantic search results.
2Content serialization. The memory’s content, metadata, and vector are serialized into an archive format.
3Archive upload. The serialized archive is uploaded to encrypted long-term archive storage with server-side encryption enabled.
4Database marker. The memory row is updated with tier: cold and the archive’s storage path. The row remains in the database as a lightweight pointer.

Promotion & Rehydration

Memories can move back up to a higher tier when they are re-accessed. The process depends on the current tier:

Transition	What happens
Warm → Hot	The memory’s vector is re-inserted into the active index. `last_accessed_at` is updated. This is fast — the vector is still in the database.
Cold → Warm	The archived content is rehydrated from long-term archive storage and restored to the database. The vector is not re-indexed until an explicit recall triggers warm → hot promotion.
Cold → Hot (direct)	Not supported. Cold memories must first rehydrate to warm, then promote to hot on the next access. This prevents expensive rehydrations from blocking recall queries.

Distributed Locking

Tier transitions involve cross-system operations (database updates, vector store mutations, archive uploads). To prevent race conditions when multiple workers process transitions concurrently, the tier manager uses row-level locking:

Lock-skip selection. The sweep claims candidate rows with row-level locks that skip already-locked rows rather than blocking on them. Contention between workers never causes a worker to wait.
Atomic transitions. Each tier transition (database update, vector mutation, archive operation) is wrapped in a single transaction. If any step fails, the transaction rolls back and the memory stays in its current tier.
No starvation. Skipped rows are picked up in the next sweep. The sweep interval ensures all eligible memories are processed within a bounded time window.

Batch Processing & Audit

The tier manager processes transitions in configurable batches and maintains a full audit trail:

Batch size. Each sweep processes a fixed number of transitions (configurable per organization) to control resource usage.
Tier-history records. Every transition is logged with the old tier, new tier, transition reason (time-based, size-based, or promotion), and timestamp — visible from the dashboard for audit and debugging.
Rollback on failure. If a batch partially fails, only the committed transitions are recorded. Uncommitted transitions remain in their original tier for the next sweep.
Metrics. The tier manager emits per-batch metrics: transitions completed, bytes moved, archive uploads/downloads, and processing duration.

← Previous

Explainability

Silent Degradation