Memory Model
Vector Representation
Every memory has at most one vector. This page explains what the vector represents, how it is generated, what travels with it into the index, and what guarantees and fallbacks the platform provides when the embedding step misbehaves.
What a vector actually captures
The vector is a compressed numeric fingerprint of the memory's text. Two memories whose vectors are close together cover similar meaning regardless of wording. The platform never claims the vector encodes truth, importance, or recency — those signals live on other columns and are combined at ranking time.
How the vector is generated
- 1The text is preprocessed: stripped of leading/trailing whitespace, validated for non-empty content, and truncated at 8000 characters.
- 2A SHA-256 digest of the cleaned text is computed and used as a cache key (
emb:{digest[:24]}). If a cache hit returns within ~10 ms, that vector is reused. - 3On a miss, the embedding service is called. If the request fails, an exponential-backoff retry runs up to 5 attempts.
- 4On full failure, the platform falls back: first an offline transformer if one is locally cached, otherwise a deterministic hash-based vector. The memory is still stored — recall quality drops, ingest never fails.
- 5On success, the result is cached for 1 hour, written to the index, and persisted on the row as a backup.
What rides along with the vector into the index
The index stores the vector plus a small metadata payload — not the text. The payload is what makes index-side filtering possible without a round trip to the durable store.
memory_id— the join key back to the durable record.user_id— partitions per-user namespaces.project_id— used as a hard filter at query time when set.environment—development/staging/production.sourceandtags— used as filters when the caller supplies them.importance— float, used by the ranker.created_at— ISO string, used by the recency factor.tier— used as a hard filter for tier-restricted recall.
Namespace and similarity metric
- Each user has exactly one collection in the index, named
memorysync-user-{user_id}. There are no per-project sub-collections. - Similarity is cosine. The index returns a distance; the platform converts it to a similarity score with
similarity = 1 - clamp(distance, 0, 1). - Dot product and Manhattan distance are not used.
What an update to a memory's content does to its vector
- A new
vector_idis generated. - The old vector is removed from the index by the previous
vector_id. - The new vector is upserted under the new
vector_id. - The row's
vectorcolumn (the durable backup copy) is overwritten. - Embedding cache entries for the old text are not actively purged; the next write to the same text just produces a hit.
When a memory has no vector
vectormay beNULL. This happens when ingest succeeded but the embedding service was completely down and no fallback produced a usable result.- Vector-less memories never appear in semantic recall — they are invisible to
POST /memory/query. - They remain visible to direct lookups and to any full-text endpoint, and a background sweep retries embedding for them.
Reliability around the vector path
- Every index call is wrapped in a retry loop with exponential backoff (50 → 100 → 200 ms). After three failures the call surfaces an error.
- Per-user collections are cached in memory after the first access so steady-state queries do not hit the index server's collection-listing path.
- Tier transitions (hot → warm, warm → cold) always re-upsert the vector so the index payload stays consistent with the row.