Vectors & Semantics
Every memory is searchable because it carries a vector — a numerical representation of its meaning. This page covers when the vector is computed, what text gets embedded, how the platform tracks the model behind each vector, and what changes if the model is updated.
What an embedding is, in one paragraph
An embedding is a fixed-length array of floating-point numbers that represents the meaning of a piece of text. Two pieces of text with similar meaning produce vectors that are close to each other under the platform's distance metric. Recall is built on this property — a query becomes a vector, and the platform finds memories whose vectors sit near it.
When the vector is computed
Embedding is part of the write path. The platform calls the embedding model during /memory/add processing. It is async at the language level (the call is awaited) but it blocks the response — when the API returns, the row already has a vector.
What text actually gets embedded
Not the raw input string. The platform first runs the input through the semantic preprocessing step, which normalises whitespace, strips low-signal padding, and may attach context. The result is the text that goes to the embedding model. The original input is preserved encrypted in the text column for reads.
How each vector is tied to a model version
Every memory carries an embedding_version field that records which model produced its vector. This is what lets the platform reason about which vectors are comparable to which queries, and which rows would need re-embedding after a model upgrade. Callers do not set this value — it is stamped server-side.
Cost and token tracking
Every embedding call records token count and cost in cents (kept as a decimal for precision). These figures roll up into the per-tenant usage counters that drive billing and quota enforcement.
What happens when the model is updated
- Existing rows keep their original
embedding_versionand stay in their original index. - New writes use the new model.
- Cross-version recall is not automatic — a query embedded with a new model only matches rows that share its
embedding_version. - Plan a re-embedding pass before changing a tenant's effective model.
What to watch when debugging recall
embedding_versionon the row — confirms which model produced the vector.embedding_versionof the query path — must align with the row's version for the row to be a candidate.- Whether the query's preprocessing matches the write-time preprocessing — divergent normalisation hurts recall.