Retrieval Pipeline
Retrieval is how the platform turns a question — a string — into a ranked list of memories. There are several entry points; this page maps each one to what it does, who should reach for it, and what it returns.
The canonical recall endpoint
POST /memory/query is what 99% of traffic uses. It accepts a query string, optional filters, optional ranking weights, an optional session_id for multi-turn context, and a traversal depth. It returns a MemoryQueryResponse with the ranked memories, an inferred query intent, per-memory scoring breakdowns, the SLA target it tried to hit, and routing diagnostics. POST /memory/retrieve is an alias for the same handler.
What the handler validates before it runs
querymust be a non-empty string.kis clamped to [1, 50] (default5).traversal_depthis clamped to [1, 3] (default2).- Authentication is enforced — every request resolves a
user_id, atenant_id, optionally aproject_idfromX-Project-ID, and anenvironmentfrom the API key's settings. None of those can be overridden from the body.
What the body can shape
| Field | Effect |
|---|---|
filters.sources | Restrict to specific origin labels. |
filters.tags | OR-match on tags. |
filters.since / filters.until | Time range on created_at. |
filters.include_summaries | When false, drops is_summary=true rows. |
filters.tier | Restricts to a single tier. |
weights | Override the default ranking weights. Values are normalised to sum to 1. |
session_id | Activates session-continuity boost. |
traversal_depth | Controls multi-hop graph expansion. |
computation_tier | Force-pick a latency tier (low / medium / high). |
Routed knowledge search
POST /memory/search/routed is the compliance-aware path. It accepts a query and an optional force_intent (factual / analytical / hybrid) and may refuse the query outright if it triggers a sensitivity rule. Use this for end-user-facing surfaces where you want refusals visible to the caller.
Synthesise
POST /memory/synthesize chains a recall and a model pass: it pulls the top-k memories for a query and produces a single coherent narrative or answer instead of a ranked list. Used for executive summaries and FAQ-style replies. Cost-gated and slower than raw recall.
Graph and cluster endpoints
GET /memory/graph— returns nodes and typed edges for the relationship graph. Supportsscope=user|project|tenantand auser_idfilter. Used by the dashboard's graph visualiser.GET /memory/clusters— returns semantic clusters and their member counts. Useful for "what themes are in my memories?" panels.
When to use which
- Conversational agent →
POST /memory/querywithsession_id. - End-user search box →
POST /memory/search/routedso refusals propagate. - Briefing or report generation →
POST /memory/synthesize. - Knowledge-graph UI →
GET /memory/graph+GET /memory/clusters.