Guides
Build a Chatbot with Memory
Wire MemorySync to your LLM in three places: before generation (recall context), after generation (write turn summary), and on user feedback (boost importance). Total added latency: ~30–80 ms p50.
Architecture
Step 1 — Recall before generation
TYPESCRIPT
async function buildPrompt(userId: string, message: string) {const ms = new MemorySync({ apiKey: process.env.MS_KEY!, projectId: "prj_chatbot" })const result = await ms.memory.query({query: message,k: 5,sessionId: userId, // pin session for cross-turn boostingfilters: { tier: "hot" },weights: { semantic: 0.5, recency: 0.2, importance: 0.2, graph: 0.1 },})const memoryBlock = result.memories.map((m, i) => `[memory ${i+1}] (${m.tier}) ${m.text}`).join("\n")return `You have access to the user's prior context:\n${memoryBlock}\n\nUser: ${message}`}
Step 2 — Store the turn after generation
TYPESCRIPT
async function storeTurn(userId: string, userMsg: string, assistantMsg: string) {// Compress the turn into one memory; the intelligence pipeline will add structure.await ms.memory.add({text: `User asked: "${userMsg}". Assistant answered: "${assistantMsg}".`,source: "chat",tags: ["turn"],importance: 0.4,metadata: { user_id: userId },})}
Step 3 — Use feedback to tune
On a 👍 from the user, bump the importance of the memories that fed the prompt. On 👎, lower importance and increment a feedback counter that the daily re-evaluator reads.
TYPESCRIPT
await ms.memory.update(memoryId, { importance: 0.7 }) // up-voteawait ms.memory.update(memoryId, { importance: 0.2 }) // down-vote
Avoiding turn-spam
- Store one memory per topic shift, not every turn. A simple heuristic: only write when the user introduces a new entity or asks a question.
- Use
deduplicate: true(default). The pipeline collapses near-duplicates within 24h. - Set
ttl_minutesfor ephemeral chat data so old turns auto-archive.
Latency tips
- Run recall in parallel with non-LLM context fetches (user profile, tools).
- Use
computation_tier: "low"for trivial follow-ups (< 6 tokens). - Keep
k≤ 5 unless you actually pass them all into the prompt. - Pin
session_idacross the conversation — repeats hit warmer caches.
Cost
A typical 10-turn conversation: 10 recalls (medium tier) + 10 adds + 10 embeddings ≈ ~150–250 µ¢ depending on provider. Compare this against the LLM completion cost on the same conversation; MemorySync usually accounts for 1–3% of total cost.