Source Provenance
Every memory carries a story of where it came from. MemorySync tracks that story across three places: the source column on the row, the memory_events audit trail, and — for extraction-derived memories — the MemoryCandidate link back to the source object.
The <code>source</code> column
source is a free-form string up to 64 characters, indexed for fast filtering. It is caller-supplied and the platform does not validate the value, but the codebase uses these labels in practice:
api— direct call to/memory/add.email— extracted from email content.web— produced by the web crawler.- Provider names — when the row was extracted from an integration sync.
The <code>event_type</code> column
Where source answers where did this come from?, event_type answers what kind of trigger produced it?. Both are indexed; both are caller-supplied; both default to null.
The <code>memory_events</code> audit trail
Every meaningful action against a memory is recorded as an event row. This is the trail you use for compliance and debugging.
| Column | Notes |
|---|---|
event_type | Action label — e.g. created, deduplicated, deleted. |
source | Origin label echoed from the request. |
payload | JSON snapshot — the relevant fields at event time. |
created_at | When the action happened. |
Input dedup with <code>source_input_hash</code>
On write, the platform computes a SHA-256 of the sanitized input and stores it on the row as source_input_hash. With deduplicate=true on the request, a matching hash short-circuits the insert and the response carries an event of type deduplicated instead of created.
Extraction provenance
Memories that come from the extraction pipeline pass through a MemoryCandidate row first. The candidate carries:
source_object_id— points at anExternalObjectrow that is the upstream artefact (page, document, message).source_type— categorises the upstream artefact:structured_row,structured_source,insight.- Promotion writes the candidate's identifying metadata into the memory's
metadata_json, so you can trace a memory back to a specific external object.
Tracing a memory back to its origin
- 1Read the memory by id; capture
source,event_type, and any provider-shaped keys inmetadata_json. - 2Query
memory_eventsfiltered by the memory's id to see the full event chain. - 3If the memory is extraction-derived, follow the metadata back to the originating
ExternalObject. - 4Cross-reference with the integration's sync logs (under
/api/v1/integrations) for the network-level event.
Why provenance matters
- Compliance — answer "how did this fact about the user get here?"
- Debugging — when a query returns a wrong recall, the audit trail shows which sync produced the offending row.
- Right-to-erasure — a single source label lets you delete every memory derived from a single integration.