MemorySyncMemorySync
Integrations

Integrations Overview

MemorySync integrations connect your external data sources and turn them into queryable memories. Every integration follows the same pull-sync model: MemorySync authenticates with the provider, fetches content through its API, extracts text, and stores it as encrypted memories in your project.

What integrations do

An integration is a managed connector that bridges an external service to MemorySync. The platform handles the full lifecycle:

  • Authentication — OAuth 2.0 token exchange and automatic credential storage for each provider. Tokens are encrypted at rest.
  • Data fetching — Cursor-based pagination through the provider's API. Each provider supports an initial bulk sync and efficient incremental updates afterwards.
  • Content extraction — Raw API responses are transformed into normalized sync objects with id, type, content, url, and metadata fields.
  • Memory storage — Extracted content is encrypted, indexed, and stored as queryable memories in your project.

Supported data sources

ProviderAuthWhat gets synced
GitHubOAuth 2.0Repositories, issues, pull requests, commits, discussions
NotionOAuth 2.0Pages and databases (root-level shared items)
Google DriveOAuth 2.0Docs, Sheets, Slides, PDFs — exported to text
OneDriveOAuth 2.0Word, Excel, PowerPoint — text extracted from document XML
Web CrawlerAPI keyAny public web page — HTML parsed and cleaned to text

How data flows in

  1. 1Connect — You authenticate with the provider via OAuth (or API key for Web Crawler). MemorySync stores the encrypted credentials.
  2. 2Initial sync — MemorySync paginates through the provider API using cursor-based iteration and yields batches of normalized objects.
  3. 3Content extraction — Each object's raw content is transformed into clean text. For example, Notion blocks become Markdown, Google Docs are exported as plain text, and OneDrive Word documents are extracted to plain text.
  4. 4Memory storage — Extracted text is encrypted, embedded, and stored as a memory in your project. Metadata like source, url, and provider-specific fields are preserved.
  5. 5Incremental sync — On subsequent runs, MemorySync uses the stored cursor to fetch only items modified since the last sync.

Connection lifecycle

Every integration connection tracks its state through a small set of well-defined statuses:

StatusMeaning
connectedOAuth credentials are valid and syncing is active.
disconnectedIntegration has been manually disconnected.
errorLast sync failed. Check last_sync_error for details. Reconnection may be required.
expiredOAuth token has expired and could not be refreshed. Click Reconnect to re-authorize.
pendingOAuth flow initiated but not yet completed.

The needs_reauth property returns true when status is error or expired, indicating the user must reconnect.

Project scoping

Every integration connection is scoped to a tenant (organization). Each organization can have one active connection per provider.

  • Synced memories inherit the organization context of the connection.
  • Sync direction controls data flow: import (external → MemorySync), export, or bidirectional.
  • Connection configuration (workspace IDs, selected channels/repos/folders) is stored alongside the connection. Sensitive values are filtered from API responses.

Sync scheduling

Each connection has a configurable sync_frequency that controls how often MemorySync fetches new data:

FrequencyBehavior
realtimeSync triggered by webhook events (where supported by the provider).
hourlyScheduled pull every 60 minutes.
dailyScheduled pull every 24 hours.
manualSync only when explicitly triggered from the dashboard.

The dashboard displays last_sync_at, last_sync_status, and items_synced for every connection.