MemorySyncMemorySync
Debugging

Integration Sync Failures

Integration syncs pull data from external sources (GitHub, Notion, Google Drive, etc.) into MemorySync. Sync jobs are managed with built-in retry logic, concurrency controls, and per-object failure isolation. This page covers how syncs work and how to diagnose failures.

How Sync Jobs Work

Every integration sync runs as a sync job — a tracked unit of work with its own state machine. The system supports three job types:

Job Type What It Does When It Runs
FULLFetches ALL objects from the external source, regardless of what was synced beforeInitial connection, manual resync, after data corruption
INCREMENTALFetches only changes since the last successful sync (using cursors)Scheduled syncs (realtime/hourly/daily), manual trigger
WEBHOOKProcesses a specific set of objects pushed by the external source's webhookWhen the external source notifies MemorySync of changes

Job state machine:

PENDING → RUNNING → COMPLETED
                  ↘ PENDING (retry)
                  ↘ FAILED (max retries exceeded)
         CANCELLED (user-initiated)

Each job exposes real-time progress in the dashboard: total items, processed items, failed items, and the number of memories created, updated, or deleted by the run.

Concurrency Limits

Each tenant has a cap on how many sync jobs may run at the same time. This prevents one tenant from monopolizing platform resources and keeps scheduling fair for everyone.

When the limit is hit, attempts to start additional jobs are rejected with a clear “too many concurrent syncs” error.

How to fix:

  • Wait for a running job to complete before starting a new one.
  • Cancel a stuck or unneeded job from the dashboard.
  • Stagger your sync schedules across connections so they do not overlap.

Idempotency protection: Each sync job creation accepts an optional idempotency key. If a job with the same key already exists, the create request is rejected so a scheduler firing twice (for example, due to clock skew) cannot create duplicate jobs.

Retry Strategy

When a sync job fails, the platform retries it a small number of times with increasing delays. The delays are designed to ride out transient external API outages that typically resolve within a short window.

Between retries: The job goes back into a pending state with a scheduled retry time, the retry count is incremented, and the latest failure reason is surfaced on the job.

After all retries are exhausted: The job is marked as failed and the connection’s health status is updated so the dashboard surfaces the connection as unhealthy.

Incremental Sync Cursors

Incremental syncs use a cursor to track progress and avoid re-fetching already-synced data. The cursor falls back gracefully when one is not available:

  1. Primary: the cursor from the last completed sync.
  2. Fallback: the completion timestamp of the last completed sync, used as a “changes since” marker.
  3. No history: if there has never been a completed sync, the incremental run effectively becomes a full sync.

Common cursor issues:

  • “Incremental sync isn’t catching new changes.” If the cursor is correct but the external API is not returning changes, the issue is on the external source’s side (pagination, cursor format change, etc.).
  • “Incremental sync re-fetches everything.” This happens when no usable cursor or completion timestamp is available, or when the provider does not support time-based change detection.

Fix for cursor drift: run a full sync to reset the baseline, then let incremental syncs resume from the fresh cursor.

Per-Object Failure Recording

Sync jobs process objects in batches. When an individual object fails, it does not abort the entire job — the failure is recorded for that object and the job continues with the rest.

What you can see for each per-object failure:

  • The object’s identifier in the external system.
  • The object type (for example, “document” or “page”).
  • A human-readable error description.
  • A snapshot of the object payload to help reproduce the issue.

Interpreting job results: After a job completes, check the failed-items counter. If it is non-zero, some objects failed individually. The job’s overall status can still be completed even with individual object failures — the job-level status reflects whether the sync process itself completed, not whether every object succeeded.

💡 Common cause: The most frequent per-object failure is “object missing ID” — the external source returned an object without an id field. This usually means the external API response format has changed.

Connection State Requirements

Before a sync job can start, two preconditions must be met:

1. The connection must be active. If the connection is in any other state (disconnected, error, or revoked), the sync job is rejected with a “connection not active” error.

2. A valid access token must be available. The platform refreshes tokens before they expire. If a refresh fails (for example, the upstream provider revoked the refresh token), the next sync job surfaces a “no access token available” error.

Troubleshooting:

  • “Connection not active” — re-authorize the integration from the dashboard. The OAuth flow re-establishes the connection.
  • “No access token available” — your refresh token has been revoked or has expired. Re-authorize the connection from the dashboard.

Monitoring Sync Health

The job status endpoint returns real-time progress, timing, and error information for an in-flight sync. A typical response includes:

  • The job’s current state and overall percent complete.
  • Counts for total, processed, and failed items.
  • Counts for memories created, updated, and deleted by the run.
  • Start time, completion time (if finished), and total duration.
  • The latest error message, if any.

Sync scheduling: After a sync completes, the connection’s next scheduled sync is updated based on the configured frequency:

Frequency Next Sync After
realtime5 minutes
hourly1 hour
daily24 hours
manualNot automatically scheduled

✅ Health check: A connection is healthy when its last sync completed successfully and the total objects synced matches your expectation. If the last sync failed, the dashboard surfaces the most recent error message and how many retries remain.