MemorySyncMemorySync
Integrations

OneDrive

The OneDrive integration syncs Word documents, Excel spreadsheets, and PowerPoint presentations from Microsoft OneDrive and SharePoint into MemorySync. It uses the Microsoft Graph API with delta queries for incremental sync and extracts text directly from Office document XML.

What gets synced

One memory is created per file. The provider extracts text from Microsoft Office formats using zip-based XML parsing:

  • Word documents (.docx) — paragraph text extracted from the document XML structure.
  • Excel spreadsheets (.xlsx) — text extracted from the shared strings table, capturing cell content across all sheets.
  • PowerPoint presentations (.pptx) — text extracted from each slide's XML, capturing all text frames and content.
  • Plain text and other formats — content downloaded directly where text extraction is possible.

Required OAuth scopes

ScopeWhy it's needed
Files.Read.AllRead all files the user has access to in OneDrive and SharePoint.
User.ReadRead the user's profile to identify the authenticated user.
offline_accessObtain a refresh token for long-lived access without requiring re-authentication.

Setup guide

  1. 1Open the dashboard — navigate to Dashboard → Integrations → OneDrive.
  2. 2Click Connect — you will be redirected to Microsoft's sign-in and consent screen. MemorySync already has the Microsoft application registered; no provider-side configuration is required on your side.
  3. 3Sign in and approve permissions — review the requested permissions (Files.Read.All, User.Read, offline_access) and approve. You will be redirected back to the dashboard.
  4. 5Connect from the dashboard — navigate to Dashboard → Integrations → OneDrive, click Connect, and approve the Microsoft consent screen.

Delta sync model

The OneDrive provider uses Microsoft Graph's delta query API for incremental sync:

  1. 1First sync — the provider calls the delta endpoint without a deltaLink, which returns all files in the drive. The response includes a deltaLink for the next sync.
  2. 2Subsequent syncs — the provider calls the stored deltaLink, which returns only files that changed since the last query. The new deltaLink is stored for the next cycle.
  3. 3Deletions — deleted files appear in the delta response with a deleted marker. The provider removes the corresponding memory.

Content extraction

Office documents (DOCX, PPTX, XLSX) are zip archives containing XML files. The provider extracts text without requiring Microsoft Office:

FormatXML file parsedExtraction method
.docxword/document.xmlIterates over paragraph elements and extracts text from each run.
.pptxppt/slides/slide*.xmlParses each slide's XML for text frames and body content.
.xlsxxl/sharedStrings.xmlReads the shared strings table which contains all cell text values.

Memory structure

Each file produces a memory with the following metadata:

FieldDescription
drive_idThe ID of the OneDrive or SharePoint drive.
item_idThe drive item ID of the file.
site_idSharePoint site ID (if applicable).
web_urlDirect link to the file in OneDrive or SharePoint.
mime_typeMIME type of the file.
sizeFile size in bytes.
last_modified_byName or email of the last person who modified the file.

Troubleshooting

  • Tenant admin consent required — in some Azure AD configurations, an admin must approve the app permissions before any user in the tenant can authorize it. Contact your Azure AD admin to grant consent.
  • Conditional Access policies — if your organization uses Conditional Access, the OneDrive integration's service calls may be blocked. Ensure the MemorySync app is allowed through your CA policies.
  • SharePoint vs OneDrive scopingFiles.Read.All grants access to both OneDrive and SharePoint document libraries. If you only want OneDrive files, the sync can be configured to filter by drive type.
  • Token expiration — Microsoft access tokens expire after approximately 1 hour. The provider automatically refreshes them using the offline_access refresh token. If refresh fails, reconnect from the dashboard.
  • Large file handling — very large Office documents may timeout during download. The provider enforces content size limits and skips files that exceed them.