MemorySyncMemorySync
Integrations

GitHub

The GitHub integration syncs repositories, issues, pull requests, commits, and discussions into MemorySync as queryable memories. It uses the GitHub REST API with OAuth 2.0 authentication and supports incremental sync to avoid re-fetching unchanged data.

What gets synced

The GitHub provider fetches the following object types, each becoming a separate memory:

TypeContent
repositoryRepository description + README contents. Metadata includes full_name, language, stars, forks, topics, default_branch.
issueIssue body text. Both open and closed issues are fetched. Metadata: number, state, labels, assignees, milestone.
pull_requestPR body text. Metadata: number, state, base_branch, head_branch, draft, merged, additions, deletions, changed_files.
commitCommit message. Up to 50 recent commits per repository. Metadata: sha, author, verified, stats (additions, deletions, total).
pr_commentReview comments on pull requests.
discussionGitHub Discussions threads.

Required OAuth scopes

ScopeWhy it's needed
repoRead access to private repositories, issues, pull requests, and commits. Use public_repo if you only want public repos.
read:orgRead organization membership to list repos the user has access to.
read:discussionRead access to GitHub Discussions.

Setup guide

  1. 1Open the dashboard — navigate to Dashboard → Integrations → GitHub.
  2. 2Click Connect — you will be redirected to GitHub's OAuth consent screen. MemorySync already has the GitHub app registered, so no provider-side configuration is required on your side.
  3. 3Authorize the requested scopes — review the permissions GitHub displays and approve. You will be redirected back to the dashboard.
  4. 4Select repositories — choose which repositories to sync. The initial sync starts immediately and fetches all selected content.

Incremental sync

After the initial sync, the GitHub provider uses smart incremental strategies to minimize API calls:

  • Repository-level skip — each repo's updated_at timestamp is compared to the last sync time. Repos that haven't changed are skipped entirely.
  • Issues — uses the since parameter to fetch only issues updated after the last sync.
  • Pull requests — sorted by updated descending with a cutoff date. Once a PR older than the last sync is encountered, pagination stops.
  • Commits — limited to the most recent 50 per repository per sync cycle.
  • Pagination safety — a hard limit of 10 pages per API call prevents runaway pagination on very large repositories.

Memory structure

Each synced object produces a memory with type-specific metadata. Here's what gets stored for each type:

TypeContentKey metadata fields
RepositoryDescription + READMEfull_name, language, stars, forks, topics
IssueIssue bodyrepo, number, state, labels, assignees
Pull requestPR bodybase_branch, head_branch, draft, merged, additions/deletions
CommitCommit messagesha, author, verified, stats

Webhook events

When a GitHub webhook is configured, the provider parses incoming events to create, update, or delete memories in real time:

  • issues.opened / issues.edited / issues.closed — creates or updates the corresponding issue memory.
  • pull_request.opened / pull_request.edited / pull_request.closed / pull_request.merged — creates or updates the PR memory.
  • discussion.* — handles discussion creation and updates.
  • issues.deleted / pull_request.deleted — marks the memory for deletion.

The provider validates the webhook signature to ensure the payload is authentic before processing.

Rate limits & tips

  • 5,000 requests per hour — GitHub enforces this limit for authenticated OAuth requests. MemorySync tracks X-RateLimit-Remaining and pauses sync when approaching the limit.
  • Pagination capped at 10 pages — prevents runaway API consumption on very large repos with thousands of issues.
  • Empty repos return HTTP 409 — the provider handles this gracefully and skips the repository without failing the sync.
  • Bot messages are skipped — commit authors and issue creators that are bots are excluded from sync to reduce noise.
  • Sync is one-way — MemorySync never writes back to GitHub. All data flows from GitHub into MemorySync.