Debugging

Debug Toolkit

Every API response, every error, and every internal state transition in MemorySync is traceable. This page covers the built-in diagnostics that ship with the platform — no extra configuration, no third-party tooling required. Master these tools and you can resolve most issues without ever contacting support.

What You Get Out of the Box

MemorySync includes five built-in debugging instruments that are active on every request, with zero setup:

Instrument	How It Helps	Where to Find It
`X-Request-ID` header	Unique per-request identifier — quote it in every support ticket	Every HTTP response
`X-Trace-ID` header	Cross-system correlation for distributed tracing	Every HTTP response
`GET /health` endpoint	Real-time platform health check with per-component status	Unauthenticated, always available
Audit Log	Every mutating API call logged with full context	Dashboard → Audit Logs
Structured error envelopes	Machine-readable error codes with human-readable messages	Every error response body

These instruments work together. For example, when a webhook delivery fails, the delivery record carries the trace ID from the originating API request, so you can follow the full lifecycle from ingestion to webhook delivery in a single search.

The Health Endpoint

The GET /health endpoint checks the platform’s core dependencies and returns a structured JSON response. It requires no authentication and is rate-limited per IP to keep public probing traffic predictable.

// Healthy response
{
  "status": "ok",
  "db": "healthy",
  "vector_db": "healthy",
  "cache": "healthy"
}

// Degraded response (one component down)
{
  "status": "degraded",
  "db": "healthy",
  "vector_db": "unhealthy",
  "cache": "healthy"
}

How to interpret:

"status": "ok" — all three components responding. Your issue is likely in your request, not the platform.
"status": "degraded" — at least one component is down. Check which one is "unhealthy" to narrow the impact area.
"db": "unhealthy" — the primary datastore is unreachable. All write operations and most reads will fail.
"vector_db": "unhealthy" — the vector index is unreachable. Memory queries will fail, but CRUD operations on metadata still work.
"cache": "unhealthy" — the cache layer is unreachable. Performance may degrade but operations continue while the platform recovers.

⚠️ Important: A healthy /health response does NOT guarantee your specific request will succeed. Health checks only verify connectivity to backend components — they don't check your API key validity, quota status, or data state.

Request Tracing

Every request flowing through MemorySync is tagged with three identifiers that enable end-to-end tracing:

Header	Direction	Purpose
`X-Request-ID`	Request → Response	Unique per-request. If you send it, MemorySync echoes it back. If you don’t, the platform generates one for you.
`X-Trace-ID`	Request → Response	Cross-system correlation. If not provided, defaults to the request ID so every request participates in a trace.
`X-Parent-Span-ID`	Request only	Optional. Links this call to its upstream span for trace viewers.

How to use in practice:

# Send a request with your own trace context
curl -H "X-API-Key: ms_..." \
     -H "X-Trace-ID: my-trace-123" \
     -H "X-Request-ID: req-abc-001" \
     https://api.memorysync.io/memory/query \
     -d '{"query": "user preferences"}'

# Response headers will echo back:
# X-Request-ID: req-abc-001
# X-Trace-ID: my-trace-123

Audit log entries and webhook deliveries carry the same identifiers. When a webhook fails, its delivery record references the trace ID of the API request that triggered it, so you can follow the path from the original call all the way to the failed delivery.

Structured Error Format

Every error response from MemorySync follows a consistent JSON envelope. The exact shape depends on the error type, but the outer structure is always predictable:

// Standard error (4xx/5xx)
{
  "error": {
    "code": "RATE_LIMIT_EXCEEDED",
    "message": "Rate limit exceeded (per-minute). Try again in 12 seconds."
  },
  "request_id": "a1b2c3d4-e5f6-..."
}

// Validation error (422)
{
  "error": {
    "code": "VALIDATION_ERROR",
    "message": "query: Field required",
    "details": [
      {
        "loc": ["body", "query"],
        "msg": "Field required",
        "type": "missing"
      }
    ]
  },
  "request_id": "a1b2c3d4-e5f6-..."
}

Key fields to check:

error.code — machine-readable error type. Use this in your error-handling code.
error.message — human-readable explanation. Includes specific details like which rate-limit layer blocked you.
error.details — for validation errors only. Array of per-field errors with location path, message, and type.
request_id — always present. Use it to search audit logs and correlate with support.

Audit Log

Every mutating API call is recorded in the audit log with full context. The audit log is queryable in the dashboard and captures these fields for every entry:

Field	Example	How to Use
`category`	`"webhook"`	Filter by domain (webhook, billing, memory, team, etc.)
`action`	`"webhook.endpoint_created"`	Exact operation performed
`resource_type`	`"webhook_endpoint"`	What was affected
`resource_id`	`"42"`	Which specific resource
`actor_email`	`"dev@example.com"`	Who triggered it
`severity`	`"warning"`	info / warning / critical
`ip_address`	`"203.0.113.50"`	Client IP that initiated the request
`success`	`false`	Whether the operation succeeded

Debugging workflow: When a webhook delivery fails, an audit entry is recorded with severity: "critical" (final attempt) or severity: "warning" (retries remain). The entry carries the delivery ID, endpoint name, event type, status code, and error message — everything you need to diagnose the failure from the dashboard.

Server-Side Crash Log

When an unexpected error occurs that doesn’t match a typed error case (validation, HTTP, permission), the platform does two things:

Returns a generic 500 response: {"error": "InternalServerError", "message": "An unexpected error occurred", "request_id": "..."}. Raw error details are never returned to the client.
Records the failure server-side for post-mortem analysis by the support and engineering teams.

If you see a 500 error, the request_id in the response is the key. Provide it to support so they can locate the exact server-side record for your request.

💡 Tip: A 500 with an InternalServerError code always means a bug in the platform, not in your request. If the issue is in your request data, you'll get a 400 or 422 with a specific message instead.

First Five Minutes Checklist

When something goes wrong, follow this systematic triage sequence. Most issues can be diagnosed within these five steps:

Call GET /health — If all three components show "healthy", the platform infrastructure is fine and the issue is in your request or configuration. If any component is "unhealthy", that narrows the blast radius immediately.
Check the HTTP status code — 401/403 → authentication problem (see Auth Issues). 422 → request body validation failure. 429 → rate limited (see Rate Limit Issues). 500 → platform bug (provide the request_id to support).
Read the error.code — Every error response contains a machine-readable code like RATE_LIMIT_EXCEEDED, VALIDATION_ERROR, or TENANT_CONTEXT_REQUIRED. This tells you exactly what went wrong.
Copy the X-Request-ID header — This is the single most important piece of debugging data. It lets support trace the exact request through every subsystem.
Search the audit log — In the dashboard, filter by your tenant and the affected resource type. Look for entries with success: false around the same timestamp.

✅ Pro tip: Replay the failing request with curl -v and include -H "X-Request-ID: debug-$(date +%s)". This gives you a custom, searchable request ID and -v shows you all response headers including rate limit details.

← Previous

Common Billing Mistakes

Common Errors