Silent Degradation
When your organization hits a quota limit, MemorySync doesn’t crash your LLM pipeline with an error. Instead, the silent degradation layer returns well-formed, empty-but-valid responses so your application keeps running while internal signals flag the quota breach for investigation.
The Silent Contract
Silent degradation is a design principle: billing issues must never surface as errors in downstream LLM pipelines or end-user experiences. When a quota limit is reached, the platform degrades gracefully rather than failing loudly.
This means:
- Memory add requests return
200 OKwith a valid response body. The request is short-circuited — the memory is not stored, not embedded, and not indexed. - Memory query requests return
200 OKwith an emptymemoriesarray — as if no relevant memories were found. - No error codes ever reach the caller. Quota exhaustion is always represented as a successful empty response.
- Internal usage telemetry tracks every silently skipped operation so your team can investigate quota issues from the dashboard.
How Silent Mode Works
When the platform determines that a request would exceed the organization’s monthly plan quota, it intercepts the request before any pipeline work runs:
- Quota check. The quota dependency runs before the route handler. If the monthly quota is exhausted, the request is flagged for silent degradation and no further work is performed.
- Pipeline short-circuit. Embedding, vector indexing, recall ranking, and database writes are all skipped — nothing is persisted and no memory ID is generated.
- Canonical response. The handler returns the appropriate silent response template: a success body for adds, an empty memories array for queries.
- Usage counter. The billing usage counter is not incremented for silently skipped requests — the counter stays at the limit rather than growing past it.
The Two Response Types
| Operation | HTTP Status | Response Shape |
|---|---|---|
POST /memory/add | 200 OK | A success body of the shape {"status":"ok"}. No memory is created, no embedding is performed, and no database write occurs. The request leaves no trace beyond internal telemetry. |
POST /memory/query | 200 OK | A query response with an empty memories array. The caller sees zero results as if the query had no matches. |
Both response types are structurally valid and indistinguishable from a normal empty result on the wire. Silent degradation is invisible to the API consumer by design.
Internal Signal Emission
Every silent skip is recorded in the internal usage telemetry. Each entry includes the organization, the operation type (add or query), and the timestamp. These records power the dashboard’s quota and usage views and can be surfaced via webhooks when silent skips exceed a configurable rate.
Integration with Billing
Silent degradation sits at the boundary between rate limiting and pipeline execution:
- Rate limiter. Per-request rate limits are evaluated first. If a rate limit is hit, a standard
429 Too Many Requestsresponse is returned — rate limits are independent of plan quotas and are not silent. - Quota dependency. Before the route handler executes, the platform performs an atomic check-and-increment against the organization’s monthly quota. If the limit has been reached, the request is flagged for silent degradation and the counter is not incremented.
- Handler short-circuit. The route handler detects the flag and returns the canonical silent response without invoking the gatekeeper, embedding, or any database write.
- Recall path. For query requests, the recall engine is bypassed entirely and an empty result is returned immediately.
This design keeps quota enforcement centralised at the request boundary while keeping the pipeline itself unaware of billing state.