Billing + Quota Flow
This page traces the complete lifecycle of a billable request — from the moment it arrives at the API to the billing cycle rollover that resets counters. It covers the upgrade flow, the downgrade flow, payment failure handling, and the background reconciliation that keeps everything in sync.
End-to-End Request Lifecycle
Every API request passes through these stages in order:
- Rate limiter. Per-IP and per-route rate limits are checked first. If exceeded, a standard
429 Too Many Requestsis returned. This is separate from billing. - Authentication. The API key (or session token) is validated. Unauthorized requests never reach the billing layer.
- Cycle rollover check. If the current billing cycle has expired, the cycle is rolled over (counters reset, scheduled transitions applied) before proceeding.
- Quota check-and-increment. The atomic check verifies your plan limit and counts the request in one step.
- Pipeline decision. If admitted, the request proceeds to the memory pipeline (storage, embedding, indexing, recall). If over quota, the pipeline is skipped and a silent response is returned.
- Usage recorded. Successful requests are recorded for the dashboard and your invoice. Internal accounting failures never block the request.
Calendar-Month Billing Cycle
Every organization has its own billing cycle anchored to the day they subscribed:
- Same-day anchoring. If you subscribe on May 15, your cycle runs May 15 → June 15 → July 15. Never a fixed 30-day window.
- Short-month clamping. January 31 → February 28 (or 29). The last day of the target month is used when the anchor day exceeds the month length.
- Timezone awareness. The “same day next month” calculation respects the dashboard’s billing timezone, with the actual boundary stored in UTC. This prevents daylight-savings surprises.
- Rollover is on-demand. When the cycle ends, the rollover happens on the next request that comes in — not on a fixed wall-clock cron. If no requests arrive for days, the rollover happens when traffic resumes.
Upgrade Flow
When a payment succeeds, the upgrade is applied atomically in this order:
- Reset counters. Both usage counters are reset to zero before the plan changes. This prevents the old plan’s usage from carrying over.
- Apply plan. If a downgrade was previously scheduled, that scheduled plan takes priority over the plan implied by the payment. Otherwise, the paid plan is applied directly. The organization’s tier is updated at the same time.
- Sync cycle window. If the payment carries explicit period dates, the billing cycle is aligned to those dates exactly. Otherwise, a new cycle starts at the payment timestamp.
- Clear flags. Any past-due markers, scheduled downgrades, and pending cancellations are cleared. The account becomes active again.
The entire transition is committed atomically. If any step fails, the whole transition rolls back — you never end up in a half-applied state.
Downgrade & Cancellation Flow
Downgrades and cancellations are deferred — they only take effect at the next cycle rollover:
| Action | What’s set immediately | What happens at rollover |
|---|---|---|
| Downgrade | A target plan is scheduled for the next cycle | The new plan is applied, counters reset, and a new cycle window starts. The current plan stays active until then. |
| Cancellation | A cancel-at-period-end flag is set | The plan reverts to Free, the paid subscription is cleared, any scheduled plan changes are dropped, and counters reset. |
| Resume | The cancel-at-period-end flag is cleared | Nothing — the cancellation is reversed before it takes effect and the plan continues normally. |
Cancellation takes priority over downgrade. If a cancellation and a scheduled downgrade are both pending, cancellation wins at rollover.
Payment Failure Handling
The system distinguishes between two types of payment failures, with very different consequences:
| Failure type | Impact on plan | Details |
|---|---|---|
| Autopay failure (subscription renewal) | Immediate downgrade to Free | Counters reset, cycle window re-anchored, account marked past-due. The paid subscription itself is preserved so a later successful retry can restore the plan automatically. |
| Manual / one-off failure (proration or checkout) | No change | Plan, cycle, and counters are completely untouched. The expectation is that the customer can retry the manual payment. |
Idempotent Webhooks
Payment events can be delivered more than once. Billing handles this with strict idempotency:
- Event deduplication. Every payment event is recorded by its unique ID before any state changes. A duplicate delivery short-circuits with no side effects.
- No double-counting. A duplicate “invoice paid” event cannot reset counters twice or apply the plan transition twice.
- Safe reconciliation. Periodic reconciliation jobs use the same idempotency guarantee, so they can be re-run safely without double-applying anything.
Budget Gating
In addition to plan-based request limits, organizations can set an optional monthly spending budget. The budget tracker provides a soft gate on the intelligence pipeline:
- Optional and configurable. Budgets are off by default. When enabled, you set a monthly budget amount and an alert threshold from the dashboard.
- Progressive throttling, not a hard cliff. As budget utilization climbs, expensive intelligence work runs in progressively more conservative modes. Above the alert threshold, only the most important work continues.
- Fails open. If budget data is briefly unavailable, requests are not blocked — the priority is keeping your AI pipeline running rather than enforcing a soft limit.
- Independent of plan limits. Hitting your spending budget does not affect your add and retrieval quotas, and vice versa — they are separate signals.