Importance Scoring
Every memory that enters MemorySync receives a deterministic importance score between 0 and 100. This score drives gating decisions, tier transitions, recall ranking, and pruning — it is the single number that determines how the platform treats a memory throughout its lifecycle.
Scoring Overview
The importance scorer runs synchronously during the POST /memory/add pipeline. It takes the memory’s content, metadata, and source information as input and produces a single integer score from 0 to 100. The scoring is fully deterministic — the same input always produces the same score, with no randomness or ML inference involved.
The score is stored on the memory record as its importance value and is immediately available for downstream systems. It is never recomputed automatically unless the re-evaluation engine explicitly triggers a rescore.
The Seven Weighted Dimensions
The scorer evaluates each memory across seven independent dimensions. Each dimension produces a sub-score that is then weighted and summed:
| Dimension | Weight | What it measures |
|---|---|---|
| Recency | Adaptive | How recently the memory was created. Newer memories receive a higher sub-score. |
| Frequency | Adaptive | How often similar content has been added. Repetitive content is penalized. |
| Content Type | Adaptive | The extracted semantic type (preference, fact, intent, relationship). Structured types score higher. |
| Source Trust | Adaptive | The trust level of the source that created the memory (API, integration, web). |
| Size | Adaptive | Content length. Extremely short or extremely long content is penalized; mid-range content scores highest. |
| Density | Adaptive | Information density — the ratio of meaningful entities and facts to total content length. |
| Plan Tier | Adaptive | The organization’s plan tier. Higher plans receive a slight scoring bonus to reflect greater storage and retrieval capacity. |
Dimension Weights & Formula
The final importance score is a weighted sum of all seven sub-scores, clamped to the 0–100 range:
Each sub-score is individually normalized to 0–1.0 before weighting. The weights are adaptive — they adjust based on the distribution of existing memories in the project. For example, if a project already has many preference-type memories, new preferences receive a lower content-type sub-score to encourage diversity.
The final score is rounded to the nearest integer and stored on the memory record. The full breakdown (individual sub-scores and weights) is available through the explainability engine.
Source Trust Hierarchy
Not all sources are treated equally. The scorer assigns different trust levels based on how the memory entered the platform:
| Source | Trust Level | Rationale |
|---|---|---|
| Direct API call | High | Explicitly added by a developer — highest signal that the content is intentional. |
| Integration sync (GitHub, Notion, etc.) | Medium–High | Automated but curated — the user chose to connect this source. |
| Web crawler | Medium | Broad ingestion with variable content quality. |
| Bulk import | Medium | Batch operations may include low-signal content alongside high-signal content. |
Content-Type Bonuses
The semantic extraction pipeline classifies each memory into a content type. The scorer applies bonuses based on the type’s structural value:
- Preference — highest bonus. User preferences (
"prefers dark mode","likes Python") are rare and high-signal for personalization. - Fact — high bonus. Concrete facts (
"works at Acme Corp") have clear recall value. - Intent — medium bonus. Expressed intentions (
"wants to learn Rust") are useful but may be transient. - Relationship — medium bonus. Connections between entities (
"reports to Jane") enrich the knowledge graph. - Unstructured / Other — no bonus. Raw text without clear semantic structure receives the baseline score only.
Score Normalization
After the weighted sum is computed, the scorer applies final normalization to ensure the score is meaningful across different projects and use cases:
- Clamping. The raw weighted sum is clamped to the 0–100 integer range. No memory can score below 0 or above 100.
- Rounding. The floating-point result is rounded to the nearest integer for storage and display.
- Determinism guarantee. Given identical inputs, the scorer always produces the identical output. There is no randomness, no sampling, and no external API calls in the scoring path.
How Other Systems Use It
The importance score is consumed by multiple downstream systems throughout the memory lifecycle:
| System | How it uses importance |
|---|---|
| Intelligence Gating | Compares the score against the project threshold to decide whether to embed the memory. |
| Recall Engine | Uses importance as one of five factors in the hybrid ranking formula (default weight: 15%). |
| Memory Retrieval | Filters out memories with importance below 0.5 before ranking. |
| Tier Manager | Considers importance when deciding tier transitions — high-importance memories stay in the hot tier longer. |
| Pruning | Memories with importance below 0.4 and no usage for 30+ days are flagged as prunable. |