Advanced

Importance Scoring

Every memory that enters MemorySync receives a deterministic importance score between 0 and 100. This score drives gating decisions, tier transitions, recall ranking, and pruning — it is the single number that determines how the platform treats a memory throughout its lifecycle.

Scoring Overview

The importance scorer runs synchronously during the POST /memory/add pipeline. It takes the memory’s content, metadata, and source information as input and produces a single integer score from 0 to 100. The scoring is fully deterministic — the same input always produces the same score, with no randomness or ML inference involved.

The score is stored on the memory record as its importance value and is immediately available for downstream systems. It is never recomputed automatically unless the re-evaluation engine explicitly triggers a rescore.

The Seven Weighted Dimensions

The scorer evaluates each memory across seven independent dimensions. Each dimension produces a sub-score that is then weighted and summed:

Dimension	Weight	What it measures
Recency	Adaptive	How recently the memory was created. Newer memories receive a higher sub-score.
Frequency	Adaptive	How often similar content has been added. Repetitive content is penalized.
Content Type	Adaptive	The extracted semantic type (preference, fact, intent, relationship). Structured types score higher.
Source Trust	Adaptive	The trust level of the source that created the memory (API, integration, web).
Size	Adaptive	Content length. Extremely short or extremely long content is penalized; mid-range content scores highest.
Density	Adaptive	Information density — the ratio of meaningful entities and facts to total content length.
Plan Tier	Adaptive	The organization’s plan tier. Higher plans receive a slight scoring bonus to reflect greater storage and retrieval capacity.

Dimension Weights & Formula

The final importance score is a weighted sum of all seven sub-scores, clamped to the 0–100 range:

importance = clamp(0, 100, ∑(dimension_score[i] × weight[i]))

Each sub-score is individually normalized to 0–1.0 before weighting. The weights are adaptive — they adjust based on the distribution of existing memories in the project. For example, if a project already has many preference-type memories, new preferences receive a lower content-type sub-score to encourage diversity.

The final score is rounded to the nearest integer and stored on the memory record. The full breakdown (individual sub-scores and weights) is available through the explainability engine.

Source Trust Hierarchy

Not all sources are treated equally. The scorer assigns different trust levels based on how the memory entered the platform:

Source	Trust Level	Rationale
Direct API call	High	Explicitly added by a developer — highest signal that the content is intentional.
Integration sync (GitHub, Notion, etc.)	Medium–High	Automated but curated — the user chose to connect this source.
Web crawler	Medium	Broad ingestion with variable content quality.
Bulk import	Medium	Batch operations may include low-signal content alongside high-signal content.

Content-Type Bonuses

The semantic extraction pipeline classifies each memory into a content type. The scorer applies bonuses based on the type’s structural value:

Preference — highest bonus. User preferences ("prefers dark mode", "likes Python") are rare and high-signal for personalization.
Fact — high bonus. Concrete facts ("works at Acme Corp") have clear recall value.
Intent — medium bonus. Expressed intentions ("wants to learn Rust") are useful but may be transient.
Relationship — medium bonus. Connections between entities ("reports to Jane") enrich the knowledge graph.
Unstructured / Other — no bonus. Raw text without clear semantic structure receives the baseline score only.

Score Normalization

After the weighted sum is computed, the scorer applies final normalization to ensure the score is meaningful across different projects and use cases:

Clamping. The raw weighted sum is clamped to the 0–100 integer range. No memory can score below 0 or above 100.
Rounding. The floating-point result is rounded to the nearest integer for storage and display.
Determinism guarantee. Given identical inputs, the scorer always produces the identical output. There is no randomness, no sampling, and no external API calls in the scoring path.

Score interpretation

0–29: Low-value content likely to be gated. 30–59: Moderate content that passes default gating but may be deprioritized in recall. 60–89: High-value content that ranks well in recall. 90–100: Critical content from trusted sources with high information density.

How Other Systems Use It

The importance score is consumed by multiple downstream systems throughout the memory lifecycle:

System	How it uses importance
Intelligence Gating	Compares the score against the project threshold to decide whether to embed the memory.
Recall Engine	Uses importance as one of five factors in the hybrid ranking formula (default weight: 15%).
Memory Retrieval	Filters out memories with importance below 0.5 before ranking.
Tier Manager	Considers importance when deciding tier transitions — high-importance memories stay in the hot tier longer.
Pruning	Memories with importance below 0.4 and no usage for 30+ days are flagged as prunable.

← Previous

Intelligence Gating

Re-evaluation Engine