How Memory Works
This page explains the core concepts behind KAPEX's memory system. Understanding these concepts will help you predict how KAPEX behaves and make the most of the API.
Memory Nodes
A memory node is the atomic unit of memory in KAPEX. Each node represents a single piece of information extracted from a conversation and contains:
- Topic -- a short label describing what the memory is about (e.g., "sister's birthday", "job change")
- Summary -- a concise natural-language description of the memory
- Salience score -- a number between 0 and 1 indicating how important this memory is right now
- Node type -- its place in the hierarchy (domain, entity, facet, theme, or interest)
- Timestamps -- when the memory was created, last accessed, and last updated
- Processing count -- how many times this memory has been accessed or discussed
Nodes are created automatically during the async write path. You can also create them explicitly via the API:
curl -X POST https://api.getkapex.ai/api/v1/ingest \
-H "X-API-Key: $KAPEX_API_KEY" \
-H "Content-Type: application/json" \
-d '{
"user_id": "user_123",
"content": "User mentioned they prefer Rust for systems work and Python for prototyping."
}'
Salience Score
The salience score S is the central concept in KAPEX. It answers the question: how important is this memory right now?
Salience is not static. It changes over time as memories decay, get reinforced through discussion, or receive external boosts from life events. A memory's salience determines whether it gets injected into the LLM's context and how it is framed when it does.
The salience score is composed of three parts:
Base Score
The base score B represents the intrinsic importance of a memory at the time it was created. It is computed from five linguistic signals extracted from the original conversation:
| Signal | Name | What It Measures |
|---|---|---|
| SDV | Self-Disclosure Velocity | How much meaningful personal information is packed into the statement -- dense, specific disclosures score higher than vague or generic ones |
| CSCV | Cross-Session Consistency Variable | Whether this topic has appeared across multiple sessions -- recurring themes score higher |
| LCS | Linguistic Complexity Score | The structural complexity of the language used -- more elaborate, considered expressions score higher than simple acknowledgments |
| SWV | Sentiment Word Valence | The linguistic weight and valence of sentiment-bearing words -- deliberate, specific word choices score higher |
| PDV | Prevalence Density Variable | How frequently this topic has appeared across the user's session history -- high cross-session frequency indicates sustained importance |
These five signals are combined into a single base score. The base score is computed once when a memory is created and updated when the memory is enriched with additional context.
Temporal Decay
Memories fade over time. KAPEX applies temporal decay to every memory node, gradually reducing its salience score as time passes since the memory was last relevant.
The key insight in KAPEX's decay model is that the more a memory has been processed, the faster it decays. This is counterintuitive but critical: it prevents stale, repetitive memories from permanently dominating the context window. A memory that has been discussed many times has already been "used" and should make room for fresher information -- unless something causes it to spike again.
Memories that have never been processed decay slowly, giving them time to surface naturally. Memories that have been discussed extensively decay faster, reflecting the fact that their information has already been communicated and absorbed.
Spike Coefficient
External events can temporarily boost a memory's salience. The spike coefficient represents a short-term increase triggered by:
- Life events -- a breakup, job change, or health diagnosis can reactivate related memories
- Direct reactivation -- your application can explicitly spike a memory via the API
- Contextual relevance -- when a new conversation touches on a dormant topic, related memories can receive a boost
Spikes are temporary. They provide an immediate salience boost that itself decays over time, allowing the memory to return to its natural trajectory unless further reinforced.
Spikes happen automatically during ingestion when KAPEX detects contextual relevance. Life events, topic reactivation, and cross-session references can all trigger spikes on related memories.
Processing Events
Every time a memory is accessed, discussed, or otherwise used, KAPEX records a processing event. Processing events serve two purposes:
- They increase the memory's decay rate -- as described above, well-processed memories decay faster to prevent context domination
- They provide a usage signal -- the processing count is available for analysis and can inform application-level decisions
Processing events are recorded automatically when KAPEX injects a memory into the LLM context and the conversation touches on that topic. Each time a memory is referenced in conversation, its processing count increments and its decay rate adjusts accordingly.
Injection Threshold
Not every memory reaches the LLM. KAPEX enforces an injection threshold of 0.25 -- only memories with a salience score at or above this threshold are eligible for injection into the LLM's context.
This threshold serves several purposes:
- Noise reduction -- low-salience memories (casual mentions, one-off comments) don't clutter the context
- Token efficiency -- the fixed token budget is spent on memories that actually matter
- Privacy by default -- information the user mentioned once in passing and never revisited naturally fades below the threshold
Memories below the threshold are not deleted. They remain in the graph and can be reactivated via spikes if they become relevant again.
Entity Hierarchy
KAPEX organizes memories into a hierarchical graph with five levels:
Domain
|
+-- Entity
| |
| +-- Facet
| +-- Theme
| +-- Interest
Domains
Domains are broad life areas: work, family, health, hobbies, relationships, finances, and so on. They are created automatically as topics emerge in conversation. A user who discusses their job, their kids, and their fitness routine will naturally develop work, family, and health domains.
Entities
Entities are specific named people, places, projects, or concepts within a domain. "Ciana" (a sister), "KAPEX" (a project), "Dr. Martinez" (a therapist) are all entities. KAPEX uses a three-tier extraction pipeline to identify entities:
- Alias matching -- known names and aliases are matched first
- Pattern matching -- regex patterns catch common entity formats
- LLM classification -- ambiguous cases are resolved by a language model
Entities are classified by gravity -- how central they are to the user's life. High-gravity entities (spouse, children, employer) receive more retrieval priority than low-gravity ones (a restaurant mentioned once).
Facets
Facets are specific, atomic facts about an entity: "Ciana's birthday is March 12", "KAPEX uses PostgreSQL", "Dr. Martinez is on vacation until June". Each facet is a single piece of retrievable information attached to its parent entity.
Themes
Themes represent recurring patterns detected within a domain: "work-life balance concerns", "weekend cooking experiments", "anxiety about public speaking". Themes are inferred over time as KAPEX observes repeated discussion of related topics.
Interests
Interests are inferred user preferences that evolve through three confidence stages:
- Ephemeral -- mentioned once, might be passing ("I tried rock climbing today")
- Tentative -- mentioned multiple times, gaining confidence ("I've been rock climbing every weekend")
- Persistent -- confirmed through sustained engagement, high confidence ("Rock climbing is a major hobby")
Interests are promoted (or allowed to decay) automatically based on how consistently the user engages with the topic.
Memory Lifecycle
Every memory follows a lifecycle:
Creation --> Active Use --> Gradual Decay --> Dormancy --> (Reactivation or Compression)
1. Creation
A memory node is created when KAPEX extracts a meaningful piece of information from a conversation. The node receives an initial base score from the five scoring signals.
2. Active Use
When a memory's salience is above the injection threshold (0.25), it is eligible to appear in the LLM context. Each time it is injected and the conversation touches on it, a processing event is recorded, which gradually increases its decay rate.
3. Gradual Decay
Over time, the memory's salience decreases due to temporal decay. Memories that are frequently processed decay faster than those that are rarely accessed. This is the natural forgetting curve.
4. Dormancy
When salience drops below the injection threshold, the memory becomes dormant. It no longer appears in LLM context but remains in the graph. It can still be found via the API's query and list endpoints.
5. Reactivation or Compression
Dormant memories can be reactivated by salience spikes -- life events, direct API calls, or contextual relevance in new conversations. Alternatively, very old and low-salience memories may be compressed into summary nodes by KAPEX's lifecycle manager, preserving the gist while freeing storage.
Viewing Entity Salience
You can view all entities and their live salience scores for a user via the entities endpoint:
curl "https://api.getkapex.ai/api/v1/context/user_123" \
-H "X-API-Key: $KAPEX_API_KEY"
The /context endpoint returns entities organized by domain with their current salience scores. The /nodes/{node_id} endpoint gives full detail for a specific node. These are useful for debugging retrieval behavior and understanding why certain memories surface while others do not.
For the full scoring breakdown at ingestion time (base score, individual signal values), see the response from the ingest endpoint.