Memories
Memories are the core of the zkStash platform. They represent the knowledge that your agents accumulate over time. Unlike the ephemeral context window of an LLM, memories in zkStash are persistent, searchable, and structured.
Core Memory Operation
At its core, every memory operation follows a simple cycle:
- Accept: The system receives conversation data and the current memory state.
- Prompt: An LLM determines how to expand or consolidate the memory.
- Update: The new memory state is saved.
Short-Term vs. Long-Term Memory
Short-Term Memory (Context)
Short-term memory typically refers to the context window of the LLM or the state of a specific conversation thread.
- Scope: Limited to a single session or thread.
- Persistence: Ephemeral. Lost when the session ends or the context window is exceeded.
- Use Case: Remembering the user’s name in the current chat, or the last question asked.
Long-Term Memory (Persistence)
zkStash provides Long-term Memory. This is knowledge that persists across different threads, sessions, and time.
- Scope: Global (per Agent). Accessible across all interactions.
- Persistence: Indefinite. Stored until explicitly deleted or expired.
- Use Case: Remembering a user’s dietary restrictions learned last month, or a fact discovered by another agent in the fleet.
Types of Long-Term Memory
Just as humans use different types of memory for different purposes, AI agents benefit from organizing knowledge into distinct categories:
Semantic Memory (Facts & Knowledge)
Semantic memory stores factual information about the world, users, or domain-specific knowledge. We support two primary patterns:
1. Profiles (Structured)
Profiles are ideal for entities where there should be one active version of the truth.
- Schema Type: Single Record (e.g.,
UserProfile,CustomerSettings) - Behavior: Updates replace the previous state.
- Example: “User prefers dark mode”
2. Collections (Unbounded)
Collections are for accumulating an unbounded amount of knowledge that is searched at runtime.
- Schema Type: Multiple Records (e.g.,
ProductKnowledge,DomainFact) - Behavior: New facts are appended.
- Example: “Redis is an in-memory database”
Episodic Memory (Experiences & Events)
Episodic memory stores specific events or interactions that happened at a particular time.
- Schema Type: Always Multiple Records (e.g.,
InteractionLog,TaskHistory) - Use Case: Learning from past successes/failures.
- Example: “Agent successfully resolved ticket #456 using strategy X”
Procedural Memory (Rules & Strategies)
Procedural memory stores instructions, strategies, or rules that guide behavior.
- Schema Type: Single Record (e.g.,
AgentInstructions) - Use Case: Evolving agent behavior based on feedback.
- Example: “For technical issues, escalate to human if confidence < 0.7”
Writing Memories
There are two primary patterns for when to write memories:
Conscious Formation (Hot Path)
The agent decides to save a memory as part of its normal response flow using MCP.
- Mechanism: The MCP server exposes tools (e.g.,
create_user_profile) to your agent. - Pros: Memory is immediately available for the next turn.
- Cons: Adds latency to the user-facing response.
Subconscious Formation (Background)
A separate process analyzes the conversation history and extracts memories after the fact using the zkStash REST API or SDK.
- Mechanism: Background workers call
POST /memories. - Pros: No latency impact; allows for deeper “reflection” and batch processing.
- Cons: Memory is not immediately available.
Context Continuity
zkStash maintains a rolling context window for each thread, allowing the extraction process to understand references across multiple API calls.
- Managed History: Recent messages are stored per
threadIdand provided to the extractor as context. - Rolling Summarization: When history exceeds a threshold, older messages are summarized to preserve key facts without bloating the context window.
- Idempotency: Duplicate requests (same conversation content) are detected and skipped, making retries safe.
Organizing Memories
Memories are organized by Agent ID and optionally by Subject ID and Thread ID.
- Agent ID: The primary namespace. All memories for a specific agent (e.g., “customer-support-bot”) are grouped together.
- Subject ID: Optional tenant isolation scope. Used to separate data between different users or tenants within the same agent (e.g., “tenant-123”).
- Thread ID: Optional sub-namespace for session-specific context. Useful for scoping memories to a particular conversation (e.g., isolating memories from different user chats).
This hierarchical organization allows you to:
- Retrieve all knowledge for an agent across all conversations
- Isolate data per tenant using
subjectId - Filter memories to a specific conversation thread
- Share knowledge between agents by querying across Agent IDs
Multi-Tenancy for Platforms
If you are building a platform where multiple users or tenants share the same agent (e.g., a “Customer Support Bot” serving 10,000 companies), use subjectId to isolate their data.
- Agent ID:
customer-support-bot(Your shared agent logic) - Subject ID:
tenant-123(Company A),tenant-456(Company B)
This ensures Company A’s queries never retrieve Company B’s memories, even though they use the same agent.
Memory Response Format
When you search for memories, zkStash returns them in an LLM-optimized format designed for optimal context in LLM reasoning. This format separates user content from system metadata and includes quality signals.
Structure
{
"id": "mem_abc123",
"kind": "UserProfile",
"quality": {
"relevance": 0.89,
"confidence": 0.95
},
"data": {
"dietaryRestrictions": ["vegan", "gluten-free"],
"favoriteColor": "blue"
},
"context": {
"when": "2024-01-15T10:30:00Z",
"mentions": [
{ "name": "User", "type": "person" }
],
"tags": ["preferences"],
"isLatest": true
},
"source": "own"
}Field Descriptions
| Field | Description |
|---|---|
id | Memory identifier for updates and references. |
kind | Schema type (e.g., UserProfile, temporal_event). |
quality.relevance | How well this memory matches your query (0-1). |
quality.confidence | How certain the system was during extraction (0-1). Use this to weight uncertain vs. definite information. |
data | The actual content—user-defined fields from your schema. |
context.when | ISO 8601 timestamp of when the event occurred (if temporal). |
context.mentions | Named entities referenced, with types for disambiguation (e.g., “Amazon” as company vs. place). |
context.tags | Topical tags for categorization. |
context.isLatest | true if this is the current version. false if superseded by a newer memory. |
source | Where this memory came from: own, shared, or shared:{agentId}. |
Search Response
A search also includes a searchedAt timestamp at the root level, useful for relative temporal reasoning:
{
"success": true,
"memories": [...],
"searchedAt": "2024-01-20T14:30:00Z"
}Memory Expiration (TTL)
By default, memories are permanent (on paid plans) or subject to 7-day retention (on the Free plan). However, you can set explicit expiration for any memory using TTL (Time-To-Live).
When to Use TTL
TTL is useful for memories that are:
- Session-specific: Context that’s only relevant during an active task
- Time-sensitive: Information that becomes stale (e.g., “user is currently booking a flight”)
- Temporary overrides: Short-term preferences that shouldn’t persist
Setting Expiration
You can set expiration in two ways:
Duration string (ttl):
// Memory expires in 24 hours
{ kind: "SessionContext", data: { task: "booking" }, ttl: "24h" }Unix timestamp (expiresAt):
// Memory expires at a specific time
{ kind: "Reminder", data: { text: "Follow up" }, expiresAt: 1735689600000 }Supported TTL formats: "30s", "15m", "1h", "24h", "7d"
Removing Expiration
To make an expiring memory permanent, update it with expiresAt: null:
// Remove expiry, memory becomes permanent
await client.updateMemory(memoryId, { expiresAt: null });TTL vs System Retention
| Type | Applies To | Behavior | Who Controls |
|---|---|---|---|
| TTL Expiry | All plans | Memory deleted when expiresAt passes | You (the developer) |
| System Retention | Free plan only | 7-day rolling window | Platform policy |
TTL is user intent—you’re explicitly saying “delete this memory after X time.”
System retention is policy-based—the platform enforces limits on the Free tier.
Note: On the Free plan, both mechanisms apply. A memory with
ttl: "1h"will be deleted after 1 hour, even if the 7-day window hasn’t passed. Conversely, a memory without TTL will still be deleted after 7 days unless you have storage insurance.
NOTE: Ready to integrate zkStash into your application? Check out the Integrations page for detailed guides on using the REST API, SDK, or MCP.