Memories

Memories are the core of the zkStash platform. They represent the knowledge that your agents accumulate over time. Unlike the ephemeral context window of an LLM, memories in zkStash are persistent, searchable, and structured.

Core Memory Operation

At its core, every memory operation follows a simple cycle:

Accept: The system receives conversation data and the current memory state.
Prompt: An LLM determines how to expand or consolidate the memory.
Update: The new memory state is saved.

Short-Term vs. Long-Term Memory

Short-Term Memory (Context)

Short-term memory typically refers to the context window of the LLM or the state of a specific conversation thread.

Scope: Limited to a single session or thread.
Persistence: Ephemeral. Lost when the session ends or the context window is exceeded.
Use Case: Remembering the user’s name in the current chat, or the last question asked.

Long-Term Memory (Persistence)

zkStash provides Long-term Memory. This is knowledge that persists across different threads, sessions, and time.

Scope: Global (per Agent). Accessible across all interactions.
Persistence: Indefinite. Stored until explicitly deleted or expired.
Use Case: Remembering a user’s dietary restrictions learned last month, or a fact discovered by another agent in the fleet.

Types of Long-Term Memory

Just as humans use different types of memory for different purposes, AI agents benefit from organizing knowledge into distinct categories:

Semantic Memory (Facts & Knowledge)

Semantic memory stores factual information about the world, users, or domain-specific knowledge. We support two primary patterns:

1. Profiles (Structured)

Profiles are ideal for entities where there should be one active version of the truth.

Schema Type: Single Record (e.g., UserProfile, CustomerSettings)
Behavior: Updates replace the previous state.
Example: “User prefers dark mode”

2. Collections (Unbounded)

Collections are for accumulating an unbounded amount of knowledge that is searched at runtime.

Schema Type: Multiple Records (e.g., ProductKnowledge, DomainFact)
Behavior: New facts are appended.
Example: “Redis is an in-memory database”

Episodic Memory (Experiences & Events)

Episodic memory stores specific events or interactions that happened at a particular time.

Schema Type: Always Multiple Records (e.g., InteractionLog, TaskHistory)
Use Case: Learning from past successes/failures.
Example: “Agent successfully resolved ticket #456 using strategy X”

Procedural Memory (Rules & Strategies)

Procedural memory stores instructions, strategies, or rules that guide behavior.

Schema Type: Single Record (e.g., AgentInstructions)
Use Case: Evolving agent behavior based on feedback.
Example: “For technical issues, escalate to human if confidence < 0.7”

Writing Memories

There are two primary patterns for when to write memories:

Conscious Formation (Hot Path)

The agent decides to save a memory as part of its normal response flow using MCP.

Mechanism: The MCP server exposes tools (e.g., create_user_profile) to your agent.
Pros: Memory is immediately available for the next turn.
Cons: Adds latency to the user-facing response.

Subconscious Formation (Background)

A separate process analyzes the conversation history and extracts memories after the fact using the zkStash REST API or SDK.

Mechanism: Background workers call POST /memories.
Pros: No latency impact; allows for deeper “reflection” and batch processing.
Cons: Memory is not immediately available.

Context Continuity

zkStash maintains a rolling context window for each thread, allowing the extraction process to understand references across multiple API calls.

Managed History: Recent messages are stored per threadId and provided to the extractor as context.
Rolling Summarization: When history exceeds a threshold, older messages are summarized to preserve key facts without bloating the context window.
Idempotency: Duplicate requests (same conversation content) are detected and skipped, making retries safe.

Organizing Memories

Memories are organized by Agent ID and optionally by Subject ID and Thread ID.

Agent ID: The primary namespace. All memories for a specific agent (e.g., “customer-support-bot”) are grouped together.
Subject ID: Optional tenant isolation scope. Used to separate data between different users or tenants within the same agent (e.g., “tenant-123”).
Thread ID: Optional sub-namespace for session-specific context. Useful for scoping memories to a particular conversation (e.g., isolating memories from different user chats).

This hierarchical organization allows you to:

Retrieve all knowledge for an agent across all conversations
Isolate data per tenant using subjectId
Filter memories to a specific conversation thread
Share knowledge between agents by querying across Agent IDs

Multi-Tenancy for Platforms

If you are building a platform where multiple users or tenants share the same agent (e.g., a “Customer Support Bot” serving 10,000 companies), use subjectId to isolate their data.

Agent ID: customer-support-bot (Your shared agent logic)
Subject ID: tenant-123 (Company A), tenant-456 (Company B)

This ensures Company A’s queries never retrieve Company B’s memories, even though they use the same agent.

Memory Response Format

When you search for memories, zkStash returns them in an LLM-optimized format designed for optimal context in LLM reasoning. This format separates user content from system metadata and includes quality signals.

Structure


{
  "id": "mem_abc123",
  "kind": "UserProfile",
  "quality": {
    "relevance": 0.89,
    "confidence": 0.95
  },
  "data": {
    "dietaryRestrictions": ["vegan", "gluten-free"],
    "favoriteColor": "blue"
  },
  "context": {
    "when": "2024-01-15T10:30:00Z",
    "mentions": [
      { "name": "User", "type": "person" }
    ],
    "tags": ["preferences"],
    "isLatest": true
  },
  "source": "own"
}

Field Descriptions

Field	Description
`id`	Memory identifier for updates and references.
`kind`	Schema type (e.g., `UserProfile`, `temporal_event`).
`quality.relevance`	How well this memory matches your query (0-1).
`quality.confidence`	How certain the system was during extraction (0-1). Use this to weight uncertain vs. definite information.
`data`	The actual content—user-defined fields from your schema.
`context.when`	ISO 8601 timestamp of when the event occurred (if temporal).
`context.mentions`	Named entities referenced, with types for disambiguation (e.g., “Amazon” as company vs. place).
`context.tags`	Topical tags for categorization.
`context.isLatest`	`true` if this is the current version. `false` if superseded by a newer memory.
`source`	Where this memory came from: `own`, `shared`, or `shared:{agentId}`.

Search Response

A search also includes a searchedAt timestamp at the root level, useful for relative temporal reasoning:


{
  "success": true,
  "memories": [...],
  "searchedAt": "2024-01-20T14:30:00Z"
}

Memory Expiration (TTL)

By default, memories are permanent (on paid plans) or subject to 7-day retention (on the Free plan). However, you can set explicit expiration for any memory using TTL (Time-To-Live).

When to Use TTL

TTL is useful for memories that are:

Session-specific: Context that’s only relevant during an active task
Time-sensitive: Information that becomes stale (e.g., “user is currently booking a flight”)
Temporary overrides: Short-term preferences that shouldn’t persist

Setting Expiration

You can set expiration in two ways:

Duration string (ttl):


// Memory expires in 24 hours
{ kind: "SessionContext", data: { task: "booking" }, ttl: "24h" }

Unix timestamp (expiresAt):


// Memory expires at a specific time
{ kind: "Reminder", data: { text: "Follow up" }, expiresAt: 1735689600000 }

Supported TTL formats: "30s", "15m", "1h", "24h", "7d"

Removing Expiration

To make an expiring memory permanent, update it with expiresAt: null:


// Remove expiry, memory becomes permanent
await client.updateMemory(memoryId, { expiresAt: null });

TTL vs System Retention

Type	Applies To	Behavior	Who Controls
TTL Expiry	All plans	Memory deleted when `expiresAt` passes	You (the developer)
System Retention	Free plan only	7-day rolling window	Platform policy

TTL is user intent—you’re explicitly saying “delete this memory after X time.”
System retention is policy-based—the platform enforces limits on the Free tier.

Note: On the Free plan, both mechanisms apply. A memory with ttl: "1h" will be deleted after 1 hour, even if the 7-day window hasn’t passed. Conversely, a memory without TTL will still be deleted after 7 days unless you have storage insurance.

NOTE: Ready to integrate zkStash into your application? Check out the Integrations page for detailed guides on using the REST API, SDK, or MCP.