Memory

Long-term memory for AI, composed in one call.

Lossless recall — no summarization, no missed memories. Then go further than plain vector search: combine semantic search, SQL joins, and analytics in a single API call. One query does what used to take three or four round-trips through your model.

Memory as a Service

Long-term memory for AI — never lose context again.

Lossless long-term memory — no summarization, no approximation. Store everything your agents see; recall exactly what matters via semantic search, metadata filters, and analytics — all in a single API call.

Automatic text embedding

Send plain English; we embed it for you with a state-of-the-art model. Bring your own embeddings if you have a custom pipeline.

POST /v1/memory/search
{ "text": "meeting about Q3 budget", "top_k": 5 }

One query language, every level

Compose semantic search, metadata filters, SQL joins, and analytics in a single call — from a plain vector lookup to a multi-step pipeline that would normally span three services.

-- Scope a semantic search with SQL, in one call
SELECT m.id FROM memories m
WHERE m.tenant = $1 AND m.tier = 'premium'
→ semantic_search(text = "product feedback")

Auto-capture & auto-recall hooks

On-start hooks inject relevant memories into your agent's context; on-end hooks save new facts back. Zero memory-management code in your agent logic.

Collections with isolation & quotas

Segment memory by use-case; each collection has composite-ID encoding that prevents cross-collection leakage at the engine level. Built-in per-tier quotas.

// Collections isolate per tenant
ten_abc123__work_memories
ten_abc123__customer_notes

Multi-SDK distribution

One API, many doorways: OpenClaw plugin (1-line install), TypeScript SDK for LangChain + custom agents, and a language-agnostic REST API.

// TypeScript SDK
import { KeyesClient } from "@keyesai/keyes-sdk";
const keyes = new KeyesClient({ apiKey });
await keyes.memory.store({ text: "...", collection: "notes" });

Ops-based pricing

Pay per operation, not per token. One UQL call replaces 3–5 LLM round-trips — predictable monthly spend instead of a surprise token bill.

The UQL difference

Other memory services stop at search. We keep going.

Other memory services give your agents a way to remember — but only at the surface: store text, search by similarity. That's where our Unified Query Language starts. Levels 3–5 let you join, aggregate, and orchestrate across your memories in a single call, replacing 3–5 LLM round-trips.

Level

Capability

Other memory services

keyes.ai MaaS

Capability

Semantic search

Others

Yes

keyes.ai

Yes

Capability

Filtered search (metadata predicates)

Others

Partial

keyes.ai

Yes

Capability

SQL → Vector joins (pre-filter by SQL, rank by vector)

Others

keyes.ai

Native — single call

Capability

Analytics (GROUP BY, SUM, AVG on results)

Others

keyes.ai

Native — single call

Capability

Reasoning plans (recursive, multi-engine)

Others

keyes.ai

Native — single call

One UQL call replaces 3–5 LLM round-trips — predictable spend instead of surprise token bills.

Use cases

Built for the agents you're already building.

AI coding agents

Give Claude Code or Codex persistent memory across sessions. Every decision, fix, and convention stays recallable — agents pick up exactly where they left off.

Customer support agents

Every ticket, every resolution, every pattern — stored and retrievable by meaning. L4 analytics surfaces the top-5 customer issues this week in a single query.

Research & knowledge agents

Accumulate thousands of articles, papers, and notes. L3 SQL→Vector joins let you search only within your own tier-1 sources, filtered by date or author.

Personal assistants

Your preferences, past conversations, recurring plans — remembered accurately, recalled instantly. The assistant that actually knows you, because nothing gets summarized away.

Native SDKs

Five languages. One memory layer.

Use the language your stack already speaks. Every SDK exposes the same five-level UQL — what works in TypeScript works the same way in Python, Go, and Rust.

TypeScript

@keyesai/keyes-sdk

npm i @keyesai/keyes-sdk

Core SDK. Sync + async, full L1–L5 UQL coverage, typed responses.

Vercel AI SDK

@keyesai/ai-sdk

npm i @keyesai/ai-sdk

Memory tools for the Vercel AI SDK — drop into any streamText() call.

Python

keyesai

pip install keyesai

Sync + async clients. Works with any framework — LangChain, LlamaIndex, vanilla.

keyesai-go

go get github.com/keyesai/go-sdk

Idiomatic Go client with context-aware methods and typed errors.

Rust

keyesai-rust

cargo add keyesai

Async-first crate with serde-typed payloads. Zero unsafe code.

No SDK for your language? keyesmemory.com works from anywhere that can speak HTTPS.

Agent integrations

Drop-in plugins for the runtimes you already ship.

First-class plugins for popular agent runtimes — install once, your agents get persistent memory across every session.

Claude Code plugin

keyes-claude-code

Persistent memory across Claude Code sessions

A drop-in CLAUDE.md plus optional MCP server. Claude recalls relevant context at session start, stores observations and decisions during work, and summarizes when the session ends. Zero glue code in your project.

# 1. Drop the file
cp CLAUDE.md ~/your-project/

# 2. Set the key
export KEYES_API_KEY="mol_xxxx"

# 3. Open Claude Code — memory is on.

L1 / L2 on every tier (L3 Pro+, L5 Enterprise)
MCP-server option for tool-based access
Auto-recall at start, auto-store at end

OpenClaw plugin

keyes-openclaw

Memory tools + auto-capture for OpenClaw agents

Drops persistent cloud memory into any OpenClaw agent. Registers four L1–L3 memory tools and a session_end hook that auto-stores summaries. Runs alongside OpenClaw's built-in local memory-core — no replacement.

# config.yaml
plugins:
  keyes-memory:
    apiKey: "mol_xxxx"
    collection: "openclaw"
    autoCapture: true

Tools: keyes_memory_search · _filter · _store · _join
session_end hook auto-summarizes
Coexists with OpenClaw's local memory-core

Building a custom agent? Use any of the native SDKs above to wire memory into your own runtime.

Tiers

Pay per operation, not per token.

No surprise bills at the end of the month. Every plan measures what matters: how many UQL calls your agents actually make.

Free

For builders kicking the tires

1K ops / day
10K vectors
1 collection
L1 + L2 queries

Starter

For side projects & personal assistants

100K ops / month
100K vectors
3 collections
L1 + L2 queries

Pro

For teams shipping agents to users

1M ops / month
1M vectors
10 collections
L1–L3 (L4 + L5 as add-ons)

Enterprise

For regulated orgs, self-hosted

Unlimited
Unlimited
Unlimited
All 5 levels + self-hosted

Ready to build with
keyes.ai?

Join the private beta. Get early access to GitDB, Memory, Vector, and Embedded Robotics services.

Join the waitlist Talk to sales →

Long-term memory for AI, composed in one call.

Long-term memory for AI — never lose context again.

Automatic text embedding

One query language, every level

Auto-capture & auto-recall hooks

Collections with isolation & quotas

Multi-SDK distribution

Ops-based pricing

Other memory services stop at search. We keep going.

Built for the agents you're already building.

AI coding agents

Customer support agents

Research & knowledge agents

Personal assistants

Five languages. One memory layer.

Drop-in plugins for the runtimes you already ship.

Claude Code plugin

OpenClaw plugin

Pay per operation, not per token.

Free

Starter

Pro

Enterprise

Ready to build withkeyes.ai?

Ready to build with
keyes.ai?