Lossless recall — no summarization, no missed memories. Then go further than plain vector search: combine semantic search, SQL joins, and analytics in a single API call. One query does what used to take three or four round-trips through your model.
Lossless long-term memory — no summarization, no approximation. Store everything your agents see; recall exactly what matters via semantic search, metadata filters, and analytics — all in a single API call.
Send plain English; we embed it for you with a state-of-the-art model. Bring your own embeddings if you have a custom pipeline.
POST /v1/memory/search
{ "text": "meeting about Q3 budget", "top_k": 5 }Compose semantic search, metadata filters, SQL joins, and analytics in a single call — from a plain vector lookup to a multi-step pipeline that would normally span three services.
-- Scope a semantic search with SQL, in one call SELECT m.id FROM memories m WHERE m.tenant = $1 AND m.tier = 'premium' → semantic_search(text = "product feedback")
On-start hooks inject relevant memories into your agent's context; on-end hooks save new facts back. Zero memory-management code in your agent logic.
Segment memory by use-case; each collection has composite-ID encoding that prevents cross-collection leakage at the engine level. Built-in per-tier quotas.
// Collections isolate per tenant ten_abc123__work_memories ten_abc123__customer_notes
One API, many doorways: OpenClaw plugin (1-line install), TypeScript SDK for LangChain + custom agents, and a language-agnostic REST API.
// TypeScript SDK
import { KeyesClient } from "@keyesai/keyes-sdk";
const keyes = new KeyesClient({ apiKey });
await keyes.memory.store({ text: "...", collection: "notes" });Pay per operation, not per token. One UQL call replaces 3–5 LLM round-trips — predictable monthly spend instead of a surprise token bill.
Other memory services give your agents a way to remember — but only at the surface: store text, search by similarity. That's where our Unified Query Language starts. Levels 3–5 let you join, aggregate, and orchestrate across your memories in a single call, replacing 3–5 LLM round-trips.
One UQL call replaces 3–5 LLM round-trips — predictable spend instead of surprise token bills.
Give Claude Code or Codex persistent memory across sessions. Every decision, fix, and convention stays recallable — agents pick up exactly where they left off.
Every ticket, every resolution, every pattern — stored and retrievable by meaning. L4 analytics surfaces the top-5 customer issues this week in a single query.
Accumulate thousands of articles, papers, and notes. L3 SQL→Vector joins let you search only within your own tier-1 sources, filtered by date or author.
Your preferences, past conversations, recurring plans — remembered accurately, recalled instantly. The assistant that actually knows you, because nothing gets summarized away.
Use the language your stack already speaks. Every SDK exposes the same five-level UQL — what works in TypeScript works the same way in Python, Go, and Rust.
npm i @keyesai/keyes-sdk
Core SDK. Sync + async, full L1–L5 UQL coverage, typed responses.
npm i @keyesai/ai-sdk
Memory tools for the Vercel AI SDK — drop into any streamText() call.
pip install keyesai
Sync + async clients. Works with any framework — LangChain, LlamaIndex, vanilla.
go get github.com/keyesai/go-sdk
Idiomatic Go client with context-aware methods and typed errors.
cargo add keyesai
Async-first crate with serde-typed payloads. Zero unsafe code.
No SDK for your language? keyesmemory.com works from anywhere that can speak HTTPS.
First-class plugins for popular agent runtimes — install once, your agents get persistent memory across every session.
keyes-claude-code
Persistent memory across Claude Code sessions
A drop-in CLAUDE.md plus optional MCP server. Claude recalls relevant context at session start, stores observations and decisions during work, and summarizes when the session ends. Zero glue code in your project.
# 1. Drop the file cp CLAUDE.md ~/your-project/ # 2. Set the key export KEYES_API_KEY="mol_xxxx" # 3. Open Claude Code — memory is on.
keyes-openclaw
Memory tools + auto-capture for OpenClaw agents
Drops persistent cloud memory into any OpenClaw agent. Registers four L1–L3 memory tools and a session_end hook that auto-stores summaries. Runs alongside OpenClaw's built-in local memory-core — no replacement.
# config.yaml
plugins:
keyes-memory:
apiKey: "mol_xxxx"
collection: "openclaw"
autoCapture: trueBuilding a custom agent? Use any of the native SDKs above to wire memory into your own runtime.
No surprise bills at the end of the month. Every plan measures what matters: how many UQL calls your agents actually make.
For builders kicking the tires
For side projects & personal assistants
For teams shipping agents to users
For regulated orgs, self-hosted
Join the private beta. Get early access to GitDB, Memory, Vector, and Embedded Robotics services.