Two ways to run keyes.ai services

A common question that's come up over the last few months — from teams reading the recall, memory, and GitDB posts — is the deployment question. "This sounds interesting, but our security team won't let us send data to a SaaS. What are the actual deployment shapes?"

So this post is a short, honest answer to that question. Same engine across both options. The choice is about where it sits and who operates it.

Option 1 — Managed service

The simplest shape. You hit our REST or gRPC endpoint with an API key, we run everything: the vector engine, the memory layer, the GitDB control plane, the storage, the upgrades, the on-call.

Each service has its own domain so you can adopt them independently:

Product	Domain	What you get
Vector	`keyesbase.com`	Indexes, upsert, query, filter
Memory	`keyesmemory.com`	Collections, recall, the full UQL L1–L5 surface
GitDB	`gitdb.sh`	The code-database flagship

A single tenant-scoped key works across all three — same auth, same dashboard, same bill — but you don't have to commit to all of them on day one. Per-operation pricing on the public tiers (Free, Starter, Pro, Enterprise), described on the pricing page when we open up. The shortest path from "I have an idea" to "I have something running."

The right fit when: you want to start tomorrow, your data isn't subject to a regulator that prohibits SaaS, and you'd rather not run infrastructure.

Option 2 — Single-tenant deployment in your cloud (BYOC)

Same engine, single-tenant, deployed into your cloud account. We hand you an AMI for AWS (Google Cloud and Azure are on the same release path), a Terraform module, a Helm chart for the K8s side, and a runbook. The control plane that meters and bills sits with us; the data plane sits inside your VPC. We have access only to operational telemetry — error rates, latency histograms, version state — never to your data.

Why this shape exists: a lot of teams have data they can't send across a shared boundary, but they don't want to staff up a database team either. BYOC is the middle path. You own the data plane and the network perimeter; we run the engine and the upgrades.

The distributed side is already working. The engine clusters horizontally — single-tenant or multi-tenant, with automatic sharding and online rebalancing when you scale the cluster up or down. On K8s the same shape runs as a StatefulSet with operator-managed orchestration.

The right fit when: your data has to stay inside your cloud account (HIPAA, PCI, sector-specific compliance), you want managed upgrades and observability, and you're comfortable with us seeing operational metadata.

What's the same across both

I want to call out the parts that don't change with deployment shape, because that's the actual product story:

Same UQL contract. L1 through L5. The application code is identical whether you're on the managed service or BYOC. No "BYOC-only feature."
Same recall guarantees. 100% Recall@10 at 1536-dim and 3072-dim — same engine, same algorithm, same benchmarks, regardless of where it's deployed.
Same observability. Prometheus metrics, OpenTelemetry traces, structured logs. The shape of the dashboards looks the same whether we're operating them or you are.
Same release cadence. When we ship a new engine version, both deployment shapes get it on the same cycle — managed service first (rolling rollout), BYOC second (you trigger when ready).

What's already built and what's still in flight

Let me be honest about state, since deployment claims are exactly the kind of thing that's easy to overstate:

Managed service — running in production. Vector and Memory endpoints are live for private beta customers. GitDB endpoint is in closed beta.
BYOC on AWS — packaged. We have the AMI, the Terraform module, the Helm chart, and an installer that has been exercised against a customer-controlled AWS account. GCP and Azure are on the release path; we haven't shipped them yet.
Distributed scaling — working. Multi-node clustering and online rebalancing are running in our own internal deployments. We haven't published a public scaling benchmark; we should.
K8s operator — written, in private testing, not yet GA. The Helm chart works today for static deployments; the operator that handles online upgrades and resharding is the piece we're polishing.

A note on what we don't do

A few honest disclaimers, in the same shape as the rest of this blog:

We don't claim to be the cheapest at every scale. For a single-digit-million-vector workload on a managed service, the math usually works out in our favor against the established vector DBs. For tens-of-billions web-scale, that's a different conversation and we're not the right answer for that use case.
We're not an HSM, a key-management service, or a network appliance vendor. We integrate cleanly with AWS KMS, GCP KMS, and on-prem HSMs, but we don't replace them.
Maturity. Private beta. Smaller customer base, fewer integrations than the established infrastructure vendors. That's real, and worth weighing.

If either of these shapes fits your situation

If you're a team that's been waiting on the right deployment shape — whether that's "I just want to use the API" or "we need this in our own cloud account" — I'd genuinely like to hear about it. We're actively looking for design partners on the BYOC side, and we're opening up managed-service tiers as fast as we can run the operations side carefully.

Either we're a fit, or we can be honest about why we're not — and that conversation is useful either way.

— Danny