The on-device memory layer for embodied AI.

One small binary you drop onto every robot. Persistent perception memory, kHz sensor ingestion, crash-safe config — sitting underneath whatever VLA or robot foundation model you run. No external server, no network, no missed frames.

The memory layer for your VLA model.

Every humanoid team is shipping a policy. Almost none of them have solved long-term memory. Embedded sits underneath whatever vision-language-action model, robot foundation model, or in-house stack you run — quietly remembering what the robot saw, did, and learned, and serving it back to your policy in milliseconds.

Any policy

Plug in a VLA, a robot foundation model, or your own in-house policy. Embedded is the substrate, not the boss.

Any embodiment

Wheeled, legged, humanoid, aerial, surgical. From a 6-DoF arm to a 35-joint humanoid — the binary is the same.

Any sensor stack

Cameras, depth, LIDAR, IMU, joint encoders, tactile, force/torque — captured at kHz rates, recalled by meaning.

One small binary, running on every robot.

One unified engine — packaged into a single embedded binary that drops onto a Jetson, an x86 edge box, or any ARM64 controller. Persistent memory, sensor ingestion, and unified queries — on-device, no network required.

Persistent perception memory

Full-fidelity semantic memory, embedded directly inside the robot — on-disk and in-process. The same memory layer your agentic workloads use. Survives restarts. No external server, no network round-trip.

// The robot remembers, even after a power cycle.
100% recall · p50 = 6.35ms

kHz sensor ingestion

Joint encoders, IMUs, force/torque, depth cameras, tactile arrays — ingested at machine rates into a columnar time-series engine that runs in the same process as your memory layer.

Ingest:  1.6M points/sec  (16× the 100K minimum)

Cross-engine joins

"Find perception memories from moments when the IMU spiked." Time-series filters and vector search combine in one query, in one process, with zero network round-trips.

// Anomaly-triggered semantic recall:
//   1. TS filter: |accel| > 9.0 in last 60s
//   2. Vector KNN: scenes similar to "near-miss"
//   3. Fused in-process — no copies
join p50 = 6.61ms  (under 10ms budget)

Industrial-grade robot config

Calibration parameters, firmware state, LLM prompt templates — stored as BSON inside the embedded unified engine. Power-loss safe by design, production-hardened, sub-millisecond CRUD. The kind of durability your robot was always supposed to have.

Insert p50: 0.22ms
Survives: power loss + process crash (WAL replay)
Format:   BSON (queryable, schema-free)

One unified engine, one frame

Vector, document, and time-series capabilities served by a single engine inside the robot. No three separate DB servers, no network round-trips — concurrent queries run zero-copy in the same process.

30 Hz frame budget: 33ms
3 concurrent queries: p50 = 0.36ms
Frames in budget:    100%
Headroom:            91× (yes, ninety-one)

Filtered episodic recall

Don't just ask "any memory similar to this scene" — ask "memories from camera 3, on mission #142, when force-torque crossed threshold, similar to this scene." Metadata filters fuse with vector search at the engine level, with zero leakage between filter and rank.

// Episodic recall — filter + semantic in one call
recall(scene_embedding,
       filter: {camera_id: 3, mission: 142})
Recall@10:  100%
p50:        7.31ms   (under 10ms budget)

Memory that lasts longer than the last five seconds.

Today's leading humanoid robots and self-driving stacks process only the last few seconds of input, then forget. Embedded gives your robot memory at every timescale — from the current kHz sensor stream, through the last mission, all the way to your entire fleet's history.

01
kHz

Now

Joint encoders, IMU, force/torque, tactile arrays — captured at thousand-hertz rates into the same engine as your perception memory. No drops, no separate logger.

1.6M pts/sec ingest (benchmarked, 16× the 100K minimum)

02
Episodic

Seconds

What did I just see? Where did I just move? Perception embeddings and time-series sensor data are recallable side-by-side, by meaning or by time.

Vector recall p50 = 6.35ms · range query p50 = 0.40ms

03
Skill library

Hours → Days

Every mission, every demonstration, every learned skill — persisted on-disk, survives power cycles. Robots stop starting from zero every morning.

100% recall, on-disk · single binary, no external server

04
Replay & training data

Fleet-wide

The same query layer that serves the robot's policy at the edge feeds fleet replay, evaluation, and training-data export back in the cloud.

Same UQL wire protocol, edge to cloud

Six phases. All under budget.

End-to-end robotic benchmark: 10K perception embeddings, 100K IMU points, 100 frame cycles. Bars show actual p50 latency against the real-world budget — empty space is headroom.

PHASEACTUAL vs BUDGETHEADROOM1Perception memoryVector KNN — "have I seen this?"p50 6.35msbudget 10ms (30Hz)1.6×headroom2Filtered recallMetadata-filtered vector searchp50 7.31msbudget 10ms1.4×headroom3Robot configCalibration / firmware CRUDp50 0.22msbudget 1ms4.5×headroom4Sensor range query1kHz IMU time-range readp50 0.40msbudget 1ms2.5×headroom5Cross-engine joinSensor → vector (anomaly recall)p50 6.61msbudget 10ms1.5×headroom6Frame budget3 concurrent queries / 30Hzp50 0.36msbudget 33ms91×headroom

Tightest margin: Phase 5 cross-engine join at 1.5×. Widest margin: Phase 6 frame budget at 91×. Tested on x86 dev server; ARM64 Jetson deployment expected to maintain passing margins for typical embedded collection sizes.

Every robot today forgets.

Most humanoids today only “process the last five seconds.” The state of the art for persistent memory still depends on an external vector server — useful in the lab, fragile in the field. We packaged the whole stack into one binary that runs on the robot.

Capability
Persistent perception memory
External memory stack
Yes — external server, approx recall
Sensor log files
No
3-DB stack
Vector DB only, approx recall
keyes.ai
Yes — embedded, 100% recall
Capability
Runs on-device without network
External memory stack
No (needs an external vector server)
Sensor log files
Yes
3-DB stack
No (3 separate servers)
keyes.ai
Yes
Capability
kHz sensor ingestion + time-range query
External memory stack
No
Sensor log files
Partial (110 MB/s cap, no query)
3-DB stack
Yes (separate DB)
keyes.ai
Yes (1.6M pts/sec, in-process)
Capability
Cross-engine join (sensor → vector)
External memory stack
No
Sensor log files
No
3-DB stack
No (manual app glue)
keyes.ai
Yes (6.61ms p50)
Capability
3 engines concurrent in 33ms frame
External memory stack
Sensor log files
3-DB stack
No (3 network hops)
keyes.ai
Yes (0.36ms p50)
Capability
Survives restart with full history
External memory stack
Yes
Sensor log files
Yes (bag files)
3-DB stack
Yes (3 separate DBs)
keyes.ai
Yes (single binary)
Capability
Total external dependencies
External memory stack
External vector server + network
Sensor log files
None — but no query
3-DB stack
3 DB servers + network
keyes.ai
None

Benchmarked end-to-end on a 6-phase robotic test (10K embeddings, 100K IMU points, 30 Hz frame loop). All phases pass with 1.4–91× headroom.

Native LIDAR queries, designed for the robot.

Today, production autonomous platforms drop more than half their LIDAR frames because 3D processing can't keep up. The same engine that benchmarks cross-engine joins at 6.61ms is being extended for on-device 3D spatial queries — Z-order indexed point clouds, bounding-box lookups, all in the same single binary. Public benchmarks land with the next release.

3D bounding-box queries

Find every LIDAR point inside a 3D box across the last few seconds, in microseconds. Z-order indexing prunes 75–90% of stored chunks before any scan touches the disk.

One engine for sensors and space

LIDAR sweeps live in the same columnar engine as IMU, joint encoders, and CAN bus. No separate spatial database, no extra process, no extra port to manage on the robot.

Cross-engine spatial reasoning

"Show me perception memories from moments when an obstacle entered the danger zone." One query stitches LIDAR spatial filter to vector recall — no application-level glue.

Engineering status: designed, integration in progress. Benchmarks publish with release.

Built for whatever your robot has to remember.

Retention policy and engine mix are tunable per series — same binary works for a warehouse arm, a humanoid, or a fleet of drones.

Humanoid robots

Joint encoders at 1–10 kHz, force/torque per joint, tactile arrays, depth cameras, IMU. Persistent memory of every interaction so the robot doesn't relearn the warehouse every morning.

Industrial robotic arms

ISO-9283 calibration parameters and grasp templates served from the embedded unified engine. Sub-millisecond access — your control loop never blocks waiting on the database.

Warehouse AMRs & drones

Depth cameras + IMU + odometry on-board. Persistent memory of every aisle, every pallet, every prior route. Single binary boots in seconds — no separate vector DB, time-series DB, or document store to keep alive.

One binary. Every robot you ship.

One unified binary with everything your robot needs. No SQL, no Postgres, no message queue — none of it ships to the robot.

Hardware target

x86 Edge Server

Arch: Intel Xeon / AMD EPYC
SIMD: AVX-512 VNNI

Fleet edge processing, cloud replay

Hardware target
Rolling out · Phase RA

NVIDIA Jetson Orin / Thor

Arch: ARM64 — Cortex-A78AE
SIMD: NEON + SVE2 (via simsimd)

Humanoid robots, advanced manipulators

Hardware target
Rolling out · Phase RA

NVIDIA IGX Thor

Arch: ARM64 + GPU
SIMD: NEON + CUDA

Industrial robotics, medical

Phase RA brings Mode9 vector search to ARM64 via simsimd NEON kernels. Public benchmarks publish with the release.

Ready to build with
keyes.ai?

Join the private beta. Get early access to GitDB, Memory, Vector, and Embedded Robotics services.