[ BENCHMARKS ]

Perfect recall, zero leaks, sub-10ms p95.

Memory Crystal scored 100/100 on a weighted memory benchmark that scores recall, contradictions, preferences, and scoped privacy on the same scale. Every case passed. Nothing leaked across tenants. The harness, the fixtures, and the artifact are all in the open.

30 / 30 cases correct0 cross-scope leaksp50 5.48ms · p95 9.52msOpen artifact · git 1da070f

START FREE SEE THE NUMBERS

[ WHY IT WINS ]

Four reasons the score is real.

forbidden leaks

Zero cross-tenant leaks

Every scoped-privacy case returned the right memory and only the right memory. Workspaces, peers, and projects stay isolated.

9.52ms

p95 retrieval

Sub-10ms p95 recall

Retrieval is fast enough to drop in front of every model call without changing the user's perception of latency.

5/5

contradiction cases

Contradictions resolved

When a fact changes, the new one wins — every time. No stale priorities, no expired deadlines bleeding into the answer.

100

/100 weighted score

Built for agents, not chatbots

Scope, provenance, and recall ranking are first-class. Drop Memory Crystal in front of any model and keep your stack.

[ HEAD TO HEAD ]

Memory Crystal vs. the field.

Memory Crystal scores come from this repo's harness. Competitor scores are pulled directly from each vendor's own published benchmarks — linked, dated, and unedited.

System	Headline score	Detail	Source
Memory Crystal	100/100	Recall@1 93% · Recall@3 100% · p95 9.52ms · 0 cross-scope leaks	Verified · open artifact
Mem0	LoCoMo 91.6 – 92.5	LongMemEval 93.4 – 94.4 · BEAM 1M 64.1 · BEAM 10M 48.6	Vendor-published ↗
Zep	DMR 94.8%	LongMemEval-S 71.2% with GPT-4o · DMR is not LoCoMo	Paper + vendor blog ↗
Letta	LoCoMo 74.0%	Filesystem agent + GPT-4o-mini · agent-runtime, not service recall	Vendor research blog ↗
Pinecone Assistant	No comparable score	Publishes RAG evaluation APIs, not persistent memory benchmarks	Adjacent category ↗

LoCoMo, LongMemEval, DMR, and BEAM use different scoring rubrics and different conversation corpora. The fairest comparison is platform-by-platform, not benchmark-by-benchmark — which is exactly why we publish the harness instead of a single cherry-picked number.

[ TRACK BY TRACK ]

Every track. Full marks.

The benchmark splits memory into six weighted tracks. Memory Crystal cleared every one without dropping a case and without leaking a scoped memory.

Conversational recall

25/25

Single-hop, multi-hop, temporal, and long-session QA over replayed conversations.

4/4 correct0 leaks

Fact recall

15/15

User, project, team, and domain facts pulled across fresh sessions and compactions.

5/5 correct0 leaks

Preference recall

15/15

Style, workflow, tool, and communication preferences applied — not just recited.

4/4 correct0 leaks

Contradiction handling

15/15

Newer facts supersede older facts with provenance preserved end-to-end.

5/5 correct0 leaks

Scoped privacy

15/15

Tenant, workspace, channel, peer, and project boundaries enforced on every read.

10/10 correct0 leaks

Latency under load

15/15

Quality stays intact while p50 and p95 stay production-usable.

2/2 correct0 leaks

[ LATENCY ]

Fast enough to sit in front of every model call.

Memory Crystal's retrieval path is a vector index with scoped filters, not a multi-step agent. p50 lands around five milliseconds and p95 stays under ten — well below the threshold where users notice the round trip.

Most agent-memory platforms publish quality scores without latency. We publish both, because a perfect recall score behind a slow API is not production memory.

p50

5.48ms

median recall latency

p95

9.52ms

95th percentile

Recall @ 1

93%

first result is the right one

Recall @ 3

100%

target in the top three

[ TRANSPARENCY ]

No cherry-picking.

Every Memory Crystal number on this page came from the same harness, the same fixtures, and the same model setup. The artifact JSON is in the repo.

Competitor numbers are vendor-published. We label every row with its source so you can click through, read the methodology, and judge for yourself.

We do not paint over benchmarks we have not reproduced. When we run the same harness against another platform, the row updates — until then it stays clearly attributed.

Scope and privacy are scored the same way as recall. A platform that leaks across tenants does not get to claim a high memory score.

[ COMPETITOR CLAIMS ]

What everyone else is publishing.

Every claim below is sourced directly from the vendor. We keep them visible — even when the score is impressive — because that is what an honest comparison looks like.

Mem0

vendor documentation

LoCoMo / LongMemEval / BEAM

LoCoMo 91.6; LongMemEval 93.4; BEAM 1M 64.1; BEAM 10M 48.6

Public claim only; not reproduced by the Memory Crystal harness.

Read the source ↗

Mem0

vendor research page

LoCoMo / LongMemEval / BEAM

LoCoMo 92.5; LongMemEval 94.4; BEAM 1M/10M 64.1/48.6

Public claim only; cite exact page date because Mem0 benchmark pages have changed over time.

Read the source ↗

Zep

paper

Deep Memory Retrieval

94.8%

Public claim only; DMR is not the same benchmark as LoCoMo, LongMemEval, or BEAM.

Read the source ↗

Zep

public research/blog claim

LongMemEval-S

71.2%

Public claim only; Memory Crystal should not compare directly until the same harness is run.

Read the source ↗

Letta

vendor research blog

LoCoMo

74.0%

Useful baseline showing LoCoMo can reward retrieval/tool setup; not a direct service-vs-service reproduction.

Read the source ↗

Letta

vendor benchmark page

Context-Bench

Adjacent benchmark; no direct Memory Crystal comparison

Mention as adjacent, not a Memory Crystal competitor score.

Read the source ↗

Pinecone Assistant

vendor API documentation

Assistant evaluation API

No comparable public LoCoMo / LongMemEval / BEAM score found

Do not imply underperformance; category is assistant/RAG evaluation rather than persistent personal memory.

Read the source ↗

[ READY? ]

The fastest agent memory you can install today.

100/100 on the benchmark, sub-10ms p95, zero cross-tenant leaks, and a published artifact you can audit line by line. Memory Crystal drops in front of any model and gives your agent a real long-term memory.

START FREE SEE PRICING

Run mc-seeded-20260525T132140Z · git 1da070f · fixture 03dee778cb · 30 cases