Most agent memory systems are toys

A lot of “memory” implementations are glorified chat logs plus vector search. That fails in production because memory has to support decision continuity, not just fuzzy text matching.

Two memory types you must separate

Episodic memory

Who did what, when, under what constraints.

Semantic memory

Stable facts: architecture choices, owners, conventions.

If you mix them in one flat namespace, retrieval quality degrades quickly.

Why naive approaches fail

no salience scoring
no temporal decay rules
no entity linking
no checkpoint boundaries
no write governance (everything gets stored)

Result: noisy recalls, stale context, and hallucinated continuity.

Embeddings are table stakes, not strategy

Use strong embeddings, yes. But the winning system combines:

vector similarity
metadata filters
graph relationships
scoring pipeline with intent-aware reranking

function rank(memory, queryIntent) {
  return (
    0.45 * memory.semanticScore +
    0.25 * memory.recencyScore +
    0.20 * memory.salienceScore +
    0.10 * memory.graphProximity(queryIntent.entities)
  );
}

Write path matters more than search path

Garbage in, garbage forever. On capture, assign:

type (episodic/semantic)
entities (people, projects, tools)
confidence
TTL policy
source attribution

Retrieval budget

Don’t inject 30 memories. Give the model 5–10 excellent ones.

Use a token budget allocator:

decisions: high priority
active project context: medium
personal preference context: low/medium unless directly asked

Evaluation loop

Track recall precision with deterministic probes:

“Who owns service X?”
“Why did we choose queue Y?”
“What constraint blocked release Z?”

Measure hit rate, ranking quality, and stale-memory frequency.

Pattern that works

short-term stream for recent turns
long-term store for durable facts/events
periodic consolidation job
explicit checkpoints on major milestones

This is how agents stop behaving like amnesiac interns.

Final take

Memory is a product surface, not a side feature. Design capture quality, retrieval ranking, and lifecycle policies with the same rigor you apply to your API layer.