persistent memory for companions: a graph with decay
an architectural pattern for long-term memory in conversational ai, using vector graphs, temporal decay, and anti-poisoning layers. trade storage for coherence.
building a companion that remembers you isn't just about storing facts. it's about weaving a coherent, evolving narrative of a person across months of interaction. the challenge is making that memory useful, trustworthy, and resilient. here's how we do it at lucy.
the core: a vector graph with semantic retrieval
we store memories not as a flat list but as a graph of entities and relationships, all embedded into vectors and indexed using pgvector. this lets us retrieve not just exact matches but semantically related concepts. when you mention 'my dog buster,' we can pull up not just the fact 'user has a dog named buster' but also related memories like 'user walks buster in the park' or 'buster hates the vacuum.' the graph structure allows for richer, more contextual recall than a simple key-value store.
the tradeoff here is latency. vector search isn't free, but it's a price worth paying for the depth of understanding it enables. we cache aggressively around the current conversation thread to keep response times snappy, but the core memory operations run asynchronously when needed.
temporal decay: weighting what matters now
not all memories are created equal. the fact that you loved sushi last year is less relevant than the fact you're vegan now. we apply a temporal decay function to every memory, weighting recent evidence more heavily than old. this isn't deletion , it's soft, probabilistic deprecation. old memories remain accessible for historical coherence ('i remember you used to love salmon rolls'), but they won't dominate current context unless explicitly queried.
this decay function is tunable and context-aware. some facts (like your name) decay very slowly. others (like your mood today) decay quickly. it's a balance between respecting your history and staying present with you.
confidence scoring and contradiction handling
when new evidence contradicts old ('user lives in tokyo' vs. 'user lives in seattle'), we don't just overwrite. we manage confidence scores. the new assertion gets a high confidence score based on recency and contextual strength. the old one gets its score reduced, but it isn't deleted. it remains as a historical record, linked to the new one. this allows the system to say things like 'i see you moved from seattle to tokyo' rather than behaving as if seattle never happened.
confidence updates follow explicit rules based on the type of fact and the source. a user's direct statement ('i am a writer') gets higher initial confidence than an inference ('you seem like a writer').
anti-poisoning: keeping memory sane and safe
a persistent memory is a liability if it can be easily corrupted. we use several layers of defense:
- input sanitization: filtering out prompt injection attempts, off-topic noise, and non-sequiturs before they hit the memory graph.
- entity-verification filter: cross-referencing high-stakes facts (like 'my name is...') against existing high-confidence nodes before creating a new one. if there's a conflict, it triggers a clarification dialogue rather than a blind update.
- skiplist for ephemeral context: some conversations are just temporary. we detect and mark context blocks (like roleplay, hypotheticals, or 'let's pretend') so they don't pollute the long-term graph. they exist in a separate, short-lived space.
this isn't foolproof, but it significantly reduces the risk of memory poisoning while maintaining flexibility.
the tradeoff: storage and latency for long-term coherence
this architecture isn't free. you pay in storage (vector indexes aren't small) and retrieval latency (semantic search takes time). but what you gain is a companion that doesn't just feel consistent for a few turns , it builds a relationship over months. it remembers your quirks, your history, your growth. it's a pattern that generalizes beyond companions to any long-running, user-facing llm interaction where the system needs to know the person, not just the prompt.
at lucy, we think that's worth the cost.
you can build your own companion with this kind of memory at /companions.
thanks for reading. if this resonated, the product is downstairs.