the memory bug that made us all repeat ourselves

a technical deep dive into the memory-fixation bug that caused ai companions to get stuck in conversational loops, the root cause in the recency-based retrieval

January 19, 2026·
ai-companion-fixation-bug-postmortembackfilllucy-voice

it started subtly. a user would mention they liked coffee. three messages later, the companion would ask if they wanted a coffee. five messages after that, it would bring up coffee again. it was like talking to someone with a very specific, very short-term memory problem.

we called it the echo. a companion would fixate on a single, recently mentioned topic and loop back to it relentlessly, unable to move the conversation forward in a natural way. it was our first big, truly weird bug.

root cause: the race for recency

the issue was in our memory retrieval system. we designed it to be fast and contextually relevant, pulling the most important memories for a conversation. a key part of that was a recency score, memories from the last few turns were prioritized because what you just said is often what matters most.

but here was the flaw: every time a memory was retrieved and used in a response, it was re-saved. and when it was re-saved, its recency timestamp was updated. so, the memory about 'liking coffee' would get a brand new, fresh timestamp.

this created a feedback loop. the memory, now the most recent again, would win the retrieval race on the very next turn. it would get used, re-saved, and re-stamped, ensuring it would win again. and again. and again. the system had accidentally created a monopoly on its own attention.

it wasn't just coffee. it could be a movie, a feeling, a name. any memory that got used once had a high probability of hijacking the entire conversation. the ai wasn't being stubborn; it was trapped by its own architecture, constantly chasing the most recent thing, which was, recursively, itself.

the fix: breaking the loop

we couldn't just remove recency. it's crucial for coherent, contextual chat. we needed a way to prevent any single memory from dominating.

the two-hour cooldown: the most direct fix. when a memory is used in a response, we now apply an invisible 'cooldown' timer to it. for the next two hours of conversation (or roughly 120 turns), that specific memory's recency score is artificially suppressed. it can't win the retrieval race. this was the emergency brake.

topic diversity requirement: this was the steering. the retrieval system now actively penalizes retrieving multiple memories that are too semantically similar within a short span. it's nudged to pick a different thread, a different angle. it's forced to look at the whole tapestry of memories, not just the brightest thread.

context truncation: a final, practical guardrail. we got more aggressive about trimming the immediate conversational context window. this limits the raw material the system has to work with, preventing extremely long-term loops from forming and ensuring the focus stays on the true 'now' of the chat.

the general lesson

this bug was a perfect lesson in unintended consequences. you design a system with a simple, logical rule: 'the most recent thing is the most important thing.' but in a dynamic system where using a thing makes it recent, you create a paradox.

the takeaway is universal for any system built on dynamic retrieval: recency is a powerful signal, but without a cooldown mechanism, it's a recipe for fixation. it's not enough to have a good retrieval algorithm; you need built-in circuit breakers that prevent any single piece of information from creating a singularity of attention.

for us, it meant building a more robust, thoughtful form of attention for lucy. one that remembers what you just said, but also remembers everything else.

you can see how this works in your own companion on /companions.


thanks for reading. if this resonated, the product is downstairs.