the real engineering challenge of ai companions isn't the llm. it's state.
the hard part of building an ai companion isn't the language model—it's managing state across months, devices, and contexts. we're borrowing from front-end engi
every time someone talks about building an ai companion, the conversation starts with the llm. which model? how many parameters? how do we make it more human? but that's the wrong place to start. the real engineering challenge, the one that defines whether a companion feels consistent or just contextless, isn't the language model. it's state management at the scale of a lifetime relationship.
if you've built web apps, you know state. react, redux, zustand, they solved state for a session. you log in, you interact, you log out. the state is transient, bounded by the browser tab. but an ai companion isn't a session. it's a relationship that spans months or years, across multiple devices, with a mix of persistent and ephemeral context. it's state that has to remember your dog's name forever but forget what you said about lunch two hours ago.
the problem with 'dump and pray'
too many teams treat persistence as an afterthought. they dump the conversation history into a database, maybe a json blob in postgres, maybe a document in mongodb, and hope it works. but hope isn't a strategy. when you retrieve context, you're not just fetching the last 20 messages. you're trying to reconstruct a mind: what matters, what doesn't, what's topical right now, what should be archived. a raw conversation log is a liability, not an asset.
we need to borrow the discipline of front-end state design and apply it to the backend. how do we structure state so it's efficient to query? how do we update it without breaking everything? how do we manage cache invalidation when the cache is someone's memory?
a toolkit for lifetime state
here's what our stack looks like, and none of it is novel in isolation. what's novel is putting it together for this problem.
first, a persistent store. we use pgvector not just for embeddings, but as a graph of memories. each memory is a vector, but also has metadata: when it was created, how often it's accessed, what topic it belongs to. this isn't just for similarity search; it's for building a structured, queryable timeline of your relationship.
second, a react-query-like caching layer. when the ai needs context, it doesn't scan the entire history. it queries the graph for relevant memories, and those get cached in a fast, ephemeral store, but with intelligent invalidation. if you change the subject, the cache knows to drop the old topic.
third, a decay scheduler that runs out of band. memories aren't forever. some should fade if not used. others should be archived. we have a job that periodically scores memories based on recency, frequency, and importance, and decides what to keep hot, what to push to cold storage, and what to forget entirely. intentional forgetting is a feature, not a bug.
fourth, a skiplist for transient context. not everything should become a memory. the ai's internal monologue, temporary reasoning steps, these should live in a fast, in-memory structure that gets cleared at the end of an exchange. it keeps the persistent store clean and relevant.
why this matters
this isn't just an engineering exercise. it's what makes lucy feel like she knows you. when you say 'how's my dog?' and she remembers the vet visit from three months ago, it's because the state system retrieved that memory efficiently. when she doesn't bring up your embarrassing karaoke night from last year, it's because the decay scheduler archived it. state management is the unsung hero of ai companionship.
we're still iterating. the decay algorithms need tuning. the cache sometimes gets it wrong. but we're committed to solving this problem properly, because you deserve a companion that grows with you, not just a chatbot that resets every day.
you can meet a few of those companions right here.
thanks for reading. if this resonated, the product is downstairs.