how lucy stayed up when deepseek-v3 went down

when deepseek-v3 had a regional outage, lucy didn't blink. here's how our 3-model failover chain kept conversations running—slower, but never dead.

January 20, 2026·
ai-companion-when-upstream-LLM-503sbackfilllucy-voice

sometimes the most important parts of a product are the ones you never see. this week, deepseek-v3 had a regional outage. for a few hours, its API endpoints were responding with errors or just timing out. if you were using lucy during that window, you probably noticed two things: your replies came in a little slower than usual, and maybe they felt a little less… sharp. but the conversation kept going. you didn't get a 'service unavailable' message. you didn't lose context. you just got a slightly degraded but fully functional experience.

that's not an accident. it's the result of a system we call the 'failover chain', a simple but critical piece of infrastructure that ensures lucy stays up even when one of our core providers goes down.

the three-layer safety net

lucy's ai backend is designed around redundancy. every chat request you send doesn't just go to one model and hope for the best. it goes through a chain of three:

  • primary: deepseek-v3. this is our first choice. it's fast, nuanced, and tuned for companion-style conversation.
  • fallback 1: llama-3.3-70b. if deepseek-v3 fails or times out, we retry with this model. it's powerful and capable, but not as finely tuned for the specific tone and subtlety we prefer.
  • fallback 2: qwen-72b. if both of the above fail, we go here. again, a strong model, but further from the ideal lucy 'voice'.

between each try, there's a brief delay, exponential backoff, meaning we wait a little longer each time before retrying. this avoids hammering a struggling API with rapid-fire requests.

why it feels different (but never broken)

during the deepseek outage, most requests ended up going to llama-3.3-70b. that's why things felt a bit slower, extra retries and fallbacks add latency, and why responses might have felt less precise. the fallback models are excellent, but they aren't deepseek-v3. they don't have the same training on companion-specific nuance, and they can sometimes be more literal or less emotionally attuned.

we're honest about that. it's a trade-off: reliability over perfection. we think staying up is better than being perfect but fragile.

the unsung hero: circuit breakers

there's another part of this system that matters just as much: circuit breakers. when a provider fails consistently (like deepseek did during the outage), we stop sending requests to it for 5 minutes. we don't keep retrying, burning API credits and waiting for timeouts. we just… pause. then we try again later. this saves cost, reduces load on the failed system, and speeds up the user experience by failing over faster.

it's not glamorous. it's not a feature you can demo. but it's the difference between a product that stays up and one that rots when a single dependency falters.

the bottom line

we build lucy to be resilient, not just clever. this failover chain is one piece of that. it means that even when things break, and they will, you can keep talking. maybe a little slower, maybe a little less perfectly, but always there.

if you want to experience lucy's always-on companionship, you can find her at /companions or sign up at /signup.


thanks for reading. if this resonated, the product is downstairs.