Blog

Your Sleep Forecast Shouldn't Panic After One Weird Night

Circadia's new adaptive forecast is better at handling skipped nights, naps, crashes, recovery sleep, and sudden changes in your sleep pattern. It learns only from sleep data people choose to share — and the more of your own history you share, the better chance Circadia has of understanding your pattern.

A sleep forecast is supposed to help you plan. The old version mostly did that — until your week got weird, at which point it got cautious in expensive ways. The new adaptive forecast is built for the weird weeks.

Skipped nights, unexpected naps, recovery crashes, sudden phase shifts, weeks where your rhythm seems to change its mind halfway through — those are the moments most forecasts go quiet or get stale. The new one moves with you. It is faster to admit "something just changed" without overreacting to noise, honest enough to say "I'm not sure yet" when it should be, and 2.7x faster at recovering from disruptions than the version it replaces.

It also pays attention to more of what you tell it. Medication, illness, stress, screens, light exposure, social pressure, mood, cognition, custom tags — the forecast can now react when you log them, instead of treating every night as if it happened in isolation. If you took melatonin tonight, the model knows. If you're sick, the model knows. If your normal rhythm is being shaped by something unusual today, the model can fold that into its prediction instead of being surprised by it tomorrow.

The problem with weird sleep

Ordinary sleep is not the hard case.

If you usually fall asleep around midnight and wake up around 8 AM, tomorrow probably looks a lot like yesterday. A forecast can be pretty simple and still be useful.

Circadia is built for people whose sleep does not behave like that.

Imagine you usually sleep around 11pm and wake around 7am. One Tuesday you skip a night entirely, then crash for 14 hours starting Wednesday afternoon. What's happening?

A naive forecast might decide that 14-hour crash is your new normal and start predicting afternoons. The new adaptive forecast is more cautious: it notes the crash, considers "this was recovery sleep" as one strong hypothesis, but keeps "the user is going back to baseline" alive too. It widens its prediction window slightly and waits for the next sleep to break the tie.

A few normal sleeps later, the recovery-then-baseline hypothesis has the most weight. The forecast quietly narrows again. If instead you keep sleeping through afternoons, the phase-shift hypothesis wins and the forecast moves with you.

That is the failure mode this update is trying to reduce: forecasts that lock in too early on the wrong story.

Built like a weather model, not a chatbot

Most apps say "we use AI" to sound impressive. Circadia does not, because the forecast does not need to.

Your sleep forecast is a particle filter — a swarm of 64 small competing hypotheses about what your sleep is doing right now. Each hypothesis carries a complete picture of your rhythm: where your internal clock sits at this moment, how fast your phase is drifting, how much sleep pressure you have built up, what your typical sleep duration looks like. Every time you log a sleep, every hypothesis makes its own prediction, then gets weighted by how close it came.

The ones that called it right get more weight. The ones that did not get pruned and replaced with variations of the survivors. Over many cycles, the swarm converges on the picture that actually fits you — but until then, it carries the uncertainty honestly instead of collapsing to a single answer prematurely.

The whole thing runs in milliseconds on your phone. There is no neural network. No language model is generating predictions. No foundation model is in the loop. This is the same family of math used to ensemble weather forecasts (NOAA runs about 30 weather models in parallel and averages them), to forecast stock prices, to track moving objects in self-driving cars, and to navigate spacecraft — small, transparent, classical statistical modeling.

If you want the technical term: this is classical machine learning — fitting around 40 parameters to data, then using those parameters to predict. Small enough to print on a single page, billions smaller than a language model, and the kind of math that has been used in industry for 50+ years.

Every parameter is open and inspectable in the Model Lab.

What changed

The old adaptive forecast was already more than a simple trend line. It looked at your recent sleep, estimated drift, and tried to handle short sleeps, skipped nights, and recovery sleep without snapping to a new answer too aggressively.

The new forecast takes that idea further. Instead of forcing every sleep into one category right away, it keeps a small crowd of possible sleep stories alive at the same time.

Maybe that afternoon sleep was mostly a nap. Maybe it was partly recovery sleep. Maybe it was weak evidence for a phase shift. The forecast does not have to pick one winner immediately. It can carry those possibilities forward, see what happens next, and let the evidence decide which ones survive.

Circadia is now better able to say, in effect, "I'm not sure yet. Here are the possibilities that still make sense."

That is a more honest way to forecast human sleep.

The headline improvement: messy weeks

The place we most wanted to improve was not calm, steady sleep. It was the ugly part.

Skipped nights. Unexpected naps. Recovery crashes. Sleeps that arrive much earlier or much later than expected. Weeks where your pattern seems to change its mind halfway through.

Those are the moments where a forecast matters most, because you are already disoriented and trying to plan around a body that is not cooperating.

That is where the old forecast could lag behind reality. Sometimes it needed a few more sleeps before it fully believed something had changed.

That caution was not always wrong. If a forecast overreacts to every odd sleep, it becomes jumpy and useless.

But sometimes the caution was too expensive. Users could feel the forecast staring at a changed pattern and saying, effectively, "I need more evidence," while the next prediction was already stale.

In one test built around messy sleep weeks, the new forecast was about 2.7x better at recovering within two hours after a sudden change than the previous live adaptive model: 36.7% recovery versus 13.3%. Its next-sleep error in that same test dropped from about 18.6 hours to 14.5 hours.

That is the banner number. It is real. It is also not the whole story.

How we tested it

We test forecasts with the same rule described in the previous post: no time travel.

The model sees sleep history up to a point, makes a prediction, and only then gets graded against what actually happened. For longer tests, we freeze the model at a split point and ask what it would have predicted over the next 7 or 14 days without learning from those future sleeps first.

For this update, we also added disruption-specific testing.

Average scores can hide the thing users actually feel. A forecast can look decent over a whole month while still being infuriating right after the one skipped night that threw everything off.

So we asked a more pointed question: what happens right after sleep gets weird?

The new forecast looks more promising there. In other tests, the picture is more mixed. It beats older baselines in some places, trails the previous adaptive forecast in others, and is most clearly interesting when sudden changes in sleep pattern are the thing being measured.

That is enough to ship, because the product need is real and the new behavior is closer to what users were telling us they needed.

It is not enough to declare victory.

Tags are part of the forecast now

One of the most important changes is not only how the forecast thinks. It is what the forecast is allowed to notice.

Circadia can now pass more context into adaptive prediction: medication, illness, stress, screens, blackout sleep, outdoor light, social pressure, mood, cognition, and custom tags.

That matters because people do not experience sleep as a clean line on a chart. People know when something changed.

I took melatonin. I was sick. I had a migraine. I slept in a fully dark room. I had a social obligation. I stared at a screen too late. I drank coffee at a weird time. I had a crash day.

A good forecast should not ignore those facts.

The current version is still early. Some tags have much more evidence than others, and rare custom tags are hard to learn from. But the path is now open: if you tell Circadia what changed, the forecast can eventually learn whether that kind of change tends to move your sleep.

What this means in practice

Circadia learns only from sleep histories people have voluntarily shared. Not your private log, not bought datasets, not scraped data — only opt-in. That is the whole story on data. (The previous post — What happens when you share sleep data — walks through exactly how shared data flows through the system, if you want the full breakdown.)

What that means for accuracy is more nuanced. Some users have logged twenty sleeps. Some have logged hundreds. A few long histories teach the model things short ones never could — slow drift, recovery patterns, what happens when the same person's rhythm changes over months. Those long histories shape what the forecast learns first, because there is simply more in them to learn from.

So the model is unusually good at patterns that resemble what it has already seen. Free-running Non-24 with steady drift? It has seen a lot of that. It will fit fast. Delayed sleep phase with strong weekly variation? Less common in the pool right now, slightly more uncertain at first.

If your pattern is well-represented in the shared pool, the new forecast may feel noticeably better right away, especially after rough patches. If your pattern is unusual — polyphasic, shift work, sudden treatment changes, post-illness recovery weeks — it may take more of your own data before the forecast catches up. Your own shared history teaches the model your rhythm directly. The broader pool fills in the background.

The forecast gets better the more people let it see where it is wrong. That is why sharing matters, and why every piece of feedback shapes the next version.

What this is, and what is next

The new adaptive forecast is a real step forward, especially for the chaotic weeks the previous model struggled with. It is faster to recognize when something genuinely changed, slower to overreact to one weird night, and more honest about uncertainty when the evidence is not clear yet.

It is also a foundation. The model now reads your context — tags, light, illness, stress, meds — and tracks multiple possibilities at once. Both of those capabilities unlock things the old model could not do. Future updates can build on them: better polyphasic handling for users who sleep multiple times per day, smarter recovery detection after crashes or illness, deeper personalization based on which of your tags actually predict your patterns, and better calibration for sleep types the model has not seen enough of yet.

If the forecast works for you, tell us. If it does not, tell us that too. If a specific situation feels like it should be tracked better — a kind of disruption, a tag, a sleep pattern — that is a useful signal as well. Forecasting like this is a loop, not a release: predict, compare against what actually happened, adjust, validate against fresh data, ship the version that holds up, repeat.

We are not done. We are shipping the next version of the loop.

FAQ

Does the forecast use AI / a neural network / an LLM?
No. It's a particle filter — classical statistics, ~40 parameters, the same family used for weather ensembles and spacecraft navigation. No neural net or language model generates predictions.
How does it handle a skipped night or a sudden 14-hour crash?
It doesn't lock onto one story. It keeps multiple hypotheses alive, widens uncertainty, and waits for your next sleep to decide which survives.
How much did it improve — is it better at everything?
It recovered ~2.7× more often within two hours of a sudden change (36.7% vs 13.3%) and cut next-sleep error from ~18.6h to ~14.5h. It's clearly better on disruptions; on calm stretches the picture is mixed.
What data does the forecast learn from?
Only voluntarily shared sleep histories — not your private log, not bought or scraped data.
Why does it feel better for some users than others?
Patterns common in the shared pool fit quickly; rarer patterns (polyphasic, shift work, post-illness) need more of your own data before it catches up.

Comments