Circadia — Methodology

A description of how Circadia captures and analyzes free-running sleep data, written for circadian-rhythm researchers evaluating whether the app's outputs can be cited or used in secondary analysis.

For the patient-facing version of this content, see HOW_IT_WORKS.md. For the patient-clinician handoff, see the Doctor's Report PDF exported from the app.

Last revised: 2026-05-25 (post-particle-filter deployment).

1. What Circadia measures

Circadia is a self-tracking app for sighted Non-24-Hour Sleep-Wake Disorder (N24SWD) and adjacent free-running phenotypes. Each user logs sleep onsets and wakes (and optionally sleepless gaps). The app derives:

Daily drift — the shift in sleep onset from one cycle to the next, in hours.
τ (tau) — the estimated period of the user's observed sleep-wake rhythm (a behavioral period), in hours. Computed as 24 + mean_drift. Under the limitations described in §13 (no DLMO, self-report, behaviorally derived), this approximates — but is not a direct measurement of — the intrinsic circadian period reported in lab forced-desynchrony protocols (Czeisler 1999; Duffy 2011).
σ_obs — observed variability of the daily drift.
σ_τ — standard error on τ.
|drift| and |drift all| — unsigned drift magnitudes (cohort-side diagnostic, see §10).
Cumulative debt and Process S — homeostatic measures, included for completeness; not the primary contribution.

The primary research contribution is this behavioral τ estimate and its uncertainty, derived from continuous at-home logging over multi-week windows. It is intended as an ambulatory approximation of the intrinsic period measured under controlled lab conditions (Czeisler 1999; Duffy 2011), with the limitations of self-reported onset timing that that implies. See §13 for the full caveat list.

2. Drift between two sessions

For two consecutive clean sleep onsets onset_prev and onset_curr:

gap_h   = (onset_curr − onset_prev) in hours
d_clock = (hour_of_day(onset_curr) − hour_of_day(onset_prev))
drift_h = d_clock                  wrapped to (−12, +12]

drift_h is the per-cycle shift in onset relative to a 24-hour day. A positive drift indicates phase delay; negative indicates phase advance.

The modular wrap to (−12, +12] is the bounded representation. For users with true per-cycle drift exceeding 12 h, this representation aliases: a real +15 h shift records as −9 h. The wraparound-unfold pass in the Smart estimator (§3, step 6) detects and corrects this when the signature is strong enough.

A pair is flagged "post-sleepless" if gap_h exceeds a configurable threshold (default 30 h; per-user adjustable). Post-sleepless pairs are excluded from the Smart and Clean drift averages because the algorithm cannot distinguish a true large drift from a normal-length cycle plus a skipped sleep. They are still surfaced in the per-row log (with an alternate-direction interpretation displayed when |drift| > 8 h) and contribute to the Raw drift estimator and to the cohort-side |drift all| diagnostic.

The reference function is calcDrift() in src/lib/utils.js.

3. τ estimator (Smart)

The default τ estimator is an EWMA-weighted regression with a fading prior, plus modular-wrap unfolding and bidirectional-drift detection. It is implemented as estimatePeriod() in src/lib/utils.js.

Inputs

All sleep entries where neither the xDrift flag (user-excluded) nor the fragmented flag (auto-flagged short session) is set.
A configurable window: either a rolling windowDays cutoff or an explicit [rangeStart, rangeEnd] range.
The user's own postSleeplessThresholdH (cloud-synced setting; see §9).

Step 1 — collect transition pairs

Adjacent entries in the filtered, time-sorted list are paired. Pairs are kept if 0 < gap_h ≤ max(postSleeplessThresholdH, 1.3 × τ_estimate). The current τ estimate is used to relax the gap threshold for users whose true τ is far from 24, preventing systematic loss of valid cycles.

Each surviving pair contributes drift_h with a weight

w_i = exp(−ln(2) × (t_now − t_onset_i) / halfLife)

with halfLife = 28 days. Recent data is weighted more heavily than old data; the half-life is comparable to the timescale on which self-reported τ has been observed to shift in N24 patients (Emens 2013).

Step 2 — prior

A weak prior centers the estimator on the published N24 population mean. The prior contributes priorPseudoCount = 3 pseudo-observations at drift = priorTau − 24 = 0.7 h with variance priorSigma² = 1.0 (Hayakawa 2005 N=57; Kitamura 2013 N=28).

The prior weight fades linearly with sample size:

effective_prior = max(0, priorPseudoCount − N_pairs / 3)

By ~9 clean pairs the prior is gone entirely. This protects cold-start users from wildly miscalibrated estimates without dragging established users toward the population mean if their individual τ differs.

Step 3 — weighted mean and variance

W            = Σ w_i + effective_prior
mean_drift   = (Σ w_i · drift_i  +  prior_contribution) / W
σ_obs²       = (Σ w_i · (drift_i − mean_drift)²  +  prior_var) / W

An adaptive σ floor prevents pathologically small variance estimates in low-jitter datasets:

σ_floor      = max(1.0, 0.5 × √(max(τ − 24, 0.5)))
σ_obs        = max(σ_obs, σ_floor)

Step 4 — τ and its uncertainty

τ            = 24 + mean_drift
N_eff        = (Σ w_i)² / Σ w_i²      # Kish effective sample size
σ_τ          = σ_obs / √N_eff

N_eff is computed with the prior pseudo-count included.

Step 5 — two-pass refinement

The whole procedure is run twice. The first pass uses priorTau to set the gap-keep threshold; the second pass uses the τ estimate from pass 1. This is a stability fix for users with very long or short τ where the default 30-hour threshold would systematically drop valid cycles.

Step 6 — wraparound unfolding

Modular drift is bounded to (−12, +12]. Users whose true per-cycle drift exceeds 12 h cannot be represented in this range without sign aliasing. A third pass detects the aliasing signature and unfolds it.

The unfolding pass fires only when both of the following hold after pass 2:

mean(|drift_i|)            > 5 h
mean(|drift_i|)            > 2 × |mean(drift_i)|

The first condition selects unusually high-magnitude users. The second detects sign cancellation under the mean — the signature of modular wraparound (real consistent forward shifts recording as alternating ±values).

When triggered, the algorithm:

Picks a bootstrap seed direction from where the large-magnitude pairs cluster (positive vs negative count among |drift_i| > 6 h).
Unwraps each pair's drift to whichever of {d, d+24, d−24} is closest to seed_direction × median(|drift_i|).
Re-computes mean and variance on the unwrapped pairs.
Accepts only if σ_obs(unwrapped) < 0.7 × σ_obs(original) — i.e., the unwrap materially tightens fit. If σ does not tighten, the heuristic is rejected and the original mean is preserved.

The σ-tightening guard prevents false-positive unfolding on genuinely chaotic data. The function returns flags wrapDetected and unwrapApplied plus tauOriginal and tauUnwrapped for diagnostic surfaces (see §10).

Step 7 — bidirectional-drift detection

When sign reversals across consecutive cycles exceed 40% and at least 4 pairs are available, the estimator returns:

bidirectional = true
driftMedian   = median(drift_i)
tauMedian     = 24 + driftMedian

The median is robust to the cancellation that the weighted mean suffers when forward and backward shifts roughly balance. Consumers (predict tab, banners, doctor's report) display the median path when this flag is set.

4. Alternative drift estimators (for comparison)

Two simpler estimators are also exposed in the UI:

Clean — unweighted arithmetic mean of drift_h over pairs where gap_h ≤ postSleeplessThreshold. Equivalent to the EWMA estimator with halfLife → ∞, no prior, no σ floor, no unfolding, no bidirectional fallback.
Raw — unweighted mean over ALL transitions, including post-sleepless ones. Post-sleepless transitions are computed as gap_h mod 24 (forced positive). Included for transparency; biases τ upward when users have many skipped sleeps.

When evaluating cohort data, the Smart estimator is the recommended reference. The Clean estimator is appropriate for short, stable windows. Raw should not be used directly for inference but is useful as a comparator — when Raw and Smart diverge by > 3 h on the same user, that user is likely a low-cycle-rate user (most transitions filtered as post-sleepless) and Smart silently undersells their real shift magnitude. See §10 for the cohort-side diagnostic.

5. Adaptive forecast (Particle Filter V2)

The Smart estimator above returns a single best-fit τ and its uncertainty. The Adaptive prediction engine is a separate sequential-Monte-Carlo model that maintains a full state distribution for each user. Deployed May 25, 2026. Reference implementation: particleFilterPredict() in src/lib/model-lab.js.

Architecture

The user's current sleep state is represented as a swarm of 64 particles, each carrying a complete hypothesis:

Field	Meaning
`anchorH`	current phase (hours-since-epoch)
`velocityH`	current per-cycle drift (h/cycle, soft-clamped)
`accelerationH`	second-order drift change
`durationH`	particle's typical sleep duration estimate
`pressure`	homeostatic sleep pressure (0–1)
`lastWakeH`	last observed wake (for awake-duration math)
`uncertaintyH`	this particle's self-reported confidence
`weight`	probability mass relative to the swarm

Update loop

For each historical entry in chronological order:

Predict. Every particle produces a per-row prediction (onset, duration, pressure). Predictions include Gaussian process noise (processNoiseH ≈ 1.2), per-particle velocity jitter, and a pressure-driven onset adjustment.

Score five hypotheses per particle. A softmax over per-hypothesis loss functions assigns weights to:

Hypothesis	Triggered by
`main`	normal cycle, residual small, pressure mid-range
`nap`	short sleep, low evidence for phase
`recovery`	high pressure, longer-than-predicted duration
`shift`	residual indicates genuine phase change
`skip`	gap from prior sleep suggests skipped cycle

Softmax temperature (hypothesisTemp ≈ 0.44) controls how sharply the model commits — low temp behaves like a hard classifier; high temp blends many possibilities softly.

Apply likelihood. Particle weight is multiplied by the probability of the observed onset and duration under each particle's prediction (Gaussian likelihood with observationSigmaH ≈ 2.2).
Normalize weights across the swarm.
Update each particle's state from the observation. The effective learning rate is a hypothesis-weighted blend (mainLR · w_main + recoveryLR · w_recovery + …), so confidently- recognized main sleep nights nudge state gently while recognized phase shifts let the model move faster.
Resample when the effective particle count (Kish formula on weights) drops below resampleThreshold ≈ 0.55 × particles.length. Higher-weight particles get cloned with small jitter; lower-weight particles are pruned. This preserves diversity without letting the swarm collapse to a single answer prematurely.

Change-point detection

When the residual for an entry exceeds changePointResidualThresholdH (default 4.5 h), the algorithm temporarily elevates the velocity learning rate from velocityLearningRate ≈ 0.08 to changePointVelocityLearningRate ≈ 0.22 and widens uncertainty by changePointUncertaintyBoostH ≈ 1.1. This is the mechanism behind the disruption-recovery improvement (see §6).

Context inputs

Self-reported covariates feed directly into per-particle prediction via small additive deltas to predicted pressure, onset, duration, and uncertainty. Tuned production weights:

Covariate	Pressure boost	Notes
`stress`	+0.05
`illness`	+0.07
`medication`	+0.04
`social`	−0.02
`screensBefore`	+0.03
`blackout`	−0.03
`mood` (1–5)	−0.015 × scaled
`cognition` (1–5)	−0.015 × scaled
`customTag` (per active tag)	+0.02	Bounded by per-user prior with `contextTagPriorCount ≈ 12` — model learns which of YOUR tags actually predict

Light timing (lightOutdoor) and tag-presence indicators also feed into separate onset / duration / velocity channels with smaller weights.

The full parameter list is 40 named numbers, exposed in source at PARTICLE_FILTER_ALPHA_PARAMS. All are inspectable in the public Model Lab (#model-lab route — non-admins can browse the catalog and re-score any candidate model against their own log).

6. Tuning and validation

Training corpus

The production parameter set was fit via offline hyperparameter optimization against voluntarily-shared sleep histories from Circadia alpha users as captured in the tuning corpus on 2026-05-17 (n ≈ 18 active sharers at that snapshot, ranging from a few weeks to ~13 months of logged sleeps) plus two longer-form externally shared datasets. These training-corpus figures are deliberately frozen to that snapshot — they describe what the model was actually fit on, not the current cohort (see §12 for current state). Run artifacts at research/runs/soft-hypothesis-deep-2026-05-17T00-37-13Z.json (prior soft-hypothesis tuning) and research/runs/particle-* (current). Model registry: research/model_registry.json.

Held-out validation

1,600+ sleep records reserved across multiple users for held-out scoring. The model never sees these during fitting.

No-time-travel rule

At each prediction step the model has access only to data preceding the predicted entry. For multi-day forecast tests, the model is frozen at a split point and asked what it would have predicted over the next 7 or 14 days without learning from those future sleeps.

Disruption-slice testing

Average scores over a whole month can hide failure modes users actually feel — a forecast that becomes useless right after one skipped night. We separately score predictions on the rows immediately following a disruption (defined as a residual > disruptionThresholdH ≈ 4 from the prior fit).

Reported improvements vs prior adaptive (soft-hypothesis)

2.7× recovery within 2 h of a sudden shift: 36.7% vs 13.3%
Next-sleep error after disrupted sleep: 18.6 h → 14.5 h (≈ 22% reduction)
One long-history dataset (≈347 entries): frozen 7-day forecast error 9.36 → 4.09 (≈ 56% reduction)

Honest qualifiers

The model is mixed elsewhere. It beats older baselines on disruption response and on long-history datasets; it can trail the previous adaptive on calm, very-steady patterns where the simpler model has nothing to fix. The training corpus shape (a few long histories shape the model disproportionately) matters; users whose sleep is unlike anything in the pool may take more of their own data before fits converge.

7. Exclusion rules

A session is excluded from drift math (but still counted toward sleep totals) under three conditions:

Fragmentation — a session starts within fragmentationThreshold hours (default 6 h) of the previous wake. Treated as a continuation of the prior episode, not a new cycle. Threshold is per-user configurable; polyphasic / ME-CFS / split-sleep users typically lower to 3–5 h, while clean monophasic N24 users can raise to 8 h.
Nap auto-flag — a session shorter than napThreshold hours (default 4 h) is auto-flagged as a crash nap. User can override per-entry; manual choice wins.
Post-sleepless — gap_h from the prior onset exceeds the user's postSleeplessThresholdH (default 30 h). Excluded from Smart and Clean averages; included in Raw (modular wrap) and in the |drift all| cohort diagnostic.

Manual exclusions via the per-entry xDrift toggle override both fragmentation and nap auto-flag in either direction. xDriftManual is preserved separately so threshold changes after the fact don't clobber the user's explicit choice.

The reference function is markFragmented() in utils.js.

8. Confidence and forecasting

Forward prediction at n cycles uses a random-walk variance model:

σ_prediction(n) = σ_obs × √n

This is the standard model for accumulated jitter in a free-running oscillator and is what the Predict tab displays. It is not a calibrated frequentist interval; it is presented to users as a guide-rail, with documented caveats that real-world predictions degrade beyond ~7 cycles due to compounding tau drift and zeitgeber perturbations.

The default forecast reports the probability mass inside the user's one-cycle σ_obs tolerance. The adaptive predictor uses the same display contract, but compares its learned residual band against the default one-cycle tolerance. That keeps the percentage comparable across modes and prevents the adaptive model from always showing the same confidence decay sequence merely because both numerator and denominator came from its own sigma.

The Analysis tab also surfaces a Phase Position scatter (Position or Residual mode) that lets users compare predicted vs actual onset for each historical entry. In Adaptive mode the per-entry predictions are the particle-filter's weighted-mean predictions made before observing each row (out.fitted on the particle-filter output) — i.e. genuine forward-in-time fits, not retrospective.

9. Co-variates captured per session

For research purposes, each sleep entry can carry:

Core covariates

Field	Type	Meaning
`q`	1–5	Self-rated sleep quality
`wakeType`	`natural` / `forced`	Whether the user woke spontaneously
`stress`	bool	Self-reported stress affecting this session
`illness`	bool	Self-reported illness
`medication`	bool	Self-reported medication change/use
`social`	bool	Social obligation affected timing
`mood`	1–5	Post-wake mood
`cognition`	1–5	Post-wake "brain fog → sharp" rating
`lightOutdoor`	comma-separated subset of {`morning`, `midday`, `evening`, `none`}	Bright outdoor light timing (multi-select May 2026; legacy single-string rows parse to a 1-element set)
`screensBefore`	bool	Screen exposure in the 2 h before onset
`blackout`	bool	Full darkness during sleep
`customTagIds`	string[]	References into the user's custom-tag table
`adHocTags`	string[]	Embedded one-off tags (max 10, capped 32 chars each)

Zeitgeber bundle (May 2026)

Stored as JSONB on the zeitgebers column of circadia_sleep_entries. All fields optional; missing means "not tracked" (not "false"):

Field	Type	Meaning
`morningLight1h`	bool	Bright outdoor or 10,000-lux exposure within 1 h of waking
`firstFood2h`	bool	First food within 2 h of waking
`workout`	`none` / `morning` / `afternoon` / `evening`	Exercise timing bucket
`workoutTime`	`HH:MM`	Optional precise workout time
`caffeine`	bool	Any caffeine intake on this day
`caffeineTime`	`HH:MM`	Optional time of last caffeine
`lastFood3h`	bool	Last food at least 3 h before sleep onset
`melatonin`	bool	Took melatonin on this day
`melatoninTime`	`HH:MM`	Time melatonin was taken
`alcohol`	bool	Alcohol on this day
`alcoholTime`	`HH:MM`	Time of last drink

All co-variates are optional, user-reported, and intended as exploratory signals — not ground-truth zeitgeber measurements. Per-user hide controls let users opt out of any zeitgeber they don't track; hidden fields don't appear in either the log form or the Analysis correlation panels.

Per-session derived covariates (computed, not user-reported)

Field	Meaning
`postSleepless`	Whether the gap to the prior onset exceeded user's post-sleepless threshold
`fragmented`	Whether this session started within the user's fragmentation threshold of the prior wake
`driftAmbiguous`	Per-row marker that modular drift on a clean transition lands beyond ±8 h (likely wrap artifact; surfaced but not used by the Smart estimator)

Sleepless periods

Sleepless periods (intentionally skipped sleeps, sometimes lasting 30–48 h in free-running patients) are logged in a separate circadia_wake_periods table to preserve the actual onset/wake timeline. Drift math treats them as gaps; sleep-debt math counts them as ordinary sustained wakefulness.

Per-user settings (cloud-synced)

Setting	Default	Range
`postSleeplessThresholdH`	30	18 – 72
`fragThresholdH`	6	1 – 24
`napThresholdH`	4	1 – 8
`ambiguousThresholdH`	8	4 – 14

Settings sync across devices via the user_settings JSONB column on circadia_user_profiles. The Smart estimator and all derived stats honor the user's own thresholds, not a global default.

10. Cohort-side diagnostics

For users analyzing the shared cohort, several additional aggregates are computed in AdminPanel.computeUserStats:

Per-user, recomputed at view time

Stat	Meaning
`tauH`, `sigmaTau`, `sigmaObs`	From the Smart estimator
`driftMean` (Smart), `driftClean`, `driftRaw`	The three estimator outputs
`driftMagnitude`	Unsigned mean of per-cycle drift magnitude (`abs(drift_i)`) over clean transitions only
`driftMagnitudeAll`	Unsigned mean across raw drifts (includes post-sleepless wraps). For users whose pattern is dominated by long awake stretches, this is closer to lived per-cycle shift than `driftMagnitude`
`lowCycleRate`	Boolean flag: Smart and Raw drift diverge by > 3 h. Indicates most transitions are filtered as post-sleepless and Smart silently undersells. Surfaced with ⚠ in the cohort table
`unwrapApplied`, `wrapDetected`	From the Smart wraparound-unfold pass (§3 step 6)
`tauOriginal`, `tauUnwrapped`	τ before vs after unfolding, when applied
`postSleeplessCount`	Number of pairs filtered as post-sleepless

Cohort-level views

τ distribution — histogram (0.25 h bins), mean, median, range
Profile aggregates — self-id, treatments, entrainment status counts
Median |drift| (unsigned shift magnitude across cohort)
Median σ_obs (cycle jitter across cohort)
% with any covariate flag — coverage indicator
Sleepless gaps — count + mean / max / total hours across cohort

Cohort-vs-individual toggle

The cohort table can be computed two ways:

Generic defaults (default view) — every user's Smart estimator re-run with the same thresholds (30 h post-sleepless, 6 h fragmentation). Useful for apples-to-apples comparisons.
Per-user settings — each user's stats computed with the thresholds they chose. Useful for "what the user actually sees."

Drill-down

Admin can load any sharing user's anonymized dataset into the normal Log / Chart / Predict / Clock / Calendar / Analysis views (read-only; the admin's own data is untouched). This is the recommended way to inspect individual users.

11. Data structure and sharing

A user who has opted in via "share my data" appears in the admin/research view under an anonymousId (UUID-style). Their identity-linked user_id is never surfaced to admin or research consumers. Per-session co-variates and onset/wake timestamps are exposed in full, but unlinked from any account-level identifier.

Free-text fields not exposed by the shared API:

Sleep notes
Wake-period notes
Profile free-text fields (region, comorbidities_other, treatments_other)

Custom and ad-hoc tag names are a separate opt-in (per-user "share my tag content"). Users sharing data can keep tag content private.

The opt-in is reversible. Revoked sharing deletes the anonymous-share linkage immediately. The underlying sleep data remains under the user's account and is not auto-deleted.

There are three consent items in Circadia, each grantable and withdrawable independently:

Simple sharing — for the Circadia developer and internal product improvement only. Tag-tuning, model fitting, bug investigation.
Research-level sharing — same anonymized data as simple, plus pre-consent for future academic research collaborations under a data-use agreement (DUA), plus a structured research profile (age bucket, sex at birth, country, comorbidities, treatments).
Publication consent — a separate, stricter additive gate that permits a user's anonymized data to be included in publicly-deposited or publication-bound datasets. Publication consent is never implied by simple or research consent; it must be granted explicitly per item. Without it, a user's data is shareable under DUA but not publishable.

Important: Research-level pre-consent does not authorize ad-hoc data transfer. As of May 2026 no academic research collaborations have been initiated. The maintainer will reach out to research-tier sharers individually before any specific collaboration begins.

The DUA / research export (the JSON bundles produced by scripts/export-circadia-shared.mjs) draws from the simple ∪ research union and is appropriate to share with a named researcher under DUA. It is not a publication snapshot — public deposits require the publication tier and a frozen, dated snapshot pipeline that does not yet exist. Do not treat simple-tier OR research-tier data as available for public deposit. See docs/circadia-data-dictionary.md for the full consent-tier table and re-identification caveats.

The reference endpoints are:

GET /api/circadia/admin/cohort/overview — cohort-level aggregates
GET /api/circadia/admin/cohort/shared-entries?anonymousId=… — per-user anonymized sleep log, includes covariates + zeitgebers + custom tag IDs + (when shared) tag names

12. Current cohort state (as of 2026-05-27)

51 signups, 48 email-verified. (Excludes one operational admin account used for testing imports — auth_users.exclude_from_cohort_stats flag; any data logged there is test data, not user behavior.)
23 users sharing data (18 with research-tier consent, 5 with simple-tier only). 15 of the 23 sharing users have additionally granted publication consent. 18 sharing users have logged at least one sleep entry (5 are zero-entry sharers); 16 of those are τ-ready (≥10 clean transition pairs).
Largest individual logs in the shared cohort (counting distinct sleep onsets, not raw rows — Circadia exports from Fitbit / Sleep As Android can re-import the same file and create byte-identical duplicate rows; we report the honest distinct count plus the duplicate-import count where it applies):
- 2,447 sleeps across 3,042 days (~8.3 years) — the longest log in the cohort by a wide margin.
- 1,296 sleeps across 1,582 days (~4.3 years).
- 989 sleeps across 854 days (~2.3 years).
- 849 sleeps across 1,834 days (621 duplicate-import rows excluded — user appears to have imported the same Fitbit file 6× in one session).
- Several other users in the 100–400-entry range; a long tail of newer users with weeks to a few months of data.
- These long-history datasets are the strongest single contribution to the τ-estimation work.
Cohort-wide entry totals: 7,145 raw / 6,507 distinct (638 duplicate imports, concentrated in a small number of users).
Observation spans: 3 to 3,042 days per user (median 71 days).
Sleepless gaps logged: 61 events cohort-wide (mean 28.9h, median 26.6h, max 62.7h) — distinguishable from typical awake stretches (per-user median typically ~16h).
Cohort skews adult sighted N24SWD and DSWPD; some ME/CFS overlap; some self-described irregular-sleep-wake / polyphasic patterns.
Geographic distribution: US-majority but not US-exclusive. The research profile collects country plus an optional free-text region field; these fields are not exported in the shared dataset (see re-identification caveat in the data dictionary).

These numbers move daily — they describe the cohort on the date stamped in the section header. The training corpus described in §6 is a frozen earlier snapshot (2026-05-17), deliberately not updated here. Email if you want a current snapshot for a specific analysis.

13. Known limitations

Self-report bias. Onsets and wakes are user-entered; some users log via memory after the fact. There is no actigraphy or PSG ground-truth.
No DLMO. Melatonin onset is not measured; τ is inferred from onset timing alone.
Cycle counting in long gaps. When gap_h exceeds the post-sleepless threshold, the algorithm cannot infer how many 24-hour cycles elapsed. These pairs are dropped from Smart/Clean drift math rather than imputed. Raw and |drift all| diagnostics include them with modular-wrap math.
Sleep-debt model is rough. The 14-day cumulative debt is a linear shortfall vs target — it does not implement the allostatic slow variable of McCauley 2009. Process S uses Borbély 1982 parameters with no individual calibration.
Co-variates are correlational only. The dataset does not support causal inference about, e.g., evening screens shifting tau, because exposure is self-reported and unblinded.
Particle filter is fit on a small corpus. Long histories shape the model disproportionately. Patterns dissimilar to anything in the training pool may take longer for fits to converge. See §6.
Tag-correlation panels use n ≥ 3 per group as their reporting floor. These are exploratory; they should not be treated as statistically calibrated.

14. Key references

Estimator priors and α

Borbély AA. 1982. A two process model of sleep regulation. Hum Neurobiol 1(3):195–204.
Czeisler CA et al. 1999. Stability, precision, and near-24-hour period of the human circadian pacemaker. Science 284(5423):2177–81.
Daan S, Beersma DGM, Borbély AA. 1984. Timing of human sleep: recovery process gated by a circadian pacemaker. Am J Physiol 246(2 Pt 2):R161–83.
Duffy JF et al. 2011. Sex difference in the near-24-hour intrinsic period of the human circadian timing system. PNAS 108 Suppl 3:15602–8.
Emens JS et al. 2013. Circadian misalignment in major depressive disorder. Psychiatry Research 207(1–2):37–43.
Hayakawa T et al. 2005. Clinical analyses of sighted patients with non-24-hour sleep-wake syndrome: a study of 57 consecutively diagnosed cases. Sleep 28(8):945–52.
Kitamura S et al. 2013. Validity of the Japanese version of the Munich ChronoType Questionnaire. Chronobiology International 30(7):918–25.

Homeostatic / debt

McCauley P et al. 2009. A new mathematical model for the homeostatic effects of sleep loss on neurobehavioral performance. J Theor Biol 256(2):227–39.
van Dongen HPA et al. 2003. The cumulative cost of additional wakefulness. Sleep 26(2):117–126.

Zeitgeber correlation backing (used by Analysis tab panels)

Khalsa SB et al. 2003. A phase response curve to single bright light pulses in human subjects. J Physiol 549(Pt 3):945–52.
Damiola F et al. 2000. Restricted feeding uncouples circadian oscillators in peripheral tissues from the central pacemaker in the suprachiasmatic nucleus. Genes Dev 14(23):2950–61.
Stokkan KA et al. 2001. Entrainment of the circadian clock in the liver by feeding. Science 291(5503):490–3.
Burke TM et al. 2015. Effects of caffeine on the human circadian clock in vivo and in vitro. Sci Transl Med 7(305):305ra146.
Chang AM et al. 2015. Evening use of light-emitting eReaders negatively affects sleep, circadian timing, and next-morning alertness. PNAS 112(4):1232–7.
Mason IC et al. 2022. Light exposure during sleep impairs cardiometabolic function. PNAS 119(12):e2113290119.
Youngstedt SD et al. 2019. Human circadian phase-response curves for exercise. J Physiol 597(8):2253–68.
Ebrahim IO et al. 2013. Alcohol and sleep I: effects on normal sleep. Alcohol Clin Exp Res 37(4):539–49.

Modeling family

Kalman RE. 1960. A new approach to linear filtering and prediction problems. J Basic Eng 82(1):35–45. (Lineage for the state-space update.)
Gordon NJ et al. 1993. Novel approach to nonlinear/non-Gaussian Bayesian state estimation. IEE Proc F 140(2):107–13. (Particle-filter founding paper.)
Doucet A et al. 2001. Sequential Monte Carlo Methods in Practice. Springer. (General reference for the particle-filter approach used in Adaptive V2.)

15. Contact

Research-level anonymized data may be made available to researchers under a written data-use agreement and the product's current privacy terms. Simple-tier data is developer-only.

Contact: Dayah Dover, dayahdover@gmail.com

Please reach out to discuss before any data flows. Research-tier consent only pre-approves the act of sharing in principle; specific collaborations require a separate conversation.

If you publish using Circadia-derived data, please cite as:

Dover D. Circadia: free-running sleep tracking for N24SWD. Open alpha, 2026. https://circadia.owlandkestrel.com

A formal DOI deposit on Zenodo is planned.