Why a Methodology Page Exists

Most online timers claim to be "accurate." Almost none publish what accurate means or how they measured it. When we built The Blog Timer, the question "how do we know this is right?" had to have a non-handwavy answer. This page is that answer.

Everything below is replicable. If you have a laptop, a phone, a browser, and roughly 90 minutes, you can reproduce our entire test protocol. We encourage that — if you find a result that contradicts ours, write to suraj@theblogtimer.com with your data and we will publish a correction in the changelog.

The protocol exists for one reason: browser-based timers fail in a lot of non-obvious ways. Background tab throttling, OS sleep, requestAnimationFrame de-prioritization, mobile background suspension, Daylight Saving transitions, and clock drift after wake-from-sleep are all real problems. The 8 tests below were designed to surface each failure mode.

The Engine Under Test

Before describing the tests, briefly: The Blog Timer's countdown engine is built around three primitives.

1. Stored end-timestamp

When the user clicks start with duration d milliseconds, we store endTime = performance.timeOrigin + performance.now() + d. From that point, every "how much time is left?" question is a subtraction against the current high-resolution timestamp. We never count down a variable.

This is the single most important design decision. The W3C High Resolution Time Level 3 spec guarantees performance.now() is monotonic and unaffected by system clock adjustments — exactly what a timer needs.

2. requestAnimationFrame for UI

The visual countdown updates inside a requestAnimationFrame loop, which the browser throttles intelligently. When the tab is hidden, rAF stops firing — but the math is still correct, because the math depends on performance.now(), not on rAF cadence.

We fall back to setInterval(fn, 250) only as a defensive layer for browsers where rAF behaves unexpectedly under tab-discard policies.

3. Visibility re-sync

On visibilitychange and focus events, we explicitly recompute the remaining time. If the user closed their laptop for an hour and the timer's deadline is in the past, the timer correctly fires its completion handler on resume.

4. Audio alert on the same clock

The audio alert is scheduled against the same end-timestamp, not against a separate interval. If completion is detected during a re-sync (rather than at the precise moment), the alert fires at re-sync. This is intentional: better a late alert than a silent miss.

The 8-Test Protocol

Every release of the timer engine is run through all 8 tests on the full browser/OS matrix. The protocol is intentionally adversarial — we are trying to break the timer, not pat ourselves on the back.

Test 1: Foreground Drift (5-minute baseline)

Start a 5-minute timer with the tab in the foreground. Record actual elapsed wall-clock time when the alert fires. Repeat 20 times. Compute the mean drift and the 95% confidence interval.

Acceptance: mean drift < 50ms, max drift < 200ms.

Tools: performance.now() instrumentation on a sibling monitor page, cross-checked against the system clock via a small Node.js timestamping script.

🗗

Test 2: Background Tab Throttling (30 minutes)

Start a 30-minute timer, then immediately switch to a different tab and use that tab actively. After 30 minutes, return to the timer tab and record when the alert fired and what the timer displays.

This test specifically targets Chromium's background-tab throttling policy (which, per the Chrome 88 release notes, limits background timers to roughly 1 wake per minute).

Acceptance: alert fires within 1 second of correct wall-clock time, regardless of background throttling.

💤

Test 3: Laptop Sleep (close-lid 10 minutes)

Start a 25-minute Pomodoro. After 5 minutes, close the laptop lid. Wait 10 minutes (verified by external phone timer). Open the lid. Observe.

Expected behavior: on resume, the timer shows correct remaining time (~10 minutes). It does not still show "20 minutes left" as if no time had passed during sleep.

Acceptance: remaining-time error after wake < 1 second.

📱

Test 4: Mobile Backgrounding (iOS Safari + Android Chrome)

Start a 10-minute timer on mobile. Press the home button (do not kill the app). Wait until the timer should complete. Reopen the browser.

iOS Safari aggressively suspends JavaScript when the tab is not foreground. Per Apple's WebKit documentation, suspended tabs do not run timers at all.

Acceptance: on return, the timer shows zero remaining and (where browser permissions allow) the audio alert fires immediately as part of re-sync.

🔌

Test 5: Clock Adjustment (NTP sync mid-timer)

Start a 15-minute timer. Mid-run, manually shift the system clock forward 30 minutes and then back 30 minutes (simulating an NTP correction). Observe.

This is why we use performance.now() rather than Date.now() for the underlying math: performance.now() is monotonic and unaffected by wall-clock adjustments, per the W3C monotonic-clock requirement.

Acceptance: timer behavior is unchanged by clock adjustment; alert still fires at the correct elapsed-time mark.

Test 6: High CPU Load

Start a 5-minute timer. Simultaneously run a CPU-intensive task in another tab — we use a WebAssembly prime-sieve benchmark that pins one core at 100%. Measure drift.

Under load, setInterval fires can stack and be delayed. Our timestamp-based design is immune to this, but we test it to confirm.

Acceptance: drift < 50ms even at 100% CPU load.

🌝

Test 7: Daylight Saving Transition

We don't wait 6 months for a real DST event. Instead, we simulate by running the same battery of tests in browsers configured with timezones that have DST transitions, with system clock pre-set to 5 minutes before "spring forward" or "fall back." Then start a 10-minute timer and let it run through the transition.

Acceptance: timer is unaffected by DST. (It should be — we use elapsed wall-clock time, not local civil time. This test confirms there's no bug where we accidentally use local time anywhere.)

🔊

Test 8: Audio Alert Reliability

Across the full browser/OS matrix, with autoplay policies enabled, with the system muted, with the system at volume 100, and with Bluetooth audio devices connected: start a 1-minute timer and verify the alert plays.

This tests against the browser autoplay restrictions that have tightened steadily since Chrome 66's 2018 autoplay policy.

Acceptance: audio plays when the page is in foreground and the user has previously interacted with the page (any click). Documented limitation: muted-system or backgrounded-tab states do not produce sound, which we surface in the UI when relevant.

Browser and OS Matrix

Every test is run on each of the following combinations. We update this matrix when a new major browser version ships.

  • macOS (current and N-1 versions) — Chrome stable, Safari stable, Firefox stable, Arc, Brave
  • Windows 10 / 11 — Chrome stable, Edge stable, Firefox stable
  • Ubuntu 22.04 LTS — Chromium stable, Firefox ESR
  • iOS (current and N-1 major versions) — Safari, Chrome (which is WebKit on iOS), Firefox Focus
  • Android (current and N-1) — Chrome stable, Samsung Internet, Firefox stable, DuckDuckGo Browser

That's a minimum of 22 browser/OS combinations × 8 tests = 176 individual test runs per release. We automate what we can with Playwright (mostly tests 1, 6, and 8 in headed mode), but several tests — particularly Test 3 (laptop sleep), Test 4 (mobile backgrounding), and Test 7 (DST simulation) — require manual execution.

Tooling

The exact tooling stack we use, in case you want to replicate:

  • Primary clock source: performance.now(), which the W3C High Resolution Time Level 3 spec guarantees as monotonic high-resolution. See the spec.
  • Secondary clock source: a Node.js process timestamping Date.now() every 100ms, written to a log file. Used to cross-check browser timestamps against an out-of-process wall clock.
  • UI refresh: requestAnimationFrame with fallback to setInterval(handler, 250). setTimeout is used only for completion event scheduling, never for elapsed-time measurement.
  • Re-sync triggers: visibilitychange, focus, and resume events all call the same recompute function.
  • Automated test runner: Playwright with browser-context isolation. Allows us to simulate tab backgrounding deterministically.
  • Statistics: Python with NumPy for distribution analysis; we report mean, standard deviation, 95% CI, and max observed drift for every test.

Statistical Reporting and Margin of Error

When we report timer accuracy, we report it as a distribution, not a single number. Specifically:

  • Mean drift: the average absolute difference between the expected completion time and the observed completion time across N runs.
  • Standard deviation: how consistent that drift is. Low SD with low mean is good. Low mean with high SD means we're sometimes very accurate and sometimes not, which is worse than consistently slightly-off.
  • 95% confidence interval: derived from the t-distribution given our sample size (typically N = 20 to N = 50 per test).
  • Max observed: the worst single drift observation in the sample. This matters because users care about worst case, not average case.

For Test 1 (Foreground Drift, 5-minute baseline) on macOS Sonoma with Chrome stable as of our most recent release, the numbers were: mean drift 8.3ms, SD 4.1ms, 95% CI [6.5ms, 10.1ms], max observed 21ms. That's roughly four orders of magnitude better than the worst-case 10+ seconds we measured on a popular competitor's site using a naive setInterval implementation.

Why This Matters for Productivity Protocols

Timer drift is not just a curiosity. For specific protocols, drift compounds.

Tabata's protocol — 20 seconds maximum-intensity work, 10 seconds rest, repeated 8 times — was validated in Tabata et al. (1996) at exactly those durations. If your interval timer drifts by even 1 second per interval, you've added 16 seconds to a 4-minute protocol. That's enough to change the metabolic response the study reported. Drift turns a research-backed protocol into something else.

The same applies to nap timers. Sara Mednick's research (Mednick et al., 2002; see also Brooks & Lack, 2006) identifies a critical window: naps shorter than ~26 minutes restore alertness without producing sleep inertia. A nap that drifts to 35 minutes leaves you groggier than no nap at all. The Hayashi research on post-nap recovery (Hayashi et al., 1999) confirms the same threshold.

Pomodoro is more forgiving — Cirillo's original 25-minute interval was, by his own admission in The Pomodoro Technique (2006), partly arbitrary. But the break boundaries matter. A "5-minute" break that drifts to 12 is a derailment of the next work block, and Cal Newport's analysis in Deep Work (2016) suggests the cost of re-engagement after over-long breaks is substantial.

Known Limitations

We don't claim perfection. The current engine has documented limitations:

  • iOS Safari long-background: when an iOS tab is backgrounded for more than approximately 5 minutes, Safari may discard the page entirely. On reload, our timer state recovers from localStorage, but the audio alert at the original completion time cannot fire while the page is discarded. We surface this as a banner on iOS when the timer is set for more than 5 minutes.
  • System clock manipulation: performance.now() protects us from NTP adjustments, but if a user manually sets their system clock backward by an hour, the displayed remaining time is unaffected (correct), but any UI showing "completes at HH:MM" will show the new, shifted local time. That is technically correct but can confuse.
  • Audio in muted-system state: we cannot bypass system mute. There is no way to.
  • Service Worker timers: we deliberately do not use service workers to fire alerts when the tab is closed. The browser API support is inconsistent and the privacy implications (notification permissions) are not worth the marginal benefit for our use case.

Open Methodology — Please Replicate

We publish this methodology because the alternative is a black box, and a black box is unacceptable for a tool that people rely on for cooking, exercise, sleep, and focus.

If you replicate our protocol on hardware or in browsers we don't currently test, we want your data. If your results contradict ours, we especially want your data. Write to suraj@theblogtimer.com with raw test logs and we'll either reproduce, correct our claims, or explain the discrepancy. Reproducibility is the whole point.

For the broader philosophy on why this transparency matters, see the about page and the editorial policy. For the full bibliography of timing-research citations referenced above, see /sources/.

Related Pages