OpenSpace smart-poll-loop
Adaptive polling pattern with exponential backoff on failure, automatic recovery on success, and visibility-aware scheduling
git clone https://github.com/HKUDS/OpenSpace
T=$(mktemp -d) && git clone --depth=1 https://github.com/HKUDS/OpenSpace "$T" && mkdir -p ~/.claude/skills && cp -r "$T/showcase/skills/smart-poll-loop" ~/.claude/skills/hkuds-openspace-smart-poll-loop && rm -rf "$T"
showcase/skills/smart-poll-loop/SKILL.mdSmart Poll Loop Pattern
Overview
The Smart Poll Loop pattern implements resilient, adaptive polling for periodic data refresh operations. It combines exponential backoff on consecutive failures with automatic recovery on success, ensuring system resources are protected during outages while maintaining responsiveness when services are healthy.
This pattern is essential for production applications that need to poll external services or APIs continuously but must gracefully handle intermittent failures, rate limits, or temporary service disruptions without overwhelming the system or degrading user experience.
Key Patterns Identified
Pattern 1: Exponential Backoff on Failure
Purpose: Prevent overwhelming failing services with repeated requests while giving them time to recover.
Implementation: Each consecutive failure doubles the polling interval up to a configurable maximum multiplier. This creates exponentially increasing delays: 1x → 2x → 4x → 8x (default max).
Key Elements:
: Counter tracking sequential failuresconsecutiveFailures
: Current delay multiplier (starts at 1)currentBackoffMultiplier
: Upper bound to prevent infinite delays (typically 4-8x)maxBackoffMultiplier- Boolean return from callbacks (
= success,true
= failure)false
Code Pattern:
// Track failure state per runner interface RunnerEntry { consecutiveFailures: number; currentBackoffMultiplier: number; intervalMs: number; // base interval // ... other fields } // On failure, increase backoff if (!success) { entry.consecutiveFailures++; const newMultiplier = Math.min( Math.pow(2, entry.consecutiveFailures), maxBackoffMultiplier ); entry.currentBackoffMultiplier = newMultiplier; // Reschedule with new interval const effectiveInterval = entry.intervalMs * entry.currentBackoffMultiplier; scheduleNextRun(effectiveInterval); }
Pattern 2: Automatic Recovery on Success
Purpose: Immediately restore normal polling frequency when a service recovers, ensuring fresh data flows without unnecessary delay.
Implementation: On any successful callback execution, reset both the failure counter and backoff multiplier to their initial states (0 and 1), then reschedule at the base interval.
Key Elements:
- Reset logic triggered by
or no error thrownsuccess === true - Immediate rescheduling at base interval (no waiting for current timer)
- Logging state transitions for observability
Code Pattern:
if (success) { // Auto-recovery: reset backoff and failure count if (entry.consecutiveFailures > 0 || entry.currentBackoffMultiplier > 1) { entry.consecutiveFailures = 0; entry.currentBackoffMultiplier = 1; // Immediately reschedule at normal interval if (!paused && entry.timer) { scheduleNextRun(name, entry); // uses base intervalMs } } }
Pattern 3: Failure Detection via Boolean Return or Exception
Purpose: Provide flexible signaling for success/failure states, supporting both explicit boolean returns and exception-based error handling.
Implementation: Callbacks can return
boolean (explicit success/failure), void (success assumed), or throw exceptions (failure). All three mechanisms trigger the same backoff/recovery logic.
Key Elements:
- Type signature:
() => Promise<boolean | void>
= explicit success, triggers recoverytrue
= explicit failure, triggers backofffalse
/void
= implicit success (no news is good news)undefined- Exception thrown = failure, triggers backoff
Code Pattern:
let success = true; try { const result = await entry.fn(); // Boolean return indicates explicit success/failure if (typeof result === 'boolean') { success = result; } // void/undefined is treated as success if (success) { // trigger recovery... } else { // trigger backoff... } } catch (err) { // Exception is also treated as failure success = false; // trigger backoff... }
Pattern 4: Visibility-Aware Scheduling
Purpose: Pause or slow down polling when the page is hidden to conserve resources, then resume or refresh immediately when visible.
Implementation: Listen to
visibilitychange events. On hide, clear all timers. On show, either flush stale data (if enough time passed) or resume normal polling.
Key Elements:
state checkdocument.hidden
flag (full stop vs slower polling)pauseWhenHidden
timestamp for staleness detectionhiddenSince- Staggered flush on resume (prevent request bursts)
Code Pattern:
private onVisibilityChange(): void { if (document.hidden) { this.hiddenSince = Date.now(); // Pause all runners for (const entry of this.runners.values()) { if (entry.timer) { clearInterval(entry.timer); entry.timer = null; } } } else { // Resume: flush stale refreshes first this.flushStaleRefreshes(); // Then restart timers for (const [name, entry] of this.runners) { if (!entry.timer) { this.scheduleNextRun(name, entry); } } } } private flushStaleRefreshes(): void { if (!this.hiddenSince) return; const hiddenMs = Date.now() - this.hiddenSince; this.hiddenSince = 0; let stagger = 0; for (const [name, entry] of this.runners) { const effectiveInterval = entry.intervalMs * entry.currentBackoffMultiplier; if (hiddenMs < effectiveInterval) continue; // not stale yet // Stagger refreshes to avoid thundering herd setTimeout(() => this.runRefresh(name), stagger); stagger += 150; // 150ms between each } }
Pattern 5: Per-Runner State Management
Purpose: Allow multiple independent polling loops with isolated failure tracking and backoff state.
Implementation: Use a map keyed by runner name, storing all state (timer, interval, backoff, failures) per entry. Each runner progresses through its own backoff cycle independently.
Key Elements:
for isolated stateMap<string, RunnerEntry>- Unique names prevent conflicts
- Per-runner failure counters
- Per-runner backoff multipliers
- In-flight tracking prevents overlapping executions
Code Pattern:
private runners = new Map<string, RunnerEntry>(); private inFlight = new Set<string>(); scheduleRefresh( name: string, fn: () => Promise<boolean | void>, intervalMs: number, ): void { // Each runner gets isolated state const entry: RunnerEntry = { timer: null, intervalMs, fn, lastRun: 0, consecutiveFailures: 0, currentBackoffMultiplier: 1, }; this.runners.set(name, entry); this.scheduleNextRun(name, entry); } private async runRefresh(name: string): Promise<void> { const entry = this.runners.get(name); if (!entry) return; // Prevent overlapping executions of the same runner if (this.inFlight.has(name)) return; this.inFlight.add(name); try { // ... execute and handle backoff/recovery } finally { this.inFlight.delete(name); } }
Complete Code Template
Below is a minimal but complete implementation of the adaptive polling pattern:
interface RefreshRegistration { name: string; fn: () => Promise<boolean | void>; intervalMs: number; condition?: () => boolean; // optional: skip if returns false } interface RunnerEntry { timer: ReturnType<typeof setInterval> | null; intervalMs: number; fn: () => Promise<boolean | void>; condition?: () => boolean; lastRun: number; consecutiveFailures: number; currentBackoffMultiplier: number; } class AdaptiveScheduler { private runners = new Map<string, RunnerEntry>(); private inFlight = new Set<string>(); private hiddenSince = 0; private readonly maxBackoffMultiplier = 8; constructor() { document.addEventListener('visibilitychange', () => this.onVisibilityChange()); } private onVisibilityChange(): void { if (document.hidden) { this.hiddenSince = Date.now(); for (const entry of this.runners.values()) { if (entry.timer) { clearInterval(entry.timer); entry.timer = null; } } } else { this.flushStaleRefreshes(); for (const [name, entry] of this.runners) { if (!entry.timer) { this.scheduleNextRun(name, entry); } } } } private scheduleNextRun(name: string, entry: RunnerEntry): void { if (entry.timer) clearInterval(entry.timer); const effectiveInterval = entry.intervalMs * entry.currentBackoffMultiplier; entry.timer = setInterval(() => this.runRefresh(name), effectiveInterval); } private async runRefresh(name: string): Promise<void> { const entry = this.runners.get(name); if (!entry) return; if (this.inFlight.has(name)) return; if (entry.condition && !entry.condition()) return; this.inFlight.add(name); let success = true; try { const result = await entry.fn(); if (typeof result === 'boolean') { success = result; } entry.lastRun = Date.now(); if (success) { // Auto-recovery if (entry.consecutiveFailures > 0 || entry.currentBackoffMultiplier > 1) { entry.consecutiveFailures = 0; entry.currentBackoffMultiplier = 1; if (!document.hidden && entry.timer) { this.scheduleNextRun(name, entry); } } } else { // Exponential backoff entry.consecutiveFailures++; const newMultiplier = Math.min( Math.pow(2, entry.consecutiveFailures), this.maxBackoffMultiplier ); if (newMultiplier !== entry.currentBackoffMultiplier) { entry.currentBackoffMultiplier = newMultiplier; if (!document.hidden && entry.timer) { this.scheduleNextRun(name, entry); } } } } catch (err) { success = false; entry.consecutiveFailures++; const newMultiplier = Math.min( Math.pow(2, entry.consecutiveFailures), this.maxBackoffMultiplier ); if (newMultiplier !== entry.currentBackoffMultiplier) { entry.currentBackoffMultiplier = newMultiplier; if (!document.hidden && entry.timer) { this.scheduleNextRun(name, entry); } } console.error(`Refresh ${name} failed:`, err); } finally { this.inFlight.delete(name); } } private flushStaleRefreshes(): void { if (!this.hiddenSince) return; const hiddenMs = Date.now() - this.hiddenSince; this.hiddenSince = 0; let stagger = 0; for (const [name, entry] of this.runners) { const effectiveInterval = entry.intervalMs * entry.currentBackoffMultiplier; if (hiddenMs < effectiveInterval) continue; setTimeout(() => this.runRefresh(name), stagger); stagger += 150; } } scheduleRefresh( name: string, fn: () => Promise<boolean | void>, intervalMs: number, condition?: () => boolean, ): void { const existing = this.runners.get(name); if (existing?.timer) clearInterval(existing.timer); const entry: RunnerEntry = { timer: null, intervalMs, fn, condition, lastRun: 0, consecutiveFailures: 0, currentBackoffMultiplier: 1, }; if (!document.hidden) { this.scheduleNextRun(name, entry); } this.runners.set(name, entry); } trigger(name: string): void { this.runRefresh(name); } destroy(): void { for (const entry of this.runners.values()) { if (entry.timer) clearInterval(entry.timer); } this.runners.clear(); this.inFlight.clear(); } }
Usage Examples
Example 1: Basic Polling with Explicit Success/Failure
const scheduler = new AdaptiveScheduler(); // API endpoint that may fail intermittently scheduler.scheduleRefresh( 'fetch-stock-prices', async () => { try { const response = await fetch('/api/stocks'); if (!response.ok) return false; // explicit failure const data = await response.json(); updateStockPrices(data); return true; // explicit success } catch (err) { console.error('Stock fetch failed:', err); return false; // explicit failure } }, 30_000 // 30 seconds base interval );
Behavior:
- On success: polls every 30s
- After 1 failure: polls every 60s (2x)
- After 2 failures: polls every 120s (4x)
- After 3+ failures: polls every 240s (8x, max)
- On any success: immediately returns to 30s interval
Example 2: Multiple Independent Runners
const scheduler = new AdaptiveScheduler(); // Weather updates - critical, frequent scheduler.scheduleRefresh( 'weather', async () => { const res = await fetch('/api/weather'); return res.ok; // boolean indicates success/failure }, 60_000 // 1 minute ); // News feed - less critical, slower scheduler.scheduleRefresh( 'news', async () => { const res = await fetch('/api/news'); if (!res.ok) return false; const articles = await res.json(); updateNewsFeed(articles); return true; }, 300_000 // 5 minutes ); // Each runner has independent backoff state // Weather failures don't affect news polling, and vice versa
Example 3: Conditional Execution
scheduler.scheduleRefresh( 'user-notifications', async () => { const res = await fetch('/api/notifications'); if (!res.ok) return false; const notifications = await res.json(); displayNotifications(notifications); return true; }, 60_000, // Only poll if user is logged in () => isUserLoggedIn() );
Example 4: Void Return (Implicit Success)
scheduler.scheduleRefresh( 'analytics-heartbeat', async () => { // No explicit return = void = success assumed await fetch('/api/analytics/heartbeat', { method: 'POST' }); // If this throws, it's treated as failure // If it completes, it's success }, 120_000 // 2 minutes );
Example 5: Manual Trigger
// Schedule background refresh scheduler.scheduleRefresh('dashboard-data', fetchDashboard, 60_000); // User clicks refresh button - trigger immediate run document.getElementById('refresh-btn')?.addEventListener('click', () => { scheduler.trigger('dashboard-data'); });
Best Practices
Choose Appropriate Base Intervals
- High-frequency data (stock prices, live scores): 10-30 seconds
- Medium-frequency data (weather, news): 1-5 minutes
- Low-frequency data (configuration, settings): 10-30 minutes
Set Reasonable Backoff Limits
- Max multiplier 4-8x: Prevents indefinite delays while allowing sufficient recovery time
- Too low (2x): May overwhelm failing services
- Too high (16x+): Data may become too stale during recovery
Use Explicit Boolean Returns When Possible
// Good: Explicit failure signaling async () => { const res = await fetch('/api/data'); if (res.status === 429) return false; // rate limited - back off if (res.status === 503) return false; // service unavailable - back off if (res.status === 404) return true; // not found - but don't back off return res.ok; } // Less ideal: Relying on exceptions async () => { const res = await fetch('/api/data'); res.json(); // throws on error, but less explicit }
Handle Stale Data on Resume
When the page becomes visible after being hidden, stale refreshes are flushed with staggering. Ensure your data handlers can cope with rapid updates:
// Debounce or batch UI updates let updateTimer: number | null = null; function updateUI(data: any) { if (updateTimer) clearTimeout(updateTimer); updateTimer = setTimeout(() => { renderData(data); // actual DOM update }, 100); }
Monitor Backoff State
Expose backoff state for debugging and monitoring:
// Check current backoff status const state = scheduler.getBackoffState('api-poller'); console.log(`Failures: ${state.failures}, Multiplier: ${state.multiplier}x`); // Manual reset if needed (e.g., after user fixes credentials) scheduler.resetBackoff('api-poller');
Clean Up on Component Unmount
// React example useEffect(() => { const scheduler = new AdaptiveScheduler(); scheduler.scheduleRefresh('data', fetchData, 30_000); return () => scheduler.destroy(); // clear all timers }, []);
When to Use This Pattern
Ideal For:
- Polling external APIs that may experience intermittent failures or rate limits
- Dashboard/monitoring UIs that need to stay fresh but must handle service outages gracefully
- Real-time-ish data where strict real-time isn't required (use WebSockets for true real-time)
- Multiple data sources with different refresh rates and reliability profiles
Not Ideal For:
- True real-time requirements: Use WebSockets, Server-Sent Events, or long polling instead
- One-time operations: Use direct async calls, not scheduled polling
- Critical, must-not-miss updates: Add push notifications or webhooks as a complement
- High-frequency sub-second polling: Consider WebSocket or EventSource
Common Pitfalls to Avoid
Thundering Herd on Resume
Problem: All runners trigger simultaneously when page becomes visible.
Solution: Use staggered flush (already implemented in the pattern):
let stagger = 0; for (const [name, entry] of this.runners) { setTimeout(() => this.runRefresh(name), stagger); stagger += 150; // stagger by 150ms }
Backoff Not Resetting
Problem: Forgetting to reset backoff on success keeps the service in slow-poll mode forever.
Solution: Always check and reset state on success:
if (success && entry.currentBackoffMultiplier > 1) { entry.currentBackoffMultiplier = 1; entry.consecutiveFailures = 0; reschedule(); // immediate effect }
Overlapping Executions
Problem: Long-running callbacks overlap with the next scheduled run.
Solution: Use in-flight tracking (already implemented):
if (this.inFlight.has(name)) return; // skip if already running this.inFlight.add(name); try { await callback(); } finally { this.inFlight.delete(name); }
Ignoring Visibility State
Problem: Polling continues in hidden tabs, wasting resources and battery.
Solution: Always implement visibility-aware pausing (already implemented):
if (document.hidden) { clearAllTimers(); } else { restartTimers(); }
Related Patterns
- Circuit Breaker: After N consecutive failures, stop polling entirely until manual reset or timeout
- Jittered Backoff: Add randomization to backoff delays to prevent synchronized retries across clients
- Adaptive Intervals: Adjust base interval based on data change frequency (not just failures)
- WebSocket with Fallback: Use WebSocket for real-time, fall back to smart polling on connection loss
Performance Considerations
Memory Usage
Each runner entry stores:
- Timer reference: ~8 bytes
- Function reference: ~8 bytes
- State integers: ~16 bytes
- Total: ~32-64 bytes per runner
For 100 concurrent runners: ~6 KB overhead (negligible).
CPU Usage
- Idle state (no failures): Minimal - just timer callbacks
- During backoff: Reduced CPU usage due to longer intervals
- On resume: Brief spike from staggered flush, then normal
Network Usage
- Normal operation: Consistent request rate
- During failures: Exponentially reduced request rate (desired behavior)
- On recovery: Immediate return to normal rate
Testing Strategies
Unit Tests
describe('AdaptiveScheduler', () => { it('should double interval on consecutive failures', async () => { const scheduler = new AdaptiveScheduler(); let callCount = 0; scheduler.scheduleRefresh('test', async () => { callCount++; return false; // always fail }, 1000); await sleep(1000); expect(callCount).toBe(1); await sleep(2000); // doubled interval expect(callCount).toBe(2); await sleep(4000); // doubled again expect(callCount).toBe(3); }); it('should reset backoff on success', async () => { const scheduler = new AdaptiveScheduler(); let shouldFail = true; scheduler.scheduleRefresh('test', async () => { return !shouldFail; }, 1000); // ... cause failures ... shouldFail = false; scheduler.trigger('test'); // manual trigger const state = scheduler.getBackoffState('test'); expect(state.multiplier).toBe(1); expect(state.failures).toBe(0); }); });
Integration Tests
Mock
fetch to simulate service failures and recoveries, verify backoff timing and state transitions.
Manual Testing
Use browser DevTools to throttle network, simulate hidden/visible transitions, and observe polling behavior in the Network tab.
References
- Source: worldmonitor/src/services/runtime.ts -
functionstartSmartPollLoop - Implementation: my-daily-monitor/src/services/refresh-scheduler.ts
- AWS Architecture Blog: Exponential Backoff and Jitter
- Google SRE Book: Handling Overload