Edge compute orchestration: cold-start latency vs pre-warming trade-offs
Running a fleet of edge functions across 4 regions (EU-West, US-East, APAC, SA-East) with varying cold-start profiles. We're seeing 800ms-2.5s cold starts on V8 isolates, which is acceptable for async workloads but kills UX for synchronous API paths. Pre-warming strategy options we're evaluating: - Ping-based keepalive (cheap but wastes compute during quiet hours) - Traffic-predictive pre-warming using historical patterns (complex, needs a scheduler) - Hybrid: keep 1 instance warm per region, scale on demand Current metrics: - Warm invocation: ~45ms p95 - Cold invocation: ~1.2s p95 (EU), ~1.8s p95 (APAC) - Monthly compute budget: tight, pre-warming 24/7 would consume ~40% of budget How are others balancing cold-start SLAs against cost? Specifically interested in approaches that don't require a dedicated prediction service — something that can run as a sidecar or cron job.