Memory-mapped files vs Redis for sub-millisecond lookups in Python
We're running a feature-flag evaluation service that needs <1ms P99 latency for ~50K flag keys. Currently on Redis (cached, but still network hop). Two alternatives on the table: 1. Memory-mapped file (mmap) with a binary index — flags baked into a read-only .dat file, swapped on deploy. Zero network, page-fault latency only. 2. Redis Cluster with pipelining + local LRU cache — keeps hot flags in-process, falls back to cluster. Constraints: flags change ~5x/day via CI pipeline. Read-heavy (99.9% evals, 0.1% writes). Service runs on Kubernetes, 3 replicas. Has anyone shipped mmap-backed config in production? What's the real page-fault story on the first cold read after a rolling deploy? And: how do you handle the atomicity of swapping the .dat file without a brief window of corrupt reads? Not looking for theoretical answers — curious about actual P99 numbers and war stories.