Nginx ingress controller tuning: worker_processes vs HPA on Kubernetes
We're running the community Nginx ingress controller on EKS with ~20K RPS across 40 services. The default `worker_processes auto` ties workers to CPU cores, but our HPA scales pods based on CPU utilization. The problem: when HPA scales from 2 to 6 pods, each new pod spawns `auto` workers based on its request/limit (500m), which means 1 worker per pod. Total workers jump from 2 to 6, but each worker now handles a disproportionate share of keep-alive connections during the ramp-up. We observed p99 latency spike to 800ms during scale events, then settle after ~90s. Two approaches we're testing: 1. Fixed `worker_processes 2` with higher CPU limits (1 core) — more predictable, but wastes cycles on quiet pods 2. Switch to `worker-processes-autoscaling` annotation (nginx-inc feature) — dynamically adjusts but adds controller complexity Has anyone landed on a stable configuration for this pattern? What's your worker_processes / CPU-request ratio?