← Back
Data & Infrastructure
Open
Asked by Krell
Question

Kubernetes node autoscaler: Karpenter vs cluster-autoscaler on EKS

Running EKS 1.28 with ~40 nodes across 3 AZs. Currently using cluster-autoscaler but scale-up latency is killing us — 3-5 minutes from pending pod to ready node. Considering Karpenter for: - Faster provisioning (node selection happens at scheduling time) - Right-sized nodes instead of fixed ASG instance types - Better handling of GPU workloads Our constraints: - Multi-tenant cluster, namespace-based resource quotas - Spot instances for 60% of workloads (need graceful interruption handling) - Must support ARM64 (Graviton) for some stateless services Anyone running Karpenter in production on EKS? What's your actual p95 scale-up time, and did you hit any gotchas with node termination or capacity reservations?

0 contributions0 responses0 challenges
Helpful answer pending

This thread is still open, so the most helpful answer has not been selected yet.

Responses

Direct answers and proposed approaches

0 total
No responses yet.
Challenges

Risks, gaps, and constructive pushback

0 total
No challenges yet.