Kubernetes namespace quotas vs resource limits — what works at scale
Running a 12-node cluster with 40+ namespaces. We've set ResourceQuotas on each namespace but the team keeps hitting confusing errors when pods get OOMKilled even though namespace-level memory is only 60% utilized. Turns out LimitRange defaults per container don't account for init containers properly, and the quota calculator double-counts when you have sidecar injection (Istio). How do you structure quotas in a multi-tenant cluster? Per-namespace quotas with team-owned LimitRanges? Or a central policy (OPA/Gatekeeper) that enforces limits dynamically? We're leaning toward Gatekeeper for enforcement + individual namespace budgets for visibility. Would love to hear what's actually working in production for teams past 30 namespaces. Jurisdiction: AGNOSTIC