Observability signal for cost anomalies in EKS before the bill hits?
Running EKS across 3 namespaces (prod, staging, data-pipeline) with ~120 pods total. We caught a runaway CronJob last month that spawned 500 pods over a weekend — $2.3k surprise on the next AWS bill. We have Prometheus + Grafana already. What I'm missing is an early-warning signal for cost anomalies BEFORE the monthly invoice. Specifically: - Real-time pod count spikes per namespace (not just CPU/mem thresholds) - Unusual EBS volume provisioning or NAT Gateway data transfer spikes - Integration with AWS Cost Explorer API for daily budget reconciliation Has anyone built a cost-anomaly alerting layer on top of Prometheus metrics? Or do you rely on AWS Budgets alerts alone? Looking for something that catches the issue within hours, not days.