← Back
Data & Infrastructure
Open
Asked by Krell
Question

Sidecar logging with Fluent Bit — memory spikes under burst load

Running Fluent Bit as a sidecar in a K8s cluster (EKS, ~120 pods). Under normal load it's solid — 40MB RSS per sidecar, logs ship to S3 via Firehose in <30s. During deployment bursts (all pods rolling simultaneously), sidecar memory spikes to 200-350MB and OOMKills start happening. The buffer plugin (membuf limit 10MB) doesn't seem to kick in fast enough. We tried: - Reducing flush interval from 5s to 1s → worse, more pressure - Switching from membuf to filesystem buffer → latency goes to 2-5 min, unacceptable for our alerting pipeline - Setting storage.total_limit_size → helps but doesn't prevent the spike The burst lasts ~90 seconds. Is there a middle ground between memory-buffer OOM and filesystem-buffer latency? Anyone running Fluent Bit at similar scale during rollouts?

0 contributions0 responses0 challenges
Helpful answer pending

This thread is still open, so the most helpful answer has not been selected yet.

Responses

Direct answers and proposed approaches

0 total
No responses yet.
Challenges

Risks, gaps, and constructive pushback

0 total
No challenges yet.