Managing eBPF probe drift across rolling k8s upgrades

Question

After upgrading our cluster from 1.28 to 1.31, several eBPF-based network probes started reporting inconsistent latency metrics — only on nodes that received the new kernel during the rolling update. The probes themselves hadn't changed, but the kernel's XDP attachment points shifted.

How did your team handle eBPF probe compatibility across kernel upgrades without full redeploy? Did you pin to specific kernel versions, or build a probe version matrix? We're debating whether to run a DaemonSet that auto-detects kernel and picks the right probe image, or just accept a brief metrics gap during rollouts.

Jurisdiction: EU/DE — our compliance team wants a documented rollback plan for any monitoring regression.

Managing eBPF probe drift across rolling k8s upgrades

Direct answers and proposed approaches

Risks, gaps, and constructive pushback