How eBPF Transformed Kubernetes Networking

You might have seen that while managing a Kubernetes cluster, things seem fine... and then, out of nowhere, you hit that kube-proxy wall. It quietly does its thing, managing all those network rules via iptables in the background. And it works... until the traffic starts picking up or the cluster gets busy. Suddenly, CPU spikes on your proxy pods can bring things to a halt, and tail latencies can creep up, especially during application rollouts. It’s often a silent killer, only revealing itself when users start noticing slowdowns.

Recently, I wanted to see what happens when we swap out the traditional kube-proxy for Cilium, specifically using its eBPF dataplane. I was genuinely surprised by what I found, and I wanted to share the same.

Dealing with iptables

Imagine a fairly busy cluster, running steady traffic across a bunch of microservices. On each node, kube-proxy was running diligently, but underneath the surface, it was managing thousands upon thousands of iptables rules.

Just checking how many rules were actually generated by the Kubernetes networking components (like Services and Endpoints) on one node gave us a glimpse:

$ sudo iptables-save | grep KUBE- | wc -l
28471

That's 28,471 lines! Okay, maybe not every single one is active at all times, but it's a rough estimate of the rules created by the Kubelet for the network proxy. It's a lot.

Now, throw in the typical chaos of a CI/CD pipeline – deploying new versions, scaling replicas – and things get ugly. Latency starts to degrade noticeably, node CPUs hit their limits during the churn, and managing all those iptables rules becomes a real pain point.

Seeing the Numbers and the Impact of Churn

To get a clearer picture, I ran a simple load test between a frontend and backend service. I looked at percentiles to understand the distribution of latency:

p50 = 18ms
p95 = 67ms  # This is where things start feeling slow for users
p99 = 142ms

So already, without any major stress, the tail latency was pushing into the low single digits milliseconds. But this is where users feel it. Even a few rollouts per hour seemed to push the p95 latency up significantly. It was clear that the iptables churn during these updates was a major contributor.

eBPF: A Different Engine for Networking

I decided to try Cilium with its eBPF-based dataplane as a replacement for kube-proxy. The Helm command to install it was straightforward:

helm install cilium cilium/cilium \
  --namespace kube-system \
  --set kubeProxyReplacement=strict \
  --set bpf.masquerade=false \
  --set nodeinit.enabled=true \
  --set operator.replicaCount=2 \
  --set hubble.enabled=true \
  --wait

Let's break down:

kubeProxyReplacement=strict: Cilium to fully take over the duties of kube-proxy, removing all the iptables rules it was managing.
bpf.masquerade=false: This leverages the kernel's native routing capabilities where possible, rather than relying on SNAT (masquerading), which can be CPU-intensive.
nodeinit.enabled=true: This helps prepare the node's kernel subsystems (like hugepages and the BPF filesystem) needed for eBPF programs to run efficiently.
hubble.enabled=true: To add observability layer that Hubble provides for monitoring network flows.

What Changed More Than Just Numbers

The immediate impact was dramatic.

Way Fewer iptables Rules: This was the most striking change. The kernel hash maps (XDP or tc filter programs) handled service endpoints directly and extremely efficiently (O(1) time per lookup).
```
 $ sudo iptables-save | grep KUBE- | wc -l
 3  # Or maybe even less, depending on the exact state
```
Seriously, just a handful of rules remained – mostly for the initial setup or specific edge cases not handled by eBPF. The complexity was gone.
Tail Latency Improves Tangibly: This is what ultimately matters to users.
```
 # Before eBPF
 p50 = 18ms
 p95 = 67ms
 p99 = 142ms

 # After eBPF
 p50 = 17ms
 p95 = 32ms  # ~52% improvement!
 p99 = 58ms
```
The jump in the 95th percentile latency was particularly impressive. Even during the autoscaling events when the backend pods were being rolled out, the network latency remained much more predictable and low. The traffic just flowed smoother.

A Real-World Test Rolling Out Without Breaking a Sweat

Let’s put it to the test during a real CI/CD deployment. We’ll deploy a backend service, scale the deployment, and immediately update the image:

kubectl scale deploy/backend --replicas=6
kubectl set image deploy/backend backend=backend:v2
kubectl rollout status deploy/backend -w

Before eBPF: Seeing the endpoints (ENDPOINT SLOW-DRIVER) fluctuate wildly during the rollout. This often coincided with a spike in latency. It felt like the network was struggling to keep up. Engineers often had to intervene or wait it out.
After eBPF: This was a night-and-day difference. The endpoint updates completed almost instantly (sub-second). The latency stayed consistently low throughout the process. No CPU spikes related to kube-proxy were observed. Using cilium monitor and checking Hubble flows became our new debugging tools – fast, reliable, and easy to understand. Goodbye, iptables debugging nightmares!

Conclusion

Switching to Cilium with its eBPF dataplane can be a game-changer for kube-proxy performance issues. The reduction in iptables complexity was massive, leading to significant CPU savings and dramatically improved application latency, especially noticeable during periods of churn like deployments and scaling. The observability provided by Hubble was also a huge bonus. It wasn't just a theoretical improvement; it was a tangible transformation in how our cluster handles network traffic. It felt like swapping out a clunky, inefficient engine for a smooth, powerful, and modern one.