This submit was co-authored by Qi Ke, Company Vice President, Azure Kubernetes Service.
Right now, we’re thrilled to announce the overall availability of Azure CNI Overlay. It is a massive step ahead in addressing networking efficiency and the scaling wants of our clients.
As cloud-native workloads proceed to develop, clients are continually pushing the size and efficiency boundaries of our current networking options in Azure Kubernetes Service (AKS). For Occasion, the standard Azure Container Networking Interface (CNI) approaches require planning IP addresses upfront, which might result in IP handle exhaustion as demand grows. In response to this demand, now we have developed a brand new networking resolution referred to as “Azure CNI Overlay”.
On this weblog submit, we are going to focus on why we would have liked to create a brand new resolution, the size it achieves, and the way its efficiency compares to the prevailing options in AKS.
Fixing for efficiency and scale
In AKS, clients have a number of community plugin choices to select from when making a cluster. Nonetheless, every of those choices have their very own challenges relating to large-scale clusters.
The “kubenet” plugin, an current overlay community resolution, is constructed on Azure route tables and the bridge plugin. Since kubenet (or host IPAM) leverages route tables for cross node communication it was designed for, not more than 400 nodes or 200 nodes in twin stack clusters.
The Azure CNI VNET gives IPs from the digital community (VNET) handle house. This can be troublesome to implement because it requires a big, distinctive, and consecutive Classless Inter-Area Routing (CIDR) house and clients might not have the accessible IPs to assign to a cluster.
Convey your Personal Container Community Interface (BYOCNI) brings a number of flexibility to AKS. Prospects can use encapsulation—like Digital Extensible Native Space Community (VXLAN)—to create an overlay community as nicely. Nonetheless, the extra encapsulation will increase latency and instability because the cluster measurement will increase.
To deal with these challenges, and to help clients who wish to run giant clusters with many nodes and pods with no limitations on efficiency, scale, and IP exhaustion, now we have launched a brand new resolution: Azure CNI Overlay.
Azure CNI Overlay
Azure CNI Overlay assigns IP addresses from the user-defined overlay handle house as a substitute of utilizing IP addresses from the VNET. It makes use of the routing of those personal handle areas as a local digital community characteristic. Which means cluster nodes don’t have to carry out any further encapsulation to make the overlay container community work. This additionally permits this overlay addressing house to be reused for various AKS clusters even when related through the identical VNET.
When a node joins the AKS cluster, we assign a /24 IP handle block (256 IPs) from the Pod CIDR to it. Azure CNI assigns IPs to Pods on that node from the block, and below the hood, VNET maintains a mapping of the Pod CIDR block to the node. This manner, when Pod visitors leaves the node, VNET platform is aware of the place to ship the visitors. This enables the Pod overlay community to attain the identical efficiency as native VNET visitors and paves the best way to help hundreds of thousands of pods and throughout 1000’s of nodes.
Datapath efficiency comparability
This part sneaks into a number of the datapath efficiency comparisons now we have been operating towards Azure CNI Overlay.
Observe: We used the Kubernetes benchmarking instruments accessible at kubernetes/perf-tests for this train. Comparability can fluctuate based mostly on a number of components corresponding to underlining {hardware} and Node proximity inside a datacenter amongst others. Precise outcomes may fluctuate.
Azure CNI Overlay vs. VXLAN-based Overlay
As talked about earlier than, the one choices for giant clusters with many Nodes and plenty of Pods are Azure CNI Overlay and BYO CNI. Right here we evaluate Azure CNI Overlay with VXLAN-based overlay implementation utilizing BYO CNI.
TCP Throughput – Greater is Higher (19% acquire in TCP Throughput)
Azure CNI Overlay confirmed a big efficiency enchancment over VXLAN-based overlay implementation. We discovered that the overhead of encapsulating CNIs was a big consider efficiency degradation, particularly because the cluster grows. In distinction, Azure CNI Overlay’s native Layer 3 implementation of overlay routing eradicated the double-encapsulation useful resource utilization and confirmed constant efficiency throughout numerous cluster sizes. In abstract, Azure CNI Overlay is a most viable resolution for operating manufacturing grade workloads in Kubernetes.
Azure CNI Overlay vs. Host Community
This part will cowl how pod networking performs towards node networking and see how native L3 routing of pod networking helps Azure CNI Overlay implementation.
Azure CNI Overlay and Host Community have related throughput and CPU utilization outcomes, and this reinforces that the Azure CNI Overlay implementation for Pod routing throughout nodes utilizing the native VNET characteristic is as environment friendly as native VNET visitors.
TCP Throughput – Greater is Higher (Much like HostNetwork)
Azure CNI Overlay powered by Cilium: eBPF dataplane
Up so far, we’ve solely taken a take a look at Azure CNI Overlay advantages alone. Nonetheless, by means of a partnership with Isovalent, the subsequent era of Azure CNI is powered by Cilium. A number of the advantages of this strategy embrace higher useful resource utilization by Cilium’s prolonged Berkeley Packet Filter (eBPF) dataplane, extra environment friendly intra cluster load balancing, Community Coverage enforcement by leveraging eBPF over iptables, and extra. To learn extra about Cilium’s efficiency features by means of eBPF, see Isovalent’s weblog submit on the topic.
In Azure CNI Overlay Powered by Cilium, Azure CNI Overlay units up the IP-address administration (IPAM) and Pod routing, and Cilium provisions the Service routing and Community Coverage programming. In different phrases, Azure CNI Overlay Powered by Cilium permits us to have the identical overlay networking efficiency features that we’ve seen up to now on this weblog submit plus extra environment friendly Service routing and Community Coverage implementation.
It is nice to see that Azure CNI Overlay powered by Cilium is ready to present even higher efficiency than Azure CNI Overlay with out Cilium. The upper pod to service throughput achieved with the Cilium eBPF dataplane is a promising enchancment. The added advantages of elevated observability and extra environment friendly community coverage implementation are additionally vital for these seeking to optimize their AKS clusters.
TCP Throughput – Greater is healthier
To wrap up, Azure CNI Overlay is now typically accessible in Azure Kubernetes Service (AKS) and affords vital enhancements over different networking choices in AKS, with efficiency similar to Host Community configurations and help for linearly scaling the cluster. And pairing Azure CNI Overlay with Cilium brings much more efficiency advantages to your clusters. We’re excited to ask you to attempt Azure CNI Overlay and expertise the advantages in your AKS surroundings.
To get began as we speak, go to the documentation accessible.