How Kubernetes delivers packets inter-node – AWS EKS

In this post, I explore the network part of the AWS EKS, and look into what default CNI plugin they are using as well as how we can replace one with Calico.

Default EKS CNI plugin – AMAZON VPC CNI plugin

By default, AWS EKS uses Amazon VPC CNI plugin. It doesn’t use any encapsulation to convey inter-node traffic and obviously the overheads are kept minimum. However, as the name suggests, it only works along with AWS VPC and hence AWS specific. Let’s take a look how it works!

1. Initial State

The below picture depicts the initial state of an EKS cluster. There are some pods are deployed in kube-system namespace. There is a daemonset to deploy an aws-node cni plugin in every node.

2. Deploy a Pod

Let’s deploy a pod and see what happens. I deployed a Nginx deployment with 3 replicas, and the result is shown below. Notably, there are a few things that happened once the container is deployed on the node, all done via amazon vpc cni plugin:

  • a secondary IP address is added onto the VPC ENI from the same VPC range
  • a veth pair is created, and the route entry is added for this veth pair with the IP address allocated as a secondary IP.

3. Expose a deployment as a service

To access this pod, let’s expose these pods as a service. In this demo, Nginx pods, which are waiting for a connection on port 80, are exposed as a service on port 8080. Note that aws cni plugin is not used here, but kube-proxy is the one who modifies the node’s netfilter to add rules.

4. Packets on the wire

Let’s launch a busybox container and try HTTP connection to the Nginx service.

  1. the container sends out the request packet, the destination is the one from the service(destination: NGINX_SERVICE_IP, destination_port: NGINX_SERVICE_PORT).
  2. The packet destination is modified based on the rules on the netfilter. With this, the service IP address(in this case 10.100.122.140), which is unknown to AWS VPC, is only contained in the originating node.

Replacing EKS CNI plugin with Calico

Amazon VPC CNI plugin works great in AWS EKS, however, it does not provide network policy by default. If you want to have a network policy, you can use calico to work with Amazon VPC CNI plugin by following this instruction on AWS.

In this demo, I will replace the cni plugin with Calico. There is no obvious reason here, but just to show how it works. You may find this useful if you want to avoid the vendor lock-in, but you should really check the compatibility if you use this for the production. Nevertheless, let’s get started!

1. Initial State

If you follow the guide on Calico, the initial state of the cluster looks like below. Note that “calico-node” is deployed instead of aws-node here.

Another item to note is that there is an interface vxlan.calico is deployed in each node. As you can see in the routing table, this vxlan endpoint is used for inter-pod communication which resides in other nodes.

2. Deploy a pod

A veth pair is created and an IP address is allocated to the pod, and the routing table is modified with the specific IP address. Contrary to the Amazon VPC CNI plugin, these IP address is contained only in node and nothing is modified in AWS VPC.

3. Expose a service

This is the same as we saw in Amazon VPC CNI plugin, kube-proxy modifies the netfilter to translate service-ip to actual-pod-ip. We can see 10.100.235.49 is allocated for service IP here.

4. Packets on the wire

Let’s check the HTTP request from the busybox on 172.16.64.2. When it sends the request to Nginx service(10.100.235.49:8080), it is first translated to the actual pod IP and port(172.16.38.4:80), it is the same flow until here as we saw in the previous demo. It is now, however, sent to the vxlan.calico interface and the packet is encapsulated in the vxlan and sent over the wire. So as in the below picture, we can only see vxlan packet on the wire.

It is though, the vxlan doesn’t provide encryption and we can pass “-d” option to see what is going on in the packet. And in this case, we can see the HTTP communication ongoing using actual pod IP addresses.

Leave a Reply

Your email address will not be published. Required fields are marked *