August 26, 2016 | Kubernetes
This post is the last of a series of three tutorial articles introducing a sample, tutorial project, demonstrating how to provision Kubernetes on AWS from scratch, using Terraform and Ansible. To understand the goal of the project, you’d better start from the first part.
WRITTEN BY
Part 1: Provision the infrastructure, with Terraform
Part 2: Instal and configure Kubernetes, with Ansible
Part 3 (this article): Complete setup and smoke test it, deploying a nginx service
The fully working project is available: https://github.com/opencredo/k8s-terraform-ansible-sample
In the second part, we have completed installation of Kubernetes components. There is still one important step: setting up the routing between Workers (aka Nodes or Minions) to allow Pods living on different machines to talk each other. As a final smoke test, we’ll deploy a nginx service.
Before starting, we have to configure Kubernetes CLI on our machine to remotely interact with the cluster.
The code snippets have been simplified. For the full, working version, please refer to project repository.
For running the following steps, we need to know Kubernetes API ELB public DNS name and Workers public IP addresses. Terraform outputs them at the end of provisioning. In this simplified project, we have to note them down, manually.
This step is not part of the platform set up. We configure Kubernetes CLI locally to interact with the remote cluster.
Setting up the client requires running few shell commands. The save the API endpoint URL and authentication details in the local <a href="http://kubernetes.io/docs/user-guide/kubeconfig-file/">kubeconfig</a>
file. They are all local shell commands, but we will use a playbook (kubectl.yaml
) to run them.
The client uses the CA certificate, generated by Terraform in the first part. User and token must match those in the token file (token.csv
), also used for Kubernetes API Server setup. The API load balancer DNS name must be passed to the playbook as a parameter.
$ ansible-playbook kubectl.yaml --extra-vars "kubernetes_api_endpoint=kube-375210502.eu-west-1.elb.amazonaws.com"
Kubernetes CLI is now configured, and we may use kubectl
to control the cluster.
$ kubectl get componentstatuses NAME STATUS MESSAGE ERROR controller-manager Healthy ok scheduler Healthy ok etcd-2 Healthy {"health": "true"} etcd-1 Healthy {"health": "true"} etcd-0 Healthy {"health": "true"}
Kubernetes uses subnets for networking between Pods. These subnets have nothing to do with the subnet we defined in AWS.
Our VPC subnet is 10.43.0.0/16, while the Pod subnets are part of 10.200.0.0/16 (10.200.1.0/24, 10.200.2.0/24 etc.). We have to setup routes between workers instances for these subnets.
As we are using the Kubenet network plugin, Pod subnets are dynamically assigned. Kube Controller decides Pod subnets within a Pod Cluster CIDR (defined by --cluster-cidr
parameter on kube-controller-manager
startup). Subnets are dynamically assigned and we cannot configure these routes at provisioning time, using Terraform. We have to wait until all Kubernetes components are up and running, discover Pod subnets querying Kubernetes API and then add the routes.
In Ansible, we might use the ec2_vpc_route_table
module to modify AWS Route Tables, but this would interfere with route tables managed by Terraform. Due to its stateful nature, tampering with Terraform managed resources is not a good idea.
The solution (hack?) adopted here is adding new routes directly to the machines, after discovering Pod subnets, using kubectl
. It is the job of kubernetes-routing.yaml
playbook, the Ansible translation of the following steps:
Query Kubernetes API for Workers Pod subnets. Actual Pod subnets (the second column) may be different, but they are not necessarily assigned following Workers numbering.
$ kubectl get nodes --output=jsonpath='{range .items[*]}{.status.addresses[?(@.type=="InternalIP")].address}{.spec.podCIDR}{"\n"}{end}' 10.43.0.30 10.200.2.0/24 10.43.0.31 10.200.0.0/24 10.43.0.32 10.200.1.0/24
Then, on each Worker, add routes for Pod subnets to the owning Node
$ sudo route add -net 10.200.2.0 netmask 255.255.255.0 gw 10.43.0.30 metric 1 $ sudo route add -net 10.200.0.0 netmask 255.255.255.0 gw 10.43.0.31 metric 1 $ sudo route add -net 10.200.1.0 netmask 255.255.255.0 gw 10.43.0.32 metric 1
… and add an IP Tables rule to avoid internal traffic being routed through the Internet Gateway:
$ sudo iptables -t nat -A POSTROUTING ! -d 10.0.0.0/8 -o eth0 -j MASQUERADE
The last step is a smoke test. We launch multiple nginx containers in the cluster, then create a Service exposed as NodePort (a random port, the same on every Worker node). The are three local shell commands. The kubernetes-nginx.yaml) is the Ansible version of them.
$ kubectl run nginx --image=nginx --port=80 --replicas=3 ... $ kubectl expose deployment nginx --type NodePort ... $ kubectl get svc nginx --output=jsonpath='{range .spec.ports[0]}{.nodePort}' 32700
The final step is manual (no playbook!). To test the service we fetch the default page from nginx.
All Workers nodes directly expose the Service. Get the exposed port from the last command you run, get Workers public IP addresses from Terraform output.
This should work for all the Workers:
$ curl http://: ...
Known Simplifications
This article concludes our walk through the sample project.
There is a lot of space for improvement to make it more realistic, using DNS names, a VPN or a bastion, moving Instances in private subnets. A network overlay (Flannel) would be another improvement. Modifying the project to add these enhancements might be a good learning exercise.
This blog is written exclusively by the OpenCredo team. We do not accept external contributions.