This post is the second of a series of three tutorial articles introducing a sample, tutorial project, demonstrating how to provision Kubernetes on AWS from scratch, using Terraform and Ansible. To understand the goal of the project, you’d better start from the first part.
The complete working project is available here: https://github.com/opencredo/k8s-terraform-ansible-sample
In the previous article, we created all AWS resources, using Terraform. No Kubernetes component has been installed yet.
We have 9 EC instances (hosts, in Ansible terms), 3 of each type (group, for Ansible):
- Controllers: Kubernetes HA master
- Will run Kubernetes API Server, Controller Manager and Scheduler services
- Workers: Kubernetes Nodes or Minions
- Will run Docker, Kubernetes Proxy and Kubelet services
- Will have CNI installed for networking between containers
- etcd: etcd 3 nodes cluster to maintain Kubernetes state
All hosts need the certificates we generated, for HTTPS.
First of all, we have to install Python 2.5+ on all machines.
Ansible project organisation
The Ansible part of the project is organised as suggested by Ansible documentation. We also have multiple playbooks, to run independently:
- Bootstrap Ansible (install Python). Install, configure and start all the required components (
- Configure Kubernetes CLI (kubectl) on your machine (
- Setup internal routing between containers (
- Smoke test it, deploying a nginx service (
kubernetes-nginx.yaml) + manual operations
This article walks through the first playbook (
The code snippets have been simplified. Please refer to project repository for the complete version.
Installing Kubernetes components
The first playbook takes care of bootstrapping Ansible and installing Kubernetes components. Actual tasks are separate in roles:
common (executed on all hosts); one role per machine type:
Before proceeding, we have to understand how Ansible identifies and find hosts.
Ansible works on groups of hosts. Each host must have a unique handle and address to SSH into the box.
The most basic approach is using a static inventory, a hardwired file associating groups to hosts and specifying the IP address (or DNS name) of each host.
The configuration file
ec2.ini, downloaded from Ansible repo, requires some change. It is very long, so here are the modified parameters only:
Note we use instance tags to filter and identify hosts and we use IP addresses for connecting to the machines.
A separate file defines groups, based on instance tags. It creates nicely named groups,
worker (we might have used groups weirdly called
tag_ansibleNodeType_worker…). If we add new hosts to a group, this file remains untouched.
To make the inventory work, we put Dynamic Inventory Python script and configuration file in the same directory with groups file.
ansible/ hosts/ ec2.py ec2.ini groups
The final step is configuring Ansible to use this directory as inventory. In
Now we are ready to execute the playbook (
infra.yaml) to install all components. The first step is installing Python on all boxes with a
raw module. It executes a shell command remotely, via SSH, with no bell and whistle.
Installing and configuring Kubernetes components: Roles
The second part of the playbook install and configure all Kubernetes components. It plays different roles on hosts, depending on groups. Note that groups and roles have identical names, but this is not a general rule.
Ansible executes the common role (omitted here) on all machines. All other roles do the real job. They install, set up and start services using systemd:
- Copy the certificates and key (we generated with Terraform, in the first part)
- Download service binaries directly from the official source, unpack and copy them to the right directory
- Create the systemd unit file, using a template
- Bump both systemd and the service
- Verify the service is running
Here are the tasks of
etcd role . Other roles are not substantially different.
etcd.service from a template. Ports are hardwired (may be externalised as variables), but hosts IP addresses are facts gathered by Ansible.
The most significant simplifications, compared to a real world project, concern two aspects:
- Ansible workflow is simplistic: at every execution, it restarts all services. In a production environment, you should add guard conditions, trigger operations only when required (e.g. when the configuration has changed) and avoid restarting all nodes of a cluster at the same time.
- As in the first part, using fixed internal DNS names, rather than IPs, would be more realistic.
infra.yaml playbook has installed and run all the services required by Kubernetes. In the next article, we will set up routing between Containers, to allow Kubernetes Pods living on different nodes talking each other.