In the last post, part 3 of this series – we started off at the bottom of the stack with the Terraform module for virtual machine creation. We continue our journey up the stack in this post with the module for creating a Kubernetes cluster.

All the blog posts in this series relate to the Arc-PX-VMware-Faststart repo on GitHub, the material covered by this blog post relates to specifically to the kubernetes_cluster Terraform module.
The Beauty of Layering
By separating the Terraform configuration into modules, you can either deploy the cluster to the virtualized infrastructure created using the virtual_machine module from last post or your own infrastructure; be that virtualized, on-premises or in the public cloud.
Irrespective of how the infrastructure is provisioned, you need to be able to do the following:
- ssh from the server where Terraform is run from onto the other hosts without having to supply a password
- execute sudo commands on both the server Terraform is run from and all the other node hosts without having to supply a password
Deploying A Kubernetes Cluster 101
Whatever you do when deploying a Kubernetes cluster, somewhere along the line you have to use kubeadm. There is a wealth of material available on blog posts and on the internet in general in which people roll there own scripts using kubeadm. I often suspect that many of these efforts are the result of Kelsey Hightower’s: Kubernetes the hard way. In this post we are emphatically going to do things the easy way, Kubernetes the hard way – primarily a learning tool, lists the following steps for deploying a cluster once compute resources have been provisioned:
- Provisioning the CA and Generating TLS Certificates
- Generating Kubernetes Configuration Files for Authentication
- Generating the Data Encryption Config and Key
- Bootstrapping the etcd Cluster
- Bootstrapping the Kubernetes Control Plane
- Bootstrapping the Kubernetes Worker Nodes
- Configuring kubectl for Remote Access
- Provisioning Pod Network Routes
- Deploying the DNS Cluster Add-on
In my day job I have seen organizations attempt to do this, and to be candid they have ended up in a real mess. We are not going to do this, instead, the kubernetes_cluster module will invoke an ansible playbook that carries out all of these steps – and as luck would have it, all the hard work has been done for us in the form of the CNCF Kubespray project:

The Kubespray repo on GitHub has 10.2K stars, 4.4K forks and 720 contributors, it is refined, battle tested and has the backing of the CNCF community, TL;DR you would be hard pressed to come up with something of a similar standard. And it gets better still, with Kubespray you can:
- rebuild the control plan
- add new nodes
- destroy clusters
Once four basic steps have been performed, deploying a Kubernetes cluster simply requires a single command line which invokes ansible-playbook
:
- clone the Kubespray GitHub repo
- install the Python packages Kubespray requires, in the
kubespray/requirements.txt
file - create an inventory file – this contains information about the hosts the cluster will use
- set up ssh connectivity between your deployment server and cluster node hosts
ansible-playbook -i <path to inventory.ini file> --become --become-user=root cluster.ym
l
Tearing down a cluster is equally as simple:ansible-playbook -i <path to inventory.ini file> --become --become-user=root reset.yml
This is why kubespray is my go-to tool for deploying Kubernetes clusters.
Compute Resources
The storage solution we will use requires three worker nodes, each with the following resources:
CPU | 4 cores |
RAM | 4GB |
Disk/var /opt | 2GB free 3GB free |
This excludes the resources required for a SQL Server 2019 Big Data Cluster or Azure Arc enabled Data Services.
kubernetes_cluster Overview
Of all the modules in Arc-PX-VMware, kubernetes_cluster
is the most complex, which is why this post will walk the reader through what it does resource-by-resource:
-
null_resource.kubespray
All the Linux and Python packages necessary for running Kubespray are installed, the Kubespray GitHub repo is cloned, and finally an inventory directory is created with the name specified in thevar.kubespray_inventory
variable, if such a directory already exists – it is renamed with a date and timestamp suffix before a new inventory directory structure is created. local_file.kubespray_inventory
This resource creates the inventory file used by the ansiblecluster.yml
playbook, in essence a call to a template file function is invoked to take information from thenode_hosts
variable:
variable "node_hosts" {
default = {
"z-ca-bdc-control1" = {
name = "z-ca-bdc-control1"
compute_node = false
etcd_instance = "etcd1"
ipv4_address = "192.168.123.88"
},
"z-ca-bdc-control2" = {
name = "z-ca-bdc-control2"
compute_node = false
etcd_instance = "etcd2"
ipv4_address = "192.168.123.89"
},
"z-ca-bdc-compute1" = {
name = "z-ca-bdc-compute1"
compute_node = true
etcd_instance = "etcd3"
ipv4_address = "192.168.123.90"
},
"z-ca-bdc-compute2" = {
name = "z-ca-bdc-compute2"
compute_node = true
etcd_instance = ""
ipv4_address = "192.168.123.91"
},
"z-ca-bdc-compute3" = {
name = "z-ca-bdc-compute3"
compute_node = true
etcd_instance = ""
ipv4_address = "192.168.123.92"
}
}
}
and translate this to the format of an the ansible inventory file used by Kubespray:
[all]
z-ca-bdc-compute1 ip=192.168.123.90 etcd_instance=etcd3
z-ca-bdc-control1 ip=192.168.123.88 etcd_instance=etcd1
z-ca-bdc-control2 ip=192.168.123.89 etcd_instance=etcd2
z-ca-bdc-compute2 ip=192.168.123.91
z-ca-bdc-compute3 ip=192.168.123.92
[kube-master]
z-ca-bdc-control1
z-ca-bdc-control2
[etcd]
z-ca-bdc-compute1
z-ca-bdc-control1
z-ca-bdc-control2
[kube-node]
z-ca-bdc-compute1
z-ca-bdc-compute2
z-ca-bdc-compute3
z-ca-bdc-control1
z-ca-bdc-control2
[calico-rr]
[k8s-cluster:children]
kube-master
kube-node
calico-rr
the path to the inventory file is /home/<user>/kubespray/inventory/<var.kubespray_inventory>/inventory.ini
-
local_file.kubernetes_config
The templatefile function is used to set the Kubernetes version for the cluster in an ansible config file in accordance with what thevar.kubernetes_version
variable is set to. null_resource.kubernetes_cluster
A Kubernetes cluster is created using an invocation toansible-playbook
. In order that the cluster can also be destroyed in the event thatterraform destroy
is executed for the module, a copy of the inventory file is created for use with the destroy provisioner.null_resource.kubernetes_context
The Kubernetes context created by kubespray (what we use to connect to a cluster with) is copied to the users.kube/config
file – if this already exists it is backed up. kubectl is then installed if it is not already present in the machine that Terraform is executed from.null_resource.taint_control_nodes
I found that certain versions of kubespray did not apply the NoSchedule taint to all control plane nodes, hence this was added as a precautionary measure.
Digging Into The Configuration – Points Of Interest
The kubernetes_cluster
module introduces:
- local values
Local values provide the ability to associate an expression with a name that can be used throughout a module:
locals {
all_nodes_verbose_etcd = [for k, v in var.node_hosts:
format("%s ip=%s etcd_instance=%s", v.name, v.ipv4_address, v.etcd_instance)
if length(v.etcd_instance) > 0]
all_nodes_verbose = [for k, v in var.node_hosts:
format("%s ip=%s", v.name, v.ipv4_address)
if length(v.etcd_instance) == 0]
master_nodes = [for k, v in var.node_hosts:
v.name
if v.compute_node != true]
etcd_nodes = [for k, v in var.node_hosts:
v.name
if length(v.etcd_instance) > 0]
all_nodes = values(var.node_hosts)[*].name
kubernetes_conf_file = format("%s/kubespray/inventory/%s/group_vars/k8s-cluster/k8s-cluster.yml", pathexpand("~"), var.kubespray_inventory)
kubespray_inv_file = format("%s/kubespray/inventory/%s/inventory.ini", pathexpand("~"), var.kubespray_inventory)
context_artifact = format("%s/kubespray/inventory/%s/artifacts/admin.conf", pathexpand("~"), var.kubespray_inventory)
}
- templates
The concept of templates is best explained by taking an example of where they are used in thekubernetes_cluster
module. This is the template used to create the ansible inventory file for Kubespray, note the lines with the words contained in${ }
:
[all]
${k8s_node_host_verbose_etcd}
${k8s_node_host_verbose}
[kube-master]
${k8s_master_host}
[etcd]
${k8s_etcd_host}
[kube-node]
${k8s_node_host}
[calico-rr]
[k8s-cluster:children]
kube-master
kube-node
calico-rr
The templatefile
function can be used in conjunction with a template to create a file – in short you take a template and plug values into it, the placeholders are specified using ${ }
:
content = templatefile("${path.module}/templates/kubespray_inventory.tpl", {
k8s_node_host_verbose_etcd = replace(join("\", \"\n", local.all_nodes_verbose_etcd), "\", \"", "")
k8s_node_host_verbose = replace(join("\", \"\n", local.all_nodes_verbose), "\", \"", "")
k8s_master_host = replace(join("\", \"\n", local.master_nodes), "\", \"", "")
k8s_etcd_host = replace(join("\", \"\n", local.etcd_nodes), "\", \"", "")
k8s_node_host = replace(join("\", \"\n", local.all_nodes), "\", \"", "")
})
filename = local.kubespray_inv_file
Coming Up In Part 5
In the next blog post in this series, the module for deploying MetalLB (a software load balancer) to our Kubernetes cluster will be covered.