Deploying Azure Data Services via Terraform Part 4: Deploying a Kubernetes Cluster

In the last post, part 3 of this series – we started off at the bottom of the stack with the Terraform module for virtual machine creation. We continue our journey up the stack in this post with the module for creating a Kubernetes cluster.

All the blog posts in this series relate to the Arc-PX-VMware-Faststart repo on GitHub, the material covered by this blog post relates to specifically to the kubernetes_cluster Terraform module.

The Beauty of Layering

By separating the Terraform configuration into modules, you can either deploy the cluster to the virtualized infrastructure created using the virtual_machine module from last post or your own infrastructure; be that virtualized, on-premises or in the public cloud.

Irrespective of how the infrastructure is provisioned, you need to be able to do the following:

  • ssh from the server where Terraform is run from onto the other hosts without having to supply a password
  • execute sudo commands on both the server Terraform is run from and all the other node hosts without having to supply a password

Deploying A Kubernetes Cluster 101

Whatever you do when deploying a Kubernetes cluster, somewhere along the line you have to use kubeadm. There is a wealth of material available on blog posts and on the internet in general in which people roll there own scripts using kubeadm. I often suspect that many of these efforts are the result of Kelsey Hightower’s: Kubernetes the hard way. In this post we are emphatically going to do things the easy way, Kubernetes the hard way – primarily a learning tool, lists the following steps for deploying a cluster once compute resources have been provisioned:

In my day job I have seen organizations attempt to do this, and to be candid they have ended up in a real mess. We are not going to do this, instead, the kubernetes_cluster module will invoke an ansible playbook that carries out all of these steps – and as luck would have it, all the hard work has been done for us in the form of the CNCF Kubespray project:

The Kubespray repo on GitHub has 10.2K stars, 4.4K forks and 720 contributors, it is refined, battle tested and has the backing of the CNCF community, TL;DR you would be hard pressed to come up with something of a similar standard. And it gets better still, with Kubespray you can:

  • rebuild the control plan
  • add new nodes
  • destroy clusters

Once four basic steps have been performed, deploying a Kubernetes cluster simply requires a single command line which invokes ansible-playbook:

  1. clone the Kubespray GitHub repo
  2. install the Python packages Kubespray requires, in the kubespray/requirements.txt file
  3. create an inventory file – this contains information about the hosts the cluster will use
  4. set up ssh connectivity between your deployment server and cluster node hosts

    ansible-playbook -i <path to inventory.ini file> --become --become-user=root cluster.yml

Tearing down a cluster is equally as simple:

ansible-playbook -i <path to inventory.ini file> --become --become-user=root reset.yml

This is why kubespray is my go-to tool for deploying Kubernetes clusters.

Compute Resources

The storage solution we will use requires three worker nodes, each with the following resources:

CPU4 cores
RAM4GB
Disk
/var
/opt

2GB free
3GB free

This excludes the resources required for a SQL Server 2019 Big Data Cluster or Azure Arc enabled Data Services.

kubernetes_cluster Overview

Of all the modules in Arc-PX-VMware, kubernetes_cluster is the most complex, which is why this post will walk the reader through what it does resource-by-resource:

  • null_resource.kubespray
    All the Linux and Python packages necessary for running Kubespray are installed, the Kubespray GitHub repo is cloned, and finally an inventory directory is created with the name specified in the var.kubespray_inventory variable, if such a directory already exists – it is renamed with a date and timestamp suffix before a new inventory directory structure is created.
  • local_file.kubespray_inventory
    This resource creates the inventory file used by the ansible cluster.yml playbook, in essence a call to a template file function is invoked to take information from the node_hosts variable:
variable "node_hosts" {
  default = {
    "z-ca-bdc-control1" = {
       name          = "z-ca-bdc-control1"
       compute_node   = false
       etcd_instance = "etcd1"
       ipv4_address  = "192.168.123.88"
    },
    "z-ca-bdc-control2" =  {
       name          = "z-ca-bdc-control2"
       compute_node   = false
       etcd_instance = "etcd2"
       ipv4_address  = "192.168.123.89"
    },
    "z-ca-bdc-compute1" = {
       name          = "z-ca-bdc-compute1"
       compute_node   = true
       etcd_instance = "etcd3"
       ipv4_address  = "192.168.123.90"
    },
    "z-ca-bdc-compute2" = {
       name          = "z-ca-bdc-compute2"
       compute_node   = true
       etcd_instance = ""
       ipv4_address  = "192.168.123.91"
    },
    "z-ca-bdc-compute3" = {
       name          = "z-ca-bdc-compute3"
       compute_node   = true
       etcd_instance = ""
       ipv4_address  = "192.168.123.92"
    }
  }
}

and translate this to the format of an the ansible inventory file used by Kubespray:

[all]
z-ca-bdc-compute1 ip=192.168.123.90 etcd_instance=etcd3
z-ca-bdc-control1 ip=192.168.123.88 etcd_instance=etcd1
z-ca-bdc-control2 ip=192.168.123.89 etcd_instance=etcd2
z-ca-bdc-compute2 ip=192.168.123.91
z-ca-bdc-compute3 ip=192.168.123.92

[kube-master]
z-ca-bdc-control1
z-ca-bdc-control2

[etcd]
z-ca-bdc-compute1
z-ca-bdc-control1
z-ca-bdc-control2

[kube-node]
z-ca-bdc-compute1
z-ca-bdc-compute2
z-ca-bdc-compute3
z-ca-bdc-control1
z-ca-bdc-control2

[calico-rr]

[k8s-cluster:children]
kube-master
kube-node
calico-rr

the path to the inventory file is /home/<user>/kubespray/inventory/<var.kubespray_inventory>/inventory.ini

  • local_file.kubernetes_config
    The templatefile function is used to set the Kubernetes version for the cluster in an ansible config file in accordance with what the var.kubernetes_version variable is set to.
  • null_resource.kubernetes_cluster
    A Kubernetes cluster is created using an invocation to ansible-playbook. In order that the cluster can also be destroyed in the event that terraform destroy is executed for the module, a copy of the inventory file is created for use with the destroy provisioner.
  • null_resource.kubernetes_context
    The Kubernetes context created by kubespray (what we use to connect to a cluster with) is copied to the users .kube/config file – if this already exists it is backed up. kubectl is then installed if it is not already present in the machine that Terraform is executed from.
  • null_resource.taint_control_nodes
    I found that certain versions of kubespray did not apply the NoSchedule taint to all control plane nodes, hence this was added as a precautionary measure.

Digging Into The Configuration – Points Of Interest

The kubernetes_cluster module introduces:

  • local values
    Local values provide the ability to associate an expression with a name that can be used throughout a module:
locals {
   all_nodes_verbose_etcd = [for k, v in var.node_hosts: 
                               format("%s ip=%s etcd_instance=%s", v.name, v.ipv4_address, v.etcd_instance)
                               if length(v.etcd_instance) > 0]

   all_nodes_verbose      = [for k, v in var.node_hosts:
                               format("%s ip=%s", v.name, v.ipv4_address) 
                               if length(v.etcd_instance) == 0] 

   master_nodes           = [for k, v in var.node_hosts:
                               v.name
                               if v.compute_node != true] 

   etcd_nodes             = [for k, v in var.node_hosts:
                               v.name 
                               if length(v.etcd_instance) > 0] 

   all_nodes              = values(var.node_hosts)[*].name

   kubernetes_conf_file = format("%s/kubespray/inventory/%s/group_vars/k8s-cluster/k8s-cluster.yml", pathexpand("~"), var.kubespray_inventory)
   kubespray_inv_file   = format("%s/kubespray/inventory/%s/inventory.ini", pathexpand("~"), var.kubespray_inventory)
   context_artifact     = format("%s/kubespray/inventory/%s/artifacts/admin.conf", pathexpand("~"), var.kubespray_inventory)
}
  • templates
    The concept of templates is best explained by taking an example of where they are used in the kubernetes_cluster module. This is the template used to create the ansible inventory file for Kubespray, note the lines with the words contained in ${ }:
[all]
${k8s_node_host_verbose_etcd}
${k8s_node_host_verbose}

[kube-master]
${k8s_master_host}

[etcd]
${k8s_etcd_host}

[kube-node]
${k8s_node_host}

[calico-rr]

[k8s-cluster:children]
kube-master
kube-node
calico-rr

The templatefile function can be used in conjunction with a template to create a file – in short you take a template and plug values into it, the placeholders are specified using ${ }:

 content = templatefile("${path.module}/templates/kubespray_inventory.tpl", {
    k8s_node_host_verbose_etcd = replace(join("\", \"\n", local.all_nodes_verbose_etcd), "\", \"", "") 
    k8s_node_host_verbose      = replace(join("\", \"\n", local.all_nodes_verbose), "\", \"", "") 
    k8s_master_host            = replace(join("\", \"\n", local.master_nodes), "\", \"", "") 
    k8s_etcd_host              = replace(join("\", \"\n", local.etcd_nodes), "\", \"", "") 
    k8s_node_host              = replace(join("\", \"\n", local.all_nodes), "\", \"", "") 
  })
  filename = local.kubespray_inv_file

Coming Up In Part 5

In the next blog post in this series, the module for deploying MetalLB (a software load balancer) to our Kubernetes cluster will be covered.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s