Part 3 of this series will begin the journey up the stack, starting with the deployment of the virtual machines that will host the Kubernetes cluster nodes:

All the blog posts in this series relate to the Arc-PX-VMware-Faststart repo on GitHub, the material covered specifically by this blog post relates to the virtual_machine Terraform module.
Why Not Just Use VMware Tanzu Kubernetes Grid ?
Tanzu Kubernetes Grid – new with vSphere 7.0, furnishes the ability to create Kubernetes clusters directly from vSphere, in short it does all the heavy lifting for you and it would have saved me a lot of effort. However, whilst Azure Arc enabled Data Services supports Tanzu, SQL Server 2019 Big Data Cluster currently does not. Also, due to the fact that Tanzu is relatively new, it is by no means a given that everyone who has vSphere also has Tanzu, plus it is a separately licensed product from vSphere. By splitting out the creation of the virtual machines that host the Kubernetes cluster nodes from the actual creation of the Kubernetes cluster itself, this enables people who are not using VMware vSphere to deploy a Kubernetes cluster to infrastructure based on:
- Hyper-V
- Bare metal, be that server class hardware or NUCs in a home lab
- Linux KVM
Ubuntu 18.04 Template Creation
The first order of play is to create a virtual machine template in VMware, this is fully documented in the README for the virtual_machine module and demonstrated in the following YouTube recording. Simply create a new virtual machine in vCenter, give it:
- two logical CPUs
- 4GB of memory
- a single 120GB disk
- a NIC
- a CD/DVD rom drive configured to boot the VM off an Ubuntu 18.04 server LTS ISO
and then you are good to go . . .
Applying The Terraform Module
To do this you require an Ubuntu virtual machine, I’ve tested this with Ubuntu 18.04 LTS and I will get around to testing it with Ubuntu 20.10 at some stage. If for example the virtual machine was created with a user called azuser, the deployment server should also have an azuser account under which all Terraform commands are executed. To get up and running, you need to:
- Install Git on the deployment server
sudo apt-get install git
- Clone the
Arc-Px-VMware-Faststart
repogit clone https://github.com/PureStorage-OpenConnect/Arc-PX-VMware-Faststart.git
- Assign values to the variables in
variables.tf
inArc-PX-VMware-Faststart/vmware_vm_pool/variables.tf
- Assign values to the variables in
Arc-PX-VMware-Faststart/vmware_vm_pool/modules/virtual_machine/variables.tf
- Execute
terraform apply -target=module.virtual_machine -auto-approve
do this from within theArc-PX-VMware-Faststart/vmware_vm_pool
directory.
Digging Into The Configuration – Points Of Interest
The Root Module
A “Root module” is where we define child modules, any provider configuration that they may share and any input arguments and outputs they use. This is the root module for all the modules under the vmware_vm_pool
folder, currently there is just one – virtual_machine
, however, in the fullness of time a module will be added for standing up Tanzu Kubernetes Grid clusters:
terraform {
required_providers {
vsphere = {
source = "hashicorp/vsphere"
version = "~> 1.24.3"
}
}
}
provider "vsphere" {
user = var.vsphere_user
password = var.VSPHERE_PASSWORD
vsphere_server = var.vsphere_server
allow_unverified_ssl = true
}
module "virtual_machine" {
source = "./modules/virtual_machine"
}
Here we specify:
- the use of the official Hashicorp provider for vmware, version 1.24.3 to be exact
- the configuration for the vmware provider, the password can be provided either when prompted for after
issuingterraform apply -target=module.virtual_machine -auto-approve
or be setting the environment variableTF_VARS_VSPHERE_PASSWORD
virtual_machine configuration
Its in the virtual_machine
module code where things start to get interesting. The concept of complex data types is introduced in the form of the list of maps used to store the virtual machine information, refer to the Arc-PX-VMware-Faststart/vmware_vm_pool/modules/virtual_machine/variables.tf
file:
variable "virtual_machines" {
default = {
"z-ca-bdc-control1" = {
name = "z-ca-bdc-control1"
compute_node = false
ipv4_address = "192.168.123.88"
ipv4_netmask = "22"
ipv4_gateway = "192.168.123.1"
dns_server = "192.168.123.2"
ram = 8192
logical_cpu = 4
os_disk_size = 120
px_disk_size = 0
},
"z-ca-bdc-control2" = {
name = "z-ca-bdc-control2"
compute_node = false
ipv4_address = "192.168.123.89"
ipv4_netmask = "22"
ipv4_gateway = "192.168.123.1"
dns_server = "192.168.123.2"
ram = 8192
logical_cpu = 4
os_disk_size = 120
px_disk_size = 0
},
"z-ca-bdc-compute1" = {
name = "z-ca-bdc-compute1"
compute_node = true
ipv4_address = "192.168.123.90"
ipv4_netmask = "22"
ipv4_gateway = "192.168.123.1"
dns_server = "192.168.123.2"
ram = 73728
logical_cpu = 12
os_disk_size = 120
px_disk_size = 120
},
"z-ca-bdc-compute2" = {
name = "z-ca-bdc-compute2"
compute_node = true
ipv4_address = "192.168.123.91"
ipv4_netmask = "22"
ipv4_gateway = "192.168.123.1"
dns_server = "192.168.123.2"
ram = 73728
logical_cpu = 12
os_disk_size = 120
px_disk_size = 120
},
"z-ca-bdc-compute3" = {
name = "z-ca-bdc-compute3"
compute_node = true
ipv4_address = "192.168.123.92"
ipv4_netmask = "22"
ipv4_gateway = "192.168.123.1"
dns_server = "192.168.123.2"
ram = 73728
logical_cpu = 12
os_disk_size = 120
px_disk_size = 120
}
}
}
Each map in the list represents a virtual machine that will host a Kubernetes cluster node, of which their are two types:
- control plane node hosts –
compute_node = false
- worker node hosts –
compute_node = true
Each virtual machine has to have an operating system disk and each worker node host requires a disk to create persistent volumes on. To cut a long story short any worker node that block storage can be presented to can be used for Portworx – our Kubernetes storage solution, hence the px_disk_size
attribute in each map element. In the HCL excerpt below for creating virtual machines, I want to highlight the use of:
for_each
to iterate through thevirtual_machine
variablevar.
to dereference attributes in thevirtual_machine
list of maps, refer to the last line in the excerpt for an example of this- the
dynamic
block which in conjunction withfor_each = each.value.compute_node ? [1] : []
helps to determine whether a virtual machine requires a second disk for Portworx:
resource "vsphere_virtual_machine" "standalone" {
resource_pool_id = data.vsphere_resource_pool.pool.id
datastore_id = data.vsphere_datastore.datastore.id
for_each = var.virtual_machines
name = each.value.name
memory = each.value.ram
num_cpus = each.value.logical_cpu
guest_id = "ubuntu64Guest"
network_interface {
network_id = data.vsphere_network.network.id
adapter_type = data.vsphere_virtual_machine.template.network_interface_types[0]
}
disk {
unit_number = 0
label = "OS"
size = each.value.os_disk_size
eagerly_scrub = data.vsphere_virtual_machine.template.disks.0.eagerly_scrub
thin_provisioned = data.vsphere_virtual_machine.template.disks.0.thin_provisioned
}
dynamic "disk" {
for_each = each.value.compute_node ? [1] : []
content {
unit_number = 1
label = "PX"
size = each.value.px_disk_size
eagerly_scrub = data.vsphere_virtual_machine.template.disks.0.eagerly_scrub
thin_provisioned = data.vsphere_virtual_machine.template.disks.0.thin_provisioned
}
}
clone {
template_uuid = data.vsphere_virtual_machine.template.id
linked_clone = var.vm_linked_clone
Coming Up In Part 4
In the next blog post in this series I will go through the creation of a Kubernetes cluster on top of the virtual machines created via the virtual_machine
module.
- © 2021 GitHub
One thought on “Deploying Azure Data Services via Terraform Part 3: Deploying VMware Virtual Machines”