Deploying Azure Data Services via Terraform Part 5: Deploying a Load Balancer to The Kubernetes Cluster

Our journey up the stack brings us to the installation of MetalLB – a software load balancer for Kubernetes:

All the content in this series of blog posts relates to the Arc-PX-VMware-Faststart repo on GitHub, this specific post relates to the metallb module which can be found on GitHub here.

The Terraform module for deploying MetalLB is quite simple, in fact one of the simplest there is out of all the modules that feature in the repo. The subject of making applications running on a Kubernetes cluster available to the outside world is a somewhat more nuanced, which is why this blog post will provide an overview of this whole area.

Services, Node Ports, Load Balancer and Ingress

There are two types of network traffic that a Kubernetes cluster is involved in:

  • ‘East-west’ – internal traffic that takes place on the overlay network, the cluster’s internal network in other words
  • ‘North-south’ – traffic between the cluster and the “Outside world”

The focus of this blog post is on North-south traffic. In the world of Kubernetes there are two ways (or services) for facilitating north-south traffic, some people might argue there are three – but an ingress controller still needs to sit behind a load balancer:

  • Node ports
    A service of the type NodePort is exposed on every worker node via a specific. Every worker node understands how to route requests to the pods associated with the service:
  • Load Balancers
    A service of type LoadBalancer is exposed via a single IP address, the load balancer then routes traffic to the appropriate pods in the cluster:
  • Ingress
    This is a type of Kubernetes object which allows routing rules to be implemented, it is then the job of an “Ingress controller” to perform the actual routing. When using Load balancers, each service requires . . . a load balancer endpoint. With Ingress an Ingress controller can sit behind a load balancer and act as a routing gateway. At present Ingress control only works for HTTP and gRPC – a high performance RPC framework heavily used by Kubernetes. Also note that Ingress and Load Balancers operate at different layers in the OSI 7 layer model.

Why Mention Ingress At All ?

Observant readers of this post will note that most databases and data platforms do not use HTTP or gRPC as client protocols. The whole reason why I’m mentioning Ingress and Load Balancers, is because the two things are often conflated.

Back to Load Balancers – Why Use Them ?

A key Kubernetes tenet is that it provides a clean abstraction between a platform developers can consume and the infrastructure that underpins it. Whilst Node Ports are easy to set up and consume, they require that clients know the IP address(es)/can resolve the IP address(es) of one or more worker nodes, which in my humble opinion makes the abstraction between the platform and infrastructure leaky Also, node ports have to use ports in the 30000 and 32767 range so as to avoid “Well known ports” – 8080 for example. Load Balancers provide a much more elegant solution to the problem of how clients can talk to application running on a Kubernetes cluster.

Introducing MetalLB

MetalLB is an open source free to use load balancer. The metallb module uses MetalLB in layer 2 configuration mode. Simply put you create a ConfgMap object with a range of IP addresses that the load balancer can use, then, every time a service is created of the type LoadBalancer, MetalLB grabs an IP address from the pool and routes traffic from the endpoint to the cluster via ARP/NDP.

Lets walkthrough the configuration for this module resource-by-resource:

  • kubernetes_namespace.metallb_system
    First of all a kubernetes namespace is created for the MetalLB pods and ConfigMap objects to reside in:
 resource "kubernetes_namespace" "metallb_system" {
  metadata {
    name = "metallb-system"
  }
}
  • helm_release.metallb
    The helm chart for MetalLB (thanks bitnami !) is installed on the cluster. For the uninitiated Helm is effectively a package manager for Kubernetes, the packages are referred to as ‘Charts’. To install a chart use helm repo add for the repo that contains the chart you want followed by issuing helm install for the chart. Use Helm 3.0 where possible as this does away with the cluster side tiller component which requires elevated privileges. Because we do not require the default ConfigMap created when installing the chart we do away with this via a local provisioner.
resource "helm_release" "metallb" {
  name       = "metallb"
  repository = "https://charts.bitnami.com/bitnami"
  chart      = "metallb"
  namespace  = kubernetes_namespace.metallb_system.metadata.0.name

  set {
    name  = "version"
    value = var.helm_chart_version 
  }

  provisioner "local-exec" {
    command = "kubectl delete configmap metallb-config -n metallb-system"
  }

  depends_on = [
    kubernetes_namespace.metallb_system
  ]
}
  • kubernetes_config_map.layer2_configuration
    Creation of a ConfigMap object is the final thing that we do, to cut a long story short we plug values into the templatefilefile function and this gives MetalLB a range of IP addresses it can use for load balancer services.
resource "kubernetes_config_map" "layer2_configuration" {
  metadata {
    name      = "metallb-config"
    namespace = "metallb-system"
  }

  data = {
    config = templatefile("${path.module}/templates/layer2_configuration.yaml.tpl", {
               ip_range_lower_boundary = var.ip_range_lower_boundary,
               ip_range_upper_boundary = var.ip_range_upper_boundary 
             })
  }

  depends_on = [
    helm_release.metallb 
  ]
}

Deploying and Testing The Module

The module is deployed via the following command, this should be executed from the
Arc-PX-VMware-Faststart/kubernetes directory:

terraform apply -target=module.metallb -auto-approve

to test this issue the following command to create a deployment and service using the supplied nginx manifest:

kubectl apply -f modules/metallb/nginx-deployment.yaml

Finally check that the service associated with load balancer service has an external IP address associated with it:

kubectl get svc

The output from this command should contain an IP address in the EXTERNAL-IP column for the nginx service:

NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE
kubernetes ClusterIP 10.233.0.1 <none> 443/TCP 6d6h
nginx LoadBalancer 10.233.58.22 192.168.113.93 8080:30044/TCP 33h

Disclaimer

Please be aware that:

  • MetalLB is not a commercially supported software load balancer
  • Take note of the limitations of using the layer 2 configuration as per the documentation.

Coming Up In Part 6

We are tantalizingly close to deploying a Big Data Cluster or Azure Arc enabled Data Services controller, however, before we can do this – we need to address the topic of storage, which will be covered by the px-store module in the next post.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s