The previous post in this series covered Kubernetes cluster creation via Kubespray. It was my intention to cover off load balancing in this post, however at the time of writing when you create a SQL Server 2019 big data cluster on premises all services are created with an endpoint type of NODEPORT, which I will go into in more detail later on in the post. Plus with the benefit of hindsight, it makes more sense to talk about networking in a later post once a big data cluster has been stood up.
Signing Up To The Early Adopter Program
Use this link to sign up to the SQL Server 2019 Early Adopter Program, you will need to have done this before you can proceed with the steps in the rest of this blog post. Also be aware that at the time of writing the CTP (community technology preview) that is currently available is 2.2, therefore as newer CTP versions roll out some of the steps in this post may be subject to change.
Pre Flight Checks
This post post will focus on creating a big data cluster so that you can get up and running as fast as possible, as such the storage type used will be ephemeral, this perfectly acceptable for “Kicking the tyres”. For production grade installations integration with a production grade storage platform is required via a storage plugin. Before we create our cluster, with the assumption we are doing this with an on premises infrastructure, the following pre-requisites need to be met:
- For the purposes of this blog post series Ubuntu 16.04 is being used, each node should be using the 4.15 kernel, check this via:
The critical piece of information we should see back is the 15 after the 4:
- By default each worker node require 100GB in order to cache docker images, by default these are cached on the root file system of each worker node, issue:
in order to see that available space per file system in KB:
Filesystem 1K-blocks Used Available Use% Mounted on udev 933984 0 933984 0% /dev tmpfs 192924 796 192128 1% /run /dev/sda2 19993200 11949656 7004904 64% / tmpfs 964604 0 964604 0% /dev/shm tmpfs 5120 0 5120 0% /run/lock tmpfs 964604 0 964604 0% /sys/fs/cgroup /dev/loop0 90368 90368 0 100% /snap/core/5897 /dev/loop1 91648 91648 0 100% /snap/core/6130 /dev/loop2 91648 91648 0 100% /snap/core/6034 /dev/sda1 523248 6152 517096 2% /boot/efi tmpfs 192920 0 192920 0% /run/user/1000
The default size for persistent storage volumes for the storage pool is 6GB
- Perform a very basic sanity test on the cluster, first lets make sure that all of the system pods are in state of running, issue:
kubectl get po -n kube-system
Kubernetes to quote Kelsey Hightower is a platform for building platforms on. The tools ubiquitous with creating objects in a cluster for building your platform are kubectl and helm. Getting familiar with Kubernetes for most Microsoft data platform professionals will require a mind shift in thinking. It is probably for this very reason that Microsoft have elected to go with a tool they have written themselves in python to carry out the heavy lifting for you. This tool is mssqlctl, at the time of writing its current incarnation is CTP 2.2 and it is installed as follows:
sudo -H pip3 install --extra-index-url https://private-repo.microsoft.com/python/ctp-2.2 mssqlctl
Cluster Creation Configuration
The configuration for the the cluster we will build is stored in a number of environment variables, there are several ways for managing these:
- Specify the environment variables in the bash profile by adding the relevant lines to the .bashrc file located in the home directory of the Linux user that the big data cluster will be created under.
- Put the environment variables in a script and then ‘Source’ these in the current environment before creating the big data cluster, e.g. create a file called say called bdc.env and source this as follows (assuming that it resides in the current working directory):
The bare minimum environment variables that need to be specified are:
export ACCEPT_EULA=Y export CLUSTER_PLATFORM=kubernetes export CONTROLLER_USERNAME="<choose your own username>" export CONTROLLER_PASSWORD="<choose your own password>" export KNOX_PASSWORD="<choose your own password>" export MSSQL_SA_PASSWORD="<choose your own password>" export DOCKER_REGISTRY="<value supplied by microsoft>" export DOCKER_REPOSITORY="<value supplied by microsoft>" export DOCKER_USERNAME="<Early adopter program sign up email address>" export DOCKER_PASSWORD="<value supplied by microsoft>" export DOCKER_EMAIL="<Early adopter program sign up email address>" export DOCKER_PRIVATE_REGISTRY="1" export USE_PERSISTENT_VOLUME="false"
<value supplied by microsoft> refers to values provided by Microsoft when you are successfully enrolled on to the big data clusters early adopter program.
Big Data Cluster Creation
Creating a cluster is as simple as the following command line:
mssqlctl create <name-of-cluster>
Where things get interesting is when it comes down to trouble shooting, the cluster creation process starts of by creation of the controller container, the progress of this can be tracked via kubectl as follows:
kubectl get po -n <name-of-cluster>
The documentation refers to the use of kubectl logs in order to inspect the logs of your containers and pods, however this only works for pods which have got past the stage of creation
kubectl logs <pod-name> --all-containers=true
In the event that the controller pod is stuck in a state of creating, run the following command to see what has caused the pod to get stuck in this state:
kubectl describe <pod-name>
Other places that may shed light on any potential problems include the Kubernetes logs on each node to be found under /var/log/containers and the syslog files under /var/log.
Our experience has been that big data cluster creation process stalls, when there is insufficient space on the worker nodes to download the docker images for the cluster.
Connecting To Your Cluster
Services created using Kubernetes-as-a-service platforms with public cloud providers have load balancing endpoints by default, i.e. external clients connect to an ip address and the requests from these clients is load balanced across the pods associated with the service. For vanilla Kubernetes on premises installations, separate provision has to be made for a load balancer. At the time of writing the easiest way to do this is to use MetalLb and the layer 2 network configuration.
Note the service-proxy-nodeport with an external port of 30777, this provides external access to the big data cluster console. For an on-premises installation of a SQL Server 2019 big data cluster, this is accessed via the following URL:
http://<ip address on any work node host>:30777/portal
When prompted for a username and password, enter the strings associated with the CONTROLLER_USERNAME and CONTROLLER_PASSWORD environment variables respectively. This is what portal looks like when you first get into it:
For those wondering where grafana comes into play, this is how various stats that are derived from the cluster are rendered. As an example of how to see such stats, on the left hand pace in the console click on controller, then monitoring service and finally the view link under node metrics and this, or something very similar is what you should see:
Lets connect to the cluster with something else; Azure Data Studio, fire Azure Data Studio up, hit add new connection in the servers pane and enter the following:
ip address of any of the worker nodes followed by ,31443
- authentication type
The string value that the MSSQL_SA_PASSWORD environment variable is set to
Coming Up In Part 4
So far you have enough to kick the tires with an on-premises installation of a SQL Server 2019 big data cluster. However, due to the fact that ephemeral storage is being used, the minute a pod is scheduled to run on a node other than the one in created some data on, you will lose that data. For production purposes persistent volumes are required, and this will be the focus of the next post in the series, . . . and there is a small matter of backing up and restoring your data also.