Scaling Out Singleton Insert Workloads Using Containers: Part 1

I will forewarn readers of this blog post that this is ‘Conceptual’ in nature, due to the fact in my tests I was spinning up containers which then fell over with core dumps. Nonetheless, I will put forward my thesis behind why you might want to do this.

Bottlenecks In The Legacy Engine

For what seemed like the first half of 2017 I blogged and presented on the SQL Server community circuit on the pressure points in the legacy database engine. My work came to two main conclusions:

  1. There is an inherent spinlock bottleneck in the legacy engine around the XDESMGR spinlock (used to serialise the distribution of transactions ids) and the LOGCACHE_ACCESS spinlock.
  2. Spinlock bottlenecks are exacerbated by the nuances of the cache architecture of modern CPUs. This can be illustrated by running a single threaded singleton workload on the same CPU socket as the log writer and then on a different socket. The test running on the same CPU socket as the log writer will complete in the shortest amount of time, to understand why this is the case you need to need to drill down into the way that spinlocks work. When a sessions thread wishes to acquire a spinlock the cache line associated with it, in the case of the LOGCACHE_ACCESS spinlock, needs to travel from the CPU core hosting the log write thread, to the CPU core hosting the sessions thread. Once the cache line has completed this journey, it is then subjected to a compare and swap (CAS) operation involving the SPID of the session. When the spinlock is released, its cache line will have to make the same journey, but in the opposite direction. The further apart the two CPU cores in question are, the higher the number of CPU cycles burned up in this process are. Ergo, the closer the singleton insert workload can run in relation to the log writer thread, the shorter the amount of time it will take to complete in.

Whilst blogging on this subject, the following image was ubiquitous in many of my blog posts:

Untitled

A Recap On Memory Management 101

The unit of transfer between main memory and the CPU is the cache line, this is 64 bytes in size. All memory access is CPU intensive, whilst data and / or instructions are being accessed, regardless of whether the instructions and / or data resides in main memory or the CPU cache hierarchy, CPU cycles are burned, there are no waits or locks involved in pure memory access. The final point is that the number of CPU cycles burned when accessing data / instructions is proportional to the distance a cache line has to travel. The best case scenario is that the data / instruction required already resides in a CPU core’s register, in this case we are talking about less than four  CPU cycles, depending on the CPU instruction being executed, the worst case scenario is main memory access, this can be 160 cycles and more if there is a translation look-a-side buffer miss also.

Sharding Workloads Across Instances

An approach I mused about in order to get the best possible utilisation possible from a multi socket server was to ‘Affinitize’ a SQL Server instance to each socket and then shard the workload across databases on each instance. The benefits of doing this would be two-fold:

  1. Two instances = 2 x the number of available XDESMGR and LOGCACHE_ACCESS spinlocks.
  2. The worse case scenario for moving the cache lines associated with the  spinlocks would be the latency of the last level (L3) cache which is less than going from CPU socket to socket via the quickpath interconnect, as illustrated by the graph below:

memory access

Containers To The Rescue !!!

What if each singleton insert thread had its own database residing on a dedicated instance ?, if each instance could be tied down to adjacent CPU cores, this would create the best case scenario in terms of minimizing the CPU cycles burned when passing spinlock cache lines around. It just so happens that docker allows the logical CPUs to be specified when spinning up containers via the cpuset argument. Because SQL Server on Linux “Is a thing” I decided to rebuild my trusty home server using:

  • Ubuntu 16.04 LTS
  • PowerShell for Linux
  • Docker (CE Version ) 17.12.0-ce, build c97c6d6
  • SQL Server 2017 docker image, the latest version available as of 29/12/2017

The end game to all of this was to achieve linear scalability across all available CPU cores in the server. 

I came up with this PowerShell script for spinning up the containers, creating a SingletonInsert database on each one and then executing one single threaded singleton insert workload per container:

Clear-Host
docker ps -a -q | ForEach-Object { docker rm -f $_ }

$MaxContainers=12

$CpuSets = @( "0,3"  , "4,7"  , "8,11"  , "12,15"  , "16,19"  , "20,23", "24,27", "28,31", "32,35", "36,39",
              "40,43", "44,47" )

for($i=0; $i -lt $MaxContainers; $i++) {
    $CpuSet = $CpuSets[$i]

    $DockerCmd = "docker run --cpuset-cpus=$CpuSet -e `"ACCEPT_EULA=Y`" -e `"MSSQL_SA_PASSWORD=P@ssw0rd!`" " +`
                 "-p " + [string](15560 + $i) + ":1433 --name SqlLinux$i -d microsoft/mssql-server-linux:2017-latest"
Invoke-Expression $DockerCmd
}

Start-Sleep -s 20

for($i=0; $i -lt $MaxContainers; $i++) {
    $DockerCmd = "docker exec -it SqlLinux$i /opt/mssql-tools/bin/sqlcmd "              +`
                 "-S localhost -U SA -P `"P@ssw0rd!`" "                                 +`
                 "-Q `"CREATE DATABASE SingletonInsert;"                                +`
                      "ALTER DATABASE SingletonInsert SET DELAYED_DURABILITY=FORCED`""
    $DockerCmd

    Invoke-Expression $DockerCmd

    $DockerCmd = "docker exec -it SqlLinux$i /opt/mssql-tools/bin/sqlcmd "              +`
                 "-S localhost -d SingletonInsert -U SA -P `"P@ssw0rd!`" "              +`
                 "-Q `"CREATE TABLE [dbo].[si] ( "                                      +`
                      "    c1 UNIQUEIDENTIFIER NOT NULL DEFAULT NEWID() PRIMARY KEY );`""

    Invoke-Expression $DockerCmd

    $DockerCmd = "docker exec -it SqlLinux$i /opt/mssql-tools/bin/sqlcmd "              +`
    "-S localhost -d SingletonInsert -U SA -P `"P@ssw0rd!`" "                           +`
    "-Q `"CREATE PROCEDURE [dbo].[usp_Insert] AS "                                      +`
         "BEGIN "                                                                       +`
         "    SET NOCOUNT ON "                                                          +`
         "    DECLARE @i INTEGER = 0; "                                                 +`
         "    WHILE @i <span 				data-mce-type="bookmark" 				id="mce_SELREST_start" 				data-mce-style="overflow:hidden;line-height:0" 				style="overflow:hidden;line-height:0" 			></span>< 1000000000 "                                                         +`
         "    BEGIN "                                                                   +`
         "        INSERT INTO si DEFAULT VALUES; "                                      +`
         "        SET @i += 1; "                                                        +`
         "    END; "                                                                    +`
         "END;`""

    Invoke-Expression $DockerCmd
}

$RunWorkload = { param ($LoopIndex)
    $DockerCmd = "docker exec SqlLinux$LoopIndex /opt/mssql-tools/bin/sqlcmd " +`
                 "-S localhost -U sa -P `'P@ssw0rd!`' -d SingletonInsert " +`
                 "-Q `'EXEC usp_Insert`'"
    $DockerCmd
    Invoke-Expression $DockerCmd
}

$InsertRates = @()

for($i=0; $i -lt $MaxContainers; $i++) {
    $StartMs = Get-Date

    for($j=0; $j -lt ($i + 1); $j++) {
        Start-Job { &$RunWorkload -LoopIndex $j }
    }
    while (Get-Job -state running) { Start-Sleep -Milliseconds 10 }

    $EndMs = Get-Date
    Write-Host "Duration = " = ($EndMs - $StartMs).TotalMilliseconds

    $InsertRate = (($i + 1) * 1000000000 * 1000)/(($EndMs - $StartMs).TotalMilliseconds)
    $InsertRates += $InsertRate
}

Clear-Host

for($i=0; $i -lt $MaxContainers; $i++) {
    Write-Host "Inserts`/s using " ($i + 1) " containers = " $InsertRates[$i]
}

After ironing out some kinks in the script, I found that not all twenty containers started up (note that this screen shot was taken some time after the last invocation of the script was run):

cadkin@ubuntu:~$ docker ps -a
[sudo] password for cadkin: 
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
e9370509b67c microsoft/mssql-server-linux:2017-latest "/bin/sh -c /opt/m..." 17 hours ago Exited (1) 17 hours ago SqlLinux19
c8fcd5d346b0 microsoft/mssql-server-linux:2017-latest "/bin/sh -c /opt/m..." 17 hours ago Exited (1) 17 hours ago SqlLinux18
feed2a44559e microsoft/mssql-server-linux:2017-latest "/bin/sh -c /opt/m..." 17 hours ago Exited (1) 17 hours ago SqlLinux17
c39325f2a1cf microsoft/mssql-server-linux:2017-latest "/bin/sh -c /opt/m..." 17 hours ago Exited (1) 17 hours ago SqlLinux16
af9373fdd1c1 microsoft/mssql-server-linux:2017-latest "/bin/sh -c /opt/m..." 17 hours ago Exited (1) 17 hours ago SqlLinux15
77a92099448a microsoft/mssql-server-linux:2017-latest "/bin/sh -c /opt/m..." 17 hours ago Exited (1) 17 hours ago SqlLinux14
41443910d1ee microsoft/mssql-server-linux:2017-latest "/bin/sh -c /opt/m..." 17 hours ago Exited (1) 17 hours ago SqlLinux13
a261eea040ca microsoft/mssql-server-linux:2017-latest "/bin/sh -c /opt/m..." 17 hours ago Exited (1) 17 hours ago SqlLinux12
148627655d39 microsoft/mssql-server-linux:2017-latest "/bin/sh -c /opt/m..." 17 hours ago Exited (1) 17 hours ago SqlLinux11
f7bf0775312a microsoft/mssql-server-linux:2017-latest "/bin/sh -c /opt/m..." 17 hours ago Exited (1) 17 hours ago SqlLinux10
39f4cb71b9d3 microsoft/mssql-server-linux:2017-latest "/bin/sh -c /opt/m..." 17 hours ago Exited (1) 17 hours ago SqlLinux9
b7ef74de5127 microsoft/mssql-server-linux:2017-latest "/bin/sh -c /opt/m..." 17 hours ago Up 17 hours 0.0.0.0:15568->1433/tcp SqlLinux8
db91dc2f0c24 microsoft/mssql-server-linux:2017-latest "/bin/sh -c /opt/m..." 17 hours ago Up 17 hours 0.0.0.0:15567->1433/tcp SqlLinux7
5ef07d08a682 microsoft/mssql-server-linux:2017-latest "/bin/sh -c /opt/m..." 17 hours ago Up 17 hours 0.0.0.0:15566->1433/tcp SqlLinux6
53aea9fd7260 microsoft/mssql-server-linux:2017-latest "/bin/sh -c /opt/m..." 17 hours ago Up 17 hours 0.0.0.0:15565->1433/tcp SqlLinux5
ff9428020511 microsoft/mssql-server-linux:2017-latest "/bin/sh -c /opt/m..." 17 hours ago Up 17 hours 0.0.0.0:15564->1433/tcp SqlLinux4
76139408c21b microsoft/mssql-server-linux:2017-latest "/bin/sh -c /opt/m..." 17 hours ago Up 17 hours 0.0.0.0:15563->1433/tcp SqlLinux3
b5ba59af259d microsoft/mssql-server-linux:2017-latest "/bin/sh -c /opt/m..." 17 hours ago Up 17 hours 0.0.0.0:15562->1433/tcp SqlLinux2
d4960fc52ae8 microsoft/mssql-server-linux:2017-latest "/bin/sh -c /opt/m..." 17 hours ago Up 17 hours 0.0.0.0:15561->1433/tcp SqlLinux1
fbe948d0bdc9 microsoft/mssql-server-linux:2017-latest "/bin/sh -c /opt/m..." 17 hours ago Up 17 hours 0.0.0.0:15560->1433/tcp SqlLinux0

Performing docker logs on some of the containers that had exitted revealed something I was not expecting to see:

cadkin@ubuntu:~$ docker logs SqlLinux9
This program has encountered a fatal error and cannot continue running.
The following diagnostic information is available:

       Reason: 0x00000003
      Message: result == 0
   Stacktrace: 0000565228a5c928 0000565228a5c4d0 
      Process: 9 - sqlservr
       Thread: 13 (application thread 0x1000)
  Instance Id: 9dd6e659-44eb-4204-a3d8-574e0655c184
     Crash Id: c2d78278-bdcc-43dc-aac8-a5124cfbd18a
  Build stamp: 440ff4daa7285d760aa0f372321cea94bd5f87ac313be8b29efa1ced69448c93

Capturing core dump and information...
No journal files were found.
No journal files were found.
Attempting to capture a dump with paldumper
WARNING: Capture attempt failure detected
Attempting to capture a filtered dump with paldumper
WARNING: Attempt to capture dump failed.  Reference /var/opt/mssql/log/core.sqlservr.9.temp/log/paldumper-debug.log for details
Attempting to capture a dump with gdb
WARNING: Unable to capture crash dump with GDB. You may need to
allow ptrace debugging, enable the CAP_SYS_PTRACE capability, or
run as root.

Although I was hoping to present some nice graphs that would prove / disprove my theory, this was not to be. However, I will endeavour to come up with a test environment that works, whether this is resolved by moving to Ubuntu 17, when it becomes the long term supported release, a later version of the SQL Server docker image or a later version of the Docker engine, I am unsure at present. However, this is most definitely something I intend to revisit, therefore, watch this space !!!.

Addendum 04/01/2018

It appears that my Ubuntu issues (in part) were due to the cgroup and memory account setting in the grub configuration file on my Ubuntu server, as detailed in this link  . Using this in junction with specifying the max memory size when starting up the containers resolved the core dump issue. However, some of my containers were still falling over, my suspicion is that my swap partition is not configured correctly. The good news is that I managed to get my test working on a windows 2016 server using Docker CE, the results of which will be presented in part 2 of this blog post.

The second update is that my PowerShell script (inline further up this blog post) has been updated on order to remove some bugs and make the results easier to extract from the scripts output. I will enhance this script as part of the efforts which will go into writing part 2 of the blog post and put this on GitHub.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s