Scaling Out CI Pipelines With Azure Devops, Docker and SQL Server

The seed of the idea behind this blog post first germinated when I noticed the following yaml:

# Publish Test Results
# Publish test results to Azure Pipelines
- task: PublishTestResults@2
inputs:
#testResultsFormat: 'JUnit' # Options: JUnit, NUnit, VSTest, xUnit, cTest
#testResultsFiles: '**/TEST-*.xml'
#searchFolder: '$(System.DefaultWorkingDirectory)' # Optional
#mergeTestResults: false # Optional
#failTaskOnFailedTests: false # Optional
#testRunTitle: # Optional
#buildPlatform: # Optional
#buildConfiguration: # Optional
#publishRunAttachments: true # Optional

Note the line that includes mergeTestResults, this got me thinking along the lines of running multiple tests in parallel, a train of thought which led to this continuous integration pipeline:

A SQL Server data tools project is checked out of GitHub, built into a DacPac, four containerized SQL Server instances are spun up using clones of the ‘Seed’ docker volume. The DacPac is applied to a database running inside each container, which a tSQLt test is then executed against, finally, at the end very end the tSQLt results are aggregate and published.

Pipeline Infrastructure

This example uses the following build infrastructure:

Self-hosted Windows and Linux build agents are used in two agent pools. Each pipeline job includes the line:

pool: $(linuxAgentPool)

or

pool: $(winAgentPool)

i.e. the names of the self-hosted Linux and Windows agent pools are taken from pipeline variables. The code for this example can be found in this GitHub repository here. The README for the repo contains full details regarding the pre-requisites for executing the pipeline. The following sections in this post will provide a walk through for each job in the pipeline:

YAML Walk Through

I will start off my making a disclaimer that the docker volume cloning mechanism in this post ultimately uses the REST API for the storage platform my employer sells. However, in order to remotely get to a point whereby I could leverage this, I had to work out a number of things which were not readily available through “The standard channels” that most developers use, my hope therefore is that if nothing else, the hints and tips I discovered along the way will come in useful to over people.

CleanEnvironment Job

trigger:
- master

variables:
- name: ssdtProject
  value: 'AzureDevOps-Fa-Docker-Volume-Ci-Pipeline'

jobs:
#
# --------------------------------------------------------------------------------------
# 1. Ensure environment is clean following any previous runs
# --------------------------------------------------------------------------------------
#
- job: CleanEnvironment
  pool: $(linuxAgentPool)
  steps:
  - script: |
      sudo docker rm -f $(sudo docker ps -a -q)
      sudo docker volume rm -f $(sudo docker volume ls | grep -v $(seedDockerVolume))
    displayName: 'Clean up old sidecar containers and volumes'

This first job cleans up any remnants from the previous execution of the pipeline, namely any sidecar containers and volumes that are still hanging around.

BuildTestEnvironment Job

The second job in the pipeline is where things start to get interesting:

#
# --------------------------------------------------------------------------------------
# 2. Create SQL Server sidecar containers with cloned copies of the 'Seed' docker volume
# --------------------------------------------------------------------------------------
#
- job: BuildTestEnvironment
  pool: $(linuxAgentPool)
  dependsOn: CleanEnvironment
  strategy:
    matrix:
      sidecar_01:
        container: SideCar01
        externalPort: 1433
        dockerVolume: Volume01
        tSQLtTestClass: Test01
      sidecar_02:
        container: SideCar02
        externalPort: 1434
        dockerVolume: Volume02
        tSQLtTestClass: Test02
      sidecar_03:
        container: SideCar03
        externalPort: 1435
        dockerVolume: Volume03
        tSQLtTestClass: Test03
      sidecar_04:
        container: SideCar04
        externalPort: 1436
        dockerVolume: Volume04
        tSQLtTestClass: Test04
  steps:
  - script: |
      sudo docker volume create --driver pure --opt o=size=80g $(dockerVolume) 
      sudo docker run -v $(dockerVolume):/var/opt/mssql -e 'ACCEPT_EULA=Y' -e 'SA_PASSWORD=$(saPassword))' -p $(externalPort):1433 --name $(container) -d mcr.microsoft.com/mssql/server:2017-latest 
      sudo docker stop $(container)
    displayName: 'Create sidecar'
  - template: templates/clone-docker-volume.yml  
    parameters:
      pfaEndpoint:  '$(pfaEndpoint)'
      pfaApiToken:  '$(pfaApiToken)'
      sourceVolume: 'ca-docker-$(seedDockerVolume)'
      targetVolume: 'ca-docker-$(dockerVolume)'
  - script: |
      sudo docker start $(container)

The matrix parallel-ization strategy is used to spin up four containers in parallel, each container initially has an empty volume. The code inside the template performs the magic of giving each sidecar container a copy of the ‘Seed’ volume:

# File: templates/clone-docker-volume.yml

parameters:
  pfaEndpoint: ''
  pfaApiToken: ''
  sourceVolume: ''
  targetVolume: ''

steps:
- task: PythonScript@0
  inputs:
    scriptSource: 'inline'
    script: |
      ${{ format('#!/usr/bin/env python3
      import purestorage
      import urllib3
      urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
      fa = purestorage.FlashArray(target="{0}", api_token="{1}")
      fa.copy_volume( "{2}", "{3}", overwrite="true")', parameters.pfaEndpoint, parameters.pfaApiToken, parameters.sourceVolume, parameters.targetVolume) }}  

As previously mentioned, I’m leveraging the Rest API for Pure Storage’s FlashArray via python, a similar capability can be leveraged on other storage platforms (including those by Pure) via windocks. Azure does support docker volumes, however, at the time of writing I am not aware of an capabilities native to Azure that allow docker volumes to be cloned.

A Word On Templates

A ‘Template’ is the mechanism by which CI pipeline functionality can be shared and reused across pipelines. A template can reside ‘In-line’ in the same repo as that of the pipeline that invokes it, or it can reside in its own dedicated repo. In this example the template is invoked with multiple parameters:

template: templates/clone-docker-volume.yml  
    parameters:
      pfaEndpoint:  '$(pfaEndpoint)'
      pfaApiToken:  '$(pfaApiToken)'
      sourceVolume: 'ca-docker-$(seedDockerVolume)'
      targetVolume: 'ca-docker-$(dockerVolume)'

In order to correctly use these parameters inside the template itself, the format command must be used along with {0}, {1}, {2} and {3} place holders:

parameters:
  pfaEndpoint: ''
  pfaApiToken: ''
  sourceVolume: ''
  targetVolume: ''

steps:
- task: PythonScript@0
  inputs:
    scriptSource: 'inline'
    script: |
      ${{ format('#!/usr/bin/env python3
      import purestorage
      import urllib3
      urllib3.disable_warnings(urllib3.exceptions.InsecureRequestWarning)
      fa = purestorage.FlashArray(target="{0}", api_token="{1}")
      fa.copy_volume( "{2}", "{3}", overwrite="true")', parameters.pfaEndpoint, parameters.pfaApiToken, parameters.sourceVolume, parameters.targetVolume) }}  

BuildDacPac Job

- job: BuildDacPac
  pool: $(winAgentPool)
  workspace:
    clean: all
  steps:
  - task: MSBuild@1
    displayName: 'Build DACPAC'
    inputs:
      solution: '$(ssdtProject).sln'
      msbuildArguments: '/property:OutDir=bin\Release'
  - task: CmdLine@2
    inputs:
      script: 'erase $(Build.ArtifactStagingDirectory)\**' 
  - task: CopyFiles@2
    inputs:
      SourceFolder: $(System.DefaultWorkingDirectory)\$(ssdtProject)\bin\Release
      contents: '**'
      targetFolder: $(Build.ArtifactStagingDirectory)
  - task: PublishBuildArtifacts@1
    inputs:
      pathToPublish: $(Build.ArtifactStagingDirectory)
      artifactName: DacPac
  - task: CmdLine@2
    inputs:
      script: 'echo $(Build.ArtifactStagingDirectory)' 

This turns the SQL Server data tools project into a deployable DacPac. The points of interest in this section are the CopyFiles@2 and PublishBuildArtifacts@1 tasks. At the time of writing this blog post, no msbuild for SQL Server targets exists for Linux, to work around this, the build pipeline creates the DacPac using windows build agents and then distributes this to the Linux build agents as a pipeline artifact.

DeployAndTestDacPac Job

- job: DeployAndTestDacPac
  dependsOn: 
  - BuildDacPac
  - BuildTestEnvironment
  pool: $(winAgentPool)
  strategy:
    matrix:
      sidecar_01:
        externalPort: 1433
        tSQLtTestClass: 'DecL_Tax'
      sidecar_02:
        externalPort: 1434
        tSQLtTestClass: 'IncL_Tax'
      sidecar_03:
        externalPort: 1435
        tSQLtTestClass: 'DecPS_AvailQty'
      sidecar_04:
        externalPort: 1436
        tSQLtTestClass: 'IncPS_AvailQty'
  variables:
  - name: connString
    value: 'server=$(sideCarServerIpAddr),$(externalPort);database=tpch;user id=sa;password=$(saPassword)'
  - name: dacPacPath
    value: '$(System.ArtifactsDirectory)\DacPac\$(ssdtProject).dacpac'
  steps:
  - script: |
      sqlcmd.exe -S $(sideCarServerIpAddr),$(externalPort) -d tpch -U sa -P $(saPassword) -Q "EXEC sp_configure 'Show Advanced Options', 1; RECONFIGURE;"
      sqlcmd.exe -S $(sideCarServerIpAddr),$(externalPort) -d tpch -U sa -P $(saPassword) -Q "EXEC sp_configure 'clr enabled', 1; RECONFIGURE;"
      sqlcmd.exe -S $(sideCarServerIpAddr),$(externalPort) -d tpch -U sa -P $(saPassword) -Q "EXEC sp_configure 'clr strict security', 0; RECONFIGURE;"
      sqlcmd.exe -S $(sideCarServerIpAddr),$(externalPort) -d master -U sa -P $(saPassword) -Q "ALTER DATABASE tpch SET RECOVERY SIMPLE;"
    displayName: 'Prepare sidecar for tSQLtCLR assembly'
  - task: DownloadBuildArtifacts@0
    inputs:
      artifactName: DacPac
      itemPattern: '**' # Optional
      downloadPath: '$(System.ArtifactsDirectory)' 
      parallelizationLimit: '4'

  - script: sqlpackage.exe /Action:Publish /sf:"$(dacPacPath)" /tcs:"$(connString)" /p:ExcludeObjectType=Logins
    displayName: 'Publish DacPac'
  - script: sqlcmd.exe -S $(sideCarServerIpAddr),$(externalPort) -d tpch -U sa -P $(saPassword) -Q "EXEC tSQLt.Run $(tSQLtTestClass);"
    displayName: 'Execute tSQLt Tests'
  - script: sqlcmd.exe -S $(sideCarServerIpAddr),$(externalPort) -d tpch -U sa -P $(saPassword) -y0 -Q "SET NOCOUNT ON;EXEC tSQLt.XmlResultFormatter;" -o $(System.DefaultWorkingDirectory)\$(tSQLtTestClass).xml
    displayName: 'Render tSQLt Results in XML'

The strategy section specifies the external port for each sidecar container along with the name of the test to be executed against each one. The first script section runs a number of T-SQL statements against the tpch database in each container, the DacPac artifact is then downloaded and published against each database. Finally a tSQLt test is executed.

PublishTestResults Job

To wrap things up, the results from each tSQLt test are aggregated and published:

- job: PublishTestResults
  dependsOn: 
  - DeployAndTestDacPac
  pool: $(winAgentPool)
  steps:
  - task: PublishTestResults@2
    inputs:
      testResultsFormat: 'JUnit'
      testResultsFiles: '*.xml' 
      searchFolder: '$(System.DefaultWorkingDirectory)' # Optional
      mergeTestResults: true
      failTaskOnFailedTests: true
    displayName: 'Publish tSQLt Test Results'

Takeaways

This post has illustrated the use of:

  • artifacts in order to pass objects between different jobs running in different self-hosted agent pools,
  • the matrix strategy for parallel execution,
  • templates for code reuse,
  • the format command for variable substitution when invoking code inside a template, and
  • merging results from multiple tests via the PublishTestResults@2 task

2 thoughts on “Scaling Out CI Pipelines With Azure Devops, Docker and SQL Server

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s