I has fortunate enough to attend Thomas Kejser’s super scaling SQL Server training day on the 13 th June, here is a brief synopsis of what was covered:
- The fundamentals of throughput versus response time
- Xperf
- Joining
- The scale-ability of different join types was covered
- Tuning Different Types Of Workload
- Strategies for tuning OLTP versus OLAP workloads
- Getting IO Right
- Single Versus Multi Threaded Performance
- Wait Types
- Spin Locks
This day long session was level 400 and covered the scalability limits of various techniques employed when developing SQL Server applications. For those familiar with Thomas’s “Grade Of The Steel” blog post series, it had a very similar feel to this, it covered things Thomas has already covered here and then some things he has not to date blogged about. Throughput the material Thomas highlighted where things break down, core concepts which need to be grasped in order to make SQL Server scale and he dispelled quite a few things that are held up as best and common practice within the SQL Server community as being myths. I was surprised at the sheer volume of material Thomas managed to get through.
So what did I takeaway from this ?:
- The full power of XPerf when it is wielded in expert hands, the part of the course Thomas presented on finding out where in the call stack latching took place being a case in point. That Xperf shows you the “Whole picture”, one part of the material illustrated that an excessive amount of time was spent on checking data types and that changing the column data types of the table used in the example resulted in a 1.5 times performance improvement. Further slides presented the fact that extended events and DMVs hit brick walls when they allude to things such as ACCESS_METHODS_DATASET_PARENT which have terse descriptions in books online. A further example highlighted a fundamental scalability bottleneck in parallel scans, the resolution being to hash partition the table, I was aware of the use of this technique for scaling out inserts but not parallel scans. I have used Xperf myself on the SSIS data flow engine as per the session I gave at SQL Bits, however I’m not aware of many people in the wider community using it to the extent that Thomas does.
- Where the scalability of core SQL Server concepts breaks down, for example the scale-ability of hash joins breaks down at a MAXDOP of 16.
- Where commonly held beliefs / best practices break down. One particular example that springs to mind is around the use of GUIDs, in particular there is a belief that you do not want expensive NAND flash to be wasted due to fragmentation. For a high volume of concurrent inserts, Thomas illustrates the scalability of this. From my own experience of designing something that had to store click streams in real time, GUIDs on flash was the only practical means of facilitating this. Thomas’s assertion that the parallel query coordinator thread was a myth was eye opening to say the least.
- Thomas’s ‘Fast’ methodology for tuning OLTP and OLAP systems.
- Perhaps one of the most interesting points that was made was that as the speed with which data can be moved around increases, this has an impact on the design of the data model. I would have liked this point to have been covered in more depth. Not that this is a criticism, because Thomas got through a tremendous amount of material.
I enjoyed the day immensely, in summary the course was well constructed and well presented by perhaps one of the few people in the SQL Server community who has expert knowledge of scaling SQL Server at the very highest end it will scale at, and not just the database engine but its associated BI stack also. My only criticism if you can call it one is that there is not a longer version of the course available. A link to the course can be found here and a link to Thomas’s blog can be found here.