Running SQL Server On A “Big Box”: Quantifying The Effect Of NUMA on OLTP Workloads: Part 2

This post is an addendum to my previous post, I hope that it answers some questions that might have popped up in readers heads whilst going through the last post:

Q: Would affinitizing connections to A NUMA node reduce pressure on the log writer ?

A: No, you only ever have one log writer per instance, the only way to change this is to ‘Shard’ your database across multiple instances.

Q: Will Hekaton reduce pressure on the log writer ?

A: Yes, to quote from Microsoft research white paper “Hekaton: SQL Server’s Memory-Optimized OLTP Engine”:

Hekaton’s transaction log is designed for high efficiency and scale. Each transaction is logged in a single, potentially large, log record. The log record contains information about all versions inserted and deleted by the transaction, sufficient to redo them. Since the tail of the transaction log is typically a bottleneck, reducing the number of log records appended to the log can improve scalability and significantly increase efficiency. Furthermore the content of the log for each transaction requires less space than systems that generate one log record per operation. Generating a log record only at transaction commit time is possible because Hekaton does not use write-ahead logging (WAL) to force log changes to durable storage before dirty data. Dirty data is never written to durable storage. Furthermore, Hekaton tries to group multiple log records into one large I/O; this is the basis for group commit and also a significant source of efficiency for Hekaton commit processing. Hekaton is designed to support multiple concurrently generated log streams per database to avoid any scaling bottlenecks with the tail of the log. Multiple log streams can be used because serialization order is determined solely by transaction end timestamps and not by ordering in the transaction log. However the integration with SQL Server leverages only a single log stream per database (since SQL Server only has one). This has so far proven sufficient because Hekaton generates much less log data and fewer log writes compared with SQL Server.

The SQL Server 2014 incarnation of Hekaton comes with a lot of restrictions which may not make its implementation practical for a lot of people, unless the need for raw pperformance is paramount. Hekaton is more feature rich in SQL Server 2016, the best comparison table of Hekaton in SQL Server 2014 versus 2016 can be found here. One of the design aims behind Hekaton was to be lock and latch free in nature so that it scales well on servers with lots of CPU cores, I hope to test this in anger once my test server is built.

Q: Will delayed durability make any difference ?

A: Yes and no, yes for log flushes but no to the contention on the LOGCACHE_ACCESS spinlock access to which is required in order to copy logging information into the log buffer.

Q: What can I do to reduce LOGCACHE_ACCESS spins ?

A: You can take CPU core 0 for NUMA node 0 out of the CPU affinity mask, if you have hyper-threading enabled this will require that you take the first two logical processors out of the affinity mask, otherwise just remove the first logical processor.

Q: Can I get around the bottleneck that the log writer and LOGCACHE_ACCESS spinlock impose via a NoSql database ?

A: If you are happy to forego the safety of ACID transactions then yes, however be aware that things such as eventual consistency come with their own nuances and it is patently unsuitable for certain applications, anything financial to name but one.

Q: If I was yo use a different relational database engine, would this allow be to get around the log writer and LOGCACHE_ACCESS spinlock issue bottlenecks ?

A: As far as I am aware all other popular realtional database engines impose similar bottlenecks on any application that requires ACID transactional behaviour, for example Oracle uses something called the redo allocation latch instead of the LOGCACHE_ACCESS spinlock.

Q: Chris, you are yet to demonstrate in any of your blogs pressure on LOGCACHE_ACCESS spinlock ?

A: Correct, however . . . hold that thought, I have new hardware on order which will give me twenty CPU cores ( Haswell – Xeon E5 V3) across two CPU sockets ;-).

Q: What is your best advise for minimizing log writer and LOGCACHE_ACCESS pressure:

A:

  • In theory Hekaton should help reduce pressure on this, however the migration path to this is not a smooth one, certainly not in SQL Server 2014, therefore most people will be constraining to using the conventional engine until the RTM version of SQL Server 2016 is released.
  • Go for high end flash PCIe storage and format it for write optimised perfromance (if the vendors tools allow for this), avoid using a SAN or direct attached storage.
  • Follow basic best practise for minizing logging i.e. avoiding page splits, unecessarily long data types etc
  • Affinitize the database engine away from socket 0 core 0
  • Try and isolate the workload to a single CPU socket if possible.
  • If you are maxing out a single CPU socket and experience pressure on the log writer and its spinlocks, consider sharding your database across instances, affinitize each instance to its own CPU socket such that no instance crosses a NUMA boundary, set the max memory for each instance (allowing for the OS, thread stack, CLR and backup buffers – assuming SQL Server 2012 onwards) such that it can be accomodated within the bank of memory associated with each socket.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s