Discovery - when your IO Subsystem out performs processor Cores - how parallelisation really is our friend

Суббота, 23 Июля 2011 г. 17:12 + в цитатник

CPU Saturation – an over performing IO subsystem

It’s not often that you see the IO subsystem able to out drive the cores in the box. To demonstrate this behaviour the TestGuid_HeapInsert table (see http://sqlblogcasts.com/blogs/tonyrogerson/archive/2011/07/19/use-guid-s-uniqueidentifier-for-keys-when-storing-data-on-ssd-s-part-2-initial-database-schema.aspx for set up scripts), a 50 million row table (a heap) 54GBytes in size will be used, processor affinity will be used to lock SQL Server to a specific number of available cores.

Note: the equipment for this test is described here: http://sqlblogcasts.com/blogs/tonyrogerson/archive/2011/07/22/use-guid-s-uniqueidentifier-for-keys-when-storing-data-on-ssd-s-part-3-initial-base-line-with-iometer-and-first-test-group.aspx. In summary, Windows 2008 R2 x64, SQL 2008 R2 x64, 16GB of RAM but SQL Server is limited to 1GB, processor is a AMD Phenom II X6 Six Core 1090T Black Edition.

Interestingly this behaviour will not show up when using the COUNT(*) function without any WHERE clause, performance is consistent regardless of the number of cores used, however, once you have a Compute Scalar in the plan then the core scalability issue described kicks in.

select COUNT(*) from TestGuid_HeapINSERT with ( index = 0 )

Average Disk Bytes/read 404,396
Average Disk Sec/read 0.001
Average Disk Reads/sec 2,806
Average Read Bytes/sec 1,134,827,393

The SQL below has been used; the index hint used to simplify the experiment and make sure a table scan is being performed:

select MAX( cast( expandrow as bigint ) ) from TestGuid_HeapINSERT with ( index = 0 )

Cores	Avg Disk Bytes/Read	Avg Disk Sec/Read	Avg Disk Reads/Sec	Avg Read Bytes/Sec	Logical Reads	Read-Ahead Reads	CPU Time (ms)	Elapsed (ms)
1	411,619	0.000	622	256,306,080	7,142,864	7,142,864	211,225	228,679
2	406,215	0.001	1,244	506,625,778	7,142,864	7,142,864	216,249	116,102
3	407,495	0.001	1,833	746,940,222	7,142,864	7,142,754	220,336	78,957
4	409,191	0.001	2,392	978,956,526	7,142,864	7,142,864	223,797	60,129
5	408,588	0.001	2,795	1,142,225,279	7,142,864	7,142,864	225,248	51,388
6	408,588	0.001	3,270	1,324,789,338	7,142,864	7,142,864	229,429	44,631

Reviewing the figures above Average Disk Bytes per read, Average Disk Sec/Read, Logical Reads, Read-ahead reads and CPU Time (ms) remain fairly constant, however Average Disk Reads/Sec and Average Read Bytes/Sec are stepped and Elapsed (ms) reduces dramatically with a second core then savings diminish as cores are added.

Cores	Avg Disk Reads/Sec	Avg Read Bytes/Sec	Elapsed (ms)
2	622	250,319,698	-112,577
3	589	240,314,444	-37,145
4	559	232,016,304	-18,828
5	403	163,268,753	-8,741
6	475	182,564,059	-6,757

It should be noted, this problem isn’t actually specific to the IO subsystem, and the problem occurs if you have enough of your table in the buffer pool too, essentially the core cannot get the data through quick enough.

Relating to the real world

The figures above show that given a query that has no parallelisation and therefore runs on a single core the duration will be 512% of the capability of the box should all cores be used and will use only 19% of the availability Read Bytes/sec maximum.

In past versions of SQL Server we have been used to turning parallelism off because it more often than not extended the duration of the query, which was fixed in SQL Server 2008 where the parallelism does now work well. However, legacy systems and code and a perpetual myth that needs breaking is that MAXDOP should be removed and just let SQL Server get on with it.

It is true that on a system with a number of concurrent connections that this problem will balance itself out, however, what about the sequential overnight batch jobs that so many of us have?

No direct advice here (yet), may be that will come when I finally do my conclusions around these SSD benchmarks, but you need to be aware that this problem is here and as IO subsystems perform better (which mine does) then these issues will need addressing.