![]() |
Tony Rogerson is a freelance Database Specialist based in the UK; a leading expert on Microsoft SQL Server this is his blog fed from his day to day ramblings | tonyrogerson@torver.net mobile: 0796 816 0362 LinkedIn Profile |
My blog has now moved to http://dataidol.com/tonyrogerson |
I've decided to implement a new blogging infrastructure based on WordPress; the community server stuff I'm using has got a bit tired and WordPress offers a wealth of plug-ins to do pretty much everything I want.
So, see http://dataidol.com/tonyrogerson for my new blog; its not just a move, I have broadened my coverage to encompass all things data, the first couple of posts talk about Short-Stroking hard disks (something I've presented on a number of times now), I've also got some Erlang content.
Anyway, enjoy!
T
|
UK SQL Server (SQL Relay 2012) - 5 cities of full day content 21st May - 30th May - Overview and Deepdive - It's free too |
Last October {2011} we held our the first ever "SQL Relay", this title was chosen because it was to be held on five consecutive days, with each of the UK regional SQL Server User Groups hosting an evening event in a relay fashion.
These events were a tremendous success; we all had a lot of FUN hosting the events, and had received some excellent feedback from both our regular and first time attendees to the User Group events.
Having known about the impending release of SLQ Server 2012 for a while now, the same UG Leaders and few new ones have been working tirelessly behind the scenes to organize the “SQL” to “SQL Relay” in the imaginatively titled “SQL Relay 2012” – Hey we’re DBAs not marketing Gurus!
Once again, we will be holding five days of regional events, the various user group leaders have been working very hard in the background, this time we are providing five full day events and they are FREE
These full day events will comprise of a morning overview of SQL Server 2012, which will be given by Microsoft speakers and Partners. The afternoon session will focus on some deeper technical content on the 2012 features; these will be delivered by UK SQL MVPs (Microsoft Most Valuable Professionals). With the evening events being a great way to complete the day or join us if you weren’t fortunate enough to be able to attend our morning or afternoon sessions, the content of these evening events will vary from location to location on their individual style and content – more details to follow soon.
Each of the five events will have their own registration and there are limitations on the number of attendees each event can hold, so be sure to register as early as possible to secure your place, you can attend as many or as few sessions as you wish, just indicate your preference when registering.
You may be asking “How much does It Cost to attend?” These are FREE events, due to HUGE generosity of our sponsors.
When and Where our these events being held
21st May Edinburgh
http://sqlserverfaq.com/?eid=378
Microsoft Edinburgh, Waverley Gate, 2-4 Waterloo
Place, Edinburgh, EH1 3EG
22nd May Manchester
http://sqlserverfaq.com/?eid=373
The Co-operative, CIS Tower, Miller Street,
Manchester, M60 0AL
23rd May Birmingham
http://sqlserverfaq.com/?eid=357
Lakeside Centre, Aston University's Conference
Centre, Birmingham, B4 7ET
24th May Bristol
http://sqlserverfaq.com/?eid=391
Avon Gorge Hotel, Sion Hill, Clifton, Bristol, BS8 4LD
Why not bring along a friend/colleague – Many of you work in teams or have SQL professional friends, so we have a special invitation and competition for you
Simply register and indicate a friend you’d like to invite – if both of you attend you also have a chance to both win a fabulous {book} prize.
Would you like to know more on
this these events? fear not as we will be providing further details over the
next few weeks in some follow-up emails
We like to ensure you get the information right @ your fingertips - So we also have also put all the relevant information on the events @ these locations as well.
FACEBOOK: https://www.facebook.com/SQLRelay2012 & https://www.facebook.com/SQLRelay2012/events
LINKEDIN GROUPS: http://www.linkedin.com/groups?gid=4153765&trk=hb_side_g & http://www.linkedin.com/groups?gid=2904068&trk=hb_side_g
TWITTER: Simply follow this hash tag for all the
latest news on the event #sqlrelay
Want to attend one of the User group events, visit our group website www.sqlserverfaq.com for a full listing of all our events, we also have a map of all UK User Groups http://tsqltidy.com/newmap/map.htm
|
Cost Comparison Hard Disk Drive to Solid State Drive on Price per Gigabyte - dispelling a myth! |
It is often said that Hard Disk Drive storage is significantly cheaper per GiByte than Solid State Devices – this is wholly inaccurate within the database space. People need to look at the cost of the complete solution and not just a single component part in isolation to what is really required to meet the business requirement.
Buying a single Hitachi Ultrastar 600GB 3.5” SAS 15Krpm hard disk drive will cost approximately lb239.60 (http://scan.co.uk, 22nd March 2012) compared to an OCZ 600GB Z-Drive R4 CM84 PCIe costing lb2,316.54 (http://scan.co.uk, 22nd March 2012); I’ve not included FusionIO ioDrive because there is no public pricing available for it – something I never understand and personally when companies do this I immediately think what are they hiding, luckily in FusionIO’s case the product is proven though is expensive compared to OCZ enterprise offerings.
On the face of it the single 15Krpm hard disk has a price per GB of lb0.39, the SSD lb3.86; this is what you will see in the press and this is what sales people will use in comparing the two technologies – do not be fooled by this bullshit people!
What is the requirement? The requirement is the database will have a static size of 400GB kept static through archiving so growth and trim will balance the database size, the client requires resilience, there will be several hundred call centre staff querying the database where queries will read a small amount of data but there will be no hot spot in the data so the randomness will come across the entire 400GB of the database, estimates predict that the IOps required will be approximately 4,000IOps at peak times, because it’s a call centre system the IO latency is important and must remain below 5ms per IO. The balance between read and write is 70% read, 30% write.
The requirement is now defined and we have three of the most important pieces of the puzzle – space required, estimated IOps and maximum latency per IO.
Something to consider with regard SQL Server; write activity requires synchronous IO to the storage media specifically the transaction log; that means the write thread will wait until the IO is completed and hardened off until the thread can continue execution, the requirement has stated that 30% of the system activity will be write so we can expect a high amount of synchronous activity.
The hardware solution needs to be defined; two possible solutions: hard disk or solid state based; the real question now is how many hard disks are required to achieve the IO throughput, the latency and resilience, ditto for the solid state.
Hard Drive solution
On a test on an HP DL380, P410i controller using IOMeter against a single 15Krpm 146GB SAS drive, the throughput given on a transfer size of 8KiB against a 40GiB file on a freshly formatted disk where the partition is the only partition on the disk thus the 40GiB file is on the outer edge of the drive so more sectors can be read before head movement is required:
For 100% sequential IO at a queue depth of 16 with 8 worker threads 43,537 IOps at an average latency of 2.93ms (340 MiB/s), for 100% random IO at the same queue depth and worker threads 3,733 IOps at an average latency of 34.06ms (34 MiB/s).
The same test was done on the same disk but the test file was 130GiB: For 100% sequential IO at a queue depth of 16 with 8 worker threads 43,537 IOps at an average latency of 2.93ms (340 MiB/s), for 100% random IO at the same queue depth and worker threads 528 IOps at an average latency of 217.49ms (4 MiB/s).
From the result it is clear random performance gets worse as the disk fills up – I’m currently writing an article on short stroking which will cover this in detail.
Given the work load is random in nature looking at the random performance of the single drive when only 40 GiB of the 146 GB is used gives near the IOps required but the latency is way out.
Luckily I have tested 6 x 15Krpm 146GB SAS 15Krpm drives in a RAID 0 using the same test methodology, for the same test above on a 130 GiB for each drive added the performance boost is near linear, for each drive added throughput goes up by 5 MiB/sec, IOps by 700 IOps and latency reducing nearly 50% per drive added (172 ms, 94 ms, 65 ms, 47 ms, 37 ms, 30 ms). This is because the same 130GiB is spread out more as you add drives 130 / 1, 130 / 2, 130 / 3 etc. so implicit short stroking is occurring because there is less file on each drive so less head movement required.
The best latency is still 30 ms but we have the IOps required now, but that’s on a 130GiB file and not the 400GiB we need.
Some reality check here: a) the drive randomness is more likely to be 50/50 and not a full 100% but the above has highlighted the effect randomness has on the drive and the more a drive fills with data the worse the effect.
For argument sake let us assume that for the given workload we need 8 disks to do the job, for resilience reasons we will need 16 because we need to RAID 1+0 them in order to get the throughput and the resilience, RAID 5 would degrade performance.
Cost for hard drives: 16 x lb239.60 = lb3,833.60
For the hard drives we will need disk controllers and a separate external disk array because the likelihood is that the server itself won’t take the drives, a quick spec off DELL for a PowerVault MD1220 which gives the dual pathing with 16 disks 146GB 15Krpm 2.5” disks is priced at lb7,438.00, note its probably more once we had two controller cards to sit in the server in, racking etc.
Minimum cost taking the DELL quote as an example is therefore: {Cost of Hardware} / {Storage Required}
lb7,438.60 / 400 = lb18.595 per GB
lb18.59 per GiB is a far cry from the lb0.39 we had been told by the salesman and the myth. Yes, the storage array is composed of 16 x 146 disks in RAID 10 (therefore 8 usable) giving an effective usable storage availability of 1168GB but the actual storage requirement is only 400 and the extra disks have had to be purchased to get the IOps up.
Solid State Drive solution
A single card significantly exceeds the IOps and latency required, for resilience two will be required.
( lb2,316.54 * 2 ) / 400 = lb11.58 per GB
With the SSD solution only two PCIe sockets are required, no external disk units, no additional controllers, no redundant controllers etc.
Conclusion
I hope by showing you an example that the myth that hard disk drives are cheaper per GiB than Solid State has now been dispelled - lb11.58 per GB for SSD compared to lb18.59 for Hard Disk.
I’ve not even touched on the running costs, compare the costs of running 18 hard disks, that’s a lot of heat and power compared to two PCIe cards!
Just a quick note: I've left a fair amount of information out due to this being a blog! If in doubt, email me :)
I'll also deal with the myth that SSD's wear out at a later date as well - that's just way over done still, yes, 5 years ago, but now - no.
|
The enterprise vendor con - connecting SSD's using SATA 2 (3Gbits) thus limiting there performance |
When comparing SSD against Hard drive performance it really makes me cross when folk think comparing an array of SSD running on 3GBits/sec to hard drives running on 6GBits/second is somehow valid. In a paper from DELL (http://www.dell.com/downloads/global/products/pvaul/en/PowerEdge-PowerVaultH800-CacheCade-final.pdf) on increasing database performance using the DELL PERC H800 with Solid State Drives they compare four SSD drives connected at 3Gbits/sec against ten 10Krpm drives connected at 6Gbits [Tony slaps forehead while shouting DOH!].
It is true in the case of hard drives it probably doesn’t make much difference 3Gbit or 6Gbit because SAS and SATA are both end to end protocols rather than shared bus architecture like SCSI, so the hard drive doesn’t share bandwidth and probably can’t get near the 600MiBytes/second throughput that 6Gbit gives unless you are doing contiguous reads, in my own tests on a single 15Krpm SAS disk using IOMeter (8 worker threads, queue depth of 16 with a stripe size of 64KiB, an 8KiB transfer size on a drive formatted with an allocation size of 8KiB for a 100% sequential read test) I only get 347MiBytes per second sustained throughput at an average latency of 2.87ms per IO equating to 44.5K IOps, ok, if that was 3GBits it would be less – around 280MiBytes per second, oh, but wait a minute [...fingers tap desk]
You’ll struggle to find in the commodity space an SSD that doesn’t have the SATA 3 (6GBits) interface, SSD’s are fast not only low latency and high IOps but they also offer a very large sustained transfer rate, consider the OCZ Agility 3 it so happens that in my masters dissertation I did the same test but on a difference box, I got 374MiBytes per second at an average latency of 2.67ms per IO equating to 47.9K IOps – cost of an 240GB Agility 3 is lb174.24 (http://www.scan.co.uk/products/240gb-ocz-agility-3-ssd-25-sata-6gb-s-sandforce-2281-read-525mb-s-write-500mb-s-85k-iops), but that same drive set in a box connected with SATA 2 (3Gbits) would only yield around 280MiBytes per second thus losing almost 100MiBytes per second throughput and a ton of IOps too.
So why the hell are “enterprise” vendors still only connecting SSD’s at 3GBits? Well, my conspiracy states that they have no interest in you moving to SSD because they’ll lose so much money, the argument that they use SATA 2 doesn’t wash, SATA 3 has been out for some time now and all the commodity stuff you buy uses it now.
Consider the cost, not in terms of price per GB but price per IOps, SSD absolutely thrash Hard Drives on that, it was true that the opposite was also true that Hard Drives thrashed SSD’s on price per GB, but is that true now, I’m not so sure – a 300GByte 2.5” 15Krpm SAS drive costs lb329.76 ex VAT (http://www.scan.co.uk/products/300gb-seagate-st9300653ss-savvio-15k3-25-hdd-sas-6gb-s-15000rpm-64mb-cache-27ms) which equates to lb1.09 per GB compared to a 480GB OCZ Agility 3 costing lb422.10 ex VAT (http://www.scan.co.uk/products/480gb-ocz-agility-3-ssd-25-sata-6gb-s-sandforce-2281-read-525mb-s-write-410mb-s-30k-iops) which equates to lb0.88 per GB.
Ok, I compared an “enterprise” hard drive with a “commodity” SSD, ok, so things get a little more complicated here, most “enterprise” SSD’s are SLC and most commodity are MLC, SLC gives more performance and wear, I’ll talk about that another day.
For now though, don’t get sucked in by vendor marketing, SATA 2 (3Gbit) just doesn’t cut it, SSD need 6Gbit to breath and even that SSD’s are pushing. Alas, SSD’s are connected using SATA so all the controllers I’ve seen thus far from HP and DELL only do SATA 2 – deliberate? Well, I’ll let you decide on that one.
|
Reporting Brick - Reducing the cost of performing Business Intelligence with Commodity Hardware |
Learn how you can load over 44 million rows with an average length of 241 bytes into SQL Server at a rate of over 530K rows per second using kit that costs less that lb2K.
https://skydrive.live.com/?mkt=en-gb#cid=DD00BC6E00F55EDF&id=DD00BC6E00F55EDF%21473
The paper is the end result of my two year masters in Business Intelligence at the University of Dundee.
Later this week I'll be putting http://www.reportingbrick.com live which will be a continuation of the paper where I'll post further research on the subject as time progresses.
This Friday (20th Jan 2012) between 2pm and 3:45pm I'll be at the School of Computing, University of Dundee to demo the kit and answer any questions in person.
|