Monday, May 17, 2010

Server 200x RAID tips and tricks

Since 2003, Windows Server was shipping with a very cool feature: software RAID.

Software RAID has two major advantages over hardware.

First, hardware RAID protects you against the disk failure, but not against the controller failure. The drive array contains proprietary disk allocation information that varies from manufacturer to manufacturer and controller to controller, so disks are not easily movable between different controllers. So when your controller fails - potentially, long time in the future, when the same boards are no longer available, - or a new release of OS stops supporting your older driver, you may be very much out of luck with your data.

Second, software RAID is considerably more flexible than hardware RAID. Hardware RAID operates on disks as atomic units - you RAID the whole disks together. Software RAID operates on volumes, and each volule can be configured with different level of redundancy. For example, you can have 2 constructs on two disks at the same time - an OS partition that is RAID-1 mirrored with an image on the second disk, and another volume that combines the rest of the space on two drives as a single span or RAID-0. The level of redundancy is selected per volume, not per disk.

Also, for those of us who like coding, software RAID has a very nice software interface (http://msdn.microsoft.com/en-us/library/bb986750(v=VS.85).aspx and its undocumented managed counterpart in Microsoft.Storage.Vds.dll) which allows one to code simple things like checking the health of the storage and send an email if something goes bad.

But what about performance? A while ago when we were designing Windows Home Server, we tested various hardware RAID implementations versus software RAID in Server 2003.

It turns out that both RAID-0 and RAID-1 exhibit very similar performance for both hardware and software solutions. If you think about what the system has to do (write the same data to two disks at the same time in the case of RAID-1) it quickly becomes obvious that hardware implementation does not really add anything over the software in this case: both can write the same data in two places at the same time with the same speed. Big surprise :-).

RAID-5 is a different beast though - there actually is a computation going on, and it is possible to build a specialized chip for doing vector XOR operations that would leave the general purpose x86 in the dust.

A much bigger problem also exists in the lack of integration between the formatting and the RAID code. When you format a RAID drive, the default allocation unit that the UI presents is very small.

Due to the way the software RAID is implemented, it leads to incredibly slow performance. On my relatively powerful system the writes clock at only 20-30MBps (this going to the drives that are supposed to sustain 3Gbps, or 300MBps transfers). Selecting a more reasonable allocation unit of 64k improves the write speed by a factor of four, to almost 120MBps.

The other performance problem that is format related is after creating a new RAID volume, the default behavior is that format and resync happen at the same time. I covered it in the previous blog post here: http://1-800-magic.blogspot.com/2010/05/solution-to-slow-formatting-puzzle.html.

In summary, here are the two very simple rules can make your RAID array much faster:
- Select the 64k as a default allocation unit when formatting the RAID-5 volume.
- When formatting any new RAID volume, use quick format first, wait for the volume to finish resyncing, then repeat with full format if you like (remember to keep the large allocation unit in the second format though!)

Happy RAIDing!

2 comments:

Eric Lee Green said...

While it is *possible* for a hardware RAID controller to do a hardware ECC computation, my own experience back in the day was that it didn't -- the processor in the RAID controller was a generic off-the-shelf embedded controller that basically was implementing a software RAID. When we were doing the initial design work for the Agami NAS, we benchmarked both hardware and software RAID systems, and found that with a few simple optimizations of the Linux RAID drivers (such as writing the ECC computations in x64 assembly, since we were AMD Opteron based), we could get far more throughput from software RAID than with any of the then-available PC-based hardware SATA RAID controllers even with RAID5. With today's multi-core multi-gigahertz processors that should be doubly true, our proof-of-concept benchmarks were being done with two Opteron cores and 8-port SATA RAID controllers.

In short, I'm dubious about the notion that common consumer RAID5 controllers are going to have higher performance than software RAID5. Maybe some high-end SCSI RAID5 controllers will, but the low-end SATA stuff is... low-end.

Sergey Solyanik said...

Most of the low-end controllers don't even pretend to implement RAID-5 in hardware - they just do soft-RAID in the driver.

So if you pay less that $500 for your board, it is very, very likely to be the case.

And if you pay more, then, well... what Eric says...