Raid

Embed Size (px)

DESCRIPTION

everything abt raid

Citation preview

IntroductionFor any organization, whether it be small business or a data center, lost data means lost business. There are two common practices for protecting that data: backups (protecting your data against total system failure, viruses, corruption, etc.), and RAID (protecting your data against drive failure). Both are necessary to ensure your data is secure.This white paper discusses the various types of RAID configurations available, their uses, and how they should be implemented into data servers.NOTE: RAID is not a substitute for regularly-scheduled backups. All organizations and users should always have a solid backup strategy in place.What is RAID?RAID (Redundant Array of Inexpensive Disks) is a data storage structure that allows a system administrator/designer/builder/user to combine two or more physical storage devices (HDDs, SSDs, or both) into a logical unit (an array) that is seen by the attached system as a single drive.There are three basic RAID elements:1. Striping (RAID 0) writes some data to one drive and some data to another, minimizing read and write access times and improving I/O performance.2. Mirroring (RAID 1) replicates data on two drives, preventing loss of data in the event of a drive failure.3. Parity (RAID 5 & 6) provides fault tolerance by examining the data on two drives and storing the results on a third. When a failed drive is replaced, the lost data is rebuilt from the remaining drives.It is possible to configure these RAID levels into combination levels called RAID 10, 50 and 60.The RAID controller handles the combining of drives into these different configurations to maximize performance, capacity, redundancy (safety) and cost to suit the user needs.Hardware RAID vs. software RAIDRAID can be hardware-based or software-based. Hardware RAID resides on a PCIe controller card, or on a motherboardintegrated RAID-on-Chip (ROC). The controller handles all RAID functions in its own hardware processor and memory. The server CPU is not loaded with storage workload so it can concentrate on handling the software requirements of the server operating system and applications.Pros: Better performance than software RAID. Controller cards can be easily swapped out for replacement and upgrades.

Cons: More expensive than software RAID.

Software RAID runs entirely on the CPU of the host computer system.Pros: Lower cost due to lack of RAID-dedicated hardware.

Cons: Lower RAID performance as CPU also powers the OS and applications.

How does RAID work?In software RAID, the RAID implementation is an application running on the host. This type of RAID uses drives attached to the computer system via a built-in I/O interface or a processorless host bus adapter (HBA). The RAID becomes active as soon as the OS has loaded the RAID driver software.In hardware RAID, a RAID controller has a processor, memory and multiple drive connectors that allow drives to be attached either directly to the controller, or placed in hot-swap backplanes.In both cases, the RAID system combines the individual drives into one logical disk. The OS treats the drive like any other drive in the computer it does not know the difference between a single drive connected to a motherboard or a RAID array being presented by the RAID controller.Given its performance benefits and flexibility, hardware RAID is better suited for the typical modern server system.RAID-compatible HDDs and SSDsStorage manufacturers offer many models of drives. Some are designated as desktop or consumer drives, and others as RAID or enterprise drives. There is a big difference: a consumer drive is not designed for the demands of being connected into a group of drives and is not suitable for RAID. RAID or enterprise drives, on the other hand, are designed to communicate with the RAID controller and act in unison with other drives to form a stable RAID array to run your server.From a RAID perspective, HDDs and SSDs only differ in their performance and capacity capabilities. To the RAID controller they are all drives, but it is important to take note of the performance characteristics of the RAID controller to ensure it is capable of fully accommodating the performance capabilities of the SSD. Most modern RAID controllers are fast enough to allow SSDs to run at their full potential, but a slow RAID controller could bottleneck data and negatively impact system performance.Hybrid RAIDHybrid RAID is a redundant storage solution that combines high capacity, low-cost SATA or higher-performance SAS HDDs with low latency, high IOPs SSDs and an SSD-aware RAID adapter card (Figure 1).

In Hybrid RAID, read operations are done from the faster SSD and write operations happen on both SSD and HDD for redundancy purposes.Hybrid RAID arrays offer tremendous performance gains over standard HDD arrays at a much lower cost than SSD-only RAID arrays. Compared to HDD-only RAID arrays, hybrid arrays accelerate IOPs and reduce latency, allowing any server system to host more users and perform more transactions per second on each server, which reduces the number of servers required to support any given workload.A simple glance at Hybrid RAID functionality does not readily show its common use cases, which include creating simple mirrors in workstations through to high-performance readintensive applications in the small to medium business arena. Hybrid RAID is also used extensively in the data center to provide greater capacity in storage servers while providing fast boot for those servers. Learn more aboutHybrid RAID.Who should use RAID?Any server or high-end workstation, and any computer system where constant uptime is required, is a suitable candidate for RAID.At some point in the life of a server, at least one drive will fail. Without some form of RAID protection, a failed drives data would have to be restored from backups, likely at the loss of some data and a considerable amount of time. With a RAID controller in the system, a failed drive can simply be replaced and the RAID controller will automatically rebuild the missing data from the rest of the drives onto the newlyinserted drive. This means that your system can survive a drive failure without the complex and long-winded task of restoring data from backups.Choosing the right RAID levelThere are several different RAID configurations, called levels, such as RAID 0, RAID 1, RAID 10, and RAID 5. While there is little difference in their names, there are big differences in their characteristics and where/when they should be used.The factors to consider when choosing the right RAID level include: Capacity Performance Redundancy (reliability/safety) PriceThere is no one-size-fits all approach to RAID because focus on one factor typically comes at the expense of another. Some RAID levels designate drives to be used for redundancy, which means they cant be used for capacity. Other RAID levels focus on performance but not on redundancy. A large, fast, highlyredundant array will be expensive. Conversely, a small, averagespeed redundant array wont cost much, but will not be anywhere near as fast as the previous expensive array.With that in mind, here is a look at the different RAID levels and how they may meet your requirements.RAID 0 (Striping)In RAID 0, all drives are combined into one logical disk (Figure 2). This configuration offers low cost and maximum performance, but no data protection a single drive failure results in total data loss.As such, RAID 0 is not recommended. As SSDs become more affordable and grow in capacity, RAID 0 has declined in popularity. The benefits of fast read/write access are far outweighed by the threat of losing all data in the event of a drive failure.Usage:Suited only for situations where data isnt mission critical, such as video/audio post-production, multimedia imaging, CAD, data logging, etc. where its OK to lose a complete drive because the data can be quickly re-copied from the source. Generally speaking, RAID 0 is not recommended.Pros: Fast and inexpensive. All drive capacity is usable. Quick to set up. Multiple HDDs sharing the data load make it the fastest of all arrays.

Cons: RAID 0 provides no data protection at all. If one drive fails, all data will be lost with no chance of recovery.

RAID 1 (Mirroring)RAID 1 maintains duplicate sets of all data on two separate drives while showing just one set of data as a logical disk (Figure 3). RAID 1 is about protection, not performance or capacity.Since each drive holds copies of the same data, the usable capacity is 50% of the available drives in the RAID set.Usage: Generally only used in cases where there is not a large capacity requirement, but the user wants to make sure the data is 100% recoverable in the case of a drive failure, such as accounting systems, video editing, gaming etc.Pros: Highly redundant each drive is a copy of the other. If one drive fails, the system continues as normal with no data loss.

Cons: Capacity is limited to 50% of the available drives, and performance is not much better than a single drive.

NOTE: With the advent of large-capacity SATA HDDs, it is possible to achieve an approximately 8TB RAID 1 array using two 8TB HDDs. While this may give sufficient capacity for many small business servers, performance will still be limited by the fact that it only has two spindles operating within the array. Therefore it is recommended to move to RAID arrays that utilize more spinning media when such capacities are required.

RAID 1E (Striped Mirroring)RAID 1E combines data striping from RAID 0 with data mirroring from RAID 1 while offering more performance than RAID 1 (Figure 4). Data written in a stripe on one drive is mirrored to a stripe on the next drive in the array.As in RAID 1, usable drive capacity in RAID 1E is 50% of the total available capacity of all drives in the RAID set.Usage:Small servers, high-end workstations, and other environments with no large capacity requirements, but where the user wants to make sure the data is 100% recoverable in the case of a drive failure.Pros: Redundant with better performance and capacity than RAID 1. In effect, RAID 1E is a mirror of an odd number of drives.

Cons: Cost is high because only half the capacity of the physical drives is available.

NOTE: RAID 1E is best suited for systems with three drives. For scenarios with four or more drives, RAID 10 is recommended.RAID 5 (Striping with Parity)As the most common and best all-round RAID level, RAID 5 stripes data blocks across all drives in an array (at least 3 to a maximum of 32), and also distributes parity data across all drives (Figure 5). In the event of a single drive failure, the system reads the parity data from the working drives to rebuild the data blocks that were lost.RAID 5 read performance is comparable to that of RAID 0, but there is a penalty for writes since the system must write both the data block and the parity data before the operation is complete.The RAID parity requires one drive capacity per RAID set, so usable capacity will always be one drive less than the total number of drives in the configuration.Usage:Often used in fileservers, general storage servers, backup servers, streaming data, and other environments that call for good performance but best value for the money. Not suited to database applications due to poor random write performance.Pros: Good value and good all-around performance.

Cons: One drive capacity is lost to parity. Can only survive a single drive failure at any one time. If two drives fail at once, all data is lost.

NOTE: It is strongly recommended to have a hot spare set up with the RAID 5 to reduce exposure to multiple drive failures. NOTE: While SSDs are becoming cheaper, and their improved performance over HDDs makes it seem possible to use them in RAID 5 arrays for database applications, the general nature of small random writes in RAID 5 still means that this RAID level should not be used in a system with a large number of small, random writes. A non-parity array such as RAID 10 should be used instead.

RAID 6 (Striping with Dual Parity)In RAID 6, data is striped across several drives and dual parity is used to store and recover data (Figure 6). It is similar to RAID 5 in performance and capacity capabilities, but the second parity scheme is distributed across different drives and therefore offers extremely high fault tolerance and the ability to withstand the simultaneous failure of two drives in an array.RAID 6 requires a minimum of 4 drives and a maximum of 32 drives to be implemented. Usable capacity is always two less than the number of available drives in the RAID set.Usage: Similar to RAID 5, including fileservers, general storage servers, backup servers, etc. Poor random write performance makes RAID 6 unsuitable for database applications.Pros: Reasonable value for money with good all-round performance. Can survive two drives failing at the same time, or one drive failing and then a second drive failing during the data rebuild.

Cons: More expensive than RAID 5 due to the loss of two drive capacity to parity. Slightly slower than RAID 5 in most applications.

RAID 10 (Striping and Mirroring)RAID 10 (sometimes referred to as RAID 1+0) combines RAID 1 and RAID 0 to offer multiple sets of mirrors striped together (Figures 7 and 8). RAID 10 offers very good performance with good data protection and no parity calculations.RAID 10 requires a minimum of four drives, and usable capacity is 50% of available drives. It should be noted, however, that RAID 10 can use more than four drives in multiples of two. Each mirror in RAID 10 is called a leg of the array. A RAID 10 array using, say, eight drives (four legs, with four drives as capacity) will offer extreme performance in both spinning media and SSD environments as there are many more drives splitting the reads and writes into smaller chunks across each drive.Usage:Ideal for database servers and any environment with many small random data writes.Pros: Fast and redundant.

Cons: Expensive because it requires four drives to get the capacity of two. Not suited to large capacities due to cost restrictions. Not as fast as RAID 5 in most streaming environments.

RAID 50 (Striping with Parity)RAID 50 (sometimes referred to as RAID 5+0) combines multiple RAID 5 sets (striping with parity) with RAID 0 (striping) (Figures 9 and 10). The benefits of RAID 5 are gained while the spanned RAID 0 allows the incorporation of many more drives into a single logical disk. Up to one drive in each sub-array may fail without loss of data. Also, rebuild times are substantially less than a single large RAID 5 array.A RAID 50 configuration can accommodate 6 or more drives, but should only be used with configurations of more than 16 drives. The usable capacity of RAID 50 is 67%-94%, depending on the number of data drives in the RAID set.It should be noted that you can have more than two legs in a RAID 50. For example, with 24 drives you could have a RAID 50 of two legs of 12 drives each, or a RAID 50 of three legs of eight drives each. The first of these two arrays would offer greater capacity as only two drives are lost to parity, but the second array would have greater performance and much quicker rebuild times as only the drives in the leg with the failed drive are involved in the rebuild function of the entire array.Usage: Good configuration for cases where many drives need to be in a single array but capacity is too large for RAID 10, such as in very large capacity servers.Pros: Reasonable value for the expense. Very good all-round performance, especially for streaming data, and very high capacity capabilities.

Cons: Requires a lot of drives. Capacity of one drive in each RAID 5 set is lost to parity. Slightly more expensive than RAID 5 due to this lost capacity.

RAID 60 (Striping with Dual Party)RAID 60 (sometimes referred to as RAID 6+0) combines multiple RAID 6 sets (striping with dual parity) with RAID 0 (striping) (Figures 11 and 12). Dual parity allows the failure of two drives in each RAID 6 array while striping increases capacity and performance without adding drives to each RAID 6 array.Like RAID 50, a RAID 60 configuration can accommodate 8 or more drives, but should only be used with configurations of more than 16 drives. The usable capacity of RAID 60 is between 50%-88%, depending on the number of data drives in the RAID set.Note that all of the above multiple-leg configurations that are possible with RAID 10 and RAID 50 are also possible with RAID 60. With 36 drives, for example, you can have a RAID 60 comprising two legs of 18 drives each, or a RAID 60 of three legs with 12 drives in each.Usage:RAID 60 is similar to RAID 50 but offers more redundancy, making it good for very large capacity servers, especially those that will not be backed up (i.e. video surveillance servers handling large numbers of cameras).Pros: Can sustain two drive failures per RAID 6 array within the set, so it is very safe. Very large and reasonable value for money, considering this RAID level wont be used unless there are a large number of drives.

Cons: Requires a lot of drives. Slightly more expensive than RAID 50 due to losing more drives to parity calculations.

When to use which RAID levelWe can classify data into two basic types: random and streaming. As indicated previously, there are two general types of RAID arrays: non-parity (RAID 1, 10) and parity (RAID 5, 6, 50, 60).Random data is generally small in nature (i.e., small blocks), with a large number of small reads and writes making up the data pattern. This is typified by database-type data.Streaming data is large in nature, and is characterized by such data types as video, images, general large files.While it is not possible to accurately determine all of a servers data usage, and servers often change their usage patterns over time, the general rule of thumb is that random data is best suited to non-parity RAID, while streaming data works best and is most cost-effective on parity RAID.Note that it is possible to set up both RAID types on the same controller, and even possible to set up the same RAID types on the same set of drives. So if, for example, you have eight 2TB drives, you can make a RAID 10 of 1TB for your database-type data, and a RAID 5 of the capacity that is left on the drives for your general and/or streaming type data (approximately 12TB). Having these two different arrays spanning the same drives will not impact performance, but your data will benefit in performance from being situated on the right RAID level.Drive size performanceWhile HDDs are becoming larger, they are not getting any faster a 1TB HDD and a 6TB HDD from the same product family will have the same performance characteristics. This has an impact when building/rebuilding arrays as it can take a long time to write all of the missing data to the new replacement drive.Conversely, SSDs are often faster in larger capacities, so an 80GB SSD and an 800GB SSD from the same product family will have quite different performance characteristics. This should be checked carefully with the product specifications from the drive vendor to make sure you are getting the performance you think you are getting from your drives.With HDDs it is generally better to create an array with more, rather than fewer, drives. A RAID 5 of three 6TB HDDs (12TB capacity) will not have the same performance as a RAID 5 array made from five 3TB HDDs (12TB capacity).With SSDs, however, it is advisable to achieve the capacity required from as few as drives possible by using larger capacity SSDs. These will have higher throughput than their smaller counterparts and will yield better system performance.Size of array vs size of drivesIt is a little-known fact that you do not need to use all of your drive capacity when creating a RAID array. When, for example, creating the RAID array in the controller BIOS, the controller will show you the maximum possible size the array can be based on the drives chosen to make up the array.During the creation process, you can change the size of the array to a lesser size. The unused space on the drives will be available for creating additional RAID arrays.A good example of this would be when creating a large server and keeping the operating system and data on separate RAID arrays. Typically you would make a RAID 10 of, say, 200GB for your OS installation spread across all drives in the server. This would use a minimal amount of capacity from each drive. You can then create a RAID 5 for your general data across the unused space on the drives.This has an added benefit of getting around drive size limitations for boot arrays on non-UEFI servers as the OS will believe it is only dealing with a 200GB drive when installing the operating system.Rebuild times and large RAID arraysThe more drives in the array, and the larger the HDDs in the array, the longer the rebuild time when a drive fails and is replaced or a hot-spare kicks in. While it is possible to have 32 drives in a RAID 5 array, it becomes somewhat impractical to do this with large spinning media.For example, a RAID 5 made of 32 6TB drives (186TB) will have very poor build and rebuild times due to the size, speed and number of drives. In this scenario, it would be advisable to build a RAID 50 with two legs from those drives (180TB capacity). When a drive fails and is replaced, only 16 of the drives (15 existing plus the new drive) will be involved in the rebuild. This will improve rebuild performance and reduce system performance impact during the rebuild process.Note, however, that no matter what you do, when it comes to rebuilding arrays with 6TB+ SATA drives, rebuild times will increase beyond 24 hours in an absolutely perfect environment (no load on server). In a real-world environment with a heavilyloaded system, the rebuild times will be even longer.Of course, rebuild times on SSD arrays are dramatically quicker due to the fact that the drives are smaller and the write speed of the SSDs are much faster than their spinning media counterparts.Default RAID settingsWhen creating a RAID array in the BIOS or management software, you will be presented with defaults that the controller proposes for the RAID settings. The most important of these is the stripe size. While there is much science, math and general knowledge involved in working out what is the best stripe size for your array, in the vast majority of cases the defaults work best, so use the 256kb stripe size as suggested by the controller.SSDs and read/write cacheIn an SSD-only RAID array, disabling the read and write cache will improve performance in a vast majority of cases. However, you may need test whether enabling read and write cache will improve performance even further. Note that it is possible to disable and enable read and write cache on the on the fly without affecting or reconfiguring the array, or restarting the server, so testing both configurations is recommended.RAID Level ComparisonFeaturesRAID 0RAID 1RAID 1ERAID 5RAID 6RAID 10RAID 50RAID 60

Minimum # Drives22334468

Data ProtectionNo ProtectionSingle-drive failureSingle-drive failureSingle-drive failureTwo-drive failure Up to one drive failure in each sub-arrayUp to one drive failure in each sub-arrayUp to one drive failure in each sub-arrayUp to two drive failure in each sub-array

Read PerformanceHighMediumMediumHighHighHighHighHigh

Write PerformanceHighMediumMediumMedium (depends on data type)LowHighMediumMedium

Read Performance (degraded)LowHighMediumMediumLowHighMediumLow

Write Performance (degraded)N/AMediumMediumLowLowHighMediumLow

Capacity Utilization100%50%50%67% - 94%50% - 88%50%67% - 94%67% - 94%

Typical UsageNon-mission-critical data, such as video/audio postproduction, multimedia imaging, CAD, data logging, etc. where its OK to lose a complete drive because the data can be quickly recopied from the source. GENERALLY SPEAKING, RAID 0 IS NOT RECOMMENDED.Cases where there is not a large capacity requirement, but the user wants to make sure the data is 100% recoverable in the case of a drive failure, such as accounting systems, video editing, gaming etc.Small servers, high-end workstations, and other environments where no large capacity requirements, but where the user wants to make sure the data is 100% recoverable in the case of a drive failure.Fileservers, general storage servers, backup servers, streaming data, and other environments that call for good performance but best value for the money. Not suited to database applications due to poor random write performance.Similar to RAID 5, including fileservers, general storage servers, backup servers, etc. Poor random write performance makes RAID 6 unsuitable for database applications.Ideal for database servers and any environment with many small random data writes.Good configuration for cases where many drives need to be in a single array but capacity is too large for RAID 10, such as in very large capacity servers.RAID 60 is similar to RAID 50 but offers more redundancy, making it good for very large capacity servers, especially those that will not be backed up (i.e. video surveillance servers handling large numbers of cameras).

ProsFast and inexpensive. All drive capacity is usable. Quick to set up. Multiple HDDs sharing the data load make it the fastest of all arrays.Highly redundant each drive is a copy of the other. If one drive fails, the system continues as normal with no data loss.Redundant with better performance and capacity than RAID 1. In effect, RAID 1E is a mirror of an odd number of drives.Good value and good all around performance.Reasonable value for money with good all-round performance. Can survive two drives failing at the same time, or one drive failing and then a second drive failing during the data rebuild.Fast and redundant.Reasonable value for the expense. Very good all-round performance, especially for streaming data, and very high capacity capabilities.Can sustain two drive failures per RAID 6 array within the set, so it is very safe. Reasonable value for money, considering this RAID level wont be used unless there are a large number of drives.

ConsRAID 0 provides no data protection at all. If one drive fails, all data will be lost with no chance of recovery.Capacity is limited to 50% of the available drives, and performance is not much better than a single drive.Cost is high because only half the capacity of the physical drives is available.One drive capacity is lost to parity. Can only survive a single drive failure at any one time. If two drives fail at once, all data is lost.More expensive than RAID 5 due to the loss of two drive capacity to parity. Slightly slower than RAID 5 in most applications.Expensive as it requires four drives to get the capacity of two. Not suited to large capacities due to cost restrictions. Not as fast as RAID 5 in streaming environments.Requires a lot of drives. Capacity of one drive in each RAID 5 set is lost to parity. Slightly more expensive than RAID 5 due to this lost capacity.Requires a lot of drives. Slightly more expensive than RAID 50 due to losing more drives to parity calculations.

Types of RAIDTypes of RAIDSoftware-BasedMotherboard-BasedAdapter-Based

DescriptionIncluded in the OS, such as Windows, and Linux.All RAID functions are handled by the host CPU which can severely tax its ability to perform other computations.Processor-intensive RAID operations are off-loaded from the host CPU to a RAID processor integrated into the motherboard.Processor-intensive RAID operations are off-loaded from the host CPU to an external PCIe adapter.Battery-back write back cache can dramatically increase performance without adding risk of data loss.

Typical UsageBest used for large block applications such as data warehousing or video streaming. Also where servers have the available CPU cycles to manage the I/O intensive operations certain RAID levels require.Inexpensive.Best used for small block applications such as transaction oriented databases and web servers.

ProsLower cost due to lack of RAIDdedicated hardware.Lower cost than adapter-based RAID.Offloads RAID tasks from the host system, yielding better performance than software RAID. Controller cards can be easily swapped out for replacement and upgrades.Data can be backed up to prevent loss in a power failure.

ConsLower RAID performance as CPU also powers the OS and applications.No ability to upgrade or replace the RAID processor in the event of hardware failure.May only support a few RAID levels.More expensive than software and integrated RAID.

Nonredundant Arrays (RAID 0)An array with RAID 0 includes two or more disk drives and provides datastriping, where data is distributed evenly across the disk drives in equal-sized sections. However, RAID 0 arrays do not maintain redundant data, so they offerno data protection.Compared to an equal-sized group of independent disks, a RAID 0 array provides improved I/O performance.Drive segment size is limited to the size of the smallest disk drive in the array. For instance, a array with two 250 GB disk drives and two 400 GB disk drives can create a RAID 0 drive segment of 250 GB, for a total of 1000 GB for the volume, as shown in this figure.FIGURE F-1 Nonredundant Arrays (RAID 0)

[ D ]

RAID 1 ArraysA RAID 1 array is built from two disk drives, where one disk drive is amirrorof the other (the same data is stored on each disk drive). Compared to independent disk drives, RAID 1 arrays provide improved performance, with twice the read rate and an equal write rate of single disks. However, capacity is only 50 percent of independent disk drives.If the RAID 1 array is built from different-sized disk drives, drive segment size is the size of the smaller disk drive, as shown in this figure.FIGURE F-2 RAID 1 Arrays

RAID 1 Enhanced ArraysA RAID 1 Enhanced (RAID 1E) array--also referred to as a striped mirror--is similar to a RAID 1 array except that data is both mirroredandstriped, and more disk drives can be included. A RAID 1E array can be built from three or more disk drives.In this figure, the large bold numbers represent the striped data, and the smaller, non-bold numbers represent the mirrored data stripes.FIGURE F-3 RAID 1 Enhanced Arrays

RAID 10 ArraysA RAID 10 array is built from two or more equal-sized RAID 1 arrays. Data in a RAID 10 array is both striped and mirrored. Mirroring provides data protection, and striping improves performance.Drive segment size is limited to the size of the smallest disk drive in the array. For instance, a array with two 250 GB disk drives and two 400 GB disk drives can create two mirrored drive segments of 250 GB, for a total of 500 GB for the array, as shown in this figure.FIGURE F-4 RAID 10 Arrays

RAID 5 ArraysA RAID 5 array is built from a minimum of three disk drives, and uses data striping andparitydata to provide redundancy. Parity data provides data protection, and striping improves performance.Parity data is an error-correcting redundancy thats used to re-create data if a disk drive fails. In RAID 5 arrays, parity data (represented by Ps in the next figure) is striped evenly across the disk drives with the stored data.Drive segment size is limited to the size of the smallest disk drive in the array. For instance, an array with two 250 GB disk drives and two 400 GB disk drives can contain 750 GB of stored data and 250 GB of parity data, as shown in this figure.FIGURE F-5 RAID 5 Arrays

RAID 5EE ArraysA RAID 5EE array--also referred to as ahot-spare--is similar to a RAID 5 array except that it includes adistributed sparedrive and must be built from a minimum of four disk drives.Unlike a hot-spare, a distributed spare is striped evenly across the disk drives with the stored data and parity data, and cant be shared with other logical disk drives. A distributed spare improves the speed at which the array is rebuilt following a disk drive failure.A RAID 5EE array protects your data and increases read and write speeds. However, capacity is reduced by two disk drives worth of space, which is for parity data and spare data.In this example, S represents the distributed spare, P represents the distributed parity data.FIGURE F-6 RAID 5EE Arrays

RAID 50 ArraysA RAID 50 array is built from at least six disk drives configured as two or more RAID 5 arrays, and stripes stored data and parity data across all disk drives in both RAID 5 arrays. (For more information, seeRAID 5 Arrays.)The parity data provides data protection, and striping improves performance. RAID 50 arrays also provide high data transfer speeds.Drive segment size is limited to the size of the smallest disk drive in the array. For example, three 250 GB disk drives and three 400 GB disk drives comprise two equal-sized RAID 5 arrays with 500 GB of stored data and 250 GB of parity data. The RAID 50 array can therefore contain 1000 GB (2 x 500 GB) of stored data and 500 GB of parity data.FIGURE F-7 RAID 50 Arrays

[ D ]In this example, P represents the distributed parity data.

RAID 6 ArraysA RAID 6 array--also referred to asdual drive failure protection--is similar to a RAID 5 array because it uses data striping and parity data to provide redundancy. However, RAID 6 arrays includetwoindependent sets of parity data instead of one. Both sets of parity data are striped separately across all disk drives in the array.RAID 6 arrays provide extra protection for your data because they can recover from two simultaneous disk drive failures. However, the extra parity calculation slows performance (compared to RAID 5 arrays).RAID 6 arrays must be built from at least four disk drives. Maximum stripe size depends on the number of disk drives in the array.FIGURE F-8 RAID 6 Arrays

RAID 60 ArraysSimilar to a RAID 50 array (seeRAID 50 Arrays), a RAID 60 array--also referred to asdual drive failure protection-- is built from at least eight disk drives configured as two or more RAID 6 arrays, and stripes stored data and two sets of parity data across all disk drives in both RAID 6 arrays.Two sets of parity data provide enhanced data protection, and striping improves performance. RAID 60 arrays also provide high data transfer speeds.

Selecting the Best RAID LevelUse this table to select the RAID levels that are most appropriate for the arrays on your storage space, based on the number of available disk drives and your requirements for performance and reliability.

TABLE F-1 Selecting the Best RAID Level

RAID LevelRedundancyDisk DriveUsageRead PerformanceWrite PerformanceBuilt-in Hot-SpareMinimumDisk Drives

RAID 0No100%wwwwwwNo2

RAID 1Yes50%wwwwNo2

RAID 1EYes50%wwwwNo3

RAID 10Yes50%wwwwNo4

RAID 5Yes67 - 94%wwwwNo3

RAID 5EEYes50 - 88%wwwwYes4

RAID 50Yes67 - 94%wwwwNo6

RAID 6Yes50 - 88%wwwNo4

RAID 60Yes50 - 88%wwwNo8

Disk drive usage, read performance, and write performance depend on the number of drives in the array. In general, the more drives, the better the performance.

Migrating RAID LevelsAs your storage space changes, you can migrate existing RAID levels to new RAID levels that better meet your storage needs. You can perform these migrations through the Sun StorageTek RAID Manager software. For more information, see theSun StorageTek RAID Manager Software Users Guide.TABLE F-2lists the supported RAID level migrations.

TABLE F-2 Supported RAID Level Migrations

Existing RAID LevelSupported Migration RAID Level

Simple volumeRAID 1

RAID 0 RAID 5 RAID 10

RAID 1 Simple volume RAID 0 RAID 5 RAID 10

RAID 5 RAID 0 RAID 5EE RAID 6 RAID 10

RAID 6RAID 5

RAID 10 RAID 0 RAID 5

AID 2

This uses bit level striping. i.e Instead of striping the blocks across the disks, it stripes the bits across the disks. In the above diagram b1, b2, b3 are bits. E1, E2, E3 are error correction codes. You need two groups of disks. One group of disks are used to write the data, another group is used to write the error correction codes. This uses Hamming error correction code (ECC), and stores this information in the redundancy disks. When data is written to the disks, it calculates the ECC code for the data on the fly, and stripes the data bits to the data-disks, and writes the ECC code to the redundancy disks. When data is read from the disks, it also reads the corresponding ECC code from the redundancy disks, and checks whether the data is consistent. If required, it makes appropriate corrections on the fly. This uses lot of disks and can be configured in different disk configuration. Some valid configurations are 1) 10 disks for data and 4 disks for ECC 2) 4 disks for data and 3 disks for ECC This is not used anymore. This is expensive and implementing it in a RAID controller is complex, and ECC is redundant now-a-days, as the hard disk themselves can do this.RAID 3

This uses byte level striping. i.e Instead of striping the blocks across the disks, it stripes the bits across the disks. In the above diagram B1, B2, B3 are bytes. p1, p2, p3 are parities. Uses multiple data disks, and a dedicated disk to store parity. The disks have to spin in sync to get to the data. Sequential read and write will have good performance. Random read and write will have worst performance. This is not commonly used.RAID 4

This uses block level striping. In the above diagram B1, B2, B3 are blocks. p1, p2, p3 are parities. Uses multiple data disks, and a dedicated disk to store parity. Minimum of 3 disks (2 disks for data and 1 for parity) Good random reads, as the data blocks are striped. Bad random writes, as for every write, it has to write to the single parity disk. It is somewhat similar to RAID 3 and 5, but little different. This is just like RAID 3 in having the dedicated parity disk, but this stripes blocks. This is just like RAID 5 in striping the blocks across the data disks, but this has only one parity disk. This is not commonly used.RAID 6

Just like RAID 5, this does block level striping. However, it uses dual parity.In the above diagram A, B, C are blocks. p1, p2Q: What is the definition of a "RAID 5" volume?A:"RAID 5" refers to a "Redundant Array of Inexpensive (or Independent) Disks" that have been established in a Level 5, or striped with parity, volume set. A RAID 5 volume is a combination of hard drives that are configured for data to be written across three (3) or more drives.

Q: What is "parity" or "parity data"?A:In a RAID 5 configuration, additional data is written to the disk that should allow the volume to be rebuilt in the event that a single drive fails. In the event that a single drive does fail, the volume continues to operate in a "degraded" state (no fault tolerance). Once the failed drive is replaced with a new hard drive (of the same or higher capacity), the "parity data" is used to rebuild the contents of the failed drive on the new one.

Q: What the minimum drive requirements to create a RAID 5 volume?A:RAID 5 volume sets require a minimum of at least three (3) hard drives (preferably of the same capacity) to create and maintain a RAID 5 volume. If one drive is of a lower capacity than the others, the RAID controller (whether hardware or software) will treat every hard drive in the array as though it were of the same lower capacity and will establish the volume accordingly.

Q: What are the differences between "hardware" and "software" RAID 5 configurations?A:With a software-based RAID 5 volume, the hard disk drives use a standard drive contoller and a software utility provides the management of the drives in the volume. A RAID 5 volume that relies on hardware for management will have a physical controller (commonly built into the motherboard, but it can also be a stand-alone expansion card) that provides for the reading and writing of data across the hard drives in the volume.

Q: What are the advantages of RAID 5 volumes?A:A RAID 5 volume provides faster data access and fault tolerance, or protection against one of the drives failing during use. With a RAID 5 disk volume, information is striped (or written) across all of the drives in the array along with parity data. If one of the hard drives in the array becomes corrupted, drops out of a ready state or otherwise fails, the remaining hard drives will continue to operate as a striped volume with no parity and with no loss of data. The failed drive can be replaced in the array with one of equal or larger capacity, and the data it contained will be automatically rebuilt using the parity data contained on the other drives. Establishing a RAID 5 volume requires 3 disk drives as a minimum requirement.

Q: What are the disadvantages of RAID 5 configurations?A:There are several disadvantages. RAID 5 results in the loss of storage capacity equivalent to the capacity of one hard drive from the volume. For example, three 500GB hard drives added together comprise 1500GB (or roughly about 1.5 terabytes) of storage. If the three (3) 500GB drives were established as a RAID 0 (striped) configuration, total data storage would equal 1500GB capacity . If these same three (3) drives are configured as a RAID 5 volume (striped with parity), the usable data storage capacity would be 1000GB and not 1500GB, since 500GB (the equivalent of one drives' capacity) would be utilized for parity. In addition, if two (2) or more drives fail or become corrupted at the same time, all data on the volume would be inaccessible to the user.

Q: Can data be recovered from a re-formatted RAID 5 volume?A:Many times information is still recoverable, depending on how the drives were re-formatted. Re-formatting a volume using Windows, for example, will create what will appear to be a new "clean" volume - but the original data will still be on the disk in the "free and available" space. However, a low-level format (usually performed through an on-board RAID controller utility) will "wipe" or overwrite every single block on a drive. Unlike an O/S (or "high-level") format, a low-level format normally is slower, takes a considerable amount of time and destroys the original data.

Q: Can I run recovery software utilities to recover my RAID volume data?A:The safest approach to data recovery with a RAID volume (or with any media) is to capture every storage block on each device individually. The resulting drive "images" are then used to help rebuild the original array structure and recover the necessary files and folders. This approach limits continued interaction with the media and helps to preserve the integrity of the original device. One of the dangers in using data recovery software is that it forces the read / write heads to travel repeatedly over areas of the original media which, if physically damaged, could become further damaged and possibly unrecoverable.

Q: If a RAID 5 volume will not mount, should I allow a "rebuild" to run?A:If one drive fails in a RAID 5 configuration, the volume still operates - but in a degraded state (it no longer writes parity information). The important data should be backed up immediately andverified to be usablebefore any rebuild operation is started. When it comes to critical data, anything that is used to read or write to the original volume represents a risk. Is the hardware operating properly? Are all other drives in the volume functioning correctly? If you are the least bit unsure, a rebuild should not be performed.

Q: If multiple drives fail in a RAID volume all at once, is the data still recoverable?A:In many cases, the answer is yes. It usually requires that data be recovered from each failed hard drive individually before attempting to address the rest of the volume. The quality and integrity of the data recovered will depend on the extent of the damage incurred to each failed storage device., p3 are parities. Non-Redundant (RAID Level 0)A non-redundant disk array, or RAID level 0, has the lowest cost of any RAID organization because it does not employ redundancy at all. This scheme offers the best performance since it never needs to update redundant information. Surprisingly, it does not have the best performance. Redundancy schemes that duplicate data, such as mirroring, can perform better on reads by selectively scheduling requests on the disk with the shortest expectedseekandrotationaldelays. Without, redundancy, any single disk failure will result in data-loss. Non-redundant disk arrays are widely used in super-computing environments where performance and capacity, rather than reliability, are the primary concerns.Sequential blocks of data are written across multiple disks in stripes, as follows:source:Reference 2The size of a data block, which is known as the "stripe width", varies with the implementation, but is always at least as large as a disk's sector size. When it comes time to read back this sequential data, all disks can be read in parallel. In a multi-tasking operating system, there is a high probability that even non-sequential disk accesses will keep all of the disks working in parallel.Mirrored (RAID Level 1)The traditional solution, called mirroring or shadowing, uses twice as many disks as a non-redundant disk array. whenever data is written to a disk the same data is also written to a redundant disk, so that there are always two copies of the information. When data is read, it can be retrieved from the disk with the shorter queuing, seek and rotational delays. If a disk fails, the other copy is used to service requests. Mirroring is frequently used in database applications where availability and transaction time are more important than storage efficiency.source:Reference 2Memory-Style(RAID Level 2)Memory systems have provided recovery from failed components with much less cost than mirroring by using Hamming codes. Hamming codes contain parity for distinct overlapping subsets of components. In one version of this scheme, four disks require three redundant disks, one less than mirroring. Since the number of redundant disks is proportional to the log of the total number of the disks on the system, storage efficiency increases as the number of data disks increases.If a single component fails, several of the parity components will have inconsistent values, and the failed component is the one held in common by each incorrect subset. The lost information is recovered by reading the other components in a subset, including the parity component, and setting the missing bit to 0 or 1 to create proper parity value for that subset. Thus, multiple redundant disks are needed to identify the failed disk, but only one is needed to recover the lost information.In you are unaware of parity, you can think of the redundant disk as having the sum of all data in the other disks. When a disk fails, you can subtract all the data on the good disks form the parity disk; the remaining information must be the missing information. Parity is simply this sum modulo 2.A RAID 2 system would normally have as many data disks as the word size of the computer, typically 32. In addition, RAID 2 requires the use of extra disks to store an error-correcting code for redundancy. With 32 data disks, a RAID 2 system would require 7 additional disks for a Hamming-code ECC. Such an array of 39 disks was the subject of a U.S. patent granted to Unisys Corporation in 1988, but no commercial product was ever released.For a number of reasons, including the fact that modern disk drives contain their own internal ECC, RAID 2 is not a practical disk array scheme.source:Reference 2Bit-Interleaved Parity (RAID Level 3)One can improve upon memory-style ECC disk arrays by noting that, unlike memory component failures, disk controllers can easily identify which disk has failed. Thus, one can use a single parity rather than a set of parity disks to recover lost information.In a bit-interleaved, parity disk array, data is conceptually interleaved bit-wise over the data disks, and a single parity disk is added to tolerate any single disk failure. Each read request accesses all data disks and each write request accesses all data disks and the parity disk. Thus, only one request can be serviced at a time. Because the parity disk contains only parity and no data, the parity disk cannot participate on reads, resulting in slightly lower read performance than for redundancy schemes that distribute the parity and data over all disks. Bit-interleaved, parity disk arrays are frequently used in applications that require high bandwidth but not high I/O rates. They are also simpler to implement than RAID levels 4, 5, and 6.Here, the parity disk is written in the same way as the parity bit in normal Random Access Memory (RAM), where it is the Exclusive Or of the 8, 16 or 32 data bits. In RAM, parity is used to detect single-bit data errors, but it cannot correct them because there is no information available to determine which bit is incorrect. With disk drives, however, we rely on the disk controller to report a data read error. Knowing which disk's data is missing, we can reconstruct it as the Exclusive Or (XOR) of all remaining data disks plus the parity disk.source:Reference 2As a simple example, suppose we have 4 data disks and one parity disk. The sample bits are: Disk 0 Disk 1 Disk 2 Disk 3 Parity 0 1 1 1 1The parity bit is the XOR of these four data bits, which can be calculated by adding them up and writing a 0 if the sum is even and a 1 if it is odd. Here the sum of Disk 0 through Disk 3 is "3", so the parity is 1. Now if we attempt to read back this data, and find that Disk 2 gives a read error, we can reconstruct Disk 2 as the XOR of all the other disks, including the parity. In the example, the sum of Disk 0, 1, 3 and Parity is "3", so the data on Disk 2 must be 1.Block-Interleaved Parity (RAID Level 4)The block-interleaved, parity disk array is similar to the bit-interleaved, parity disk array except that data is interleaved across disks of arbitrary size rather than in bits. The size of these blocks is called the striping unit. Read requests smaller than the striping unit access only a single data disk. Write requests must update the requested data blocks and must also compute and update the parity block. For large writes that touch blocks on all disks, parity is easily computed by exclusive-or'ing the new data for each disk. For small write requests that update only one data disk, parity is computed by noting how the new data differs from the old data and applying those differences to the parity block. Small write requests thus require four disk I/Os: one to write the new data, two to read the old data and old parity for computing the new parity, and one to write the new parity. This is referred to as a read-modify-write procedure. Because a block-interleaved, parity disk array has only one parity disk, which must be updated on all write operations, the parity disk can easily become a bottleneck. Because of this limitation, the block-interleaved distributed parity disk array is universally preferred over the block-interleaved, parity disk array.source:Reference 2Block-Interleaved Distributed-Parity (RAID Level 5)The block-interleaved distributed-parity disk array eliminates the parity disk bottleneck present in the block-interleaved parity disk array by distributing the parity uniformly over all of the disks. An additional, frequently overlooked advantage to distributing the parity is that it also distributes data over all of the disks rather than over all but one. This allows all disks to participate in servicing read operations in contrast to redundancy schemes with dedicated parity disks in which the parity disk cannot participate in servicing read requests. Block-interleaved distributed-parity disk array have the best small read, large write performance of any redundancy disk array. Small write requests are somewhat inefficient compared with redundancy schemes such as mirroring however, due to the need to perform read-modify-write operations to update parity. This is the major performance weakness of RAID level 5 disk arrays.The exact method used to distribute parity in block-interleaved distributed-parity disk arrays can affect performance. Following figure illustrates left-symmetric parity distribution.Each square corresponds to a stripe unit. Each column of squares corresponds to a disk. P0 computes the parity over stripe units 0, 1, 2 and 3; P1 computes parity over stripe units 4, 5, 6, and 7 etc. (source:Reference 1)A useful property of the left-symmetric parity distribution is that whenever you traverse the striping units sequentially, you will access each disk once before accessing any disk device. This property reduces disk conflicts when servicing large requests.source:Reference 2P+Q redundancy (RAID Level 6)Parity is a redundancy code capable of correcting any single, self-identifying failure. As large disk arrays are considered, multiple failures are possible and stronger codes are needed. Moreover, when a disk fails in parity-protected disk array, recovering the contents of the failed disk requires successfully reading the contents of all non-failed disks. The probability of encountering an uncorrectable read error during recovery can be significant. Thus, applications with more stringent reliability requirements require stronger error correcting codes.Once such scheme, called P+Q redundancy, uses Reed-Solomon codes to protect against up to two disk failures using the bare minimum of two redundant disk arrays. The P+Q redundant disk arrays are structurally very similar to theblock-interleaved distributed-parity disk arraysand operate in much the same manner. In particular, P+Q redundant disk arrays also perform small write operations using a read-modify-write procedure, except that instead of four disk accesses per write requests, P+Q redundant disk arrays require six disk accesses due to the need to update both the `P' and `Q' information.Striped Mirrors (RAID Level 10)RAID 10 was not mentioned in the original 1988 article that defined RAID 1 through RAID 5. The term is now used to mean the combination of RAID 0 (striping) and RAID 1 (mirroring). Disks are mirrored in pairs for redundancy and improved performance, then data is striped across multiple disks for maximum performance. In the diagram below, Disks 0 & 2 and Disks 1 & 3 are mirrored pairs.Obviously, RAID 10 uses more disk space to provide redundant data than RAID 5. However, it also provides a performance advantage by reading from all disks in parallel while eliminating the write penalty of RAID 5. In addition, RAID 10 gives better performance than RAID 5 while a failed drive remains unreplaced. Under RAID 5, each attempted read of the failed drive can be performed only by reading all of the other disks. On RAID 10, a failed disk can be recovered by a single read of its mirrored pair.source:Reference 2Tool to calculate storage efficiency given the number of disks and the RAID level (source:Reference 3)RAID Systems Need Tape BackupsIt is worth remembering an important point about RAID systems. Even when you use a redundancy scheme like mirroring or RAID 5 or RAID 10, you must still do regular tape backups of your system. There are several reasons for insisting on this, among them: RAID does not protect you from multiple disk failures. While one disk is off line for any reason, your disk array is not fully redundant. Regular tape backups allow you to recover from data loss that is not related to a disk failure. This includes human errors, hardware errors, and software errors. This creates two parity blocks for each data block. Can handle two disk failureThis RAID configuration is complex to implement in a RAID controller, as it has to calculate two parity data for each data bloThere are three important considerations while making a selection as to which RAID level is to be used for a system viz. cost, performance and reliability.There are many different ways to measure these parameters for eg. performance could be measured as I/Os per second per dollar, bytes per second or response time. We could also compare systems at the same cost, the same total user capacity, the same performance or the same reliability. The method used largely depends on the application and the reason to compare. For example, in transaction processing applications the primary base for comparison would be I/Os per second per dollar while in scientific applications we would be more interested in bytes per second per dollar. In some heterogeneous systems like file servers both I/O per second and bytes per second may be important. Sometimes it is important to consider reliability as the base for comparison.Taking a closer look at the RAID levels we observe that most of the levels are similar to each other. RAID level 1 and RAID level 3 disk arrays can be viewed as a subclass of RAID level 5 disk arrays. Also RAID level 2 and RAID level 4 disk arrays are generally found to be inferior to RAID level 5 disk arrays. Hence the problem of selecting among RAID levels 1 through 5 is a subset of the more general problem of choosing an appropriate parity group size and striping unit for RAID level 5 disk arrays.Some ComparisonsGiven below is a table that compares the throughput of various redundancy schemes for four types of I/O requests. The I/O requests are basically reads and writes which are divided into small (reads & writes) and large ones. Remembering the fact that our data has been spread over multiple disks (data striping), a small refers to an I/O request of one striping unit while a large I/O request refers to requests of one full stripe (one stripe unit from each disk in an error correcting group).RAID TypeSmall ReadSmall WriteLarge ReadLarge WriteStorage Efficiency

RAID Level 011111

RAID Level 111/211/21/2

RAID Level 31/G1/G(G-1)/G(G-1)/G(G-1)/G

RAID Level 51max (1/G,1/4)1(G-1)/G(G-1)/G

RAID Level 61max (1/G,1/6)1(G-2)/G(G-2)/G

G : The number of disks in an error correction group.The table above tabulates the maximum throughput per dollar relative level 0 for RAID levels 0, 1, 3, 5 and 6. For practical purposes we consider RAID levels 2 & 4 inferior to RAID level 5 disk arrays, so we don't show the comparisons. The cost of a system is directly proportional to the number of disks it uses in the disk array. Thus the table shows us that given equivalent cost RAID level 0 and RAID level 1 systems, the RAID level 1 system can sustain half the number of small writes per second that a RAID level 0 system can sustain. Equivalently the cost of small writes is twice as expensive in a RAID level 1 system as in a RAID level 0 system.The table also shows storage efficiency of each RAID level. The storage efficiency is approximately inverse the cost of each unit of user capacity relative to a RAID level 0 system. Thestorage efficiencyis equal to the performance/cost metric for large writes. source:Reference 1The figures above graph the performance/cost metrics from the table above for RAID levels 1, 3, 5 and 6 over a range of parity group sizes. The performance/cost of RAID level 1 systems is equivalent to the performance/cost of RAID level 5 systems when the parity group size is equal to 2. The performance/cost of RAID level 3 systems is always less than or equal to the performance/cost of RAID level 5 systems. This is expected given that a RAID level 3 system is a subclass of RAID level 5 systems derived by restricting the striping unit size such that all requests access exactly a parity stripe of data. Since the configuration of RAID level 5 systems is not subject to such a restriction, the performance/cost of RAID level 5 systems can never be less than that of an equivalent RAID level 3 system. Of course such generalizations are specific to the models of disk arrays used in the above experiments. In reality, a specific implementation of a RAID level 3 system can have better performance/cost than a specific implementation of a RAID level 5 system.The question of which RAID level to use is better expressed as more general configuration questions concerning the size of the parity group and striping unit. For a parity group size of 2, mirroring is desirable, while for a very small striping unit RAID level 3 would be suited.The figure below plots the performance/cost metrics from the table above for RAID levels 3, 5 & 6. BACK/HOMEReliability of any I/O system has become as important as its performance and cost. This part of the tutorial: Reviews the basic reliability provided by ablock-interleaved parity disk array Lists and discusses three factors that can determine the potential reliability of disk arrays.Redundancy in disk arrays is motivated by the need to fight disk failures. Two key factors MTTF(Mean-Time-to-Failure) and MTTR(Mean-Time-to-Repair) are of primary concern in estimating the reliability of any disk. Following are some formulae for the mean time between failures :RAID level 5MTTF(disk)2------------------N*(G-1)*MTTR(disk)Disk array with two redundant disk per parity group (eg: P+Q redundancy)MTTF(disk)3-------------------------N*(G-1)*(G-2)* (MTTR(disk)2)N - total number of disks in the systemG - number of disks in the parity groupFactors affecting ReliabilityThree factors that can dramatically affect the reliability of disk arrays are: System crashes Uncorrectable bit-errors Correlated disk failuresSystem CrashesSystem crashrefers to any event such as a power failure, operator error, hardware breakdown, or software crash that can interrupt an I/O operation to a disk array.Suchcrashes can interruptwrite operations, resulting in states where the data is updated and the parity is not updated or vice versa. In either case, parity is inconsistent and cannot be used in the event of a disk failure. Techniques such asredundant hardwareand power supplies can be applied to make suchcrashes less frequent.System crashes can cause parity inconsistenciesin both bit-interleaved and block-interleaved disk arrays, but the problem is of practical concern only in block-interleaved disk arrays.For, reliability purposes,system crashesin block-interleaved disk arrays are similar to disk failures in that they mayresult in the loss of the correct parity for stripes that were modified during the crash.Uncorrectable bit-errorsMostuncorrectable bit-errorsare generatedbecause data is incorrectly written or gradually damaged as the magnetic media ages. These errors are detected only when we attempt to read the data.Our interpretation ofuncorrectable bit error ratesis that they represent therate at which errors are detected during reads from the disk during the normal operation of the disk drive.Oneapproachthat can be used with or without redundancy is to tryto protect against bit errorsby predicting when a disk is about to fail. VAXsimPLUS, a product from DEC, monitors the warnings issued by disks and notifies an operator when it feels the disk is about to fail.Correlated disk failuresCauses:Common environmental and manufacturing factors.For example, an accident might sharply increase the failure rate for all disks in a disk array for a short period of time. In general,power surges, power failures and simply switching the disks on and offcan place stress on the electrical components of all affected disks. Disks also share common support hardware; when this hardware fails, it can lead to multiple, simultaneous disk failures.Disks are generally more likely to fail either very early or very late in their lifetimes.Early failuresare frequently caused by transient defects which may not have been detected during the manufacturer's burn-in process.Late failuresoccur when a disk wears out. Correlated disk failures greatly reduce the reliability of disk arrays by making it much more likely that an initial disk failure will be closely followed by additional disk failures before the failed disk can be reconstructed.Mean-Time-To-Data-Loss(MTTDL)Following are some formulae to calculate the mean-time-to-data-loss(MTTDL). In a block-interleaved parity-protected disk array, data loss is possible through the following three common ways: double disk failures system crash followed by a disk failure disk failure followed by an uncorrectable bit error during reconstructionThe above three failure modes are the hardest failure combinations, in that we, currently, don't have any techniques to protect against them without sacrificing performance.RAID Level 5Double Disk FailureMTTF(disk1) * MTTF(disk2)-----------------------N * (G-1) * MTTR(disk)

System Crash + Disk FailureMTTF(system) * MTTF(disk)-----------------------N * MTTR(system)

Disk Failure + Bit ErrorMTTF(disk)-----------------------N * (1 - ( p(disk))(G-1))

Software RAIDharmonic sum of the above

Hardware RAIDharmonic sum of above excluding system crash + disk failure

Failure Characteristics forRAID Level 5Disk Arrays(source:Reference 1)

P+Q disk ArrayTriple Disk FailuresMTTF(disk) * (MTTF(disk2) * MTTF(disk3)----------------------------------N * (G-1) * (G-2) * MTTR(disk)2

System Crash + Disk FailureMTTF(system) * MTTF(disk)--------------------------N * MTTR(system)

Double disk failure + Bit errorMTTF(disk) * MTTF(disk2)----------------------------------N*(G-1)*(1-(p(disk))(G-2))* MTTR(disk)

Software RAIDharmonic sum of the above

Hardware RAIDharmonic sum excluding system crash +disk failure

Failure characteristics for aP+Q disk array(source:Reference 1)

p(disk) = The probability of reading all sectors on a disk (derived from disk size, sector size, and BER)Tool for ReliabilityUsing the Above Equations.(source:Reference 3)Redundant array of independent disks (RAID) is a storage technology used to improve the processing capability of storage systems. This technology is designed to provide reliability in disk array systems and to take advantage of the performance gains offered by an array of multiple disks over single-disk storage.RAIDs two primary underlying concepts are: distributing data over multiple hard drives improves performance using multiple drives properly allows for any one drive to fail without loss of data and without system downtimeIn the event of a disk failure, disk access continues normally and the failure is transparent to the host system.Logical DriveA logical drive is an array of independent physical drives. Increased availability, capacity, and performance are achieved by creating logical drives. The logical drive appears to the host the same as a local hard disk drive does.

FIGURE A-1 Logical Drive Including Multiple Physical DrivesLogical VolumeA logical volume is composed of two or more logical drives. The logical volume can be divided into a maximum of 32 partitions for Fibre Channel. During operation, the host sees a nonpartitioned logical volume or a partition of a logical volume as one single physical drive.Local Spare DriveA local spare drive is a standby drive assigned to serve one specified logical drive. When a member drive of this specified logical drive fails, the local spare drive becomes a member drive and automatically starts to rebuild.Global Spare DriveA global spare drive does not only serve one specified logical drive. When a member drive from any of the logical drives fails, the global spare drive joins that logical drive and automatically starts to rebuild.ChannelsYou can connect up to 15 devices (excluding the controller itself) to a SCSI channel when the Wide function is enabled (16-bit SCSI). You can connect up to 125 devices to an FC channel in loop mode. Each device has a unique ID that identifies the device on the SCSI bus or FC loop.A logical drive consists of a group of SCSI drives, Fibre Channel drives, or SATA drives. Physical drives in one logical drive do not have to come from the same SCSI channel. Also, each logical drive can be configured for a different RAID level.A drive can be assigned as the local spare drive to one specified logical drive, or as a global spare drive. A spare is not available for logical drives that have no data redundancy (RAID 0).

FIGURE A-2 Allocation of Drives in Logical Drive ConfigurationsYou can divide a logical drive or logical volume into several partitions or use the entire logical drive as single partition.

FIGURE A-3 Partitions in Logical Drive ConfigurationsEach partition is mapped to LUNs under host SCSI IDs or IDs on host channels. Each SCSI ID/LUN acts as one individual hard drive to the host computer.

FIGURE A-4 Mapping Partitions to Host ID/LUNs

FIGURE A-5 Mapping Partitions to LUNs Under an ID

RAID LevelsThere are several ways to implement a RAID array, using a combination of mirroring, striping, duplexing, and parity technologies. These various techniques are referred to as RAID levels. Each level offers a mix of performance, reliability, and cost. Each level uses a distinct algorithm to implement fault tolerance.There are several RAID level choices: RAID 0, 1, 3, 5, 1+0, 3+0 (30), and 5+0 (50). RAID levels 1, 3, and 5 are the most commonly used.The following table provides a brief overview of the RAID levels.

TABLE A-1 RAID Level Overview

RAID LevelDescriptionNumber of Drives SupportedCapacityRedundancy

0Striping2-36NNo

1Mirroring2N/2Yes

1+0Mirroring and striping4-36 (even number only)N/2Yes

3Striping with dedicated parity3-31N-1Yes

5Striping with distributed parity3-31N-1Yes

3+0 (30)Striping of RAID 3 logical drives2-8 logical drivesN-# of logical drivesYes

5+0 (50)Striping of RAID 5 logical drives2-8 logical drivesN-# of logical drivesYes

Capacityrefers to the total number (N) of physical drives available for data storage. For example, if the capacity is N-1 and the total number of disk drives in the logical drive is six 36-Mbyte drives, the disk space available for storage is equal to five disk drives--(5 x 36 Mbyte or 180 Mbyte. The -1 refers to the amount of striping across six drives, which provides redundancy of data and is equal to the size of one of the disk drives.For RAID 3+0 (30) and 5+0 (50),capacityrefers to the total number of physical drives (N) minus one physical drive (#) for each logical drive in the volume. For example, if the total number of disk drives in the logical drive is twenty 36-Mbyte drives and the total number of logical drives is 2, the disk space available for storage is equal to 18 disk drives--18 x 36 Mbyte (648 Mbyte).RAID 0RAID 0 implementsblock striping, where data is broken into logical blocks and is striped across several drives. Unlike other RAID levels, there is no facility for redundancy. In the event of a disk failure, data is lost.In block striping, the total disk capacity is equivalent to the sum of the capacities of all drives in the array. This combination of drives appears to the system as a single logical drive.RAID 0 provides the highest performance. It is fast because data can be simultaneously transferred to or from every disk in the array. Furthermore, read/writes to separate drives can be processed concurrently.

FIGURE A-6 RAID 0 ConfigurationRAID 1RAID 1 implementsdisk mirroring, where a copy of the same data is recorded onto two drives. By keeping two copies of data on separate disks, data is protected against a disk failure. If, at any time, a disk in the RAID 1 array fails, the remaining good disk (copy) can provide all of the data needed, thus preventing downtime.In disk mirroring, the total usable capacity is equivalent to the capacity of one drive in the RAID 1 array. Thus, combining two 1-Gbyte drives, for example, creates a single logical drive with a total usable capacity of 1 Gbyte. This combination of drives appears to the system as a single logical drive.

Note -RAID 1 does not allow expansion. RAID levels 3 and 5 permit expansion by adding drives to an existing array.

FIGURE A-7 RAID 1 ConfigurationIn addition to the data protection that RAID 1 provides, this RAID level also improves performance. In cases where multiple concurrent I/O is occurring, that I/O can be distributed between disk copies, thus reducing total effective data access time.RAID 1+0RAID 1+0 combines RAID 0 and RAID 1 to offermirroring and disk striping. Using RAID 1+0 is a time-saving feature that enables you to configure a large number of disks for mirroring in one step. It is not a standard RAID level option that you can select; it does not appear in the list of RAID level options supported by the controller. If four or more disk drives are chosen for a RAID 1 logical drive, RAID 1+0 is performed automatically.

FIGURE A-8 RAID 1+0 ConfigurationRAID 3RAID 3 implementsblock striping with dedicated parity. This RAID level breaks data into logical blocks, the size of a disk block, and then stripes these blocks across several drives. One drive is dedicated to parity. In the event that a disk fails, the original data can be reconstructed using the parity information and the information on the remaining disks.In RAID 3, the total disk capacity is equivalent to the sum of the capacities of all drives in the combination, excluding the parity drive. Thus, combining four 1-Gbyte drives, for example, creates a single logical drive with a total usable capacity of 3 Gbyte. This combination appears to the system as a single logical drive.RAID 3 provides increased data transfer rates when data is being read in small chunks or sequentially. However, in write operations that do not span every drive, performance is reduced because the information stored in the parity drive needs to be recalculated and rewritten every time new data is written, limiting simultaneous I/O.

FIGURE A-9 RAID 3 ConfigurationRAID 5RAID 5 implementsmultiple-block striping with distributed parity. This RAID level offers redundancy with the parity information distributed across all disks in the array. Data and its parity are never stored on the same disk. In the event that a disk fails, original data can be reconstructed using the parity information and the information on the remaining disks.

FIGURE A-10 RAID 5 ConfigurationRAID 5 offers increased data transfer rates when data is accessed in large chunks, or randomly and reduced data access time during many simultaneous I/O cycles.Advanced RAID LevelsAdvanced RAID levels require the use of the arrays built-in volume manager. These combination RAID levels provide the protection benefits of RAID 1, 3, or 5 with the performance of RAID 1. To use advanced RAID, first create two or more RAID 1, 3, or 5 arrays, and then join them. The following table provides a description of the advanced RAID levels.

TABLE A-2 Advanced RAID Levels

RAID LevelDescription

RAID 3+0 (30)RAID 3 logical drives that have been joined together using the arrays built-in volume manager.

RAID 5+0 (50)RAID 5 logical drives that have been joined together using the arrays volume manager.

Local and Global Spare DrivesThe external RAID controllers provide both local spare drive and global spare drive functions. The local spare drive is used only for one specified drive; the global spare drive can be used for any logical drive on the array.The local spare drive always has higher priority than the global spare drive. Therefore, if a drive fails and both types of spares are available at the same time or a greater size is needed to replace the failed drive, the local spare is used.If there is a failed drive in the RAID 5 logical drive, replace the failed drive with a new drive to keep the logical drive working. To identify a failed drive, refer to theSun StorEdge 3000 Family RAID Firmware Users Guidefor your array.

Caution -If, when trying to remove a failed drive, you mistakenly remove the wrong drive, you can no longer access the logical drive because you have incorrectly failed another drive.

A local spare drive is a standby drive assigned to serve one specified logical drive. When a member drive of this specified logical drive fails, the local spare drive becomes a member drive and automatically starts to rebuild.A local spare drive always has higher priority than a global spare drive; that is, if a drive fails and there is a local spare and a global spare drive available, the local spare drive is used.

FIGURE A-11 Local (Dedicated) SpareA global spare drive is available for all logical drives rather than serving only one logical drive (seeFIGURE A-12). When a member drive from any of the logical drives fails, the global spare drive joins that logical drive and automatically starts to rebuild.A local spare drive always has higher priority than a global spare drive; that is, if a drive fails and there is a local spare and a global spare drive available, the local spare drive is used.

FIGURE A-12 Global SpareHaving Both Local and Global SparesInFIGURE A-13, the member drives in logical drive 0 are 9-Gbyte drives, and the members in logical drives 1 and 2 are all 4-Gbyte drives.

FIGURE A-13 Mixing Local and Global SparesA local spare drive always has higher priority than a global spare drive; that is, if a drive fails and both a local spare and a global spare drive are available, the local spare drive is used.InFIGURE A-13, it is not possible for the 4-Gbyte global spare drive to join logical drive 0 because of its insufficient capacity. The 9-Gbyte local spare drive aids logical drive 0 once a drive in this logical drive fails. If the failed drive is in logical drive 1 or 2, the 4-Gbyte global spare drive immediately aids the failed drive.

source:Reference 1 ck.