RAID implementation levels

written by: Ralph Kitzper; article published: year 2010, month 06;

In: Root » Computers and technology » Storage Devices

  Share  
|
  PL  |  NL  |  FR  |  ES  |  PT  |  IT  |  DE  |  DK  |  NO  |  SE  |  FI  |  GR  |  JP  |  CN  |  KR  |  RU  |  AE


Just connecting four drives to a SCSI controller won't create a drive array. An array requires special electronics to handle the digital coding and control of the individual drives. The electronics of these systems are proprietary to their manufacturers. The array controller then connects to your PC through a proprietary or standard interface. SCSI is becoming the top choice. Currently, most drive arrays are assembled by computer manufacturers for their own systems, but a growing number are becoming available as plug-in additions to PCs.

In 1988 three researchers at the University of California at Berkeley-David A. Patterson, Garth Gibson, and Randy H. Katz-first outlined five disk array models in a paper entitled A Case for Redundant Arrays of Inexpensive Disks. They called their models RAID Levels and labeled them as RAID 1 through 5, appropriately enough. Their numerical designations were arbitrary and were not meant to indicate that RAID 1 is better or worse than RAID 5. The numbers simply provide a label for each technology that can be readily understood by the cognoscenti.

In 1993, these levels were formalized in the first edition of the RAIDBook, published by the RAID Advisory Board, an association of suppliers and consumers of RAID-related mass storage products. The book is part of one of the RAID Advisory Board's principle objectives, the standardization of the terminology of RAID-related technology. Although the board does not officially set standards, it does prepare them for submission to the recognized standards organizations. The board also tests the function and performance of RAID products and verifies that they perform a basic set of functions correctly.

The RAID Advisory Board currently recognizes nine RAID implementation levels. Five of these conform to the original Berkeley RAID definitions. Beyond the five array levels described by the Berkeley group, several other RAID terms are used and acknowledged by the RAID Advisory Board. These include RAID Level 0, RAID Level 6, RAID Level 10, and RAID Level 53.

RAID Level 0

Early workers used the term RAID Level 0 to refer to the absence of any array technology. According to the RAID Advisory Board, however, the term refers to an array that simply uses data striping to distribute data across several physical disks. Although this RAID Level 0 offers no greater reliability than the worst of the physical drives making up the array, it can improve the performance of the overall storage system.

RAID Level 1

The simplest of drive arrays, RAID Level 1, consists of two equal-capacity disks that mirror one another. One disk duplicates all the files of the other, essentially serving as a backup copy. Should one of the drives fail, the other can serve in its stead.

This reliability is the chief advantage of RAID Level 1 technology. The entire system has the same capacity as one of its drive alone. In other words, the RAID Level 1 system yields only 50 percent of its potential storage capacity, making it the most expensive array implementation. Performance depends on the sophistication of the array controller. Simple systems deliver exactly the performance of one of the drives in the array. A more sophisticated controller could potentially double data throughput by simultaneously reading alternate sectors from both drives. Upon the failure of one of the drives, performance reverts to that of a single drive, but no information (and no network time) is lost.

RAID Level 2

The next step up in array sophistication is RAID Level 2 which interleaves bits or blocks of data as explained earlier in the description of drive arrays. The individual drives in the array operate in parallel, typically with their spindles synchronized.

To improve reliability, RAID Level 2 systems use redundant disks to correct single-bit errors and detect double-bit errors. The number of extra disks needed depends on the error-correction algorithm used. For example, an array of eight data drives may use three error correction drives. High end arrays with 32 data drives may use seven error correction drives. The data, complete with error-detection code, is delivered directly to the array controller. The controller can instantly recognize and correct for errors as they occur, without slowing the speed at which information is read and transferred to the host computer.

The RAID Level 2 design anticipates that disk errors occur often, almost regularly. At one time, mass storage devices might have been error-prone, but no longer. Consequently, RAID Level 2 can be overkill except in the most critical of circumstances.

The principal benefit of RAID Level 2 is performance-because of their pure parallel nature, it and RAID Level 3 are the best-performing array technologies, at least in systems that require a single, high speed stream of data. In other words, it yields a high data transfer rate. Depending on the number of drives in the array, an entire byte or even 32-bit double-word could be read in the same period it would take a single drive to read one bit. Normal single-bit disk errors don't hinder this performance in any way because of RAID Level 2's on-the-fly error-correction.

The primary defect in the RAID Level 2 design arises from its basic storage unit being multiple sectors. As with any hard disk, the smallest unit each drive in the array can store is one sector. File sizes must increase in units of multiple sectors-one drawn from each drive. In a ten-drive array, for example, even the tiniest two-byte file would steal ten sectors (5120 bytes) of disk space. (Under DOS, which uses clusters of four sectors, the two-byte file would take a total of 20480 bytes!) In actual applications this drawback is not severe because systems that need the single-stream speed and instant error-correction of RAID Level 2 also tend to be those using large files, for example, mainframes.

RAID Level 3

This level is one step down from RAID Level 2. Although RAID Level 3 still uses multiple drives operating in parallel interleaving bits or blocks of data, instead of full error correction it allows only for parity checking. That is, errors can be detected but without the guarantee of recovery.

Parity checking requires fewer extra drives in the array-typically only one per array-making it a less expensive alternative. When a parity error is detected, the RAID Level 3 controller reads the entire array again to get it right. This re-reading imposes a substantial performance penalty-the disks must spin entirely around again, yielding a 17 millisecond delay in reading the data. Of course, the delay appears only when disk errors are detected. Modern hard disks offer such high reliability that the delays are rare. In effect, RAID Level 3 compared to RAID Level 2 trades off fewer drives for a slight performance penalty that occurs only rarely.

RAID Level 4

This level interleaves not bits or blocks but sectors. The sectors are read serially, as if the drives in the array were functionally one large drive with more heads and platters. (Of course, for higher performance a controller with adequate buffering could read two or more sectors at the same time, storing the later sectors in fast RAM and delivering them immediately after the preceding sector has been sent to the computer host.) For reliability, one drive in the array is dedicated to parity checking. RAID Level 4 earns favor because it permits small arrays of as few as two drives, although larger arrays make more efficient use of the available disk storage.

The dedicated parity drive is the biggest weakness of the RAID Level 4 scheme. In writing, RAID Level 4 maintains the parity drive by reading the data drives, updating the parity information, then writing the update to the parity drive. This read-update-write cycle adds a performance penalty to every write, although read operations are unhindered.

RAID Level 4 offers an extra benefit for operating systems that can process multiple data requests simultaneously. An intelligent RAID Level 4 controller can process multiple input/output requests, reorganize them, and read its drives in the most efficient manner, perhaps even in parallel. For example, while a sector from one file is being read from one drive, a sector from another file can read from another drive. This parallel operation can improve the effective throughput of such operating systems.

RAID Level 5

This level eliminates the dedicated parity drive from the RAID Level 4 array and allows the parity-check function to rotate through the various drives in the array. Error checking is thus distributed across all disks in the array. In properly designed implementations, enough redundancy can be built in to make the system fault tolerant.

RAID Level 5 is probably the most popular drive-array technology currently in use because it works with almost any number of drives, including arrays as small as two, yet permits redundancy and fault tolerance to be built in.

RAID Level 6

To further improve the fault tolerance of RAID Level 5, the same Berkeley researchers who developed the initial five RAID levels proposed one more, now known as RAID Level 6. This level adds a second parity drive to the RAID level 5 array. The chief benefit is that any two drives in the array can fail without the loss of data. This enables an array to remain in active service while an individual physical drive is being repaired yet still remain fault tolerant. In effect, a RAID Level 6 array with a single failed physical disk becomes a RAID Level 5 array. The drawback of the RAID Level 6 design is that it requires two parity blocks to be written during every write operation. Its write performance is extremely low, although read performance can achieve levels on par with RAID Level 5.

RAID Level 10

Some arrays employ multiple RAID technologies. RAID Level 10 represents a layering of RAID Levels 0 and 1 to combine the benefits of each. (Sometimes RAID Level 10 is called RAID Level 0&1 to more specifically point at its origins.) To improve input/output performance, RAID Level 10 employs data striping, splitting data blocks between multiple drives. Moreover, the Array Management Software can further speed read operations by filling multiple operations simultaneously from the two mirrored arrays (at times when both halves of the mirror are functional, of course.) To improve reliability, the RAID level uses mirroring so that the striped arrays are exactly duplicated. This technology achieves the benefits of both of its individual layers. Its chief drawback is cost. As with simple mirroring it doubles the amount of physical storage needed for a given amount of logical storage.

RAID Level 53

This level represents a layering of RAID Level 0 and RAID Level 3-the incoming data is striped between two RAID Level 3 arrays. The capacity of the RAID Level 53 array is the total of the capacity of the individual underlying RAID Level 3 arrays. Input/Output performance is enhanced by the striping between multiple arrays. Throughput is improved by the underlying RAID Level 3 arrays. Because the simple striping of the top RAID Level 0 layer adds no redundant data, reliability falls. RAID Level 3 arrays, however, are inherently so fault tolerant that the overall reliability of the RAID Level 53 array far exceeds that of an individual hard disk drive. As with a RAID Level 3 array, the failure of a single drive will not adversely affect data integrity.

Which implementation is best depends on what you most want to achieve with a drive array: Efficient use of drive capacity, fewest number of drives, greatest reliability, or quickest performance. For example, RAID 1 provides the greatest redundancy (thus reliability), and RAID 2 the best performance (followed closely by RAID 3).

Share

Disclaimer

1) E-articles is not responsible for the information contained by this article as well for any and all copyright infringements by authors and writers. E-articles is a free information resource. If you suspect this article for any copyright infringement, please read the terms of service and contact us or use the "Report this article" button on this page to investigate the problem.
2) E-articles is not responsible for inaccuracies, falsehoods, or any other types of misinformation this article may contain and will not be liable for any loss or damage suffered by a user through the user's reliance on the information gained here.