Upload
primrose-merritt
View
215
Download
0
Embed Size (px)
Citation preview
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
1
RAIDGeneral Concept
Auteur : Franck THOMAS
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
2
RAID Basics
RAID : Redundant Arrays of Independent (or Inexpensive) Disks
Definition :
Simultaneous use of two or more drives in order to add fault
tolerance, capacity and/or performance to data storage system
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
3
HDD Performance
Technologies offer different price and performances
Specification with impact on performances: rotation speed, cache size, cache management (NCQ), port
Technology7.2k RPM
10k RPM 15k RPMDual Port
(=full duplex)
SATA YES YES NO NO
SCSI NO YES YES NO
SAS NO YES YES YES
SCSI hard disk are no more used in servers or workstations since end of 2008. They have been replaced by SAS HDD.
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
4
RAID 0
RAID 0 (stripping):
Data are stripped on all disksOffer performancesNo redundancy2 disks minimum, maximum depending of RAID controller
Data are split depending of stripe size (16/32/64/128KB)
controller
With software RAID, there is also concatenation / spanning mode
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
5
JBOD
controller
JBOD
Disk “JBOD”: Just a Bunch of Disk
A JBOD disk is physically connected to RAID controller but doesn’t use the RAID functionalities.
This disk is usable as if it was connected to simple SCSI controller
The goal is to have RAID drives and non RAID drives into the same system.
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
6
RAID 1
RAID 1 (mirroring):
Data duplicated on second hard disk
Offer redundancy
Equivalent of one disk space lost for redundancy
Only on 2 disks
Support one disk failure
controller
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
7
RAID 10 (stripping+mirroring):
Aggregation of several mirrors
Offer redundancy
Offer performance
Half of physical space lost for duplication
Even amount of disk required (4 minimum)
Support one disk failure per mirror
RAID 10controller
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
8
RAID 5
controller
RAID 5 (stripping with parity):
Data stripped on all disks
Redundancy done by parity (XOR logical operator)
Parity distributed on all disks
Equivalent of one disk space is used for parity storage (1/n disk lost)
3 disks minimum, maximum given by controller
Support one disk failure
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
9
RAID 6 – Triple Mirror
RAID 6-TM:
Data mirrored on 3 disks
Up to 2 disks lost
Equivalent of 2 disks space used redundancy
3 disks minimum and maximum
controller
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
10
RAID 6 – Double Parity
RAID 6 (stripping with duplicated parity):
Data stripped on all disks
Redundancy done by parity (XOR logical operator)
Parity splitted and duplicated on all disks alternatively
Equivalent of 2 disks space is used for parity storage (2/n disk lost)
4 disks minimum, maximum given by controller
Support 2 disk failures
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
11
Non exhaustive list of «exotic / obsolete» RAID levels
RAID 0+1: A mirror of 2 RAID 0 RAID 1E: A RAID 0 where stripes are written twice and distributed
across several disks = RAID1 on odd amount of disks. RAID 3: RAID 5 where a single disk is dedicated to parity storage RAID 5E, 5EE: Specific from LSI RAID 7: RAID 0 using concatenation mode (with hdd of different
sizes) RAID 50: stripping of several RAID5 RAID 60: stripping of several RAID6 …
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
12
Expansion / Migration
Expansion: Possibility to expand size of RAID array by adding a disk.e.g. Expand a RAID5 with a new disk.
Migration: Possibility to change RAID level, eventually by adding disk.
e.g. Migrate from RAID 1 to RAID 0 (still 2 disks).
Always backup data as precaution but operation doesn’t impact data nor access to them. Some controllers offer to migrate ‘ONLINE’
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
13
RAID Signature – COD (Conf. On Disk)
RAID configuration is always written on disks. Signature is around 700 MB big. Today, most of RAID controllers doesn’t contain any configuration to avoid
configuration mismatch.
Some old controllers (LSI SCSI) stored RAID configuration, so be careful on RAID card swap ! Plug controller on server without any disk connected Start server and « Clear configuration », stop server Replug disk and start server, controller will load configuration automatically
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
14
RAID and operating system
RAID Controller Operating System
N/A Files
N/A Partitions
Logical Drive Physical Drive
Array N/A
Physical Disks N/A
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
15
Size of Physical Drive under Operating System
RAID 1
Example two HDD of 300GB in RAID 1 will provide a size of less than 286GB under OS.
300.000.000 / 1024 / 1024 = 286,102 GB
286,102 – COD (≈ 700MB) = 285,402 GB
COD COD
300 GB
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
16
Limitations on disk size (Windows)
A disk signed as ‘MBR’ type is limited to 2 TB. A disk where Windows is installed is always ‘MBR’ type.
The only way to access more than 2 TB is to create a 2nd LD and convert it as ‘GPT’
Note: ‘GPT’ is available since Windows Server 2003 SP1 or more.
More information on the following link :http://technet2.microsoft.com/WindowsServer/en/library/4b35160a-4e27-4258-9e8b-
e2088f8a757a1033.mspx
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
17
Status: Online or Unconfigured
Disk “ONLINE”:ONLINE disk is a physical disk used or integrated in an Array.
If all disks of an array are ONLINE, array status is ONLINE or OPTIMAL.
Disk “READY or UNCONFIGURED”:
Physical disk not used by the controller. Can be removed without impact
controller
READY orUNCONFIGURED
ONLINE
Array
OPTIMAL or ONLINE ONLINE
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
18
controller
DEAD orFAILED
(not responding)
ONLINEONLINE
Status: Offline / Dead or Failed
Disk “OFFLINE or FAILED”:Such disk is still in the array but inactive.The array is now DEGRADED or CRITICAL.It can be a minor error or a status manually forced by administrator.REBUILD required to reintegrate the disk in the array.
Disk “DEAD or FAILED – NOT RESPONDING”:Like the OFFLINE status but means this is an hardware failure
regarding the detection of the disk
controller
OFFLINE orFAILED
ONLINEONLINE
Array A0
DEGRADED or CRITICAL
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
19
controller
DEAD orFAILED
(not responding)
DEAD orFAILED
(not responding)
ONLINE
Status: Offline / Dead or Failed
Array “OFFLINE”:All disks are OFFLINE. All data could be lost, but you can try to
force all disks in ONLINE to retrieve the original configuration and data.
Array “FAILED”:Two or more disks are OFFLINE but not all disks of the array.All the data could be lost, but you can try to force disk in ONLINE to
retrieve the original configuration and data.
controller
OFFLINE orFAILED
OFFLINE orFAILED
OFFLINE orFAILED
Array A0
OFFLINE or FAILED
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
20
controller
REBUILDSegment 1 duplicated
Segment 2 duplicated
Segment 3 duplicated
Segment 4 duplicated
Array
Disk FAILED Disk FAILED or OFFLINE or OFFLINE
or DEADor DEAD
Status: Hot Spare & Rebuild
Disk “Hot Spare”:
Hot spare disk is a standby disk ready to replace a failing (Offline or Dead) drive automatically
This disk is not used until a failure occurs.
After the rebuild, this disk is part of the array
controller
HOT SPARE
FAILUREArray
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
21
Initialization
Preparing a physical drive is called : format.
Preparing a logical drive is called: initialisation.
Initialisation will erase all sectors of logical drive.
Two modes exists:
• Full Initialisation: all blocks of logical drive are erased, longer but safer
• Quick Initialisation: only first blocks of logical drive are erased and remaining block will be erased in background, shorter.
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
22
Patrol Read / Media Patrol
In order to detect bad sector independently of normal I/O activity, some controllers offer sector verification in background when system is idle.
According to controller, this feature is called “Patrol read” or “Media Patrol”
It can be done on non RAID drives (JBOD, Spare) and HDD in RAID.
If bad sector is detected, controller will notify of error
If HDD is not in RAID, data recovery cannot be applied.
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
23
Consistency Check
•Consistency Check is insuring that data are readable and redundant
•It applies on logical drive level with RAID level offering redundancy.
•It is a preventive maintenance task to be scheduled monthly.
•In case of inconsistency (ex: bad sector), the sector is dynamically remapped using HDD spare sectors and date are recovered from rest of RAID.
Date Time Source Type Category Event Computer Description10/01/2008 09:46:21 MR_MONITOR Information VD 66 FRDCBMSQLCLSN02 Controller ID: 0 Consistency Check started on VD 0 .10/01/2008 09:46:23 MR_MONITOR Warning VD 63 FRDCBMSQLCLSN02 Controller ID: 0 Consistency Check found inconsistent parity on VD strip (VD = 0, strip = 0x800).10/01/2008 09:46:23 MR_MONITOR Warning VD 63 FRDCBMSQLCLSN02 Controller ID: 0 Consistency Check found inconsistent parity on VD strip (VD = 0, strip = 0x7b0).10/01/2008 09:46:24 MR_MONITOR Warning VD 63 FRDCBMSQLCLSN02 Controller ID: 0 Consistency Check found inconsistent parity on VD strip (VD = 0, strip = 0xe5b).10/01/2008 09:46:24 MR_MONITOR Warning VD 63 FRDCBMSQLCLSN02 Controller ID: 0 Consistency Check found inconsistent parity on VD strip (VD = 0, strip = 0xe51).10/01/2008 09:46:24 MR_MONITOR Warning VD 63 FRDCBMSQLCLSN02 Controller ID: 0 Consistency Check found inconsistent parity on VD strip (VD = 0, strip = 0xb22).10/01/2008 09:46:25 MR_MONITOR Warning VD 64 FRDCBMSQLCLSN02 Controller ID: 0 Consistency Check inconsistency logging disabled, too many inconsistencies on VD 0 .10/01/2008 09:46:25 MR_MONITOR Warning VD 63 FRDCBMSQLCLSN02 Controller ID: 0 Consistency Check found inconsistent parity on VD strip (VD = 0, strip = 0xe65).10/01/2008 09:46:25 MR_MONITOR Warning VD 63 FRDCBMSQLCLSN02 Controller ID: 0 Consistency Check found inconsistent parity on VD strip (VD = 0, strip = 0xe64).10/01/2008 09:46:25 MR_MONITOR Warning VD 63 FRDCBMSQLCLSN02 Controller ID: 0 Consistency Check found inconsistent parity on VD strip (VD = 0, strip = 0xe63).10/01/2008 09:46:25 MR_MONITOR Warning VD 63 FRDCBMSQLCLSN02 Controller ID: 0 Consistency Check found inconsistent parity on VD strip (VD = 0, strip = 0xe62).10/01/2008 09:46:25 MR_MONITOR Warning VD 63 FRDCBMSQLCLSN02 Controller ID: 0 Consistency Check found inconsistent parity on VD strip (VD = 0, strip = 0xe5f).10/01/2008 09:54:58 MR_MONITOR Information VD 59 FRDCBMSQLCLSN02 Controller ID: 0 Consistency Check done with corrections on VD 0, (corrections =4401).10/01/2008 10:46:58 MR_MONITOR Information VD 66 FRDCBMSQLCLSN02 Controller ID: 0 Consistency Check started on VD 0 .10/01/2008 10:54:54 MR_MONITOR Information VD 58 FRDCBMSQLCLSN02 Controller ID: 0 Consistency Check done on VD 0 .
Example here with 36GB RAID1 made of 2 x 15kRPM SAS disk linked to LSI 8408E.
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
24
Performance
2 families:
1. « Software » =Disk Controller + Software
• Software part is done by ROM BIOS and driver.• Workload is on server CPU.• No memory cache, nor BBU.• It is also known as « HostRAID »• Solution integrated on motherboard so cheapest solution
2. « Hardware » =Disk Controller + RAID Engine
• Dedicated controller, no CPU load.• Memory cache / BBU• Solution on daughter PCI card
SouthBridgeSouthBridge HDD
HDD
PCI
SouthBridgeSouthBridge
Internal bus PCI
Disk controller
Disk Controller
HDD
HDD
RAID Engine
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
25
Cache memory
In order to optimise physical access to hard disk, some RAID controllers offer cache memory (option or on board)
Cache Memory is always used for read access.
Cache Memory may be used for write access:
For write access, two mode exists:Write through = write cache disabledWrite back = write enabled
Write back is risky because data are not immediately written on disk. If power failure occurs, data may be lost.
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
26
BBU
Battery Back Up Unit (BBU) option is designed to add fault tolerance against power failure. The BBU powers the memory until electricity comes back
When BBU is present, write cache can be set to Write Back
During maintenance operation, make sure to unplug battery before memory removal
Ex1: LSI SecuRAID321 Ex2: Promise FastTrak S150 SX4 PCI
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
27
Performance vs. RAID Levels and technology
503
838
417
645
0
100
200
300
400
500
600
700
800
900
RAID 0 Reads RAID 0 Writes
SCSI SAS
MB
/sec
RAID 0
494
840
228
304
0
100
200
300
400
500
600
700
800
900
RAID 5 Reads RAID 5 Writes
SCSI SAS
RAID 5
NEC Computers SAS - Confidential - Oct 2008 - RAID General Concept
28
Questions ?