Installing command line tool
The command line tool to manage raid devices should be located at /opt/MegaRAID/MegaCli/MegaCli64. If not, you can download the RPM from IBM. It is an architecture independant package. I personnaly used it on Redhat and opensuse successfully. The one I'm using for this documentation is ibm_utl_sraidmr_megacli-8.04.08_linux_32-64.zip. So you have to untar, then do rpm -Uvh on both packages ( one for library and one for the actual command line tool ).Getting the big picture
To start with, the following command queries all adapters and returns information about the virtual drives defined, their status and all physical drives that they are made of. The command output a lot of information so I usually grep some keywords to shorten the text. Here is the command and an extract of the output :./MegaCli64 -LDPDInfo -aALL | egrep "Adapter|Virtual Disk|Name|RAID|State|^Number|^Span|PD:|^Device|Firmware|^$" Adapter #0 Number of Virtual Disks: 2 Virtual Disk: 0 (target id: 0) Name: RAID Level: Primary-5, Secondary-0, RAID Level Qualifier-3 State: Optimal Number Of Drives:9 Span Depth:1 Number of Spans: 1 Span: 0 - Number of PDs: 9 PD: 0 Information Device Id: 15 Firmware state: Online PD: 1 Information Device Id: 16 Firmware state: Online PD: 2 Information Device Id: 17 Firmware state: Online ... PD: 8 Information Device Id: 23 Firmware state: Online Virtual Disk: 1 (target id: 1) Name:data2 RAID Level: Primary-5, Secondary-0, RAID Level Qualifier-3 State: Degraded Number Of Drives:24 Span Depth:1 Number of Spans: 1 Span: 0 - Number of PDs: 24 PD: 0 Information Device Id: 42 Firmware state: Online PD: 1 Information Device Id: 43 Firmware state: Online ... PD: 22 Information Device Id: 64 Firmware state: Online PD: 23 Information Device Id: 65 Firmware state: Online Adapter #1 Number of Virtual Disks: 1 Virtual Disk: 0 (target id: 0) Name: RAID Level: Primary-5, Secondary-0, RAID Level Qualifier-3 State: Optimal Number Of Drives:11 Span Depth:1 Number of Spans: 1 Span: 0 - Number of PDs: 11 PD: 0 Information Device Id: 8 Firmware state: Online PD: 1 Information Device Id: 9 Firmware state: Online ... PD: 10 Information Device Id: 18 Firmware state: Online
Finding the drive to replace
It's a good idea to start gathering info about the Adapter :/opt/MegaRAID/MegaCli> sudo ./MegaCli64 -AdpAllInfo -aALL Adapter #0 ============================================================================== Versions ================ Product Name : PERC H700 Integrated Serial No : 18P02M3 FW Package Build: 12.10.2-0004 ... Device Present ================ Virtual Drives : 1 Degraded : 0 Offline : 0 Physical Devices : 5 Disks : 4 Critical Disks : 0 Failed Disks : 0 ...
So we know now that the current machine has one adapter : Adapter 0. So in the following command, we will specify -a0 for adpater 0. Then we get enclosure information :
/opt/MegaRAID/MegaCli> sudo ./MegaCli64 -EncInfo -a0 Number of enclosures on adapter 0 -- 1 Enclosure 0: Device ID : 32 Number of Slots : 6 Number of Power Supplies : 0 Number of Fans : 0 Number of Temperature Sensors : 0 Number of Alarms : 0 Number of SIM Modules : 0 Number of Physical Drives : 4 Status : Normal Position : 0 Connector Name : Unavailable Enclosure type : SES FRU Part Number : N/A Enclosure Serial Number : N/A ESM Serial Number : N/A Enclosure Zoning Mode : N/A Partner Device Id : 65535 Inquiry data : Vendor Identification : DP Product Identification : BACKPLANE Product Revision Level : 1.07 Vendor Specific : 18NJ5VP Exit Code: 0x00So we have one adapter a0 and one enclosure with an id of 32. We now query for the logical drive information :
/opt/MegaRAID/MegaCli> sudo ./MegaCli64 -LDInfo -LALL -a0 Adapter 0 -- Virtual Drive Information: Virtual Drive: 0 (Target Id: 0) Name :server RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0 Size : 557.75 GB Mirror Data : 557.75 GB State : Degraded Strip Size : 64 KB Number Of Drives per span:2 Span Depth : 2 Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU Default Access Policy: Read/Write Current Access Policy: Read/Write Disk Cache Policy : Disk's Default Encryption Type : None Bad Blocks Exist: No Is VD Cached: Yes Cache Cade Type : Read OnlyThis shows us the RAID level used ( 1-0 so a mirror of strippes ) and the status of this raid device : Degraded.
So now we look for the deffective drive with the following command. The field we need to watch is Firmware State : Failed.
/opt/MegaRAID/MegaCli> sudo ./MegaCli64 -PDList -a0 Adapter #0 ... Enclosure Device ID: 32 Slot Number: 3 ... Firmware state: Failed ...The output has been truncated to show only relevant information. So in our case, it's the drive in slot number 3 of enclosure ID 32 that needs to be replaced.
Replacing the drive
We prepare the drive for replacement :/opt/MegaRAID/MegaCli> sudo ./MegaCli64 -PDOffline -PhysDrv\[32:3\] -a0 Adapter: 0: EnclId-32 SlotId-3 state changed to OffLine. Exit Code: 0x00 /opt/MegaRAID/MegaCli> sudo ./MegaCli64 -PDMarkMissing -PhysDrv\[32:3\] -a0 EnclId-32 SlotId-3 is marked Missing. Exit Code: 0x00 /opt/MegaRAID/MegaCli> sudo ./MegaCli64 -PDPrpRmv -PhysDrv\[32:3\] -a0 Prepare for removal Success Exit Code: 0x00Now it's time for the physical replacement.
Then if everything went smoothly, you should see the array being rebuild :
/opt/MegaRAID/MegaCli> sudo ./MegaCli64 -PDInfo -PhysDrv\[32:3\] -a0 Enclosure Device ID: 32 Slot Number: 3 Drive's postion: DiskGroup: 0, Span: 1, Arm: 1 Enclosure position: N/A Device Id: 3 ... Firmware state: Rebuild ...We can query the controler to see the actual rebuild progress :
/opt/MegaRAID/MegaCli> sudo ./MegaCli64 -PDRbld -ShowProg -PhysDrv\[32:3\] -a0 Rebuild Progress on Device at Enclosure 32, Slot 3 Completed 7% in 3 Minutes. Exit Code: 0x00
Checking
Eventually, you should see something like that :/opt/MegaRAID/MegaCli> sudo ./MegaCli64 -PDRbld -ShowProg -PhysDrv\[32:3\] -a0 Device(Encl-32 Slot-3) is not in rebuild process Exit Code: 0x00 /opt/MegaRAID/MegaCli> sudo ./MegaCli64 -LDInfo -LALL -a0 Adapter 0 -- Virtual Drive Information: Virtual Drive: 0 (Target Id: 0) Name :server RAID Level : Primary-1, Secondary-0, RAID Level Qualifier-0 Size : 557.75 GB Mirror Data : 557.75 GB State : Optimal Strip Size : 64 KB Number Of Drives per span:2 Span Depth : 2 Default Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU Current Cache Policy: WriteBack, ReadAdaptive, Direct, No Write Cache if Bad BBU Default Access Policy: Read/Write Current Access Policy: Read/Write Disk Cache Policy : Disk's Default Encryption Type : None Bad Blocks Exist: No Is VD Cached: Yes Cache Cade Type : Read Only Exit Code: 0x00