Troubleshooting
This section contains troubleshooting procedures for common situations as well as for specific problems.
Common Troubleshooting Procedures
This section describes commands and procedures that can be used in troubleshooting. Topics covered include:
Cables Attached Correctly
Verify that the power-supply cord and adapter cables are attached correctly. If the system is having trouble with read and write operations to a particular virtual disk or non-RAID physical disk (if the system hangs, for example), then make sure that the cables attached to the corresponding enclosure or backplane are secure. If the connection is secure but the problem persists, you may need to replace a cable. See also "Isolate Hardware Problems."
On SAS controllers, you should verify that the cable configuration is valid. Refer to the SAS hardware documentation for valid cable configurations. If the cable configuration is invalid, you may receive alerts "2182" or "2356."
System Requirements
Make sure that the system meets all system requirements. In particular, verify that the correct levels of firmware and drivers are installed on the system. For more information on drivers and firmware, see "Drivers and Firmware."
Drivers and Firmware
Storage Management is tested with the supported controller firmware and drivers. In order to function properly, the controller must have the minimum required version of the firmware and drivers installed. The most current versions can be obtained from the Dell™ Support website at support.dell.com.
![]() ![]() |
NOTE: You can verify which firmware and drivers are installed by selecting the Storage object in the tree view and clicking the Information/Configuration tab. You can also check the Alert Log for alerts relating to unsupported firmware and driver versions. |
It is also recommended to obtain and apply the latest Dell PowerEdge™ Server System BIOS on a periodic basis to benefit from the most recent improvements. Please refer to the Dell PowerEdge system documentation for more information.
Isolate Hardware Problems
If you receive a “timeout” alert related to a hardware device or if you otherwise suspect that a device attached to the system is experiencing a failure, then do the following to confirm the problem:
•![]() |
Verify that the cables are correctly attached. |
•![]() |
If the cables are correctly attached and you are still experiencing the problem, then disconnect the device cables and reboot the system. If the system reboots successfully, then one of the devices may be defective. Refer to the hardware device documentation for more information. |
Rescan to Update Information on SCSI Controllers
On SCSI controllers, use the Rescan controller task to update information for the controller and attached devices. This operation may take a few minutes if there are a number of devices attached to the controller.
If the Rescan does not properly update the disk information, you may need to reboot your system.
Replacing a Failed Disk
You may need to replace a failed disk in the following situations:
Replacing a Failed Disk that is Part of a Redundant Virtual Disk
If the failed disk is part of a redundant virtual disk, then the disk failure should not result in data loss. You should replace the failed disk immediately, however, as additional disk failures can cause data loss.
If the redundant virtual disk has a hot spare assigned to it, then the data from the failed disk is rebuilt onto the hot spare. After the rebuild, the former hot spare functions as a regular physical disk and the virtual disk is left without a hot spare. In this case, you should replace the failed disk and make the replacement disk a hot spare.
![]() ![]() |
NOTE: If the redundant virtual disk does not have a hot spare assigned to it, then replace the failed disk using the procedure described in Replacing a Physical Disk Receiving SMART Alerts |
Replacing the Disk:
1 ![]() |
Remove the failed disk. |
2 ![]() |
Insert a new disk. Make sure that the new disk is the same size or larger as the disk you are replacing. (On some controllers, you may not be able to use the additional disk space if you insert a larger disk. See "Virtual Disk Considerations for PERC 3/SC, 3/DC, 3/QC, 4/SC, 4/DC, 4e/DC, 4/Di, 4e/Si, 4e/Di, CERC ATA100/4ch, PERC 5/E, PERC 5/i, PERC 6/E, and PERC 6/I Controllers" for more information.) |
A rebuild is automatically initiated because the virtual disk is redundant.
Assigning a Hot Spare:
If a hot spare was already assigned to the virtual disk, then data from the failed disk may already be rebuilt onto the hot spare. In this case, you need to assign a new hot spare. See "Assign and Unassign Dedicated Hot Spare" and "Assign and Unassign Global Hot Spare" for more information.
Replacing a Failed Physical Disk that is Part of a Nonredundant Virtual Disk
If the failed physical disk is part of a nonredundant virtual disk (such as RAID 0), then the failure of a single physical disk will cause the entire virtual disk to fail. To proceed, you need to verify when your last backup was, and if there is any new data that has been written to the virtual disk since that time.
If you have backed up recently and there is no new data on the disks that would be missed, you can restore from backup.
![]() ![]() |
NOTE: If the failed disk is attached to a PERC 3/SC, 3/DC, 3/QC, 4/SC, 4/DC, 4e/DC, 4/Di, or CERC ATA100/4ch controller, you can attempt to recover data from the disk by using the procedure described in "Using the Physical Disk Online Command on Select Controllers" before continuing with the following procedure. |
Do the following:
1 ![]() |
Delete the virtual disk which is currently in a failed state. |
2 ![]() |
Remove the failed physical disk. |
3 ![]() |
Insert a new physical disk. |
4 ![]() |
Create a new virtual disk. |
5 ![]() |
Restore from backup. |
Using the Physical Disk Online Command on Select Controllers
Does my controller support this feature? See "Supported Features"
If you do not have a suitable backup available, and if the failed disk is part of a virtual disk on a controller that supports the Online physical disk task, then you can attempt to retrieve data by selecting Online from the failed disk’s drop-down task menu.
The Online command attempts to force the failed disk back into a Online state. If you are able to force the disk into a Online state, you may be able to recover individual files. How much data you can recover depends on the extent of disk damage. File recovery is only possible if a limited portion of the disk is damaged.
There is no guarantee you will be able to recover any data using this method. A forced Online does not fix a failed disk. You should not attempt to write new data to the virtual disk.
After retrieving any viable data from the disk, replace the failed disk as described previously in "Replacing a Failed Disk that is Part of a Redundant Virtual Disk" or "Replacing a Failed Physical Disk that is Part of a Nonredundant Virtual Disk."
Replacing a Failed Physical Disk in a RAID 1 on a CERC SATA1.5/2s
On a CERC SATA1.5/2s controller, a rebuild may not start automatically when you replace a failed physical disk that is part of a RAID 1 virtual disk. In this circumstance, use the following procedure to replace the failed physical disk and rebuild the redundant data.
1 ![]() |
Shut down the system. |
2 ![]() |
Disconnect the SATA cable on the failed physical disk in the RAID 1 virtual disk. |
3 ![]() |
Replace the failed physical disk with a formatted physical disk. You can format the physical disk using the Disk Utilities in the controller BIOS. (You may not need to format the entire physical disk. Formatting 1% of the disk may be sufficient.) |
4 ![]() |
Reboot the system. When rebooted, the RAID 1 virtual disk should display a Failed Redundancy state. |
5 ![]() |
Expand the controller object in the tree view and select the Physical Disks object. |
6 ![]() |
Execute the Rebuild task for the physical disk you added. |
Recovering from Removing the Wrong Physical Disk
If the physical disk that you mistakenly removed is part of a redundant virtual disk that also has a hot spare, then the virtual disk rebuilds automatically either immediately or when a write request is made. After the rebuild has completed, the virtual disk will no longer have a hot spare since data has been rebuilt onto the disk previously assigned as a hot spare. In this case, you should assign a new hot spare.
If the physical disk that you removed is part of a redundant virtual disk that does not have a hot spare, then replace the physical disk and do a rebuild.
See the following sections for information on rebuilding physical disks and assigning hot spares:
•![]() |
"Understanding Hot Spares" for RAID controllers |
•![]() |
"Rebuild" for PERC 3/SC, 3/DC, 3/QC, 4/SC, 4/DC, 4e/DC, 4/Di, CERC ATA100/4ch, PERC 5/E and PERC 5/i controllers |
You can avoid removing the wrong physical disk by blinking the LED display on the physical disk that you intend to remove. See "Blink and Unblink (Physical Disk)" for information on blinking the LED display.
Resolving Microsoft® Windows® Upgrade Problems
If you upgrade the Microsoft Windows operating system on a server, you may find that Storage Management no longer functions after the upgrade. The installation process installs files and makes registry entries on the server that are specific to the operating system. For this reason, changing the operating system can disable Storage Management.
To avoid this problem, you should uninstall Storage Management before upgrading. If you have already upgraded without uninstalling Storage Management, however, you should uninstall Storage Management after the upgrade.
After you have uninstalled Storage Management and completed the upgrade, reinstall Storage Management using the Storage Management install media. You can download Storage Management from the Dell Support website support.dell.com.
Virtual Disk Troubleshooting
The following sections describe troubleshooting procedures for virtual disks.
A Rebuild Does Not Work
A rebuild will not work in the following situations:
•![]() |
The virtual disk is nonredundant. For example, a RAID 0 virtual disk cannot be rebuilt because RAID 0 does not provide data redundancy. |
•![]() |
There is no hot spare assigned to the virtual disk. As long as the virtual disk is redundant, you can do the following to rebuild it: |
•![]() |
Pull out the failed physical disk and replace it. A rebuild will automatically start on the new disk. |
•![]() |
Assign a hot spare to the virtual disk and then perform a rebuild. |
•![]() |
You are attempting to rebuild onto a hot spare that is too small. Different controllers have different size requirements for hot spares. See "Considerations for Hot Spares on PERC 3/SC, 3/DC, 3/QC, 4/SC, 4/DC, 4e/DC, 4/Di, 4e/Si, 4e/Di, CERC ATA100/4ch, PERC 5/E, PERC 5/i, PERC 6/E, PERC 6/I, and CERC 6/I Controllers" and "Considerations for Hot Spares on PERC 3/Si, 3/Di, and CERC SATA1.5/6ch Controllers" for more information on disk size requirements. |
•![]() |
The hot spare has been unassigned from the virtual disk. This could happen on some controllers if the hot spare was assigned to more than one virtual disk and has already been used to rebuild a failed physical disk for another virtual disk. See "Considerations for Hot Spares on PERC 3/Si, 3/Di, and CERC SATA1.5/6ch Controllers" for a description of this situation. |
•![]() |
On SCSI controllers, both redundant and nonredundant virtual disks reside on the same set of physical disks. On the PERC 3/SC, 3/DC, 3/QC, 4/SC, 4/DC, 4e/DC, 4/Di, and CERC ATA100/4ch controllers, a rebuild is not performed for a physical disk that is used by both redundant and nonredundant virtual disks. In order to rebuild the redundant virtual disk, you need to delete the nonredundant virtual disk. Before deleting this disk, however, you can attempt to recover data from the failed physical disk by forcing it back online. See "Using the Physical Disk Online Command on Select Controllers" for more information. |
•![]() |
A physical disk has been removed, and the system has not yet attempted to write data to the removed disk. In this case, the system will not recognize the removal of a physical disk until it attempts a write operation to the disk. If the physical disk is part of a redundant virtual disk, then the system will rebuild the disk after attempting a write operation. This situation applies to PERC 3/SC, 3/DC, 3/QC, 4/SC, 4/DC, 4e/DC, 4/Di, and CERC ATA100/4ch controllers. |
•![]() |
The virtual disk includes failed or corrupt physical disks. This situation may generate alert "2083." See alert "2083" for more information. |
•![]() |
The rebuild rate setting is too low. If the rebuild rate setting is quite low and the system is processing a number of operations, then the rebuild may take an unusual amount of time to complete. See "Set Rebuild Rate" for more information. |
•![]() |
The rebuild was cancelled. Another user can cancel a rebuild that you have initiated. |
A Rebuild Completes with Errors
This section applies to PERC 3/SC, 3/DC, 3/QC, 4/SC, 4/DC, 4e/DC, 4/Di, 4e/Si, 4e/Di, and CERC ATA100/4ch controllers
In some situations, a rebuild may complete successfully while also reporting errors. This may occur when a portion of the disk containing redundant (parity) information is damaged. The rebuild process can restore data from the healthy portions of the disk but not from the damaged portion.
When a rebuild is able to restore all data except data from damaged portions of the disk, it will indicate successful completion while also generating alert "2163." The rebuild may also report sense key errors. In this situation, take the following actions to restore the maximum data possible:
1 ![]() |
Back up the degraded virtual disk onto a fresh (unused) tape. |
2 ![]() |
Perform a "Check Consistency" on the virtual disk that you have backed up onto tape. |
3 ![]() |
Restore the virtual disk from the tape onto healthy physical disks. |
Cannot Create a Virtual Disk
You might be attempting a RAID configuration that is not supported by the controller. Check the following:
•![]() |
How many virtual disks already exist on the controller? Each controller supports a maximum number of virtual disks. See "Maximum Number of Virtual Disks per Controller" for more information. |
•![]() |
Is there adequate available space on the disk? The physical disks that you have selected for creating the virtual disk must have an adequate amount of free space available. |
•![]() |
The controller may be performing other tasks, such rebuilding a physical disk, that must run to completion before the controller can create the new virtual disk. |
A Virtual Disk of Minimum Size is Not Visible to Windows Disk Management
If you create a virtual disk using the minimum allowable size in Storage Management, the virtual disk may not be visible to Windows Disk Management even after initialization. This occurs because Windows Disk Management is only able to recognize extremely small virtual disks if they are dynamic. It is generally advisable to create virtual disks of larger size when using Storage Management.
Virtual Disk Errors on Linux
On some versions of the Linux operating system, the virtual disk size is limited to 1TB. If you create a virtual disk that exceeds the 1TB limitation, your system may experience the following behavior:
•![]() |
I/O errors to the virtual disk or logical drive |
•![]() |
Inaccessible virtual disk or logical drive |
•![]() |
Virtual disk or logical drive size is smaller than expected |
If you have created a virtual disk that exceeds the 1TB limitation, you should do the following:
1 ![]() |
Back up your data. |
2 ![]() |
Delete the virtual disk. |
3 ![]() |
Create one or more virtual disks that are smaller than 1TB. |
4 ![]() |
Restore your data from backup. |
Whether or not your Linux operating system limits virtual disk size to 1TB depends on the version of the operating system and any updates or modifications that you have implemented. See your operating system documentation for more information.
Problems Associated With Using the Same Physical Disks for Both Redundant and Nonredundant Virtual Disks
When creating virtual disks, you should avoid using the same physical disks for both redundant and nonredundant virtual disks. This recommendation applies to all controllers. Using the same physical disks for both redundant and nonredundant virtual disks can result in unexpected behavior including data loss.
![]() ![]() |
NOTE: SAS controllers do not allow you to create redundant and nonredundant virtual disks on the same set of physical disks. |
Considerations for PERC 3/Si, 3/Di, CERC SATA1.5/6ch, and CERC SATA1.5/2s Controllers When Physical Disks are Shared by Redundant and Nonredundant Virtual Disks
This section describes behavior that may occur on the PERC 3/Si, 3/Di, CERC SATA1.5/6ch, and CERC SATA1.5/2s controllers if you use the same physical disks for both redundant and nonredundant virtual disks. In this type of configuration, the failure or removal of a physical disk can cause the following behavior:
•![]() |
The nonredundant virtual disk displays a Failed state. |
Resolution: This behavior is expected because the virtual disk is nonredundant. In this case, the failure or removal of a single physical disk causes the entire virtual disk to fail with no possibility of recovering the data unless a backup is available.
•![]() |
The redundant virtual disks display a Degraded state. |
Resolution: This behavior is also expected. Data can be recovered if a hot spare is available to rebuild the failed or removed disk.
•![]() |
Various disks display an Offline state. The Offline state may apply to all physical disks used by the redundant and nonredundant virtual disks. |
Resolution: Perform a "Rescan Controller." When the rescan is complete, select each physical disk that is Offline and perform a "Remove Dead Segments" task. You must remove the dead segments before the physical disk can be brought back online. The dead segments are caused by the failure or removal of the shared physical disk.
![]() ![]() |
NOTE: It is recommended that you avoid using the same physical disks for both redundant and nonredundant virtual disks. |
Specific Problem Situations and Solutions
This section contains additional trouble-shooting problem areas. Topics include:
Physical Disk is Offline or Displays an Error Status
A physical disk may display an error status if it has been damaged, taken offline, or was a member of a virtual disk that has been deleted or initialized. The following actions may resolve the error condition:
•![]() |
If a user has taken the disk offline, then return the disk to Online status by executing the Online disk task. |
•![]() |
Rescan the controller. This action updates the status of storage objects attached to the controller. If the error status was caused by deleting or initializing a virtual disk, rescanning the controller should resolve this problem. |
•![]() |
Investigate whether there are any cable, enclosure, or controller problems preventing the disk from communicating with the controller. If you find a problem and resolve it, you may need to rescan the controller to return the disk to Online or Ready status. If the disk does not return to Online or Ready status, reboot the system. |
•![]() |
If the disk is damaged, replace it. See "Replacing a Failed Disk" for more information. |
A Disk is Marked as Failed When Rebuilding in a Cluster Configuration
When a system in a cluster attempts to rebuild a failed disk but the rebuild fails, then another system takes over the rebuild. In this situation, you may notice that the rebuilt disk continues to be marked as failed on both systems even after the second system has rebuilt successfully. To resolve this problem, perform a rescan on both systems after the rebuild completes successfully.
A Disk on a PERC 4/Di Controller Does not Return Online after a Prepare to Remove
When you do a Prepare to Remove command on a physical disk attached to a PERC 4/Di controller, you may find that the disk does not display in the Storage Management tree view even after doing a rescan or a reboot.
In this case, do the following to redisplay the disk in the Storage Management tree view:
1 ![]() |
Manually remove and then replace the physical disk. |
2 ![]() |
Rescan the controller or reboot the system. |
Receive a “Bad Block” Alert with “Replacement,” “Sense,” or “Medium” Error
The following alerts or events are generated when a portion of a physical disk is damaged:
•![]() |
"2146" |
•![]() |
"2147" |
•![]() |
"2148" |
•![]() |
"2149" |
•![]() |
"2150" |
This damage is discovered when the controller performs an operation that requires scanning the disk. Examples of operations that may result in these alerts are as follows:
•![]() |
Consistency check |
•![]() |
Rebuild |
•![]() |
Virtual disk format |
•![]() |
I/O |
If you receive an alerts 2146 through 2150 as the result of doing a rebuild or while the virtual disk is in a degraded state, then data cannot be recovered from the damaged disk without restoring from backup. If you receive alerts 2146 through 2150 under circumstances other than a rebuild, then data recovery may be possible. The following describes each of these situations.
Alerts 2146 through 2150 Received during a Rebuild or while a Virtual Disk is Degraded
Do the following if you receive alerts 2146 through 2150 during a rebuild or while the virtual disk is in a degraded state:
1 ![]() |
Replace the damaged physical disk. |
2 ![]() |
Create a new virtual disk and allow the virtual disk to completely resynchronize. While the resynchronization is in progress, the status of the virtual disk will be Resynching. |
3 ![]() |
Restore data to the virtual disk from backup. |
Alerts 2146 through 2150 Received while Performing I/O, Consistency Check, Format, or Other Operation
If you receive alerts 2146 through 2150 while performing an operation other than a rebuild, you should replace the damaged disk immediately to avoid data loss.
Do the following:
1 ![]() |
Back up the degraded virtual disk to a fresh (unused) tape. |
2 ![]() |
Replace the damaged disk. |
3 ![]() |
Do a rebuild. |
Read and Write Operations Experience Problems
If the system is hanging, timing out, or experiencing other problems with read and write operations, then there may be a problem with the controller cables or a device. For more information, see "Cables Attached Correctly" and "Isolate Hardware Problems."
I/O Stops When a Redundant Channel Fails
If you have implemented channel redundancy on a PERC 3/SC, 3/DC, 3/QC, 4/SC, 4/DC, 4e/DC, or 4/Di controller, a failure of one channel causes I/O to stop on the other channels included in the channel-redundant configuration. For the resolution to this problem, see "Channel Redundancy on PERC 3/DC, 3/QC, 4/DC, 4e/DC, 4/Di, and 4e/Di Controllers."
A Task Menu Option is Not Displayed
You may notice that the task menus do not always display the same task options. This is because Storage Management only displays those tasks that are valid at the time the menu is displayed. Some tasks are only valid for certain types of objects or at certain times. For example, a Check Consistency task can only be performed on a redundant virtual disk. Similarly, if a disk is already offline, the Offline task option is not displayed.
There may be other reasons why a task cannot be run at a certain time. For example, there may already be a task running on the object that must complete before additional tasks can be run.
A Corrupt Disk or Drive Message Suggests Running autocheck During a Reboot
Let autocheck run, but do not worry about the message. The reboot will complete after autocheck is finished. Depending on the size of your system, this may take about ten minutes.
Erroneous Status and Error Messages after a Windows Hibernation
Activating the Windows hibernation feature may cause Storage Management to display erroneous status information and error messages. This problem resolves itself when the Windows operating system recovers from hibernation.
Storage Management May Delay Before Updating Temperature Probe Status
In order to display the enclosure temperature and temperature probe status, Storage Management polls the enclosure firmware at regular intervals to obtain temperature and status information. On some enclosures, there is a short delay before the enclosure firmware reports the current temperature and temperature probe status. Because of this delay, Storage Management may require one or two minutes before displaying the correct temperature and temperature probe status.
You are Unable to Log into a Remote System
Access can be denied here if you do not enter a user name and password that match an administrator account on the remote computer or if you mistype the login information. The remote system may also not be powered on or there may be network problems.
Cannot Connect to Remote Windows Server™ 2003 System
When connecting to a remote Windows Server 2003 system, you must log into the remote system using an account that has administrator privileges. By default, Windows Server 2003 does not allow anonymous (null) connections to access the SAM user accounts. Therefore, if you are attempting to connect using an account that has a blank or null password, the connection may fail.
Reconfiguring a Virtual Disk Displays Error in Mozilla Browser
When reconfiguring a virtual disk using the Mozilla browser, the following error message may display:
Although this page is encrypted, the information you have entered is to be sent over an unencrypted connection and could easily be read by a third party.
You can disable this error message by changing a Mozilla browser setting. To disable this error message:
1 ![]() |
Select Edit and then Preferences. |
2 ![]() |
Click Privacy and Security. |
3 ![]() |
Click SSL. |
4 ![]() |
Uncheck the “Sending form data from an unencrypted page to an unencrypted page” option. |
Physical Disks Display Under Connector Not Enclosure Tree Object
Storage Management polls the status of physical disks at frequent intervals. When the physical disk is located in an enclosure, Storage Management uses the data reported by the SCSI Enclosure Processor (SEP) to ascertain the status of the physical disk. In the event that the SEP is not functioning, Storage Management is still able to poll the status of the physical disk, but Storage Management is not able to identify the physical disk as being located in the enclosure. In this case, Storage Management displays the physical disk directly below the Connector object in the tree view and not under the enclosure object.
You can resolve this problem by restarting the Server Administrator service or by rebooting the system. For more information on restarting the Server Administrator service, see the Dell OpenManage™ Server Administrator User’s Guide.