Home
Dell OpenManage Server Administrator Version 5.2 Messages Reference Guide
Contents
1. j Action See the Readme file for a list of and driver versions validated kernel and driver versions Update the system to meet the minimum requirements and then reinstall Storage Management 2168 The non RAID SCSI Warning Cause The version of the driver does not None 103 driver version is older Non critical meet the minimum requirements Storage than the minimum Management may not be able to display the required level See storage or perform storage management readme txt for the functions until you have updated the system validated driver to meet the minimum requirements version Action See the Readme file for the validated driver version Update the system to meet the minimum requirements and then reinstall Storage Management 2169 The controller Critical Cause The controller battery cannot recharge None 1154 battery needs tobe Failure The battery may be old or it may have been replaced Error already recharged the maximum number of times In addition the battery charger may not be working Action Replace the battery pack 2170 The controller Ok Normal Cause This alert is for informational purposes None 1151 battery charge level Action None is normal 86 Storage Management Message Reference Table 4 4 Storage Management Messages continued Event ID 2171 Description Severity Cause and Action Clear SNMP Event Trap Number Numbers Cause The battery may be recharging the 2172 115
2. lt Processor Entity gt thermal tripped Information This event is generated when the processor was deasserted has recovered from an earlier thermal condition lt Processor Entity gt configuration Critical This event is generated when the processor error was asserted configuration is incorrect lt Processor Entity gt configuration Information This event is generated when the earlier error was deasserted processor configuration error was corrected lt Processor Entity gt throttled was Warning This event is generated when the processor asserted slows down to prevent over heating lt Processor Entity gt throttled was Information This event is generated when the earlier deasserted processor throttled event was corrected System Event Log Messages for IPMI Systems Power Supply Events The power supply sensors monitor the functionality of the power supplies These messages provide status and warning information for power supplies for a particular system Table 3 5 Power Supply Events Event Message Severity Cause lt Power Supply Sensor Name gt power Critical This event is generated when the power supply supply sensor removed sensor is removed lt Power Supply Sensor Name gt power Information This event is generated when the power supply supply sensor AC recovered has been replaced lt Power Supply Sensor Name gt power Information This event is generated when the power supply supply sensor returned to n
3. The Patrol Read corrected a media error Severity Critical Failure Error Cause and Action Cause Storage Management has lost communication with a controller This may occur if the controller driver or firmware is experiencing a problem The 1 indicates a substitution variable The text for this substitution variable is displayed with the alert in the Alert Log and can vary depending on the situation Action Reboot the system If the problem is not resolved contact technical support See your system documentation for information about contacting technical support by using telephone fax and Internet services Ok Normal Cause This alert is for informational purposes Critical Failure Error Action None Cause A Clear task was being performed on a physical disk but the task was interrupted and did not complete successfully The controller may have lost communication with the disk The disk may have been removed or the cables may be loose or defective Action Verify that the disk is present and not in a Failed state Make sure the cables are attached securely See the online help for more information on checking the cables Restart the Clear task Clear SNMP Event Trap Number Numbers None 104 None 901 None 904 Ok Normal Cause This alert is for informational purposes None 901 Action None Storage Management Message Reference Table 4 4 Storage Management Messages continued
4. The following subsections explain how to open the Windows 2000 Advanced Server Windows Server 2003 and the Red Hat Enterprise Linux and SUSE Linux Enterprise Server event viewers Viewing Events in Windows 2000 Advanced Server and Windows Server 2003 1 2 3 Click the Start button point to Settings and click Control Panel Double click Administrative Tools and then double click Event Viewer In the Event Viewer window click the Tree tab and then click System Log The System Log window displays a list of recently logged events To view the details of an event double click one of the event items K NOTE You can also look up the dcsys32 log file in the install_path omsa log directory to view the separate event log file The default insta _pathis C Program Files Dell SysMgt Viewing Events in Red Hat Enterprise Linux and SUSE Linux Enterprise Server 1 Log in as root 2 Use a text editor such as vi or emacs to view the file named var log messages The following example shows the Red Hat Enterprise Linux and SUSE Linux Enterprise Server message log var log messages The text in boldface type indicates the message text K NOTE These messages are typically displayed as one long line In the following example the message is displayed using line breaks to help you see the message text more clearly Introduction Feb 6 14 20 51 server01 Server Administrator Instrumentation Service EventID 1000
5. 1453 Fan enclosure removed from Warning A fan enclosure has been removed from the system specified system The sensor location and A chassis location are provided Sensor location lt Location chassis ocatior ate provided in chassis gt Chassis location lt Name of chassis gt Event Message Reference 33 Table 2 10 Fan Enclosure Messages continued EventID Description Severity Cause 1454 Fan enclosure removed from Error A fan enclosure has been removed from the system for an extended specified system for a user definable length of amount of time time The sensor location and chassis location i al Sensor location lt Location are provided in chassis gt Chassis location lt Name of chassis gt 1455 Fan enclosure sensor Error A fan enclosure sensor in the specified system detected a non recoverable detected an error from which it cannot recover value The sensor location and chassis location Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt are provided AC Power Cord Messages AC power cord messages listed in Table 2 11 provide status and warning information for power cords that are part of an AC power switch if your system supports AC switching Table 2 11 AC Power Cord Messages EventID Description Severity Cause 1500 AC power cord sensor has failed Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt
6. 1501 AC power cord is not being monitored Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt 34 Event Message Reference Information An AC power cord sensor in the specified system failed The AC power cord status cannot be monitored The sensor location and chassis location information are provided Information The AC power cord status is not being monitored This occurs when a system s expected AC power configuration is set to nonredundant The sensor location and chassis location information are provided Table 2 11 AC Power Cord Messages continued EventID Description Severity Cause 1502 AC power has been restored Information An AC power cord that did not have Sensor location lt Location in ie power had the oe site chassis ne sensor ocation an chassis location information are provided Chassis location lt Name of chassis gt 1503 AC power has been lost Warning An AC power cord has lost its power but censor location lt Location in we is sufficient Least to ae chasis this as a warning the sensor ocation an chassis location information are provided Chassis location lt Name of chassis gt 1504 AC power has been lost Error An AC power cord has lost its power and Sensor location lt Location in te a anann e ae to be A che ceies classified as an error The sensor location an chassis location information are provided Chassis location lt Name of chassis gt 1505 A
7. A foreign configuration has been cleared 93 A foreign configuration has been detected 107 A foreign configuration has been imported 93 A global hot spare failed 91 A global hot spare has been removed 9 A global rescan has initiated 94 A Learn cycle start is pending while the battery charges 99 A mirrored virtual disk has been unmirrored 76 A physical disk is incompatible 104 A physical disk is marked as missing 112 A physical disk that was marked as missing has been replaced 112 A power supply in the enclosure has a DC failure 105 A power supply in the enclosure has an AC failure 105 A previously scheduled system BIOS update has been canceled 13 A redundant path has been restored 99 A redundant path is broken 99 A system BIOS update has been scheduled for the next reboot 13 A user has discarded data from the controller cache 114 Index 121 A virtual disk and all of its member physical disks have been removed while the system was shut down This removal was discovered during system start up 114 A virtual disk and its mirror have been split 76 A virtual disk blink has been initiated 93 A virtual disk blink has ceased 94 A virtual disk is permanently degraded 104 AC power cord is not being monitored 34 AC power cord messages 34 AC power cord sensor 7 AC power cord sensor has failed 34 50 AC power has been
8. Action Replace the physical disk that contains the disk errors Review other alert messages to identify the physical disk that has errors If the virtual disk is redundant you can replace the physical disk and continue using the virtual disk If the virtual disk is non redundant you may need to recreate the virtual disk after replacing the physical disk After replacing the physical disk run Check Consistency to check the data 110 Storage Management Message Reference Table 4 4 Storage Management Messages continued Event Description Severity Cause and Action Clear SNMP ID Event Trap Number Numbers 2341 The Check Ok Normal Cause This alert is for informational purposes None 1201 Consistency made corrections and completed 2342 The Check Warning Consistency found Non critical inconsistent parity data Data redundancy may be lost 2343 The Check Warning Consistency logging Non critical of inconsistent parity data is disabled 2346 Error occurred 1 Warning Non critical 2347 The rebuild failed Critical due to errors on the Failure source physical disk Error Action None Cause The data on a source disk and the None 1203 redundant data on a target disk is inconsistent Action Restart the Check Consistency task If you receive this alert again check the health of the physical disks included in the virtual disk Review the alert messages for significant alerts related to the physical
9. An EMM has been removed There is a bad sensor on an enclosure Severity Cause and Action Clear Event Number Critical Cause The failure may be caused by a loss of None Failure power to the EMM The EMM self test may Error also have identified a failure There could also be a firmware problem or a multi bit error Action Replace the EMM See the hardware documentation for information on replacing the EMM Ok Normal Cause This alert is for informational purposes None Action None Critical Cause A device has been removed and the None Failure system is no longer functioning in optimal Error condition Action Replace the device Ok Normal Cause This alert is for informational purposes None Action None Critical Cause An EMM has been removed None Failure Action Replace the EMM See the Error hardware documentation for information on replacing the EMM Warning Cause The enclosure has a bad sensor The None Non critical enclosure sensors monitor the fan speeds temperature probes etc Action See the hardware documentation for more information SNMP Trap Numbers 854 752 802 852 902 952 1002 1052 1102 1152 1202 754 804 854 904 954 1004 1054 1104 1154 1204 951 954 853 Storage Management Message Reference 101 Table 4 4 Storage Management Messages continued Event Description Severity Cause and Action Clear SNMP ID Event Trap Number Numbers
10. If sensor type is discrete Discrete current state lt State gt A current sensor on the power supply for the specified system exceeded its failure threshold The sensor location chassis location previous state and current sensor value are provided A current sensor in the specified system detected an error from which it cannot recover The sensor location chassis location previous state and current sensor value are provided Event Message Reference Chassis Intrusion Messages Chassis intrusion messages listed in Table 2 6 are a security measure Chassis intrusion means that someone is opening the cover to a system s chassis Alerts are sent to prevent unauthorized removal of parts from a chassis Table 2 6 Chassis Intrusion Messages EventID Description Severity Cause 1250 Chassis intrusion sensor has Information A chassis intrusion sensor in the specified failed system failed The sensor location chassis location previous s nd chassis intrusion Sensor location lt Location ocation p sas oe png trusio Gn hass iss state are provided Chassis location lt Name of chassis gt Previous state was lt State gt Chassis intrusion state lt Intrusion state gt 1251 Chassis intrusion sensor Information A chassis intrusion sensor in the specified value unknown system could not obtain a reading The sensor dansar oeat ione seen tie ae chassis location o and 26 enae eies chassis intrusion state are prov
11. Normal Cause This alert is for informational purposes has been detected Controller event log 1 Controller event log 1 Action None Ok Normal Cause This alert is for informational Warning Non critical purposes The 1 indicates a substitution variable The text for this substitution variable is generated by the controller and is displayed with the alert in the Alert Log This text is from events in the controller event log that were generated while Storage Management was not running This text can vary depending on the situation Action None Cause The 1 indicates a substitution variable The text for this substitution variable is generated by the controller and is displayed with the alert in the Alert Log This text is from events in the controller event log that were generated while Storage Management was not running This text can vary depending on the situation Action If there is a problem review the controller event log and the Server Administrator Alert Log for significant events or alerts that may assist in diagnosing the problem Check the health of the storage components See the hardware documentation for more information Clear SNMP Event Trap Number Numbers None 901 None 751 None 751 None 753 Storage Management Message Reference 109 Table 4 4 Storage Management Messages continued Event Description Severity Cause and Action Clear SNMP ID Event Trap Number Numbers
12. Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt If sensor type is not discrete Current sensor value in Amps lt Reading gt If sensor type is discrete Discrete current state lt State gt A current sensor on the power supply for the specified system returned to a valid range after crossing a failure threshold The sensor location chassis location previous state and current sensor value are provided A current sensor on the power supply for the specified system exceeded its warning threshold The sensor location chassis location previous state and current sensor value are provided Event Message Reference 23 24 Table 2 5 Current Sensor Messages continued Event ID Description Severity Cause 1204 1205 Current sensor detected a Error failure value Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt If sensor type is not discrete Current sensor value in Amps lt Reading gt If sensor type is discrete Discrete current state lt State gt Current sensor detected a Error non recoverable value Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt If sensor type is not discrete Current sensor value in Amps lt Reading gt
13. and the battery may be burnt or corroded Action Replace the battery if it has been removed If the contact point between the battery and the controller is burnt or corroded you will need to replace either the battery or the controller or both See the hardware documentation for information on how to safely access remove and replace the battery The controller Ok Normal Cause This alert is for informational purposes None 1151 battery has been replaced Action None Storage Management Message Reference 87 Table 4 4 Storage Management Messages continued Event ID 2176 2177 2178 2179 2180 2181 Description The controller battery Learn cycle has started The controller battery Learn cycle has completed The controller battery Learn cycle has timed out The controller battery Learn cycle has been postponed The controller battery Learn cycle will start in 1 days The controller battery Learn cycle will start in 1 hours Severity Cause and Action Ok Normal Cause This alert is for informational purposes Action None Ok Normal Cause This alert is for informational purposes Warning Non critical Action None Cause The controller battery must be fully charged before the Learn cycle can begin The battery may be unable to maintain a full charge causing the Learn cycle to timeout Additionally the battery must be able to maintain cached data for a
14. 2135 Array Manager is Warning Cause Storage Management has been installed None 103 installed on the Non critical on a system that has an Array Manager system installation Action Installing Storage Management and Array Manager on the same system is not a supported configuration Uninstall either Storage Management or Array Manager 2136 Virtual disk Ok Normal Cause This alert is for informational purposes 2088 1201 initialization Virtual disk initialization is in progress Action None 80 Storage Management Message Reference Table 4 4 Storage Management Messages continued Event ID 2137 2138 2139 2140 Description Severity Cause and Action Clear SNMP Event Trap Number Numbers Communication Warning Cause The controller is unable to communicate 2162 853 timeout Non critical with an enclosure There are several reasons why communication may be lost For example there may be a bad or loose cable An unusual amount of I O may also interrupt communication with the enclosure In addition communication loss may be caused by software hardware or firmware problems bad or failed power supplies and enclosure shutdown When viewed in the Alert Log the description for this event displays several variables These variables are Controller and enclosure names type of communication problem return code and SCSI status Action Check for problems with the cables See the online help for more information o
15. 2255 2269 2270 2274 2303 2305 Management This change affects the message 2309 2361 2362 2363 text of the modified alerts Obsolete Alerts 2160 2160 replaced by 2195 2161 2161 replaced by 2196 62 Storage Management Message Reference Table 4 3 Alert Message Change History Alert Message Change History Documentation Documentation updated to indicate clear Starting with Dell OpenManage 5 0 Array Changes alert status Manager is no longer an installable option If Reference to SNMP trap variables you have an Array Manager installation and wish to see how the Array Manager events correspond to the Storage Management alerts refer to the product documentation prior numbers removed see comments to Storage Management 2 1 or Dell OpenManage 5 1 removed Corresponding Array Manager event Alert Descriptions and Corrective Actions The following sections describe alerts generated by the RAID or SCSI controllers supported by Storage Management The alerts are displayed in the Server Administrator Alert subtab or through Windows Event Viewer These alerts can also be forwarded as SNMP traps to other applications SNMP traps are generated for the alerts listed in the following sections These traps are included in the Dell OpenManage Server Administrator Storage Management management information base MIB The SNMP traps for these alerts use all of the SNMP trap variables For more information on SNMP support and the
16. 2299 Bad PHY 1 Critical Cause There is a problem with a physical None 854 Failure connection or PHY The 1 indicates a Error substitution variable The text for this substitution variable is displayed with the alert in the Alert Log and can vary depending on the situation Action Contact Dell technical support 2300 The enclosure is Critical Cause The controller is not receiving a None 854 unstable Failure consistent response from the enclosure Error There could be a firmware problem or an invalid cabling configuration If the cables are too long they will degrade the signal Action Power down all enclosures attached to the system and reboot the system If the problem persists upgrade the firmware to the latest supported version You can download the most current version of the driver and firmware from support dell com Make sure the cable configuration is valid See the hardware documentation for valid cabling configurations 2301 The enclosure hasa Critical Cause The enclosure or an enclosure None 854 hardware error Failure component is in a Failed or Degraded state Error Action Check the health of the enclosure and its components Replace any hardware that is in a Failed state See the hardware documentation for more information 2302 The enclosure is not Critical Cause The enclosure or an enclosure None 854 responding Failure component is in a Failed or Degraded state Error Action Check the heal
17. 2316 AT 2318 39 100 100 100 100 101 101 101 101 101 101 102 102 102 102 102 103 103 103 103 104 104 104 105 105 105 105 105 106 106 106 2319 106 2320 106 2321 107 Dai OF 2323 107 2324 107 232S MOE 2326 107 2327 108 2328 108 2329 108 2330 108 2331 109 2332 109 2334 109 2335 109 2336 110 2337 110 2338 110 2399 UG 2340 110 Pak II 2342 111 Daars ele 2346 111 2347 111 2348 112 2349 112 2350 112 2351 112 2392 WIZ 239a WIZ 2356 113 2397 WN 2358 113 2359 113 2360 114 2361 114 2362 114 2363 114 2364 114 2366 114 230l S 2368 115 ZEN WS A A bad disk block could not be reassigned during a write operation 112 A bad disk block has been reassigned 109 A block on the physical disk has been punctured by the controller 97 A consistency check on a virtual disk has been paused suspended 75 A consistency check on a virtual disk has been resumed 76 A controller hot plug has been detected 109 A controller rescan has been initiated 93 A dedicated hot spare failed 91 A dedicated hot spare has been automatically unassigned 92 A dedicated hot spare has been removed 91 A device has been inserted 101 A device has been removed 101 A device is in an unknown state 95 A device is missing 95 A disk media error has been corrected 98 A disk media error was corrected during recovery 99
18. 2336 Controller event Critical Cause The 1 indicates a substitution None 754 log 1 Failure variable The text for this substitution Error variable is generated by the controller and is displayed with the alert in the Alert Log This text is from events in the controller event log that were generated while Storage Management was not running This text can vary depending on the situation Action See the hardware documentation for more information 2337 The controller is Critical Cause The controller was unable to recover None 1154 unable to recover Failure data from the cache cached data from the Error battery backup unit BBU Action Check if the battery is charged and in good health When the battery charge is unacceptably low it cannot maintain cached data Check if the battery has reached its recharge limit The battery may need to be recharged or replaced 2338 The controller has Ok Normal Cause This alert is for informational purposes None 1151 recovered cached data from the BBU 2339 The factory default Ok Normal Cause This alert is for informational purposes None 751 settings have been Action None Action None restored 2340 The BGI completed Critical Cause The BGI task encountered errors that None 1204 with uncorrectable Failure cannot be corrected The virtual disk contains errors Error physical disks that have unusable disk space or disk errors that cannot be corrected
19. Change write policy 76 Chassis intrusion detected 26 47 Chassis intrusion in progress 26 47 chassis intrusion messages 25 Chassis intrusion returned to normal 25 chassis intrusion sensor 6 Chassis intrusion sensor detected a non recoverable value 26 47 Chassis intrusion sensor has failed 25 Chassis intrusion sensor value unknown 25 47 Communication regained 84 Communication timeout 81 Communication with the enclosure has been lost 100 Controller alarm disabled 82 Controller alarm enabled 82 Controller alarm has been tested 84 Controller battery is reconditioning 72 Controller battery low 82 Controller battery recondition is completed 72 Controller configuration has been reset 84 Controller event log 1 109 110 Controller log file entry 1 95 Controller rebuild rate has changed 82 cooling device messages 18 current sensor 6 Current sensor detected a failure value 24 Current sensor detected a non recoverable value 24 Current sensor detected a warning value 23 Current sensor has failed 22 46 current sensor messages 22 Current sensor returned to a normal value 23 46 Current sensor value unknown 22 Dead disk segments restored 81 Dedicated hot spare assigned Physical disk 1 90 Dedicated hot spare unassigned Physical disk 1 90 Dedicated spare imported as global due to missing arrays 114 Device failed
20. Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt If sensor type is not discrete Voltage sensor value in Volts lt Reading gt If sensor type is discrete Discrete voltage state lt State gt 1155 Voltage sensor detected a non recoverable value Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt If sensor type is not discrete Voltage sensor value in Volts lt Reading gt If sensor type is discrete Discrete voltage state lt State gt Error Error A voltage sensor in the specified system exceeded its failure threshold The sensor location chassis location previous state and voltage sensor value are provided A voltage sensor in the specified system detected an error from which it cannot recover The sensor location chassis location previous state and voltage sensor value are provided Event Message Reference 21 Current Sensor Messages Current sensors listed in Table 2 5 measure the amount of current in amperes that is traversing critical components Current sensor messages provide status and warning information for current sensors in a particular chassis Table 2 5 Current Sensor Messages EventID Description Severity Cause 1200 Current sensor has failed Information A current sensor on the power supply for the specified system
21. Server Administrator starting Feb 6 14 20 51 server01 Server Administrator Instrumentation Service EventID 1001 Server Administrator startup complete Feb 6 14 21 21 server01 Server Administrator Instrumentation Service EventID 1254 Chassis intrusion detected Sensor location Main chassis intrusion Chassis location Main System Chassis Previous state was OK Normal Chassis intrusion state Open Feb 6 14 21 51 server01 Server Administrator Instrumentation Service EventID 1252 Chassis intrusion returned to normal Sensor location Main chassis intrusion Chassis location Main System Chassis Previous state was Critical Failed Chassis intrusion state Closed Viewing the Event Information The event log for each operating system contains some or all of the following information Date The date the event occurred Time The local time the event occurred Type A classification of the event severity Information Warning or Error User The name of the user on whose behalf the event occurred Computer The name of the system where the event occurred Source The software that logged the event Category The classification of the event by the event source Event ID The number identifying the particular event type Description A description of the event The format and contents of the event description vary depending on the event type Introduction 9 Un
22. changed to 901 2204 Severity changed to Informational SNMP trap changed to 901 60 Storage Management Message Reference Table 4 3 Alert Message Change History Alert Message Change History Obsolete Alerts Documentation Changes 2205 2266 2272 2273 2279 2299 2305 2331 2367 2333 2354 2355 2365 2370 Severity for alert 2163 changed from Ok Normal to Critical Failure Error Severity for alert 2318 changed from Critical Failure Error to Warning Non critical Severity changed to Informational SNMP trap changed to 901 SNMP traps changed to 751 801 851 901 951 1001 1051 1101 1151 1201 Severity changed to Critical SNMP trap changed to 904 Changed corrective action information in the documentation Changed alert message text and documentation for cause and corrective action Changed alert message text Changed corrective action information in the documentation Changed severity to Warning Changed SNMP trap number to 903 Changed severity to Informational Changed SNMP trap number to 901 Changed severity to Warning Changed SNMP trap number to 903 2354 replaced by 2368 Documentation change only made in the Dell OpenManage Server Administrator Messages Reference Guide to reflect the severity displayed in the Server Administrator Alert Log and documented in the Storage Management online help Documentation change only made in the Dell OpenManage Server Adm
23. 2 00000 eee neues 44 FanSensorEvents 0000 eee eee eee 45 Processor StatusEvents 0 000000 eee eee 46 Power Supply Events 20 0000 eee eee eae 41 Memory ECC Events ouaaa aa eee eee 48 BMC Watchdog Events naaa aaa a 48 Memory Events ahaaa aa a 49 Hardware Log Sensor Events anaa aa aaa a 49 Drive Events seui p ce duda stat a ndt ee Pe dea ai ee eo 50 IntrusionEvents s i e a a a a a e a i a a A a Gaa 51 BIOS Generated System Events naaa aaa aa 52 R2 Generated System Events aana aaa a 55 Cable Interconnect Events annaua aaa 55 Battery Events 3 2s 2eace Bid ai a a ar Sw BoE ag yA 55 Entity Presence Events 0 0 0 0 0 cee eee ee eee 56 4 Storage Management Message Reference 57 Alert Monitoring and Logging aaa aaa a 57 Alert Message Format with Substitution Variables 2 57 Alert Message Change History naaa aaa 60 Alert Descriptions and Corrective Actions 63 ndeke niea aa ink a a a ot eat wl 117 4 Contents Introduction Dell OpenManage Server Administrator produces event messages stored primarily in the operating system or Server Administrator event logs and sometimes in SNMP traps This document describes the event messages created by Server Administrator version 5 2 or later and displayed in the Server Administrator Alert log Server Administrator creates events in respon
24. 63 Device returned to normal 77 Diagnostic message 1 105 106 Drive Events 50 Driver version mismatch 80 drives messages 50 E Enclosure alarm disabled 81 Enclosure alarm enabled 81 Enclosure firmware mismatch 76 Enclosure was shut down 75 entity presence messages 56 Error occurred 1 111 event description reference 10 F Failure prediction threshold exceeded due to test 75 Fan enclosure inserted into system 33 fan enclosure messages 33 Fan enclosure removed from system 33 Fan enclosure removed from system for an extended amount of time 34 fan enclosure sensor 7 Fan enclosure sensor detected a non recoverable value 34 Fan enclosure sensor has failed 33 Fan enclosure sensor value unknown 33 fan sensor 6 Index 123 Fan sensor detected a failure value 19 Fan sensor detected a non recoverable value 19 Fan sensor detected a warning value 18 Fan Sensor Events 45 Fan sensor has failed 18 44 fan sensor messages 45 Fan sensor returned to a normal value 18 Fan sensor value unknown 18 44 Firmware version mismatch 80 G Global hot spare assigned 70 Global hot spare unassigned 70 hardware log sensor 7 Hardware Log Sensor Events 49 hardware log sensor messages 49 Hot spare SMART polling failed 99 124 Index Intrusion Events 51 intrusion messages 51 L Log backup created 13 Log monitoring has be
25. Critical The speed of the specified lt Sensor Name Location gt sensor detected a failure fan is not sufficient to provide enough cooling to the lt Reading gt where lt Sensor system Name Location gt is the entity that this sensor is monitoring For example BMC Back Fan or BMC Front Fan Reading is specified in RPM For example 100 RPM lt Sensor Name Location gt Fan Information The fan specified by lt Sensor Name Location gt has sensor returned to normal state returned to its normal operating speed lt Reading gt lt Sensor Name Location gt Fan Warning The speed of the specified lt Sensor Name Location gt sensor detected a warning fan may not be sufficient to provide enough cooling lt Reading gt to the system lt Sensor Name Location gt Fan Information The fan specified by lt Sensor Name Location gt may Redundancy sensor redundancy have failed and hence the redundancy has been degraded degraded lt Sensor Name Location gt Fan Critical The fan specified by lt Sensor Name Location gt may Redundancy sensor redundancy have failed and hence the redundancy that was lost degraded previously has been lost lt Sensor Name Location gt Fan Information The fan specified by lt Sensor Name Location gt may Redundancy sensor redundancy regained have started functioning again and hence the redundancy has been regained System Event Log Messages for IPMI Systems 45 Processor Status Events The pro
26. Event ID 2272 2273 2274 2276 2277 Description Patrol Read found an uncorrectable media error A block on the physical disk has been punctured by the controller The physical disk rebuild has resumed The dedicated hot spare is too small The global hot spare is too small Severity Critical Failure Error Critical Failure Error Cause and Action Clear Event SNMP Trap Number Numbers Cause The Patrol Read task has encountered None an error that cannot be corrected There may be a bad disk block that cannot be remapped Action Back up your data If you are able to back up the data successfully then fully initialize the disk and then restore from back up Cause The controller encountered an None unrecoverable medium error when attempting to read a block on the physical disk and marked that block as invalid If the controller encountered the unrecoverable medium error on a source physical disk during a rebuild or reconfigure operation it will also puncture the corresponding block on the target physical disk The invalid block will be cleared on a write operation Action Back up your data If you are able to back up the data successfully then fully initialize the disk and then restore from back up Ok Normal Cause This alert is for informational purposes None Warning Non critical Warning Non critical Action None Cause The dedicated hot spare
27. Event Log Messages for IPMI Systems 51 BIOS Generated System Events The BIOS generated messages monitor the health and functionality of the chipsets I O channels and other BIOS related functions These system events are generated by the BIOS Table 3 12 BIOS Generated System Events Event Message Severity Cause System Event I O channel chk Critical This event is generated when a critical interrupt is generated in the I O Channel System Event PCI Parity Err Critical This event is generated when a parity error is detected on the PCI bus System Event Chipset Err Critical This event is generated when a chip error is detected System Event PCI System Err Information This event indicates historical data and is generated when the system has crashed and recovered System Event PCI Fatal Err Critical This error is generated when a fatal error is detected on the PCI bus System Event PCIE Fatal Err Critical This error is generated when a fatal error is detected on the PCIE bus POST Err Critical This event is generated when an error accrues during system boot See the system documentation for more POST fatal error lt number gt 5 information on the error code Memory Spared Critical This event is generated when memory spare is no redundancy lost longer redundant Memory Mirrored Critical This event is generated when memory mirroring is no redundancy Tost longer redundant Memory RAID Critica
28. Failure Error Ok Normal Cause This alert is for informational purposes None Warning Clear SNMP Event Trap Number Numbers None 754 Cause and Action Cause The 1 indicates a substitution variable The text for this substitution variable is generated by the firmware and is displayed with the alert in the Alert Log This text can vary depending on the situation The reference to SMP in this text refers to SAS Management Protocol Action There may be a SAS topology error See the hardware documentation for information on correct SAS topology configurations There may be problems with the cables such as a loose connection or an invalid cabling configuration See the hardware documentation for information on correct cabling configurations Check if the firmware is a supported version Cause The 1 indicates a substitution None 754 variable The text for this substitution variable is generated by the firmware and is displayed with the alert in the Alert Log This text can vary depending on the situation Action There may be a problem with the enclosure Check the health of the enclosure and its components by selecting the enclosure object in the tree view The Health subtab displays a red X or yellow exclamation point for enclosure components that are failed or degraded See the enclosure documentation for more information 1151 Action None Cause The physical disk does not comply None 903 Non cri
29. If sensor type is not discrete Temperature sensor value in degrees Celsius lt Reading gt If sensor type is discrete Discrete temperature state lt State gt Temperature sensor value Information unknown Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt If sensor type is not discrete Temperature sensor value in degrees Celsius lt Reading gt If sensor type is discrete Discrete temperature state lt State gt Cause A temperature sensor on the backplane board system board or the carrier in the specified system failed The sensor location chassis location previous state and temperature sensor value are provided A temperature sensor on the backplane board system board or drive carrier in the specified system could not obtain a reading The sensor location chassis location previous state and a nominal temperature sensor value are provided Event Message Reference 15 16 Table 2 2 Temperature Sensor Messages continued EventID Description Severity Cause 1052 Temperature sensor returned Information A temperature sensor on the backplane to a normal value board system board or drive carrier in the dnar aig ease Location an r n no 7 i range chassis atter crossing a fal ure thres hold ne sensor location chassis location previous chassis location lt Name of state and temperature sensor value chassis gt are provided Previous sta
30. Information This event is generated when there is a memory error on Bank DIMM A B correction on a particular Dual Inline Memory Module DIMM ECC uncorrectable error Critical This event is generated when the chipset is unable to detected on Bank DIMM correct the memory errors Usually a bank number is provided and DIMM may or may not be identifiable depending on the error Correctable memory error Critical This event is generated when the chipset in the ECC logging disabled error correction rate exceeds a predefined limit BMC Watchdog Events The BMC watchdog operations are performed when the system hangs or crashes These messages monitor the status and occurrence of these events in a system Table 3 7 BMC Watchdog Events Event Message Severity Cause BMC OS Watchdog timer expired Information This event is generated when the BMC watchdog timer expires and no action is set BMC OS Watchdog performed Critical This event is generated when the BMC watchdog system reboot detects that the system has crashed timer expired because no response was received from Host and the action is set to reboot BMC OS Watchdog performed Critical This event is generated when the BMC watchdog system power off detects that the system has crashed timer expired because no response was received from Host and the action is set to power off BMC OS Watchdog performed Critical This event is generated when the BMC watchdog sy
31. Numbers 0001 13 1000 13 1001 13 1002 13 1003 13 1004 13 1005 14 1006 14 1007 14 1008 14 1009 14 1011 14 1012 14 1050 15 1051 15 1052 16 1053 16 1054 17 1055 17 1100 18 1101 18 1102 18 1103 18 1104 19 1105 19 1150 19 1151 20 1152 20 1153 20 MSi All NSS Zil 1200 22 1201 22 1202 23 1203 23 1204 24 1205 24 1250 25 Aol 25 252 25 1253 26 1254 26 1255 26 1300 27 1301 27 1302 27 1303 27 1304 28 1305 28 1306 28 1350 29 1351 29 1352 30 1353 30 1354 31 1355 31 1403 32 1404 32 1450 33 1451 33 ere 33 253 333 aan Sar er Sar 1500 34 1501 34 1502 35 Index 117 1503 35 1504 35 1505 35 1550 36 1551 36 1552 36 1553 36 1554 36 1555 36 1600 37 1601 37 1602 37 1603 38 1604 38 1605 38 1650 39 1651 39 1652 39 1653 39 1700 40 1701 40 1702 40 1703 40 1704 41 1705 41 2048 63 2049 64 2050 64 2051 64 2052 64 118 Index 2053 65 2054 65 2055 65 2056 65 2057 66 2058 66 2059 66 2061 66 2062 66 2063 66 2064 67 2065 67 2067 67 2070 67 2074 67 2076 68 2077 68 2079 68 2080 68 2081 68 2082 69 2083 69 2085 69 2086 69 2088 69 2089 69 2090 69 2091 69 2092 69 2094 70 2095 70 2098 70 2099 70 2100 71 2101 71 2102 71 2103 71 2104 72 2105 72 2106 72 2107 72 2108 73 2109 74 2110 75 ZA AMES WS 2114 75 2115 76 2116 76
32. Physical disk rebuild failed Virtual disk check consistency completed Virtual disk format completed Virtual disk initialization completed Physical disk initialize completed Virtual disk reconfiguration completed Virtual disk rebuild completed Physical disk rebuild completed Severity Critical Failure Error Critical Failure Error Cause and Action Cause A physical disk included in the virtual disk has failed or is corrupt A user may also have cancelled the rebuild Action Replace the failed or corrupt disk You can identify a disk that has failed by locating the disk that has a red X for its status Restart the virtual disk rebuild Cause A physical disk included in the virtual disk has failed or is corrupt A user may also have cancelled the rebuild Action Replace the failed or corrupt disk You can identify a disk that has failed by locating the disk that has a red X for its status Rebuild the virtual disk rebuild Ok Normal Cause This alert is for informational purposes Action None Ok Normal Cause This alert is for informational purposes Action None Ok Normal Cause This alert is for informational purposes Action None Ok Normal Cause This alert is for informational purposes Action None Ok Normal Cause This alert is for informational purposes Action None Ok Normal Cause This alert is for informational purposes A
33. Table 4 4 Storage Management Messages continued Event Description Severity ID 2123 Redundancy lost Warning Non critical 2124 Redundancy normal Ok Normal Cause and Action Clear SNMP Event Trap Number Numbers Cause A virtual disk or an enclosure has lost 2124 1306 data redundancy In the case of a virtual disk one or more physical disks included in the virtual disk have failed Due to the failed physical disk or disks the virtual disk is no longer maintaining redundant mirrored or parity data The failure of an additional physical disk will result in lost data In the case of an enclosure more than one enclosure component has failed For example the enclosure may have suffered the loss of all fans or all power supplies Action Identify and replace the failed components To identify the failed component select the Storage object and click the Health subtab The controller status displayed on the Health subtab indicates whether a controller has a failed or degraded component Click the controller that displays a Warning or Failed status This action displays the controller Health subtab which displays the status of the individual controller components Continue clicking the components with a Warning or Health status until you identify the failed component See the online help for more information See the enclosure documentation for information on replacing enclosure components and for other diagnostic in
34. ZMH VO 2118 76 2120 76 ANAM IT Pad EA DAB US 2124 78 ANZ 7 NA TS 2128 79 ANZ TO AMO 79 2131 80 2132 80 2135 80 2136 80 PNB tell 2138 81 2139 81 2140 81 dlls he Mew ev DAE eve Zilles SA 2145 82 2146 82 DMA GA 2148 83 AI Ga 2150 83 2151 83 2152 83 2153 83 2154 83 2155 84 2156 84 PAST sear 2158 84 2159 84 2162 84 2163 84 2164 85 2165 85 2166 85 2167 86 2168 86 2169 86 2170 86 27 Sri PMY 3 G7 ZDERE 27S S 2176 88 ZTT SO 2178 88 2179 88 2180 88 2181 88 2182 89 2186 89 2187 89 2188 89 2189 89 2191 90 2O92 O0 2193 90 2194 90 2195 90 2196 90 2 O9 90 2201 91 PAZ Ol ZA03 Oil 2204 91 LANS Q2 2206 92 2207 92 DAIL Oe EAM Oe DAN Ode 2214 93 MANS 93 Dif dan 3 LE 33 2234 93 UD A LAD JA 2238 93 Dd Ho D3 2240 93 DM 93 Daa I UNS A Dae 93 LAG Sar 2246 94 2247 94 22A Sap DES Or EAS Sr Di Sr Index 119 DEE Sr DRD Sy Sar DRDI Qar 2260 94 2261 94 2262 95 2263 95 2264 95 2265 95 2266 95 2267 95 2268 96 2269 96 2270 96 2271 96 DAA Di BAI 3 OY Ded ies SAN 2276 97 BANT II BAT N DTS Ks 2280 98 2281 98 2282 99 ZZ OPMS BRS 99 2285 99 2286 99 DSI IO 120 Index 2288 2289 2290 LEON 22927 BUDS DDO Te 22O 2296 AEDT 2298 LEDS 2300 2301 2302 2303 2304 2305 2306 2307 2309 2310 Zop 25127 ARNIS 2o 2315
35. absent from the chassis This sensor monitors the chassis and any attached systems AC Power Cord Sensor Monitors the presence of AC power for an AC power cord Hardware Log Sensor Monitors the size of a hardware log e Processor Sensor Monitors the processor status in the system e Pluggable Device Sensor Monitors the addition removal or configuration errors for some pluggable devices such as memory cards Battery Sensor Monitors the status of one or more batteries in the system Sample Event Message Text The following example shows the format of the event messages logged by Server Administrator EventID 1000 Source Server Administrator Category Instrumentation Service Type Information Date and Time Mon Oct 21 10 38 00 2002 Computer lt computer name gt Description Server Administrator starting Data Bytes in Hex Viewing Alerts and Event Messages An event log is used to record information about important events Server Administrator generates alerts that are added to the operating system event log and to the Server Administrator Alert log To view these alerts in Server Administrator 1 Select the System object in the tree view 2 Select the Logs tab 3 Select the Alert subtab You can also view the event log using your operating system s event viewer Each operating system s event viewer accesses the applicable operating system event log Int
36. lost 35 AC power has been restored 35 All virtual disks are missing from the controller This situation was discovered during system start up 114 An attempt to hot plug an EMM has been detected This type of hot plug is not supported 103 An EMM has been discovered 100 122 Index An EMM has been inserted 101 An EMM has been removed 101 An enclosure blink has ceased 94 An enclosure blink operation has initiated 94 An invalid SAS configuration has been detected 89 Array Manager is installed on the system 80 Asset name changed 83 Asset tag changed 83 Automatic System Recovery ASR action was performed 14 Background initialization cancelled 79 Background initialization completed 79 Background initialization failed 79 Background initialization started 79 Bad block extended medium error 83 Bad block extended sense error 83 Bad block medium error 83 Bad block replacement error 82 Bad block sense error 82 Bad block table is 80 full 103 Bad block table is full Unable to log block 1 104 Bad PHY 1 102 Battery charge in progress 93 Battery charge process interrupted 93 battery messages 55 BGI terminated due to loss of ownership in a cluster configuration 106 BIOS Generated System Events 52 bios generated system messages 52 BMC Watchdog Events 48 BMC watchdog messages 48 C cable interconnect messages 55
37. mirrored memory configuration Information This event is generated when there is a memory failure in a spared memory configuration Critical This event is generated when redundancy is lost in a spared memory configuration Information This event is generated when the redundancy lost or degraded earlier is regained in a spared memory configuration Hardware Log Sensor Events The hardware logs provide hardware status messages to the system management software On particular systems the subsequent hardware messages are not displayed when the log is full These messages provide status and warning messages when the logs are full Table 3 9 Hardware Log Sensor Events Event Message Severity Cause Log full detected Log cleared Critical This event is generated when the SEL device detects that only one entry can be added to the SEL before it is full Information This event is generated when the SEL is cleared System Event Log Messages for IPMI Systems 49 50 Drive Events The drive event messages monitor the health of the drives in a system These events are generated when there is a fault in the drives indicated Table 3 10 Drive Events Event Message Severity Cause Drive lt Drive gt asserted fault Critical This event is generated when the specified drive in the state array is faulty Drive lt Drive gt de asserted Information This event is generated when the specified drive fault s
38. working You should also check the thermostat settings and examine whether the enclosure is located near a heat source Make sure the enclosure has enough ventilation and that the room temperature is not too hot See the physical disk enclosure documentation for more diagnostic information Cause The physical disk enclosure is too 2353 1053 cool Action Check if the thermostat setting is too low and if the room temperature is too cool Cause The physical disk enclosure is too hot None 1054 A variety of factors can cause the excessive temperature For example a fan may have failed the thermostat may be set too high or the room temperature may be too hot Action Check for factors that may cause overheating For example verify that the enclosure fan is working You should also check the thermostat settings and examine whether the enclosure is located near a heat source Make sure the enclosure has enough ventilation and that the room temperature is not too hot See the physical disk enclosure documentation for more diagnostic information Cause The physical disk enclosure is too None 1054 cool Action Check if the thermostat setting is too low and if the room temperature is too cool Storage Management Message Reference 71 72 Table 4 4 Storage Management Messages continued Event ID 2104 2105 2106 2107 Smart configuration Critical Description Severity Cause and Action Cont
39. 3 2172 2173 2174 2175 The controller Warning battery temperature Non critical is above normal room temperature may be too hot or the fan in the system may be degraded or failed Action If this alert was generated due to a battery recharge the situation will correct when the recharge is complete You should also check if the room temperature is normal and that the system components are functioning properly The controller Ok Normal Cause This alert is for informational purposes Clear 1151 battery temperature is normal Unsupported Warning configuration Non critical detected The SCSI rate of the enclosure management modules EMMs is not the same EMMO 1 EMM1 2 The controller Warning battery has been Non critical removed Action None event Cause The EMMs in the enclosure havea None 853 different SCSI rate This is an unsupported configuration All EMM s in the enclosure should have the same SCSI rate The percent sign indicates a substitution variable The text for this substitution variable is displayed with the alert in the Alert Log and can vary depending on the situation Action The EMMs in the enclosure have a different SCSI rate This is an unsupported configuration All EMMs in the enclosure should have the same SCSI rate Cause The controller cannot communicate None 1153 with the battery the battery may be removed or the contact point between the controller
40. C power has been lost Error An AC power cord sensor in the specified Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt system failed The AC power cord status cannot be monitored The sensor location and chassis location information are provided Hardware Log Sensor Messages Hardware logs provide hardware status messages to systems management software On certain systems the hardware log is implemented as a circular queue When the log becomes full the oldest status messages are overwritten when new status messages are logged On some systems the log is not circular On these systems when the log becomes full subsequent hardware status messages are lost Hardware log sensor messages listed in Table 2 12 provide status and warning information about the noncircular logs that may fill up resulting in lost status messages Event Message Reference 35 36 Table 2 12 Hardware Log Sensor Messages EventID Description Severity Cause 1550 Log monitoring has been Information A hardware log sensor in the specified disabled system is disabled The log type information Log type lt Log type gt is provided 1551 Log status is unknown Information A hardware log sensor in the specified da types endo typ system could not obtain a reading The log i type information is provided 1552 Log size is no longer near Information The hardware log on the specified system is or at capacity no l
41. Dell OpbenManage Server Administrator Messages Reference Guide www dell com support dell com Notes and Notices K NOTE A NOTE indicates important information that helps you make better use of your computer NOTICE A NOTICE indicates either potential damage to hardware or loss of data and tells you how to avoid the problem Information in this document is subject to change without notice 2003 2007 Dell Inc All rights reserved Reproduction in any manner whatsoever without the written permission of Dell Inc is strictly forbidden Trademarks used in this text Dell the DELL logo and Dell OpenManage are trademarks of Dell Inc Microsoft and Windows are registered trademarks and Windows Server is a trademark of Microsoft Corporation Red Hat is a registered trademark of Red Hat Inc SUSE is a registered trademark of Novell Inc in the United States and other countries Other trademarks and trade names may be used in this document to refer to either the entities claiming the marks and names or their products Dell Inc disclaims any proprietary interest in trademarks and trade names other than its own February 2007 Contents Wi Ntoh fica Go esene 2 tinned Sod aout Meee es 5 What s New inthisRelease 2 2 20005 5 Messages Not Described in ThisGuide 5 Understanding Event Messages 22 20050 6 Sample Event Message Text 2 0 0000 eee ee 7 Viewing Alerts
42. If sensor type is discrete Discrete voltage state lt State gt Voltage sensor returned to a Information normal value Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt If sensor type is not discrete Voltage sensor value in Volts lt Reading gt If sensor type is discrete Discrete voltage state lt State gt Voltage sensor detected a warning value Warning Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt If sensor type is not discrete Voltage sensor value in Volts lt Reading gt If sensor type is discrete Discrete voltage state lt State gt Event Message Reference A voltage sensor in the specified system could not obtain a reading The sensor location chassis location previous state and a nominal voltage sensor value are provided A voltage sensor in the specified system returned to a valid range after crossing a failure threshold The sensor location chassis location previous state and voltage sensor value are provided A voltage sensor in the specified system exceeded its warning threshold The sensor location chassis location previous state and voltage sensor value are provided Table 2 4 Voltage Sensor Messages continued EventID Description Severity Cause 1154 Voltage sensor detected a failure value
43. Information This event is generated when an earlier hardware hardware incompatibility mismatch is corrected BMC Firmware and CPU mismatch was deasserted SBE Log Disabled Critical This event is generated when the ECC single bit error correctable memory error rate is exceeded logging disabled was asserted CPU Protocol Err Critical This event is generated when the processor protocol anait to non recsyerable enters a non recoverable state CPU Bus PERR Critical This event is generated when the processor bus PERR DRG Se Ott to non recoverabte enters a non recoverable state CPU Init Err Critical This event is generated when the processor aa EEE Oleh ede Bias tates initialization enters a non recoverable state CPU Machine Chk Critical This event is generated when the processor machine oot SSE te ES eS check enters a non recoverable state Logging Disabled Critical This event is generated when all event logging is all event logging disabled was disabled asserted Unknown system event sensor Critical This event is generated when an unknown hardware unknown system hardware failure was asserted failure is detected System Event Log Messages for IPMI Systems R2 Generated System Events Table 3 13 R2 Generated Events Description Severity Cause System Event OS stop event OS Information The OS was shutdown restarted graceful shutdown detected normally OEM Event data record after Information Comment s
44. MIB see the SNMP Reference Guide To locate an alert scroll through the following table to find the alert number displayed on the Server Administrator Alert tab or search this file for the alert message text or number See Understanding Event Messages for more information on severity levels For more information regarding alert descriptions and the appropriate corrective actions see the online help Table 4 4 Storage Management Messages Event Description Severity Cause and Action Clear SNMP ID Event Trap Number Numbers 2048 Device failed Critical Cause A storage component such as a 2121 754 Failure physical disk or an enclosure has failed The 804 Error failed component may have been identified 854 by the controller while performing a task such 904 as a rescan or a check consistency 954 1004 Action Replace the failed component You can identify which disk has failed by locating 1054 the disk that has a red X for its status 110 Perform a rescan after replacing the disk N Storage Management Message Reference 63 64 Table 4 4 Storage Management Messages continued Event ID 2049 2050 Physical disk offline 2051 2052 Description Physical disk removed Physical disk degraded Physical disk inserted Severity Warning Non critical Warning Non critical Warning Non critical Cause and Action Clear SNMP Event Trap Number Numbers Cause A physical disk has been
45. Messages continued Event ID 2191 2192 2193 2194 2195 2196 2199 Description Multiple enclosures are attached to the controller This is an unsupported configuration The virtual disk Check Consistency has made corrections and completed The virtual disk reconfiguration has resumed The virtual disk Read policy has changed Dedicated hot spare assigned Physical disk 1 Dedicated hot spare unassigned Physical disk 1 The virtual disk cache policy has changed Severity Critical Failure Error Cause and Action Cause Many enclosures are attached to the controller port When the enclosure limit is exceeded the controller loses contact with all enclosures attached to the port Action Remove the last enclosure You must remove the enclosure that has been added last and is causing the enclosure limit to exceed Ok Normal Cause This alert is for informational purposes The virtual disk Check Consistency has identified errors and made corrections For example the Check Consistency may have encountered a bad disk block and remapped the disk block to restore data consistency Action This alert is for informational purposes only and no additional action is required As a precaution monitor the Alert Log for other errors related to this virtual disk If problems persist contact Dell Technical Support Ok Normal Cause This alert is for informational pu
46. None SNMP Trap Numbers 753 753 753 851 851 851 1051 Storage Management Message Reference 83 Table 4 4 Storage Management Messages continued Event Description Severity Cause and Action ID 2155 Minimum Ok Normal Cause This alert is for informational purposes temperature probe A user has changed the value for the warning threshold minimum temperature probe warning value changed threshold Action None 2156 Controller alarm has Ok Normal Cause This alert is for informational purposes been tested The controller alarm test has run successfully Action None 2157 Controller Ok Normal Cause This alert is for informational purposes configuration has A user has reset the controller configuration been reset See the online help for more information Action None 2158 Physical disk online Ok Normal Cause This alert is for informational purposes An offline physical disk has been made online Action None 2159 Virtual disk renamed Ok Normal Cause This alert is for informational purposes A user has renamed a virtual disk When renaming a virtual disk on a PERC 3 SC 3 DCL 3 DC 3 OC 4 SC 4 DC 4e DC 4 Di CERC ATA100 4ch PERC 5 E PERC 5 i or SAS 5 iR controller this alert displays the new virtual disk name On the PERC 3 SC 3 DCL 3 DC 3 QC 4 SC 4 DC 4e DC 4 Di 4 IM 4e Si 4e Di and CERC ATA 100 4ch controllers this alert displays the original virtual disk name A
47. OS application log Sends an SNMP trap if the operating system s SNMP service is installed and enabled K NOTE Dell OpenManage Server Administrator Storage Management does not log alerts regarding the data I O path These alerts are logged by the respective RAID drivers in the system alert log See the Storage Management Online Help and the Dell OpenManage Server Administrator Storage Management User s Guide for updated information Alert Message Format with Substitution Variables When you view an alert in the Server Administrator alert log the alert identifies the specific components such as the controller name or the virtual disk name to which the alert applies In an actual operating environment a storage system can have many combinations of controllers and disks as well as user defined names for virtual disks and other components Because each environment is unique in its storage configuration and user defined names an accurate alert message requires that the Storage Management Service be able to insert the environment specific names of storage components into an alert message This environment specific information 1s inserted after the alert message text as shown for alert 2127 in Table 4 1 Storage Management Message Reference 57 58 For other alerts the alert message text is constructed from information passed directly from the controller or another storage component to the Alert Log In these cases the variable in
48. Table 4 4 Storage Management Messages continued Event ID 2121 2122 Description Device returned to normal Redundancy degraded Severity Cause and Action Ok Normal Cause This alert is for informational purposes Warning Non critical A device that was previously in an error state has returned to a normal state For example if an enclosure became too hot and subsequently cooled down then you may receive this alert Action None Cause One or more of the enclosure components has failed For example a fan or power supply may have failed Although the enclosure is currently operational the failure of additional components could cause the enclosure to fail Action Identify and replace the failed component To identify the failed component select the enclosure in the tree view and click the Health subtab Any failed component will be identified with a red X on the enclosure s Health subtab Alternatively you can select the Storage object and click the Health subtab The controller status displayed on the Health subtab indicates whether a controller has a failed or degraded component See the enclosure documentation for information on replacing enclosure components and for other diagnostic information Clear Event Number Clear event 2124 SNMP Trap Numbers 752 802 852 902 952 1002 1052 1102 1152 1202 1305 Storage Management Message Reference 77 78
49. Variables for Each Storage Object continued Storage Object Message Variables A B C and X Y Z in the following examples are variables representing the storage object name or number SAS EMM Message Format EMM X Controller A Connector B Enclosure C Example 2121 Device returned to normal EMM 1 Controller 1 Connector 0 Enclosure 2 Alert Message Change History The following table describes changes made to the Storage Management alerts from the previous release of Storage Management to the current release Table 4 3 Alert Message Change History Alert Message Change History Storage Management 2 2 Comments Product Versions to Storage Management 2 2 which Changes Server Administrator 3 2 Apply us Dell OpenManage 5 2 Reduction of Enhancements to Storage Management In previous versions of Storage Management unnecessary alert avoid numerous redundant or an unexpected system shutdown may have generation inappropriate alerts posted to the Alert caused the controller to repost a large number Log after an unexpected system of alerts to the Alert Log when restarting the shutdown system Modified Alerts 2095 Severity changed to Informational SNMP trap changed to 901 2153 Severity changed to Informational SNMP trap changed to 851 2188 Severity changed to Informational SNMP trap changed to 1151 2192 Changed documentation for cause and corrective action 2202 Severity changed to Informational SNMP trap
50. a problem with the driver or firmware Action Check the cables Check if the controller has a supported version of the driver and firmware You can download the most current version of the driver and firmware from support dell com Rebooting the system may also resolve this problem Controller log file Ok Normal Cause This alert is for informational None entry 1 purposes The 1 indicates a substitution variable The text for this substitution variable is generated by the controller and is displayed with the alert in the Alert Log This text can vary depending on the situation Action None The controller Ok Normal Cause This alert is for informational purposes None reconstruct rate has changed Action None SNMP Trap Numbers 101 101 753 803 853 903 953 1003 1053 1103 1153 1203 753 803 853 903 953 1003 1053 1103 1153 1203 751 801 851 901 951 1001 1051 1101 1151 1201 751 Storage Management Message Reference 95 96 Table 4 4 Storage Management Messages continued Event ID 2268 2269 2270 2271 Description 1 Storage Management has lost communication with the controller An immediate reboot is strongly recommended to avoid further problems If the reboot does not restore communication then contact technical support for more information The physical disk Clear operation has completed The physical disk Clear operation failed
51. ace from these dead segments has been recovered and is now usable Any data residing on these dead segments has been lost Action None Ok Normal Cause This alert is for informational purposes A user has changed the controller rebuild rate Action None Ok Normal Cause This alert is for informational purposes A user has enabled the controller alarm Action None Ok Normal Cause This alert is for informational purposes A user has disabled the controller alarm Action None Warning Cause The controller battery charge is low Non enitical Action Recondition the battery See the online help for more information Warning Cause A portion of a physical disk is Non critical damaged Action See the Dell OpenManage Server Administrator Storage Management online help or the Dell OpenManage Server Administrator Storage Management User s Guide for more information Warning Cause A portion of a physical disk is Non critical damaged Action See the Dell OpenManage Server Administrator Storage Management online help for more information Storage Management Message Reference Clear Event Number None None None None None None None SNMP Trap Numbers 901 751 751 751 1153 753 753 Table 4 4 Storage Management Messages continued Event ID 2148 2149 2150 215 2152 2153 2154 Description Bad block medium error Bad block extended se
52. aced The sensor location chassis location previous state and additional power supply status information are provided A power supply sensor reading in the specified system exceeded a user definable warning threshold The sensor location chassis location previous state and additional power supply status information are provided Table 2 8 Power Supply Messages continued EventID Description Severity Cause 1354 1355 Power supply detected a failure Error Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt Power Supply type lt type of power supply gt lt Additional power supply status information gt If in configuration error state Configuration error type lt type of configuration error gt Power supply sensor detected Error a non recoverable value Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt Power Supply type lt type of power supply gt lt Additional power supply status information gt If in configuration error state Configuration error type lt type of configuration error gt A power supply has been disconnected or has failed The sensor location chassis location previous state and additional power supply status information are provided A power supply sensor in the specified system detected an error from
53. ancel the flash BIOS update has been canceled BIOS update or an error occurs during the flash 1004 Thermal shutdown protection Error This message is generated when a has been initiated system is configured for thermal shutdown due to an error event If a temperature sensor reading exceeds the error threshold for which the system is configured the operating system shuts down and the system powers off This event may also be initiated on certain systems when a fan enclosure is removed from the system for an extended period of time Event Message Reference 13 Table 2 1 Miscellaneous Messages continued EventID Description Severity Cause 1005 SMBIOS data is absent Warning The system does not contain the required systems management BIOS version 2 2 or higher or the BIOS is corrupted 1006 Automatic System Recovery Error This message is generated when an ASR action was performed automatic system recovery action 1s Action pertoried wase lt ActTonS performed due toa hung operating l f system The action performed and the Date and time of action lt Date time of action are provided and time gt 1007 User initiated host system Information User requested a host system control control action action to reboot power off or power action reaq sted vast sAcbionS cycle the system Alternatively the user had indicated protective measures to be initiated in the event of a thermal shutdown 1008 Systems Management Data Infor
54. and Event Messages 004 7 Viewing Events in Windows 2000 Advanced Server and Windows Server 2003 r ci he ted Garten Sek Ae oA eG 8 Viewing Events in Red Hat Enterprise Linux and SUSE Linux Enterprise Servera a orere e ese gh Sas SA Bia ANET ang ak wera s 8 Viewing the EventInformation 2 2000 9 Understanding the Event Description 10 2 Event Message Reference 13 Miscellaneous Messages 0 0000 euuese 13 Temperature Sensor Messages 2 000005 15 Cooling Device Messages 0 000 000 ee enue 18 Voltage SensorMessages 0 000000 e eae 19 CurrentSensorMessages 000 0000 ee eue 22 Chassis Intrusion Messages 0 0000000 25 Redundancy Unit Messages anau aa aa 0 0000 e eee 26 Power Supply Messages nnua aaa 0000 ee eee 29 Memory Device Messages nahna aa aaa ee enue 32 Fan Enclosure Messages 0000 eee enue 33 AC Power Cord Messages 00 000 ee uuee 34 Hardware Log Sensor Messages 20005 35 Processor Sensor Messages 0 0 000000 e ue 37 Contents 3 Pluggable Device Messages 0 0000 ee uuee 39 Battery Sensor Messages 0 0 0 000 e ee neue 40 3 System Event Log Messages for IPMI Systems 43 Temperature SensorEvents 00000 00 eae 43 Voltage SensorEvents
55. ary depending on the situation Action Upgrade to the same version of the firmware on both EMM modules Cause The power supply has an AC failure 2325 1003 Action Replace the power supply Cause The power supply has a DC failure 2323 1003 Action Replace the power supply Cause Storage Management is unable to None 104 monitor or manage SAS devices Action Reboot the system If problem persists make sure you have supported versions of the drivers and firmware Also you may need to reinstall Storage Management or Server Administrator because of some missing installation components Ok Normal Cause This alert is for informational None 751 purposes The 1 indicates a substitution variable The text for this substitution variable is generated by the utility that ran the diagnostics and is displayed with the alert in the Alert Log This text can vary depending on the situation Action None Storage Management Message Reference 105 Table 4 4 Storage Management Messages continued 106 Event ID 2316 2317 2318 2319 2320 Description Diagnostic message l BGI terminated due to loss of ownership in a cluster configuration Problems with the battery or the battery charger have been detected The battery health is poor Single bit ECC error The DIMM is degrading Single bit ECC error The DIMM is critically degraded Severity Cause and Action Critical Cause A diagnos
56. ation lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt Fan sensor value lt Reading gt Voltage Sensor Messages cannot recover The sensor location chassis location previous state and fan sensor value are provided Voltage sensors listed in Table 2 4 monitor the number of volts across critical components Voltage sensor messages provide status and warning information for voltage sensors in a particular chassis Table 2 4 Voltage Sensor Messages EventID Description Severity Cause 1150 Voltage sensor has failed Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt If sensor type is not discrete Voltage sensor value in Volts lt Reading gt If sensor type is discrete Discrete voltage state lt State gt Information A voltage sensor in the specified system failed The sensor location chassis location previous state and voltage sensor value are provided Event Message Reference 19 Table 2 4 Voltage Sensor Messages continued Event ID Description Severity Cause 1151 1152 1153 20 Voltage sensor value unknown Information Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt If sensor type is not discrete Voltage sensor value in Volts lt Reading gt
57. ation chassis location previous state an processor sensor status are provided Chassis Location lt Name of chassis gt Previous state was lt State gt Processor sensor status lt status gt 1605 Processor sensor detected a Error A processor sensor in the specified system non recoverable value has failed The sensor location chassis Sensor Tmeak iene Bieee een aa location o and processor sensor statis chasetes status are provide Chassis Location lt Name of chassis gt Previous state was lt State gt Processor sensor status lt status gt Event Message Reference Pluggable Device Messages The pluggable device messages listed in Table 2 14 provide status and error information when some devices such as memory cards are added or removed Table 2 14 Pluggable Device Messages EventID Description Severity Cause 1650 1651 1652 1653 lt Device plug event type unknown gt Information Device location lt Location in chassis if available gt Chassis location lt Name of chassis if available gt Additional details lt Additional details for the events if available gt Device added to system Information Device location lt Location in chassis gt Chassis location lt Name of chassis gt Additional details lt Additional details for the events gt Device removed from system Information Device location lt Location in chassis gt Chassis location lt Name of c
58. ature sensor detected Error A temperature sensor on the backplane a non recoverable value Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt If sensor type is not discrete Temperature sensor value in degrees Celsius lt Reading gt If sensor type is discrete Discrete temperature state lt State gt board system board or drive carrier in the specified system detected an error from which it cannot recover The sensor location chassis location previous state and temperature sensor value are provided Event Message Reference 17 18 Cooling Device Messages Cooling device sensors listed in Table 2 3 monitor how well a fan is functioning Cooling device messages provide status and warning information for fans in a particular chassis Table 2 3 Cooling Device Messages EventID Description Severity Cause 1100 Fan sensor has failed Information A fan sensor in the specified system is not Seno iocatioh r s kocdtion functioning The sensor location chassis ta chassiss location previous state and fan sensor value are provided Chassis location lt Name of chassis gt Previous state was lt State gt Fan sensor value lt Reading gt 1101 Fan sensor value unknown Information A fan sensor in the specified system could not Sensor Tocation lt Loesbicn obtain a reading The sensor location chassis He een er location previous state a
59. atus information gt If in configuration error state Configuration error type lt type of configuration error gt A power supply sensor in the specified system failed The sensor location chassis location previous state and additional power supply status information are provided A power supply sensor in the specified system could not obtain a reading The sensor location chassis location previous state and additional power supply status information are provided Event Message Reference 29 Table 2 8 Power Supply Messages continued EventID Description Severity Cause 1352 1353 Power supply returned to Information normal Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt Power Supply type lt type of power supply gt lt Additional power supply status information gt If in configuration error state Configuration error type lt type of configuration error gt Power supply detected a warning Sensor location lt Location in chassis gt Warning Chassis location lt Name of chassis gt Previous state was lt State gt Power Supply type lt type of power supply gt lt Additional power supply status information gt If in configuration error state Configuration error type lt type of configuration error gt Event Message Reference A power supply has been reconnected or repl
60. can vary depending on the situation The 1 indicates a substitution variable The text for this substitution variable is generated by the controller and is displayed with the alert in the Alert Log This text can vary depending on the situation Action Make sure the cables are attached securely If the problem persists replace the cable with a valid cable according to SAS specifications If the problem still persists you may need to replace some devices such as the controller or EMM See the hardware documentation for more information Ok Normal Cause This alert is for informational purposes The 1 indicates a substitution variable The text for this substitution variable is generated by the controller and is displayed with the alert in the Alert Log This text can vary depending on the situation Action None Storage Management Message Reference Clear SNMP Event Trap Number Numbers None 753 None 753 None 753 None 751 Table 4 4 Storage Management Messages continued Event ID 2331 2332 2334 2335 Description A bad disk block has been reassigned Severity Cause and Action Ok Normal Cause The disk has a bad block Data has been readdressed to another disk block and no data loss has occurred Action Monitor the disk for other alerts or indications of poor health For example you may receive alert 2306 Replace the disk if you suspect there is a problem A controller hot plug Ok
61. cessor status messages monitor the functionality of the processors in a system These messages provide processor health and warning information of a system Table 3 4 Processor Status Events Event Message Severity Cause lt Processor Entity gt status processor Critical IERR internal error generated by the sensor IERR where lt Processor lt Processor Entity gt Entity gt is the processor that generated the event For example PROC for a single processor system and PROC for multiprocessor system lt Processor Entity gt status processor Critical The processor generates this event before it sensor Thermal Trip shuts down because of excessive heat caused by lack of cooling or heat synchronization lt Processor Entity gt status processor Information This event is generated when a processor sensor recovered from IERR recovers from the internal error lt Processor Entity gt status processor Warming This event is generated for all processors that sensor disabled are disabled lt Processor Entity gt status processor Information This event is generated if the terminator is sensor terminator not present missing on an empty processor slot lt Processor Entity gt presence was Critical This event is generated when the system deasserted could not detect the processor lt Processor Entity gt presence was Information This event is generated when the earlier asserted processor detection error was corrected
62. continued Event Message Severity Cause Drive lt Drive gt Informational in failed array was deasserted Drive lt Drive gt Informational rebuild in progress was asserted Drive lt Drive gt Warning rebuild aborted was asserted This event is generated when the drive is removed from the fail array This event is generated when the drive is rebuilding This event is generated when the drive rebuilding process is aborted Intrusion Events The chassis intrusion messages are a security measure Chassis intrusion alerts are generated when the system s chassis is opened Alerts are sent to prevent unauthorized removal of parts from the chassis Table 3 11 Intrusion Events Event Message Severity Cause lt Intrusion sensor Name gt Critical sensor detected an intrusion lt Intrusion sensor Name gt Information sensor returned to normal state lt Intrusion sensor Name gt Critical sensor intrusion was asserted while system was ON lt Intrusion sensor Name gt Critical sensor intrusion was asserted while system was OFF This event is generated detects an intrusion This event is generated has been corrected This event is generated when the intrusion sensor when the earlier intrusion when the intrusion sensor detects an intrusion while the system is on This event is generated when the intrusion sensor detects an intrusion while the system is off System
63. ction None 2162 Communication Ok Normal Cause This alert is for informational purposes regained Communication with an enclosure has been restored Action None 2163 Rebuild completed Critical Cause This alert is documented in the Storage with errors Failure Management online help Error Action See the online help for more information 84 Storage Management Message Reference Clear Event Number None None None Clear event None Clear event None SNMP Trap Numbers 1051 751 751 901 1201 851 904 Table 4 4 Storage Management Messages continued Event ID 2164 2165 2166 Description Severity Cause and Action Clear SNMP Event Trap Number Numbers See the Readme file Ok Normal Cause This alert is for informational purposes None 101 for a list of validated controller driver versions The RAID controller Warning firmware and driver Non critical validation was not performed The configuration file cannot be opened The RAID controller Warning firmware and driver Non critical validation was not performed The configuration file is out of date or corrupted Storage Management is unable to determine whether the system has the minimum required versions of the RAID controller drivers Action See the Readme file for driver and firmware requirements In particular if Storage Management experiences performance problems you should v
64. ction None Ok Normal Cause This alert is for informational purposes Action None Clear Event Number None None Clear event Clear event Clear event Clear event Clear event Clear event Clear event SNMP Trap Numbers 1204 904 1201 1201 1201 901 1201 1201 901 Storage Management Message Reference 69 Table 4 4 Storage Management Messages continued Event Description ID 2094 Predictive Failure reported 2095 SCSI sense data 2098 Global hot spare assigned 2099 Global hot spare unassigned Severity Warning Non critical Cause and Action Cause The physical disk is predicted to fail Many physical disks contain Self Monitoring Analysis and Reporting Technology SMART When enabled SMART monitors the health of the disk based on indications such as the number of write operations that have been performed on the disk Action Replace the physical disk Even though the disk may not have failed yet it is strongly recommended that you replace the disk If this disk is part of a redundant virtual disk perform the Offline task on the disk replace the disk and then assign a hot spare and the rebuild will start automatically If this disk is a hot spare then unassign the hot spare perform the Prepare to Remove task on the disk replace the disk and assign the new disk as a hot spare NOTICE if this disk is part of a non
65. d a SMART alert None 903 predictive failure due to test conditions Action None Cause The physical disk enclosure is either None 854 hotter or cooler than the maximum or minimum allowable temperature range Action Check for factors that may cause overheating or excessive cooling For example verify that the enclosure fan is working You should also check the thermostat settings and examine whether the enclosure is located near a heat source Make sure the enclosure has enough ventilation and that the room temperature is not too hot or too cold See the enclosure documentation for more diagnostic information 2114 A consistency check Ok Normal Cause The check consistency operation ona 2115 1201 on a virtual disk has been paused suspended virtual disk was paused by a user Action To resume the check consistency operation right click the virtual disk in the tree view and select Resume Check Consistency Storage Management Message Reference 75 Table 4 4 Storage Management Messages continued Event Description Severity Cause and Action Clear SNMP ID Event Trap Number Numbers 2115 A consistency check Ok Normal Cause This alert is for informational purposes Clear 1201 on a virtual disk has The check consistency operation on a virtual event been resumed disk has resumed processing after being paused by a user Action None 2116 A virtual disk and its Ok Normal Cause This alert is for informational pu
66. d block table is now 80 full Action Back up your data Replace the disk generating this alert and restore from back up 751 905 903 Storage Management Message Reference 103 Table 4 4 Storage Management Messages continued 104 Event ID 2307 2309 2310 Description Bad block table is full Unable to log block 1 A physical disk is incompatible A virtual disk is permanently degraded Severity Critical Failure Error Warning Non critical Critical Failure Error Cause and Action Cause The bad block table is used for remapping bad disk blocks This table fills as bad disk blocks are remapped When the table is full bad disk blocks can no longer be remapped and disk errors can no longer be corrected At this point data loss can occur The 1 indicates a substitution variable The text for this substitution variable is displayed with the alert in the Alert Log and can vary depending on the situation Action Replace the disk generating this alert If necessary restore your data from backup Cause You have attempted to replace a disk with another disk that is using an incompatible technology For example you may have replaced one side of a mirror with a SAS disk when the other side of the mirror is using SATA technology Action See the hardware documentation for information on replacing disks Cause A redundant virtual disk has lost redundancy This ma
67. derstanding the Event Description Table 1 2 lists in alphabetical order each line item that may appear in the event description Table 1 2 Event Description Reference Description Line Item Explanation Action performed was lt Action gt Action requested was lt Action gt Additional Details lt Additional details for the event gt lt Additional power supply status information gt Chassis intrusion state lt Intrusion state gt Chassis location lt Name of chassis gt Configuration error type lt type of configuration error gt Current sensor value lt Reading gt in Amps Date and time of action lt Date and time gt Device location lt Location in chassis gt lt State gt Discrete current state Discrete temperature state lt State gt Introduction Specifies the action that was performed for example Action performed was Power cycle Specifies the action that was requested for example Action requested was Reboot shutdown OS first Specifies additional details available for the hot plug event for example Memory device DIMM1_A Serial number FFFF30B1 Specifies information pertaining to the event for example Power supply input AC is off Power supply POK power OK signal is not normal Power supply is turned off Specifies the chassis intrusion state open or closed for example Chassis intrusion state Open Specifies name of the chassis that generated the
68. disks If you suspect that a physical disk has a problem replace it and restore from backup Cause The Check Consistency can no longer None 1203 report errors in the parity data Action See the hardware documentation for more information Cause A physical device may have an error None 903 The 1 indicates a substitution variable The text for this substitution variable is generated by the firmware and is displayed with the alert in the Alert Log This text can vary depending on the situation Action Verify the health of attached devices Review the Alert Log for significant events Run the PHY integrity diagnostic tests You may need to replace faulty hardware Make sure the cables are attached securely See the hardware documentation for more information Cause You are attempting to rebuild data None 904 that resides on a defective disk Action Replace the source disk and restore from backup Storage Management Message Reference 111 Table 4 4 Storage Management Messages continued Event ID 2348 2349 2350 2351 2352 2353 112 Description The rebuild failed due to errors on the target physical disk A bad disk block could not be reassigned during a write operation There was an unrecoverable disk media error during the rebuild A physical disk is marked as missing A physical disk that was marked as missing has been replaced The enclosure temperature has ret
69. ed restart the check consistency operation Cause A physical disk included in the virtual None disk failed Action Replace the failed physical disk You can identify which physical disk has failed by locating the disk that has a red X for its status Rebuild the physical disk When finished restart the virtual disk format operation Cause A physical disk included in the virtual None disk has failed or a user has cancelled the initialization Action If a physical disk has failed then replace the physical disk Cause The physical disk has failed or is None corrupt Action Replace the failed or corrupt disk You can identify a disk that has failed by locating the disk that has a red X for its status Restart the initialization Cause A physical disk included in the virtual None disk has failed or is corrupt A user may also have cancelled the reconfiguration Action Replace the failed or corrupt disk You can identify a disk that has failed by locating the disk that has a red X for its status If the physical disk is part of a redundant array then rebuild the physical disk When finished restart the reconfiguration Storage Management Message Reference SNMP Trap Numbers 1204 1204 1204 904 1204 Table 4 4 Storage Management Messages continued Event ID 2082 2083 2085 2086 2088 2089 2090 2091 2092 Description Virtual disk rebuild failed
70. ed correctly Action See the hardware documentation for information on correct cabling configurations Cause The controller has flushed the cache and any data in the cache has been lost This may happen if the system has memory or battery problems that cause the controller to distrust the cache Although user data may have been lost this alert does not always indicate that relevant or user data has been lost Action Verify that the battery and memory are functioning properly Cause The system memory is malfunctioning Action Replace the battery pack The controller write Ok Normal Cause The controller battery is unable to policy has been changed to Write Through maintain cached data for the required period of time For example if the required period of time is 24 hours the battery is unable to maintain cached data for 24 hours It is normal to receive this alert during the battery Learn cycle as the Learn cycle discharges the battery before recharging it When discharged the battery cannot maintain cached data Action Check the health of the battery If the battery is weak replace the battery pack Clear SNMP Event Trap Number Numbers None 754 None 753 None 753 None 1151 The controller write Ok Normal Cause This alert is for informational purposes None 1151 policy has been changed to Write Back Action None Storage Management Message Reference 89 90 Table 4 4 Storage Management
71. en disabled 36 51 Log size is near or at capacity 36 Log size returned to a normal level 36 Log status is unknown 36 51 Log was cleared 13 Maximum temperature probe warning threshold value changed 83 Memory device ECC Correctable error count crossed a warning threshold 32 Memory device ECC Correctable error count sensor crossed a failure threshold 32 memory device messages 32 Memory device monitoring has been disabled 32 Memory ECC Events 48 memory ecc messages 48 Memory Events 49 memory modules messages 49 memory prefailure sensor 6 messages AC power cord 34 50 battery 55 battery sensor 40 bios generated system 52 BMC watchdog 48 cable interconnect 55 chassis intrusion 25 cooling device 18 current sensor 22 drives 50 entity presence 56 fan enclosure 33 fan sensor 45 hardware log sensor 9 intrusion 51 memory device 32 memory ecc 48 memory modules 49 miscellaneous 13 pluggable device 39 52 power supply 29 47 processor sensor 37 processor status 46 r2 generated system 55 redundancy unit 26 storage management 63 temperature sensor 15 43 voltage sensor 19 44 Minimum temperature probe warning threshold value changed 84 Multi bit ECC error 100 Multiple enclosures are attached to the controller This is an unsupported configuration 90 P Patrol Read found an uncorrectable media error 97 Physical disk dead segm
72. enclosure management module EMM managing the power supply Example 2122 Redundancy degraded Power Supply 1 Controller 1 Connector 0 Target ID 6 Message Format Power Supply X Controller A Connector B Enclosure C Example 2312 A power supply in the enclosure has an AC failure Power Supply 1 Controller 1 Connector 0 Enclosure 2 Message Format Temperature Probe X Controller A Connector B Target ID C where C is the SCSI ID number of the EMM managing the temperature probe Example 2101 Temperature dropped below the minimum warning threshold Temperature Probe 1 Controller 1 Connector 0 Target ID 6 Message Format Temperature Probe X Controller A Connector B Enclosure C Example 2101 Temperature dropped below the minimum warning threshold Temperature Probe 1 Controller 1 Connector 0 Enclosure 2 Message Format Fan X Controller A Connector B Target ID C where C is the SCSI ID number of the EMM managing the fan Example 2121 Device returned to normal Fan 1 Controller 1 Connector 0 Target ID 6 Message Format Fan X Controller A Connector B Enclosure C Example 2121 Device returned to normal Fan 1 Controller 1 Connector 0 Enclosure 2 Message Format EMM X Controller A Connector B Target ID C where C is the SCSI ID number of the EMM Example 2121 Device returned to normal EMM 1 Controller 1 Connector 0 Target ID6 Storage Management Message Reference 59 Table 4 2 Message Format with
73. ents recovered 82 Physical disk degraded 64 Physical disk initialization started 66 Physical disk initialize completed 69 Physical disk initialize failed 68 Physical disk inserted 64 Physical disk offline 64 Physical disk online 84 Physical disk rebuild cancelled 67 Physical disk rebuild completed 69 Physical disk rebuild failed 69 Physical disk rebuild started 67 Physical disk removed 64 Physical disk s have been removed from a virtual disk The virtual disk will be in Failed state during the next system reboot 114 Physical disk s that are part of a virtual disk have been removed while the system was shut down This removal was discovered during system start up 114 pluggable device sensor 7 Power supply detected a failure 31 Power supply detected a warning 30 48 Power Supply Events 47 power supply messages 29 47 Power supply returned to normal 30 48 power supply sensor 6 Power supply sensor detected a non recoverable value 31 Power supply sensor has failed 29 Power supply sensor value unknown 29 Predictive Failure reported 70 Problems with the battery or the battery charger have been detected The battery health is poor 106 processor sensor 7 Processor sensor detected a failure value 38 52 Processor sensor detected a non recoverable value 38 Processor sensor detected a warning value 38 52 Processor sensor has
74. erify that you have the minimum supported versions of the drivers and firmware installed Cause Storage Management is unable to None 753 determine whether the system has the minimum required versions of the RAID controller firmware and drivers This situation may occur for a variety of reasons For example the installation directory path to the configuration file may not be correct The configuration file may also have been removed or renamed Action Reinstall Storage Management Cause Storage Management is unable to None 753 determine whether the system has the minimum required versions of the RAID controller firmware and drivers This situation has occurred because a configuration file is unreadable or missing data The configuration file may be corrupted Action Reinstall Storage Management Storage Management Message Reference 85 Table 4 4 Storage Management Messages continued Event Description Severity Cause and Action Clear SNMP ID Event Trap Number Numbers 2167 The current kernel Warning Cause The version of the kernel and the None 103 version and the non Non critical driver do not meet the minimum requirements RAID SCSI driver Storage Management may not be able to version are older display the storage or perform storage than the minimum management functions until you have required levels See updated the system to meet the minimum readme txt for a list requirements of validated kernel
75. ert is for informational purposes Action None Storage Management Message Reference Clear Event Number None None 2358 None None None None None None 2260 None None SNMP Trap Numbers 1201 1153 1151 1151 901 901 901 901 901 851 851 101 Table 4 4 Storage Management Messages continued Event ID Description Severity Cause and Action Clear Event Number SMART thermal Ok Normal Cause This alert is for informational purposes None 2262 2263 2264 2265 2266 2267 shutdown is enabled Action None SMART thermal Ok Normal Cause This alert is for informational purposes None shutdown is disabled A device is missing _Warning Cause The controller cannot communicate None Non critical with a device The device may be removed There may also be a bad or loose cable Action None Action Check if the device is in and not removed If it is in check the cables You should also check the connection to the controller battery and the battery health A battery with a weak or depleted charge may cause this alert A device is in an Warning Cause The controller cannot communicate None unknown state Non critical with a device The state of the device cannot be determined There may be a bad or loose cable The system may also be experiencing problems with the application programming interface API There could also be
76. f for example in RPM Off Fan sensor value 2600 Fan sensor value Specifies the type of hardware log for example Log type ESM Specifies the name of the memory bank in the system that generated the message for example Memory device bank location Bank_1 Specifies the location of the memory module in the chassis for example Memory device location DIMM_A Specifies the number of power supply or cooling devices required to achieve full redundancy for example Number of devices required for full redundancy 4 Specifies a list of possible causes for the memory module event for example Possible memory module event cause warning error rate exceeded Single bit Single bit error logging disabled Specifies the type of power supply for example Power Supply type VRM Specifies the status of the previous redundancy message for example Previous redundancy state was Lost Specifies the previous state of the sensor for example Previous state was OK Normal Specifies the status of the processor sensor for example Processor sensor status Configuration error Introduction 11 12 Table 1 2 Event Description Reference continued Description Line Item Explanation Redundancy unit lt Redundancy location in chassis gt Sensor location lt Location in chassis gt Temperature sensor value lt Reading gt Voltage sensor value in Volts lt Reading gt Specifies the l
77. failed 37 52 Processor sensor returned toa normal state 37 52 Processor sensor value unknown 37 52 Processor Status Events 46 processor status messages 46 R r2 generated system messages 55 Rebuild completed with errors 84 Rebuild not possible as SAS SATA is not supported in the same virtual disk 115 Recharge count maximum exceeded 92 Index 125 Redundancy degraded 28 77 Redundancy is offline 27 Redundancy lost 28 78 Redundancy normal 78 Redundancy not applicable 27 48 Redundancy regained 28 Redundancy sensor has failed 27 Redundancy sensor value unknown 27 48 redundancy unit messages 26 redundancy unit sensor 6 S SAS expander error 1 113 SAS port report 1 108 SAS SMP communications error 1 113 SCSI sense data 70 SCSI sense sector reassign 79 See the Readme file for a list of validated controller driver versions 85 sensor AC power cord 7 chassis intrusion 6 current 6 fan 6 fan enclosure 7 hardware log 7 126 Index sensor continued memory prefailure 6 power supply 6 processor 7 37 redundancy unit 6 temperature 6 voltage 6 Server Administrator starting 13 Server Administrator startup complete 13 Service tag changed 83 Single bit ECC error limit exceeded 89 Single bit ECC error 100 Single bit ECC error The DIMM is critically degraded 106 Single bit ECC error The DIMM is critica
78. failed The sensor location chassis location previous state and current sensor value are provided Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt If sensor type is not discrete Current sensor value in Amps lt Reading gt If sensor type is discrete Discrete current state lt State gt 1201 Current sensor value unknown Information A current sensor on the power supply for the specified system could not obtain a reading The sensor location chassis location previous state and a nominal current sensor value are provided Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt If sensor type is not discrete Current sensor value in Amps lt Reading gt If sensor type is discrete Discrete current state lt State gt 22 Event Message Reference Table 2 5 Current Sensor Messages continued Event ID Description Severity Cause 1202 1203 Current sensor returned to Information a normal value Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt If sensor type is not discrete Current sensor value in Amps lt Reading gt If sensor type is discrete Discrete current state lt State gt Current sensor detected a warning value Warning
79. firmware may have stopped BGI Action None Critical Cause BGI of a virtual disk has failed Failure Error Action None Ok Normal Cause BGI of a virtual disk has completed This alert is for informational purposes Action None Clear SNMP Event Trap Number Numbers None 903 2130 1201 None 1201 None 1204 Clear 1201 event Storage Management Message Reference 79 Table 4 4 Storage Management Messages continued Event Description Severity Cause and Action Clear SNMP ID Event Trap Number Numbers 2131 Firmware version Warning Cause The firmware on the controller isnot None 753 mismatch Non critical a supported version Action Install a supported version of the firmware If you do not have a supported version of the firmware available it can be downloaded from the Dell support site at support dell com If you do not have a supported version of the firmware available check with your support provider for information on how to obtain the most current firmware 2132 Driver version Warning Cause The controller driver is not a None 753 mismatch Non critical supported version Action Install a supported version of the driver If you do not have a supported driver version available it can be downloaded from the Dell support site at support dell com If you do not have a supported version of the driver available check with your support provider for information on how to obtain the most current driver
80. formation Cause This alert is for informational purposes Clear 1304 Data redundancy has been restored to a virtual disk or an enclosure that previously suffered a loss of redundancy Action None Storage Management Message Reference event Table 4 4 Storage Management Messages continued Event ID 2126 2127 2128 2129 2130 Description SCSI sense sector reassign Background initialization BGI started BGI cancelled BGI failed BGI completed Severity Cause and Action Warning Cause A sector of the physical disk is Non critical corrupted and data cannot be maintained on this portion of the disk This alert is for informational purposes NOTICE Any data residing on the corrupt portion of the disk may be lost and you may need to restore your data from backup Action If the physical disk is part of a nonredundant virtual disk then back up the data and replace the physical disk NOTICE Removing a physical disk that is included in a nonredundant virtual disk will cause the virtual disk to fail and may cause data loss If the disk is part of a redundant virtual disk then any data residing on the corrupt portion of the disk will be reallocated elsewhere in the virtual disk Ok Normal Cause BGI of a virtual disk has started This alert is for informational purposes Action None Ok Normal Cause BGI of a virtual disk has been cancelled A user or the
81. formation is represented with a percent sign in the Storage Management documentation An example of such an alert is shown for alert 2334 in Table 4 1 Table 4 1 Alert Message Format AlertID Message Text Displayed in the Storage Message Text Displayed in the Alert Log with Variable Management Service Documentation Information Supplied 2127 Background Initialization started Background Initialization started Virtual Disk 3 Virtual Disk 3 Controller 1 PERC 5 E Adapter 2334 Controller event log Controller event log Current capacity of the battery is above threshold Controller 1 PERC 5 E Adapter The variables required to complete the message vary depending on the type of storage object and whether the storage object is in a SCSI or SAS configuration The following table identifies the possible variables used to identify each storage object K NOTE Some alert messages relating to an enclosure or an enclosure component such as a fan or EMM are generated by the controller when the enclosure or enclosure component ID cannot be determined Table 4 2 Message Format with Variables for Each Storage Object Storage Object Message Variables A B C and X Y Z in the following examples are variables representing the storage object name or number Controller Battery SCSI Physical Disk SAS Physical Disk Message Format Controller A Name Message Format Controller A Example 2326 A foreign conf
82. hassis gt Additional details lt Additional details for the events gt Device configuration error Error detected Device location lt Location in chassis gt Chassis location lt Name of chassis gt Additional details lt Additional details for the events gt A pluggable device event message of unknown type was received The device location chassis location and additional event details if available are provided A device was added in the specified system The device location chassis location and additional event details if available are provided A device was removed from the specified system The device location chassis location and additional event details if available are provided A configuration error was detected for a pluggable device in the specified system The device may have been added to the system incorrectly Event Message Reference 39 Battery Sensor Messages Battery sensors monitor how well a battery is functioning Battery messages listed in Table 2 15 provide status and warning information for batteries in a particular chassis Table 2 15 Battery Sensor Messages EventID Description Severity Cause 1700 Battery sensor has failed Information A battery sensor in the Sensor location lt Location in chassis gt specified system is not functioning The sensor Chassis location lt Name of chassis gt location chassis location Previous state was lt State gt prev
83. hysical disk blink has ceased The Clear operation has cancelled The physical disk has been started An enclosure blink operation has initiated An enclosure blink has ceased A global rescan has initiated Severity Cause and Action Ok Normal Cause This alert is for informational purposes Warning Non critical O O O O O O O Action None Cause The controller battery charge is weak Action As the charge weakens the charger should automatically recharge the battery If the battery has reached its recharge limit replace the battery pack Monitor the battery to make sure that it recharges successfully If the battery does not recharge replace the battery pack Normal Cause T Action None Normal Cause T Action None Normal Cause T Action None Normal Cause T Normal Cause This alert is for informational purposes Action None Normal Cause This alert is for informational purposes Action None Normal Cause This alert is for informational purposes Action None his alert is for informational purposes his alert is for informational purposes his alert is for informational purposes his alert is for informational purposes Action None Normal Cause T his alert is for informational purposes Action None OK Normal Cause This alert is for informational purposes Action None Ok Normal Cause This al
84. ialization failed 68 Virtual disk initialization started 66 Virtual disk rebuild completed 69 Virtual disk rebuild failed 69 Virtual disk rebuild started 67 Virtual disk reconfiguration completed 69 Virtual disk reconfiguration failed 68 Virtual disk reconfiguration started 66 Virtual disk renamed 84 voltage sensor 6 Voltage sensor detected a failure value 21 45 Voltage sensor detected a non recoverable value 21 Voltage sensor detected a warning value 20 Voltage Sensor Events 44 Voltage sensor has failed 19 45 voltage sensor messages 19 44 Index 129 Voltage sensor returned to a normal value 20 Voltage sensor value unknown 20 45 130 Index
85. ialization started Physical disk initialization started Virtual disk reconfiguration started Non critical Cause and Action Cause 1 This alert message occurs when a physical disk included in a redundant virtual disk fails Because the virtual disk is redundant uses mirrored or parity information and only one physical disk has failed the virtual disk can be rebuilt Action 1 Configure a hot spare for the virtual disk if one is not already configured Rebuild the virtual disk When using an Expandable RAID Controller PERC PERC 3 SC 3 DCL 3 DC 3 QC 4 SC 4 DC 4e DC 4 Di CERC ATA100 4ch PERC 5 E PERC 5f or a Serial Attache SCSI SAS 5 iR controller rebuild the virtual disk by first configuring a hot spare for the disk and then initiating a write operation to the disk The write operation will initiate a rebuild of the disk Cause 2 A physical disk in the disk group has been removed Action 2 If a physical disk was removed from the disk group either replace the disk or restore the original disk You can identify which disk has been removed by locating the disk that has a red X for its status Perform a rescan after replacing the disk Ok Normal Cause This alert is for informational purposes Action None Ok Normal Cause This alert is for informational purposes Action None Ok Normal Cause This alert is for informational purposes Action None Ok Normal Cause This alert
86. ided Chassis location lt Name of chassis gt Previous state was lt State gt Chassis intrusion state lt Intrusion state gt 1252 Chassis intrusion returned Information A chassis intrusion sensor in the specified to normal Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt Chassis intrusion state lt Intrusion state gt system detected that a cover was opened while the system was operating but has since been replaced The sensor location chassis location previous state and chassis intrusion state are provided Event Message Reference 25 26 Table 2 6 Chassis Intrusion Messages continued EventID Description Severity Cause 1253 Chassis intrusion in Warning A chassis intrusion sensor in the specified progress system detected that a system cover is currently Says locatione See a ae Te and oe is operating in chassis gt he sensor location chassis location previous state and chassis intrusion state are provided Chassis location lt Name of chassis gt Previous state was lt State gt Chassis intrusion state lt Intrusion state gt 1254 Chassis intrusion detected Error A chassis intrusion sensor in the specified system detected that the system cover was Sensor location lt Location Syste d a that thesystemco TN a ehacsiss opene while the system was operating i The sensor location chassis location previous Chassis locat
87. iguration has been detected Controller 1 PERC 5 E Adapter K NOTE The controller name is not always displayed Message Format Battery X Controller A Example 2174 The controller battery has been removed Battery 0 Controller 1 Message Format Physical Disk X Y Controller A Connector B Example 2049 Physical disk removed Physical Disk 0 14 Controller 1 Connector 0 Message Format Physical Disk X Y Z Controller A Connector B Example 2049 Physical disk removed Physical Disk 0 0 14 Controller 1 Connector 0 Storage Management Message Reference Table 4 2 Message Format with Variables for Each Storage Object continued Storage Object Message Variables A B C and X Y Z in the following examples are variables representing the storage object name or number Virtual Disk Message Format Virtual Disk X Name Controller A Name Message Format Virtual Disk X Controller A Example 2057 Virtual disk degraded Virtual Disk 11 Virtual Disk 11 Controller 1 PERC 5 E Adapter K NOTE The virtual disk and controller names are not always displayed Enclosure Message Format Enclosure X Y Controller A Connector B SCSI Power Supply SAS Power Supply SCSI Temperature Probe SAS Temperature Probe SCSI Fan SAS Fan SCSI EMM Example 2112 Enclosure shutdown Enclosure 0 2 Controller 1 Connector 0 Message Format Power Supply X Controller A Connector B Target ID C where C is the SCSI ID number of the
88. ill be unavailable until the operation completes 115 The virtual disk cache policy has changed 90 The virtual disk Check Consistency has made corrections and completed 90 The virtual disk Read policy has changed 90 The virtual disk reconfiguration has resumed 90 There is a bad sensor on an enclosure 101 There was an unrecoverable disk media error during the rebuild 112 Thermal shutdown protection has been initiated 13 U understanding event description 10 Unsupported configuration detected The SCSI rate of the enclosure management modules EMMs is not the same EMMO0 1 EMM1 2 87 User initiated host system reset 14 V viewing event information 9 event messages 7 events in Red Hat Linux 8 events in SUSE Linux Enterprise Server 8 events in Windows 2000 8 Virtual disk check consistency cancelled 67 Virtual disk check consistency completed 69 Virtual disk check consistency failed 68 Virtual disk check consistency started 66 Virtual disk configuration changed 65 Virtual disk created 65 Virtual disk degraded 66 Virtual disk deleted 65 Virtual disk failed 65 Virtual disk format changed 68 Virtual disk format completed 69 Virtual disk format started 66 Virtual disk has inconsistent data 98 Virtual disk initialization 80 Virtual disk initialization cancelled 67 Virtual disk initialization completed 69 Virtual disk init
89. in the chassis and in any attached systems Fan Sensor Monitors fans in various locations in the chassis and in any attached systems Voltage Sensor Monitors voltages across critical components in various chassis locations and in any attached systems Current Sensor Monitors the current or amperage output from the power supply or supplies in the chassis and in any attached systems Chassis Intrusion Sensor Monitors intrusion into the chassis and any attached systems Redundancy Unit Sensor Monitors redundant units critical units such as fans AC power cords or power supplies within the chassis also monitors the chassis and any attached systems For example redundancy allows a second or nth fan to keep the chassis components at a safe temperature when another fan has failed Redundancy is normal when the intended number of critical components are operating Redundancy is degraded when a component fails but others are still operating Redundancy is lost when there is one less critical redundancy device than required Power Supply Sensor Monitors power supplies in the chassis and in any attached systems Memory Prefailure Sensor Monitors memory modules by counting the number of Error Correction Code ECC memory corrections Introduction Fan Enclosure Sensor Monitors protective fan enclosures by detecting their removal from and insertion into the system and by measuring how long a fan enclosure is
90. ing information for voltage sensors for a particular chassis Table 3 2 Voltage Sensor Events Event Message Severity Cause lt Sensor Name Location gt voltage Critical The voltage of the monitored device has sensor detected a failure lt Reading gt exceeded the critical threshold where lt Sensor Name Location gt is the entity that this sensor is monitoring Reading is specified in volts For example 3 860 V lt Sensor Name Location gt voltage Critical The voltage specified by sensor state asserted lt Sensor Name Location gt is in critical state lt Sensor Name Location gt voltage Information The voltage of a previously reported sensor state de asserted lt Sensor Name Location gt is returned to normal state lt Sensor Name Location gt voltage Warning Voltage of the monitored entity sensor detected a warning lt Sensor Name Location gt exceeded the lt Reading gt warning threshold lt Sensor Name Location gt voltage Information The voltage of a previously reported sensor returned to normal lt Reading gt lt Sensor Name Location gt is returned to normal state System Event Log Messages for IPMI Systems Fan Sensor Events The cooling device sensors monitor how well a fan is functioning These messages provide status warning and failure messages for fans for a particular chassis Table 3 3 Fan Sensor Events Event Message Severity Cause lt Sensor Name Location gt Fan
91. inistrator Messages Reference Guide to reflect the severity displayed in the Server Administrator Alert Log and documented in the Storage Management online help Storage Management Message Reference 61 Table 4 3 Alert Message Change History Alert Message Change History Removed alert 2344 Replaced by Documentation change only made in the Dell alert 2070 OpenManage Server Administrator Messages Reference Guide to reflect existing Storage Management online help Removed alert 2345 Replaced by alert Documentation change only made in the Dell 2079 OpenManage Server Administrator Messages Reference Guide to reflect existing Storage Management online help Storage Management 2 1 Comments Product Versions to Storage Management 2 1 which Changes Server Administrator 2 4 Apply ji Dell OpenManage 5 1 New Alerts 2062 see note The alert numbers for the new alerts 2173 2062 2260 were previously unassigned 2195 Alert numbers 2370 and 2371 are new 2196 NOTE Alerts 2062 and 2260 were previously undocumented in the Storage Management 2212 online help Dell OpenManage Server 2213 Administrator Storage Management User s Guide and the Dell OpenManage Server 2214 Administrator Messages Reference Guide 2215 2260 see note 2370 2371 Modified Alerts 2049 2050 2051 2052 2065 2074 2080 The term array disk has been changed to 2083 2089 2092 2141 2158 2249 2251 physical disk throughout Storage 2252
92. ion lt Name of state and chassis intrusion state are provided chassis gt Previous state was lt State gt Chassis intrusion state lt Intrusion state gt 1255 Chassis intrusion sensor Error A chassis intrusion sensor in the specified detected a non recoverable value Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt Chassis intrusion state lt Intrusion state gt system detected an error from which it cannot recover The sensor location chassis location previous state and chassis intrusion state are provided Redundancy Unit Messages Redundancy means that a system chassis has more than one of certain critical components Fans and power supplies for example are so important for preventing damage or disruption of a computer system that a chassis may have extra fans or power supplies installed Redundancy allows a second or nth fan to keep the chassis components at a safe temperature when the primary fan has failed Redundancy is normal when the intended number of critical components are operating Redundancy is degraded when a component fails but others are still operating Redundancy is lost when the number of components functioning falls below the redundancy threshold Table 2 7 lists the redundancy unit messages Event Message Reference The number of devices required for full redundancy is provided as part of the message when app
93. ion None Storage Management Message Reference Table 4 4 Storage Management Messages continued Event Description Severity Cause and Action Clear ID Event Number 2053 Virtual disk created Ok Normal Cause This alert is for informational purposes None Action None 2054 Virtual disk deleted Warning Cause A virtual disk has been deleted None Non critical Performing a Reset Configuration may detect that a virtual disk has been deleted and generate this alert Action None 2055 Virtual disk Ok Normal Cause This alert is for informational purposes None configuration Acton N ne changed 2056 Virtual disk failed Critical Cause One or more physical disks included None Failure in the virtual disk have failed If the virtual Error disk is non redundant does not use mirrored or parity data then the failure of a single physical disk can cause the virtual disk to fail If the virtual disk is redundant then more physical disks have failed than can be rebuilt using mirrored or parity information Action Create a new virtual disk and restore from a backup SNMP Trap Numbers 1201 1203 1201 1204 Storage Management Message Reference 65 Table 4 4 Storage Management Messages continued Event ID Description Severity 2057 Virtual disk degraded Warning 2058 2059 2061 2062 2063 66 Virtual disk check consistency started Virtual disk format started Virtual disk init
94. ious state and battery sensor status are provided Battery sensor status lt status gt P 1701 Battery sensor value unknown Information A battery sensor in the Sensor Location lt Location in chassis gt specified systern could not retrieve a reading The sensor Chassis Location lt Name of chassis gt location chassis location Previous state was lt State gt previous state and battery sensor status are provided Battery sensor status lt status gt sensor statusare proyite 1702 Battery sensor returned to a normal Information A battery sensor in the value specified system detected Sensor Location lt Location in chassis gt that a battery transitioned back to a normal state Chassis Location lt Name of chassis gt The sensor location chassis Previous state was lt State gt location previous state and battery sensor status ar Battery sensor status lt status gt attery sensor stata arg provided 1703 Battery sensor detected a warning Warning A battery sensor in the value specified system detected Sensor Location lt Location in chassis gt that a battery is in a predictive i failure state The sensor Chassis Location lt Name of chassis gt location chassis location Previous state was lt State gt previous state and battery sensor status are provided Battery sensor status lt status gt Senisokstatus ate provide 40 Event Message Reference Table 2 15 Battery Sensor Messages continued EventID Desc
95. is for informational purposes Normal Cause This alert is for informational purposes Normal Cause This alert is for informational purposes Normal Cause This alert is for informational purposes Normal Cause This alert is for informational purposes Normal Cause This alert is for informational purposes Normal Cause This alert is for informational purposes Normal Cause This alert is for informational purposes Normal Cause This alert is for informational purposes Normal Cause This alert is for informational purposes Normal Cause This alert is for informational purposes Normal Cause This alert is for informational purposes Clear Event Number None None None None None None None None None None None 2243 Clear event None SNMP Trap Numbers 1151 1151 751 751 751 751 751 751 751 751 751 751 751 1201 Storage Management Message Reference 93 94 Table 4 4 Storage Management Messages continued Event ID 2245 2246 2247 2248 2249 2251 2252 2254 2255 2259 2260 2261 Description A virtual disk blink has ceased The controller battery is degraded The controller battery is charging The controller battery is executing a Learn cycle The physical disk Clear operation has started The physical disk blink has initiated The p
96. is for informational purposes Action None Ok Normal Cause This alert is for informational purposes Action None Storage Management Message Reference Clear SNMP Event Trap Number Numbers None 1203 2085 1201 2086 1201 2088 1201 2089 901 2090 1201 Table 4 4 Storage Management Messages continued Event Description ID 2064 started 2065 started 2067 Virtual disk check consistency cancelled 2070 Virtual disk initialization cancelled Severity Cause and Action Virtual disk rebuild Ok Normal Cause This alert is for informational purposes Action None Physical disk rebuild Ok Normal Cause This alert is for informational purposes Action None Ok Normal Cause The check consistency operation cancelled because a physical disk in the array has failed or because a user cancelled the check consistency operation Action If the physical disk failed then replace the physical disk You can identify which disk failed by locating the disk that has a red X for its status Perform a rescan after replacing the disk When performing a consistency check be aware that the consistency check can take a long time The time it takes depends on the size of the physical disk or the virtual disk Ok Normal Cause The virtual disk initialization cancelled because a physical disk included in the virtual disk has failed or because a user cancelled the virtual disk initialization Acti
97. is not large None enough to protect all virtual disks that reside on the disk group Action Assign a larger disk as the dedicated hot spare Cause The global hot spare is not large enough None to protect all virtual disks that reside on the controller Action Assign a larger disk as the global hot spare 904 904 901 903 903 Storage Management Message Reference 97 98 Table 4 4 Storage Management Messages continued Event ID 2278 2279 2280 2281 Description The controller battery charge level is below a normal threshold The controller battery charge level is operating within normal limits A disk media error has been corrected Virtual disk has inconsistent data Severity Ok Normal Cause The battery is discharging A battery Ok Normal Cause This alert is provided for Ok Normal Cause A disk media error was detected Ok Normal Cause This alert is for informational purposes None Clear SNMP Event Trap Number Numbers 1154 Cause and Action None discharge is a normal activity during the battery Learn cycle Before completing the battery Learn cycle recharges the battery You should receive alert 2179 when the recharge occurs Action Check if the battery Learn cycle is in progress Alert 2176 indicates that the battery Learn cycle has initiated The battery also displays the Learn state while the Learn cycle is in progress If a Learn cycle is not i
98. l This event is generated when memory RAID is no redundancy ese longer redundant Err Reg Pointer Information This event is generated when an OEM event accrues OEM Diagnostic data event was asserted System Board PFault Fail Critical This event is generated when the system board Safe state asserted voltages are not at normal levels System Board PFault Fail Information This event is generated when earlier PFault Fail Safe Safe state deasserted system voltages returns to a normal level Memory Add Information This event is generated when memory is added to the BANK DIMM presence was system asserted 52 System Event Log Messages for IPMI Systems Table 3 12 BIOS Generated System Events continued Event Message Severity Cause Memory Removed Information This event is generated when memory is removed from BANK DIMM presence was the system asserted Memory Cfg Err Critical This event is generated when memory configuration is c ntig rat IO rror BANKI incorrect for the system DIMM was asserted Mem Redun Gain Information This event is generated when memory redundancy is redundancy regained regained Mem ECC Warning Warning This event is generated when correctable ECC errors PHAnea Gon to HOH ered eal have increased from a normal rate from OK Mem ECC Warning Critical This event is generated when correctable ECC errors transition to critical from reach a critical rate less severe Me
99. lation and that the room temperature is not too hot See the physical disk enclosure documentation for more diagnostic information Action 2 If you cannot identify why the disk has reached an unacceptable temperature then replace the disk If the physical disk is a member of a non redundant virtual disk then back up the data before replacing the disk NOTICE Removing a physical disk that is included in a non redundant virtual disk will cause the virtual disk to fail and may cause data loss Storage Management Message Reference Clear SNMP Event Trap Number Numbers None 903 Table 4 4 Storage Management Messages continued Event Description ID Severity Cause and Action Clear SNMP Event Trap Number Numbers Cause A disk is degraded and has received a None 903 2110 SMART warning degraded 2111 Failure prediction threshold exceeded due to test No action needed 2112 Enclosure was shut down Warning Non critical Warning Non critical Critical Failure Error SMART alert predictive failure The disk is likely to fail in the near future Action Replace the disk that has received the SMART alert If the physical disk is a member of a non redundant virtual disk then back up the data before replacing the disk NOTICE Removing a physical disk that is included in a non redundant virtual disk will cause the virtual disk to fail and may cause data loss Cause A disk has receive
100. licable for the redundancy unit and the platform For details on redundancy computation see the respective platform documentation Table 2 7 Redundancy Unit Messages EventID Description Severity Cause 1300 Redundancy sensor has failed Information A redundancy sensor in the specified system Redundancy anit lt Redundaney failed The redundancy unit location chassis Jocaticn im eee location previous redundancy state and the number of devices required for full Chassis location lt Name of redundancy are provided chassis gt Previous redundancy state was lt State gt 1301 Redundancy sensor value Information A redundancy sensor in the specified system unknown could not obtain a reading The redundancy Redundancy tities lt Redundancy unit location chassis location previous Tocation in chassiss redundancy state and the number of devices required for full redundancy Chassis location lt Name of are provided chassis gt Previous redundancy state was lt State gt 1302 Redundancy not applicable Information A redundancy sensor in the specified system Redunaahcysunib lt Redund ney detected that a unit was not redundant Jocation in chdssiss The redundancy location chassis location previous redundancy state and the number Chassis location lt Name of of devices required for full redundancy are chassis gt provided Previous redundancy state was lt State gt 1303 Redundancy is offline Information A redundancy sensor in the s
101. lly degraded There will be no further reporting 107 Single bit ECC error The DIMM is degrading 106 Smart configuration change 72 Smart FPT exceeded 72 SMART thermal shutdown is disabled 95 SMART thermal shutdown is enabled 95 Smart warning 73 Smart warning degraded 75 Smart warning temperature 74 SMBIOS data is absent 14 System Event Log Messages 43 system management data manager started 14 system management data manager stopped 14 T Temperature dropped below the minimum failure threshold 71 Temperature dropped below the minimum warning threshold 71 Temperature exceeded the maximum failure threshold 71 Temperature exceeded the maximum warning threshold 71 temperature sensor 6 Temperature sensor detected a failure value 17 Temperature sensor detected a non recoverable value 17 Temperature sensor detected a warning value 16 Temperature Sensor Events 43 Temperature sensor has failed 15 43 temperature sensor messages 15 43 Temperature sensor returned to a normal value 16 43 Temperature sensor value unknown 15 43 The AC power supply cable has been removed 107 The background initialization BGI rate has changed 93 The battery charge cycle is complete 113 The BGI completed with uncorrectable errors 110 The Check Consistency found inconsistent parity data Data redundancy may be lost 111 The Check Con
102. m CRC Err Critical This event is generated when CRC errors enter a transition to non recoverable non recoverable state Mem Fatal SB CRC Critical This event is generated when CRC errors occur while uncorrectable ECC was storing to memory asserted Mem Fatal NB CRC Critical This event is generated when CRC errors occur while uncorrectable ECC was removing from memory asserted Mem Overtemp Critical This event is generated when system memory reaches critical over temperature critical temperature was asserted USB Over current Critical This event is generated when the USB exceeds a transition to non recoverable predefined current level Hdwr version err Critical This event is generated when there is a mismatch hardware incompatibility BMC Firmware and CPU mismatch was asserted between the BMC firmware and the processor in use or vice versa System Event Log Messages for IPMI Systems 53 54 Table 3 12 BIOS Generated System Events continued Event Message Severity Cause Hdwr version err Information This event is generated when the earlier mismatch haraware incompatibility between the BMC firmware and the processor is BMC Firmware and CPU corrected mismatch was deasserted Hdwr version err Critical This event is generated when there is a mismatch hardware incompatibility between the BMC firmware and the processor in use or BMC Firmware and other TETELE mismatch was asserted Hdwr version err
103. mation Systems Management Data Manager Manager Started services were started 1009 Systems Management Data Information Systems Management Data Manager Manager Stopped services were stopped 1011 RCI table is corrupt Warning This message is generated when the BIOS Remote Configuration Interface RCI table is corrupted or cannot be read by the systems management software 1012 IPMI Status Information This message is generated to indicate Interfaces ethe I MI interface the Intelligent Platform Management being used gt lt additional Interface IPMI status of the system information if available and Additional information when available applicable gt includes Baseboard Management Controller BMC not present BMC not responding System Event Log SEL not present and SEL Data Record SDR not present 14 Event Message Reference Temperature Sensor Messages Temperature sensors listed in Table 2 2 help protect critical components by alerting the systems management console when temperatures become too high inside a chassis The temperature sensor messages use additional variables sensor location chassis location previous state and temperature sensor value or state Table 2 2 Temperature Sensor Messages EventID Description 1050 1051 Severity Temperature sensor has failed Information Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt
104. message for example Chassis location Main System Chassis Specifies the type of configuration error that occurred for example Configuration error type Revision mismatch Specifies the current sensor value in amps for example Current sensor value in Amps 7 853 Specifies the date and time the action was performed for example Date and time of action Sat Jun 12 16 20 33 2004 Specifies the location of the device in the specified chassis for example Device location Memory Card A Specifies the state of the current sensor for example Discrete current state Good Specifies the state of the temperature sensor for example Discrete temperature state Good Table 1 2 Event Description Reference continued Description Line Item Explanation Discrete voltage state lt State gt Fan sensor value lt Reading gt Log type lt Log type gt Memory device bank location lt Bank name in chassis gt Memory device location lt Device name in chassis gt Number of devices required for full redundancy lt Number gt Possible memory module event cause lt list of causes gt Power Supply type lt type of power supply gt Previous redundancy state was lt State gt Previous state was lt State gt Processor sensor status lt status gt Specifies the state of the voltage sensor for example Discrete voltage state Good Specifies the fan speed in revolutions per minute RPM or On Of
105. n checking the cables You should also check to see if the enclosure has degraded or failed components To do so select the enclosure object in the tree view and click the Health subtab The Health subtab displays the status of the enclosure components Verify that the controller has supported driver and firmware versions installed and that the EMMs are each running the same version of supported firmware Enclosure alarm Ok Normal Cause This alert is for informational purposes None 851 enabled A user has enabled the enclosure alarm Action None Enclosure alarm Ok Normal Cause A user has disabled the enclosure alarm None 851 disabled Action None Dead disk segments Ok Normal Cause This alert is for informational purposes None 1201 restored Disk space that was formerly dead or inaccessible to a redundant virtual disk has been restored Action None Storage Management Message Reference 81 Table 4 4 Storage Management Messages continued Event ID 2141 2142 2143 2144 2145 2146 2147 82 Description Physical disk dead segments recovered Controller rebuild rate has changed Controller alarm enabled Controller alarm disabled Controller battery low Bad block replacement error Bad block sense error Severity Cause and Action Ok Normal Cause This alert is for informational purposes Portions of the physical disk were formerly inaccessible The disk sp
106. n progress replace the battery pack None 1151 informational purposes This alert indicates that the battery is recharging during the battery Learn cycle Action None None 1201 while the controller was completing a background task A bad disk block was identified The disk block has been remapped Action Consider replacing the disk If you receive this alert frequently be sure to replace the disk You should also routinely back up your data 1201 Action None Storage Management Message Reference Table 4 4 Storage Management Messages continued Event ID 2282 2283 2284 2285 2286 2287 2288 Description Hot spare SMART polling failed A redundant path is broken A redundant path has been restored A disk media error was corrected during recovery A Learn cycle start is pending while the battery charges The Patrol Read is paused The patrol read has resumed Severity Cause and Action Critical Cause The controller firmware attempted a Failure SMART polling on the hot spare but was Error unable to complete it The controller has lost communication with the hot spare Action Check the health of the disk assigned as a hot spare You may need to replace the disk and reassign the hot spare Make sure the cables are attached securely See the Cables Attached Correctly section in the Dell OpenManage Server Administrator Storage Management User s Guide for m
107. ncy sensor in the specified system Redundancy unit lt Redundancy o r PAT that Aes ae snes Tear ton in ohassiss depending regun ant unit has eoni isconnected has on the failed or is not present The redundancy Chassis location lt Name of number of unit location chassis location previous chassis gt units that are redundancy state and the number of devices Previous redundancy state was functional required for full redundancy are provided lt State gt Event Message Reference Power Supply Messages Power supply sensors monitor how well a power supply is functioning Power supply messages listed in Table 2 8 provide status and warning information for power supplies present in a particular chassis Table 2 8 Power Supply Messages EventID Description Severity Cause 1350 1351 Power supply sensor has Information failed Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt Power Supply type lt type of power supply gt lt Additional power supply status information gt If in configuration error state Configuration error type lt type of configuration error gt Power supply sensor value Information unknown Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt Power Supply type lt type of power supply gt lt Additional power supply st
108. nd a nominal fan sensor value are provided Chassis location lt Name of chassis gt Previous state was lt State gt Fan sensor value lt Reading gt 1102 Fan sensor returned to a Information A fan sensor reading on the specified system normal value returned to a valid range after crossing a Sensor To ation eneestien warning threshold The sensor location chassis get location previous state and fan sensor value are provided Chassis location lt Name of chassis gt Previous state was lt State gt Fan sensor value lt Reading gt 1103 Fan sensor detected a Warning A fan sensor reading in the specified system warning value exceeded a warning threshold The sensor Sencor Tocations lt rocation aati TARE E state and in Shige res an sensor value are provided Chassis location lt Name of chassis gt Previous state was lt State gt Fan sensor value lt Reading gt Event Message Reference Table 2 3 Cooling Device Messages continued EventID Description Severity Cause 1104 Fan sensor detected a Error A fan sensor in the specified system detected failure value the failure of one or more fans The sensor ganeor Ae ae oe LOE aon eee E sas ae state and oh ehassiss an sensor value are provided Chassis location lt Name of chassis gt Previous state was lt State gt Fan sensor value lt Reading gt 1105 Fan sensor detected a Error A fan sensor detected an error from which it non recoverable value Sensor loc
109. ng Non critical Warning Non critical Warning Non critical OkK Normal Warning Non critical Clear SNMP Event Trap Number Numbers 901 Cause and Action None because the virtual disk it was assigned to has been deleted Action None Cause The only physical disk available to None 903 be assigned as a hot spare is using SATA technology The physical disks in the virtual disk are using SAS technology Because of this difference in technology the hot spare cannot rebuild data if one of the physical disks in the virtual disk fails Action Add a SAS disk that is large enough to be used as the hot spare and assign the new disk as a hot spare Cause The only physical disk available tobe None 903 assigned as a hot spare is using SAS technology The physical disks in the virtual disk are using SATA technology Because of this difference in technology the hot spare cannot rebuild data if one of the physical disks in the virtual disk fails Action Add a SATA disk that is large enough to be used as the hot spare and assign the new disk as a hot spare Cause The physical disk may not have a None 903 supported version of the firmware or the disk may not be supported by Dell Action If the disk is supported by Dell update the firmware to a supported version If the disk is not supported by Dell replace the disk with one that is supported 1151 Cause This alert is for informational pu
110. nnot replace SATA disks 92 The only hot spare available is a SATA disk SATA disks cannot replace SAS disks 92 The Patrol Read corrected a media error 96 The patrol read has resumed 99 The Patrol Read has started 93 The Patrol Read has stopped 93 The Patrol Read is paused 99 The Patrol Read mode has changed 93 The Patrol Read rate has changed 93 The physical disk blink has ceased 94 The physical disk blink has initiated 94 The physical disk Clear operation failed 96 The physical disk Clear operation has completed 96 The physical disk Clear operation has started 94 The physical disk has been started 94 The physical disk is not certified 113 The physical disk is not supported 92 The physical disk is too small to be used for a rebuild 103 The physical disk rebuild has resumed 97 The power supply cable has been inserted 107 The power supply is switched on 107 The RAID controller firmware and driver validation was not performed The configuration file cannot be opened 85 The RAID controller firmware and driver validation was not performed The configuration file is out of date or corrupted 85 The rebuild failed due to errors on the source physical disk 111 The rebuild failed due to errors on the target physical disk 112 The SCSI Enclosure Processor SEP has been rebooted as part of the firmware download operation and w
111. normally except for a multibit error Replace the memory module identified in the message during the system s next scheduled maintenance Clear the memory error on multibit ECC error The memory device status and location are provided 32 Event Message Reference Fan Enclosure Messages Some systems are equipped with a protective enclosure for fans Fan enclosure messages listed in Table 2 10 monitor whether foreign objects are present in an enclosure and how long a fan enclosure is missing from a chassis Table 2 10 Fan Enclosure Messages EventID Description Severity Cause 1450 Fan enclosure sensor has Information The fan enclosure sensor in the specified failed system failed The sensor location and chassis location are provided Sensor location lt Location oration ate proyiced in chassis gt Chassis location lt Name of chassis gt 1451 Fan enclosure sensor value Information The fan enclosure sensor in the specified unknown system could not obtain a reading The sensor location and chassis location are provided Sensor location lt Location Sea ane CHA ocation are provided in chassis gt Chassis location lt Name of chassis gt 1452 Fan enclosure inserted into Information A fan enclosure has been inserted into the system specified system The sensor location and chassis location are provided Sensor location lt Location Chassis OcaHOm step EG in chassis gt Chassis location lt Name of chassis gt
112. nse error Bad block extended medium error Asset tag changed Asset name changed Service tag changed Maximum temperature probe warning threshold value changed Severity Cause and Action Warning Cause A portion of a physical disk is Non critical damaged Action See the Dell OpenManage Server Administrator Storage Management online help for more information Warning Cause A portion of a physical disk is Non critical damaged Action See the Dell OpenManage Server Administrator Storage Management online help for more information Warning Cause A portion of a physical disk is Non critical damaged Action See the Dell OpenManage Server Administrator Storage Management online help for more information Ok Normal Cause This alert is for informational purposes A user has changed the enclosure asset tag Action None Ok Normal Cause This alert is for informational purposes A user has changed the enclosure asset name Action None Ok Normal Cause An enclosure service tag was changed In most circumstances this service tag should only be changed by Dell support or your service provider Action Ensure that the tag was changed under authorized circumstances Ok Normal Cause This alert is for informational purposes A user has changed the value for the maximum temperature probe warning threshold Action None Clear Event Number None None None None None None
113. nsor Location could not obtain a reading The sensor lt Location in chassis gt location chassis location previous state and chassis VeGe cons Name ak processor sensor status are provided chassis gt Previous state was lt State gt Processor sensor status lt status gt 1602 Processor sensor returned to Information A processor sensor in the specified system a normal value Sensor Location lt Location in chassis gt Chassis Location lt Name of chassis gt Previous state was lt State gt Processor sensor status lt status gt transitioned back to a normal state The sensor location chassis location previous state and processor sensor status are provided Event Message Reference 37 38 Table 2 13 Processor Sensor Messages continued EventID Description Severity Cause 1603 Processor sensor detected a Warning A processor sensor in the specified system is warning value in a throttled state The sensor location dansede location Steet adn chassis location previous state oe ehassis gt processor sensor status are provided Chassis Location lt Name of chassis gt Previous state was lt State gt Processor sensor status lt status gt 1604 Processor sensor detected a Error A processor sensor in the specified system is failure value disabled has a configuration error or b xperienced a thermal trip The sensor Sensor Location lt Location in i Pet E m a t i a trip E enst q chassis oc
114. ocation of the redundant power supply or cooling unit in the chassis for example Redundancy unit Fan Enclosure Specifies the location of the sensor in the specified chassis for example Sensor location CPU1 Specifies the temperature in degrees Celsius for example Temperature sensor value in degrees Celsius 30 Specifies the voltage sensor value in volts for example Voltage sensor value in Volts 1 693 Introduction Event Message Reference The following tables lists in numerical order each event ID and its corresponding description along with its severity and cause K NOTE For corrective actions see the appropriate documentation Miscellaneous Messages Miscellaneous messages in Table 2 1 indicate that certain alert systems are up and working Table 2 1 Miscellaneous Messages EventID Description Severity Cause 0000 Log was cleared Information User cleared the log from Server Administrator 0001 Log backup created Information The log was full copied to backup and cleared 1000 Server Administrator starting Information Server Administrator is beginning to initialize 1001 Server Administrator startup Information Server Administrator completed its complete initialization 1002 A system BIOS update has been Information The user has chosen to update the flash scheduled for the next reboot basic input output system BIOS 1003 A previously scheduled system Information The user decides to c
115. on ID 2108 Smart warning Severity Warning Non critical Cause and Action Cause A disk has received a SMART alert predictive failure The disk is likely to fail in the near future Action Replace the disk that has received the SMART alert If the physical disk is a member of a non redundant virtual disk then back up the data before replacing the disk NOTICE Removing a physical disk that is included in a non redundant virtual disk will cause the virtual disk to fail and may cause data loss Clear SNMP Event Trap Number Numbers None 903 Storage Management Message Reference 73 74 Table 4 4 Storage Management Messages continued Event ID 2109 SMART warning Description temperature Severity Warning Non critical Cause and Action Cause A disk has reached an unacceptable temperature and received a SMART alert predictive failure The disk is likely to fail in the near future Action 1 Determine why the physical disk has reached an unacceptable temperature A variety of factors can cause the excessive temperature For example a fan may have failed the thermostat may be set too high or the room temperature may be too hot or cold Verify that the fans in the server or enclosure are working If the physical disk is in an enclosure you should check the thermostat settings and examine whether the enclosure is located near a heat source Make sure the enclosure has enough venti
116. on If a physical disk failed then replace the physical disk You can identify which disk has failed by locating the disk that has a red X for its status Perform a rescan after replacing the disk Restart the format physical disk operation Restart the virtual disk initialization 2074 Physical disk rebuild Ok Normal Cause A user has cancelled the rebuild cancelled operation Action Restart the rebuild operation Clear SNMP Event Trap Number Numbers 2091 1201 2092 901 None 1201 None 1201 None 901 Storage Management Message Reference 67 Table 4 4 Storage Management Messages continued Event ID 2076 2077 2079 2080 2081 68 Description Virtual disk check consistency failed Virtual disk format failed Virtual disk initialization failed Physical disk initialize failed Virtual disk reconfiguration failed Severity Critical Failure Error Critical Failure Error Critical Failure Error Critical Failure Error Critical Failure Error Clear Event Cause and Action Number Cause A physical disk included in the virtual None disk failed or there is an error in the parity information A failed physical disk can cause errors in parity information Action Replace the failed physical disk You can identify which disk has failed by locating the disk that has a red X for its status Rebuild the physical disk When finish
117. ondition was corrected System Event Log Messages for IPMI Systems 55 Entity Presence Events The entity presence messages are used for detecting different hardware devices Table 3 16 Entity Presence Events Description Severity Cause lt Device Name gt Information This event is generated when the device was detected presence was asserted lt Device Name gt Critical This event is generated when the device was not detected absent was asserted 56 System Event Log Messages for IPMI Systems Storage Management Message Reference The Dell OpenManage Server Administrator Storage Management s alert or event management features let you monitor the health of storage resources such as controllers enclosures physical disks and virtual disks Alert Monitoring and Logging The Storage Management Service performs alert monitoring and logging By default the Storage Management Service starts when the managed system starts up If you stop the Storage Management Service the alert monitoring and logging stops Alert monitoring does the following e Updates the status of the storage object that generated the alert e Propagates the storage object s status to all the related higher objects in the storage hierarchy For example the status of a lower level object will be propagated up to the status displayed on the Health tab for the top level storage object e Logs an alert in the Alert log and the operating system
118. onger near or at its capacity usually as Log types sLoo types the result of clearing the log The log type information is provided 1553 Log size is near or at Warning The size of a hardware log on the specified capacity system is near or at the capacity of the Log type lt Log type gt hardware log The log type information is provided 1554 Log size is full Error The size of a hardware log on the specified Log type nee types system is full The log type information is provided 1555 Log sensor has failed Error A hardware log sensor in the specified Log type lt Log type gt system failed The hardware log status cannot be monitored The log type information is provided Event Message Reference Processor Sensor Messages Processor sensors monitor how well a processor is functioning Processor messages listed in Table 2 13 provide status and warning information for processors in a particular chassis Table 2 13 Processor Sensor Messages EventID Description Severity Cause 1600 Processor sensor has failed Information A processor sensor in the specified system is gensor Location eLocationsin i functioning The ar location chassis chassiss ocation previous state and processor sensor status are provided Chassis Location lt Name of chassis gt Previous state was lt State gt Processor sensor status lt status gt 1601 Processor sensor value Information A processor sensor in the specified system unknown Se
119. ore information on checking the cables Warning Cause The controller has two connectors Non critical that are connected to the same enclosure O O O O O The communication path on one connector has lost connection with the enclosure The communication path on the other connector is reporting this loss Action Make sure the cables are attached securely Make sure both EMMs are healthy Normal Cause This alert is for informational purposes Action None Normal Cause This alert is for informational purposes Action None Normal Cause This alert is for informational purposes Action None Normal Cause This alert is for informational purposes Action None Normal Cause This alert is for informational purposes Action None Clear Event SNMP Trap Number Numbers None 2284 Clear event None None 2288 Clear event 904 903 901 901 1151 751 751 Storage Management Message Reference 99 Table 4 4 Storage Management Messages continued Event Description Severity Cause and Action Clear SNMP ID Event Trap Number Numbers 2289 Multi bit ECC error Critical Cause An error involving multiple bits has None 754 Failure been encountered during a read or write Error operation The error correction algorithm recalculates parity data during read and write operations If an error involves only a single bit it may be p
120. ormal that failed or removed was replaced and the state state has returned to normal lt Entity Name gt PS Redundancy Information Power supply redundancy is degraded if one of sensor redundancy degraded the power supply sources is removed or failed lt Entity Name gt PS Redundancy Critical Power supply redundancy is lost if only one sensor redundancy lost power supply is functional lt Entity Name gt PS Redundancy Information This event is generated if the power supply has sensor redundancy regained been reconnected or replaced lt Power Supply Sensor Name gt Warning This event is generated when the power supply predictive failure was asserted is about to fail lt Power Supply Sensor Name gt input Critical This event is generated when the power supply lost was asserted is unplugged lt Power Supply Sensor Name gt Information This event is generated when the power predictive failure was deasserted supply has recovered from an earlier predictive failure event lt Power Supply Sensor Name gt input Information This event is generated when the power supply lost was deasserted is plugged in System Event Log Messages for IPMI Systems 47 Memory ECC Events The memory ECC event messages monitor the memory modules in a system These messages monitor the ECC memory correction rate and the type of memory events that occurred Table 3 6 Memory ECC Events Event Message Severity Cause ECC error correction detected
121. ossible for the error correction algorithm to correct the error and maintain parity data An error involving multiple bits however usually indicates data loss In some cases if the multi bit error occurs during a read operation the data on the disk may be correct valid If the multi bit error occurs during a write operation data loss has occurred Action Replace the dual in line memory module DIMM The DIMM is a part of the controller battery pack See your hardware documentation for information on replacing the DIMM You may need to restore data from backup 2290 Single bit ECC error Warning Cause An error involving a single bit has None 753 Non critical been encountered during a read or write operation The error correction algorithm has corrected this error Action None 2291 AnEMMhas been Ok Normal Cause This alert is for informational purposes None 851 discovered Action None 2292 Communication Critical Cause The controller has lost communication 2162 854 with the enclosure Failure with an EMM The cables may be loose or has been lost Error defective Action Make sure the cables are attached securely Reboot the system 100 Storage Management Message Reference Table 4 4 Storage Management Messages continued Event ID 2293 2294 2295 2296 2297 2298 Description The EMM has failed A device has been inserted A device has been removed An EMM has been inserted
122. pecified system Redundancy unit lt Redundancy location in chassis gt Chassis location lt Name of chassis gt Previous redundancy state was lt State gt detected that a redundant unit is offline The redundancy unit location chassis location previous redundancy state and the number of devices required for full redundancy are provided Event Message Reference 27 28 Table 2 7 Redundancy Unit Messages continued EventID Description Severity Cause 1304 Redundancy regained Information A redundancy sensor in the specified system Redundancy nit lt R dindan y anes that a Me o o has Tea dn ohdsoies been reconnected or replaced full redundancy is in effect The redundancy unit location Chassis location lt Name of chassis location previous redundancy state chassis gt and the number of devices required for full Previous redundancy state was redundancy are provided lt State gt 1305 Redundancy degraded Warning A redundancy sensor in the specified system Redundancy nit lt ke imdancy detected that one of the components of the redundancy unit has failed but the unit is location in chassis gt 7 still redundant The redundancy unit Chassis location lt Name of location chassis location previous redundancy chassis gt state and the number of devices required Previous redundancy state was for full redundancy are provided lt State gt 1306 Redundancy lost Warming or A redunda
123. purposes None 901 imported as global due to missing arrays Action None 114 Storage Management Message Reference Table 4 4 Storage Management Messages continued Event ID 2367 2368 2371 Description Rebuild not possible as SAS SATA is not supported in the same virtual disk The SCSI Enclosure Processor SEP has been rebooted as part of the firmware download operation and will be unavailable until the operation completes Attempted import of Unsupported Virtual Disk type RAID 1 Severity Cause and Action Clear Event Number Warning Cause This alert is for informational purposes None Non critical Action Make sure that all physical disks in the virtual disk are using the same technology For example all physical disks must be either SAS or SATA You cannot use both SAS and SATA physical disks in the same virtual disk Ok Normal Cause This alert is for informational purposes None Action None Ok Normal Cause This alert is for informational purposes None Action None SNMP Trap Numbers 903 851 751 Storage Management Message Reference 115 116 Storage Management Message Reference Index Symbols 1 Storage Management has lost communication with this RAID controller and attached storage An immediate reboot is strongly recommended to avoid further problems If the reboot does not restore communication there may be a hardware failure 96
124. r Events Event Message Severity Cause lt Sensor Name Location gt Critical Temperature of the backplane board system temperature sensor detected a board or the carrier in the specified system failure lt Reading gt where lt Sensor lt Sensor Name Location gt exceeded the critical Name Location gt is the entity threshold that this sensor is monitoring For example PROC Temp or Planar Temp Reading is specified in degree Celsius For example 100 C lt Sensor Name Location gt Warning Temperature of the backplane board system temperature sensor detected board or the carrier in the specified system a warning lt Reading gt lt Sensor Name Location gt exceeded the non critical threshold lt Sensor Name Location gt Warning Temperature of the backplane board system temperature sensor returned board or the carrier in the specified system to warning state lt Reading gt lt Sensor Name Location gt returned from critical state to non critical state lt Sensor Name Location gt Information Temperature of the backplane board system temperature sensor returned to normal state lt Reading gt board or the carrier in the specified system lt Sensor Name Location gt returned to normal operating range System Event Log Messages for IPMI Systems 43 44 Voltage Sensor Events The voltage sensor event messages monitor the number of volts across critical components These messages provide status and warn
125. redundant disk back up your data immediately If the disk fails you will not be able to recover the data Clear SNMP Event Trap Number Numbers None 903 Ok Normal Cause A physical disk has experienced a None 901 temporary error Action None Ok Normal Cause A user has assigned a physical diskasa None 901 global hot spare This alert is for informational purposes Action None Ok Normal Cause A user has unassigned a physical disk None 901 as a global hot spare This alert is for informational purposes Action None 70 Storage Management Message Reference Table 4 4 Storage Management Messages continued Event ID 2100 2101 2102 2103 Description Temperature exceeded the maximum warning threshold Temperature dropped below the minimum warning threshold Temperature exceeded the maximum failure threshold Temperature dropped below the minimum failure threshold Severity Warning Non critical Warning Non critical Critical Failure Error Critical Failure Error Clear SNMP Event Trap Number Numbers 2353 1053 Cause and Action Cause The physical disk enclosure is too hot A variety of factors can cause the excessive temperature For example a fan may have failed the thermostat may be set too high or the room temperature may be too hot Action Check for factors that may cause overheating For example verify that the enclosure fan is
126. removed 2052 903 from the disk group This alert can also be caused by loose or defective cables or by problems with the enclosure Action If a physical disk was removed from the disk group either replace the disk or restore the original disk On some controllers a removed disk has a red X for its status On other controllers a removed disk may have an Offline status or is not displayed on the user interface Perform a rescan after replacing or restoring the disk If a disk has not been removed from the disk group then check for problems with the cables See the online help for more information on checking the cables Make sure that the enclosure is powered on If the problem persists check the enclosure documentation for further diagnostic information Cause A physical disk in the disk group is 2158 903 offline A user may have manually put the physical disk offline Action Perform a rescan You can also select the offline disk and perform a Make Online operation Cause A physical disk has reported an error None 903 condition and may be degraded The physical disk may have reported the error condition in response to a consistency check or other operation Action Replace the degraded physical disk You can identify which disk is degraded by locating the disk that has a red X for its status Perform a rescan after replacing the disk Ok Normal Cause This alert is for informational purposes None 901 Act
127. ription Severity Cause 1704 Battery sensor detected a failure Error A battery sensor in the value specified system detected 1705 Sensor Location lt Location in chassis gt Chassis Location lt Name of chassis gt Previous state was lt State gt Battery sensor status lt status gt Battery sensor detected a non Error recoverable value Sensor Location lt Location in chassis gt Chassis Location lt Name of chassis gt Previous state was lt State gt Battery sensor status lt status gt that a battery has failed The sensor location chassis location previous state and battery sensor status are provided A battery sensor in the specified system detected that a battery has failed The sensor location chassis location previous state and battery sensor status are provided Event Message Reference 41 42 Event Message Reference System Event Log Messages for IPMI Systems The following tables list the system event log SEL messages their severity and cause K NOTE For corrective actions see the appropriate documentation Temperature Sensor Events The temperature sensor event messages help protect critical components by alerting the systems management console when the temperature rises inside the chassis These event messages use additional variables such as sensor location chassis location previous state and temperature sensor value or state Table 3 1 Temperature Senso
128. roduction 7 The location of the event log file depends on the operating system you are using In the Microsoft Windows 2000 Advanced Server and Windows Server 2003 operating systems messages are logged to the system event log and optionally to a unicode text file desys32 log viewable using Notepad that is located in the install_path omsa log directory The default install_path is C Program Files Dell SysMet In the Red Hat Enterprise Linux and SUSE Linux Enterprise Server operating system messages are logged to the system log file The default name of the system log file is var log messages You can view the messages file using a text editor such as vi or emacs NOTE Logging messages to a unicode text file is optional By default the feature is disabled To enable this feature modify the Event Manager section of the dcemdy32 ini file as follows e In Windows locate the file at lt install_path gt dataeng ini and set UnitextLog enabled True The default insta l_path is C Program Files Dell SysMgt Restart the DSM SA Event Manager service e In Red Hat Enterprise Linux and SUSE Linux Enterprise Server locate the file at lt install_path gt dataeng ini and set UnitextLog enabled True The default install_path is opt dell srvadmin Issue the etc init d dataeng restart command to restart the Server Administrator event manager service This will also restart the Server Administrator data manager and SNMP services
129. roller battery is Ok Normal Cause This alert is for informational purposes reconditioning Controller battery recondition is completed Smart FPT exceeded Warning Non critical change Action None Ok Normal Cause This alert is for informational purposes Failure Error Action None Cause A disk on the specified controller has received a SMART alert predictive failure indicating that the disk is likely to fail in the near future Action Replace the disk that has received the SMART alert If the physical disk is a member of a non redundant virtual disk then back up the data before replacing the disk NOTICE Removing a physical disk that is included in a non redundant virtual disk will cause the virtual disk to fail and may cause data loss Cause A disk has received a SMART alert predictive failure after a configuration change The disk is likely to fail in the near future Action Replace the disk that has received the SMART alert If the physical disk is a member of a non redundant virtual disk then back up the data before replacing the disk NOTICE Removing a physical disk that is included in a non redundant virtual disk will cause the virtual disk to fail and may cause data loss Storage Management Message Reference Clear SNMP Event Trap Number Numbers 2105 1151 Clear 1151 event None 903 None 904 Table 4 4 Storage Management Messages continued Event Descripti
130. rposes Action None Ok Normal Cause This alert is for informational purposes Action None Ok Normal Cause This alert is for informational purposes Action None Ok Normal Cause This alert is for informational purposes Action None Ok Normal Cause This alert is for informational purposes Action None Storage Management Message Reference Clear Event Number None None None None 2196 Clear event None SNMP Trap Numbers 854 1203 1201 1201 1201 1201 1201 Table 4 4 Storage Management Messages continued Event Description ID 2201 A global hot spare failed 2202 A global hot spare has been removed 2203 A dedicated hot spare failed 2204 A dedicated hot spare has been removed Severity Warning Non critical Cause and Action Cause The controller is not able to communicate with a disk that is assigned as a dedicated hot spare The disk may have been removed There may also be a bad or loose cable Action Check if the disk is healthy and that it has not been removed Check the cables If necessary replace the disk and reassign the hot spare Ok Normal Cause The controller is unable to Warning Non critical communicate with a disk that is assigned as a global hot spare The disk may have been removed There may also be a bad or loose cable Action Check if the disk is healthy and that it has not been remo
131. rposes None Action None Cause The battery has been recharged more None 1153 times than the battery recharge limit allows Action Replace the battery pack Storage Management Message Reference Table 4 4 Storage Management Messages continued Event ID 2214 2215 2232 2233 2234 2235 2237 2238 2239 2240 2241 2242 2243 2244 Description Battery charge in progress Battery charge process interrupted The controller alarm is silenced The background initialization BGI rate has changed The Patrol Read rate has changed The Check Consistency rate has changed A controller rescan has been initiated The controller debug log file has been exported A foreign configuration has been cleared A foreign configuration has been imported The Patrol Read mode has changed The Patrol Read has started The Patrol Read has stopped A virtual disk blink has been initiated Severity OK Normal Cause This alert is for informational purposes OK Normal Cause This alert is for informational purposes O O O O O O O O O Cause and Action Action None Action None Action None Action None Action None Action None Action None Action None Action None Action None Action None Action None Action None Action None Normal Cause This alert
132. rposes None 1201 mirror have been A user has caused a mirrored virtual disk to be split split When a virtual disk is mirrored its data is copied to another virtual disk in order to maintain redundancy After being split both virtual disks retain a copy of the data although because the mirror is no longer intact updates to the data are no longer copied to the mirror Action None 2117 A mirrored virtual Ok Normal Cause This alert is for informational purposes None 1201 disk has been A user has caused a mirrored virtual disk to be unmirrored unmirrored When a virtual disk is mirrored its data is copied to another virtual disk in order to maintain redundancy After being unmirrored the disk formerly used as the mirror returns to being a physical disk and becomes available for inclusion in another virtual disk Action None 2118 Change write policy Ok Normal Cause This alert is for informational purposes None 1201 A user has changed the write policy for a virtual disk Action None 2120 Enclosure firmware Warning Cause The firmware on the EMM is not the None 853 mismatch Non critical same version It is required that both modules have the same version of the firmware This alert may be caused when a user attempts to insert an EMM module that has a different firmware version than an existing module Action Download the same version of the firmware to both EMM modules 76 Storage Management Message Reference
133. s critically degraded There will be no further reporting The DC power supply is switched off The power supply is switched on The AC power supply cable has been removed The power supply cable has been inserted A foreign configuration has been detected Severity Critical Failure Error Critical Failure Error Cause and Action Clear Event Number Cause The DIMM is malfunctioning None Data loss or data corruption is imminent The DIMM must be replaced immediately No further alerts will be generated Action Replace the DIMM immediately The DIMM is a part of the controller battery pack See your hardware documentation for information on replacing the DIMM Cause The power supply unit is switched off 2323 Either a user switched off the power supply unit or it is defective Action Check if the power switch is turned off If it is turned off turn it on If the problem persists check if the power cord is attached and functional If the problem is still not corrected or if the power switch is already turned on replace the power supply unit Ok Normal Cause This alert is for informational purposes Clear Critical Failure Error Action None event Cause The power cable may be pulled out 2325 or removed The power cable may also have overheated and become warped and nonfunctional Action Replace the power cable Ok Normal Cause This alert is for informational purpo
134. se to sensor status changes and other monitored parameters The Server Administrator event monitor uses these status change events to add descriptive messages to the operating system event log or the Server Administrator Alert log Each event message that Server Administrator adds to the Alert log consists of a unique identifier called the event ID for a specific event source category and a descriptive message The event message includes the severity cause of the event and other relevant information such as the event location and the monitored item s previous state Tables provided in this guide list all Server Administrator event IDs in numeric order Each entry includes the event ID s corresponding description severity level and cause Message text in angle brackets for example lt State gt describes the event specific information provided by the Server Administrator What s New in this Release Modifications have been made to the Storage Management Service events For more information see Alert Message Change History Messages Not Described in This Guide This guide describes only event messages created by Server Administrator and displayed in the Server Administrator Alert log For information on other messages produced by your system consult one of the following sources e Your system s Installation and Troubleshooting Guide e Other system documentation e Operating system documentation e Application program documen
135. ses Clear Action None event Ok Normal Cause This alert is for informational purposes None The controller has physical disks that were moved from another controller These physical disks contain virtual disks that were created on the other controller See the Import Foreign Configuration and Clear Foreign Configuration section in the Dell OpenManage Server Administrator Storage Management User s Guide for more information Action None SNMP Trap Numbers 754 1004 1001 1004 1001 751 Storage Management Message Reference 107 108 Table 4 4 Storage Management Messages continued Event ID 2327 2328 2329 2330 SAS port report 1 Description The NVRAM has corrupted data The controller is reinitializing the NVRAM The NVRAM has corrupt data SAS port report 1 Severity Warning Non critical Warning Non critical Warning Non critical Cause and Action Cause The NVRAM has corrupted data This may occur after a power surge a battery failure or for other reasons The controller is reinitializing the NVRAM Action None The controller is taking the required corrective action If this alert is generated often such as during each reboot replace the controller Cause The NVRAM has corrupt data The controller is unable to correct the situation Action Replace the controller Cause The text for this alert is generated by the controller and
136. sistency logging of inconsistent parity data is disabled 111 The Check Consistency made corrections and completed 111 The Check Consistency rate has changed 93 The Clear operation has cancelled 94 The controller alarm is silenced 93 The controller battery charge level is below a normal threshold 98 The controller battery charge level is normal 86 The controller battery charge level is operating within normal limits 98 The controller battery has been removed 87 The controller battery has been replaced 87 The controller battery is charging 94 The controller battery is degraded 94 The controller battery is executing a Learn cycle 94 The controller battery Learn cycle has been postponed 88 The controller battery Learn cycle has completed 88 The controller battery Learn cycle has started 88 The controller battery Learn cycle has timed out 88 The controller battery Learn cycle will start in days 88 The controller battery needs to be replaced 86 The controller battery temperature is above normal 87 The controller battery temperature is above normal 92 The controller battery temperature is normal 87 The controller cache has been discarded 89 The controller debug log file has been exported 93 The controller has recovered cached data from the BBU 110 The controller is unable to recover cached data from the bat
137. specified period of time in the event of a power loss For example some batteries maintain cached data for 24 hours If the battery is unable to maintain cached data for the required period of time then the Learn cycle will timeout Action Replace the battery pack as the battery is unable to maintain a full charge Ok Normal Cause This alert is for informational purposes Action None Ok Normal Cause This alert is for informational purposes The 1 indicates a substitution variable The text for this substitution variable is displayed with the alert in the Alert Log and can vary depending on the situation Action None Ok Normal Cause This alert is for informational purposes The 1 indicates a substitution variable The text for this substitution variable is displayed with the alert in the Alert Log and can vary depending on the situation Action None Storage Management Message Reference Clear SNMP Event Trap Number Numbers 2177 1151 Clear 1151 event None 1153 None 1151 None 1151 None 1151 Table 4 4 Storage Management Messages continued Event ID 2182 2186 2187 2188 2189 Description Severity An invalid SAS Critical configuration has Failure been detected Error The controller cache Warning has been discarded Non critical Single bit ECC error Warning limit exceeded Non critical Cause and Action Cause The controller and attached enclosures are not cabl
138. stem power cycle detects that the system has crashed timer expired because no response was received from Host and the action is set to power cycle 48 System Event Log Messages for IPMI Systems Memory Events The memory modules can be configured in different ways in particular systems These messages monitor the status warning and configuration information about the memory modules in the system Table 3 8 Memory Events Event Message Severity Cause Memory RAID redundancy degraded Memory RAID redundancy lost Memory RAID redundancy regained Memory Mirrored redundancy degraded Memory Mirrored redundancy lost Memory Mirrored redundancy regained Memory Spared redundancy degraded Memory Spared redundancy lost Memory Spared redundancy regained Information This event is generated when there is a memory failure in a RAID configured memory configuration Critical This event is generated when redundancy is lost in a RAID configured memory configuration Information This event is generated when the redundancy lost or degraded earlier is regained in a RAID configured memory configuration Information This event is generated when there is a memory failure in a mirrored memory configuration Critical This event is generated when redundancy is lost in a mirrored memory configuration Information This event is generated when the redundancy lost or degraded earlier is regained in a
139. tate recovers from a faulty condition Drive lt Drive gt Informational This event is generated when the drive is installed drive presence was asserted Drive lt Drive gt Warning This event is generated when the drive is about to fail predictive failure was asserted Drive lt Drive gt Informational This event is generated when the drive from earlier Husaiceie failure wee predictive failure is corrected deasserted Drive lt Drive gt Warning This event is generated when the drive is placed in a hot spare was asserted hot spate Drive lt Drive gt Informational This event is generated when the drive is taken out of hot spare was deasserted hot sparg Drive lt Drive gt Warning This event is generated when the drive is placed in consistency check in progress consistency check was asserted Drive lt Drive gt Informational This event is generated when the consistency check of consistency check in progress the drive is completed was deasserted Drive lt Drive gt Critical This event is generated when the drive is placed in in critical array was critical array asserted Drive lt Drive gt Informational This event is generated when the drive is removed imer ecah array Was from critical array deasserted Drive lt Drive gt Critical This event is generated when the drive is placed in the in failed array was asserted System Event Log Messages for IPMI Systems fail array Table 3 10 Drive Events
140. tation Introduction 5 Understanding Event Messages This section describes the various types of event messages generated by the Server Administrator When an event occurs on your system the Server Administrator sends information about one of the following event types to the systems management console Table 1 1 Understanding Event Messages Icon Alert Severity Component Status An event that describes the successful operation of a unit The alert is provided for A OK Normal informational purposes and does not indicate an error condition For example the alert may indicate the normal start or stop of an operation such as power supply or a sensor reading returning to normal A An event that is not necessarily significant but may indicate a possible future problem For example a Warning Non critical alert may indicate that a component such as a temperature probe in an enclosure has crossed a warning threshold Warning Non critical A significant event that indicates actual or imminent loss of data or loss of function Critical Failure Error For example crossing a failure threshold or a hardware failure such as an array disk Server Administrator generates events based on status changes in the following sensors 6 Temperature Sensor Helps protect critical components by alerting the systems management console when temperatures become too high inside a chassis also monitors a variety of locations
141. te was lt State gt If sensor type is not discrete Temperature sensor value in degrees Celsius lt Reading gt If sensor type is discrete Discrete temperature state lt State gt 1053 Temperature sensor detected Warning A temperature sensor on the backplane a warning value Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt If sensor type is not discrete in lt Reading gt Temperature sensor value degrees Celsius If sensor type is discrete Discrete temperature state lt State gt Event Message Reference board system board CPU or drive carrier in the specified system exceeded its warning threshold The sensor location chassis location previous state and temperature sensor value are provided Table 2 2 Temperature Sensor Messages continued EventID Description Severity Cause 1054 Temperature sensor detected Error A temperature sensor on the backplane a failure value board system board or drive carrier in the eR ee ere ae eee tion in oa a its sesh chasers thresho i ne sensor location chassis location previous state and temperature Chassis location lt Name of sensor value are provided chassis gt Previous state was lt State gt If sensor type is not discrete Temperature sensor value in degrees Celsius lt Reading gt If sensor type is discrete Discrete temperature state lt State gt 1055 Temper
142. tery backup unit BBU 110 The controller reconstruct rate has changed 95 The controller write policy has been changed to Write Back 89 The controller write policy has been changed to Write Through 89 The current kernel version and the non RAID SCSI driver version are older than the minimum required levels See the Readme file for a list of validated kernel and driver versions 86 The DC power supply is switched off 107 The dedicated hot spare is too small 97 The EMM has failed 101 Index 127 The enclosure cannot support both SAS and SATA physical disks Physical disks may be disabled 103 The enclosure has a hardware error 102 The enclosure is not responding 102 The enclosure is unstable 102 The enclosure temperature has returned to normal 112 The factory default settings have been restored 110 The firmware on the EMMs is not the same version EMM0 1 EMMI 2 105 The global hot spare is too small 97 The initialization sequence of SAS components failed during system startup SAS management and monitoring is not possible 105 The non RAID SCSI driver version is older than the minimum required level See the Readme file for the validated driver version 86 The NVRAM has corrupt data 108 128 Index The NVRAM has corrupted data The controller is reinitializing the NVRAM 108 The only hot spare available is a SAS disk SAS disks ca
143. th of the enclosure and its components Replace any hardware that is in a Failed state See the hardware documentation for more information 102 Storage Management Message Reference Table 4 4 Storage Management Messages continued Event ID 2303 Description The enclosure Severity Cause and Action Clear Event Number SNMP Trap Numbers 851 Ok Normal Cause This alert is for informational purposes None 2304 2305 2306 cannot support both SAS and SATA physical disks Physical disks may be disabled An attempt to hot plug an EMM has been detected This type of hot plug is not supported The physical disk is too small to be used for a rebuild Bad block table is 80 full Action None Ok Normal Cause This alert is for informational purposes None Warning Non critical Warning Non critical Action None Cause This alert is for informational purposes None Action Use a physical disk that is the same size or larger than the physical disk being replaced See the Replacing a Failed Disk section in the Dell OpenManage Server Administrator Storage Management User s Guide for more information Cause The bad block table is used for None remapping bad disk blocks This table fills as bad disk blocks are remapped When the table is full bad disk blocks can no longer be remapped and disk errors can no longer be corrected At this point data loss can occur The ba
144. tical with the standards set by Dell and is not supported Action Replace the physical disk with a physical disk that is supported Storage Management Message Reference 113 Table 4 4 Storage Management Messages continued Event Description Severity Cause and Action Clear SNMP ID Event Trap Number Numbers 2360 A user has discarded Ok Normal Cause This alert is for informational purposes None 751 data from the Action None controller cache 2361 Physical disk s that Ok Normal Cause This alert is for informational purposes None 751 are part of a virtual disk have been removed while the system was shut down This removal was discovered during system start up Action None 2362 Physical disk s have Ok Normal Cause This alert is for informational purposes None 751 been removed from a virtual disk The virtual disk will be in Failed state during the next system reboot Action None 2363 A virtual disk and all Ok Normal Cause This alert is for informational purposes None 751 of its member physical disks have been removed while the system was shut down This removal was discovered during system start up Action None 2364 All virtual disks are Ok Normal Cause This alert is for informational purposes None 751 missing from the controller This situation was discovered during system start up Action None 2366 Dedicated spare Ok Normal Cause This alert is for informational
145. tics test failed The 1 Failure indicates a substitution variable The text for Error this substitution variable is generated by the utility that ran the diagnostics and is displayed with the alert in the Alert Log This text can vary depending on the situation Action See the documentation for the utility that ran the diagnostics for more information Ok Normal Cause This alert is for informational purposes Action None Warning Cause The battery or the battery charger is Non critical not functioning properly Action Replace the battery pack Warning Cause The DIMM is beginning to Non critical malfunction Action Replace the DIMM to avoid data loss or data corruption The DIMM is a part of the controller battery pack See your hardware documentation for information on replacing the DIMM Critical Cause The DIMM is malfunctioning Failure Data loss or data corruption may be Error imminent Action Replace the DIMM immediately to avoid data loss or data corruption The DIMM is a part of the controller battery pack See your hardware documentation for information on replacing the DIMM Storage Management Message Reference Clear Event Number None None None None None SNMP Trap Numbers 754 1201 1154 753 754 Table 4 4 Storage Management Messages continued Event ID 2321 2322 2323 2324 2325 2326 Description Single bit ECC error The DIMM i
146. tring accompanying an OS graceful shutdown restart event OS shutdown restart System Event OS stop event runtime Critical The OS encountered a critical error and critical stop was stopped abnormally OEM Event data record after OS Information OS bugcheck code and paremeters bugcheck event Cable Interconnect Events The cable interconnect messages are used for detecting errors in the hardware cabling Table 3 14 Cable Interconnect Events Description Severity Cause lt Cable sensor Name Location gt Critical This event is generated when the cable is not connected or is incorrectly Configuration error was asserted connected lt Cable sensor Name Location gt Information This event is generated when the earlier s N N sti a0 0 N Coane Rion vaa assertad cable connection error was corrected Battery Events Table 3 15 Battery Events Description Severity Cause lt Battery sensor Name Location gt Critical This event is generated when the sensor Sty Sa yag one ieg detects a failed or missing battery lt Battery sensor Name Location gt Information This event is generated when the earlier Pe yas deaeserte failed battery was corrected lt Battery sensor Name Location gt Warning This event is generated when the sensor sa tevas Oram aes detects a low battery condition lt Battery sensor Name Location gt Information This event is generated when the earlier Aiow yad tea tas ie a low battery c
147. urned to normal Severity Critical Failure Error Critical Failure Error Critical Failure Error Cause and Action Cause You are attempting to rebuild data on a disk that is defective Action Replace the target disk If a rebuild does not automatically start after replacing the disk initiate the Rebuild task You may need to assign the new disk as a hot spare to initiate the rebuild Cause A write operation could not complete because the disk contains bad disk blocks that could not be reassigned Data loss may have occurred and data redundancy may also be lost Action Replace the disk Cause The rebuild encountered an unrecoverable disk media error Action Replace the disk Ok Normal Cause This alert is for informational purposes Action None Ok Normal Cause This alert is for informational purposes Action None Ok Normal Cause This alert is for informational purposes Action None Storage Management Message Reference Clear Event Number None None None 2352 Clear event Clear event SNMP Trap Numbers 904 904 904 901 901 851 Table 4 4 Storage Management Messages continued Event ID Description 2356 SAS SMP communications error 1 2357 SAS expander error 1 2358 The battery charge cycle is complete 2359 The physical disk is not certified Severity Critical Failure Error Critical
148. ved Check the cables If necessary replace the disk and reassign the hot spare Cause The controller is unable to communicate with a disk that is assigned as a dedicated hot spare The disk may have failed or been removed There may also be a bad or loose cable Action Check if the disk is healthy and that it has not been removed Check the cables If necessary replace the disk and reassign the hot spare Ok Normal Cause The controller is unable to communicate with a disk that is assigned as a dedicated hot spare The disk may have been removed There may also be a bad or loose cable Action Check if the disk is healthy and that it has not been removed Check the cables If necessary replace the disk and reassign the hot spare Clear SNMP Event Trap Number Numbers None 903 None 901 None 903 None 901 Storage Management Message Reference 91 92 Table 4 4 Storage Management Messages continued Event ID 2205 2206 2207 2211 2212 2213 Description A dedicated hot spare has been automatically unassigned The only hot spare available is a SATA disk SATA disks cannot replace SAS disks The only hot spare available is a SAS disk SAS disks cannot replace SATA disks The physical disk is not supported The controller battery temperature is above normal Recharge count maximum exceeded Severity Ok Normal Cause The hot spare is no longer required Warni
149. which it cannot recover The sensor location chassis location previous state and additional power supply status information are provided Event Message Reference 31 Memory Device Messages Memory device messages listed in Table 2 9 provide status and warning information for memory modules present in a particular system Memory devices determine health status by monitoring the ECC memory correction rate and the type of memory events that have occurred K NOTE A critical status does not always indicate a system failure or loss of data In some instances the system has exceeded the ECC correction rate Although the system continues to function you should perform system maintenance as described in Table 2 9 K NOTE In Table 2 9 lt status gt can be either critical or non critical Table 2 9 Memory Device Messages EventID Description Severity Cause 1403 Memory device status is Warning A memory device correction rate lt status gt Memory device location exceeded an acceptable value lt location in chassis gt The memory device status and location Possible memory module event are provided cause lt list of causes gt 1404 Memory device status is Error A memory device correction rate lt status gt Memory device location exceeded an acceptable value a memory lt location in chassis gt spare bank was activated or a multibit Bags hie memory modile Beene ne error Ae The sas continues Hae CIEE of causes to unction
150. y occur when the virtual disk suffers the failure of multiple physical disks In this case both the source physical disk and the target disk with redundant data have failed A rebuild is not possible because there is no redundancy Action Replace the failed disks and restore from backup Storage Management Message Reference Clear SNMP Event Trap Number Numbers None 904 None 903 None 1204 Table 4 4 Storage Management Messages continued Event ID 2311 2312 2313 2314 2315 Description The firmware on the EMMs is not the same version EMMO 1 EMM1 2 A power supply in the enclosure has an AC failure A power supply in the enclosure has a DC failure The initialization sequence of SAS components failed during system startup SAS management and monitoring is not possible Diagnostic message l Severity Warning Non critical Warning Non critical Warning Non critical Critical Failure Error Cause and Action Clear SNMP Event Trap Number Numbers Cause The firmware on the EMM modules None 853 is not the same version It is required that both modules have the same version of the firmware This alert may be caused if you attempt to insert an EMM module that has a different firmware version than an existing module The 1 and 2 indicate a substitution variable The text for these substitution variables is displayed with the alert in the Alert Log and can v
Download Pdf Manuals
Related Search
Related Contents
Crystal body sculpture instrument (desktop) das europäische profilraster für sprachlehrende Echos de montardon 20 análisis del fracturamiento presente en la formación los santos al HP S520 User's Manual Conceptronic USB 2.0 A to B connector cable Panasonic CZ-RWSC1U Installation Manual TLシート&リアカウルキット 取扱説明書 Copyright © All rights reserved.
Failed to retrieve file