Home
Dell OpenManage Server Administrator Version 2.3 Messages Reference Guide
Contents
1. EventID Description Severity Cause 1053 Temperature sensor detected a warning Warning A temperature sensor on the value backplane board system Sensor location lt Location in chassis gt board or drive carrier in the f specified system exceeded Chassis location lt Name of chassis gt its warning threshold The Previous state was lt State gt sensor location chassis location previous state and If sensor type is not discrete cation previous stale temperature sensor value Temperature sensor value in degrees are provided Celsius lt Reading gt If sensor type is discrete Discrete temperature state lt State gt 1054 Temperature sensor detected a failure Error A temperature sensor on the value backplane board system Sensor location lt Location in chassis gt board or drive carrier in the specified system exceeded Chassis location lt Name of chassis gt its failure threshold The Previous state was lt State gt sensor location chassis location previous state and If sensor type is not discrete catin previous state a temperature sensor value Temperature sensor value in degrees are provided Celsius lt Reading gt If sensor type is discrete Discrete temperature state lt State gt 1055 Temperature sensor detected a Error A temperature sensor on the non recoverable value Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt If senso
2. Action Check if the disk is healthy and that it has not been removed Check the cables If necessary replace the disk and reassign the hot spare Cause The controller is unable to communicate with a disk that is assigned as a dedicated hot spare The disk may have been removed There may also be a bad or loose cable Action Check if the disk is healthy and that it has not been removed Check the cables If necessary replace the disk and reassign the hot spare Cause The hot spare is no longer required because the virtual disk it was assigned to has been deleted Action None Storage Management Message Reference 903 903 903 903 None None None None Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2206 The only hot spare Warning Cause The only array disk available to be 903 None available isa SATA Non critical assigned as a hot spare is using SATA disk SATA disks technology The array disks in the virtual disk cannot replace SAS are using SAS technology Due to this disks difference in technology the hot spare cannot rebuild data if one of the array disks in the virtual disk fails Action Add a SAS disk that is large enough to be used as the hot spare and assign the new disk as a hot spare 2207 The only hot spare Warning Cause The only array disk available to be 903 None available
3. Memory Events The memory modules can be configured in different ways in particular systems These messages monitor the status warning and configuration information about the memory modules in the system Table 3 8 Memory Events Event Message Severity Cause Memory RAID redundancy degraded Memory RAID redundancy lost Memory Mirrored redundancy degraded Memory Mirrored redundancy lost Information This event is generated when there is a memory failure in a RAID configured memory configuration Critical This event is generated when redundancy is lost in a RAID configured memory configuration Memory RAID redundancy regained Information This event is generated when the redundancy lost or degraded earlier is regained in a RAID configured memory configuration Information This event is generated when there is a memory failure in a mirrored memory configuration Critical This event is generated when redundancy is lost in a mirrored memory configuration 42 System Event Log Messages for IPMI Systems Table 3 8 Memory Events continued Event Message Severity Cause Memory Mirrored redundancy Information regained Memory Spared redundancy Information degraded Memory Spared redundancy lost Critical Memory Spared redundancy Information regained This event is generated when the redundancy lost or degraded earlier is regained in a mirrored memory configuration This event i
4. Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2050 2051 2052 2054 2055 Array disk offline Array disk degraded Array disk inserted Virtual disk created Virtual disk deleted Virtual disk configuration changed Warning Non critical Warning Non critical Ok Normal Ok Normal Warning Non critical Ok Normal Cause A physical disk in the array is offline 903 A disk can be made offline during a Prepare to Remove operation or because a user manually put the disk offline Action Perform a rescan You can also select the offline disk and perform a Make Online operation Cause An array disk has reported an error 903 condition and may be degraded The array disk may have reported the error condition in response to a consistency check or other operation Action Replace the degraded array disk You can identify which disk is degraded by locating the disk that has a red X for its status Perform a rescan after replacing the disk Cause This alert is provided for 901 informational purposes Action None Cause This alert is provided for 1201 informational purposes Action None Cause A virtual disk has been deleted 1203 Performing a Reset Configuration operation may detect that a virtual disk has been deleted and generate this alert Action None Cause This alert is provided fo
5. Index 107 108 Index
6. 50 2077 50 2079 50 2080 51 2081 51 2082 51 2083 51 100 Index 2085 51 2086 51 2088 52 2089 52 2090 52 2091 52 2092 52 2094 52 2095 53 2098 53 2099 53 2100 54 2101 54 2102 54 2103 54 2104 55 2105 55 2106 55 2107 55 2108 55 2109 56 2110 56 AMAL S7 AMMA 57 Aiea SH AMM S7 2116 58 AAG Bis 2 eh Se 2120 58 2121 59 DNA BS 2123 60 2124 60 2126 61 2127 61 2128 61 2129 61 2130 61 2131 61 2132 62 2135 62 2136 62 ABS O gt 2138 63 2139 63 2140 63 2141 64 2142 64 2143 64 2144 64 2145 64 2146 64 2147 64 2148 64 2149 64 2150 65 ZS INOS ANE OS 2153 65 2154 65 2155 65 2156 65 2157 66 2158 66 2159 66 2160 66 2161 66 2162 66 2163 67 2164 67 2165 67 2166 67 2167 68 2168 68 2169 68 2170 68 2171 69 2174 69 2175 69 2176 69 ZAG O9 2S AO AMS HO 2180 70 2181 70 2182 71 2186 71 2187 71 2188 71 2189 71 ANDI 72 AOZ TA Alas 22 IN 72 AND 72 2201 73 2202 73 2203 73 LAV 12 2205 74 2206 74 2207 74 BAM I FAs 222 Ts DED TS Dan TS PLIS TO DED TS DEN VO PAO TS 2240 75 2241 75 Dita IS PBA PD 2244 76 DAA UO 2246 76 AAT LO 2248 76 DAS TG PES IC Baa US 2254 76 DES FH LISD U4 2260 77 ZAM WY AROZ IT 2263 77 2264 77 2265 78 2266 78 2267 78 2268 79 2269 79 2270 79 DAT TS 2272 80 2273 80 2274 80 2276 80 2277 80 2278
7. Action None 2201 A global hot spare Warning Cause The controller is unable to 903 None failed Non critical communicate with a disk that is assigned as a global hot spare The disk may have failed or been removed There may also be a bad or loose cable Action Check if the disk is healthy and that it has not been removed Check the cables If necessary replace the disk and reassign the hot spare Storage Management Message Reference 73 74 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2202 2203 2204 2205 A global hot spare has Warning been removed Non critical A dedicated hot spare Warning failed Non critical A dedicated hot spare Warning has been removed Non critical A dedicated hot spare Warning has been automatically unassigned Non critical Cause The controller is unable to communicate with a disk that is assigned as a global hot spare The disk may have been removed There may also be a bad or loose cable Action Check if the disk is healthy and that it has not been removed Check the cables If necessary replace the disk and reassign the hot spare Cause The controller is unable to communicate with a disk that is assigned as a dedicated hot spare The disk may have failed or been removed There may also be a bad or loose cable
8. Event Number 2334 Controller event log Ok Cause This alert is provided for 751 None 1 Normal informational purposes NOTE 1 is a Action None substitution variable that will appear in the alert description for specific details about the alert 2335 Controller event log Warning Cause The text for this alert is generated by 753 None 1 Non critical the controller and can vary depending on the NOTE 1 is a situation This text is from events in the substitution variable controller event log that were generated while that will appear in the Storage Management was not running alert description for Action If there is a problem review the specific details about controller event log and the Server the alert Administrator Alert Log for significant events or alerts that may assist in diagnosing the problem Check the health of the storage components See the hardware documentation for more information 2336 Controller event log Critical Cause The text for this alert is generated by 754 None l Failure the controller and can vary depending on the NOTE 1 isa Error situation This text is from events in the substitution variable controller event log that were generated while that will appear in the Storage Management was not running alert description for Action See the hardware documentation for specific details about more information the alert 2337 The controller is Critical Cause The controller was unable to recover 1154 None u
9. clearing the log The log type information is provided 1553 Log size is near or at capacity Warning The size of a hardware log on the too Eypes lt Log types specified system is near or at the capacity of the hardware log The log type information is provided 1554 Log size is full Error The size of a hardware log on the iog type 2hag types specified system is full The log type information is provided 1555 Log sensor has failed Error A hardware log sensor in the iog kepek Log types specified system failed The hardware log status cannot be monitored The log type information is provided Processor Sensor Messages Processor sensors monitor how well a processor is functioning Processor messages listed in Table 2 13 provide status and warning information for processors in a particular chassis Table 2 13 Processor Sensor Messages EventID Description Severity Cause 1600 Processor sensor has failed Information A processor sensor in the specified 7 3 system is not functioning Th Sensor location lt Location in chassis gt ystem 1s ot ction 8 oe sensor location chassis location Chassis location lt Name of chassis gt previous state and processor sensor Previous state was lt State gt status are provided Processor sensor status lt status gt 1601 Processor sensor value unknown Information A processor sensor in the specified Sensor location lt Location in chassis gt system could not obtain a reading The sens
10. 851 676 851 677 1201 None Storage Management Message Reference 63 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2141 Array disk dead Ok Cause Portions of the array disk that were 901 None segments recovered Normal formerly inaccessible have been recovered This alert is provided for informational purposes Action None 2142 Controller rebuild Ok Cause A user has changed the controller 751 680 rate has changed Normal rebuild rate This alert is provided for informational purposes Action None 2143 Controller alarm Ok Cause A user has enabled the controller 751 678 enabled Normal alarm This alert is provided for informational purposes Action None 2144 Controller alarm Ok Cause A user has disabled the controller 751 679 disabled Normal alarm This alert is provided for informational purposes Action None 2145 Controller battery low Warning Cause The controller battery charge is low 1153 580 Non critical Action Recondition the battery See the online help for more information 2146 Bad block Warning Cause A portion of an array disk is damaged 753 691 replacement error Non critical Aetion See the Storage Management online help or the Dell OpenManage Server Administrator Storage Management User s Guide for more information 2147 Bad block sense error Warning Cause A portion of an array disk is damaged 753 691
11. Action None 2306 Bad block table is Warning Cause The bad block table is used for 903 None 80 full Non critical remapping bad disk blocks This table fills as bad disk blocks are remapped When the table is full bad disk blocks can no longer be remapped and disk errors can no longer be corrected At this point data loss can occur The bad block table is now 80 full Action Back up your data Replace the disk generating this alert and restore from back up 2307 Bad block table is full Critical Cause The bad block table is used for 904 None Unable to log Failure remapping bad disk blocks This table fills as block 1 Error bad disk blocks are remapped When the NOTE 1 is a table is full bad disk blocks can no longer be substitution variable remapped and disk errors can no longer be that will appear in the corrected At this point data loss can occur alert description for Action Replace the disk generating this alert specific details about and restore from backup You may have the alert lost data 86 Storage Management Message Reference Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2309 An array disk is Warning Cause You have attempted to replace a disk 903 None incompatible Non critical with another disk that is using an incompatible technology For example you may have replaced one side of a mirror with a SAS dis
12. Non critical Action See the online help for more information 2148 Bad block medium Warning Cause A portion of an array disk is damaged 753 691 SLOR Non critical Aetion See the online help for more information 2149 Bad block extended Warning Cause A portion of an array disk is damaged 753 691 SESE SITAE Non critical Aetion See the online help for more information 64 Storage Management Message Reference Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2150 Bad block extended Warning Cause A portion of an array disk is damaged 753 691 medium error Non critical Aetion See the online help for more information 2151 Asset tag changed Ok Cause A user has changed the enclosure 851 None Normal asset tag This alert is provided as an information Action None 2152 Asset name changed Ok Cause A user has changed the enclosure 851 None Normal asset name This alert is provided for informational purposes Action None 2153 Service tag changed Warning Cause An enclosure service tag was changed 753 None Non critical In most circumstances this service tag should only be changed by Dell support or your service provider Action Ensure that the tag was changed under authorized circumstances 2154 Maximum Ok Cause A user has changed the value for the 1051 None temperature probe Normal maximum temperature probe warning warning thres
13. Storage Management Message Reference 87 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2314 The initialization Critical Cause Storage Management is unable to 104 None sequence of SAS Failure monitor or manage SAS devices components failed Error Action Reboot the system If problem during system persists make sure you have supported startup SAS versions of the drivers and firmware Also you management and may need to reinstall Storage Management or monitoring is not Server Administrator because of some possible missing installation components 2315 Diagnostic message Ok Cause This alert is provided for 751 None l Normal informational purposes NOTE 1 is a Action None substitution variable that will appear in the alert description for specific details about the alert 2316 Diagnostic message Critical Cause A diagnostics test failed The text for 754 None 1 Failure this alert is generated by the utility that ran NOTE 1 is a Error the diagnostics substitution variable Action See the documentation for the utility that will appear in the that ran the diagnostics for more alert description for information specific details about the alert 2317 BGI terminated due Ok Cause This alert is provided for 1201 None to loss of ownership Normal informational purposes ina cluster Action None configuration 2318 Problems wi
14. The version of the kernel and the 103 None version and the non Non critical driver do not meet the minimum RAID SCSI driver requirements Storage Management may not version are older than be able to display the storage or perform the minimum storage management functions until you have required levels updated the system to meet the minimum See the Readme file regurements for a list of validated Action See the Readme file for kernel and kernel and driver driver requirements Update the system to versions meet the minimum requirements and then 2168 2169 2170 reinstall Storage Management The non RAID SCSI Warning Cause The version of the driver does not 103 driver version is older Non critical meet the minimum requirements Storage than the minimum Management may not be able to display the required level storage or perform storage management Sethe Reidmefile functions until you have updated the system for the validated to meet the minimum requirements driver version Action See the Readme file for the driver requirements Update the system to meet the minimum requirements and then reinstall Storage Management The controller battery Critical Cause The controller battery cannot 1154 needs to be replaced Failure recharge The battery may be old or it may Error have been already recharged the maximum number of times In addition the battery charger may not be working Action Replace the battery pack The control
15. as a loose connection or an invalid cabling configuration See the hardware documentation for information on correct cabling configurations Check if the firmware is a supported version 96 Storage Management Message Reference Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2357 SAS expander error Critical Cause The text for this alert is generated by 754 None 1 Failure the firmware and can vary depending on the NOTE 1 is a Error situation substitution variable Action There may be a problem with the that will appear in the enclosure Check the health of the enclosure alert description for and its components by selecting the specific details about enclosure object in the tree view The Health the alert subtab displays a red X or yellow exclamation point for enclosure components that are failed or degraded See the enclosure documentation for more information 2358 The battery charge Ok Cause This alert is provided for 1151 None cycle is complete Normal informational purposes Action None 2359 The physical diskis Warning Cause The physical disk does not comply 903 None not certified Non critical with the standards set by Dell and is not supported Action Replace the physical disk with a physical disk that is supported 2360 A user has discarded Ok Cause This alert is provided for 751 None data from the Normal informatio
16. discrete Discrete current state lt State gt system detected an error from which it cannot recover The sensor location chassis location previous state and current sensor value are provided 24 Event Message Reference Chassis Intrusion Messages Chassis intrusion messages listed in Table 2 6 are a security measure Chassis intrusion means that someone is opening the cover to a system s chassis Alerts are sent to prevent unauthorized removal of parts from a chassis Table 2 6 Chassis Intrusion Messages EventID Description Severity Cause 1250 Chassis intrusion sensor has failed Information A chassis intrusion sensor in the Sensor location lt Location in chassis gt specified system failed The sensor location chassis location Chassis location lt Name of chassis gt previous state and chassis Previous state was lt State gt intrusion state are provided Chassis intrusion state lt Intrusion state gt 1251 Chassis intrusion sensor value unknown Information A chassis intrusion sensor in the specified system could n Sensor location lt Location in chassis gt peci yste ce ot obtain a reading The sensor Chassis location lt Name of chassis gt location chassis location Previous state was lt State gt previous state and chassis he fs intrusion state are provided Chassis intrusion state lt Intrusion P state gt 1252 Chassis intrusion returned to normal Information A chassis intrusi
17. is a SAS Non critical assigned as a hot spare is using SAS disk SAS disks technology The array disks in the virtual disk cannot replace SATA are using SATA technology Due to this disks difference in technology the hot spare cannot rebuild data if one of the array disks in the virtual disk fails Action Add a SATA disk that is large enough to be used as the hot spare and assign the new disk as a hot spare 2211 The physical disk is Warning Cause The physical disk may not have a 903 None not supported Non critical supported version of the firmware or the disk may not be supported by Dell Action If the disk is supported by Dell update the firmware to a supported version If the disk is not supported by Dell replace the disk with one that is supported 2232 The controller alarm Ok Cause This alert is provided for 751 None is silenced Normal informational purposes Action None 2233 The BGI rate has Ok Cause This alert is provided for 751 None changed Normal informational purposes Action None 2234 The Patrol Read rate Ok Cause This alert is provided for 751 None has changed Normal informational purposes Action None Storage Management Message Reference 75 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2235 The Check Ok Cause This alert is provided for 751 None Consistency rate has Normal informational pu
18. location chassis location previous state and processor sensor status are provided A processor sensor in the specified system is disabled has a configuration error or experienced a thermal trip The sensor location chassis location previous state and processor sensor status are provided A processor sensor in the specified system has failed The sensor location chassis location previous state and processor sensor status are provided Event Message Reference 35 36 Pluggable Device Messages The pluggable device messages listed in Table 2 14 provide status and error information when some devices such as memory cards are added or removed Table 2 14 Pluggable Device Messages EventID Description Severity Cause 1650 lt Device plug event type unknown gt Information A pluggable device event message Device location lt Location in chassis j E ka RA The ah avaitabhies evice ocation chassis ocation and additional event details if Chassis location lt Name of chassis if available are provided available gt Additional details lt Additional details for the events if available gt 1651 Device added to system Information A device was added in the specified Device location lt Location in chassis gt system The device location chassis location and additional Chassis location lt Name of chassis gt event details if available are Additional details lt Additional provided details for the events
19. operation ona 1201 604 on a virtual disk has Normal virtual disk was paused by a user been a Action To resume the check consistency suspended operation right click the virtual disk in the Storage Management tree view and select Resume Check Consistency 2115 A consistency check Ok Cause The check consistency operation ona 1201 605 on a virtual disk has Normal virtual disk has resumed processing after been resumed being paused by a user Action This alert is provided for informational purposes Storage Management Message Reference 57 58 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2116 A virtual disk and its Ok Cause A user has caused a mirrored virtual 1201 606 mirror have been split Normal disk to be split When a virtual disk is 2117 2118 2120 mirrored its data is copied to another virtual disk in order to maintain redundancy After being split both virtual disks retain a copy of the data although because the mirror is no longer intact updates to the data are no longer copied to the mirror Action This alert is provided for informational purposes A mirrored virtual Ok Cause A user has caused a mirrored virtual 1201 disk has been Normal disk to be unmirrored When a virtual disk is unmirrored mirrored its data is copied to another virtual disk in order to maintain redundancy After being
20. run a Check Consistency task to check the data 2341 The Check Ok Cause This alert is provided for 1201 None Consistency Normal informational purposes operation made Acton None corrections and i completed 2342 The Check Warning Cause The data on a source disk and the 1203 None Consistency task found inconsistent parity data Data redundancy may be lost Non critical redundant data on a target disk is inconsistent Action Restart the Check Consistency task If you receive this alert again check the health of the array disks included in the virtual disk Review the alert messages for significant alerts related to the array disks If you suspect that an array disk has a problem replace it and restore from backup Storage Management Message Reference 93 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2343 The Check Warning Cause The Check Consistency operation 1203 None Consistency logging Non critical can no longer report errors in the parity data of inconsistent parity AS Action See the hardware documentation for data is disabled more information 2344 The virtual disk Warning Cause A user has cancelled the virtual disk 1203 None initialization Non critical initialization terminated Action Restart the initialization 2345 The virtual disk Critical Cause The controller cannot co
21. supply gt lt Additional power supply status information gt If in configuration error state Configuration error type lt type of configuration error gt Event Message Reference 29 Table 2 8 Power Supply Messages continued EventID Description Severity Cause 1354 Power supply detected a failure Error A power supply has been Sensor location lt Location in chassis gt disconnected or has failed The sensor location chassis location previous state and additional Previous state was lt State gt power supply status information are provided Chassis location lt Name of chassis gt Power Supply type lt type of power supply gt lt Additional power supply status information gt If in configuration error state Configuration error type lt type of configuration error gt 1355 Power supply sensor detected a non Error A power supply sensor in the recoverable value specified system detected an error from which it cannot recover The sensor location chassis location previous state Previous state was lt State gt and additional power supply status information are provided Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Power Supply type lt type of power supply gt lt Additional power supply status information gt If in configuration error state Configuration error type lt type of configuration error gt Memory Device Messages Memory
22. 5 Alert Monitoring andLogging 000005 45 Alert Descriptions and Corrective Actions 45 Indek eaa terrer baud 244 g Se ea a ade eae oc 99 4 Contents Introduction Dell OpenManage Server Administrator produces event messages stored primarily in the operating system or Server Administrator event logs and sometimes in SNMP traps This document describes the event messages created by Server Administrator version 2 0 or later and displayed in the Server Administrator Alert log Server Administrator creates events in response to sensor status changes and other monitored parameters The Server Administrator event monitor uses these status change events to add descriptive messages to the operating system event log or the Server Administrator Alert log Each event message that Server Administrator adds to the alert log consists of a unique identifier called the event ID for a specific event source category and a descriptive message The event message includes the severity cause of the event and other relevant information such as the event location and the monitored item s previous state Tables provided in this guide list all Server Administrator event IDs in numeric order Each entry includes the event ID s corresponding description severity level and cause Message text in angle brackets for example lt State gt describes the event specific information provided by the Server Administrator Wha
23. 81 2279 81 2280 81 2281 81 22O RON 2283 82 2284 82 Index 101 2285 82 2286 82 LAST AOL 2288 82 2289 83 2290 83 ZPO DIDI A833 2293 84 AO cee Set BEDS Sar 2296 84 2297 84 2298 84 2299 85 2300 85 2301 85 2302 85 2303 85 86 2304 86 2305 86 2306 86 2307 86 2309 87 2310 87 DIMA SZ BA Sh ZIM 7 102 Index 2314 88 2315 88 2316 88 2317 88 2318 88 ANS e 2320 89 2321 89 2322 89 2y e 2324 90 2325 90 2326 90 A l gt NO 2328 90 2329 91 2330 91 DRI Gl 2332 Ol 2335 Ol D345 YA 2335 92 2336 92 233 IZ 2338 93 2339 OF 2340 93 2341 93 yi S 2343 94 Dara Ot 2345 94 2346 94 BD Gat 2348 94 2 Ds 2350 95 2351 OS 2352 OS A359 Ds LIDi OS 2355 96 2356 96 AT O 2358 97 230997 2360 97 2O O7 EXOD IT 2363 98 2364 98 2365 98 2366 98 2367 98 2368 98 A A consistency check on a virtual disk has been paused suspended 57 A consistency check on a virtual disk has been resumed 57 A mirrored virtual disk has been unmirrored 58 A previously scheduled system BIOS update has been canceled 15 A system BIOS update has been scheduled for the next reboot 15 A virtual disk and its mirror have been split 58 AC power cord is not being monitored 32 AC power cord messages 32 43 AC power cord sensor 9 AC power cord sensor has failed 32 43 AC power has been lost 33 AC power has
24. Dell OpbenManage Server Administrator Messages Reference Guide www dell com support dell com Notes and Notices K NOTE A NOTE indicates important information that helps you make better use of your computer NOTICE A NOTICE indicates either potential damage to hardware or loss of data and tells you how to avoid the problem Information in this document is subject to change without notice 2003 2005 Dell Inc All rights reserved Reproduction in any manner whatsoever without the written permission of Dell Inc is strictly forbidden Trademarks used in this text The DELL logo and Dell OpenManage are trademarks of Dell Inc Microsoft and Windows are registered trademarks of Microsoft Corporation Novell and NetWare are registered trademarks of Novell Inc Red Hat is a registered trademark of Red Hat Inc Other trademarks and trade names may be used in this document to refer to either the entities claiming the marks and names or their products Dell Inc disclaims any proprietary interest in trademarks and trade names other than its own November 2005 Contents Wi AMPOOWCHOM i set asks tess eie eet Sica eaten 7 What s New inthisRelease 20 0000 00 0 eee 7 Messages Not Described in ThisGuide 7 Understanding Event Messages 000 0004 ee 8 Viewing Alerts and Event Messages 004 9 2 Event Message Reference cc cee eeeeeaaes 15 Miscellan
25. Discrete voltage state lt State gt 20 Event Message Reference Table 2 4 Voltage Sensor Messages continued EventID Description Severity Cause 1152 Voltage sensor returned to a normal Information A voltage sensor in the specified value system returned to a valid range J 8 Sensor location lt Location in chassis gt after crossing a failure threshold i The sensor location chassis Chassis location lt Name of chassis gt location previous state and Previous state was lt State gt voltage sensor value are provided If sensor type is not discrete Voltage sensor value in Volts lt Reading gt If sensor type is discrete Discrete voltage state lt State gt 1153 Voltage sensor detected a warning Warning A voltage sensor in the specified value system exceeded its warning Sensor location lt Location in chassis gt threshold The SENSOL location chassis location previous state Chassis location lt Name of chassis gt and voltage sensor value Previous state was lt State gt are provided If sensor type is not discrete Voltage sensor value in Volts lt Reading gt If sensor type is discrete Discrete voltage state lt State gt 1154 Voltage sensor detected a failure Error A voltage sensor in the specified value Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt If sensor type is not discrete Voltage sensor value in Volts
26. EventID Description Severity Cause 1300 Redundancy sensor has failed Information A redundancy sensor in the specified system failed The redundancy unit location chassis location previous Chassis location lt Name of chassis gt redundancy state and the Redundancy unit lt Redundancy location in chassis gt Previous redundancy state was lt State gt number of devices required for full redundancy are provided 26 Event Message Reference Table 2 7 Redundancy Unit Messages continued EventID Description Severity Cause 1301 Redundancy sensor value unknown Information A redundancy sensor in the Redundancy unit lt Redundancy location o e ea not tn Chassis gt obtain a rea ing ne f redundancy unit location Chassis location lt Name of chassis gt chassis location previous Previous redundancy state was lt State gt redundancy state and the number of devices required for full redundancy are provided 1302 Redundancy not applicable Information A redundancy sensor in the Redundancy unit lt Redundancy location specified e eee a ine chasei o gt unit was not re un ant ne redundancy location chassis Chassis location lt Name of chassis gt location previous redundancy Previous redundancy state was lt State gt state and the number of devices required for full redundancy are provided 1303 Redundancy is offline Information A redundancy sensor in the Redundancy unit lt Redundancy location T system pe
27. If sensor type is discrete Discrete current state lt State gt 22 Event Message Reference A current sensor on the power supply for the specified system failed The sensor location chassis location previous state and current sensor value are provided Table 2 5 Current Sensor Messages continued EventID Description Severity Cause 1201 Current sensor value unknown Information A current sensor on the power 1202 1203 Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt If sensor type is not discrete Current sensor value in Amps lt Reading gt If sensor type is discrete Discrete current state lt State gt Current sensor returned to a normal Information value Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt If sensor type is not discrete Current sensor value in Amps lt Reading gt If sensor type is discrete Discrete current state lt State gt Current sensor detected a warning value Warning Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt If sensor type is not discrete Current sensor value in Amps lt Reading gt If sensor type is discrete Discrete current state lt State gt supply for the specified system could not obtain a reading The sensor loc
28. Self Monitoring Analysis and Reporting Technology S M A R T When enabled SMART monitors the health of the disk based on indications such as the number of write operations that have been performed on the disk Action Replace the array disk Even though the disk may not have failed yet it is strongly recommended that you replace the disk Review the message text for additional information Storage Management Message Reference Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2095 SCSI sense data If Warning Cause An array disk has failed is corrupt or 903 57 this disk is part ofa Non critical is otherwise experiencing a problem redundant virtual Action Replace the array disk Even though disk elect the the disk may not have failed yet it is strongly Offline option and recommended that you replace the disk then replace the disk Review the message text for Then configure a hot additional information spare and it will start the rebuild automatically If this disk is a hot spare select the Prepare to Remove option and then replace the disk If this disk is part of a non redundant disk you should back up your data immediately If the disk fails you will not be able to recover the data 2098 Global hot spare Ok Cause A user has assigned an array diskasa 901 574 assigned Normal global hot spare This al
29. Si 3 Di CERC SATA 1 5 6ch or CERC SATA 1 5 2s controller this alert displays the new virtual disk name On the PERC 2 SC 2 DC 3 SC 3 DCL 3 DC 3 QC 4 SC 4 DC 4e DC 4 Di 4 M 4e Si 4e Di and CERC ATA 100 4ch controllers this alert displays the original virtual disk name Action None Cause A user has assigned an array diskasa 901 574 dedicated hot spare to a virtual disk See the online help for more information This alert is provided for informational purposes Action None Cause A user has unassigned an array disk as 901 575 a dedicated hot spare to a virtual disk See the online help for more information This alert is provided for informational purposes Action None Cause Communication with an enclosure 851 None has been restored This alert is provided for informational purposes Action None Storage Management Message Reference Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2163 Rebuild completed Ok See the online help for more information 904 690 with errors Normal 2164 See the Readme file Ok Cause Storage Management is unable to 101 None for a list of validated Normal determine whether the system has the controller driver minimum required versions of the RAID versions controller drivers Action This alert is generated for informational purposes See the Readme file for driver a
30. Supply Sensor Name gt power Information This event is generated when the power supply sensor AC recovered supply has been replaced lt Power Supply Sensor Name gt power Information This event is generated when the power supply sensor returned to normal supply that failed or removed was replaced state and the state has returned to normal 40 System Event Log Messages for IPMI Systems Table 3 5 Power Supply Events continued Event Message Severity Cause lt Entity Name gt PS Redundancy sensor Information redundancy degraded lt Entity Name gt PS Redundancy sensor Critical redundancy lost lt Entity Name gt PS Redundancy sensor Information redundancy regained Power supply redundancy is degraded if one of the power supply sources is removed or failed Power supply redundancy is lost if only one power supply is functional This event is generated if the power supply has been reconnected or replaced Memory ECC Events The memory ECC event messages monitor the memory modules in a system These messages monitor the ECC memory correction rate and the type of memory events that occurred Table 3 6 Memory ECC Events Event Message Severity Cause ECC error correction detected on Information Bank DIMM A B ECC uncorrectable error detected on Critical Bank DIMM Correctable memory error logging Critical disabled This event is generated when there is a memory error correction on a
31. The following sections describe alerts generated by the RAID or SCSI controllers supported by Storage Management The alerts are displayed in the Server Administrator Alert subtab or through the Windows Event Viewer These alerts can also be forwarded as SNMP traps to other applications SNMP traps are generated for the alerts listed in the following sections These traps are included in the Storage Management management information base MIB The SNMP traps for these alerts use all of the SNMP trap variables For more information on SNMP support and the MIB see the SNMP Reference Guide To locate an alert scroll through the following table to find the alert number displayed on the Server Administrator Alert tab or search this file for the alert message text or number See Understanding Event Messages for more information on severity levels Storage Management Message Reference 45 46 K NOTE If you have an Array Manager installation the Array Manager console reports the status of storage components through error icons and graphical displays When there is a change in status Array Manager sends events to the Array Manager event log which can be viewed from the Array Manager console For more information see the Dell OpenManage Array Manager User s Guide For more information regarding alert descriptions and the appropriate corrective actions see the online help Table 4 1 Storage Management Messages Event ID Description S
32. a failure value 19 Fan sensor detected a non recoverable value 20 Fan sensor detected a warning value 19 Fan Sensor Events 39 Fan sensor has failed 19 38 Fan sensor returned to a normal value 19 Fan sensor value unknown 19 38 Firmware version mismatch 61 G Global hot spare assigned 53 Global hot spare unassigned 53 H hardware log sensor 9 Hardware Log Sensor Events 43 Intrusion Events 44 L Log backup created 15 Log monitoring has been disabled 33 44 Log size is near or at capacity 34 Log size returned to a normal level 34 Log status is unknown 33 44 Log was cleared 15 Maximum temperature probe warning threshold value changed 65 Memory device ECC Correctable error count crossed a warning threshold 31 42 Memory device ECC Correctable error count sensor crossed a failure threshold 31 memory device messages 30 42 Memory device monitoring has been disabled 31 42 Memory ECC Events 41 Memory Events 42 memory prefailure sensor 8 messages AC power cord 32 43 chassis intrusion 25 40 cooling device 19 current sensor 22 fan enclosure 31 43 memory device 30 42 miscellaneous 15 37 pluggable device 36 44 power supply 28 processor sensor 34 44 messages continued redundancy unit 26 41 storage management 6 temperature sensor 16 37 voltage sensor 20 39 Minimum temperature probe warning threshold valu
33. a normal value Information A fan sensor reading on the specified system returned to Sensor location lt Location in chassis gt specined system returned to a valid range after crossing a Chassis location lt Name of chassis gt warning threshold The sensor Previous state was lt State gt location chassis location revious state and fan sensor Fan sensor value lt Reading gt P value are provided 1103 Fan sensor detected a warning value Warning A fan sensor reading in the i specified system exceeded Sensor location lt Location in chassis gt SPEETISESY teme cecce a warning threshold The sensor Chassis location lt Name of chassis gt location chassis location Previous state was lt State gt previous state and fan sensor value are provided Fan sensor value lt Reading gt P 1104 Fan sensor detected a failure value Error A fan sensor in the specified Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt Fan sensor value lt Reading gt system detected the failure of one or more fans The sensor location chassis location previous state and fan sensor value are provided Event Message Reference 19 Table 2 3 Cooling Device Messages continued EventID Description Severity Cause 1105 Fan sensor detected a Error A fan sensor detected an error non recoverable value from which it cannot recover The sensor location c
34. as been lost Action Verify that the battery and memory are functioning properly 2187 Single bit ECC error Warning Cause The system memory is 753 None limit exceeded Non critical malfunctioning Action Replace the battery pack Storage Management Message Reference 71 72 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2188 The controller write Warning Cause The controller battery is unable to 1153 None policy has been Non critical maintain cached data for the required period changed to Write of time For example if the required period of Through time is 24 hours the battery is unable to maintain cached data for 24 hours It is normal to receive this alert during the battery Learn cycle as the Learn cycle discharges the battery before recharging it When discharged the battery cannot maintain cached data Action Check the health of the battery If the battery is weak replace the battery pack 2189 The controller write Ok Cause This alert is provided for 1151 None policy has been Normal informational purposes changed to Write Koen Mone Back 2191 Multiple enclosures Critical Cause Many enclosures are attached tothe 854 None are attached tothe Failure controller port When the enclosure limit is controller This isan Error exceeded the controller loses contact with all unsupported enclosures attached
35. as received a SMART alert predictive failure The disk is likely to fail in the near future Action Replace the disk that has received the SMART alert If the array disk is a member of a non redundant virtual disk then back up the data before replacing the disk Removing an array disk that is included in a non redundant virtual disk will cause the virtual disk to fail and may cause data loss Storage Management Message Reference 903 588 903 589 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2111 Failure prediction Warning Cause A disk has received a SMART alert 903 590 threshold exceeded Non critical predictive failure due to test conditions due to test No Action None action needed 2112 Enclosure was shut Critical Cause The array disk enclosure is either 854 602 down Failure hotter or cooler than the maximum or Error minimum allowable temperature range Action Check for factors that may cause overheating or excessive cooling For example verify that the enclosure fan is working You should also check the thermostat settings and examine whether the enclosure is located near a heat source Make sure the enclosure has enough ventilation and that the room temperature is not too hot or too cold See the enclosure documentation for more diagnostic information 2114 A consistency check Ok Cause The check consistency
36. ate gt are provided 1255 Chassis intrusion sensor detected a Error A chassis intrusion sensor in the non recoverable value specified system detected an error from which it cannot recover The sensor location Chassis location lt Name of chassis gt chassis location previous state Sensor location lt Location in chassis gt Previous state was lt State gt and chassis intrusion state Bente i are provided Chassis intrusion state lt Intrusion P state gt Redundancy Unit Messages Redundancy means that a system chassis has more than one of certain critical components Fans and power supplies for example are so important for preventing damage or disruption of a system that a chassis may have extra fans or power supplies installed Redundancy allows a second or nth fan to keep the chassis components at a safe temperature when the primary fan has failed Redundancy is normal when the intended number of critical components are operating Redundancy is degraded when a component fails but others are still operating Redundancy is lost when the number of components functioning falls below the redundancy threshold Table 2 7 lists the redundancy unit messages The number of devices required for full redundancy is provided as part of the message when applicable for the redundancy unit and the platform For details on redundancy computation see the respective platform documentation Table 2 7 Redundancy Unit Messages
37. ation Restart the virtual disk initialization Cause A user has cancelled the rebuild operation Action Restart the rebuild operation Cause An array disk included in the virtual disk failed or there is an error in the parity information A failed array disk can cause errors in parity information Action Replace the failed array disk You can identify which disk has failed by locating the disk that has a red X for its status Rebuild the array disk When finished restart the check consistency operation Cause An array disk included in the virtual disk failed Action Replace the failed array disk You can identify which array disk has failed by locating the disk that has a red X for its status Rebuild the array disk When finished restart the virtual disk format operation Cause An array disk included in the virtual disk has failed or a user has cancelled the initialization Action If an array disk has failed then replace the array disk Storage Management Message Reference 532 536 538 539 541 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2080 Array disk initialize Critical Cause The array disk has failed or is corrupt 904 542 failed Failure Action Replace the failed or corrupt disk Error You can identify a disk that has failed by locating the disk that has a re
38. ation chassis location previous state and a nominal current sensor value are provided A current sensor on the power supply for the specified system returned to a valid range after crossing a failure threshold The sensor location chassis location previous state and current sensor value are provided A current sensor on the power supply for the specified system exceeded its warning threshold The sensor location chassis location previous state and current sensor value are provided Event Message Reference 23 Table 2 5 Current Sensor Messages continued EventID Description Severity Cause 1204 Current sensor detected a failure Error A current sensor on the power value supply for the specified system Sensor location lt Location in chassis gt exceeded 8 failure threshold i The sensor location chassis Chassis location lt Name of chassis gt location previous state and Previous state was lt State gt current sensor value are provided If sensor type is not discrete ATE tie Current sensor value in Amps lt Reading gt If sensor type is discrete Discrete current state lt State gt 1205 Current sensor detected a Error A current sensor in the specified non recoverable value Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt If sensor type is not discrete Current sensor value in Amps lt Reading gt If sensor type is
39. ational purposes Action None Cause The controller has lost 854 communication with an EMM The cables may be loose or defective Action Make sure the cables are attached securely Reboot the system None None None None Storage Management Message Reference 83 84 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2293 The EMM has failed Critical Cause The failure may be caused by a loss of 854 None Failure power to the EMM The EMM self test may Error also have identified a failure There could also be a firmware problem or a multi bit error Action Replace the EMM See the hardware documentation for information on replacing the EMM 2294 A device has been Ok Cause This alert is provided for 752 802 None inserted Normal informational purposes 852 902 Action None 952 1002 1052 1102 1152 1202 2295 A device has been Critical Cause A device has been removed and the 754 804 None removed Failure system is no longer functioning in 854 904 Error optimal condition 954 1004 Action Replace the device 1054 1104 1154 1204 2296 AnEMMhasbeen Ok Cause This alert is provided for 851 None inserted Normal informational purposes Action None 2297 An EMM has been Critical Cause An EMM has been removed 854 None removed Failure Action Replace the EMM See the hardware Error documenta
40. atus is not sanson docation era ioe Lan ohassiss being monitored This occurs when a system s expected AC power Chassis location lt Name of chassis gt configuration is set to nonredundant The sensor location and chassis location information are provided 1502 AC power has been restored Information An AC power cord that did not Sensor location Chassis location 32 Event Message Reference lt Location in chassis gt lt Name of chassis gt have AC power has had the power restored The sensor location and chassis location information are provided Table 2 11 AC Power Cord Messages continued EventID Description Severity Cause 1503 AC power has been lost Warning An AC power cord has lost its power but there is sufficient redundancy to classify this as a warning The sensor location and chassis location information are provided Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt 1504 AC power has been lost Error An AC power cord has lost its power and lack of redundancy requires this to be classified as an error The sensor location and chassis location information are provided Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt 1505 AC power has been lost Error An AC power cord sensor in the specified system failed The AC power cord status cannot be monitored The sensor location and chassis location information are p
41. ay indicate a possible future problem For example a Warning Non critical alert may indicate that a component such as a temperature probe in an enclosure has crossed a warning threshold Warning Non critical A significant event that indicates actual or imminent loss of data or loss of Critical Failure Error function For example crossing a failure threshold or a hardware failure such as an array disk Server Administrator generates events based on status changes in the following sensors Temperature Sensor Helps protect critical components by alerting the systems management console when temperatures become too high inside a chassis also monitors a variety of locations in the chassis and in any attached systems Fan Sensor Monitors fans in various locations in the chassis and in any attached systems Voltage Sensor Monitors voltages across critical components in various chassis locations and in any attached systems Current Sensor Monitors the current or amperage output from the power supply or supplies in the chassis and in any attached systems Chassis Intrusion Sensor Monitors intrusion into the chassis and any attached systems Redundancy Unit Sensor Monitors redundant units critical units such as fans AC power cords or power supplies within the chassis also monitors the chassis and any attached systems For example redundancy allows a second or nth fan to keep the chassis components at a safe tem
42. ays Action None 2367 Rebuild not possible Ok Cause This alert is provided for 901 None as SAS SATA is not Normal informational purposes supported in the Achion None same virtual disk l 2368 The SEP has been Ok Cause This alert is provided for 851 None rebooted as part of Normal informational purposes the firmware Action None download operation and will be unavailable until the operation completes Storage Management Message Reference Index Numerics 0000 15 0001 15 1000 15 1001 15 1002 15 1003 15 1004 15 1005 16 1006 16 1007 16 1008 16 1009 16 1050 17 1051 17 1052 17 1053 18 1054 18 1055 18 1100 19 1101 19 1102 19 1103 19 1104 19 1105 20 1150 20 1151 20 Nz ZI 1153 21 Sse All ISS 22 1200 22 1201 23 1202 23 1203 23 1204 24 1205 24 1250 25 1251 25 1252 25 253 25 1254 26 1255 26 1300 26 1301 27 1302 27 1303 27 1304 27 1305 28 1306 28 1350 28 1351 29 1352 29 1353 29 1354 30 1355 30 1403 31 1404 31 1450 31 1451 31 1452 31 1453 32 1454 32 l5 32 1500 32 1501 32 1502 32 1503 33 1504 33 1505 33 1550 33 15533 1552 3 1553 95t ban Sar 1555 34 1600 34 1601 34 Index 99 1602 35 1603 35 1604 35 1605 35 2048 46 2049 46 2050 47 2051 47 2052 47 AOSD a7 2054 47 2055 47 2056 48 2057 48 2058 48 2059 49 2061 49 2063 49 2064 49 2065 49 2067 49 2070 50 2074 50 2076
43. been restored 32 Array disk dead segments recovered 64 Array Disk degraded 47 Array disk initialize completed 52 Array disk initialize failed 51 Array disk inserted 47 Array disk offline 47 Array disk online 66 Array disk rebuild cancelled 50 Array disk rebuild completed 52 Array disk rebuild failed 51 Array disk rebuild started 49 Array disk removed 46 Array Manager is installed on the system 62 Asset name changed 65 Asset tag changed 65 Automatic System Recovery ASR action was performed 16 Background initialization cancelled 61 Background initialization completed 61 Background initialization failed 61 Background initialization started 61 Bad block extended medium error 65 Bad block extended sense error 64 Bad block medium error 64 Bad block replacement error 64 Bad block sense error 64 BIOS Generated System Events 44 BMC Watchdog Events 42 C Change write policy 58 Chassis intrusion detected 26 41 Chassis intrusion in progress 25 41 chassis intrusion messages 25 40 Chassis intrusion returned to normal 25 chassis intrusion sensor 8 Chassis intrusion sensor detected a non recoverable value 26 41 Chassis intrusion sensor has failed 25 Chassis intrusion sensor value unknown 25 40 Communication regained 66 Communication timeout 63 Controller alarm disabled 64 Controller alarm enabled 64 Control
44. behalf the event occurred e Computer The name of the system where the event occurred Source The software that logged the event e Category The classification of the event by the event source Event ID The number identifying the particular event type e Description A description of the event The format and contents of the event description vary depending on the event type Introduction 11 Understanding the Event Description Table 1 2 lists in alphabetical order each line item that may appear in the event description Table 1 2 Event Description Reference Description Line Item Explanation Action performed was lt Action gt Action requested was lt Action gt Additional Details lt Additional details for the event gt lt Additional power supply status information gt Chassis intrusion state lt Intrusion state gt Chassis location lt Name of chassis gt Configuration error type of configuration error gt lt type Current sensor value lt Reading gt in Amps Date and time of action lt Date and time gt Device location lt Location in chassis gt lt State gt Discrete current state Discrete temperature state lt State gt Introduction Specifies the action that was performed for example Action performed was Power cycle Specifies the action that was requested for example shutdown OS first Action requested was Reboot Spe
45. cifies additional details available for the hot plug event for example Memory device DIMM1_A Serial number FFFF30B1 Specifies information pertaining to the event for example Power supply input AC is off Power supply POK power OK signal is not normal Power supply is turned off Specifies the chassis intrusion state open or closed for example Chassis intrusion state Open Specifies name of the chassis that generated the message for example Chassis location Main System Chassis Specifies the type of configuration error that occurred for example Configuration error type Revision mismatch Specifies the current sensor value in amps for example Current sensor value in Amps 7 853 Specifies the date and time the action was performed for example Date and time of action Tue Mar 21 16 20 33 2006 Specifies the location of the device in the specified chassis for example Device location Memory Card A Specifies the state of the current sensor for example Discrete current state Good Specifies the state of the temperature sensor for example Discrete temperature state Good Table 1 2 Event Description Reference continued Description Line Item Explanation Discrete voltage state lt State gt Fan sensor value lt Reading gt Log type lt Log type gt Memory device bank location lt Bank name in chassis gt Memory device location lt Device name in chassis gt Number of devic
46. ck Seeyour hardware documentation for information on replacing the DIMM 2322 The DC power supply Critical Cause The power supply unit is switched off 1004 None is switched off Failure Either a user switched off the power supply Error unit or it is defective Action Check if the power switch is turned off If it is turned off turn it on If the problem persists check if the power cord is attached and functional If the problem is still not corrected or if the power switch is already turned on replace the power supply unit 2323 The power supply is Ok Cause This alert is provided for 1001 None switched on Normal informational purposes Action None Storage Management Message Reference 89 90 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2324 The AC power supply Critical Cause The power cable may be pulled out or 1004 None cable has been Failure removed The power cable may also have removed Error overheated and become warped and nonfunctional Action Replace the power cable 2325 The power supply Ok Cause This alert is provided for 1001 None cable has been Normal informational purposes inserted Action None 2326 A foreign Ok Cause This alert is provided for 751 None configuration has Normal informational purposes The controller has been detected array disks that were moved from another controller The
47. d X for its status Restart the initialization 2081 Virtual disk Critical Cause An array disk included in the virtual 1204 543 reconfiguration failed Failure disk has failed or is corrupt A user may also Error have cancelled the reconfiguration Action Replace the failed or corrupt disk You can identify a disk that has failed by locating the disk that has a red X for its status If the array disk is part of a redundant array then rebuild the array disk When finished restart the reconfiguration 2082 Virtual disk rebuild Critical Cause An array disk included in the virtual 1204 544 failed Failure disk has failed or is corrupt A user may also Error have cancelled the rebuild Action Replace the failed or corrupt disk You can identify a disk that has failed by locating the disk that has a red X for its status Restart the virtual disk rebuild 2083 Array disk rebuild Critical Cause An array disk included in the virtual 904 545 failed Failure disk has failed or is corrupt A user may also Error have cancelled the rebuild Action Replace the failed or corrupt disk You can identify a disk that has failed by locating the disk that has a red X for its status Rebuild the virtual disk rebuild 2085 Virtual disk check Ok Cause This alert is provided for 1201 547 consistency Normal informational purposes completed Action None 2086 Virtual disk format Ok Cause This alert is provided for 1201 548 complet
48. device messages listed in Table 2 9 provide status and warning information for memory modules present in a particular system Memory devices determine health status by monitoring the ECC memory correction rate and the type of memory events that have occurred K NOTE A critical status does not always indicate a system failure or loss of data In some instances the system has exceeded the ECC correction rate Although the system continues to function you should perform system maintenance as described in Table 2 9 K NOTE In Table 2 9 lt status gt can be either critical or non critical 30 Event Message Reference Table 2 9 Memory Device Messages EventID Description Severity Cause 1403 Memory device status is lt status gt Warning A memory device correction rate exceeded an acceptable value The memory device status and location are provided Memory device location lt location in chassis gt Possible memory module event cause lt list of causes gt 1404 Memory device status is lt status gt Error A memory device correction rate exceeded an acceptable value a Memory device location lt location in memory spare bank was activated chassis gt qe or a multibit ECC error occurred Possible memory module event cause The system continues to function lt list of causes gt normally except for a multibit error Replace the memory module identified in the message during the system s next scheduled main
49. disks that reside on the controller Action Assign a larger disk as the global hot spare Cause The battery is discharging A battery 1154 discharge is a normal activity during the battery Learn cycle Before completing the battery Learn cycle recharges the battery You should receive alert 2179 when the recharge occurs Action Check if the battery Learn cycle is in progress Alert 2176 indicates that the battery Learn cycle has initiated The battery also displays the Learn state while the Learn cycle is in progress If a Learn cycle is not in progress replace the battery pack Cause This alert is provided for 1151 informational purposes This alert indicates that the battery is recharging during the battery Learn cycle Action None Cause A disk media error was detected while 1201 the controller was completing a background task A bad disk block was identified The disk block has been remapped Action Consider replacing the disk If you receive this alert frequently be sure to replace the disk You should also routinely back up your data Cause This alert is provided for 1201 informational purposes Action None None None None None None Storage Management Message Reference 81 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2282 Hot spare SMART Critical Cause The controller
50. e changed 65 P pluggable device sensor 9 Power supply detected a failure 30 Power supply detected a warning 29 42 Power Supply Events 40 power supply messages 28 Power supply returned to normal 29 42 power supply sensor 8 Power supply sensor detected a non recoverable value 30 Power supply sensor has failed 28 Power supply sensor value unknown 29 Predictive Failure reported 52 processor Sensor 9 Processor sensor detected a failure value 35 44 Processor sensor detected a non recoverable value 35 Processor sensor detected a warning value 35 44 Processor sensor has failed 34 44 Processor sensor returned toa normal state 35 44 Processor sensor value unknown 34 44 Processor Status Events 40 Rebuild completed with errors 67 Redundancy degraded 28 59 Redundancy is offline 27 Redundancy lost 28 60 Redundancy normal 60 Redundancy not applicable 27 41 Redundancy regained 27 Redundancy sensor has failed 26 Redundancy sensor value unknown 27 41 redundancy unit messages 26 41 redundancy unit sensor 8 S SCSI sense data 53 SCSI sense sector reassign 61 Index 105 See readme txt for a list of validated controller driver versions 67 sensor AC power cord 9 chassis intrusion 8 current 8 fan 8 fan enclosure 9 hardware log 9 memory prefailure 8 power supply 8 processor 9 34 44 redundancy unit 8
51. e 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2109 2110 Smart warning temperature Smart warning degraded Warning Non critical Warning Non critical Cause A disk has reached an unacceptable temperature and received a SMART alert predictive failure The disk is likely to fail in the near future First Action Determine why the array disk has reached an unacceptable temperature A variety of factors can cause the excessive temperature For example a fan may have failed the thermostat may be set too high or the room temperature may be too hot or cold Verify that the fans in the server or enclosure are working If the array disk is in an enclosure you should check the thermostat settings and examine whether the enclosure is located near a heat source Make sure the enclosure has enough ventilation and that the room temperature is not too hot See the enclosure documentation for more diagnostic information Second Action If you cannot identify why the disk has reached an unacceptable temperature then replace the disk If the array disk is a member of a non redundant virtual disk then back up the data before replacing the disk Removing an array disk that is included in a non redundant virtual disk will cause the virtual disk to fail and may cause data loss Cause A disk is degraded and h
52. e Reference Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2349 Abad disk block Critical Cause A write operation could not complete 904 None could not be Failure because the disk contains bad disk blocks reassigned during a Error that could not be reassigned Data loss may write operation have occurred and data redundancy may also be lost Action Replace the disk 2350 There was an Critical Cause The rebuild encountered an 904 None unrecoverable disk Failure unrecoverable disk media error media SO during Error Action Replace the disk the rebuild 2351 A physical disk is Ok Cause This alert is provided for 901 None marked as missing Normal informational purposes Action None 2352 A physical disk that Ok Cause This alert is provided for 901 None was marked as Normal informational purposes missing has been Action None replaced 2353 The enclosure Ok Cause This alert is provided for 851 None temperature has Normal informational purposes returned to normal Actions None 2354 Enclosure firmware Ok Cause This alert is provided for 851 None download in progress Normal informational purposes Action None Storage Management Message Reference 95 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2355 Enclosure f
53. ed Normal informational purposes Action None Storage Management Message Reference 51 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2088 Virtual disk Ok Cause This alert is provided for 1201 550 initialization Normal informational purposes completed Action None 2089 Array disk initialize Ok Cause This alert is provided for 901 551 completed Normal informational purposes Action None 2090 Virtual disk Ok Cause This alert is provided for 1201 552 reconfiguration Normal informational purposes completed Action None 2091 Virtual disk rebuild Ok Cause This alert is provided for 1201 553 completed Normal informational purposes Action None 2092 Array disk rebuild Ok Cause This alert is provided for 901 554 completed Normal informational purposes Action None 2094 Predictive Failure Warning Cause The array disk is predicted to fail 903 570 reported If this disk is part of a redundant virtual disk select the Offline option and then replace the disk Then configure a hot spare and it will start the rebuild automatically If this disk is a hot spare select the Prepare to Remove option and then replace the disk If this disk is part of a non redundant disk you should back up your data immediately If the disk fails you will not be able to recover the data Non critical Many array disks contain
54. ed the Failure A variety of factors can cause the excessive maximum failure Error temperature For example a fan may have threshold failed the thermostat may be set too high or the room temperature may be too hot Action Check for factors that may cause overheating For example verify that the enclosure fan is working You should also check the thermostat settings and examine whether the enclosure is located near a heat source Make sure the enclosure has enough ventilation and that the room temperature is not too hot See the enclosure documentation for more diagnostic information Temperature dropped Critical Cause The array disk enclosure is too cool below the minimum Failure Action Check whether the thermostat failure threshold Error setting is too low and whether the room temperature is too cool Storage Management Message Reference 1053 591 1053 592 1054 593 1054 594 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2104 Controller battery is Ok Cause This alert is provided for reconditioning Normal informational purposes Action None 2105 Controller battery Ok Cause This alert is provided for recondition is Normal informational purposes completed Action None 2106 Smart FPT exceeded Warning Cause A disk on the specified controller has Non critical received a SMART alert pr
55. ed the loss of all fans or all power supplies Action Identify and replace the failed components To identify the failed component select the Storage object and click the Health subtab The controller status displayed on the Health subtab indicates whether a controller has a failed or degraded component Click the controller that displays a Warning or Failed status This action displays the controller Health subtab which displays the status of the individual controller components Continue clicking the components with a Warning or Health status until you identify the failed component See the online help for more information See the enclosure documentation for information on replacing enclosure components and for other diagnostic information 2124 Redundancy normal Ok Cause Data redundancy has been restored to 1304 None Normal a virtual disk or an enclosure that previously suffered a loss of redundancy Action This alert is provided for informational purposes 60 Storage Management Message Reference Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2126 SCSI sense sector Warming Cause A sector of the disk is corrupted and 903 None reassign Non critical data cannot be maintained on this portion of the disk Action If the disk is part of a non redundant virtual disk then replace the disk Any data residing on the corrupt p
56. ed to replace some devices such as the controller or EMM See the hardware documentation for more information 2330 SAS port report 1 Ok Cause This alert is provided for 751 None NOTE 1 isa Normal informational purposes substitution variable Action None that will appear in the alert description for specific details about the alert 233 Abad disk block has Warning Cause The disk has a bad block Data has 903 None been reassigned Non critical been readdressed to another disk block and no data loss has occurred Action Monitor the disk for other alerts or indications of poor health For example you may receive alert 2306 Replace the disk if you suspect there is a problem 2332 A controller hot plug Ok Cause This alert is provided for 751 None has been detected Normal informational purposes Action None 2333 An enclosure Warning Cause The firmware has detected a 853 None temperature sensor differential has been detected Non critical temperature sensor differential in the enclosure Action Monitor the enclosure for other alerts related to the temperature For example you may receive alerts related to the fan or temperature probes Check the health of the enclosure and its components Replace any component that is failed Storage Management Message Reference 91 Table 4 1 Storage Management Messages continued 92 Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager
57. edictive failure indicating that the disk is likely to fail in the near future Action Replace the disk that has received the SMART alert If the array disk is a member of a non redundant virtual disk then back up the data before replacing the disk Removing an array disk that is included in a non redundant virtual disk will cause the virtual disk to fail and may cause data loss 2107 Smart configuration Critical Cause A disk has received a SMART alert change Failure predictive failure after a configuration change Error The disk is likely to fail in the near future Action Replace the disk that has received the SMART alert If the array disk is a member of a non redundant virtual disk then back up the data before replacing the disk Removing an array disk that is included in a non redundant virtual disk will cause the virtual disk to fail and may cause data loss 2108 Smart warning Warning Cause A disk has received a SMART alert Non critical predictive failure The disk is likely to fail in the near future Action Replace the disk that has received the SMART alert If the array disk is a member of a non redundant virtual disk then back up the data before replacing the disk Removing an array disk that is included in a non redundant virtual disk will cause the virtual disk to fail and may cause data loss 1151 581 1151 582 903 585 904 586 903 587 Storage Management Message Reference 55 56 Tabl
58. ensor type is not discrete P temperature sensor value Temperature sensor value in degrees are provided Celsius lt Reading gt If sensor type is discrete Discrete temperature state lt State gt 1051 Temperature sensor value unknown Information A temperature sensor on the Sensor location lt Location in chassis gt backplane board system board or drive carrier in the Chassis location lt Name of chassis gt specified system could not If sensor type is not discrete obtain a reading The sensor location chassis location Temperature sensor value in degrees y j previous state and a Celsius lt Reading gt nominal temperature sensor If sensor type is discrete value are provided Discrete temperature state lt State gt 1052 Temperature sensor returned to a normal Information A temperature sensor on the value Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt If sensor type is not discrete Temperature sensor value in degrees Celsius lt Reading gt If sensor type is discrete Discrete temperature state lt State gt backplane board system board or drive carrier in the specified system returned to a valid range after crossing a failure threshold The sensor location chassis location previous state and temperature sensor value are provided Event Message Reference 17 Table 2 2 Temperature Sensor Messages continued
59. eous Messages 0000 eee uuee 15 Temperature Sensor Messages 2 00005 16 Cooling Device Messages 0 000000 eee neae 19 Voltage Sensor Messages 000000 ee uuee 20 CurrentSensorMessages 0 000000 ee eae 22 Chassis Intrusion Messages 02 00 000000 25 Redundancy Unit Messages laahaa aaa 26 Power Supply Messages nuaa aaa eee eee 28 Memory Device Messages nahna aa aaa e ene 30 Fan Enclosure Messages nahua aaa ee eeu nuese 31 AC Power Cord Messages naaa aaa a 32 Hardware Log Sensor Messages naaa aaa aaa 33 Processor Sensor Messages aahua aa aaa a 34 Pluggable Device Messages nnana aaa 36 Contents 3 3 System Event Log Messages for IPMI Systems 37 Temperature Sensor Events aaau aaa 37 Voltage Sensor Events naana aaa a 38 Fan Sensor Events l a uaaa aa eee 39 Processor Status Events aaau aaa 40 Power Supply Events 0 0000 eee eee 40 MemoryECCEvents 0 000 eee eee ee 41 BMC Watchdog Events a 42 Memory Events sas 2 202052 6 ds Ge be Ge ee ee ee ee 42 Hardware Log SensorEvents 2 0020085 43 Drive Events 2 o t a 8424 5 Oe eked es wild eee AS ESR eo 43 Intrusion Events i n yA a ar a rE a a a e aA 44 BIOS Generated System Events naaa aaa a 44 4 Storage Management Message Reference 4
60. eplace the battery 2175 The controller battery Ok Cause This alert is provided for 1151 None has been replaced Normal informational purposes Action None 2176 The controller battery Ok Cause This alert is provided for 1151 None Learn cycle has Normal informational purposes started Action None 2177 The controller battery Ok Cause This alert is provided for 1151 None Learn cycle has Normal informational purposes completed Action None Storage Management Message Reference 69 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2178 The controller battery Warning Cause The controller battery must be fully 1153 None Learn cycle has Non critical charged before the Learn cycle can begin timed out The battery may be unable to maintain a full charge causing the Learn cycle to timeout Additionally the battery must be able to maintain cached data for a specified period of time in the event of a power loss For example some batteries maintain cached data for 24 hours If the battery is unable to maintain cached data for the required period of time then the Learn cycle will timeout Action Replace the battery pack as the battery is unable to maintain a full charge 2179 The controller battery Ok Cause This alert is provided for 1151 None Learn cycle has been Normal informational purposes postponed Action None 2180 The control
61. ert is provided for informational purposes Action None 2099 Global hot spare Ok Cause A user has unassigned an array disk as 901 575 unassigned Normal a global hot spare This alert is provided for informational purposes Action None Storage Management Message Reference 53 54 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2100 2101 2102 2103 Temperature Warning Cause The array disk enclosure is too hot exceeded the Non critical A variety of factors can cause the excessive maximum warning temperature For example a fan may have threshold failed the thermostat may be set too high or the room temperature may be too hot Action Check for factors that may cause overheating For example verify that the enclosure fan is working You should also check the thermostat settings and examine whether the enclosure is located near a heat source Make sure the enclosure has enough ventilation and that the room temperature is not too hot See the enclosure documentation for more diagnostic information Temperature dropped Warning Cause The array disk enclosure is too cool below the minimum Non critical warning threshold Action Check whether the thermostat setting is too low and whether the room temperature is too cool Temperature Critical Cause The array disk enclosure is too hot exceed
62. es required for full redundancy lt Number gt Possible memory module event cause lt list of causes gt Power Supply type lt type of power supply gt Previous redundancy state was lt State gt lt State gt Previous state was Processor sensor status lt status gt Specifies the state of the voltage sensor for example Discrete voltage state Good Specifies the fan speed in revolutions per minute RPM or On Off for example Fan sensor value in RPM Off 2600 Fan sensor value Specifies the type of hardware log for example Log type ESM Specifies the name of the memory bank in the system that generated the message for example Memory device bank location Bank_1 Specifies the location of the memory module in the chassis for example Memory device location DIMM_A Specifies the number of power supply or cooling devices required to achieve full redundancy for example Number of devices required for full redundancy 4 Specifies a list of possible causes for the memory module event for example Possible memory module event cause Single bit warning error rate exceeded Single bit error logging disabled Specifies the type of power supply for example Power Supply type VRM Specifies the status of the previous redundancy message for example Previous redundancy state was Lost Specifies the previous state of the sensor for example Previous state was OK Normal Specifies
63. etected that in chassies depending one of the components in the on the redundant unit has been Chassis location lt Name of chassis gt number of disconnected has failed or is Previous redundancy state was lt State gt units that are not present The redundancy functional unit location chassis location previous redundancy state and the number of devices required for full redundancy are provided Power Supply Messages Power supply sensors monitor how well a power supply is functioning Power supply messages listed in Table 2 8 provide status and warning information for power supplies present in a particular chassis Table 2 8 Power Supply Messages EventID Description Severity Cause 1350 Power supply sensor has failed Information A power supply sensor in the specified system failed The sensor location chassis location previous state and additional Previous state was lt State gt power supply status information are provided Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Power Supply type lt type of power supply gt lt Additional power supply status information gt If in configuration error state Configuration error type lt type of configuration error gt 28 Event Message Reference Table 2 8 Power Supply Messages continued EventID Description Severity Cause 1351 Power supply sensor value unknown Information A power supply sensor in
64. everity Cause and Action SNMP Trap Array Numbers Manager Event Number 2048 Device failed Critical Cause A physical disk in the array failed The 754 804 500 Failure failed disk may have been identified by the 854 904 Error controller or channel Performing a 954 1004 consistency check can also identify a 1054 failed disk 1104 1154 1204 Action Replace the failed array disk You can identify which disk has failed by locating the disk that has a red X for its status Perform a rescan after replacing the disk 2049 Array disk removed Warning Cause A physical disk has been removed 903 501 Non critical from the array A user may have also executed the Prepare to Remove task This alert can also be caused by loose or defective cables or by problems with the enclosure Action If a physical disk was removed from the array either replace the disk or restore the original disk You can identify which disk has been removed by locating the disk that has a red X for its status Perform a rescan after replacing or restoring the disk If a disk has not been removed from the array then check for problems with the cables See the online help for more information on checking the cables Make sure that the enclosure is powered on If the problem persists check the enclosure documentation for further diagnostic information Storage Management Message Reference Table 4 1 Storage Management Messages continued
65. firmware attempted to 904 None polling failed Failure do SMART polling on the hot spare but was Error unable to complete it The controller has lost communication with the hot spare Action Check the health of the disk assigned as a hot spare You may need to replace the disk and reassign the hot spare Make sure the cables are attached securely 2283 A redundant pathis Warning Cause The controller has two connectors 903 None broken Non critical that are connected to the same enclosure The communication path on one connector has lost connection with the enclosure The communication path on the other connector is reporting this loss Action Make sure the cables are attached securely Make sure both EMMs are healthy 2284 A redundant path has Ok Cause This alert is provided for 901 None been restored Normal informational purposes Action None 2285 A disk media error Ok Cause This alert is provided for 901 None was corrected during Normal informational purposes TECOVETY Action None 2286 A Learn cycle start is Ok Cause This alert is provided for 1151 None pending while the Normal informational purposes battery charges Action None 2287 The Patrol Read is Ok Cause This alert is provided for 751 None paused Normal informational purposes Action None 2288 The Patrol Read has Ok Cause This alert is provided for 751 None resumed Normal informational purposes Action None 82 Storage Management Message Reference Tab
66. gement has lost 104 communication with a device There may be faulty hardware or loose or defective cables Action Reboot the system If the problem is not resolved check for hardware failures Any failed component must be replaced Make sure the cables are attached securely See the hardware documentation for more diagnostics information Cause This alert is provided for 901 informational purposes Action None None None None Storage Management Message Reference 79 80 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2270 The array disk Clear Critical Cause A Clear operation was being 904 None operation failed Failure performed on an array disk but it was Error interrupted and did not complete successfully The controller may have lost communication with the disk The disk may have been removed or the cables may be loose or defective Action Check if the disk is in and not in a failed state Make sure the cables are attached securely Restart the Clear operation 2271 The Patrol Read Ok Cause This alert is provided for 901 None corrected a media Normal informational purposes CHO Action None 2272 Patrol Read found Critical Cause The Patrol Read task has faced an 903 None an uncorrectable Failure error that cannot be corrected There may be media error Error a bad disk block that cannot be
67. gt 1652 Device removed from system Information A device was removed from the Device location lt Location in chassis gt specified system The device location chassis location and Chassis location lt Name of chassis gt additional event details if Additional details lt Additional available are provided details for the events gt 1653 Device configuration error detected Error A configuration error was detected Device location lt Location in chassis gt for a pluggable device m the specified system The device may Chassis location lt Name of chassis gt have been added to the system Additional details lt Additional incorrectly details for the events gt Event Message Reference System Event Log Messages for IPMI Systems The following tables list the system event log SEL messages their severity and cause K NOTE For corrective actions see the appropriate documentation Temperature Sensor Events The temperature sensor event messages help protect critical components by alerting the systems management console when the temperature rises inside the chassis These event messages use additional variables such as sensor location chassis location previous state and temperature sensor value or state Table 3 1 Temperature Sensor Events Event Message Severity Cause lt Sensor Name Location gt Critical Temperature of the backplane board temperature sensor detected a system board or the carrier in the specif
68. hassis location previous state and fan sensor value are provided Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt Fan sensor value lt Reading gt Voltage Sensor Messages Voltage sensors listed in Table 2 4 monitor the number of volts across critical components Voltage sensor messages provide status and warning information for voltage sensors in a particular chassis Table 2 4 Voltage Sensor Messages EventID Description Severity Cause 1150 Voltage sensor has failed Information A voltage sensor in the specified system failed The sensor location chassis location previous state and voltage Previous state was lt State gt sensor value are provided Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt If sensor type is not discrete Voltage sensor value in Volts lt Reading gt If sensor type is discrete Discrete voltage state lt State gt 1151 Voltage sensor value unknown Information A voltage sensor in the specified system could not obtain a reading The sensor location chassis location previous state Previous state was lt State gt and a nominal voltage sensor value are provided Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt If sensor type is not discrete Voltage sensor value in Volts lt Reading gt If sensor type is discrete
69. hold threshold This alert is provided for value changed informational purposes Action None 2155 Minimum Ok Cause A user has changed the value for the 1051 None temperature probe Normal minimum temperature probe warning warning threshold threshold This alert is provided for value changed informational purposes Action None 2156 Controller alarm has Ok Cause The controller alarm test has run 751 None been tested Normal successfully This alert is provided for informational purposes Action None Storage Management Message Reference 65 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2157 Controller Ok configuration has Normal been reset 2158 Array disk online Ok Normal 2159 Virtual disk renamed Ok Normal 2160 Dedicated hotspare Ok assigned Normal 2161 Dedicated hotspare Ok unassigned Normal 2162 Communication Ok regained Normal 66 Cause A user has reset the controller 751 None configuration See the online help for more information This alert is provided for informational purposes Action None Cause An offline array disk has been made 901 None online This alert is provided for informational purposes Action None Cause A user has renamed a virtual disk 1201 608 This alert is provided for informational purposes NOTE When renaming a virtual disk on a PERC 2 2 Si 3
70. ical disk in the array has failed or because a user cancelled the check consistency operation Action If the physical disk failed then replace the physical disk You can identify which disk failed by locating the disk that has a red X for its status Perform a rescan after replacing the disk When performing a consistency check be aware that the consistency check can take a long time The time it takes depends on the size of the physical disk or the virtual disk Storage Management Message Reference 49 50 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Manager Event Number 2070 2074 2076 2077 2079 Virtual disk initialization cancelled Array disk rebuild cancelled Virtual disk check consistency failed Virtual disk format failed Virtual disk initialization failed Ok Normal Ok Normal Critical Failure Error Critical Failure Error Critical Failure Error Cause The virtual disk initialization cancelled because a physical disk included in the virtual disk has failed or because a user cancelled the virtual disk initialization Action If a physical disk failed then replace the physical disk You can identify which disk has failed by locating the disk that has a red X for its status Perform a rescan after replacing the disk Restart the format array disk oper
71. ied failure lt Reading gt where lt Sensor system lt Sensor Name Location gt exceeded Name Location gt is the entity the critical threshold that this sensor is monitoring For example PROC Temp or Planar Temp Reading is specified in degree Celsius For example 100 C lt Sensor Name Location gt Warning Temperature of the backplane board temperature sensor detected a system board or the carrier in the specified warning lt Reading gt system lt Sensor Name Location gt exceeded the non critical threshold lt Sensor Name Location gt Warning Temperature of the backplane board temperature sensor returned to system board or the carrier in the specified warning state lt Reading gt system lt Sensor Name Location gt returned from critical state to non critical state lt Sensor Name Location gt Information Temperature of the backplane board temperature sensor returned to normal state lt Reading gt system board or the carrier in the specified system lt Sensor Name Location gt returned to normal operating range System Event Log Messages for IPMI Systems 37 38 Voltage Sensor Events The voltage sensor event messages monitor the number of volts across critical components These messages provide status and warning information for voltage sensors for a particular chassis Table 3 2 Voltage Sensor Events Event Message Severity Cause lt Sensor Name Location gt voltage Critical sen
72. in chassis gt from the specified s stem The sensor location and chassis location Chassis location lt Name of chassis gt are provided 1454 Fan enclosure removed from system for Error A fan enclosure has been removed an extended amount of time from the specified system for a F ser defi ime Th Sensor location lt Location in chassis gt User definable length of umg ae sensor location and chassis location Chassis location lt Name of chassis gt are provided 1455 Fan enclosure sensor detected a non Error A fan enclosure sensor in the recoverable value specified system detected an error F ich it c cover Th Sansom Bead Siete ion OS ys nek from which it cannot recove e Chassis location lt Name of chassis gt sensor location and chassis location are provided AC Power Cord Messages AC power cord messages listed in Table 2 11 provide status and warning information for power cords that are part of an AC power switch if your system supports AC switching Table 2 11 AC Power Cord Messages EventID Description Severity Cause 1500 AC power cord sensor has failed Information An AC power cord sensor in the Sensor location lt Location in chassis gt specified system failed The j i AC power cord status cannot be Chassis location lt Name of chassis gt monitored The sensor location and chassis location information are provided 1501 AC power cord is not being monitored Information The AC power cord st
73. inistrator starting Data Bytes in Hex Viewing Alerts and Event Messages An event log is used to record information about important events Storage Management generates alerts that are added to the Microsoft Windows application alert log and to the Server Administrator Alert log To view these alerts in Server Administrator 1 Select the System object in the tree view 2 Select the Logs tab 3 Select the Alert subtab You can also view the event log using your operating system s event viewer Each operating system s event viewer accesses the applicable operating system event log Introduction 9 The location of the event log file depends on the operating system you are using e In the Microsoft Windows 2000 Advanced Server and Windows Server 2003 operating systems messages are logged to the system event log and optionally to a unicode text file desys32 log viewable using Notepad that is located in the install_path omsa log directory The default install_path is C Program Files Dell SysMegt e In the Red Hat Enterprise Linux operating system messages are logged to the system log file The default name of the system log file is var log messages You can view the messages file using a text editor such as vi or emacs K NOTE Logging messages to a unicode text file is optional By default the feature is disabled To enable this feature modify the Event Manager section of the dcemdy32 ini file as fol
74. irmware Warning Cause The system was unable to download 853 None download failed The Non critical firmware to the enclosure The controller may system was unable to have lost communication with the enclosure download firmware to There may have been problems with the data the enclosure The transfer or the download media may be controller may have corrupt lost communica Non Action Attempt to download the enclosure with the enclosure firmware again If problems continue check There may have been if the controller can communicate with the problems with the enclosure Make sure that the enclosure is data transfer or the powered on Check the cables and the health download media may of the enclosure and its components be corrupt To check the health of the enclosure select the enclosure object in the tree view The Health subtab displays a red X or yellow exclamation point for enclosure components that are failed or degraded 2356 SAS SMP Critical Cause The text for this alert is generated by 754 None communicationserror Failure the firmware and can vary depending on the 1 Error situation The reference to SMP in this text NOTE 1 is a refers to SAS Management Protocol substitution variable Action There may be a SAS topology error that will appear in the See the hardware documentation for alert description for information on correct SAS topology specific details about configurations There may be problems with the alert the cables such
75. k when the other side of the mirror is using SATA technology Action See the hardware documentation for information on replacing disks 2310 A virtual disk is Critical Cause A redundant virtual disk has lost 1204 None permanently Failure redundancy This may occur when the virtual degraded Error disk suffers the failure of multiple array disks In this case both the source array disk and the target disk with redundant data have failed A rebuild is not possible because there is no longer redundancy Action Replace the failed disks and restore from backup 2311 The firmware onthe Warning Cause The firmware on the EMM modules is 853 None EMM s is not the Non critical not the same version It is required that both same version EMMO modules have the same version of the 1 EMM 2 firmware This alert may be caused if you NOTE 1 and 2 are attempt to insert an EMM module that has a substitution variables different firmware version than an that will appear in the existing module alert description for Action Upgrade to the same version of the specific details about firmware on both EMM modules the alert 2312 A power supply inthe Warning Cause The power supply has an AC failure 1003 None enclosure has an Non critical Aetion Replace the power supply AC failure i 2313 A power supply in the Warning Cause The power supply has a DC failure 1003 None enclosure has a DC failure Non critical Action Replace the power supply
76. le 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2289 2290 2291 2292 Multi bit ECC error Single bit ECC error An EMM has been discovered Communication with the enclosure has been lost Critical Failure Error Warning Non critical Ok Normal Critical Failure Error Cause An error involving multiple bits has 754 been encountered during a read or write operation The error correction algorithm recalculates parity data during read and write operations If an error involves only a single bit it may be possible for the error correction algorithm to correct the error and maintain parity data An error involving multiple bits however usually indicates data loss In some cases if the multi bit error occurs during a read operation the data on the disk may be alright If the multi bit error occurrs during a write operation data loss has occurred Action Replace the dual in line memory module DIMM The DIMM is a part of the controller battery pack See your hardware documentation for information on replacing the DIMM You may need to restore data from backup Cause An error involving a single bit has 753 been encountered during a read or write operation The error correction algorithm has corrected this error Action None Cause This alert is provided for 851 inform
77. ler alarm has been tested 65 Controller battery is reconditioning 55 Controller battery low 64 Index 103 Controller battery recondition is completed 55 Controller configuration has been reset 66 Controller rebuild rate has changed 64 cooling device messages 19 current sensor 8 Current sensor detected a failure value 24 Current sensor detected a non recoverable value 24 Current sensor detected a warning value 23 Current sensor has failed 22 40 current sensor messages 22 Current sensor returned to a normal value 23 40 Current sensor value unknown 23 D Dead disk segments restored 63 Dedicated hotspare assigned 66 Dedicated hotspare unassigned 66 Device failed 46 Device returned to normal 59 Drive Events 43 Driver version mismatch 62 104 Index E Enclosure alarm disabled 63 Enclosure alarm enabled 63 Enclosure firmware mismatch 58 Enclosure was shut down 57 event description reference 12 F Failure prediction threshold exceeded due to test 57 Fan enclosure inserted into system 31 fan enclosure messages 31 43 Fan enclosure removed from system 32 Fan enclosure removed from system for an extended amount of time 32 fan enclosure sensor 9 Fan enclosure sensor detected a non recoverable value 32 Fan enclosure sensor has failed 31 Fan enclosure sensor value unknown 31 fan sensor 8 Fan sensor detected
78. ler battery Ok Cause This alert is provided for 1151 charge level is normal Normal informational purposes Action None Storage Management Message Reference None None None Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2171 The controller battery Warning Cause The battery may be recharging the 1153 None temperature is above Non critical room temperature may be too hot or the fan normal in the system may be degraded or failed Action If this alert was generated due to a battery recharge the situation will correct when the recharge is complete You should also check if the room temperature is normal and that the system components are functioning properly 2172 The controller battery Ok Cause This alert is provided for 1151 None temperature is Normal informational purposes normal Action None 2174 The controller battey Warning Cause The controller cannot communicate 1153 None has been removed Non critical with the battery the battery may be removed or the contact point between the controller and the battery may be burnt or corroded Action Replace the battery if it is not in If the contact point between the battery and the controller is burnt or corroded you will need to replace either the battery or the controller or both See the hardware documentation for information on how to safely access remove and r
79. ler battery Ok Cause This alert is provided for 1151 None Learn cycle will start Normal informational purposes in 1 days Action None NOTE The 1 is a variable that will be filled in with the number of days before which the Learn cycle will start You can set the duration to start the Learn cycle 70 Storage Management Message Reference Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2181 The controller battery Ok Cause This alert is provided for 1151 None Learn cycle will start Normal informational purposes in hours NOTE The 1 is a variable that will be filled in with the number of hours before which the Learn cycle will start You can set the duration to start the Action None Learn cycle 2182 An invalid SAS Critical Cause The controller and attached 754 None configuration has Failure enclosures are not cabled correctly been detected Error Action See the hardware documentation for information on correct cabling configurations 2186 The controller cache Warning Cause The controller has flushed the cache 753 None has been discarded Non critical and any data in the cache has been lost This may happen if the system has memory or battery problems that cause the controller to distrust the cache Although user data may have been lost this alert does not always indicate that relevant or user data h
80. lows e In Windows locate the file at insta _path dataeng ini and set Uni textLog enabled True The default install_path is C Program Files Dell SysMgt Restart the Systems Management Event Manager service e In Red Hat Enterprise Linux locate the file at insta l_path dataeng ini and set UnitextLog enabled True The default install_path is opt dell svradmin Issue the service dataeng restart command to restart the systems management event manager service This will also restart the systems management data manager and SNMP services The following subsections explain the procedure to open the Windows 2000 Advanced Server Windows Server 2003 and Red Hat Enterprise Linux event viewers Viewing Events in Windows 2000 and Windows Server 2003 1 Click the Start button point to Settings and click Control Panel 2 Double click Administrative Tools and then double click Event Viewer 3 In the Event Viewer window click the Tree tab and then click System Log The System Log window displays a list of recently logged events 4 To view the details of an event double click one of the event items K NOTE You can also look up the desys32 log file in the install_pathomsa log directory to view the separate event log file The default insta _pathis C Program Files Dell SysMgt Viewing Events in Red Hat Enterprise Linux 1 Log in as root 2 Use a text editor such as vi or emacs to view the file named var log messages The following example shows
81. lt Reading gt If sensor type is discrete Discrete voltage state lt State gt system exceeded its failure threshold The sensor location chassis location previous state and voltage sensor value are provided Event Message Reference 21 Table 2 4 Voltage Sensor Messages continued EventID Description Severity Cause 1155 Voltage sensor detected a Error non recoverable value Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt If sensor type is not discrete Voltage sensor value in Volts lt Reading gt If sensor type is discrete Discrete voltage state lt State gt A voltage sensor in the specified system detected an error from which it cannot recover The sensor location chassis location previous state and voltage sensor value are provided Current Sensor Messages Current sensors listed in Table 2 5 measure the amount of current in amperes that is traversing critical components Current sensor messages provide status and warning information for current sensors in a particular chassis Table 2 5 Current Sensor Messages EventID Description Severity Cause 1200 Current sensor has failed Information Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt If sensor type is not discrete Current sensor value in Amps lt Reading gt
82. mmunicate 1204 None initialization failed Failure with the attached devices A disk may be Error removed or contain errors The cables may also be loose or defective Action Check the health of attached devices Review the Alert Log for significant events and make sure the cables are attached securely 2346 Error occurred 1 Warning Cause The text for this alert is generated by 903 None NOTE 1 isa Non critical the firmware and can vary depending on substitution variable the situation that will appear in the Action Check the health of attached devices alert description for Review the Alert Log for significant events specific details about You may need to replace faulty hardware the alert Make sure the cables are attached securely See the hardware documentation for more information 2347 The rebuild failed due Critical Cause You are attempting to rebuild data 904 None to errors on the Failure that resides on a defective disk source physical disk Error Action Replace the source disk and restore from backup 2348 The rebuild failed due Critical Cause You are attempting to rebuild data on 904 None to errors on the target Failure a disk that is defective physical disk Error Action Replace the target disk If a rebuild does not automatically start after replacing the disk initiate the Rebuild task You may need to assign the new disk as a hot spare to initiate the rebuild 94 Storage Management Messag
83. n None Storage Management Message Reference 77 78 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2260 An enclosure blink Ok Cause This alert is provided for 851 None has ceased Normal informational purposes Action None 2261 A global rescanhas Ok Cause This alert is provided for 101 None initiated Normal informational purposes Action None 2262 Smart thermal Ok Cause This alert is provided for 101 None shutdown is enabled Normal informational purposes Action None 2263 Smart thermal Ok Cause This alert is provided for 101 None shutdown is disabled Normal informational purposes Action None 2264 A device is missing Warning Cause The controller cannot communicate 753 803 None Non critical with a device The device may be removed 853 903 There may also be a bad or loose cable 953 1003 Action Check if the device is in and o connected If it is in check the cables 1103 1153 1203 Also check the connection to the controller battery and the battery health A battery with a weak or depleted charge may cause this alert 2265 A device is in an Warming Cause The controller cannot communicate 753 803 None unknown state Non critical with a device The state of the device cannot 853 903 be determined There may be a bad or loose 953 1003 cable The system may also be experiencing 1053 problems with the ap
84. nable to recover Failure data from the cache cached data from the Error Action Check if the battery is charged and in battery backup unit good health When the battery charge is BBU unacceptably low it cannot maintain cached data Check if the battery has reached its recharge limit The battery may need to be recharged or replaced Storage Management Message Reference Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2338 The controller has Ok Cause This alert is provided for 1151 None recovered cached data Normal informational purposes from the BBU Aetion None 2339 The factory default Ok Cause This alert is provided for 751 None settings have been Normal informational purposes restored Action None 2340 The BGI completed Critical Cause The BGI task encountered errors that 1204 None with uncorrectable Failure cannot be corrected The virtual disk contains errors Error array disks that have unusable disk space or disk errors that cannot be corrected Action Replace the array disk that contains the disk errors Review other alert messages to identify the array disk that has errors If the virtual disk is redundant you can replace the array disk and continue using the virtual disk If the virtual disk is non redundant you may need to recreate the virtual disk after replacing the array disk After replacing the array disk
85. nal purposes controller cache Acti n None 2361 Array disk s that are Ok Cause This alert is provided for 751 None part of a virtual disk Normal informational purposes have been removed Action None while the system was shut down This removal was discovered during system start up 2362 Array disk s have Ok Cause This alert is provided for 751 None been removed froma Normal virtual disk The virtual disk will be in Failed state during the next system reboot informational purposes Action None Storage Management Message Reference 97 98 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2363 A virtual disk and all Ok Cause This alert is provided for 751 None of its member array Normal informational purposes disks have been Action None removed while the l system was shut down This removal was discovered during system start up 2364 All virtual disks are Ok Cause This alert is provided for 751 None missing from the Normal informational purposes controller This Action None situation was i discovered during system start up 2365 The speed of the Ok Cause This alert is provided for 851 None enclosure fan has Normal informational purposes changed Action None 2366 Dedicated spare Ok Cause This alert is provided for 901 None imported as global Normal informational purposes due to missing arr
86. nd firmware requirements In particular if Storage Management experiences performance problems you should verify that you have the minimum supported versions of the drivers and firmware installed 2165 The RAID controller Warning Cause Storage Management is unable to 753 None firmware and driver Non critical determine whether the system has the validation was not minimum required versions of the RAID performed The controller firmware and drivers This configuration file situation may occur for a variety of reasons cannot be opened For example the installation directory path to the configuration file may not be correct The configuration file may also have been removed or renamed Action Reinstall Storage Management 2166 The RAID controller Warning Cause Storage Management is unable to 753 None firmware and driver validation was not performed The configuration file is out of date or corrupted Non critical determine whether the system has the minimum required versions of the RAID controller firmware and drivers This situation has occurred because a configuration file is unreadable or missing data The configuration file may be corrupted Action Reinstall Storage Management Storage Management Message Reference 67 68 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2167 The current kernel Warning Cause
87. ne a in ohasaias redundant unit is oO Ine ne f redundancy unit location Chassis location lt Name of chassis gt chassis location previous Previous redundancy state was lt State gt redundancy state and the number of devices required for full redundancy are provided 1304 Redundancy regained Information A redundancy sensor in the Redundancy unit lt Redundancy location ee ean a tn ebassi as ost redundancy dev ce has i been reconnected or replaced Chassis location lt Name of chassis gt full redundancy is in effect The Previous redundancy state was lt State gt redundancy unit location chassis location previous redundancy state and the number of devices required for full redundancy are provided Event Message Reference 27 Table 2 7 Redundancy Unit Messages continued EventID Description Severity Cause 1305 Redundancy degraded Warning A redundancy sensor in the specified system detected that one of the components of the s f redundancy unit has failed but Chassis location lt Name of chassis gt the unit is still redundant The Redundancy unit lt Redundancy location in chassis gt Previous redundancy state was lt State gt redundancy unit location chassis location previous redundancy state and the number of devices required for full redundancy are provided 1306 Redundancy lost Warning or A redundancy sensor in the Redundancy unit lt Redundancy location Error specified system d
88. oller battery Warning Cause The controller battery charge is weak 1153 None is degraded Non critical Action As the charge weakens the charger should automatically recharge the battery If the battery has reached its recharge limit replace the battery pack Monitor the battery to make sure that it recharges successfully If the battery does not recharge replace the battery pack 2247 The controller battery Ok Cause This alert is provided for 1151 None is charging Normal informational purposes Action None 2248 The controller battery Ok Cause This alert is provided for 1151 None is executing a Normal informational purposes Learn cycle Action None 2249 The array disk Clear Ok Cause This alert is provided for 901 None operation has started Normal informational purposes Action None 2251 The array disk blink Ok Cause This alert is provided for 901 None has initiated Normal informational purposes Action None 2252 The array disk blink Ok Cause This alert is provided for 901 None has ceased Normal informational purposes Action None 2254 The Clear operation Ok Cause This alert is provided for 901 None has cancelled Normal informational purposes Action None 2255 The array disk has Ok Cause This alert is provided for 901 None started Normal informational purposes Action None 2259 An enclosure blink Ok Cause This alert is provided for 851 None operation has Normal informational purposes initiated Actio
89. oller is unable to communicate with an enclosure There are several reasons why communcation may be lost For example there may be a bad or loose cable An unusual amount of I O may also interrupt communication with the enclosure In addition communication loss may be caused by software hardware or firmware problems bad or failed power supplies and enclosure shutdown When viewed in the Alert Log the description for this event displays several variables These variables are Controller and enclosure names type of communication problem return code and SCSI status Action Check for problems with the cables See the online help for more information on checking the cables You should also check to see if the enclosure has degraded or failed components To do so select the enclosure object in the tree view and click the Health subtab The Health subtab displays the status of the enclosure components Verify that the controller has supported driver and firmware versions installed and that the EMMs are each running the same version of supported firmware Cause A user has enabled the enclosure alarm This alert is provided for informational purposes Action None Cause A user has disabled the enclosure alarm Action None Cause Disk space that was formerly dead or inaccessible to a redundant virtual disk has been restored This alert is provided for informational purposes Action None 853 688 610 611
90. on sensor in the specified system detected that Sensor location lt Location in chassis gt specified system detected that a cover was opened while the Chassis location lt Name of chassis gt system was operating but has Previous state was lt State gt since been replaced The sensor Rae i location chassis location Chassis intrusion state lt Intrusion BA A previous state and chassis state gt intrusion state are provided 1253 Chassis intrusion in progress Warning A chassis intrusion sensor in the Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt Chassis intrusion state lt Intrusion state gt specified system detected that a system cover is currently being opened and the system is operating The sensor location chassis location previous state and chassis intrusion state are provided Event Message Reference 25 Table 2 6 Chassis Intrusion Messages continued EventID Description Severity Cause 1254 Chassis intrusion detected Error A chassis intrusion sensor in the specified system detected that the system cover was opened Chassis location lt Name of chassis gt while the system was operating Sensor location lt Location in chassis gt Previous state was lt State gt The sensor location chassis location previous state and Chassis intrusion state lt Intrusion at chassis intrusion state st
91. operation will initiate a rebuild of the disk Cause 2 A physical disk in the array has been removed Action 2 If a physical disk was removed from the array either replace the disk or restore the original disk You can identify which disk has been removed by locating the disk that has a red X for its status Perform a rescan after replacing the disk 2058 Virtual disk check Ok Cause This alert is provided for 1201 520 consistency started Normal informational purposes Action None 48 Storage Management Message Reference Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2059 Virtual disk format Ok Cause This alert is provided for 1201 521 started Normal informational purposes Action None 2061 Virtual disk Ok Cause This alert is provided for 1201 523 initialization started Normal informational purposes Action None 2063 Virtual disk Ok Cause This alert is provided for 1201 525 reconfiguration Normal informational purposes started Action None 2064 Virtual disk rebuild Ok Cause This alert is provided for 1201 526 started Normal informational purposes Action None 2065 Array disk rebuild Ok Cause This alert is provided for 901 527 started Normal informational purposes Action None 2067 Virtual disk check Ok Cause The check consistency operation 1201 529 consistency cancelled Normal cancelled because a phys
92. or location chassis Chassis location lt Name of chassis gt location previous state and Previous state was lt State gt processor sensor status are rovided Processor sensor status lt status gt P Event Message Reference Table 2 13 Processor Sensor Messages continued EventID Description Cause 1602 Processor sensor returned to a normal value Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt Processor sensor status lt status gt 1603 Processor sensor detected a warning value Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt Processor sensor status lt status gt 1604 Processor sensor detected a failure value Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt Processor sensor status lt status gt 1605 Processor sensor detected a non recoverable value Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Previous state was lt State gt Processor sensor status lt status gt Information A processor sensor in the specified system transitioned back to a normal state The sensor location chassis location previous state and processor sensor status are provided A processor sensor in the specified system is in a throttled state The sensor
93. ortion of the disk may be lost and you may need to restore from backup If the disk is part of a redundant virtual disk then any data residing on the corrupt portion of the disk will be reallocated elsewhere in the virtual disk 2127 Background Ok Cause BGI of a virtual disk has started This 1201 683 initialization BGI Normal alert is provided for informational purposes started Action None 2128 BGI cancelled Ok Cause BGI of a virtual disk has been 1201 684 Normal cancelled A user or the firmware may have stopped BGI Action None 2129 BGI failed Critical Cause BGI of a virtual disk has failed 1204 685 Failure Action None Error 2130 BGI completed Ok Cause BGI of a virtual disk has completed 1201 686 Normal This alert is provided for informational purposes Action None 2131 Firmware version Warning Cause The firmware on the controller is not 753 None mismatch Non critical a supported version Action Install a supported version of the firmware If you do not have a supported version of the firmware available it can be downloaded from the Dell support website at support dell com If you do not have a supported version of the firmware available check with your support provider for information on how to obtain the most current firmware Storage Management Message Reference 61 62 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array N
94. p and cleared 1000 Server Administrator starting Information Server Administrator is beginning to initialize 1001 Server Administrator startup Information Server Administrator completed its complete initialization 1002 A system BIOS update has been Information The user has chosen to update the flash scheduled for the next reboot basic input output system BIOS 1003 A previously scheduled system Information The user has decided to cancel the flash BIOS update has been canceled BIOS update or an error has occurred during the flash 1004 Thermal shutdown protection Error This message is generated when a has been initiated system is configured for thermal shutdown due to an error event If a temperature sensor reading exceeds the error threshold for which the system is configured the operating system shuts down and the system powers off This event may also be initiated on certain systems when a fan enclosure is removed from the system for an extended period of time Event Message Reference 15 Table 2 1 Miscellaneous Messages continued EventID Description Severity Cause 1005 SMBIOS data is absent Warning The system management BIOS does not contain the required systems management BIOS version 2 2 or higher or the BIOS is corrupted 1006 Automatic System Recovery Error This message is generated when an ASR action was performed automatic system recovery action 1s Action perfore wake sActions performed due to a non re
95. particular Dual Inline Memory Module DIMM This event is generated when the chipset is unable to correct the memory errors Usually a bank number is provided and DIMM may or may not be identifiable depending on the error This event is generated when the chipset in the ECC error correction rate exceeds a predefined limit System Event Log Messages for IPMI Systems 41 BMC Watchdog Events The BMC watchdog operations are performed when the system hangs or crashes These messages monitor the status and occurrence of these events in a system Table 3 7 BMC Watchdog Events Event Message Severity Cause BMC OS Watchdog timer expired BMC OS Watchdog performed system reboot BMC OS Watchdog performed system power off BMC OS Watchdog performed system power cycle Information This event is generated when the BMC watchdog timer expires and no action is set Critical This event is generated when the BMC watchdog detects that the system has crashed timer expired because no response was received from Host and the action is set to reboot Critical This event is generated when the BMC watchdog detects that the system has crashed timer expired because no response was received from Host and the action is set to power off Critical This event is generated when the BMC watchdog detects that the system has crashed timer expired because no response was received from Host and the action is set to power cycle
96. perature when another fan has failed Redundancy is normal when the intended number of critical components are operating Redundancy is degraded when a component fails but others are still operating Redundancy is lost when there is one less critical redundancy device than required Power Supply Sensor Monitors power supplies in the chassis and in any attached systems Memory Prefailure Sensor Monitors memory modules by counting the number of Error Correction Code ECC memory corrections Introduction Fan Enclosure Sensor Monitors protective fan enclosures by detecting their removal from and insertion into the system and by measuring how long a fan enclosure is absent from the chassis This sensor monitors the chassis and any attached systems AC Power Cord Sensor Monitors the presence of AC power for an AC power cord Hardware Log Sensor Monitors the size of a hardware log e Processor Sensor Monitors the processor status in the system e Pluggable Device Sensor Monitors the addition removal or configuration errors for some pluggable devices such as memory cards Sample Event Message Text The following example shows the format of the event messages logged by Server Administrator EventID 1000 Source Server Administrator Category Instrumentation Service Type Information Date and Time Wed Mar 15 10 38 00 2006 Computer lt computer name gt Description Server Adm
97. plication programming 1103 interface API There could also be a 1153 1203 problem with the driver or firmware Action Check the cables Check if the controller has a supported version of the driver and firmware You can download the most current version of the driver and firmware from support dell com Rebooting the system may also resolve this problem Storage Management Message Reference Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2266 Controller log file Ok Cause This alert is provided for 751 None 2267 2268 2269 entry 1 Normal 1 is a substitution variable that will appear in the alert description for specific details about the alert The controller Ok reconstruct rate has Normal changed 1 Storage Critical Management has lost Failure communication with Error this RAID controller and attached storage An immediate reboot is strongly recommended to avoid further problems If the reboot does not restore communication there may be a hardware failure NOTE 1 is a substitution variable that will appear in the alert description for specific details about the alert The array disk Clear Ok operation has Normal completed informational purposes Action None Cause This alert is provided for 751 informational purposes Action None Cause Storage Mana
98. r 1201 informational purposes Action None 502 503 504 505 506 507 Storage Management Message Reference 47 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2056 Virtual disk failed Critical Cause One or more physical disks included 1204 508 Failure in the virtual disk have failed If the virtual Error disk is non redundant does not use mirrored or parity data then the failure of a single physical disk can cause the virtual disk to fail If the virtual disk is redundant then more physical disks have failed than can be rebuilt using mirrored or parity information Action Create a new virtual disk and restore from a backup 2057 Virtual disk degraded Warning Cause 1 This alert message occurs whena 1203 509 Non critical physical disk included in a redundant virtual disk fails Because the virtual disk is redundant uses mirrored or parity information and only one physical disk has failed the virtual disk can be rebuilt Action 1 Configure a hot spare for the virtual disk if one is not already configured Rebuild the virtual disk When using a Expandable RAID Controller PERC 2 SC 3 SC 2 DC 3 DCL 3 DC 3 QC 4 SC 4 DC 4e DC 4 Di or CERC ATA100 4ch controller rebuild the virtual disk by first configuring a hot spare for the disk and then initiating a write operation to the disk The write
99. r Name Location gt is the entity provide enough cooling to the system that this sensor is monitoring For example BMC Back Fan or BMC Front Fan Reading is specified in RPM For example 100 RPM lt Sensor Name Location gt Fan sensor Information The fan specified by lt Sensor returned to normal state lt Reading gt Name Location gt has returned to its normal operating speed lt Sensor Name Location gt Fan sensor Warning The speed of the specified lt Sensor detected a warning lt Reading gt Name Location gt fan may not be sufficient to provide enough cooling to the system lt Sensor Name Location gt Fan Redundancy Information The fan specified by lt Sensor sensor redundancy degraded lt Sensor Name Location gt Fan Redundancy Critical sensor redundancy lost lt Sensor Name Location gt Fan Redundancy Information sensor redundancy regained Name Location gt may have failed and hence the redundancy has been degraded The fan specified by lt Sensor Name Location gt may have failed and hence the redundancy that was degraded previously has been lost The fan specified by lt Sensor Name Location gt may have started functioning again and hence the redundancy has been regained System Event Log Messages for IPMI Systems 39 Processor Status Events The processor status messages monitor the functionality of the processors in a system These messages provide processor health and warning informa
100. r type is not discrete Temperature sensor value in degrees Celsius lt Reading gt If sensor type is discrete Discrete temperature state lt State gt backplane board system board or drive carrier in the specified system detected an error from which it cannot recover The sensor location chassis location previous state and temperature sensor value are provided 18 Event Message Reference Cooling Device Messages Cooling device sensors listed in Table 2 3 monitor how well a fan is functioning Cooling device messages provide status and warning information for fans in a particular chassis Table 2 3 Cooling Device Messages EventID Description Severity Cause 1100 Fan sensor has failed Information A fan sensor in the specified Sensor location lt Location in chassis gt system 1s not functioning The sensor location chassis location Chassis location lt Name of chassis gt previous state and fan sensor Previous state was lt State gt value are provided Fan sensor value lt Reading gt 1101 Fan sensor value unknown Information A fan sensor in the specified system could not obtain Sensor location lt Location in chassis gt fe cor otopta 2 reading The sensor location Chassis location lt Name of chassis gt chassis location previous state Previous state was lt State gt and a nominal fan sensor value are provided Fan sensor value lt Reading gt P 1102 Fan sensor returned to
101. rated when a fatal error is detected on the PCIE bus 44 System Event Log Messages for IPMI Systems Storage Management Message Reference Storage Management s alert or event management features let you monitor the health of storage resources such as controllers connectors array disks and virtual disks Alert Monitoring and Logging The Storage Management Service performs alert monitoring and logging By default the Storage Management Service starts when the managed system starts up If you stop the Storage Management Service then alert monitoring and logging stops Alert monitoring does the following e Updates the status of the storage object that generated the alert e Propagates the storage object s status to all the related higher objects in the storage hierarchy For example the status of a lower level object will be propagated up to the status displayed on the Health tab for the top level storage object e Logs an alert into the Alert log and the Windows application log Sends an SNMP trap if the operating system s SNMP service is installed and enabled K NOTE Storage Management does not log alerts regarding the data 1 0 path These alerts are logged by the respective RAID drivers in the system alert log For updated information lookup the Storage Management Online Help and the Dell OpenManage Server Administrator Storage Management User s Guide Alert Descriptions and Corrective Actions
102. remapped Action Replace the array disk to avoid future data loss 2273 Bad media Critical Cause A source array disk in a redundant 904 None Failure virtual disk has a bad disk block The Error algorithm that maintains redundant data has created a similar bad block on the target redundant disk to maintain consistency in disk block addressing Data has been lost Action Restore from backup 2274 The array disk rebuild Ok Cause This alert is provided for 901 None has resumed Normal informational purposes Action None 2276 The dedicated hot Warning Cause The dedicated hot spare is not large 903 None spare is too small Non critical enough to protect all virtual disks that reside on the disk group Action Assign a larger disk as the dedicated hot spare Storage Management Message Reference Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2277 2278 2279 2280 2281 The global hot spare is too small The controller battery charge level is below a normal threshold The controller battery charge level is above a normal threshold A disk media error has been corrected Virtual disk has inconsistent data Warning Non critical Critical Failure Error Ok Normal Ok Normal Ok Normal Cause The global hot spare is not large 903 enough to protect all virtual
103. ription 12 User initiated host system reset 16 V viewing event information 11 event messages 9 events in NetWare 10 events in Red Hat Linux 10 events in Windows 2000 10 Virtual disk check consistency cancelled 49 Virtual disk check consistency completed 51 Virtual disk check consistency failed 50 Virtual disk check consistency started 48 Virtual disk configuration changed 47 Virtual disk created 47 Virtual disk degraded 48 Virtual disk deleted 47 Virtual disk failed 48 Virtual disk format changed 50 Virtual disk format completed 51 Virtual disk format started 49 Virtual disk initialization 62 Virtual disk initialization cancelled 50 Virtual disk initialization completed 52 Virtual disk initialization failed 50 Virtual disk initialization started 49 Virtual disk rebuild completed 52 Virtual disk rebuild failed 51 Virtual disk rebuild started 49 Virtual disk reconfiguration completed 52 Virtual disk reconfiguration failed 51 Virtual disk reconfiguration started 49 Virtual disk renamed 66 voltage sensor 8 Voltage sensor detected a failure value 21 39 Voltage sensor detected a non recoverable value 22 Voltage sensor detected a warning value 21 Voltage Sensor Events 38 Voltage sensor has failed 20 39 voltage sensor messages 20 39 Voltage sensor returned to a normal value 21 Voltage sensor value unknown 20 39
104. rovided Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Hardware Log Sensor Messages Hardware logs provide hardware status messages to systems management software On certain systems the hardware log is implemented as a circular queue When the log becomes full the oldest status messages are overwritten when new status messages are logged On some systems the log is not circular On these systems when the log becomes full subsequent hardware status messages are lost Hardware log sensor messages listed in Table 2 12 provide status and warning information about the noncircular logs that may fill up resulting in lost status messages Table 2 12 Hardware Log Sensor Messages EventID Description Severity Cause 1550 Log monitoring has been disabled Information A hardware log sensor in the specified system is disabled The Log type lt Log type gt iog log type information is provided 1551 Log status is unknown Information A hardware log sensor in the specified system could not obtain a reading The log type information is provided Log type lt Log type gt Event Message Reference 33 34 Table 2 12 Hardware Log Sensor Messages continued EventID Description Severity Cause 1552 Log size is no longer near or at Information The hardware log on the specified capacity system is no longer near or at its Log type Srog types capacity usually as the result of
105. rposes changed Action None 2237 A controller rescan Ok Cause This alert is provided for 751 None has been initiated Normal informational purposes Action None 2238 The controller debug Ok Cause This alert is provided for 751 None log file has been Normal informational purposes exported Action None 2239 A foreign Ok Cause This alert is provided for 751 None configuration has Normal informational purposes been cleared Action None 2240 A foreign Ok Cause This alert is provided for 751 None configuration has Normal informational purposes been imported Action None 2241 The Patrol Read Ok Cause This alert is provided for 751 None mode has changed Normal informational purposes Action None 2242 The Patrol Read has Ok Cause This alert is provided for 751 None started Normal informational purposes Action None 2243 The Patrol Read has Ok Cause This alert is provided for 751 None stopped Normal informational purposes Action None 2244 A virtual disk blink Ok Cause This alert is provided for 1201 None has been initiated Normal informational purposes Action None 2245 A virtual disk blink Ok Cause This alert is provided for 1201 None 76 has ceased Normal informational purposes Action None Storage Management Message Reference Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2246 The contr
106. s generated when there is a memory failure in a spared memory configuration This event is generated when redundancy is lost in a spared memory configuration This event is generated when the redundancy lost or degraded earlier is regained in a spared memory configuration Hardware Log Sensor Events The hardware logs provide hardware status messages to the system management software On particular systems the subsequent hardware messages are not displayed when the log is full These messages provide status and warning messages when the logs are full Table 3 9 Hardware Log Sensor Events Event Message Severity Cause Log full detected Critical This event is generated when the SEL device detects that only one entry can be added to the SEL before it is full Log cleared Information This event is generated when the SEL is cleared Drive Events The drive event messages monitor the health of the drives in a system These events are generated when there is a fault in the drives indicated Table 3 10 Drive Events Event Message Severity Drive lt Drive gt asserted fault Critical state Drive lt Drive gt de asserted Information fault state Cause This event is generated when the specified drive in the array is faulty This event is generated when the specified drive recovers from a faulty condition System Event Log Messages for IPMI Systems 43 Intrusion Events The chassis intrusion mes
107. s has failed For example a fan or power supply may have failed Although the enclosure is currently operational the failure of additional components could cause the enclosure to fail Action Identify and replace the failed component To identify the failed component select the enclosure in the tree view and click the Health subtab Any failed component will be identified with a red X on the enclosure s Health subtab Alternatively you can select the Storage object and click the Health subtab The controller status displayed on the Health subtab indicates whether a controller has a failed or degraded component See the enclosure documentation for information on replacing enclosure components and for other diagnostic information Storage Management Message Reference 59 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2123 Redundancy lost Warning Cause A virtual disk or an enclosure has lost 1306 None Non critical data redundancy In the case of a virtual disk one or more array disks included in the virtual disk have failed Due to the failed array disk or disks the virtual disk is no longer maintaining redundant mirrored or parity data The failure of an additional array disk will result in lost data In the case of an enclosure more than one enclosure component has failed For example the enclosure may have suffer
108. sages are a security measure Chassis intrusion alerts are generated when the system s chassis is opened Alerts are sent to prevent unauthorized removal of parts from the chassis Table 3 11 Intrusion Events Event Message Severity Cause lt Intrusion sensor Name gt sensor Critical detected an intrusion lt Intrusion sensor Name gt sensor Information returned to normal state This event is generated when the intrusion sensor detects an intrusion This event is generated when the earlier intrusion has been corrected BIOS Generated System Events The BIOS generated messages monitor the health and functionality of the chipsets I O channels and other BIOS related functions These system events are generated by the BIOS Table 3 12 BIOS Generated System Events Event Message Severity Cause System Event I O channel chk Critical This event is generated when a critical interrupt is generated in the I O Channel System Event PCI Parity Err Critical This event is generated when a parity error is detected on the PCI bus System Event Chipset Err Critical This event is generated when a chip error is detected System Event PCI System Err Critical This event indicates historical data and is generated when the system has crashed and recovered System Event PCI Fatal Err Critical This error is generated when a fatal error is detected on the PCI bus System Event PCIE Fatal Err Critical This error is gene
109. se array disks contain virtual disks that were created on the other controller See Import Foreign Configuration and Clear Foreign Configuration for more information Action None 2327 The NVRAM has Warning Cause The NVRAM has corrupted data This 753 None corrupted data The Non critical may ocurr after a power surge a battery controller is failure or for other reasons The controller is reinitializing the reinitializing the NVRAM NVRAM Action None The controller is taking the required corrective action If this alert is generated often such as during each reboot replace the controller 2328 The NVRAM has Warning Cause The NVRAM has corrupt data The 753 None corrupt data Non critical controller is unable to correct the situation Action Replace the controller Storage Management Message Reference Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2329 SAS port report 1 Warning Cause The text for this alert is generated by 753 None NOTE 1 is a Non critical the controller and can vary depending on substitution variable the situation that will appear in the Action Make sure the cables are attached alert description for securely ifi i t i specie details abou If the problem persists replace the cable with the alert i po Ss a valid cable according to SAS specifications If the problem still persists you may ne
110. sor detected a failure lt Reading gt where lt Sensor Name Location gt is the entity that this sensor is monitoring For example CMOS Battery Reading is specified in volts For example 3 860 V lt Sensor Name Location gt voltage Critical sensor state asserted lt Sensor Name Location gt voltage Information sensor state de asserted lt Sensor Name Location gt voltage Warning sensor detected a warning lt Reading gt lt Sensor Name Location gt voltage Information sensor returned to normal lt Reading gt The voltage of the monitored device is out of critical threshold The voltage specified by lt Sensor Name Location gt is in critical state The voltage of a previously reported lt Sensor Name Location gt is returned to normal state Voltage of the monitored entity lt Sensor Name Location gt exceeded the warning threshold The voltage of a previously reported lt Sensor Name Location gt is returned to normal state System Event Log Messages for IPMI Systems Fan Sensor Events The cooling device sensors monitor how well a fan is functioning These messages provide status warning and failure messages for fans for a particular chassis Table 3 3 Fan Sensor Events Event Message Severity Cause lt Sensor Name Location gt Fan sensor Critical The speed of the specified lt Sensor detected a failure lt Reading gt where Name Location gt fan is not sufficient to lt Senso
111. sponsive l f operating system The action Date and time of action lt Date performed and the time of action and time gt are provided 1007 User initiated host system Information User requested a host system control control action action to reboot power off or power Aetion regoestediwast lt ActTons cycle the system Alternatively the user had indicated protective measures to be initiated in the event of a thermal shutdown 1008 Systems Management Data Information Systems Management Data Manager Manager Started services were started 1009 Systems Management Data Information Systems Management Data Manager Manager Stopped services were stopped Temperature Sensor Messages Temperature sensors listed in Table 2 2 help protect critical components by alerting the systems management console when temperatures become too high inside a chassis The temperature sensor messages use additional variables sensor location chassis location previous state and temperature sensor value or state 16 Event Message Reference Table 2 2 Temperature Sensor Messages EventID Description Severity Cause 1050 Temperature sensor has failed Information A temperature sensor on the Sensor location lt Location in chassis gt backplane board system board or the carrier in the Chassis location lt Name of chassis gt specified system failed The Previous state was lt State gt sensor location chassis location previous state and If s
112. t s New in this Release The following changes in Server Administrator are documented in this guide e Support for additional Storage Management messages Removed support for Novell NetWare Messages Not Described in This Guide This guide describes only event messages created by Server Administrator and displayed in the Server Administrator Alert log For information on other messages produced by your system consult one of the following sources e Your system s Installation and Troubleshooting Guide e Other system documentation e Operating system documentation e Application program documentation For more information on Array Manager event messages see the Array Manager documentation Introduction 7 Understanding Event Messages This section describes the various types of event messages generated by the Server Administrator When an event occurs on your system the Server Administrator sends information about one of the following event types to the systems management console Table 1 1 Understanding Event Messages Icon Alert Severity Component Status An event that describes the successful operation of a unit The alert is provided A OK Normal for informational purposes and does not indicate an error condition A For example the alert may indicate the normal start or stop of an operation such as power supply or a sensor reading returning to normal An event that is not necessarily significant but m
113. temperature 8 voltage 8 Server Administrator starting 15 Server Administrator startup complete 15 Service tag changed 65 Smart configuration change 55 Smart FPT exceeded 55 Smart warning 55 Smart warning degraded 56 Smart warning temperature 56 SMBIOS data is absent 16 System Event Log Messages 37 system management data manager started 16 system management data manager stopped 16 106 Index T Temperature dropped below the minimum failure threshold 54 Temperature dropped below the minimum warning threshold 54 Temperature exceeded the maximum failure threshold 54 Temperature exceeded the maximum warning threshold 54 temperature sensor 8 Temperature sensor detected a failure value 18 Temperature sensor detected a non recoverable value 18 Temperature sensor detected a warning value 18 Temperature Sensor Events 37 Temperature sensor has failed 17 37 temperature sensor messages 16 37 Temperature sensor returned to a normal value 17 37 Temperature sensor value unknown 17 37 The current kernel version and the non RAID SCSI driver version are older than the minimum required levels 68 The non RAID SCSI driver version is older than the minimum required level 68 The RAID controller firmware and driver validation was not performed 67 Thermal shutdown protection has been initiated 15 U understanding event desc
114. tenance Clear the memory error on multibit ECC error The memory device status and location are provided Fan Enclosure Messages Some systems are equipped with a protective enclosure for fans Fan enclosure messages listed in Table 2 10 monitor whether foreign objects are present in an enclosure and how long a fan enclosure is missing from a chassis Table 2 10 Fan Enclosure Messages EventID Description Severity Cause 1450 Fan enclosure sensor has failed Information The fan enclosure sensor in the specified system failed The sensor location and chassis location are provided Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt 1451 Fan enclosure sensor value unknown Information The fan enclosure sensor in the specified system could not obtain a reading The sensor location and chassis location are provided Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt 1452 Fan enclosure inserted into system Information A fan enclosure has been inserted into the specified system The sensor location and chassis location Chassis location lt Name of chassis gt are provided Sensor location lt Location in chassis gt Event Message Reference 31 Table 2 10 Fan Enclosure Messages continued EventID Description Severity Cause 1453 Fan enclosure removed from system Warning A fan enclosure has been removed Sensor location lt Location
115. th the Critical Cause The battery or the battery charger is 1154 None battery or the battery Failure not functioning properly charger have been Error Action Replace the battery pack detected The battery health is poor Storage Management Message Reference Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2319 Single bit ECC error Warning Cause The DIMM is beginning to 753 None The DIMM is Non critical malfunction degrading Action Replace the DIMM to avoid data loss or data corruption The DIMM is a part of the controller battery pack See your hardware documentation for information on replacing the DIMM 2320 Single bit ECC error Critical Cause The DIMM is malfunctioning Data 754 None The DIMM is Failure loss or data corruption may be eminent critically degraded Error Action Replace the DIMM immediately to avoid data loss or data corruption The DIMM is a part of the controller battery pack See your hardware documentation for information on replacing the DIMM 2321 Single bit ECC error Critical Cause The DIMM is malfunctioning Data 754 None The DIMM is Failure loss or data corruption is eminent The critically degraded Error DIMM must be replaced immediately There will be no No further alerts will be generated further reporting Action Replace the DIMM immediately The DIMM is a part of the controller battery pa
116. the specified system could not obtain a reading The sensor location chassis location Previous state was lt State gt previous state and additional power supply status information are provided Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Power Supply type lt type of power supply gt lt Additional power supply status information gt If in configuration error state Configuration error type lt type of configuration error gt 1352 Power supply returned to normal Information A power supply has been reconnected or replaced The sensor location chassis location previous state and additional Previous state was lt State gt power supply status information are provided Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Power Supply type lt type of power supply gt lt Additional power supply status information gt If in configuration error state Configuration error type lt type of configuration error gt 1353 Power supply detected a warning Warning A power supply sensor reading in the specified system exceeded a user definable warning threshold The sensor location Previous state was lt State gt chassis location previous state and additional power supply status information are provided Sensor location lt Location in chassis gt Chassis location lt Name of chassis gt Power Supply type lt type of power
117. the Red Hat Enterprise Linux message log var log messages The text in boldface type indicates the message text K NOTE These messages are typically displayed as one long line In the following example the message is displayed using line breaks to help you see the message text more clearly 10 Introduction Feb 6 14 20 51 server01 Server Administrator Instrumentation Service EventID 1000 Server Administrator starting Feb 6 14 20 51 server01 Server Administrator Instrumentation Service EventID 1001 Server Administrator startup complete Feb 6 14 21 21 server01 Server Administrator Instrumentation Service EventID 1254 Chassis intrusion detected Sensor location Main chassis intrusion Chassis location Main System Chassis Previous state was OK Normal Chassis intrusion state Open Feb 6 14 21 51 server01 Server Administrator Instrumentation Service EventID 1252 Chassis intrusion returned to normal Sensor location Main chassis intrusion Chassis location Main System Chassis Previous state was Critical Failed Chassis intrusion state Closed Viewing the Event Information The event log for each operating system contains some or all of the following information e Date The date the event occurred e Time The local time the event occurred Type A classification of the event severity Information Warning or Error e User The name of the user on whose
118. the status of the processor sensor for example Processor sensor status Configuration error Introduction 13 14 Table 1 2 Event Description Reference continued Description Line Item Explanation Redundancy unit lt Redundancy location in chassis gt Sensor location lt Location in chassis gt Temperature sensor value lt Reading gt Voltage sensor value in Volts lt Reading gt Specifies the location of the redundant power supply or cooling unit in the chassis for example Redundancy unit Fan Enclosure Specifies the location of the sensor in the specified chassis for example Sensor location CPU1 Specifies the temperature in degrees Celsius for example Temperature sensor value in degrees Celsius 30 Specifies the voltage sensor value in volts for example Voltage sensor value in Volts 1 693 Introduction Event Message Reference The following tables list in numerical order each event ID and its corresponding description along with its severity and cause K NOTE For corrective actions see the appropriate documentation Miscellaneous Messages Miscellaneous messages in Table 2 1 indicate that certain alert systems are up and working Table 2 1 Miscellaneous Messages EventID Description Severity Cause 0000 Log was cleared Information User cleared the log from Server Administrator 0001 Log backup created Information The log was full copied to backu
119. tion for information on replacing the EMM 2298 There is abad sensor Warning Cause The enclosure has a bad sensor The 853 None on an enclosure Non critical enclosure sensors monitor the fan speeds temperature probes etc Action See the hardware documentation for more information Storage Management Message Reference Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2299 Bad PHY 1 Critical Cause There is a problem with a physical 854 None NOTE 1 is a Failure connection or PHY substitution variable Error Action Replace the EMM that contains the that will appear in the bad PHY See the hardware documentation alert description for for information on replacing the EMM specific details about Attach the storage to a different connector if the alert available Make sure the cables are attached securely 2300 The enclosure is Critical Cause The controller is not receiving a 854 None unstable Failure consistent response from the enclosure Error There could be a firmware problem or an invalid cabling configuration If the cables are too long they will degrade the signal Action Power down all enclosures attached to the system and reboot the system If the problem persists upgrade the firmware to the latest supported version You can download the most current version of the driver and firmware from support dell com Make s
120. tion of a system Table 3 4 Processor Status Events Event Message Severity Cause lt Processor Entity gt status processor Critical IERR internal error generated by the sensor IERR where lt Processor lt Processor Entity gt Entity gt is the processor that generated the event For example PROC for a single processor system and PROC for multiprocessor system lt Processor Entity gt status processor Critical The processor generates this event before it sensor Thermal Trip shuts down because of excessive heat caused by lack of cooling or heat synchronizating lt Processor Entity gt status processor Information This event is generated when a processor sensor recovered from IERR recovers from the internal error lt Processor Entity gt status processor Warning This event is generated for all processors that sensor disabled are disabled lt Processor Entity gt status processor Information This event is generated if the terminator is sensor terminator not present missing on an empty processor slot Power Supply Events The power supply sensors monitor the functionality of the power supplies These messages provide status and warning information for power supplies for a particular system Table 3 5 Power Supply Events Event Message Severity Cause lt Power Supply Sensor Name gt power Critical This event is generated when the power supply sensor removed supply sensor is removed lt Power
121. to the port configuration Action Remove the last enclosure You must remove the enclosure that has been added last and is causing the enclosure limit to exceed Storage Management Message Reference Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2192 The virtual disk Ok Cause The virtual disk Check Consistency 1203 None Check Consistency Normal has identified errors and made corrections has made corrections For example the Check Consistency may and completed have encountered a bad disk block and remapped the disk block to restore data consistency This alert is provided for informational purposes Action Monitor the battery and cache health to make sure they are functioning properly Monitor the Alert Log for events related to the battery and write policy changes You should also monitor the Alert Log for events related to disk errors If you suspect that the battery or a disk have problems replace the battery pack or the disk 2193 The virtual disk Ok Cause This alert is provided for 1201 None reconfigure has Normal informational purposes resumed Action None 2194 The virtual disk read Ok Cause This alert is provided for 1201 None policy has changed Normal informational purposes Action None 2199 The virtual disk cache Ok Cause This alert is provided for 1201 None policy has changed Normal informational purposes
122. umbers Manager Event Number 2132 2135 2136 Driver version Warning Cause The controller driver is not a mismatch Non critical supported version Action Install a supported version of the driver If you do not have a supported driver version available it can be downloaded from the Dell support site at support dell com If you do not have a supported version of the driver available check with your support provider for information on how to obtain the most current driver Array Manager is Warning Cause Storage Management has been installed on the Non critical installed on a system that has an Array system Manager installation Action Installing Storage Management and Array Manager on the same system is not a supported configuration Uninstall either Storage Management or Array Manager Virtual disk Ok Cause Virtual disk initialization is in initialization Normal progress This alert is provided for informational purposes Action None Storage Management Message Reference 753 None 103 None 1201 None Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2137 2138 2139 2140 Communication timeout Enclosure alarm enabled Enclosure alarm disabled Dead disk segments restored Warning Non critical Ok Normal Ok Normal Ok Normal Cause The contr
123. unmirrored the disk formerly used as the mirror returns to being an array disk and becomes available for inclusion in another virtual disk Action This alert is provided for informational purposes Change write policy Ok Cause A user has changed the write policy 1201 Normal for a virtual disk Action This alert is provided for informational purposes Enclosure firmware Warning Cause The firmware on the enclosure 853 mismatch Non critical management modules EMM is not the same version It is required that both modules have the same version of the firmware This alert may be caused when a user attempts to insert an EMM module that has a different firmware version than an existing module Action Download the same version of the firmware to both EMM modules Storage Management Message Reference 607 601 672 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2121 Device returned to Ok Cause A device that was previously in an 752 802 None normal Normal error state has returned to a normal state 852 902 For example if an enclosure became too hot 952 1002 and subsequently cooled down then you may 1052 receive this alert 1102 Action This alert is provided for 1152 1202 informational purposes 2122 Redundancy Warning Cause One or more of the enclosure 1305 None degraded Non critical component
124. ure the cable configuration is valid See the hardware documentation for valid cabling configurations 2301 The enclosure hasa Critical Cause The enclosure or an enclosure 854 None hardware error Failure component is in a Failed or Degraded state Error Action Check the health of the enclosure and its components Replace any hardware that is in a Failed state See the hardware documentation for more information 2302 The enclosure is not Critical Cause The enclosure or an enclosure 854 None responding Failure component is in a Failed or Degraded state Error Action Check the health of the enclosure and its components Replace any hardware that is in a Failed state See the hardware documentation for more information Storage Management Message Reference 85 Table 4 1 Storage Management Messages continued Event ID Description Severity Cause and Action SNMP Trap Array Numbers Manager Event Number 2303 The enclosure cannot Ok Cause This alert is provided for 851 None support both SAS and Normal informational purposes SATA array disks Action None Array disks may be disabled 2304 An attempt to hot Ok Cause This alert is provided for 751 None plug an EMM has Normal informational purposes been detected This Action None type of hot plug is not supported 2305 The array disk is too Ok Cause This alert is provided for 901 None small to be used fora Normal informational purposes rebuild
Download Pdf Manuals
Related Search
Related Contents
Sharkoon 4044951009145 Cables Direct 5m HDMI-DVI-D MG tool - Manual do utilizador User Manual - realmediashop.de MCVS-O1-3160 Manual de Usuario Final Pantógrafo IP-100 Copyright © All rights reserved.
Failed to retrieve file