Home

Sun Fire X4150, X4250, and X4450 Servers Diagnostics

image

Contents

1. The DIMM Fault and CPU Fault LEDs operate on stored power for up to a minute when the system is powered down even after the AC power is disconnected and the motherboard is out of the system The stored power lasts for about half an hour FIGURE B 7 Remind Button and DIMM LEDs on X4150 and X4250 Motherboard Remind button o o 2 3 109 w n a E E 4 4 ai m m Oo oo 44 Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 FIGURE B 8 Remind Button and DIMM LEDs on X4450 Motherboard ARAYA ai Remind button eee Hap m CPU Fault Appendix B Status Indicator LEDs 45 FIGURE B 10 Remind Button and CPU LEDs on X4450 Motherboard Remind button CPU Fault LEDs 46 Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 APPENDIX C Using the ILOM Service Processor Web Interface to View System Information This appendix contains information about using the Integrated Lights Out Manager ILOM Service processor SP web interface to view monitoring and maintenance information for your server m Making a Serial Connection to the SP on page 48 m Viewing ILOM SP Event Logs on page 49 m Viewing Replaceable Component Information on page 52 m Viewing Sensors on page 54 Note The information in this chapter might not appl
2. gt 4 To start the serial console type the following commands cd SP console start To exit console mode and return to the service processor type Esc Shift 9 5 Continue with the following procedures a Viewing ILOM SP Event Logs on page 49 a Viewing Replaceable Component Information on page 52 m Viewing Sensors on page 54 48 Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 Viewing ILOM SP Event Logs Events are notifications that occur in response to some actions The IPMI system event log SEL provides status information about the server s hardware and software to the ILOM software which displays the events in the ILOM web interface To view event logs 1 Log in to the SP as administrator or operator to reach the ILOM web interface a Type the IP address of the server s SP into your web browser The Sun Integrated Lights Out Manager Login screen appears b Type your user name and password When you first try to access the ILOM SP you are prompted to type the default user name and password The default user name and password are Default user name root Default password changeme 2 From the System Monitoring tab select Event Logs The System Event Logs page is displayed See FIGURE C 1 for a page that shows sample information Appendix C Using the ILOM Service Processor Web Interface to View System Information 49 FIGURE C 1 System Event Logs Page REFR
3. When you select the Immediate Burn in Testing menu option the Continuous Burn in Testing window is displayed The screen includes the list of options shown in TABLE 3 3 for running the tests When a quick tst noinput tst or full tst script is loaded the defaults indicated in the third column are automatically loaded TABLE 3 3 Continuous Burn in Testing Options Default Using quick tst noinput tst or Option Default General full tst Script All Possible Choices Pass Control Duration Script File Report File Journal File Journal Options Pause on Error Overall Time 01 00 N A None None Failed Tests Overall Passes quick tst noinput tst or full tst None D noinput jrl D quick jrl or D full jrl All Tests Absent Devices and Test Summary N Individual Passes Overall Passes or Overall Time Any number to designate the time duration of the test quick tst noiniput tst or full tst User defined User defined Failed Tests All Tests Absent Devices and Test Summary YorN Chapter 3 Using Pc Check Diagnostics Software 21 TABLE 3 3 Continuous Burn in Testing Options Continued Default Using quick tst noinput tst or Option Default General full tst Script All Possible Choices Screen Display Control Panel Control Panel Control Panel or Running Tests POST Card N N YorN Beep Codes N N YorN Maximum Fails Disabled Disabled 1 9999 Componen
4. Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 TABLE 3 1 System Information Menu Options Continued Option Description CPU Frequency Monitor Tests the processor speed CMOS RAM Utilities Shows the CMOS settings of the system SCSI Utilities Not relevant to the Sun Fire X4150 X4250 and X4450 servers Text File Editor Opens a text file editor Start Up Options Enables you to set up options for diagnostics testing Advanced Diagnostics Tests Advanced Diagnostics Tests Menu Options TABLE 3 2 gives the name and a brief description of each option in the Advanced Diagnostics Tests menu TABLE 3 2 Advanced Diagnostics Tests Menu Options Option Description Processor Details information about the processor and includes a Processor Tests menu to test the processor on the system Memory Details information about the memory and includes a Memory Tests menu to test the memory on the system Also lists each type of memory in the system such as system cache or video memory Motherboard Details information about the motherboard and includes a Motherboard Tests menu to test the motherboard on the system Diskettes Not relevant to the Sun Fire X4150 X4250 and X4450 servers Hard Disks Details information about the hard disk and includes a Hard Disk Tests menu to test hard disks on the system Refer to Testing the Hard Disk on page 19 for detailed information about testing hard di
5. The Device Test options include the Mechanics Stress Test and the Internal Cache Test These tests are relevant to testing non media related devices associated with the hard drive hardware such as the head and internal cache In addition to choosing any of these tests you can also define several parameters of the test You can change the parameters within the Test Settings option Your options within Test Settings include the following Media Test Settings Enables you to select the test time duration the percentage of the hard disk to test and the sectors to be tested on the hard disk a Device Test Settings Enables you to select the test time durations of the devices and the test level Chapter 3 Using Pc Check Diagnostics Software 19 a Number of Retries Enables you to select the number of times to retry testing a device before terminating the test Maximum Errors Enables you to select the number of errors allowed before terminating the test m Check SMART First SMART stands for Smart Monitoring Analysis Reporting Test HPA Protection HPA stands for Host Protected Area m Exit Burn in Tests Immediate Burn in Testing The Immediate Burn in Testing option enables you to run burn in test scripts on your server This section includes the following topics m Full System Tests on page 20 m Component Tests on page 22 m Running the Immediate Burn in Tests on page 22 Full System Tests
6. When the BIOS Main menu appears navigate to the BIOS Boot menu Instructions for navigating within the BIOS screens appear on the BIOS screens On the BIOS Boot menu screen select Boot Device Priority The Boot Device Priority screen appears Select the DVD ROM drive to be the primary boot device Save and exit the BIOS screens Reboot the server When the server reboots from the CD in the DVD ROM drive the Solaris operating system boots and SunVTS software starts and opens its first GUI window In the SunVTS GUI press Enter or click the Start button when you are prompted to start the tests The test suite runs until it encounters an error or the test is completed Note The CD takes approximately nine minutes to boot 9 When SunVTS software completes the test review the log files generated during the test SunVTS provides access to four different log files m SunVTS test error log contains time stamped SunVTS test error messages The log file path name is var opt SUNWvts logs sunvts err This file is not created until a SunVTS test failure occurs Chapter 4 Using SunVTS Diagnostic Software 35 m SunVTS kernel error log contains time stamped SunVTS kernel and SunVTS probe errors SunVTS kernel errors are errors that relate to running SunVTS and not to testing of devices The log file path name is var opt SUNWVvts logs vtsk err This file is not created until SunVTS reports a SunVTS kernel error a Su
7. 39 Front Panel LEDs 40 Back Panel LEDs 42 Hard Drive LEDs 43 Internal Status Indicator LEDs 43 Using the ILOM Service Processor Web Interface to View System Information 47 Making a Serial Connection to the SP 48 Viewing ILOM SP Event Logs 49 Interpreting Event Log Time Stamps 51 Viewing Replaceable Component Information 52 Viewing Sensors 54 Booting the Tools and Drivers CD from a PXE Server 59 Setting up the Tools and Drivers CD Image on the PXE Server 59 Accessing the Tools and Drivers CD From the Target Server 62 Index 1 Contents v vi Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 Preface The Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide contains information and procedures for using available tools to diagnose problems with the servers Before You Read This Document It is important that you review the safety guidelines in the Sun Fire X4150 Safety and Compliance Guide and the Sun Fire X4250 and X4450 Safety and Compliance Guide Related Documentation The document set for the Sun Fire X4150 X4250 and X4450 Servers is described in the Where To Find Sun Fire X4150 X4250 and X4450 Servers Documentation sheet that is packed with your system You can also find the documentation at http docs sun com Translated versions of some of these documents are available at http docs sun com Select a language from the drop down list and navigate to the Sun Fire X41
8. Code ECC errors See your Solaris documentation for details To view ECC errors use the following command fmdump eV DIMM Fault LEDs When you press the Remind button on the motherboard or memory tray for x4450 the LEDs next to the DIMMs flash to indicate that the system has detected 24 or more CEs in a 24 hour period on that DIMM a DIMM fault LED is off The DIMM is operating properly a DIMM fault LED is flashing amber At least one of the DIMMs in this DIMM pair has reported 24 CEs within a 24 hour period or a UE uncorrectable error See FIGURE 2 1 and FIGURE 2 2 for the locations of the Remind button and LEDs on the motherboard Chapter 2 Troubleshooting DIMM Problems 9 FIGURE 2 1 DIMMs and LEDs on Motherboard X4150 and X4250 Remind button Th lt p A l MAAA qu CPP PME 4 931 L1nV4 T9 WIG peneeeee Misssssss 10 Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 FIGURE 2 2 DIMMs and LEDs on Mezzanine x4450 CLLLEMOELLLLELLLLLLLLLMELLLLES Remind button x AS Se 5 Isolating and Correcting DIMM ECC Errors If your log files report an Error Correction Code ECC error or a problem with a DIMM complete the following steps until you can isolate the fault In this example the log file reports an error with the DIMM in DO The faul
9. Power Button FIGURE 1 3 X4450 Server Front Panel Button 2 Remove the server cover For instructions on removing the server cover refer to your server s service manual 3 Inspect the internal status indicator LEDs These can indicate component malfunction For the LED locations and descriptions of their behavior see Internal Status Indicator LEDs on page 43 Chapter 1 Initial Inspection of the Server 5 Note The server must be in standby power mode to view the internal LEDs 4 Verify that there are no loose or improperly seated components 5 Verify that all cable connectors inside the system are firmly and correctly attached to their appropriate connectors 6 Verify that any after factory components are qualified and supported For a list of supported PCI cards and DIMMs refer to your server s service manual 7 Check that the installed DIMMs comply with the supported DIMM population rules and configurations as described in the service manual for your product 8 Replace the server cover 9 To restore the server to main power mode all components powered on use a ballpoint pen or other nonconductive stylus to press and release the Power button on the server front panel See FIGURE 1 1 FIGURE 1 2 and FIGURE 1 3 When main power is applied to the full server the Power OK LED next to the Power button lights and remains lit 10 If the problem with the server is not evident you
10. QO ressens NE comen x ie el File Edt View Favorites Tools Help He ae GSsuncTmy Integrated Lights Out Manager 6 E Lh Page G took 3 Role Admi H Sun Integrated Lights Out Manager Sensor Readings Indicators il Event Logs Power Management Sensor Readings View readings for system sensors Click on a sensor name for more information including threshold values Sensor Readings Type 1SYSIMB PO PRSNT Entity Presence Present 1SYSIMB P 1 PRSNT Entity Presence Present 1SYSIMB P2 PRSNT Entity Presence Absent ISYSIMB P3 PRSNT Entity Presence Absent ISYSIMBIT_AMBO Temperature 33 000 degrees C ISYSIMBIT_AMB1 Temperature 33 000 degrees C ISYSMBIT_AMB2 Temperature 30 000 degrees C ISYSMBIT_AMB3 Temperature 28 000 degrees C SYS MB T_DIMMO Temperature 45 000 degrees C SYS MB T_DIMM1 Temperature 49 000 degrees C SYS MB T_DIMM2 Temperature 48 000 degrees C SYS MB T_DIMM3 Temperature 39 000 degrees C ISYS MBIPON_VCC Voltage 1 135 Volts ISYSIMBIP11V_VCC Voltage 1 135 Volts ISYSIMBIP2N_VCC Voltage Not Readable ISYSIMBIP3N_VCC Voltage Not Readable ISYSIMBN_VTT Voltage 1 203 Volts ISYSIMBN_ 1V5 Voltage 1 498 Volts ISYSIMBN_ 1V5STBY Voltage 1 498 Volts ISYSIMBN_ 1V8 Voltage 1 807 Volts 1SYS MB N_ 5V Voltage 5 044 Volts ISYS IMB V_ 3V3 Voltage 3 268 Volts ISYSIMBN_ 3V3STBY Voltage 3 266 Volts JSYSIMBN_ 12V Voltage 12 096 Volts 4 gt E r tees
11. SP reboots as a result of the following m A complete system power cycle a An IPMI command for example me reset cold A command line interface CLI command for example reset SP Appendix C Using the ILOM Service Processor Web Interface to View System Information 51 ILOM web interface operation for example from the Maintenance tab selecting Reset SP a An SP firmware upgrade After an SP reboot the SP clock is changed by the following events m When the host is booted The host s BIOS unconditionally sets the SP time to that indicated by the host s RTC The host s RTC is set by the following operations a When the host s CMOS is cleared as a result of changing the host s RTC battery or inserting the CMOS clear jumper on the motherboard The host s RTC starts at Jan 1 00 01 00 2002 m When the host s operating system sets the host s RTC The BIOS does not consider time zones Solaris and Linux software respect time zones and will set the system clock to UTC Therefore after the OS adjusts the RTC the time set by the BIOS will be UTC a When the user sets the RTC using the host BIOS Setup screen a Continuously via NTP if NTP is enabled on the SP NTP jumping is enabled to recover quickly from an erroneous update from the BIOS or user NTP servers provide UTC time Therefore if NTP is enabled on the SP the SP clock will be in UTC a Manually using the CLI ILOM web interface and IPMI 52 Vi
12. can obtain additional information by viewing the power on self test POST messages and BIOS event logs during system startup Continue with Viewing Event Logs on page 37 Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 CHAPTER 2 Troubleshooting DIMM Problems This chapter describes how to detect and correct problems with the server s Dual Inline Memory Modules DIMMs It includes the following sections a DIMM Replacement Guidelines on page 7 m How DIMM Errors Are Handled by the System on page 8 a Isolating and Correcting DIMM ECC Errors on page 11 Note Refer to the service manual or service label for the system that you are servicing for information on DIMM population rules DIMM Replacement Guidelines Replace a DIMM when one of the following events takes place a The DIMM fails memory testing under BIOS due to Uncorrectable Memory Errors UCEs a UCEs occur and investigation shows that the errors originated from memory m More than 24 Correctable Errors CEs originate in 24 hours from a single DIMM and no other DIMM is showing further CEs Note If more than one DIMM has experienced multiple CEs other possible causes of CEs must be ruled out by a qualified Sun Support specialist before replacing any DIMMs Retain copies of the logs showing the memory errors to send to Sun for verification prior to calling Sun How DIMM Errors Are Handle
13. disk only if that hard disk is completely free of any partitions You need to delete any existing partitions from a hard disk if you plan to use the hard disk to create a diagnostic partition on it Caution Removing all hard disk partitions destroys all data on the disk There are two ways to remove existing partitions from the hard disk m Use the Erase Primary Boot Hard Disk utility Option 3 on the Tools and Drivers CD main menu m Use the following procedure 1 Insert the Tools and Drivers CD into the DVD tray 2 Reboot the server 3 From the Tools and Drivers CD main menu type 3 to exit to DOS 4 Type fdisk at the command prompt and press the Enter key 5 Type 4 to select an alternate fixed disk The second hard disk as seen from fdisk is the first bootable disk of the system The first hard disk as seen from fdisk is the bootable Tools and Drivers CD Caution When performing the following tests be careful not to delete any operating system partitions that you want to keep Removing hard disk partitions destroys all data on the disk 6 Type 2 to delete the DOS partition 7 Type 1 or 2 depending on the type of partition you want to delete 8 Type the number of the partition you want to delete 9 Type Y to erase the data and the partition 10 Repeat Step 6 through Step 9 until all partitions have been deleted 11 Press the Esc key to exit and press any key to reboot the server Chapter 3 U
14. ou utilisateurs finaux pour des armes nucl aires des missiles des armes biologiques et chimiques ou du nucl aire maritime directement ou indirectement sont strictement interdites Les exportations ou reexportations vers les pays sous embargo am ricain ou vers des entit s figurant sur les listes d exclusion d exportation am ricaines y compris mais de maniere non exhaustive la liste de personnes qui font objet d un ordre de ne pas participer d une fa on directe ou indirecte aux exportations des produits ou des services qui sont r gis par la l gislation am ricaine sur le contr le des exportations et la liste de ressortissants sp cifiquement d sign s sont rigoureusement interdites L utilisation de pi ces d tach es ou d unit s centrales de remplacement est limit e aux r parations ou l change standard d unit s centrales pour les produits export s conform ment la l gislation am ricaine en mati re d exportation Sauf autorisation par les autorit s des Etats Unis l utilisation d unit s centrales pour proc der des mises jour de produits est rigoureusement interdite LD com Ca Adobe PostScript Contents Preface vii Initial Inspection of the Server 1 Service Troubleshooting Flowchart 1 Gathering Service Information 2 System Inspection 3 Troubleshooting Power Problems 3 Externally Inspecting the Server 3 Internally Inspecting the Server 4 Troubleshooting DIMM Problems 7 DIMM Replacement Guid
15. partition 1s diagpart Accessing the Diagnostic Partition on the Windows Server 2003 Operating System The Windows Server 2003 operating system OS does not allow you to mount a diagnostic partition There is no way to view or gain access to the diagnostic partition if you are running Windows XP on the server The only way to retrieve the contents log files on the diagnostic partition is to attach a USB diskette drive to the server and complete the following procedure 1 Connect the USB diskette drive to any USB port on the server Chapter 3 Using Pc Check Diagnostics Software 29 2 Insert the Tools and Drivers CD into the DVD tray 3 Reboot the server 4 At the Tools and Drivers CD main menu type 3 to exit to DOS 5 Type the following at the DOS command prompt C gt d 6 Copy the log file to the diskette For example to copy a file named noinput jr1 to the diskette type D gt copy d noinput jrl a The journal file is now saved to the diskette in the USB diskette drive 30 Show Results Summary The summary lists the tests that were run and shows the results Pass Fail or N A is listed for each option The following is a complete listing of all options that are available with the Tools and Drivers CD If your own system does not have all of these options they might not be listed when the Show Results Summary is displayed m Processor This section shows the following tests conducted against the p
16. table is updated with the specified events The fields in the Event Log are described in TABLE C 1 TABLEC 1 Event Log Fields Field Description Event ID The number of the event in sequence from number 1 Time Stamp The day and time the event occurred If the Network Time Protocol NTP server is enabled to set the SP time the SP clock will use Universal Coordinated Time UTC For more information about time stamps see Interpreting Event Log Time Stamps on page 51 Sensor Name The name of a component for which an event was recorded The sensor name abbreviations correspond to these components e sys System or chassis e p0 Processor 0 e p1 Processor 1 e io I O board e ps Power supply e fp Front panel e ft Fan tray mb Motherboard Sensor Type The type of sensor for the specified event Description A description of the event 4 To clear the event log click the Clear Event Log button A confirmation dialog box appears 5 Click OK to clear all entries in the log 6 If the problem with the server is not evident after viewing ILOM SP logs and information continue with Running SunVTS Diagnostic Tests on page 9 Interpreting Event Log Time Stamps The system event log time stamps are related to the service processor clock settings If the clock settings change the change is reflected in the time stamps When the service processor reboots the SP clock is set to Thu Jan 1 00 00 00 UTC 1970 The
17. test script Chapter 3 Using Pc Check Diagnostics Software 23 24 Create Diagnostic Partition Option The diagnostic partition is preinstalled on the server You need to reinstall the diagnostic partition only if you have reformatted your hard drive Using the Erase Primary Boot Hard Disk utility on the Tools and Drivers CD preserves the diagnostic partition The Create Diagnostic Partition option installs a diagnostic partition on the first bootable disk seen by the server The first bootable disk is on the primary HDD device Note If you are running the Pc Check Diagnostics software from a PXE server you do not need to follow the instructions in Appendix D The following sections explain how to create and access the diagnostic partition on the server m Removing Existing Partitions From a Hard Disk on page 25 m Adding a Diagnostic Partition to the First Bootable Disk on page 26 m Creating a Log File on the Diagnostic Partition on page 26 m Accessing the Diagnostic Partition on a Red Hat Linux System on page 27 m Accessing the Diagnostic Partition on the Solaris 10 Operating System on page 28 m Accessing the Diagnostic Partition on the Windows Server 2003 Operating System on page 29 Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 Removing Existing Partitions From a Hard Disk The Create Diagnostic Partition option creates a diagnostic partition on a hard
18. 100 A Start Ml Buoster 27 0 05 Produ EE Microsoft Excel Bugso 4 Defect detail Windows F sunm Integrated 2 diagguide txt WordPad a j 2 21PM Appendix Using the ILOM Service Processor Web Interface to View System Information 55 TABLE C 2 Sensor Readings Fields Field Description Status Reports the status of the sensor including State Asserted State Deasserted Predictive Failure Device Inserted Device Present Device Removed Device Absent Unknown and Normal Name Reports the name of the sensor The names correspond to the following components e sys System or chassis e bp Back panel e fp Front panel e mb Motherboard e io I O board e p0 Processor 0 e pl Processor 1 e ft0 Fan tray 0 e ftl Fan tray 1 e pdb Power distribution board e ps0 Power supply 0 e ps1 Power supply 1 Readin Reports the rpm temperature and voltage measurements 8 P P P 8 3 Click the Refresh button to update the sensor readings to their current status 4 Click a sensor to display its thresholds A display of properties and values appears See the example in FIGURE C 4 56 Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 FIGURE C 4 Sensor Details Page https 10 6 143 113 Mozilla Firefox Sun Integrated Lights Out Manager View all of the properties and values for a sensor SYS MB PO PRSNT Property Value type Entity Presence class Di
19. 50 X4250 or X4450 servers document collection using the Product category link Available translations for the Sun Fire X4150 X4250 and X4450 Servers include Simplified Chinese Traditional Chinese French Japanese and Korean English documentation is revised more frequently and might be more up to date than the translated documentation For all Sun documentation go to the following URL http docs sun com vii viii Typographic ConventionsThird Party Typeface Meaning Examples AaBbCc123 The names of commands files and directories onscreen computer output AaBbCc123 What you type when contrasted with onscreen computer output AaBbCc123 Book titles new words or terms words to be emphasized Replace command line variables with real names or values The settings on your browser might differ from these settings 8 y 8 8 Web Sites Edit your login file Use 1s a to list all files You have mail su Password Read Chapter 6 in the User s Guide These are called class options You must be superuser to do this To delete a file type rm filename Sun is not responsible for the availability of third party web sites mentioned in this document Sun does not endorse and is not responsible or liable for any content advertising products or other materials that are available on or through such sites or resources Sun will not be responsible or liable for any actual or alleged damage or loss
20. 50 and X4450 servers Not relevant to the Sun Fire X4150 X4250 and X4450 servers Details information about the video card Initially the monitor might flicker but then it brings up a Video Test Options menu that enables you to perform various video tests Not relevant to the Sun Fire X4150 X4250 and X4450 servers Details information about Advanced Configurable Power Interface ACPI and includes an ACPI Tests menu to test ACPI Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 Testing the Hard Disk 1 From the main menu choose Advanced Diagnostics Tests 2 From the Advanced Diagnostics menu choose Hard Disks 3 From the Select Drive menu choose the hard disk you are testing The Hard Disk Diagnostics window opens showing both the information for the hard disk you have selected and the Hard Disk Tests menu The Hard Disk Tests menu displays the following options Select Drive Test Settings Read Test Read Verify Test Non Destructive Write Test Destructive Write Test Mechanics Stress Test Internal Cache Test View Error Log Utilities Menu Exit The Media Test options include the Read Test the Read Verify Test the Non Destructive Write Test and the Destructive Write Test These tests are relevant to testing the media associated with the hard drive hardware such as the physical disk Caution Running the Destructive Write Test destroys any data that is on the disk
21. B Sec CD DVD Transfer Rating CD DVD Drive Seek Test CD DVD Seek Time ms CD DVD Test Disk Read and CD DVD Tray Test a ATAPI Devices This section shows the following tests conducted against ATAPI devices Linear Read Test Non Destructive Write and Random Read Write Test a Hard Disk This section shows the following tests conducted against the hard disk Read Test Read Verify Test Non Destructive Write Test Destructive Write Test Mechanics Stress Test and Internal Cache Test a USB This section shows the following tests conducted against the USB Controller Tests and Functional Tests a Hardware ID The compare test is used to determine the machine ID for the system This test is not available for the Sun Fire X4150 X4250 and X4450 servers Print Results Report The Print Results Report option enables you to print system diagnostic results Ensure that your server is connected to a printer and then enter the required information to print the results Chapter 3 Using Pc Check Diagnostics Software 31 About Pc Check The About Pc Check window includes general information about Pc Check software including resident and nonresident components such as mouse devices Exit to DOS Use the Exit to DOS option to exit Pc Check and return to the DOS prompt 32 Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 CHAPTER 4 Using SunVTS Diagnostic Software This chapter contain
22. Connect the Sun Fire server to the same network as the PXE server 2 Power on or reboot the Sun Fire server 3 Press the F12 key during POST 4 The Boot Message Screen located on your PXE server at t ftpboot linux install msgs boot msg displays on the screen 5 Type supp1_tau at the prompt and press Return The MEMDISK kernel and the bootable portion of the Tools and Drivers CD are downloaded to the test machine over the network and into memory Once downloaded the bootable portion of the Tools and Drivers CD will be booted 6 The main menu of the bootable portion of the Tools and Drivers CD is displayed on the target Sun Fire server 7 You can now run the hardware diagnostics or update the System BIOS See Chapter 3 in this document for information on running Pc Check diagnostics software Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 Index B BIOS event logs 37 Bootable Diagnostics CD 34 comments and suggestions ix component inventory viewing with ILOM SP GUI 52 D diagnostic partition accessing Red Hat Linux 27 Solaris 10 28 Windows XP 29 adding 26 log file 26 removing 24 diagnostic software Bootable Diagnostics CD 34 SunVTS 33 diagnostics advanced diagnostics option 17 deferred burn in testing option 23 hard disk testing 19 immediate burn in testing option 20 main menu options 14 PC CHECK information 32 print results reports option 31 running from P
23. ESH Loc OUT hy Administrator root Sun Integrated Lights Out Manager em Information System Monitoring Sensor Readings Indicators Event Logs unt Event Log Displays every event in the SP including IPMI Audit and FMA events Click the Clear Log button to delete all current log entries Event Log Event ID Class Severity Date Time Description 162 minor Wed Nov 28 root Open Session object sessionftype value www success 09 39 10 2007 161 minor Wed Nov 28 root Open Session object sessionitype value shell success 09 23 06 2007 160 critical Wed Nov 28 ID 81 pre init timestamp Entity Presence hdd prsnt Device Absent 09 21 01 2007 159 critical Wed Nov 28 ID 80 pre init timestamp Entity Presence hdd2 prsnt Device Absent 09 20 57 2007 v lt il E 3 Select the category of event that you want to view in the log from the drop down list box You can select from the following types of events m Sensor specific events These events relate to a specific sensor for a component for example a fan sensor or a power supply sensor BIOS generated events These events relate to error messages generated in the BIOS m System management software events These events relate to events that occur within the ILOM software 50 Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 After you have selected a category of event the Event Log
24. IGURE B 4 and FIGURE B 5 show and describe the back panel LEDs FIGURE B 6 shows and describes the hard drive LEDs FIGURE B 7 and FIGURE B 8 show the locations of the internal DIMM LEDs FIGURE B 9 and FIGURE B 10 show the locations of the internal CPU LEDs 39 Front Panel LEDs FIGURE B 1 Sun Fire X4150 Server Front Panel LEDs Lote Les tis Ps ei D ToP Figure Legend Locator LED Locator button White Service Required LED Amber Power OK LED Green Power button Hard drive map Rear PS LED Amber Power supply fault System Over Temperature LED Amber bk N on oO A Top Fan LED Amber Service action required on fans 40 Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 FIGURE B 2 Sun Fire X4250 Server Front Panel LEDs amp Sun 3777 215 216 1114 11519 yD 01418101 FIGURE B 3 Sun Fire X4450 Server Front Panel LEDs amp Sun D 1 5 ay e O om O O Eee l NSunS ca Appendix B Status Indicator LEDs 41 Figure Legend applies to both X4250 and X4450 Locator LED Locator button White Service Required LED Amber 5 Rear PS LED Amber Power supply fault 6 Power OK LED Green 7 Top Fan LED Amber Service action required on fans 8 System Over Temperature LED Amber bk N Power button Hard drive map Back Panel LEDs FIGURE B 4 Sun Fire X4150 Server Back Panel LEDs FIG
25. M ISYSIMBIPOID4 DIMM IN fovermoroning IRAN 3 Select a component from the drop down list Information about the selected component appears 4 If the problem with the server is not evident after viewing replaceable component information continue with Using Pc Check Diagnostics Software on page 13 or Using SunVTS Diagnostic Software on page 9 Appendix Using the ILOM Service Processor Web Interface to View System Information 53 Viewing Sensors This section describes how to view the server temperature voltage and fan sensor readings For a complete list of sensors see the Integrated Lights Out Manager Supplement for the server To view sensor readings 1 Log in to the SP as administrator or operator to reach the ILOM web interface a Type the IP address of the server s SP into your web browser The Sun Integrated Lights Out Manager Login screen appears b Type your user name and password When you first try to access the ILOM Service Processor you are prompted to type the default user name and password The default user name and password are Default user name root Default password changeme 2 From the System Monitoring tab select Sensor Readings The Sensor Readings page appears See FIGURE C 3 54 Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 FIGURE C 3 Sensor Readings Page Sun TM Integrated Lights Out Manager Windows Internet Explorer lej x
26. S amp o SUN microsystems Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide Sun Microsystems Inc www sun com Part No 820 4213 10 July 2008 Revision A Submit comments about this document at http www sun com hwdocs feedback Copyright 2008 Sun Microsystems Inc 4150 Network Circle Santa Clara California 95054 U S A All rights reserved Unpublished rights reserved under the Copyright Laws of the United States THIS PRODUCT CONTAINS CONFIDENTIAL INFORMATION AND TRADE SECRETS OF SUN MICROSYSTEMS INC USE DISCLOSURE OR REPRODUCTION IS PROHIBITED WITHOUT THE PRIOR EXPRESS WRITTEN PERMISSION OF SUN MICROSYSTEMS INC This distribution may include materials developed by third parties Sun Sun Microsystems the Sun logo Java Solaris Sun Fire X4150 Sun Fire X4250 and Sun Fire X4450 are trademarks or registered trademarks of Sun Microsystems Inc in the U S and other countries AMD Opteron and Opteron are trademarks of Advanced Micro Devices Inc Intel is a registered trademark of Intel Corporation This product is covered and controlled by U S Export Control laws and may be subject to the export or import laws in other countries Nuclear missile chemical biological weapons or nuclear maritime end uses or end users whether direct or indirect are strictly prohibited Export or reexport to countries subject to U S embargo or to entities identified on U S export exclusion lists including but not l
27. Service Information on page 2 Troubleshooting Power Problems on page 3 Externally Inspecting the Server on page 3 Internally Inspecting the Server on page 4 Troubleshooting DIMM Problems on page 7 Viewing Event Logs on page 37 Using the ILOM Service Processor Web Interface to View System Information on page 47 Using Pc Check Diagnostics Software on page 13 Using SunVTS Diagnostic Software on page 33 Gathering Service Information The first step in determining the cause of a problem with the server is to gather information from the service call paperwork or the onsite personnel Use the following general guideline steps when you begin troubleshooting To gather service information 1 Collect information about the following items Events that occurred prior to the failure Whether any hardware or software was modified or installed Whether the server was recently installed or moved How long the server exhibited symptoms The duration or frequency of the problem Document the server settings before you make any changes If possible make one change at a time in order to isolate potential problems In this way you can maintain a controlled environment and reduce the scope of troubleshooting Take note of the results of any change that you make Include any errors or informational messages Check for potential device conflicts before you add a new device Check for
28. Three scripts have already been created for testing your system m quick tst This script performs a high level test of all hardware components including those components that require user input as well as a more in depth memory test The user must interact with the Pc Check software to progress through these interactive tests The tests cannot be run unattended and do not contain timeout facilities The interactive tests wait until the user provides the correct input 20 Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 m noinput tst This script is used as a first triage of any hardware related problems or issues The script performs a high level test of most hardware components excluding those components that require user input keyboard mouse sound video This test does not require user input a full tst This script performs the most detailed and comprehensive test on all hardware components including those components that require user input This script contains a more in depth memory test than quick tst as well as external port tests which might require loopback connectors The user must interact with the test utility to progress through these interactive tests Tip Each of these scripts tests the operating status of your entire system If you want to test only a certain percentage of your system s hard drives refer to Testing the Hard Disk on page 19 to change the test options
29. URE B 5 Sun Fire X4250 and X4450 Server Back Panel LEDs 42 Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 Figure Legend applies to all servers 1 Power Supply LEDs 3 Ethernet Port LEDs e Power Supply OK Green Left side Green indicates link activity e Power Supply Fail Amber Right side e AC OK Green e Green Link is operating at maximum speed 2 System LEDs e Locator LED Button White e Service Required LED Amber e Power OK LED Green e Amber Link is operating at less than maximum speed Hard Drive LEDs FIGURE B 6 Hard Drive LEDs Figure Legend 1 Ready to remove LED Blue Service action is allowed 2 Fault LED Amber Service action is required 3 Status LED Green Blinks when data is being transferred Internal Status Indicator LEDs The server has internal status indicators on the motherboard a The DIMM Fault LEDs indicate a problem with the corresponding DIMM See FIGURE B 7 and FIGURE B 8 for the LED locations When you press the Remind button if there is a problem with a DIMM the corresponding DIMM Fault LED flashes See DIMM Fault LEDs on page 9 for details Appendix B Status Indicator LEDs 43 m The CPU Fault LEDs indicate a problem with the corresponding CPU See FIGURE B 9 and FIGURE B 10 for the CPU LED locations When you press the Remind button if there is a problem with a CPU the corresponding CPU Fault LED flashes Note
30. X4250 and X4450 Servers Diagnostics Guide July 2008 7 Change to the memdisk directory For example cd syslinux 3 63 memdisk 8 Copy the memdisk kernel to the new Tools and Drivers Directory created in Step 3 For example cp syslinux 3 63 memdisk memdisk tftpboot linux install suppl tau 9 Edit the Boot Message Screen as follows a Open the boot msg file in a text editor vi tftpboot linux install msgs boot msg b Type the following line after 0 Local Machine suppl_tau Sun Fire xxx Server Tools and Drivers CD Where xxx is the server number ie X4250 c Save and close the boot msg file 10 Edit the default PXE Configuration file as follows a Open the default file in a text editor vi tftpboot linux install pxelinux cfg default b Type the following lines after the 1abe10 section label suppl_tau kernel suppl_tau memdisk append initrd suppl_tau boot img c Save and close the default file 11 Test the installation on the test machine Appendix D Booting the Tools and Drivers CD from a PXE Server 61 62 Accessing the Tools and Drivers CD From the Target Server You will need the following to run diagnostics on a target Sun Fire server m PXE server configured as shown in Setting up the Tools and Drivers CD Image on the PXE Server on page 59 m Sun Fire server set up on the same network as the PXE server 1
31. XE server 59 show results summary option 30 shut down option 32 system information menu options 16 DIMMs error handling 8 fault LEDs 9 isolating errors 11 E emergency shutdown 4 error handling DIMMs 8 event logs BIOS 37 external inspection 3 external LEDs 39 F faults DIMM 9 FRU inventory viewing with ILOM SP GUI 52 G gathering service visit information 2 general troubleshooting guidelines 2 graceful shutdown 4 guidelines for troubleshooting 2 l ILOM SP GUI general information 47 serial connection 48 time stamps 51 Index 1 viewing component inventory 52 SunVTS viewing sensors 54 Bootable Diagnostics CD 34 viewing SP event log 49 documentation 34 inspection logs 35 external 3 overview 33 internal 4 Integrated Lights Out Manager Service Processor T See ILOM SP GUI third party Web sites viii internal inspection 4 time stamps in ILOM SP SEL 51 isolating DIMM ECC errors 11 Tools and Drivers CD accessing from PXE server 62 L setting up on PXE server 59 LEDs troubleshooting external 39 guidelines 2 LEDs ports and slots illustrated 40 41 42 43 typographic conventions viii locations of ports slots and LEDs illustration 40 41 42 43 P ports slots and LEDs illustrated 40 41 42 43 Power button Power button 4 5 location 4 5 power off procedure 4 power problems troubleshooting 3 PXE server accessing Tools and Driver
32. caused by or in connection with the use of or reliance on any such content goods or services that are available on or through such sites or resources Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 Sun Welcomes Your Comments Sun is interested in improving its documentation and welcomes your comments and suggestions You can submit your comments by going to http www sun com hwdocs feedback Please include the title and part number of your document with your feedback Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide part number 820 4213 10 Preface ix x Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 CHAPTER 1 Initial Inspection of the Server This chapter includes the following topics m Service Troubleshooting Flowchart on page 1 m Gathering Service Information on page 2 m System Inspection on page 3 Service Troubleshooting Flowchart Use the following flowchart as a guideline for using the information in this book to troubleshoot the server TABLE 1 1 Troubleshooting Flowchart To perform this task Refer to this section Gather initial service information Investigate any power on problems Perform external visual inspection and internal visual inspection Troubleshoot DIMM problems View BIOS event logs View service processor logs and sensor information Run diagnostics software Gathering
33. d by the System This section describes the following topics m Uncorrectable DIMM Errors on page 8 m Correctable DIMM Errors on page 8 a DIMM Fault LEDs on page 9 Uncorrectable DIMM Errors For all operating systems the behavior is the same for uncorrectable errors UCEs 1 When a UCE occurs the memory controller causes an immediate reboot of the system 2 During reboot the BIOS checks the Machine Check registers and determines that the previous reboot was due to a UCE The uncorrectable ECC error is displayed in the service processor s system event log SEL as shown here Memory Uncorrectable ECC Asserted DIMM AO Correctable DIMM Errors If a DIMM has 24 or more correctable errors CE s in 24 hours it is considered defective and should be replaced CEs will be captured in the SEL and light the fault LED after 24 single bit errors are detected in 24 hours They are reported or handled in the supported operating systems as follows a Windows server a A Machine Check error message bubble appears on the task bar b Open the Event Viewer to view errors Access the Event Viewer through this menu path Start gt Administration Tools gt Event Viewer c View individual errors by time to see the details of the error Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 a Solaris Solaris FMA reports and sometimes retires memory with correctable Error Correction
34. detect and test all motherboard components ports and slots If you encounter any hardware related error message such as memory errors or hard disk errors on your server run one of the following a Advanced Diagnostics Test A specific hardware component test Immediate Burn in Test A server diagnostic test script The following procedure describes how to access these test options from the Tools and Drivers CD Accessing the Pc Check Diagnostics Software Do one of the following depending on which method you are using to access the Pc Check diagnostics software a If your server has a DVD drive installed a Insert the Tools and Drivers CD into your DVD drive and reboot the system b Type 1 to run the hardware diagnostics software a If you want to run the Pc Check software from a PXE server see Appendix D for instructions a If you are running the Pc Check software through the ILOM web interface do the following a Select Remote Control gt Diagnostic tab b Select one of the following a Enabled Launches the quick diagnostics tests which run for 3 minutes a Extended Launches the 30 minute diagnostic test Manual Launches the full version of diagnostics which can run all tests This option gives you the same results as booting diagnostics from the Tools and Drivers CD c Click Save 14 Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 d Reboot the Server Refer to the I
35. elines 7 How DIMM Errors Are Handled by the System 8 Uncorrectable DIMM Errors 8 Correctable DIMM Errors 8 DIMM Fault LEDs 9 Isolating and Correcting DIMM ECC Errors 11 Using Pc Check Diagnostics Software 13 Pc Check Diagnostics Overview 14 Accessing the Pc Check Diagnostics Software 14 System Information Menu Options 16 Advanced Diagnostics Tests 17 Advanced Diagnostics Tests Menu Options 17 Testing the Hard Disk 19 Burn in Tests 20 Immediate Burn in Testing 20 Full System Tests 20 Component Tests 22 Running the Immediate Burn in Tests 22 Deferred Burn in Testing 23 Create Diagnostic Partition Option 24 Removing Existing Partitions From a Hard Disk 25 Adding a Diagnostic Partition to the First Bootable Disk 26 Creating a Log File on the Diagnostic Partition 26 Accessing the Diagnostic Partition on a Red Hat Linux System 27 Accessing the Diagnostic Partition on the Solaris 10 Operating System 28 Accessing the Diagnostic Partition on the Windows Server 2003 Operating System 29 Show Results Summary 30 Print Results Report 31 About Pc Check 32 Exit to DOS 32 4 Using SunVTS Diagnostic Software 33 Running SunVTS Diagnostic Tests 33 SunVTS Documentation 34 Diagnosing Server Problems With the Bootable Diagnostics CD 34 Requirements 35 Using the Bootable Diagnostics CD 35 iv Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 Viewing Event Logs 37 Status Indicator LEDs 39 External Status Indicator LEDs
36. enterprise RHEL 4 Manual sysadmin guide 59 a Tools and Drivers CD a MEMDISK kernel from the SYSLINUX project Access this kernel at http www kernel org pub linux utils boot syslinux To set up the PXE server 1 Log in to the PXE server as root superuser 2 Determine the directory where the Red Hat image is installed on the PXE server The default directory for the PXE image is usually tftpboot linux install The remainder of this procedure assumes that the PXE files have been installed in this directory Note If your PXE files are not installed in the t ftpboot linux install directory modify the procedure as necessary 3 Make a directory for the Tools and Drivers CD contents mkdir tftpboot linux install suppl tau 4 Insert the Tools and Drivers CD into the PXE server and copy the boot img file located in the root directory of the CD to the new server supplemental directory created in Step 3 cp mnt cdrom boot img tftpboot linux install suppl tau 5 Download the MEMDISK kernel a Go to the latest SYSLINUX project web site at http www kernel org pub linux utils boot syslinux b Save the latest syslinux version zip file to your root directory Where version is the latest SYSLINUX project version Note Version 3 63 was the latest version at the time of this writing 6 Unzip the zip file For example unzip syslinux 3 63 zip 60 Sun Fire X4150
37. enu which enables you to modify the various options listed in TABLE 3 3 for the currently loaded test script a Select Tests Opens a listing of the tests available for your server configuration and the currently loaded test script Perform Burn in Tests Runs the currently loaded burn in test script Tip The memory tests in Pc Check detect single bit ECC failures and report them down to an individual DIMM Deferred Burn in Testing You can use the Deferred Burn in Testing option to create and save your own scripts to run at a later time From the main menu choose Deferred Burn in Testing The top portion of the window lists the options described in TABLE 3 3 and the bottom portion of the window lists the following Burn in menu options Load Burn in Script To use a pre written test Enter one of the following quick tst noinput tst or full tst To use a script that you have created and saved enter d testname tst where testname is the name of the script that you have created a Save Burn in Script To save a burn in script that you have created enter d testname tst Where testname is the name of the script that you have created a Change Options Opens the Burn in Options menu which enables you to modify the various options listed in TABLE 3 3 for the currently loaded test script a Select Tests Opens a listing of all the possible types of tests available for you to run for the currently loaded
38. ewing Replaceable Component Information Depending on the component you select information about the manufacturer component name serial number and part number can be displayed To view replaceable component information 1 Log in to the SP as administrator or operator to reach the ILOM web interface a Type the IP address of the server s SP into your web browser The Sun Integrated Lights Out Manager login screen appears b Type your user name and password When you first try to access the ILOM SP you are prompted to type the default user name and password The default user name and password are Default user name root Default password changeme Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 2 From the System Information tab select Components The Replaceable Component Information page appears See FIGURE C 2 FIGURE C 2 Replaceable Component Information Page REFRESH Loc OUT ministrator root m P00144F8D2D87 Integrated Lights Out Manager il System Information User Versions Session Time Out Components Identification Information Component Management View component information from this page To view further details click on a Component Name Component Management Status Component Name Type 15YS Host System iSYS MB Motherboard ISYS MB PO Host Processor ISYSIMBIPOIDO DIMM ISYSIMBIPOID1 DIMM ISYSIMBIPOID2 DIMM ISYSIMBIPOID3 DIM
39. formation about your system motherboard BIOS processor memory cache drives video modem network buses and ports Enables you to create a document showing information about your system including comparisons between the updates and the newest versions of your system XML is the format used to create and display this information though you can also choose a text txt format Provides information obtained from the system about the BIOS type system motherboard enclosure processors memory modules cache slots system event log memory array memory devices memory device mapped addresses and system boot Includes details about specific devices from pci config space within the system similar to the System Management Information section Shows the master slave devices on the primary and secondary IDE controllers Not relevant to the Sun Fire X4150 X4250 and X4450 servers Lists and details device interrupt vector information Shows hardware interrupt assignments Shows device drivers loaded under Open DOS Tests the Advanced Power Management APM capabilities of the system You can choose to change the power state view the power status indicate CPU usage get a PM event or change the interface mode Shows the I O port assignment for the hardware devices on the system Enables you to view the mapped memory for the entire system Reads sector information from the hard disks and DVD disks sector by sector
40. hod for shutting down the server from main power mode to standby power mode See FIGURE 1 1 FIGURE 1 2 and FIGURE 1 3 for the location of the power button for each platforn Graceful shutdown Use a ballpoint pen or other nonconductive stylus to press and release the Power button on the front panel This causes Advanced Configuration and Power Interface ACPI enabled operating systems to perform an orderly shutdown of the operating system Servers not running ACPI enabled operating systems shut down to standby power mode immediately a Emergency shutdown Use a ballpoint pen or other nonconductive stylus to press and hold the Power button for four seconds to force main power off and enter standby power mode Caution Performing an emergency shutdown can cause open files to become corrupt Use an emergency shutdown only when necessary When main power is off the Power OK LED on the front panel will begin flashing indicating that the server is in standby power mode Caution When you use the Power button to enter standby power mode power is still directed to the service processor and power supply fans indicated when the Power OK LED is flashing To completely power off the server you must disconnect the AC power cords from the back panel of the server FIGURE 1 1 X4150 Server Front Panel Power Button Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 FIGURE 1 2 X4250 Server Front Panel
41. imited to the denied persons and specially designated nationals lists is strictly prohibited Use of any spare or replacement CPUs is limited to repair or one for one es of CPUs in products exported in compliance with U S export laws Use of CPUs as product upgrades unless authorized by the U S Government is strictly prohibited Copyright 2008 Sun Microsystems Inc 4150 Network Circle Santa Clara California 95054 Etats Unis Tous droits r serv s Non publie droits r serv s selon la l gislation des Etats Unis sur le droit d auteur CE PRODUIT CONTIENT DES INFORMATIONS CONFIDENTIELLES ET DES SECRETS COMMERCIAUX DE SUN MICROSYSTEMS INC SON UTILISATION SA DIVULGATION ET SA REPRODUCTION SONT INTERDITES SANS L AUTORISATION EXPRESSE ECRITE ET PREALABLE DE SUN MICROSYSTEMS INC Cette distribution peut inclure des l ments d velopp s par des tiers Sun Sun Microsystems le logo Sun Java Solaris et Sun Fire X4150 Sun Fire X4250 and Sun Fire X4450 sont des marques de fabrique ou des marques d pos es de Sun Microsystems Inc aux Etats Unis et dans d autres pays AMD Opteron et Opteron sont marques d pos es de Advanced Micro Devices Inc Intel est une marque d pos e de Intel Corporation Ce produit est soumis la l gislation am ricaine sur le contr le des exportations et peut tre soumis la r glementation en vigueur dans d autres pays dans le domaine des exportations et importations Les utilisations finales
42. log and the SP system event log Use this procedure to view the BIOS event log and the SP system event log 1 To turn on main power mode all components powered on if necessary use a ball point pen or other nonconductive stylus to press and release the Power button on the server front panel See FIGURE 1 1 When main power is applied to the full server the Power OK LED next to the Power button lights and remains lit 2 Enter the BIOS Setup utility by pressing the F2 key while the system is performing the power on self test POST The BIOS Main menu screen is displayed 3 View the BIOS event log a From the BIOS Main Menu screen select the Server tab b Select View event log 4 If the problem with the server is not evident continue with Using the ILOM Service Processor Web Interface to View System Information on page 47 37 38 Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 APPENDIX B Status Indicator LEDs This appendix contains information about the locations and behavior of the LEDs on the server It describes the external LEDs that can be viewed on the outside of the server and the internal LEDs that can be viewed only with the main cover removed External Status Indicator LEDs See the following figures and tables for information about the LEDs that are viewable on the outside of the server FIGURE B 1 FIGURE B 2 and FIGURE B 3 show and describe the front panel LEDs F
43. mory Test vmemtest SunVTS software has a sophisticated graphical user interface GUI that provides test configuration and status monitoring The user interface can be run on one system to display the SunVTS testing of another system on the network SunVTS software also provides a TTY mode interface for situations in which running a GUI is not possible SunVTS Documentation For the most up to date information on SunVTS software go to http docs sun com app docs prod test validate Make sure to read the most recent product Release Notes before running SunVTS on your server Diagnosing Server Problems With the Bootable Diagnostics CD This Bootable Diagnostics CD is designed so that the server will boot from the CD This CD boots and starts SunVTS software Diagnostic tests run and write output to log files that the service technician can use to determine the problem with the server 34 Sun Fire X4140 X4240 and X4440 Servers Diagnostics Guide July 2008 Requirements To use the diagnostics CD you must have a keyboard mouse and monitor attached to the server on which you are performing diagnostics or available through a remote KVM Using the Bootable Diagnostics CD To use the diagnostics CD to perform diagnostics 1 2 With the server powered on insert the CD into the DVD ROM drive Reboot the server and press F2 during the start of the reboot so that you can change the BIOS setting for boot device priority
44. nVTS information log contains informative messages that are generated when you start and stop the SunVTS test sessions The log file path name is var opt SUNWvts logs sunvts info This file is not created until a SunVTS test session runs m Solaris system message log is a log of all the general Solaris OS events logged by syslogd The path name of this log file is var adm messages a Click the Log button The Log file window appears b Specify the log file that you want to view by selecting it from the Log file window The content of the selected log file appears in the window c With the three lower buttons you can perform the following actions a Print the log file A dialog box appears for you to specify your printer options and printer name Delete the log file The file remains on the display but it will not be available the next time you try to display it Close the Log file window The window closes Note If you want to save the log files When you use the Bootable Diagnostics CD the server boots from the CD Therefore the test log files are not on the server s hard disk drive and they will be deleted when you power cycle the server To save the log files you must save them to a removable media device or FTP them to another system 36 Sun Fire X4140 X4240 and X4440 Servers Diagnostics Guide July 2008 APPENDIX A Viewing Event Logs This appendix contains information about the BIOS event
45. ntegrated Lights Out Manager 2 0 User s Guide for more information on using the ILOM web interface The system boots to the Pc Check main menu The system information loads the Diagnostics main menu opens and the following menu options are displayed To run a specific hardware component test select Advanced Diagnostics Test To run one of the test scripts supplied by Sun select Immediate Burn in Testing Navigate through the menu items by pressing the arrow keys to move to a menu selection Use the Enter key to select a menu selection and the ESC key to exit a System Information Menu Advanced Diagnostics Tests Immediate Burn in Testing Deferred Burn in Testing Create Diagnostic Partition Show Results Summary Print Results Report About PC CHECK Exit to DOS menu Navigation instructions are shown at the bottom of each screen The following sections in this chapter describe the menu items and tests in detail Chapter 3 Using Pc Check Diagnostics Software 15 System Information Menu Options TABLE 3 1 describes each option in the System Information menu TABLE 3 1 System Information Menu Options Option Description System Overview Hardware ID Image Menu System Management Information PCI Bus Information IDE Bus Information PCMCIA CardBus Info Interrupt Vectors IRQ Information Device Drivers APM Information I O Port Browser Memory Browser Sector Browser Includes basic in
46. ny failed DIMMs For UCEs if the LEDs indicate a fault with the pair replace both DIMMs Ensure that they are inserted correctly with ejector latches secured Reconnect AC power cords to the server Power on the server and run the diagnostics test again Review the log file If the tests identify the same error the problem is in the CPU not the DIMMs Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 CHAPTER 3 Using Pc Check Diagnostics Software This chapter assists you with using the Diagnostics application on the Tools and Drivers CD that is packaged with your system Diagnostic output is accessible on systems that are running supported Linux or Solaris operating systems If you are having specific problems with your system use the Pc Check Diagnostics software to diagnose and resolve these issues The following sections are included in this chapter Pc Check Diagnostics Overview on page 14 Advanced Diagnostics Tests on page 17 Burn in Tests on page 20 Create Diagnostic Partition Option on page 24 Show Results Summary on page 30 Print Results Report on page 31 About Pc Check on page 32 Exit to DOS on page 32 Pc Check Diagnostics Overview Sun Fire X4150 X4250 and X4450 server diagnostics are contained in the DOS based Pc Check utility This program can be accessed and executed only from the Tools and Drivers CD Pc Check was designed to
47. o the diagnostic partition enabled The names of log files correspond to the name of the script For example a script named noinput tst creates a log file named noinput jrl The following procedure shows an example of how to create and access a log file on the diagnostic partition for the noinput tst script 1 Insert the Tools and Drivers CD into the DVD tray 2 Reboot the server 26 Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 10 11 From the Tools and Drivers CD main menu choose 1 to run Hardware Diagnostics From the Hardware Diagnostics main menu choose Immediate Burn In Testing Select Load Burn in Script Type noinput tst and press Enter If you are using a test you have created yourself you need to enter d testname tst into the Load Burn in Script field where testname is the name of the test you have created Select Perform Burn in Tests to run the script When the tests are complete press the Esc key to exit the Display Results window Select Exit to DOS and press Enter At the DOS prompt type the following C gt ds Type the following to list the contents of the diagnostic partition D gt dir The noinput jr1 log appears Accessing the Diagnostic Partition on a Red Hat Linux System To access the diagnostic partition on a Red Had Linux system 1 2 3 Remove the Tools and Drivers CD from the DVD tray Reboot the server and star
48. rocessor Core Processor Tests AMD 64 Bit Core Tests Math Co Processor Tests Pentium Class FDIV and Pentium Class FIST MMX Operation 3DNow Operation SSE Instruction Set SSE2 Instruction Set and MP Symmetry a Motherboard This section shows the following tests conducted against the motherboard DMA Controller Tests System Timer Tests Interrupt Test Keyboard Controller Tests PCI Bus Tests and CMOS RAM Clock Tests Memory Cache Memory and Video Memory This section shows the following tests conducted against the various types of memory Inversion Test Tree Progressive Inv Test Chaotic Addressing Test and Block Rotation Test Input Device This section shows the following tests conducted against the input device Verify Device Keyboard Repeat and Keyboard LEDs Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 Mouse This section shows the following tests conducted against the mouse Buttons Ballistics Text Mode Positioning Text Mode Area Redefine Graphics Mode Positions Graphics Area Redefine and Graphics Cursor Redefine a Video This section shows the following tests conducted against the video Color Purity Test True Color Test Alignment Test LCD Test and Test Cord Test a Multimedia This section shows the following tests conducted against the multimedia components Internal Speaker Test FM Synthesizer Test PCM Sample Test CD DVD Drive Read Test CD DVD Transfer K
49. s CD 62 diagnostics 59 setting up R Tools and Drivers CD 59 related documentation vii S safety guidelines vii sensors viewing w serial connect ith ILOM SP GUI 54 ion to ILOM SP 48 Service Processor system event log See SP SEL service visit information gathering 2 shutdown procedure 4 slots ports and LEDs illustrated 40 41 42 43 SP event log viewing w SP SEL ith ILOM SP GUI 49 time stamps 51 Index 2 Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008
50. s information about the SunVTS diagnostic software tool The SunVTS Bootable Diagnostics CD that contains the Sun Validation Test Suite SunVTS software might be an orderable option for your server You can also download SunVTS from http www sun com oem products vts Note SunVTS 7 0ps2 is the minimum version supported with the Sun Fire X4150 X4250 and X4450 servers Running SunVTS Diagnostic Tests SunVTS provides a comprehensive diagnostic tool that tests and validates Sun hardware by verifying the connectivity and functionality of most hardware controllers and devices on Sun platforms SunVTS software can be tailored with modifiable test instances and processor affinity features The following tests are supported on x86 platforms a CD DVD Test cddvdtest CPU Test cputest Cryptographics Test cryptotest m Disk and Diskette Drives Test disktest m Data Translation Look aside Buffer dtlbtest Emulex HBA Test emixtest m Floating Point Unit Test fputest 33 a InfiniBand Host Channel Adapter Test ibhcatest m Level 1 Data Cache Test 11dcachetest m Level 2 SRAM Test 12sramtest m Ethernet Loopback Test netlbtest m Network Hardware Test nettest m Physical Memory Test pmemtest a QLogic Host Bus Adapter Test qlctest a RAM Test ramtest m Serial Port Test serialtest m System Test systest a Tape Drive Test tapetest m Universal Serial Board Test usbtest a Virtual Me
51. screte Sensor value Present 106 143 113 5 If the problem with the server is not evident after viewing sensor readings information continue with Running SunVTS Diagnostic Tests on page 9 Appendix C Using the ILOM Service Processor Web Interface to View System Information 57 58 Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 APPENDIX D Booting the Tools and Drivers CD from a PXE Server This chapter contains information on booting the Tools and Driver CD from a PXE server It contains information on the following topics If you have a server that does not have a DVD drive you can run the Pc Check diagnostics and flash the BIOS from a Preboot Execution Environment PXE server The following topics are included in this section m Setting up the Tools and Drivers CD Image on the PXE Server on page 59 m Accessing the Tools and Drivers CD From the Target Server on page 62 Setting up the Tools and Drivers CD Image on the PXE Server You will need the following to set up the PXE server m Red Hat kickstart server with a CD or DVD drive Instructions for setting up the Red Hat kickstart server can be found in the system administration guides for Red Hat Enterprise Linux a Red Hat Enterprise Linux 3 manual at http www redhat com docs manuals enterprise RHEL 3 Manual sysadmin guide a Red Hat Enterprise Linux 4 manual at http www redhat com docs manuals
52. sing Pc Check Diagnostics Software 25 Adding a Diagnostic Partition to the First Bootable Disk Pc Check can view only the first or second hard disk on the system from the boot loader The software automatically installs the diagnostic partition on the first bootable disk To add the diagnostic partition on the first bootable disk 1 Insert the Tools and Drivers CD into the DVD tray 2 Reboot the server 3 At the Tools and Drivers CD main menu type 1 to run Hardware Diagnostics 4 From the main menu choose Create Diagnostic Partition m If the first bootable disk is clear of partitions the Sun Microsystems Partitioning Utility window appears It states Your primary hard disk is not partitioned Would you like to partition it now Select Yes and press Enter A window appears stating Partitioning complete Your machine will now be restarted a If the first bootable disk is not clear of partitions a window appears stating that the software is unable to create a hardware diagnostic partition because there are already partitions on the disk a If this happens go to Removing Existing Partitions From a Hard Disk on page 25 to remove the partitions from the disk m Repeat Step 1 through Step 4 of this procedure 5 Press Enter to reboot your server Creating a Log File on the Diagnostic Partition All the scripts that are loadable with the hardware diagnostics software are predefined with logging t
53. sks and script information Chapter 3 Using Pc Check Diagnostics Software 17 TABLE 3 2 Advanced Diagnostics Tests Menu Options Continued Option Description CD ROM DVD ATAPI Devices Serial Ports Parallel Ports Modems ATA USB FireWire Network Keyboard Mouse Joystick Audio Video Printers Firmware ACPI Includes a CD ROM DVD menu to test DVD devices on the system Details information about devices attached to the IDE controllers on the system other than a DVD or hard disks for example zip drives Details information about the serial port and includes a Serial Ports Tests menu to test serial ports on the system Note In order for the Serial Port test to pass the COMI entry in the BIOS Setup Screen must be set to System The use of a serial port Loopback connector might also be required Not relevant to the Sun Fire X4150 X4250 and X4450 servers Not relevant to the Sun Fire X4150 X4250 and X4450 servers Includes an ATA test menu Details information about the USB devices on the system and includes a USB Tests menu to test the USB Not relevant to the Sun Fire X4150 X4250 and X4450 servers Performs network register controller tests Includes a Keyboard Test menu with options for performing different tests on the keyboard Details information about the mouse and includes a menu to test the mouse on the system Not relevant to the Sun Fire X4150 X42
54. ssing the Diagnostic Partition on the Solaris 10 Operating System To access the diagnostic partition on the Solaris 10 operating system OS 1 Remove the Tools and Drivers CD from the DVD tray 2 Reboot the machine and start the Solaris 10 OS 28 Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 10 Log in as root superuser Type the following command to determine if your diagnostic partition has been configured to be mounted ls diagpart If this command fails to list the log files created by the hardware diagnostics software then the OS has never been configured to mount the diagnostic partition Continue to Step 5 a If this command succeeds in listing the log files created by the hardware diagnostics software then the OS has already been configured to mount the diagnostic partition All users have read access to this partition Only the superuser has read write access to this partition You do not need to continue this procedure Insert the Tools and Drivers CD into the DVD tray When the CD is mounted open a terminal window Type the following cd cdrom cdrom0 drivers sx86 Type the following to install the diagnostic partition install sh Press the Enter key The following lines appear if the diagnostic partition is mounted successfully Mounting Diagnostic Partition Installation Successful Type the following command to list the contents of the diagnostic
55. t LED on DIMM D0 is on To isolate and correct DIMM ECC errors 1 If you have not already done so shut down your server to standby power mode and remove the cover Chapter 2 Troubleshooting DIMM Problems 11 12 2 4 Inspect the installed DIMMs to ensure that they comply with the DIMM population rules in your product service manual Press the Remind button and inspect the DIMM fault LEDs See FIGURE 2 1 and FIGURE 2 2 For CEs and UCEs a flashing LED identifies the DIMM where the error is located Disconnect the AC power cords from the server Caution Before handling components attach an antistatic wrist strap to a chassis ground any unpainted metal surface The system s printed circuit boards and hard disk drives contain components that are extremely sensitive to static electricity Note To recover fault information look in the SP SEL as described in the Sun Integrated Lights Out Manager 2 0 User s Guide 5 Remove the DIMMs from the DIMM slots Refer to your server s service manual for details y Visually inspect the DIMMs for physical damage dust or any other contamination on the connector or circuits Visually inspect the DIMM slot for physical damage Look for cracked or broken plastic on the slot Dust off the DIMMs clean the contacts and install them Caution Use only compressed air to dust DIMMs 10 11 12 If there is no obvious damage replace a
56. t Tests There are also a number of tests that can be performed on individual components Each test is a continuous loop that last for 6 minutes The following scripts are available for testing specific components cdrom tst Tests the CD ROM in the system cpu tst Tests all CPUs in the system hdiskx tst Tests hard disk x in the system Where x is the number of the hard drive that you are testing 1 16 Note the actual hard disk test by default last 10 minutes mboard tst Tests the motherboard in the system video tst Tests the video adapter in the system Running the Immediate Burn in Tests To load one of the scripts available to test the devices on your system do the following From the main menu choose Immediate Burn in Testing The top portion of the window lists the options described in TABLE 3 3 and the bottom portion of the window lists the following Burn in menu options Load Burn in Script To use a pre written test Enter one of the following quick tst noinput tst or full tst To use a script that you have created and saved enter d testname tst where testname is the name of the script that you have created 22 Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 a Save Burn in Script To save a burn in script that you have created enter d testname tst where testname is the name of the script that you have created Change Options Opens the Burn in Options m
57. t the Red Hat Linux operating system Log in as root superuser Chapter 3 Using Pc Check Diagnostics Software 27 4 Determine if your diagnostic partition has been configured to be mounted by typing the following command ls diagpart a If this command fails to list the log files created by the hardware diagnostics software then the operating system has never been configured to mount the diagnostic partition Continue to Step 5 a If this command succeeds in listing the log files created by the hardware diagnostics software then the operating system has already been configured to mount the diagnostic partition All users have read access to this partition Only the superuser has read write access to this partition You do not need to continue this procedure 5 Insert the Tools and Drivers CD into the DVD tray 6 When the CD is mounted open a terminal window 7 Type the following command cd mountpoint drivers linux linux_version Where mountpoint is the CD mountpoint and linux_version is the version of Linux that you have installed For example cd mnt cdrom drivers linux red_hat 8 Type the following to install the diagnostic partition install sh 9 Press Enter The following lines appear if the diagnostic partition is mounted successfully Mounting Diagnostic Partition Installation Successful 10 Type the following command 1s diagpart The contents of the diagnostic partition are listed Acce
58. version dependencies especially with third party software Sun Fire X4150 X4250 and X4450 Servers Diagnostics Guide July 2008 System Inspection Controls that have been improperly set and cables that are loose or improperly connected are common causes of problems with hardware components Troubleshooting Power Problems m If the server will power on skip this section and go to Externally Inspecting the Server on page 3 m If the server will not power on check the following a Check that AC power cords are attached firmly to the server s power supplies and to the AC sources m Check that the main cover is firmly in place An intrusion switch on the motherboard automatically shuts down the server power to standby mode when the cover is removed Externally Inspecting the Server To perform a visual inspection of the external system 1 Inspect the external status indicator LEDs which can indicate component malfunction For the LED locations and descriptions of their behavior see External Status Indicator LEDs on page 39 2 Verify that nothing in the server environment is blocking air flow or making a contact that could short out power 3 If the problem is not evident continue with the next section Internally Inspecting the Server on page 4 Chapter 1 Initial Inspection of the Server 3 Internally Inspecting the Server To perform a visual inspection of the internal system 1 Choose a met
59. y if you have a Sun Blade X4150 or X4450 server that is still running Embedded Lights Out Manager For more information on using the ILOM SP web interface to maintain the server for example configuring alerts refer to the Integrated Lights Out Manager 2 0 User s Guide m If any of the logs or information screens indicate a DIMM error see Chapter 2 a If the problem with the server is not evident after viewing ILOM SP logs and information continue with Using Pc Check Diagnostics Software on page 13 or Running SunVTS Diagnostic Tests on page 9 47 Making a Serial Connection to the SP To make a serial connection to the SP 1 Connect a serial cable from the RJ 45 Serial Management port on server to a terminal device 2 Press Enter on the terminal device to establish a connection between that terminal device and the ILOM SP Note If you are connecting to the serial port on the SP during its power up sequence you will see boot messages The service processor eventually displays a login prompt For example SUNSP0003BA84D777 login The first string in the prompt is the default host name for the ILOM SP It consists of the prefix SUNSP and the MAC address of the ILOM SP The MAC address for each ILOM SP is unique 3 Log in to the SP and type the default user name root with the default password changeme Once you have successfully logged in to the SP it displays its default command prompt

Download Pdf Manuals

image

Related Search

Related Contents

Black and White LCD Serials User Manual  eqWave 3 User Manual  PDF, 2MB: FMS-Handbuch zur elektronischen Mitteilung  Samsung MAX-N75 User Manual  R-Stage / S-Stage G Type ガスケットセット 取扱説明書  manual do usuário  Siemens TS25325 steam ironing station  Instrucciones de servicio UDT  Procedimento_Elab_Revisão_Aprov_Emissao e Distr  Kidde RF-SM-DC Smoke Alarm User Manual  

Copyright © All rights reserved.
Failed to retrieve file