Home

StorageWorks™ Array Controllers HS Family of

1. Cl BULKHEAD FRONT TO REAR MOUNTING SCSI 2 CABLE LOCATIONS ROUTING POINTS H ean S i E eet 5 E CDU B d HOLE e 3j 7 3 CDUA df HOLE E E 10 al STORAGE HOLE I INTERNAL b POSITION S8 E 14 CI CABLE j ROUTIN E fale BONIS t STORAGE Tee 5 e POSITION S7 a TM L CONTROLLER iS b POSITION C2 qe HOLE Ie STORAGE 2 26 EL POSITION S6 E gt STORAGE q HOLE P POSITION S5 E 32 CABINET FRONT CABINET REAR CXO 3902A MC Figure 3 5 shows the loading sequence for storage and controller shelves when TZ8xx series tape devices are installed e Standard shelf configuration A standard of one BA350 MA controller shelf connected to six BA350 SB storage shelves in a single SW500 series cabinet is suggested e Two BA350 MA shelves can be housed with a maximum of four BA350 SB shelves as two subsystems 3 6 Configuration Rules and Restrictions Figure 3 5 SW500 Series Cabinet Controller Storage Tape Drive Locations
2. 5 16 6 Diagnostics Exercisers and Utilities 6 1 Initialization oa 1 tees Bide E See Oe hoes 6 1 6 1 1 Built In Self Test esmis ne ea a E eee eens 6 2 6 1 2 Core Module Integrity Self Test o o ooooooo momo ooo 6 2 6 1 3 Module Integrity Self Test DAEMON esee eee 6 3 6 1 3 1 Self Test cack rad Rea eda a hee ade 6 4 6 2 Disk Inline Exerciser HSJ and HSD Series Controllers 65 vi 6 2 1 6 2 2 6 2 3 6 2 3 1 6 2 3 2 6 2 4 6 2 5 6 2 6 6 2 7 6 2 8 6 2 9 6 2 9 1 6 2 9 2 6 2 9 3 6 2 10 6 2 11 6 2 12 6 3 1 6 3 2 6 3 3 6 3 3 1 6 3 3 2 6 3 3 3 6 3 4 6 3 5 6 3 6 6 3 7 6 3 8 6 3 9 6 3 9 1 6 3 9 2 6 3 10 6 3 11 6 3 12 6 4 1 6 4 2 6 4 3 6 4 3 1 6 4 3 2 6 4 4 6 4 5 6 4 6 6 4 7 6 4 8 6 4 9 6 4 10 6 4 11 6 5 1 6 5 1 1 6 5 1 2 6 5 1 3 Invoking DILX esee Interrupting DILX Execution DIE Tests 4212 a baie habe Sah toa Basic Function Test DILX User Defined Test DILX DILX Test Definition Questions DILX Output Messages o o o o DILX End Message Display DILX Event Information Packet Displays DILX Data Patterds ooo o o DILX Examples o o ooooooooooo DILX Example Using All Defaults DILX Example Using All Functions DILX Examples Auto Configure with All Units Interpreting
3. 19959 SHELF MOUNTING LOCATIONS HOLE R STORAGE S E E 2 position sis C3 42 2 j Es HOLE STORAGE ca Bl dz 8 El POSITION S14 o go E sls EE HOLE ME STORAGE C3 ale oa 14 a POSITION S13 ale CABLE ole PASS qe gt y ado THROUGH SpA HOLE 0 Hic Ve STORAGE C1 BI 2o o POSITION S6 az d ollo LO o efJ EE POLE TR STORAGE en 3 58 POSITION S1 S HOLE Sl 38 t STORAGE C1 Bl Pl POSITION S2 ol ar HOLE ole 44 M STORAGE ct EI by POSITION S3 als a e co HOLE STORAGE ale a P T a 50 t POSITION S4 C1 A o HOLE STORAGE 56 El Posmonss P S of ofl ecia a o ooo SHELF MOUNTING LOCATIONS STORAGE ef HOLE C309 Bosition sis f STORAGE HT HOLE C3 C4 Bosition s17 8 STORAGE 74 HOLE 63 04 POSITION s16 14 P L CABLE 4 PASS 2 THROUGH cg STORAGE HOLE POSITION S12 26 STORAGE 55 HOLE 02 position st a 82 co STORAGE TT HOLE POSITION S10 38 STORAGE TH HOLE C2 POSITION s9 la 44 STORAGE ff HOLE C2 positionss 50 STORAGE MH HOLE C2 Postmons7 3 56 CABINET FRONT CABINET REAR CXO 4161C MC Single or paired TZ8x7 devices must be connected with a 0 2 meter 8 inch SCSI 1 to StorageWorks transition
4. MOUNTING MOUNTING LOCATIONS LOCATIONS o o FRONT TO REAR o o HOLE p 3 SCSI 2 CABLE E SI HOLE 3 ROUTING POINTS E 3 3 L TAPE TAPE lo ff TAPE TAPE o POSITION T2 POSITION T1 9 e POSITION T1 POSITION T2 o a HOLE STORAGE o CI BULKHEAD _ pee Cl BULKHEAD 14 POSITION S3 a o o S S 5 HK p Ec d CONTROLLER 7 A CDUB d HOLE 20 E L i 19 POSITION C1 3 i INTERNAL E SBUE H HOLE o JT CICABLE e 22 gt ROUTING E CONTROLLER d iria STORAGE 3 POINT POSITION C2 qM HOLE 26 POSITION S1 2 o STORAGE 2 26 o o e POSITION S5 e HOLE PP STORAGE 3 STORAGE 3T HOLE POSITION S2 POSITION S4 d 32 o o o o CABINET FRONT CABINET REAR CXO 3903A MC e Two device shelves per port jumpered pairs Two BA350 SB shelves can be joined on the same controller port with the following restrictions The SCSI 2 cable to the first BA350 SB storage shelf is 1 0 meter or less The SCSI 2 cable from the first BA350 SB shelf to the second shelf is 0 5 meters or less This requires two shelves to be immediately adjacent to each other The first BA350 SB storage shelf is configured for unterminated single SCSI Controller shelf position C1 can be used with the pairs S1 S2 and S3 84 and controller shelf position C2 can be used with the pair S7 S8 to satisfy these restrictions single subsystem C1 can thus accommodate up to 16 5 inch SBBs e TZ8x7 hal
5. AT CTT A TUTUTTTTUTUTUUTUU CXO 4119A MC 4 Use a gentle up and down rocking motion to help seat the module into the backplane Press firmly on the module until it is seated Finally press firmly once more to make sure the module is seated 5 Tighten the four screws on the front bezel using a 3 32 inch Allen wrench HSJ series controllers or flat head screwdriver HSD and HSZ series controllers 6 Connect a maintenance terminal to the MMJ of the new controller Before Proceeding Set initial controller parameters by following the steps in Section 7 1 3 5 7 Press and hold the controller s green reset button Then insert the program card into the new controller The program card eject button will extend when the card is fully inserted Release the reset button 9 Enter the following command to initialize the controller CLI gt RESTART THIS CONTROLLER If the controller initializes correctly its green reset LED will begin to flash at 1 Hz If an error occurs during initialization the OCP will display a code Refer to Chapter 5 to analyze the code 10 If you wish you may disconnect the maintenance terminal The terminal is not required for normal controller operation 7 8 Removing and Replacing Field Replaceable Units Figure 7 6 Controller Shelf Rails
6. CACHE MODULE CONTROLLER MODULE CXO 4120A MC 11 Close and lock the cabinet doors SW800 series using a 5 32 inch Allen wrench 7 1 3 5 Restoring Initial Parameters A new controller module has no initial parameters so you must use the maintenance terminal to enter them Refer to information in a CONFIGURATION INFO file or on the configuration sheet packaged with your system whichever is most current for parameters Be sure to use the same parameters from the removed controller when installing a replacement Removing and Replacing Field Replaceable Units 7 9 After installation of a nonredundant controller use the CLI to define its parameters in the following order from a maintenance terminal CAUTION Do not install HSJ series CI host port cables until after setting all parameters listed here Failure to follow this procedure may result in adverse effects on the host cluster Note Not all steps are applicable to all controller models Steps applicable to certain models are designated as such 1 HSD series controller Turn the controller on before entering parameters 2 Enter the following command to set the MAX_NODES HSJ series controllers CLI gt SET THIS_CONTROLLER MAX_NODES n where n is 8 16 or 32 3 Enter the following command to set a valid controller ID CLI gt SET THIS CONTROLLER ID n where n is the HSJ series controller CI node number 0 through MAX_NODES 1
7. KB S This column indicates the average amount of kilobytes of data transferred to and from the unit in the previous screen update interval This data is only available for disk and tape units O Rd This column indicates what percentage of data transferred between the host and the unit were read from the unit This data is only contained in the DEFAULT display for disk and tape device types O Wr This column indicates what percentage of data transferred between the host and the unit were written to the unit This data is only contained in the DEFAULT display for disk and tape device types O Cm This column indicates what percentage of data transferred between the host and the unit were compared A compare operation may be accompanied by either a read or a write operation so this column is not cumulative with read percentage and write percentage columns This data is only contained in the DEFAULT display for disk and tape device types HT This column indicates the cache hit percentage for data transferred between the host and the unit PH This column indicates the partial cache hit percentage for data transferred between the host and the unit MS This column indicates the cache miss percentage for data transferred between the host and the unit Purge This column shows the number of blocks purged from the cache in the last update interval BIChd This column shows the number of blocks added to the cache in the
8. ee ees Operator Control Panel 0 0c ccc eee eee ens Command Line Interpreter Accessing the CLI 0 0 cc Exiting the CLI Pree ey Perea ee oe Pere Mech dune te A ae hs og C1 C50 CO CO TO I9 I0 I9 QO QO QO QO O CO O O cq o II IN Roc no Y I errs ODNDWDMNAAATHKRARWOWDODAABDN PA PR a Ge Qe Qu Go ge b b h h o do o do do o do le 4 3 3 Command Sets os cia ti dino das 4 3 4 3 4 Initial Configuration Nonredundant Controller 4 4 4 3 5 Initial Configuration Dual redundant Controllers 4 6 4 3 6 Configuring Storage Devices 0 0 cece eee eens 4 8 4 4 Acceptance A E Seay erties 4 10 45 Maintenance Terminal 0 ccc eee eee nes 4 10 4 6 Virtual Terminal HSJ and HSD Series Controllers 4 10 4 7 Virtual Terminal HSZ series Controllers o ooooo o 4 11 4 8 VAXcluster Console System 0 0 cece ees 4 11 4 9 Operating Systems o esse rn e xo Wales EN AP PY pA N Ide do 4 11 4 9 1 Controller Disks as System Initialization Disks 4 12 4 9 2 Operating System Nodes OpenVMS 0 0 00 eee 4 12 4 9 3 AUTOGEN COM OpenVMS llle 4 13 4 9 4 Other Conditions OpenVMS ooo o ooo 4 14 440 Failov r ni A a a ada 4 15 4 10 1 Setting Failover rese Comen e nea a rs 4 16 4 10 2 Exiting Wailover sepes a a
9. 0 9 0 9 SE9 119 Lb9 4 4 R4 BH SH SH R4 R4 SH 4 o 4 001 3 T AA a Had ado H adds Hdd A Q p H dd dE Hdd d 2 H dd dd LOSPETTO 190181 S A vLOOT OOT LM E 00 0T LMT 0 ZN IMILT TI y 19FX Nd 0 0 0 0 0 0 SC 0 0 0 0 6 0 SIPI L pT 11H A I 16 diog queudtnby Text ta p66T quotzAdo 1oITUOW A CIA 00 00 MH ZPIA MS SEOOOEEEDZ Ta Td Td T8 Td Td Ud Td T8 gt Ea B C3 Eos Eos Eos Eos Bn bn bn Gs Bn Bn CO CO Q4 C25 CO C2 CD CD C2 C2 C2 C2 C2 T8 Ud Ud Ta Ud e3s dAL 4 0 A A ON 0 gt XEW 49S Ind TE NIVAITO LC c uN Ca o N A c2 Ca LO N YA v2 aH Sa 61 LAISOS 8I AdUIA 8 SIHS NOH Z TION 0 WEN ad N S 0PZSH 6 70 Diagnostics Exercisers and Utilities SS XX xxx SS TO c5 UL ud O XXX O E E iO FH A I 68 AA OO E O BEC pec cs 0 S M S P by Y DE O C4 MO TLD o d Hudadaddo Hudadadas Huaaaaaagi Huaaaqqaez Hudaaaaaaco HYddddadTd LOSPECTO 190181 00000 0 0 0 A Oesa 0 0 0 IT9T 809 0 8 X oeza 00000 0 0 0 A 0 Td 00000 0 0 0 A 0290 00000 0 0 0 vY 0250 a 00000 0 0 0 A Oct o 00000 0 0 0 706900 0000 0 0 0 A ozea a 000000 0 0 vt 05600 0 0 IT 9T EE 0 TG X cc 9 000000 0 0 A amp 0510 0000 0 0 0 dA 7i S 00000 0 0 0 A amp 0 100000 0 0 0 A 019d
10. 5 1 4 SBB 5 1 4 SBB CXO 3750B MC Half rack full depth devices for example all TZ867 tapes must be on their own port and cannot be connected as an extension from a BA350 SB shelf Only two such devices maximum may be configured per controller port and those devices must be physically adjacent to each other at the top of a cabinet Figure 3 7 shows two adjacent tape drives attached to a single port of the controller shelf 3 8 Configuration Rules and Restrictions Figure 3 7 Adjacent Devices on a Single Port BA350 MA HSJ40 CONTROLLER CXO 3751A MC e Connecting a 1 0 meter cable from a controller shelf to a device shelf allows for device shelf jumpering Connecting a 2 0 meter cable does not permit shelf jumpering Required cable length will vary depending on cabinet type device shelf position and controller shelf position 3 4 Device Placement The following sections describe recommended device configurations for 3Y2 inch and 54 inch SBBs Note Intermixing disk SBBs and tape SBBs on the same controller port is permitted provided all other configuration rules in this chapter are also obeyed 3 4 1 3 inch SBB Restrictions There are no restrictions for adding 3 inch SBBs to a configuration Refer to your SPD and release notes for a list of specific supported device types 3 4 2 54 inch SBB Restrictions The following restrictions apply when adding 5 inch SBBs
11. continued on next page Table C 46 Cont Disk Inline Exerciser DILX Last Failure Codes Code Description 800A0100 DILX was was not able to restart HIS timer 800B0100 DILX tried to issue an IO for an opcode that is not supported 800C0100 DILX tried to issue a oneshot IO for an opcode that is not supported 800D0100 A DILX device control block contains an unsupported unit_state 800E0100 While trying to print an Event Information Packet DILX discovered an unsupported MSCP error log format 80100100 DILX could not compare buffers because no memory was available from EXEC ALLOCATE_MEM_ZEROED 80120100 DILX expected an EIP to be on the receive EIP question but no EIPs were there 80130100 DILX was asked to fill a data buffer with an unsupported data pattern 80140100 DILX could not process an unsupported answer in dx reuse params Table C 47 Tape Inline Exerciser TILX Last Failure Codes Code Description 81010100 An HTB was not available to issue an IO when it should have been 81020100 A unit could not be dropped from testing because an available command failed 81030100 TILX tried to release a facility that was not reserved by TILX 81040100 TILX tried to change the unit state from MAINTENANCE_MODE to NORMAL but was rejected because of insufficient resources 81050100 TILX tried to change the USB unit state from MAINTENANCE MODE to NORMAL but TILX never received notification of a successful state change 8
12. 6 2 9 2 DILX Example Using All Functions In Example 6 7 all functions are chosen for DILX DILX was invoked from the virtual terminal using the DUP connection from an OpenVMS system This is an extensive long run because the initial write pass was chosen and because there was enough time for the initial write pass to complete and for normal testing to continue for a reasonable length of time after the initial write pass CAUTION This test writes to disks All user data will be destroyed Example 6 7 All Functions DILX SHOW CLUSTER CONTINUOUS View of Cluster from system ID 9038 node ENGHRN 7 APR 1993 14 54 01 SYSTEMS MEMBERS NODE SOFTWARE STATUS ENGHRN VMS V5 5 MEMBER FORCE HSC V700 WODWND VMS V5 5 MEMBER CYMBAL VMS V5 5 MEMBER LUTE VMS V5 5 MEMBER ASS2 HSJ TM41 ASS1 HSJ XM41 Entered a Ctrl C here UP gt set host dup server mscp dup MASS1 task DILX HSCPAD I LOCPROGEXE Local program executing type V to exit oe O Copyright O Digital Equipment Corporation 1993 Disk Inline Exerciser version 1 4 The Auto Configure option will automatically select for testing half or all of the disk units configured It will perform a very thorough test with WRITES enabled The user will only be able to select the run time and performance summary options and whether or not to test a half or full configuration The user will not be able to specify specific units to test The Auto Configure option is
13. Error 1140 Invalid unit number Valid unit number range s are lt start gt to lt end gt Explanation You attempted to create a unit out of the valid unit ranges The valid unit ranges are given by the lt start gt and lt end gt values Retry the ADD command specifying a unit number in the correct range Error 2000 Port must be 1 lt maximum port number gt Explanation When adding a device you specified a port less than 1 or greater than lt maximum port number gt Retry the command specifying a port within the range given Error 2010 Target must be 0 lt maximum target number gt Explanation When adding a device you specified a target greater than lt maximum target number gt In single controller configurations lt maximum target number gt is 6 In dual redundant configurations lt maximum target number gt is 5 Error 2020 LUN must be 0 7 Explanation When adding a device you specified a LUN greater than 7 Error 2030 This port target and LUN already in use by another device Explanation When adding a device you specified PTL that is already specified by another device Command Line Interpreter B 65 Error 2040 Cannot set TRANSPORTABLE when device in use by an upper layer Explanation A disk cannot be set to TRANSPORTABLE once it is being used by an upper level unit or storage set Error 2050 Cannot set NOTRANSPORTABLE when device in use by an upper layer Explanation A disk ca
14. Qualifiers for HSD controllers ID n Specifies the DSSI node number 0 through 7 MSCP_ALLOCATION_CLASS n Specifies the allocation class 0 through 255 in a single controller configuration or 1 through 255 in a dual redundant configuration When first installed the controller s MSCP_ALLOCATION_CLASS is set to 0 PATH NOPATH Enables or disables the DSSI port When first installed NOPATH is set PROMPT new prompt Specifies a 1 to 16 character prompt enclosed in quotes that will be displayed when the controller s CLI prompts for input Only printable ASCII characters are valid When first installed the CLI prompt is set to the first three letters of the controller s model number for example HSJ gt HSD gt or HSZ gt SCS _NODENAME xxxxxx Specifies a one to six character name for node TERMINAL_PARITY ODD EVEN NOTERMINAL_PARITY Specifies the parity transmitted and expected Parity options are ODD or EVEN NOTERMINAL_PARITY causes the controller not to check for or transmit any parity on the terminal lines When first installed the controller s terminal parity is set to NOTERMINAL PARITY TERMINAL SPEEDzbaud rate Sets the terminal speed to 300 600 1200 2400 4800 or 9600 baud The transmit speed is always equal to the receive speed When first installed the controller s terminal speed is set to 9600 baud B 38 Command Line Interpreter SET THIS_CONTROLLER TMSCP_ALLOCATION_CLASS n
15. Initialize a newly inserted disk by entering the following CLI gt INITIALIZE container where container is either the disk or a group of disks linked as a storage set This initializes the metadata on each disk in the container including the one that was just swapped Note If you think you have failed to perform warm swap exactly as stated here you should reinitialize the controller Otherwise the controller may perform unpredictably Removing and Replacing Field Replaceable Units 7 41 Remember to close and lock cabinet doors SW800 series using a 5 32 inch Allen wrench after finishing the device warm swap 7 11 2 Controller Warm Swap HSJ Series Controllers Use warm swap to efficiently remove and replace one controller in a dual redundant configuration When you warm swap a controller you are changing out a controller in the most transparent method available to the HS controller subsystem Performing warm swap involves removing one controller while forcing the other controller into failover Because the remaining controller executes failover it assumes control of the absent controller s devices This minimzes impact to system performance and downtime Note You must warm swap only one controller at a time Never attempt to remove both controllers in your dual redundant configuration using warm swap Try to have a replacement controller available prior to starting warm swap Otherwise you must t
16. ON E 1 Glossary Index Examples 6 1 6 2 6 3 6 4 6 5 Tp qq Ie 0 Y O TT P P gt 2 o oL ol ol O mA qe np om cloogmromvm o Ji Figures 1 1 1 2 1 3 1 4 2 1 29 2 3 2 4 2 5 2 6 2 7 8 1 3 2 3 3 DILX End Message Display llle 6 18 Controller Error ss 1202 dr ctt e Beale se Rope net tte 6 19 Memory Error 0 cece hs 6 19 Disk Transfer Error iaa oa noes ke tem 6 20 Bad Block Replacement Attempt Error o ooooooo ooo 6 20 Using All Defaults DILX 6 22 All F nctions DIEX 0 3 2 heck A A A pe A ee 6 23 Auto Configuration with All Units o 6 25 Auto Configuration with Half of All Units 6 26 TILX End Message Display llle 6 42 Controller Btror 2225s te es ete RU Se en etes 6 43 Memory Error unos necia e 9 E a p se Re ra 6 43 Tape Efron a eL Vuoi eae Ee e e Rue b ESSE 6 44 Using All Defaults TILX eee 6 46 Using All Functions TILX 6 46 DILX Sense Data Display lees 6 61 DILX Deferred Error Display o oooooooooooo ooo 6 62 Disk Transfer Error Event Log C 2 Deskew Command Procedure Example C 123 ERF Error Log Before Command Procedure C 125 ERF Error Log After Command Procedure Ln C 126
17. SB shelves attached to controllers with 6 controller ports e Maximum number of device shelves Up to 18 horizontal BA350 SB device shelves are allowed 16 if one or two TZ8x7 tape loaders are present An earlier cabinet configuration had a provision for 19 horizontal device shelves however Digital no longer recommends that configuration 3 Redundant power and dual redundant controllers are not supported when using 42 devices This is not a recommended configuration 3 4 Configuration Rules and Restrictions Figure 3 3 SW800 Series Data Center Cabinet Controller Storage 3 4 Tape Drive Locations SHELF TAPE SHELF TAPE MOUNTING MOUNTING LOCATIONS LOCATIONS o o o o ofe ojo o o oj 08 o 8 pJp o o HOLE ele o elf o HOLE 3 o TAPE TAPE olo a o ollo TAPE TAPE o 3 POSITION POSITION Eo 7 sll gest lele POSITION POSITION A T2 T1 els mrQ 5
18. See Section C 2 1 for the description of this field byte count The number of bytes contained in the affected Nonvolatile Parameter Memory component area that is the area bounded by memory address through memory address byte count 1 number of times written The number of times the affected Nonvolatile Parameter Memory component area has been written undef This field is only present to provide longword alignment its content is undefined C 2 3 4 Backup Battery Failure Event Log Template 12 The HSJ30 40 controller Value Added Services firmware component reports backup battery failure conditions for the various hardware components that use a battery to maintain state during power failures via the Backup Battery Failure Event Log HSJ Series Error Logging C 29 The Backup Battery Failure Event Log will be sent to all host systems that have enabled Miscellaneous error logging on a connection or connections established with the HSJ30 40 controller s Disk and or Tape MSCP Server The Backup Battery Failure Event Log is reported via the T MSCP Memory Errors error log message format The format of this event log including the HSJ30 40 controller specific fields is shown in Figure C 19 Figure C 19 Backup Battery Failure Event Log Template 12 Format 31 0 controller identifier reserved chvrsn csvrsn memory address reserved event time Backup Battery Failure Event Log Format Sp
19. 00120009 Not implemented in DSSI environment 001D0009 Virtual circuit closed due to DSSI ID complete failure continued on next page D 2 HSD Series Error Logging Table D 2 Cont Host Interconnect Services Status Codes Code Description 001F0009 Virtual circuit closed due to DSSI retry Table D 3 DSSI Port Port Driver Event Log Template 32 Instance MSCP Event Codes MSCP Instance Event Code Code Description 4007640A 006A DSSI Port detected error upon attempting to transmit a packet This resulted in the closure of the Virtual Circuit Table D 4 Host Interconnect Services Last Failure Codes Code Description 40000101 An unrecognized DSSI opcode was received by HIS These packets are packets with DSSI opcodes recognized by the port but not by HIS Last Failure Parameter 0 contains the DSSI opcode value Table D 5 Host Interconnect Port Services Last Failure Codes Code Description 420C0100 HP_INIT could not allocate initial HTB for Path A 420D0100 HP_INIT could not allocate HPHW structure 42350100 HP found a negative offset in a Host Data transfer operation 42640100 Scan packet que found bad path select case for DSSI 42680102 Dssi_err_isr routine found that 720 report status for initiator mode Last Failure Ped an unexpected status for target mode Last Failure Parameter 0 contains the 720 chip dstat register value Last Failure Parameter 1 contains the 720 chip sist1 regi
20. 1000 Perform data compare y n n y Enter compare percentage 1 100 2 1 Tape unit numbers on this controller include 50 52 Enter unit number to be tested 50 Is a tape loaded and ready answer Yes when ready y Unit 50 successfully allocated for testing Select another unit y n n y Enter unit number to be tested 52 Is a tape loaded and ready answer Yes when ready y Unit 52 successfully allocated for testing Maximum number of units are now configured TILX testing started at 13 JAN 1993 04 38 15 Test will run for 10 minutes Type T if running TILX through VCS or G in all other cases to get a current performance summary Type C to terminate the TILX test prematurely Type Y to terminate TILX prematurely TILX Summary at 13 JAN 1993 04 40 14 Test minutes remaining 9 expired 1 Unit 50 Total IO Requests 724 Read Count 3 Write Count 681 Reposition Count 3 Total KB xfer 6718 Read 10 Write 6707 o errors detected Unit 52 Total IO Requests 731 Read Count 3 Write Count 687 Reposition Count 3 Total KB xfer 6743 Read 10 Write 6733 o errors detected Reuse Parameters stop continue restart change unit stop TILX Normal Termination HSJ gt Diagnostics Exercisers and Utilities 6 47 6 3 10 Interpreting the TILX Performance Summaries A TILX performance display is produced under the following conditions When the user selectable performance summary interval elapses
21. 7 1 7 2 7 2 7 3 7 3 7 3 7 4 7 7 7 9 7 13 7 13 7 13 7 13 7 15 7 16 7 18 7 19 7 19 7 19 7 19 7 19 7 20 7 21 7 21 7 21 7 22 7 22 7 23 7 23 7 23 7 23 7 25 7 25 7 25 7 25 7 26 7 26 7 27 7 27 vii 7 6 2 7 6 3 7 6 4 7 7 7 7 1 7 7 2 7 7 3 7 7 4 7 8 7 8 1 7 8 2 7 8 3 7 8 4 7 9 7 9 1 7 9 2 7 9 3 7 9 4 7 10 7 10 1 7 10 2 7 10 3 7 10 4 7 11 7 11 1 7 11 2 7 11 2 1 7 11 2 2 7 11 2 3 7 11 2 4 7 11 2 5 Precautions aa aren beer SP OR ee rtg ee etd us 7 27 Cable Removal isaac wl RE RUE da el a HAS 7 28 Cable Replacement Installation o 7 29 SCSI Host Cables HSZ Series llle eee 7 29 Tools Required llle 7 29 Precautions osi e loon od a a cs 7 30 Cable Removals since tata satis Seas Saas Maa es Vis 7 30 Cable Replacement Installation 0 00 0 cece eee eee 7 31 SCSI Device Port Cables 0 llle eh 7 31 Tools REQuired ui RU pe Ms LIE RUN Y te RS 7 31 PreCaUtlODS fice eee eaa eee ae 7 31 Cable Removal iia ni ada ares a ir el are d pea 7 32 Cable Replacement Installation 7 33 AEA a SEE ee keene RU aeons E o dre A dc 7 34 Tools Required 2i IR a SS SE DE SEAS 7 34 Pr cauti0ns sese tesar e doge e Bodo d og oe ma intere 7 34 Blower Removal 24 nea Aeg Re RA aie AERIAL 7 35 Blower Replacement Installation o 7 36 Power Supplies icon A A id p EN CER 7 36
22. and ascq fields are undefined Byte transfer timeout during disk operation Note that in this instance the asc and ascq fields are undefined Byte transfer timeout during tape operation Note that in this instance the asc and ascq fields are undefined Byte transfer timeout during media loader operation Note that in this instance the asc and ascq fields are undefined Byte transfer timeout during operation to a device that is unknown to the controller Note that in this instance the asc and asco fields are undefined SCSI bus errors during disk operation Note that in this instance the asc and ascq fields are undefined continued on next page Table C 27 Cont Device Services Nontransfer Error Event Log Template 41 Instance MSCP Event Codes MSCP Instance Event Code Code Description 03854402 01AA SCSI bus errors during tape operation Note that in this instance the asc and ascq fields are undefined 03B74402 01AA SCSI bus errors during media loader operation Note that in this instance the asc and ascq fields are undefined 03D24402 01AA SCSI bus errors during device operation The device type is gt unknown to the controller Note that in this instance the asc and ascq fields are undefined 03052002 002A Device port SCSI chip reported gross error during disk operation Note that in this instance the asc and ascq fi
23. seconds Change update interval HELP Display this help message REFRESH Refresh the curren QUIT Terminate program sa t display e as EXIT UPDATE Update screen display VIDPY gt n controller status display This is the sample output from executing the HELP command Diagnostics Exercisers and Utilities 6 97 6 6 The CONFIG Utility The CONFIG utility locates and adds devices to the controller You should run the CONFIG utility whenever new devices are added to the controller The CONFIG searches all port target LUN device combinations to determine what devices exist on the subsystem It adds all new devices that are found The CONFIG utility does not initialize these devices and it does not add units or storage sets If a device somewhere in the cluster already has the PTL that the CONFIG utility plans to assign the program will assign an alpha character after the numbers For example if another device is already called DISK100 the program will assign the name DISK100A to the new device The program compares DISK100A to other PTLs in the cluster and if DISK100A has already been used the program increments to DISK100B and so forth This avoids the assignment of duplicate PTLs in the same cluster 6 6 1 Running the CONFIG Utility You can run the CONFIG utility on either a virtual terminal or on a maintenance terminal Before running the CONFIG utility you may use the SHOW DEVICES command to
24. 0 IH ONY T OT WA EZ SUOT329UUO 0 0 Td ONS T OZ dOSW TZ 070 IH ONS I OT SOS oz 070 IH ONS I OI ana 61 070 TE ONT I OT H Sa 8I 0 0 TE NA T OT CeHINA LT H ddaad 2 0 00T UY dnd OT A4UIA 8 0 1 O 69 7d H ddddd ZO 0 0 UH 2N48 0Ov LdH 0 1 O z69za H daadaa Td 0 0 Td ONT T OT NOOWH C 0 1 O T69ZA LOSHEZTO 00 uH O O TIN 0 S dM OMSY JUN S dM MSY FUN JSBI1L ndo eas dAL xew xas SUN ud 05 05 0 0 dn S ba 0 S dM 0 STPI 0 0 ITH G I 248 vV0 6v 91 v661 dd43 0 dioo qusudrnbg Teqthtd p66T O IUBTIAdOD 10ITUON AULA 6 74 Diagnostics Exercisers and Utilities Figure 6 9 VTDPY Brief SCSI Status Display 06S 1 O L0ZOG 06S 1 lt O 9020G 78S X lt O GOZOG 00 Td NA I OT INV GH TE 06S 1 lt O 70zZOd 00 Td ONA Z 0Z 204 0 786 I lt O 0Z0A 00 0T MZ OO Td ONS T OT 3LONTA 62 06S I 40 20200 00 0T LM T 0 0 Td DNA T OT JOHAN 82 86S 1 40 TOZOG ZUN IMLO O TE ONT Z 9T NIVNITO LC 909 1 O 00ZOd 91PY 193X 00 Ta ONT T OZ 0 Sd 92 909 1 lt O LOTOG 002 UN ONT OP T Sa GZ 06S i lt O 90100 Hddddad9 0 0 TE ONI T OT VA vc 06S 1 0G0100 H dads 0 0 Ta N4 T OT 8H sa 6T 06S 1 lt O potTOd Hdd A A 7 0 0 TE N4 T OT 1IAISOS 8I 06S 1 O 0T0Od H Adder 0707 Us dana Zz OT AddIA 8 P19 1 0 ZOIOd Hadad zo 00 Ud ONIZ 0b SIHS 909 1 lt O TOTO H GG dIdO O Td 2NAT OT NOM Z 86S 1 40 00T0d 9SDEZTO 00 uH 0
25. 0 TIN 0 S A OMSY JUN S dM MSY FUN 1911 ndo eas dAL xew xas SUN ud 9v L0 0 0 dn S ba IST S A 0256 STPI 0 0 ITH G I 388 d109 quewdtnbg eirbrG PE6T aubra Ado 10ITUON AdQlA Diagnostics Exercisers and Utilities 6 75 Display Header HSJ40 s n cxo0000002 sw vi4J O mw 2 02 O VIDPY Monitor Copyright 1994 Digital Equipment Corp Description This subdisplay provides title information for the display For 132 column displays this subdisplay will be spread across one line of the display Controller model Controller serial number Controller firmware version Controller hardware version 000000 Copyright notice 6 76 Diagnostics Exercisers and Utilities Date and Time 29 JAN 1994 13 46 34 Up 1 3 45 19 Description This subdisplay provides time information for the display O System date and time This information is not displayed for SCSI based HS controllers O Time in days hours minutes and seconds since the last controller boot Diagnostics Exercisers and Utilities 6 77 Controller Performance Summary 88 I D Hit 47 2 Idle O 1225 KB S 106 rq s O Description This subdisplay provides total system performance information O Instruction and data cache hit rate O Policy processor idle rate Cumulative data transfer rate in kilobytes per second When logical units are being displayed this is the transfer rate between the host and the controller When physical devices a
26. 01 02 70 C 70 C 73 C 74 76 C C C 00 C 67 C C C C 03 67 C 70 C 74 76 40 nn C 68 C SCSI Buffered Modes Codes 0 MO OUO Na C 63 C 63 C 63 C 63 C 63 C 63 C 63 C 63 71 74 SCSI Command Operation Codes 00 01 03 04 05 07 08 10 11 12 13 14 15 16 17 18 19 25 28 30 31 32 33 34 35 36 37 C 61 C 61 C 61 C 61 C 61 C 61 C 61 C 61 C 61 C 61 C 61 C 61 C 61 C 61 C 61 C 61 C 61 76 Codes SCSI Command Operation Codes cont d 39 40 41 42 43 44 45 47 48 49 55 0A 1A 2A 3A 5A A5 A6 AS A9 C 62 C 62 C 62 C 62 C 62 C 62 C 62 C 62 C 62 C 62 C 62 C 61 C 61 C 61 C 62 C 62 C 62 C 62 C 62 C 62 AF C 62 OB 1B 2B 3B 4B BO B1 B2 B3 B5 B6 B8 1C 3C 4C 1D 4D 1E 2E 3E OF OF 3F SCSI Device Type Codes 00 01 05 08 SCSI Sense Key Codes 0 gt gt gt gt or oO rH C 61 C 61 C 61 C 62 C 62 C 62 C 62 C 62 C 63 C 63 C 63 C 63 C 61 C 62 C 62 C 61 C 62 C 61 C 61 C 62 C 61 C 61 C 62 C 60 C 60 C 60 C 60 C 64 C 64 C 64 C 64 C 64 C 64 Index 11 Codes SCSI Sense Key Codes cont d C 64 C 64 C 64 C 64 C 64 C 64 C 64 C 64 C 64 C 64 solid OCP 5 4 System Communication Services Message Operation Codes 0000 C 59 0001 C
27. 107 Table C 36 Cont Fault Manager Last Failure Codes Code Description 040D0100 040F0102 FM ENABLE_EVENT_NOTIFICATION was called to enable EIP notification but the specified routine was already enabled to receive EIP notification The eip gt generic mscp1 flgs field of the EIP passed to FM REPORT_ EVENT contains an invalid flag e Last Failure Parameter 0 contains the instance code value e Last Failure Parameter 1 contains the value supplied in the eip gt generic mscp1 flgs field Table C 37 D ual Universal Asynchronous Receiver Transmitter Services Last Failure Codes Code Description 06010100 06020100 06030100 The DUART was unable to allocate enough memory to establish a connection to the CLI A port other than terminal port A was referred to by a set terminal characteristics command This is illegal A DUP question or default question message type was passed to the DUART driver but the pointer to the input area to receive the response to the question was NULL Table C 38 Failover Control Last Failure Codes Code Description 07010100 07020100 07030100 07040100 07050100 07060100 All available slots in the FOC notify table are filled FOC CANCEL_NOTIFY was called to disable notification for a rtn that did not have notification enabled Unable to start the Failover Control Timer before main loop Unable to restart the Failover Control Timer Una
28. 2 Configure half of all disk units for testing this is recommended for a dual controller subsystem 3 Exit Auto Configure and DILX Enter Auto Configure option 1 3 3 2 k Qa apto oun k All data on the Auto Configured disks will be destroyed You MUST be sure of yourself Are you sure you want to continue y n n y Enter execution time limit in minutes 1 65535 60 Enter performance summary interval in minutes 1 65535 60 Unit 12 successfully allocated for testing Unit 21 successfully allocated for testing Unit 61 successfully allocated for testing DILX testing started at 13 JAN 1993 04 39 20 Test will run for 60 minutes Type T if running DILX through VCS or G in all other cases to get a current performance summary Type C to terminate the DILX test prematurely Type Y to terminate DILX prematurely DILX Summary at 13 JAN 1993 04 41 39 Test minutes remaining 58 expired 2 Unit 12 Total IO Requests 8047 o errors detected Unit 21 Total IO Requests 15239 o errors detected Unit 61 Total IO Requests 19270 o errors detected S Reuse Parameters stop continue restart change unit stop DILX Normal Termination HSJ gt 6 2 10 Interpreting the DILX Performance Summaries A DILX performance display is produced under the following conditions e When a specified performance summary interval elapses When DILX terminates for any conditions except an abort e When Ctrl G
29. 2 1 10 Device Ports sey os opere Dep Ve E ato i epee ones 2 1 11 Cache Module i iL A EARS ONERE Oe GNE 2 1 11 1 Common Cache Functions o o 2 1 11 2 Read Cache Module 2 1 12 Host Interface ov s eL RE he 2 1 12 1 HSJ Series CI Interface oo o 2 1 12 2 HSD Series DSSI Interface 0 0 0 eee eee 2 1 12 3 HSZ Series SCSI 2 Interface o ooo ooo oo 2 2 HS Controller Firmware l xvii xxi Ll I 000000 JO ODOHOIA oL de oL oL o cb do oL d RA Poth Be Fo ne festa te Toga P9 C 4O0 C1C1CQ1C1CiC1 5 2 2 1 2 2 1 1 2 2 1 2 2 2 2 2 2 3 2 2 3 1 2 2 3 2 2 2 3 3 2 2 3 4 2 2 3 5 2 2 4 2 2 5 2 2 5 1 2 2 5 2 2 2 5 3 2 3 2 3 1 2 3 2 2 3 3 Core Functions Tests and Diagnostics 0 llle Executive Functions llle eee Host Interconnect Functions oooooooooorooo oo o Operator Interface and Subsystem Management Functions Command Line Interpreter llle Diagnostic Utility Protocol 0 llle HSZ Series Virtual Terminal o o Local Programs a ccc ccc eee eee nes Error Logging and Fault Management sens Device Services Value Added Functions 0 0000 cece eee ee eee eee RAID Failover Caching Addressing Storage Within the Subsystem
30. 31457 01 68 pin SCSI bus terminator 12 37004 03 A 2 Required Tools and Equipment The following tools and equipment are required for controller maintenance e Portable antistatic kit part number 29 26246 00 e ESD mat for all module replacement service e 3 32 inch Allen wrench for replacing HSJ series controllers e 5 32 inch Allen Wrench for opening the front door of a SW800 series data center cabinet e Flat head screwdriver for replacing host cables HSD series controllers and HSZ controllers Small flat head screwdriver for replacing trilink connectors while SCSI cables are attached An EIA 423 compatible terminal is needed for setting the initial configuration When using this terminal a connecting cable between the terminal and the controller that supports EIA 423 communication is required A 2 Field Replaceable Units A 3 Related Field Replaceable Units The following FRUs are related to the HS controllers Refer to the appropriate Storage Works documentation for removal and replacement procedures for these components if not found in this manual Table A 5 Controller Related FRUs FRU Part Number CI external cable BLUE 17 01551 xxt Controller shelf with backplane BA350 MA Device shelf with backplane BA350 SB Shelf power supply H7429 AA NULL modem DECconnect laptop 9 pin cable H8571 J DEC connect cable BC16E xxt SCSI 1 to SCSI 2 transition cable 0 2 meter 17 03831 01 8 inch
31. 6 6 output messages 6 14 performance summary 6 27 running from maintenance terminal 6 6 running from VCS 6 6 running from virtual terminal 6 6 test definition questions 6 8 tests available 6 7 user defined test 6 8 using all defaults 6 22 using all functions 6 23 HSZ series abort codes 6 65 basic function test 6 51 data test patterns 6 62 defaults 6 54 deferred error display 6 62 defined 6 50 error codes 6 65 interrupting 6 51 output messages 6 58 performance summary 6 63 running from maintenance terminal 6 51 sense data display 6 61 test definition questions 6 53 tests available 6 51 user defined test 6 52 DIRECTORY command B 14 Disk in line exerciser See DILX DRAB See Shared memory DRAM See Shared memory DSSI cable service precautions 1 9 DSSI host cable 3 19 7 27 installing 7 29 length 3 19 removing 7 28 replacing 7 29 service of 7 27 service precautions 7 27 tools 7 27 DSSI host interconnection supported protocols 2 9 DSSI node number 4 5 4 6 7 10 7 16 DSSI trilink installing 7 29 removing 7 28 replacing 7 29 Index 14 Dual controller port 2 4 Dual data link See DDL Dual redundant controller and downtime 5 1 configuration 3 16 failover 2 4 2 12 5 1 7 3 7 42 initialization 4 1 installing one of 7 15 on separate hosts 4 8 7 17 7 47 removal of one 7 13 replacing one of 7 15 restoring parameters for one 7 16 service consideration 5 1 servic
32. E 00 0 IL9T Z6Z 0 gd 000000 0 0 0 A 0150 o 00000 0 0 0 vw0sIda0 0000 0 0 0 A OTP E bc D ed e gi 0 0 7 09040 0000 0 0 0 vY orea P 000000 0 0 70950 0 0 0 TITLI OFF 0 0S X OT 2 00000 0 0 0 A amp 05nq00000 0 0 0 OTTO E 00000 0 0 0 0100000 0 0 0 w 009d gt 0 0 0 IL9T STb 0 0 wWO0rd0 0000 0 0 0 A 0050 o 00000 0 0 0 wW0hTd0 0000 0 0 0 A 00d E 00000 0 0 0 06910 0000 0 0 0 A 000 00000 0 0 0 v 0 Sd 0 0 0 TT 9T 96 0 LE Ww 00zd o 000000 0 0 A000 0000 0 0 0 vY 001d ya WI Wd YO HL ond s ewiM S ampa s bu AMSY TLd WI Wd Y I ond s exiM S IVA S DY IMSY Id 5 97570 0 da S D4 erc S dM 68E7 9TPI 0 0 2 90 6 9T v66T 4X4 0 dio queudimbg Text ra p661 O WoTIAdoy x107TUOW AULA 00 00 MH CFIA MS BEGOOLEEDZ N S OFLSH Diagnostics Exercisers and Utilities 6 71 Figure 6 6 VTDPY Unit Cache Performance Display oo oooooooooooOoOo CC AAA A ES A UCA C E ES C 0 EOS 3055 2 E 3 O ES XX 0 ERAS 5 O SAO AOS ES IIA AA CO C AS C2 C3 C05 cC EE ESA 0 x XY CX C CO E us E o 0 C co c C c c5 c oc5oc oc oc oc cocococo X 45 SACA AO oL Cc co xx CO COCA C5 CX c5 c C9 E35 x SH SH SH 4 4 4 4 4 4 4 4 4 4 4 E XO Xa 3 C054 x CO C3 o cw xx 0 0 0 0 ye lt lt lt lt lt lt lt E E CoO 8B d d d d d GB GB GB c e M YHTE PUOTE Sand SSW Hd SIH QUO 3M SPY S M MS 6H Sz2 0 0 dn 90 6 9T v66 884 0 s bu c 0
33. ENTRY 19 kk ckckckckck ck kk kk kk k kk kk k kk kkkkkkkk kk EM EVENT INFORMATION EVENT CLASS ERROR EVENT OS EVENT TYPE 199 CAM SCSI SEQUENCE NUMBER 19 OPERATING SYSTEM DEC OSF 1 OCCURRED LOGGED ON Tue Mar 15 12 36 47 1994 OCCURRED ON SYSTEM dombek SYSTEM ID x0004000F CPU TYPE DEC CPU SUBTYPE KN15AA EE UNIT INFORMATION CLASS x0000 DISK SUBSYSTEM x0000 DISK BUS x000E x0392 LUN x2 TARGET x2 CAM STRING ROUTINE NAME cdisk check sense CAM STRING ROUTINE NAME cdisk check sense CAM STRING Hardware Error bad block number 0 mE CAM STRING ERROR TYPE Hard Error Detected CAM STRING DEVICE NAME DEC HS240 muc CAM STRING Active CCB at time of error pu CAM STRING CCB request completed with an error ERROR os std os type 11 std type 10 continued on next page E 2 HSZ Series Error Logging Example E 1 Cont The uerf utility Error Event Log MY ADDR CCB LENGTH FUNC CODE CAM_STATUS PATH ID TARGET ID TARGET LUN CAM FLAGS PDRV_PTR NEXT_CCB REQ_MAP VOID CAM_CBFCNP DATA_PTR DXFER_LEN SENSE_PTR SENSE_LEN CDB_LEN SGLIST_CNT CAM_SCSI_STATUS SENSE_RESID RESID CAM CDB IO CAM TIMEOUT MSGB LEN VU FLAGS TAG ACTION EROS CAM STRING x8A960728 x00C0 x01 x0084 CAM REQ CMP ERR AUTOSNS VALID 14 25 2 x00000442 CAM QUEUE ENABLE CAM DIR IN CAM SIM OFRZD
34. N S N S DEC OSF 1 AXP N S N S V2 0 Supported with limitations Not supported at time of printing Refer to your firmware release notes for updates to the list of operating system support Normal Operation 4 11 Although certain specifics regarding operating systems are covered here you should refer to the Storage Works Array Controllers HS Family of Array Controllers User s Guide for complete information on operating system support 4 9 1 Controller Disks as System Initialization Disks HSJ series controllers HSJ series controller disks as VAX 7000 and VAX 10000 initialization devices HS operating firmware supports manual and automatic initialization for VAX 7000 10000 systems For a disk drive connected to an HSJ series controller to be both a VAX 7000 10000 manual and automatic initialization device the following conditions must be met e VAX 7000 10000 console code must be at version V3 2 or higher e HS operating firmware must be at version V1 0B or higher Note Contact Digital Multivendor Services if you need to upgrade to V3 2 or greater VAX 7000 10000 console code If your VAX 7000 10000 console code version is earlier than V3 2 you are limited to manual initialization To manually initialize perform the following steps 1 Make sure that the disk drives attached to the HSJ series controller are visible to the initialization driver by entering the SHOW DEVICE command repeatedly from
35. Replace Flags Serial Number Bad LB Old RB New RB Cause Instance X Template Type X Requestor Information Size x Requestor Specific Data bytes 0 7 XX XX XX XX XX XX XX XX Requestor Specific Data bytes 8 15 xx XX XX XX XX XX XX XX toch cb ctv x x MMMM KKM v4 KM DX v4 DX v4 v4 MM KM X continued on next page 6 20 Diagnostics Exercisers and Utilities Example 6 5 Cont Bad Block Replacement Attempt Error Requestor Specific Data bytes xx XX XX XX XX XX XX XX XX XX 6 2 8 DILX Data Patterns Table 6 2 defines the data patterns used with the DILX Basic Function or User Defined tests There are 18 unique data patterns These data patterns were selected as worst case or the ones most likely to produce errors on disks connected to the controller Table 6 2 DILX Data Patterns Pattern Number Pattern in hex 1 2 3 4 5 shifting 1s 6 shifting Os 7 alternating 1s Os 10 11 12 13 ripple 1 14 ripple 0 15 16 17 18 Default Use all of the above patterns in a random method 0000 8B8B 3333 3091 0001 0003 0007 000F 001F 003F 007F OOFF O1FF 03FF 07FF OFFF 1FFF 3FFF 7FFF FTE FFFC FFFC FFFC FFEO FFEO FFEO FFEO FEOO FC00 F800 F000 F000 C000 8000 0000 0000 0000 0000 FFFF FFFF FFFF 0000 0000 FFFF FFFF 0000 FFFF 0000 FFFF 0000 FFFF B6D9 5555 5555 5555 AAAA AAAA AAAA 5555 5555 AAAA AAAA 5555 AAAA 5555 AAAA 55
36. SHOW OTHER_CONTROLLER SHOW OTHER CONTROLLER Shows the other controller s information Note This command is valid for HSJ and HSD controllers only Format SHOW OTHER CONTROLLER Description Shows all controller port and terminal information for the other controller Qualifiers FULL If the FULL qualifier is specified additional amplifying information is displayed after the normal controller information Examples CLI SHOW OTHER CONTROLLER Controller HSJ40 ZG313FF115 Software E140 Hardware 0000 Configured for dual redundancy with 2630355555 In dual redundant configuration SCSI address 6 Host port Node name HSJ306 valid CI node 6 32 max nodes System ID 420010061120 Path A is ON Path B is ON MSCP allocation class 3 TMSCP allocation class 3 Cache 32 megabyte read cache version 2 The basic HSJ controller information Command Line Interpreter B 49 SHOW OTHER_CONTROLLER 2 CLI gt SHOW OTHER CONTROLLER Controller HSD30 2633400026 Software E140 Hardware 0000 Configured for dual redundancy with CX40100000 All devices failed over to this controller SCSI address 7 Host port Node name HSD001 valid DSSI node 1 Host path is ON MSCP allocation class 9 TMSCP allocation class 9 Cache 32 megabyte read cache version 2 The basic HSD controller information 3 CLI SHOW OTHER CONTROLLER FULL Controller HSJ40 ZG313FF115 Software E140 Hardware 0000 Configured for dual redundancy with 2630355555
37. The uerf utility Error Event Log 0 000 E 2 SW800 Series Data Center Cabinet o oooooooooo oo 1 2 SW500 Series Cabinet cece ee eee eens 1 3 Shelf Grounding Stud o 1 7 Program Card Eject Button o 1 8 HS Controller Common Hardware Block Diagram 2 2 HS Controller Operator Control Panel 2 3 HSJ Series CI Host Interface Hardware Block Diagram 2 6 HSD Series DSSI Host Interface Hardware Block Diagram 2 7 HSZ Series SCSI 2 Host Interface Hardware Block Diagram 2 7 Controller Storage Addressing 0 0 0 cece eee eens 2 14 Host Storage Addressing HSZ series o eee eee 2 15 SW800 Series Data Center Cabinet Loading 3 3 SW800 Series Data Center Cabinet Controller Storage 1 2 Tape Drive Locations 24 8 hase cde A hs UP E io 3 4 SW800 Series Data Center Cabinet Controller Storage 3 4 Tape Drive Locations mic ew ue eR bak ele 3 EUR Raed ee 3 5 xi xii PPPT a S qt dap ete EEP eae e E I ioo Does eremo po 3 Coe ea A 7 5 PPO HOO OOOO 00 1007 ROI C SW500 Series Cabinet Loading SW500 Series Cabinet Controller Storage Tape Drive Locations Single Extension from Device Shelf to Device Shelf Adjacent Devices on a Single Port oo Balanced Devices Within De
38. Tools Required 22h A dades 7 37 Precautions x22 A re dus due oa EA Da e ue bp EC 7 37 Power Supply Removal esee 7 37 Power Supply Replacement Installati0N ooooo o 7 38 Warm SWAP es toh ss qe trae PR a ipa o UE a 7 38 SBB Warm Swap llle ra 7 38 Tools Required Ax OMA REDIERE 7 38 PrecautiOns 23 gcc y HG a ta she ea MN UP RE RUE 7 39 Device Removal s s E oR da pev Ree pb eee pe a 7 39 Device Replacement eee eens 7 40 Restoring the Device to the Configuration 7 41 Controller Warm Swap HSJ Series Controllers 7 42 Tools Required ci it pc Bain Mops le Te a 7 42 Precautions a a 7 42 Controller Removal o ooo o 7 42 Controller Replacement 0 00 cece eee eee eens 7 44 Restoring Parameters see 7 45 A Field Replaceable Units A 1 A 2 A 3 Controller Field Replaceable Units o A 1 Required Tools and Equipment A 2 Related Field Replaceable Units 0 0 0 0 ce cece ee eee A 3 B Command Line Interpreter viii B 1 CEP Commands serra ed ia da ER Sx ET YE ERE B 1 ADD CDROM iD Bd as B 2 ADD DISK S Lote ee titer a e tente ENA B 3 ADD STRIPESET 1e ey ble eee aer do B 5 ADD TAR Co a I ERES B e AF B 6 ADD UNTE iei Ep loa eden Fl DIM RE Ms B 7 CLEAR ERRORS CLI essere B 11 DELETE contai
39. Typing a question mark after a keyword causes the parser to provide a list of keywords or values that may follow the supplied keyword The CLI is not case sensitive so keywords may be entered in uppercase lowercase or mixed case Upon successful execution of a command other than HELP the CLI is exited and the display is resumed Entering a carriage return without a command also exits the CLI and resumes the display If an error occurs in the command the user prompts for command expansion help or the HELP command is entered the CLI prompts for an additional command instead of returning to the display 6 5 1 3 How to Interpret the VTDPY Display Fields This section describes the major fields in the VTDPY displays Examples of the VTDPY screens are shown followed by an explanation of each field of the screens Diagnostics Exercisers and Utilities 6 67 Figure 6 2 VTDPY Default Display for Cl Controllers oo oooooooooooOoOo oo oooooooooooOoOo oo oooooooooooOoOo E A OC AE AA CO E CESA Se oo O C5cC 5cC 5cC 5cC5cC 5cC 5cC 5cC 5c 5c 5c 5 c c rc 0 0 iA De 26 91 vo61 dH4 0 SSH SH SH SH 84 R4 R4 SH 4 4 f a 4 ye lt lt lt lt k s A de CoO 8B d d d d d do cec OS e v IH U 2M SPY S GN MS a 6970 7970 9700 2950 1970 0970 S70 7500 ESTO zato TS70 0570 C770 a C a a a E AAA COM CO aa a RA OA ES AS Sa CO ee a ss Co C2
40. UNKNOWN SUBCODE 0000 X MSLG Q CNT ID 00134534 01280001 UNIQUE IDENTIFIER 000100134534 X ASS STORAGE CONTROLLER ODEL 40 MSLGS B CNT SVR FF CONTROLLER SOFTWARE VERSION 255 MSLGSB_CNT_HVR 00 CONTROLLER HARDWARE REVISION 0 MSLGSW MULT UNT 0005 MSLG Q UNIT ID 00000001 02FF0000 UNIQUE IDENTIFIER 000000000001 X DISK CLASS DEVICE 166 MODEL 255 SLG B UNIT SVR 0B UNIT SOFTWARE VERSION 11 SLGS B UNIT HVR 0C UNIT HARDWARE REVISION 12 SLGS B LEVEL 01 SLG B_RETRY 00 SLGSL VOL SER 00001492 VOLUME SERIAL 5266 SLGSL HDR CODE 000659B6 LOGICAL BLOCK 416182 GOOD LOGICAL SECTOR continued on next page C 2 HSJ Series Error Logging Example C 1 Cont Disk Transfer Error Event Log CONTROLLER DEPENDENT INFORMATION LONGWORD 1 LO LO LO LO LO LO LO LO LO LO LO LO LO LO LO LO LO LO LO G O R R Dr Ze D 3 03094002 00003C51 00000000 000016D4 00000000 00030002 56415246 20205355 00000501 36325A52 20202020 29432820 43454420 20202020 31202020 100F0002A 59060004 00001686 01030000 000A8001 0 lt ech 1007 Jet E xd CNOT E head RZ26 0 DEC 1 6 LX 3 Less vw HSJ Series Error Logging C 3 The 32 bit instance code always appears in LONGWORD 1 of CONTROLLER DEPENDENT INFORMATION with the f
41. Where xx equals the length in feet When using a TZ8x7 a transition cable must be routed between the TZ8x7 device and the SCSI 2 cable because the device is SCSI 1 Field Replaceable Units A 3 B Command Line Interpreter This appendix provides the following information A comprehensive list of all CLI commands e CLI error messages the operator may encounter Examples of some common CLI based procedures An overview of using the CLI as well as a description of how to access and exit the CLI is provided in Chapter 4 B 1 CLI Commands The following sections detail each of the allowable commands in the CLI with required parameters and qualifiers The defaults for each qualifier are indicated by a capital D in parentheses D Examples are given after the command format parameters description and qualifiers Command Line Interpreter B 1 ADD CDROM ADD CDROM Format Parameters Description Examples Adds a CDROM drive to the known list of CDROM drives Note This command is valid for HSJ and HSD controllers only ADD CDROM container name SCSI location container name Specifies the name that will be used to refer to this CDROM drive This name will be referred to when creating units and stripesets The name must start with a letter A through Z and can then consist of up to eight more characters made up of A through Z 0 through 9 period dash and underscore _ for a tot
42. contains the FX DMA Indirect List Pointer register DILP e Last Failure Parameter 2 contains the FX DMA Page Address register DADDR e Last Failure Parameter 3 contains the FX DMA Command and control register DCMD A processor interrupt was generated by the HSJ30 40 controller s XOR engine FX indicating an unrecoverable error condition e Last Failure Parameter 0 contains the FX Control and Status Register CSR e Last Failure Parameter 1 contains the FX DMA Indirect List Pointer register DILP e Last Failure Parameter 2 contains the FX DMA Page Address register DADDR e Last Failure Parameter 3 contains the FX DMA Command and control register DCMD The logical unit mapping type was detected invalid in va set disk geometry An invalid status was returned from CACHES LOOKUP LOCK e Last Failure Parameter 0 contains the DD address e Last Failure Parameter 1 contains the invalid status continued on next page Table C 34 Cont Value Added Services Last Failure Codes Code Description 02560102 02570102 025A0102 025B0102 025C0102 02620102 02690102 02720100 02730100 02790102 An invalid status was returned from CACHE LOOKUP_LOCK e Last Failure Parameter 0 contains the DD address e Last Failure Parameter 1 contains the invalid status An invalid status was returned from VA XFER during a operation e Last Failure Parameter 0 contains the DD
43. executes only part of the controller initialization diagnostics run directly from the program card The remaining diagnostics all functional code and all utilities run from controller shared memory Refer to the Storage Works Array Controllers HS Family of Array Controllers User s Guide for information on controller I O performance using HS operating firmware The HS operating firmware consists of five function areas e Core functions e Host interconnect functions Operator interface and subsystem management e Device services e Value added functions These functions are discussed in the following sections 2 8 Functional Description 2 2 1 Core Functions HS operating firmware provides the following core functions in the order they are executed following turning on the controller 1 Tests and diagnostics 2 Executive functions 2 2 1 1 Tests and Diagnostics HS controller tests and diagnostics are integrated in a controller self test procedure performed when the controller is switched on The output of self test is a simple go nogo status of the controller subsystem Self test includes a test of the cache module See Chapter 6 for additional self test information 2 2 1 2 Executive Functions Executive functions act as the operating system kernel for the HS controller The executive functions are common among the different controller models described in this manual Executive functions control firmware execution with
44. last update interval e o o o 8 6 92 Diagnostics Exercisers and Utilities BIHit This column shows the number of cached data blocks hit in the last update interval Diagnostics Exercisers and Utilities 6 93 Device Status PTL asurO r9 sO Raks s wrks s 0ue0 799 CRO BRO TRO D100 A 0 0 0 1 0 0 0 0 D120 A 0 0 0 ju 5p d O D140 A 0 0 0 0 0 0 0 0 D210 A 11 93 0 E de D d 8 D230 A 0 0 0 0 0 0 0 0 D300 A 11 93 0 2 X de Qe 4 D310 A 0 0 0 t 0 0 0 0 D320 A 36 247 Qo 12 10 0 0 0 D400 A 11 93 0 2s 3 TD 8 D410 A 0 0 0 0 0 0 0 0 D420 A 36 247 0 10 8 0 0 0 D430 A 0 0 0 0 0 0 0 0 D440 A 0 0 0 0 0 0 0 0 D450 AS 0 0 0 0 0 0 0 D500 A 11 93 0 b ub 0 0 0 D510 A 0 0 0 0 0 0 0 0 D520 A 0 0 0 0 0 0 0 0 D530 A 41 0 315 6 5 0 0 0 Description This subdisplay shows the status of the physical storage devices that are known to the controller firmware It also shows I O performance information and bus statistics for these devices Up to 42 devices may be displayed in this subdisplay Q The PTL column contains a letter indicating the type of device followed by the SCSI Port Target and LUN of the device The list is sorted by port target and LUN The device type letters that may be displayed are as follow D indicates a disk device e T indicates a tape device e L indicates a media loader e C indicates a CDROM device F indicates a device type not listed abov
45. o o ooooooo oo Controller Storage Addressing o Host Storage Addressing Host Storage Addressing HSZ series o oooo ooo oo 3 Configuration Rules and Restrictions 3 1 3 2 3 2 1 3 2 2 3 3 3 4 3 4 1 3 4 2 3 4 2 1 3 4 3 3 4 4 3 4 5 3 4 6 3 5 3 5 1 3 5 2 3 5 3 3 5 4 3 6 3 6 1 3 6 2 Ordering Considerations 0 o Cabinets SW800 Series Data Center Cabinet 0 000000 eens SW500 series C Shelves Device Placement abitielsz 33d RR Repo Rn ey alerts 3 inch SBB Restrictions lecce ee 54 inch SBB Restrictions llle Table Conventions lt lt o has 3 inch SBBs 5 inch SBBs Intermixing 5 inch and 3 inch SBBS Atypical Configurations o ooooooooooo Controllers Nonredundant Controllers 0 0 ccc cece eee eee Dual Redundant Controllers llle Optimal Performance Configuration 0 0 000 Optimal Availability Configuration oooooooooo oo Host Considerations cc ee es Host Cables Host Adapters 4 Normal Operation 4 1 4 1 1 4 1 2 4 1 3 4 2 4 3 4 3 1 4 3 2 Initialization Controller Initialization 0 0 0 ccc eens Dual Redundant Configuration Initialization Subsystem Initialization
46. soft errors are no longer displayed but testing continues for the unit Enter IO queue depth 1 12 4 Explanation Enter the maximum number of outstanding I Os for each unit selected for testing The default is 4 Enter unit number to be tested Explanation Enter the unit number for the unit to be tested Note When DILX asks for the unit number it requires the number designator for the disk where D117 would be specified as unit number 117 Unit x will be write enabled Do you still wish to add this unit y n n Explanation This is a reminder of the consequences of testing a unit while it is write enabled This is the last chance to back out of testing the displayed unit Enter Y to write enable the unit Enter N to back out of testing that unit 6 54 Diagnostics Exercisers and Utilities Select another unit y n n Explanation Enter Y to select another unit for testing Enter N to begin testing the units already selected The system will display the following test selections Available tests are Ji Basic Function 2 User Defined Test Use the Basic Function 99 9 of the time The User Defined test is for special problems only Enter test number 1 2 1 Explanation Enter 1 for the Basic Function test or 2 for the User Defined test After selecting a test the system will then display the following messages In the User Defined test you may define up to 20 comm
47. 0 0 000000 0 1 pp0d 0 0 0 0000000 1 ppd 0 0 0 0000000 Ze IPP0Q D GS 399 0 39 1 0pb00 D dpo X be qeu sg a 1 ve GEPOQ 0 0 0 000000 0 1 P ptPOQ 0 0 0 0000000 1 ye tP0Q SIO0 0 0 0000000 1 P Zeod 0 0 0 0000000 1 ye T00 900 0 0 0000000 1 ye 0 00 ZOTO cl 0 000 0 0 0010 9bp 7 0 GZPOQ D990010 ZET 0 0010 0 0 00T0 cep 2 0 vcpOO 0n0dT 6ST 0 660 I 0 66 0 STE 2 0 EZp0 ccpod 0 ZET 0 0010 0 0 0010 9b 2 0 ZZb00 SpOG O ZET 0 0010 0 0 00170 8P 2 0 TZPOQ 010 ZET 0 0010 0 0 0010 8b 1 0 OCPOQ 500 0 0 0000000 1 ve GDP0G IGOGO 0 0 0000000 I PIPOG 05040 0 0 0000000 1 ye TPOQ quao 0 0 0000000 X TIPOG Waco 0 0 0000000 1 42 TPOQ VUN FHT PUOTE obana SSW 2Hd SIH UD 31M pu S A MSY FUN S A 6TLZ 9TPI 1 08 11H Q I 06 d10 queudinbg TeirbTQ 7661 O JUBTIADO 1oqtuo0W A UIA 00 00 MH CPTA MS BEGOOLEEDZ N S OPLSH 6 72 Diagnostics Exercisers and Utilities OO0OO0OO0OO0OO0OO0OOOOOOO c Figure 6 7 VTDPY Brief Cl Status Display 0 1 Ve Eppod v E 0 A ae ot ee PO y c T 0 X ge 00900 906040 00 Ta 2NA T OT u a TE 0 1 lt GEPOd 68L9GPEZTO 00 TE 2NA Z 0Z 204 oE 0 1 ye pepod sn Yd 00 TA ONA T OT FIONIA 6c 0 3 oS PEPOG 00 TE DNA T 0T OO3AN 8Z ZI ye G9p04 0 i ZE POG A EO 0 TE ONA
48. 0001 DISK MSCP MESSAGE SLGSL CMD REF 5B54001E SLGSW SEQ NUM 0039 SEQUENCE 57 SLG B FORMAT 00 CONTROLLER LOG SLG B FLAGS 00 UNRECOVERABLE ERROR SLGS W EVENT 01CA CONTROLLER ERROR POLICY PROCESS ERROR SLG Q CNT ID 00000021 01280001 UNIQUE IDENTIFIER 000100000021 X MASS STORAGE CONTROLLER MODEL 40 MSLGS B CNT SVR FF CONTROLLER SOFTWARE VERSION 255 MSLG B_CNT_HVR 00 CONTROLLER HARDWARE REVISION 0 CONTROLLER DEPENDENT INFORMATION LONGWORD 1 01010000 Fassa LONGWORD 2 044103CF 1 A LONGWORD 3 00000000 lanai LONGWORD 4 00470000 G LONGWORD 5 00000000 Fw LONGWORD 6 00020000 Fill LONGWORD 7 00000000 daten HSJ Series Error Logging C 125 Example C 4 shows the same ERF error log after running the command procedure notice the deskewed longwords Example C 4 ERF Error Log After Command Procedure VAX VMS SYSTEM ERROR REPORT KARA k koe kk eek ke ke kc e e kx x kx kx kk ENTRY ERROR SEQUENCE 2820 DATE TIME 16 MAR 1993 11 35 45 39 SYSTEM UPTIME 2 DAYS 22 48 03 SCS NODE CNOTE COMPILED 16 MAR 1993 12 30 07 PAGE 144 1 GGG IGE LOGGED ON SID 05903914 SYS_TYPE 00000000 VAX VMS V5 5 2 ERLSLOGMESSAGE ENTRY KA825 HW REV B PATCH REV 28 UCODE REV 20 BI NODE 2 I O SUB SYSTEM UNIT _HSJ402SDUA20 ESSAGE TYPE 0001 SLGSL CMD REF 5B54001E SLGSW SEQ NUM 0039 SLGSB FORMAT 00 SLGSB FLAGS 00 SLGSW EVENT 01CA SLGSQ
49. 5 4 7 6 7 6 1 Controller Module Diagnosing the Shutting Down Controller ii a ea aControler ccce a Nonredundant Controller ooo ooo oo Tools Required ccc eee ee ne eee Precautions Module Removal 0 0 0 cece ce eee tenes Module Replacement Installation Restoring Initial Parameters One Dual Redundant Controller o o Tools Required cin dd LED UE E ee Precautions Module Removal eee ee ene Module Replacement InstallatioN Restoring Initial Parameters Both Dual Redundant Controllers o Cache Module Tools Required Precautions Module Removal orreri inss llle Module Replacement Installation Upgrading Cache Modules 0 0 0 eect eee Program Card Tools Required Precautions Card Removal Card Replacement Installati0N o o ee ens External CI Cables Tools Required Precautions Cable Removal HSJ Series serue ti siaa e a hh ee eas Cable Replacement Installation aaaea Internal CI Cables HSJ series o ooo ooo o Tools Required Precautions Cable Removal Cable Replacement Installation 0 0 0 ccc cece eens DSSI Host Cables HSD series llle Tools Required
50. 59 0002 C 59 0003 C 59 0004 C 59 0005 C 59 0006 C 59 0007 C 59 0008 C 59 0009 C 59 000A C 59 000B C 59 Template Codes 01 C 22 05 C 25 11 C 27 12 C 29 C 55 13 C 31 14 C 34 31 C 36 32 C 38 33 C 40 41 C 43 51 C 45 57 C 47 61 C 50 71 C 52 Cold swap power supply 7 36 SO DS HM EOCOU Command line interpreter See CLI Commands ADD CDROM B 2 ADD DISK B 3 ADD STRIPESET B 5 ADD TAPE B 6 ADD UNIT B 7 CLEAR ERRORS CLI B 11 DELETE container name B 12 DELETE unit number B 13 DIRECTORY B 14 EXIT B 15 HELP B 16 Index 12 Commands cont d INITIALIZE B 17 LOCATE B 18 LOCATE CANCEL B 18 LOCATE DISKS B 18 LOCATE entity B 19 LOCATE PTL SCSI location B 18 LOCATE TAPES B 18 LOCATE UNITS B 18 RENAME B 20 RESTART OTHER CONTROLLER B 21 RESTART THIS CONTROLLER B 23 RUN B 25 SELFTEST OTHER CONTROLLER B 26 SELFTEST THIS CONTROLLER B 28 SET disk container name B 30 SET FAILOVER B 31 SET NOFAILOVER B 33 SET OTHER CONTROLLER B 34 SET stripeset container name B 37 SET THIS_CONTROLLER B 38 SET unit number B 41 SHOW cdrom container name B 45 SHOW CDROMS B 44 SHOW DEVICES B 46 SHOW disk container name B 48 SHOW DISKS B 47 SHOW OTHER_CONTROLLER B 49 SHOW STORAGESETS B 51 SHOW stripeset container name B 53 SHOW STRIPESETS B 52 SHOW tape container name B 55 SHOW TAPES B 54 SHOW THIS CONTROLLER B 56 SHOW unit number B 59 SHOW
51. 80 percent of the time It is the first phase executed after the initial write pass has completed It is re executed at 10 minute intervals with each cycle lasting approximately 8 minutes Intervals are broken down into different cycles The interval is repeated until the user selected time interval expires Data Intensive Designed to test disk throughput by selecting a starting LBN and repeating transfers to the next sequential LBN that has not been accessed by the previous I O The transfer size of each I O equals the maximum sized I O that is possible with the memory constraints DILX must run under This phase continues performing spiraling I O to sequential tracks Read and write commands are issued in read write mode This phase is executed 20 percent of the time after the initial write pass has completed This phase always executes after the random I O phase It is re executed at 10 minute intervals with each cycle approximately 2 minutes 6 4 3 2 User Defined Test DILX CAUTION The User Defined test should be run only by very knowledgeable personnel Otherwise customer data can be destroyed When this test is selected DILX prompts you for input to define a specific test In the DILX User Defined test a total of 20 or fewer I O commands can be defined Once all of the commands are issued DILX issues the commands again in the same sequence This is repeated until the selected time limit is reached As you build the tes
52. 800E0100 810E0100 080E0101 402E0101 022E0102 033E0108 020F0100 021F0100 027F0100 028F0100 031F0100 402F0100 408F0100 600F0100 601F0100 603F0100 604F0100 605F0100 810F0100 080F0101 C 115 C 116 C 117 C 117 C 118 C 104 C 99 C 97 C 100 C 100 C 108 C 109 C 111 D 3 C 113 C 114 C 114 C 115 C 115 C 116 C 117 C 117 C 110 C 111 C 97 C 97 C 100 C 100 C 101 C 111 C 113 C 114 C 114 C 115 C 116 C 117 C 117 C 109 C 111 C 98 C 105 C 97 C 97 C 100 C 100 C 101 C 111 C 111 C 113 C 114 C 114 C 115 C 115 C 117 C 109 Codes Last Failure Codes firmware cont d 040F0102 033F0108 hardware 01800080 01812088 01822288 01832288 01842288 01852288 01860080 01870080 01880080 01890080 02392084 03330188 03350188 03360188 03380188 03420188 42332080 42382080 42392080 42442080 42452080 42472080 42482080 018A0080 034A2080 423A2080 023A2084 030B0188 423B2080 42302080 423D2080 424D2080 423E2080 424E2080 423F2080 Port Port Driver Message Operation Codes 0000 0001 0002 0003 0004 0005 0006 C 59 C 59 C 59 C 59 C 59 C 59 C 59 Recommended Repair Action Codes 00 01 02 03 04 05 06 07 C 120 C 120 C 120 C 120 C 120 C 120 C 120 C 120 C 108 C 105 C 93 C 94 C 94 C 95 C 95 C 96 C 96 C 96 C 96 C 96 C 98 C 102 C 102 C 103 C 104 C 106 C 112 C 112 C 112
53. CLD error report then reset the controller Disk unit x does not exist Explanation An attempt was made to allocate a unit for testing that does not exist on the controller Unit x successfully allocated for testing Explanation All processes that DILX performs to allocate a unit for testing have been completed The unit is ready for DILX testing 6 14 Diagnostics Exercisers and Utilities Unable to allocate unit Explanation This message should be preceded by a reason why the unit could not be allocated for DILX testing DILX detected error code x Explanation The normal way DILX recognizes an error on a unit is through the reception of an EIP This loosely corresponds to an MSCP error log However the following are some errors that DILX will detect without the reception of an EIP e Illegal Data Pattern Number found in data pattern header Unit x This is code 1 DILX read data from the disk and found that the data were not in a pattern that DILX previously wrote to the disk e No write buffers correspond to data pattern Unit x This is code 2 DILX read a legal data pattern from the disk at a place where DILX wrote to the disk but DILX does not have any write buffers that correspond to the data pattern Thus the data have been corrupted e Read data do not match what DILX thought was written to the media Unit x This is code 3 DILX writes data to the disk and then reads it and compares it against
54. CNT ID 00000021 01280001 MSLGSB CNT SVR FF MSLGSB CNT HVR 00 CONTROLLER DEPENDENT INFORMATION LONGWORD 1 01010000 LONGWORD 2 044103CF LONGWORD 3 00000000 LONGWORD 4 00470000 LONGWORD 5 00000000 LONGWORD 6 00020000 LONGWORD 7 00000000 LONGWORD DESKEW C 126 HSJ Series Error Logging DIS SEQUENCE 57 CO CO POL UN AS OD TROLLER LOG UNRECOVERABLE E TROLLER ERRO ICY PROCESS QUE IDENTIFI S STORAGE CO EL 40 MSCP MESSAGE RROR R ERROR ER 000100000021 X TROLLER CONTROLLER SOFTWARE VERSION 255 CONTROLLER HARDWARE REVISION 0 continued on next page Example C 4 Cont ERF Error Log After Command Procedure LONGWORD 1 03CF0101 LONGWORD 2 00000441 LONGWORD 3 00000000 LONGWORD 4 00000047 LONGWORD 5 00000000 LONGWORD 6 00000002 HSJ Series Error Logging C 127 D HSD Series Error Logging This appendix details errors the HSD series controller will report in its host error logs under the OpenVMS operating system as well as how to extract the information from the logs Note Host error log translations are correct as of the date of publication of this manual However log information may change with firmware updates Refer to your Storage Works Array Controller Operating Firmware Release Notes for error log information updates D 1 Reading an HSD series Error Log You can interpret an HSD series error log the same way as an
55. COM from OpenVMS software V5 5 2 dealing with devices is shown below Change one element in the speed list the 1 shown enclosed in a box to a 4 speed liste 1 2 2 4 4 4 4 4 4 1 1 1 1 4 1 4 1 1 1 2 speed listespeed list 4 4 4 2 2 1 1 1 1 2 4 1 1 1 1 1 1 4 4 speed listsspeed list 1 1 1 4 4 1 4 1 4 4 4 4 1 1 4 1 4 4 1 4 speed list espeed list 4 4 1 1 4 4 2 1 1 1 4 1 1 1 4 4 4 4 4 4 speed list speed list 4 4 4 4 1 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 speed list speed list 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 4 Sspeed_list speed_list 4 4 4 4 4 4 4 diskspeed 1 Stemp FSGETDVI sys sysdevice DEVTYPE SIF temp LE 126 AND temp GE 1 THEN diskspeed FSELEMENT temp speed_list Sdisksize FSGETDVI sys sysdevice MAXBLOCK SIF diskspeed NE 1 THEN GOTO getdata30 Normal Operation 4 13 3 Run the AUTOGEN program Completing this procedure causes the disk drives to be recognized as supported device types OpenVMS VAX V6 0 The AUTOGEN COM DCL procedure does not support device types above 137 although HSX00 and HSX01 are properly defined in the speed list To circumvent this problem perform the following steps 1 Make a copy of the AUTOGEN COM DCL file in case restoration of the original state is required 2 Edit the AUTOGEN COM f
56. Check indicated path cables for proper installation 63 Check the CI adapter on the host system identified in the remote node name field for proper operation Recommended Repair Action codes apply to each reportable event except those reported via the Disk Copy Data Correlation Event Log as identified by the value contained in the Repair Action subfield of the instance code field of the event logs described in Section C 2 For events reported via the Last Failure Event Log the Recommended Repair Action code is contained in the Repair Action subfield of the last failure code field of that event log Disk Copy Data Correlation Event Log Conditions The Recommended Repair Action Code assigned to the following conditions reported via a Disk Copy Data Correlation Event Log is 01 see Table C 51 e Subcommand Error subcode Destination Command Timed Out e Subcommand Error subcode Source Command Timed Out e Subcommand Error subcode Destination Inconsistent State cases C D E and F Controller Error subcode Local Connection Request Failed Insufficient Resources to Request Local Connection HSJ Series Error Logging C 121 Controller Error subcode Remote Connection Request Failed Insufficient Resources to Request Local Connection The Recommended Repair Action Code assigned to the following condition reported via a Disk Copy Data Correlation Event Log is 02 se
57. Codes MSCP Instance Event Code Code Description 01032002 012A Nonvolatile parameter memory component EDC check failed content of the component reset to default settings Table C 21 Backup Battery Failure Event Log Template 12 Instance MSCP Event Codes MSCP Instance Event Code Code Description 02032001 012A Journal SRAM backup battery failure detected during system restart The memory address field contains the starting physical address of the Journal SRAM 02042001 012A Journal SRAM backup battery failure detected during periodic check The memory address field contains the starting physical address of the Journal SRAM HSJ Series Error Logging C 79 Table C 22 Subsystem Built In Self Test Failure Event Log Template 13 Instance MSCP Event Codes MSCP Instance Event Code Code Description 82012002 020A An unrecoverable error was detected during execution of the NCR710 Subsystem Built In Self Test One of the ports on the controller module has failed some all of the attached storage is no longer accessible via this controller 82022202 020A An unrecoverable error was detected during execution of the Cache Memory DRAB Chip Subsystem Built In Self Test that rendered half of the cache memory unusable 82032202 020A An unrecoverable error was detected during execution of the Cache Memory DRAB Chip Subsystem Built In Self Test that rendered the entire cache memory unusable 82042002
58. Component Event Log Template T1955 NR NN Backup Battery Failure Event Log Template 12 Subsystem Built In Self Test Failure Event Log Template 13 Memory System Failure Event Log Template 14 CI Port Event Log Template 31 o CI Port Port Driver Event Log Template 32 CI System Communication Services Event Log Template 33 Device Services Nontransfer Error Event Log Template 41 Disk Transfer Error Event Log Template 51 Disk Bad Block Replacement Attempt Event Log Template A SMe ee ha bak Nes iim Ge di ae TREE Lik Tape Transfer Error Event Log Template 61 Media Loader Error Event Log Template 71 Disk Copy Data Correlation Event Log o ooooooooo C 3 Event Log Codes si ece p Reg DR ROPA AUI OV C 4 Event Notification Recovery Threshold o oooooooo ooo C 5 Recommended Repair Action o o o e C 6 Deskew Command Procedure 0 00 c eee eee eee eens D HSD Series Error Logging D 1 Reading an HSD series Error Log llle D 2 Event Log Formats ia sey de D 3 Event ioe Codes o 2 0 Sites a BS D 4 Recommended Repair Action 0 0 ee eee eens E HSZ Series Error Logging E 1 Reading an HSZ Series Error Log llle B 79 2O0000 PFEP AAN NNNO 00000D0D 0000 POO OOOO 0000 PRPRBRWOWWWNPD O16 O O O amp
59. Ctrl G print summary request or a Ctrl C reuse parameters request was entered before TILX started to test units TILX cannot satisfy the second two requests so TILX treats all of these requests as a termination request TILX will not change the state of a unit if it is not NORMAL Explanation TILX cannot allocate the unit for testing because it is already in Maintenance mode Maintenance mode can only be invoked by the firmware If another TILX session is in use the unit is considered in Maintenance mode Unit is not available if you dismount the unit from the host it may correct this problem Explanation The unit has been placed on line by another user or host or the media is not present Diagnostics Exercisers and Utilities 6 39 Soft error reporting disabled Unit x Explanation This message indicates that the soft error limit has been reached and that no more soft errors will be printed for this unit Hard error limit reached unit x dropped from testing Explanation This message indicates that the hard error limit has been reached and the unit must be dropped from testing Soft error reporting disabled for controller errors Explanation This indicates that the soft error limit has been reached for controller errors Controller soft error reporting is disabled Hard error limit reached for controller errors All units dropped from testing Explanation This message is self explanatory Unit is alread
60. D Instruction Data cache on the controller module DRAB Dynamic RAM Controller and Arbitration Engine operates controller shared memory SRAM Static RAM ECC Error Correction Code EDC Error Detection Code NXM Nonexistent Memory lit continuously ld flashing Error Analysis and Fault Isolation 5 7 5 5 Device LEDs The storage devices SBBs and their power supplies have LEDs to indicate power and status You can use these LEDs in conjuction with the OCP indicators to isolate certain faults as discussed in the following sections 5 5 1 Storage SBB Status Device shelves monitor the status of the storage SBBs When a fault occurs the fault and the SBB device address SCSI target ID are reported to the controller for processing The SBB internal fault identity bus controls the fault lower LED As shown in Figure 5 4 each storage SBB has two LED indicators that display the SBB s status These LEDs have three states on off and flashing e The upper LED green is the device activity LED and is on or flashing when the SBB is active CAUTION Do not remove a storage SBB when the upper LED is on or flashing This can cause the loss or corruption of data e The lower LED amber is the storage SBB fault LED and indicates an error condition when it is either on or flashing When this LED indicates a fault the amber controller OCP LED for the device s port will be lit continuously as well You
61. Description of Error Action a EEHEHE EE 00 DAEMON hard error Replace controller module B BH E E E H 0 Repeated firmware bugcheck Replace controller module E E E EE EEL 02 NVMEM version mismatch Replace program card with later version gH BBE 03 NVMEM write error Replace controller module E EE EE OU E 04 NVMEM read error Replace controller module gH EE E HEO E O5 Inconsistent NVMEM structures RESET the controller repaired E HEHE HO DO 06 Mero Replace controller module E EE E EL 0 Bugcheck with no restart RESET the controller E BH J EE E E 08 NVMEM contents invalid Replace controller module B COOOOOCL 3F No program card seen Replace controller module O off lit continuously DAEMON Diagnostic and Execution Monitor NVMEM Nonvolatile Memory NMI Nonmaskable Interrupt La power failure or controller reset during an NVMEM update causes this error If the error occurs on one controller in a dual redundant configuration a configuration mismatch will probably occur upon restart Try the card in another module If the problem moves with the card replace the card If the problem does not move with the card replace the controller module 5 4 Error Analysis and Fault Isolation Figure 5 3 Flashing OCP Codes Reset 1 2 3 4 5 6 Description of Error Action 01 Program card EDC error Replace program card 04 Timer zero in the timer chip will run when Replace controller b i O N
62. Each of the following sections discusses specific TILX questions The test questions are listed in the approximate order that they are displayed on your terminal These questions prompt you to define the runtime parameters for TILX Note Defaults for each question are given inside If you press the Return key as a response to a question the default is used as the response Use all defaults y n y Explanation Enter Y to use the defaults for TILX and most of the other TILX questions are not asked Enter N and the defaults are not used You must then answer each question as it is displayed The following defaults are assumed for all units selected for testing e Execution time limit 10 minutes e Performance summary interval 10 minutes Diagnostics Exercisers and Utilities 6 33 e Displaying performance statistics is disabled Note This does not include total I O requests e Displaying hard soft EIPs and end messages is disabled e Hard error limit 65535 Testing will stop if the limit is reached Hex dump of extended error log information is disabled e I O queue depth 4 A maximum of 4 I Os will be outstanding at one time e Selected test Basic Function test e The record count 4096 e All data patterns are used e Data compares are disabled Enter execution time limit in minutes 1 65535 10 Explanation Enter the desired time you want TILX to run The
63. Enables and disables write protection of the unit When entering an ADD UNIT command NOWRITE PROTECT is the default Qualifiers for a unit created from a stripeset MAXIMUM CACHED TRANSFER zn MAXIMUM CACHED TRANSFER z32 D Specifies the maximum size transfer in blocks to be cached by the controller Any transfers over this size will not be cached Valid values are 1 through 1024 When entering the ADD UNIT command MAXIMUM CACHED TRANSFER 32 is the default READ CACHE D NOREAD CACHE Enables and disables the controller s read cache on this unit When entering an ADD UNIT command READ CACHE is the default RUN D NORUN Enables and disables a unit s ability to be spun up When RUN is specified the devices that make up the unit will be spun up If NORUN is specified the unit will be spun down When entering an ADD UNIT command RUN is the default WRITE PROTECT NOWRITE PROTECT D Enables and disables write protection of the unit When entering an ADD UNIT command NOWRITE PROTECT is the default Qualifiers for a unit created from a tape drive HSJ and HSD only DEFAULT FORMAT format DEFAULT FORMAT DEVICE DEFAULT D Specifies the tape format to be used unless overridden by the host Note that not all devices support all formats The easiest way to determine what formats are supported by a specific device is to enter SHOW tape unit number DEFAULT FORMAT the valid options will be displayed B 42 C
64. Format 31 0 command reference number sequence number reserved controller identifier reserved chvrsn csvrsn instance code reserved event time error id undefined ppd opcode instance code See Section C 2 1 for the description of this field The values that can be reported in this field for this event log are shown in Table C 25 templ See Section C 2 1 for the description of this field This field contains the value 32 for this event log tdisize See Section C 2 1 for the description of this field This field contains the value 10 for this event log HSJ Series Error Logging C 39 reserved offset 1E This field contains the value 0 event time See Section C 2 1 for the description of this field his status error id sre dst intopcd vestate ppd opcode See Section C 2 2 1 for the description of these fields undefined This field is only present to provide longword alignment its content is undefined Note that the content of certain of the fields described previously may be undefined depending on the value supplied in the instance code field See Table C 25 for more detail C 2 3 9 Cl System Communication Services Event Log Template 33 The HSJ30 40 controller Host Interconnect Services firmware component reports errors detected while performing work related to the CI System Communication Services SCS communication layer via the CI System Communication Services Event Log
65. HSJ series error log Appendix C with the following exeptions Template type 31 does not exist for HSD series error logs e Template types 32 and 33 have changed as shown in Table D 1 Table D 1 Template Types Deskewed Description Template Longword Value Value DSSI Port Port Driver Event Log 321 2 1032xxxx 00001032 DSSI System Communication Services 331 2 2C33xxxx 00002C33 Event Log The MSLG B_FORMAT field for these templates will read 00 CONTROLLER LOG so you may want to run the OpenVMS DCL command procedure provided at the end of Appendix C for deskewing the longwords HSD Series Error Logging D 1 D 2 Event Log Formats In general the event log formats for the HSD series controller are identical to those for the HSJ series However where the HSJ series uses CI to describe the host interface the HSD series controller uses DSST For example in the following table the terms in the first column for HSJ series controllers translate to the terms in the second column for for HSD series controllers Be aware of this change in terminology as you use Appendix C to decode your error logs CI Host Interconnect Services Common DSSI Host Interconnect Services Common Event Log Fields Event Log Fields CI source node address DSSI source node address CI destination node address DSSI destination node address CI Virtual Circuit State Codes DSSI Virtual Circuit State Codes CI Port Port Driver Event Log
66. Host Storage Addressing Note The information in this section applies to all controllers However see Section 2 3 3 for additional specialized information on how a SCSI host addresses storage A typical host device interface consists of a number of host ports each connected to a bus containing devices From the host perspective the controller is one of these devices Functional Description 2 13 Figure 2 6 Controller Storage Addressing HOST INTERFACE CONTROLLER PORT ADDRESS SCSI SCSI BUS 2 BUS 6 OPTIONAL SCSIID SCSIID SCSIID SCSIID SCSIID SCSIID SCSIID CONTROLLER TARGET ADDRESS i CONTROLLER LUNO ruNo LUNO Luno f Luno LUNO LUNO LUN ADDRESS 1 6 5 4 3 2 1 0 DEVICE DEVICE DEVICE DEVICE DEVICE DEVICE DEVICE 6 5 4 3 2 1 0 PHYSICAL DEVICES CXO 3993A MC To support certain high level storage subsystem functions such as RAID the controller presents the entire physical device configuration from Figure 2 6 to the host as a group of host logical units A host logical unit often consists of storage space a storage set distributed throughout more than one physical device The controller presents these logical units to the host as individually addressable virtual devices You configure host logical units using the CLI 2 14 Functional Description Note Controller LUNs devices and host logical units may represent the same structure but only if you configure t
67. Host protocol firmware 2 9 Host storage explained 2 13 Host storage HSZ series explained 2 15 Host based volume shadowing See HBVS Hot swap power supply 7 36 HS controller models and error logging 5 16 host protocol 2 9 HS operating firmware See Firmware HSD30 specifications 1 9 HSJ30 specifications 1 9 HSJ40 specifications 1 9 HSZ40 specifications 1 9 HSZUTIL 2 10 4 11 6 100 I D cache 2 2 6 3 6 4 IBR 6 2 Initial boot record See IBR Initialization BIST 6 2 causes of 4 1 6 1 command 4 9 7 12 containers B 78 described 6 1 device port 6 3 dual redundant controller 4 1 4 17 failover 4 17 host port 6 3 nontransportable devices 4 17 subsystem 4 2 tests performed 6 1 time required 6 1 transportable devices 4 18 Initialization disk operating system 4 12 INITIALIZE command B 17 Installation blower 7 36 CI cable external 7 25 CI cable internal 7 26 device port cable 7 33 DSSI host cable 7 29 DSSI trilink 7 29 nonredundant controller 7 7 Index 16 Installation cont d one dual redundant controller 7 15 power supply 7 38 program card 7 22 read cache 7 19 SCSI cable device port 7 33 SCSI host cable 7 31 SCSI trilink 7 31 Instruction Data cache See 1 D cache Intel 80960CA chip 2 1 6 2 L Lamp test B 18 Local programs 2 10 LOCATE CANCEL command B 18 LOCATE command B 18 LOCATE DISKS command B 18 LOCATE entity B 19 LOCAT
68. If the OVERRIDE_ONLINE qualifier is specified the controller will restart after all customer data is written to disk CAUTION Customer data may be lost or corrupted if the OVERRIDE_ONLINE qualifier is specified NOOVERRIDE_ONLINE is the default Qualifiers for HSZ controllers Examples IGNORE_ERRORS NOIGNORE_ERRORS D If errors result when trying to write user data the controller will not be restarted unless IGNORE_ERROR is specified CAUTION Customer data may be lost or corrupted if the IGNORE_ERRORS qualifier is specified NOIGNORE_ERRORS is the default IMMEDIATE NOIMMEDIATE D If IMMEDIATE is specified immediately restart the controller without checking for online devices CAUTION Customer data may be lost or corrupted if the IMMEDIATE qualifier is specified NOIMMEDIATE is the default CLI gt RESTART THIS_CONTROLLER Restart this controller as long as this controller does not have any units that are on line CLI gt RESTART THIS_CONTROLLER OVERRIDE_ONLINE Restart this controller even if there are units on line to this controller B 24 Command Line Interpreter RUN RUN Runs a diagnostic or utility on THIS_CONTROLLER Format RUN program name Parameters program name The name of the diagnostic or utility to be run DILX and TILX are examples of utilities and diagnostics that can be run from the CLI Description The RUN command enables various diagnostics and ut
69. In dual redundant configuration SCSI address 6 Host port Node name HSJ306 valid CI node 6 32 max nodes System ID 420010061120 Path A is ON Path B is ON MSCP allocation class 3 TMSCP allocation class 3 Cache 32 megabyte read cache version 2 Extended information Terminal speed 19200 baud eight bit no parity 1 stop bit Operation control 00000005 Security state code 41415 A full HSJ controller information listing B 50 Command Line Interpreter SHOW STORAGESETS SHOW STORAGESETS Shows storage sets and storage set information Format SHOW STORAGESETS Description The SHOW STORAGESETS command displays all the storage sets known by the controller storage set is any collection of containers such as stripesets Stripesets will be displayed first Qualifiers FULL If the FULL qualifier is specified additional amplifying information may be displayed after each storage set Examples d CLI gt SHOW STORAGESETS Name Storageset Uses Used by STi stripeset DISK500 D1 DISK510 DISK520 A basic listing of all storage sets 2 CLI gt SHOW STORAGESETS FULL Name Storageset Uses Used by STi stripeset DISK500 D1 DISK510 DISK520 CHUNKSIZE DEFAULT ST2 stripeset DISK400 D17 DISK410 DISK420 CHUNKSIZE DEFAULT A full listing of all storage sets Command Line Interpreter B 51 SHOW STRIPESETS SHOW STRIPESETS Shows stripesets and related stripeset information Format SHOW STRIPESETS Description
70. Initiator detected error message received 49 00 Invalid message error 4A 00 Command phase error 4B 00 Data phase error 4C 00 Logical unit failed self configuration 4E 00 Overlapped commands attempted 53 00 Media load or eject failed 53 02 Medium removal prevented 57 00 Unable to recover table of contents 5A 00 Operator request or state change input unspecified 5A 01 Operator medium removal request 5B 00 Log exception 5B 01 Threshold condition met 5B 02 Log counter at maximum 5B 03 Log list codes exhausted 63 00 End of user area encountered on this track 64 00 Illegal mode for this track 40 nn Diagnostic failure detected on component nn where nn identifies a specific target device component nn range 80 through FF Refer to documentation provided by the vendor of the target device for a description of the component identified by nn HSJ Series Error Logging Table C 16 SCSI ASC ASCQ Codes For Medium Changer Devices such as jukeboxes ASC ASCQ Code Code Description 00 00 No additional sense information 00 06 T O process terminated 02 00 No seek complete 04 00 Logical unit not ready cause not reportable 04 01 Logical unit is in process of becoming ready 04 02 Logical unit not ready initializing command required 04 03 Logical unit not ready manual intervention required 06 00 No reference position found 07 00 Multiple peripheral devices selected 08 00 Logical unit communication fail
71. MSCP Event Codes Instance Code MSCP Event Code Description 03D8450A 03D9450A 03DA450A 03DB450A 03DC450A 03DD450A 03DE450A 03DF450A 03E0450A 03E1450A C 88 HSJ Series Error Logging 00EB 00EB 00EB 00EB 00EB 00EB 00EB 00EB 00EB 00EB During device initialization the device reported the SCSI Sense Key ILLEGAL REQUEST Indicates that there was an illegal parameter in the command descriptor block or in the additional parameters supplied as data for some commands FORMAT UNIT SEARCH DATA and so forth If the target detects an invalid parameter in the command descriptor block then it shall terminate the command without altering the medium If the target detects an invalid parameter in the additional parameters supplied as data then the target may have already altered the medium This sense key may also indicate that an invalid IDENTIFY message was received During device initialization the device reported the SCSI Sense Key UNIT ATTENTION This indicates that the removable medium may have been changed or the target has been reset During device initialization the device reported the SCSI Sense Key DATA PROTECT This indicates that a command that reads or writes the medium was attempted on a block that is protected from this operation The read or write operation is not performed During device initialization the device reported the SCSI Sense Key BLANK CHECK This in
72. ON disabled module 05 Timer zero in the timer chip decrements Replace controller incorrectly module 06 Timer zero in the timer chip did not interrupt Replace controller the processor when requested module 07 Timer one in the timer chip decrements Replace controller incorrectly module 08 Timer one in the timer chip did not interrupt Replace controller the processor when requested module 09 Timer two in the timer chip decrements Replace controller incorrectly module NR A hm I oO 0 od ON A A NO O O0 A hl b i Of NN 1 Oo O ON O bi ON O N O OA Timer two in the timer chip did not interrupt Replace controller b i ON O X O the processor when requested module OB Memory failure in the I D cache Replace controller module OC No hit or miss to the I D cache when Replace controller expected module OD One or more bits in the diagnostic registers Replace controller did not match the expected reset value module OE Memory error in the nonvolatile journal Replace controller SRAM module OF Wrong image seen on program card Replace program card 10 At least one register in the controller DRAB Replace controller chip does not read as written module E EE H NH NH NH NH NH NH NH NH NH NH OH 0 0 0 rnnrnslIrunrmn rg ng np rm rm Lm rn A NO oo 0 0O O0 0 0 0 0 0 0g O O A A A 11 Main memory is fragmented into too many Replace controller sectio
73. PCB copy of the 710 DBC register Last Failure Parameter 3 contains the PCB copy of the 710 DNAD register Last Failure Parameter 4 contains the PCB copy of the 710 DSP register Last Failure Parameter 5 contains the PCB copy of the 710 DSPS register Last Failure Parameter 6 contains the PCB copies of the 710 SSTAT2 SSTAT1 SSTATO DSTAT registers Last Failure Parameter 7 contains the PCB copies of the 710 LCRC RESERVED ISTAT DFIFO registers Insufficient memory available for static structure allocation Insufficient memory available for static structure allocation DWDs exhausted Diagnostics report all NCR710s are broken C 106 HSJ Series Error Logging Table C 36 Fault Manager Last Failure Codes Code Description 04010101 04020102 04030102 04040103 04050100 04060100 04070103 04080102 04090100 The requestor ID component of the instance code passed to FM REPORT_ EVENT is larger than the maximum allowed for this environment Last Failure Parameter 0 contains the instance code value The requestor s error table index passed to FM REPORT_EVENT is larger than the maximum allowed for this requestor e Last Failure Parameter 0 contains the instance code value e Last Failure Parameter 1 contains the requester error table index value The USB index supplied in the EIP is larger than the maximum number of USBs e Last Failure Parameter 0 contains the instance code value
74. Performance Summaries A DILX performance display is produced under the following conditions e When a specified performance summary interval elapses e When DILX terminates for any conditions except an abort e When Ctrl G or Ctrl T is entered The performance display has different formats depending on whether or not performance statistics are requested in the user specified parameters and if errors are detected The following is an example of a DILX performance display where performance statistics were not selected and where no errors were detected DILX Summary at 18 JUN 1993 06 18 41 Test minutes remaining 0 expired 6 Unit 1 Total IO Requests 482 No errors detected Unit 2 Total IO Requests 490 No errors detected The following is an example of a DILX performance display where performance statistics were selected and no errors were detected DILX Summary at 18 JUN 1993 06 18 41 Test minutes remaining 0 expired 6 Unit 1 Total IO Requests 482 Read Count 292 Write Count 168 KB xfer Read 7223 Write 4981 Total 12204 No errors detected The following is an example of a DILX performance display where performance statistics were not selected and where errors were detected on a unit under test DILX Summary at 18 JUN 1993 06 18 41 Test minutes remaining 0 expired 6 O unit 10 Total IO Requests 153259 No errors detected O unit 40 Total IO Requests 2161368 Err in Hex IC 031A4002 PTL 04 00 00 Key 04 ASC Q B0 00 HC 0 SC 1 Tota
75. RESTART OTHER CONTROLLER Refer to Chapter 4 for important information about VMS node names 7 46 Removing and Replacing Field Replaceable Units 10 11 12 Enter the following command to verify the preceding parameters were set CLI gt SHOW OTHER_CONTROLLER FULL Connect the host port cable to the front of the new controller Do not connect the controllers in a dual redundant pair to separate different host CPUs Enter the following commands to enable CI paths A and B to the host CLI gt SET OTHER_CONTROLLER PATH_A CLI gt SET OTHER_CONTROLLER PATH_B If you wish you may disconnect the maintenance terminal The terminal is not required for normal controller operation Close and lock the cabinet doors SW800 series using a 5 32 inch Allen wrench Removing and Replacing Field Replaceable Units 7 47 A Field Replaceable Units This appendix lists HS controller field replaceable units FRUs required tools and equipment and related FRUs A 1 Controller Field Replaceable Units The following FRUs come with the various controller modules Part numbers are correct as of publication of this manual but are subject to change Always verify your information in case part numbers or ordering methods have changed Table A 1 HSJ40 FRUs FRU Part Number HSJ40 CI SCSI controller module 70 30097 01 including OCP and bezel 16 MB read cache module 54 22229 02 discontinued Version 1 32 MB rea
76. SBB 7 38 Index 20 Warm swap cont d storage device 7 38 Write history log 4 14
77. SCS Accept Request 407C0100 SCS command timeout unexpectedly inactive during SCS Reject Request 408E0100 Message receive queue count disagrees with HTBs on the queue 408F0100 Unrecognized HTB ID type 40900100 htb_id type not DG when attempting to xmit DG HTB 40930100 Message receive queue count disagrees with HTBs on the queue 40950100 Create transfer request with 0 byte count 40960100 Create transfer request with 0 byte count 40970100 Create transfer request with 0 byte count 40980100 Create transfer request with 0 byte count 409C0100 Illegal return value from HIS MAP 409D0100 Illegal return value from HIS MAP 40B40101 Invalid value in max_nodes field of se_params structure Last Failure Parameter 0 contains the max_nodes field value HSJ Series Error Logging C 111 Table C 42 Host Interconnect Port Services Last Failure Codes Code Description 42000100 42020100 42030100 42060100 420B0100 42332080 42340100 42382080 42392080 423A2080 423B2080 423C2080 423D2080 423E2080 423F2080 42442080 42452080 42472080 42482080 424B0001 424C0001 424D2080 424K 2080 Cmpl_main routine found invalid port transmit status Cannot start timer Cannot restart work timer HP_INIT could not allocate initial buffers HP_INIT could not allocate initial bufs for path a dl_ctl table Receive_main found destination address in the rev packet does not match node address HP could not allocate buffe
78. STRIPEO STRIPEO is deleted from the known list of containers B 12 Command Line Interpreter DELETE unit number DELETE unit number Format Parameters Description Deletes a unit from the list of known units DELETE unit number unit number Specifies the logical unit number on HSDs and HSJs D0 D4094 or T0 T4094 on HSZs D0 D7 or TO T7 that is to be deleted This is the name given the unit when it was created using the ADD UNIT command If the logical unit specified is on line to a host the unit will not be deleted unless the OVERRIDE ONLINE qualifier is specified If any errors occur when trying to flush the user data the logical unit will not be deleted Qualifiers for HSD and HSJ controllers Examples OVERRIDE ONLINE NOOVERRIDE ONLINE D If the logical unit is on line to the controller it will not be deleted unless OVERRIDE ONLINE is specified If the OVERRIDE ONLINE qualifier is specified the unit will be spun down the user data will be flushed to disk and the logical unit will be deleted CAUTION Customer data may be lost or corrupted if the OVERRIDE ONLINE qualifier is specified NOOVERRIDE ONLINE is the default CLI gt DELETE D12 Disk unit number 12 is deleted from the known list of units CLI gt DELETE T3 OVERRIDE ONLINE Tape unit number 3 is deleted from the known list of units even if it is currently on line to a host Command Line Interpreter B 13 DIRECTORY DIREC
79. This field contains the value 0 event time See Section C 2 1 for the description of this field his status error id src dst intopcd vestate ppd opcode scs opcode See Section C 2 2 1 for the description of these fields connection id remote node name See Section C 2 2 2 for the description of these fields remote connection id The remote connection identifier supplied by the host node received connection id The connection identifier of the System Application SYSAP that is receiving the message contained in the Host Transaction Block send connection id The connection identifier of the System Application SYSAP that is sending the message contained in the Host Transaction Block connection state The connection state code as shown in Table C 8 undefined This field is only present to provide longword alignment its content is undefined Note that the content of certain of the fields described previously may be undefined depending on the value supplied in the instance code field See Table C 26 for more detail C 42 HSJ Series Error Logging C 2 3 10 Device Services Nontransfer Error Event Log Template 41 The HSJ30 40 controller Device Services firmware component reports errors detected while performing nontransfer work related to disk tape or media loader device operations via the Device Services Nontransfer Event Log If the error is associated with a command issued by a host system t
80. US UY LU AS AUT US EIU US UE LO XU XE AER AS US UE AS AU US UE LO HUP HUY in_done close inf if ctrl_inp then gosub convert_longs endif close ouf exit convert longs index 1 write ouf lt FF gt write ouf continued on next page HSJ Series Error Logging C 123 Example C 2 Cont Deskew Command Procedure Example write ouf write ouf write ouf LONGWORD DESKEW write ouf write ouf convert longs loop len f length lw string if len le 4 then goto convert longs done lw fSextract len 8 8 1w string write ouf LONGWORD index lw lw string fSextract 0 len 8 lw string index index 1 goto convert longs loop convert longs done write ouf lt FF gt return Ur SY OY UY OF U HUE IUE CAE EE UE HUGE ALE C 124 HSJ Series Error Logging Example C 3 shows an ERF error log before running the command procedure Example C 3 ERF Error Log Before Command Procedure VAX VMS SYSTEM ERROR REPORT COMPILED 16 MAR 1993 12 30 07 PAGE 144 ckckckck ck kckckckckckckck ck ck ck ck ck ck ck ck kk kk kk ENTRY de ckckckck ck ckckckckckck ck ck ck ckckckck ck ck ck k k k kk kk kk ERROR SEQUENCE 2820 LOGGED ON SID 05903914 DATE TIME 16 MAR 1993 11 35 45 39 SYS TYPE 00000000 SYSTEM UPTIME 2 DAYS 22 48 03 SCS NODE CNOTE VAX VMS V5 5 2 ERLSLOGMESSAGE ENTRY KA825 HW REV B PATCH REV 28 UCODE REV 20 BI NODE 2 I O SUB SYSTEM UNIT _HSJ402SDUA20 ESSAGE TYPE
81. Yes TThe dual redundant controller configuration supports up to six devices per port Nonredundant configurations support up to seven devices per port but this sacrifices a convenient upgrade to high availability and redundant backup power options On the same or different ports 1 2 Maintenance Strategy Maintain the HS controller subsystem by removing and replacing field replaceable units FRUs as necessary Chapter 7 contains FRU removal and replacement procedures See Appendix A for a list of FRUs and FRU part numbers Note Do not attempt to replace or repair components within field replaceable units FRUs Use the controller internal diagnostics and error logs to isolate FRU level failures 1 4 General Information and Subsystem Overview 1 3 Maintenance Features The HS controllers have the following features to aid in troubleshooting and maintenance Initialization diagnostics Various levels of initialization diagnostics execute on the controller These tests ensure that the subsystem is ready to come on line after it has been reset powered on and so forth You can elect to rerun many of the diagnostics even after initialization completes in order to test the controller operation See Chapter 6 for more information about the controller initialization diagnostics Utilities You can run the VTDPY utility to display current controller state and performance data including processor utilization host po
82. a VCS because it will cause VCS to terminate VCS acts on the sequence and the sequence is never sent to DILX Use Ctrl T when invoking DILX from a VCS e Ctrl G causes DILX to produce a performance summary DILX continues normal execution without affecting the runtime parameters 6 6 Diagnostics Exercisers and Utilities Ctrl C causes DILX to produce a performance summary stop testing and asks the reuse parameters question Ctrl Y causes DILX to abort The reuse parameters question is not asked Ctrl T causes DILX to produce a performance summary DILX then continues executing normally without affecting any of the runtime parameters 6 2 3 DILX Tests There are two DILX tests as follow The Basic Function test The User Defined test 6 2 3 1 Basic Function Test DILX The Basic Function test for DILX executes in three or four phases The four phases are as follow Initial Write Pass Is the only optional phase and is always executed first if selected The initial write pass writes the selected data patterns to the entire specified data space or until the DILX execution time limit has been reached Once the initial write pass has completed it is not re executed no matter how long the DILX execution time is set The other phases are re executed on a 10 minute cycle Random 1 O Simulates typical I O activity with random transfers from one byte to the maximum size I O possible with the memory constraints D
83. a shelf blower or a power supply is not functioning properly Error Analysis and Fault Isolation 5 9 e When the lower LED is off either there is an input power problem or the power supply is not functioning Figure 5 5 Power Supply LEDs SHELF STATUS LED POWER SUPPLY STATUS LED CXO 3613B PH For a detailed explanation of the power supply LED codes see Tables 5 2 and 5 3 Table 5 2 Shelf and Single Power Supply Status LEDs Status LED State Indication Shelf upper On System is operating normally PS lower On Shelf upper Off Fault status PS lower On There is a shelf fault there is no power supply fault Replace blower as described in Chapter 7 Shelf upper Off Fault status PS lower Off Shelf and power supply fault Replace power supply as described in Chapter 7 Note When a shelf has two power supplies you must observe the LEDs on both power supplies to determine the status see Table 5 3 5 10 Error Analysis and Fault Isolation Table 5 3 Shelf and Dual Power Supply Status LEDs Status LED PS1T PS2 Indication Shelf upper On On Normal status PS lower On On System is operating normally Shelf upper off off Fault status PS lower On On There is a shelf fault there is no power supply fault Replace blower as described in Chapter 7 Shelf upper off off Fault status PS lower On Off PS1 is operational Replace PS2 as described in Chapter 7 Shel
84. actually used The Typ column lists the thread type The following thread types may be displayed e FNC Functional thread Those threads that are started when the controller boots and never exit e DUP DUP Local Program threads These threads are only active when run either from a DUP connection or through the command line interpreter s RUN command e NULL The NULL thread does not have a thread type because it is a special type of thread that only executes when no other thread is executable The Sta column lists the current thread state The following thread states may be displayed e Bl The thread is blocked waiting for timer expiration resources or a synchronization event e Io A DUP Local Program is blocked waiting for terminal I O completion e Rn The thread is currently executable Diagnostics Exercisers and Utilities 6 79 O The CPU column lists the percentage of execution time credited to each thread since the last screen update The values may not add up to exactly 100 percent due to both rounding errors and the fact that there may not be enough room to display all of the threads An unexpected amount of time may be credited to some threads because the controller s firmware architecture allows code from one thread to execute in the context of another thread without a context switch Table 6 13 describes the processes that may be displayed in the active thread display Note It is possib
85. address e Last Failure Parameter 1 contains the invalid status An invalid status was returned from CACHE LOOKUP LOCK e Last Failure Parameter 0 contains the DD address e Last Failure Parameter 1 contains the invalid status An invalid mapping type was specified for a logical unit e Last Failure Parameter 0 contains the USB address e Last Failure Parameter 1 contains the Unit Mapping Type An invalid mapping type was specified for a logical unit e Last Failure Parameter 0 contains the USB address e Last Failure Parameter 1 contains the Unit Mapping Type An invalid status was returned from CACHE LOOKUP_LOCK e Last Failure Parameter 0 contains the DD address e Last Failure Parameter 1 contains the invalid status An invalid status was returned from CACHE OFFER WRITE DATA e Last Failure Parameter 0 contains the DD address e Last Failure Parameter 1 contains the invalid status A request was made to read a device metadata block with an invalid block type A request was made to write a device metadata block with an invalid block type An invalid status was returned from VA XFER in a complex read operation e Last Failure Parameter 0 contains the DD address e Last Failure Parameter 1 contains the invalid status continued on next page HSJ Series Error Logging C 99 Table C 34 Cont Value Added Services Last Failure Codes Code Description 027B0102 027D0100 027E0
86. an overview of how to get help Format HELP Description The HELP command displays a brief description on how to use the question mark to obtain help on any command or function of the CLI Examples CLI gt HELP Help may be requested by typing a question mark at the CLI prompt This will display a list of all available commands For further information you may enter a partial command and type a space followed by a to print a list of all available options at that point in the command For example SET THIS_CONTROLLER Will print a list of all legal SET THIS_CONTROLLER commands Displaying help using the HELP command CLI gt SET Your options are FAILOVER OTHER_CONTROLLER NOFAILOVER THIS_CONTROLLER Unit number or container name Obtaining help on the SET command using the facility B 16 Command Line Interpreter INITIALIZE INITIALIZE Initializes the metadata on the container specified Format INITIALIZE container name Parameters container name Specifies the container name to initialize Description The INITIALIZE command initializes a container so a logical unit may be created from it When initializing a single disk drive container if NOTRANSPORTABLE was specified or allowed to default on the ADD DISK or SET disk name commands a small amount of disk space is made inaccessible to the host and used for metadata The metadata will be initialized If TRANSPORTABLE was specified
87. any metadata will be destroyed on the device and the full device will be accessible to the host CAUTION The INITIALIZE command destroys all customer data on the container When an initialize is required e When a unit is going to be created from a newly installed disk When a unit is going to be created from a newly created storage set stripeset When an initialize is specifically not required e When a unit has been deleted and a new unit is going to be created from the same container e When a storage set that was initialized in the past has been deleted then re added using the same members as before Examples CLI gt INITIALIZE DISKO Container DISKO is initialized If NOTRANSPORTABLE was specified or allowed to default metadata is written on it CLI gt INITIALIZE STRIPEO Container STRIPEO is initialized and metadata is written on it Command Line Interpreter B 17 LOCATE LOCATE Format Description Qualifiers Locates devices disks tapes and storage sets by lighting the amber device fault LED on the StorageWorks building block SBB LOCATE The LOCATE command illuminates the amber device fault LEDs the lower LED on the front of an SBB of the containers specified The LOCATE command can also be used as a lamp test ALL The LOCATE ALL command turns on the amber device fault LEDs of all configured devices This qualifier can also be used as a lamp test See LOCATE CANCEL to turn
88. are used in write commands This question is displayed when writes are enabled for the Basic Function or User Defined tests There are 18 unique data patterns to select from These patterns were carefully selected as worst case or most likely to produce errors for disks connected to the controller See Table 6 2 for a list of data patterns The default uses all 18 patterns in a random method This question also allows you to create a unique data pattern of your own choice Enter the 8 digit hexadecimal user defined data pattern Explanation This question is only displayed if you choose to use a user defined data pattern for write commands The data pattern is represented in a longword and can be specified with eight hexadecimal digits Enter start block number 0 highest Ibn on the disk 0 Explanation Enter the starting block number of the area on the disk you wish DILX to test Zero is the default Enter end block number starting lbn highest lbn on the disk highest Ibn on the disk Explanation Enter the highest block number of the area on the disk you wish DILX to test The highest block number of that type of disk is the default Perform data compare y n n Explanation Enter Y to enable the use of the compare modifier bit with read and write commands Enter N and no data compare operations are done This question only applies to the Basic Function test If the compare modifier is set on wri
89. ascq fields are undefined The SWAP interrupt from the shelf indicated by the port field can not be cleared All SWAP interrupts from all ports will be disabled until corrective action is taken When SWAP interrupts are disabled both HSJ30 40 controller front panel button presses and removal insertion of devices are not detected by the HSJ30 40 controller Note that in this instance the target asc and ascq fields are undefined continued on next page Table C 27 Cont Device Services Nontransfer Error Event Log Template 41 Instance MSCP Event Codes Instance Code MSCP Event Code Description 03F20064 03F30064 03D3450A 03D4450A 03D5450A 03D6450A 03D7450A 00EB The SWAP interrupts have been cleared and reenabled for all shelves Note that in this instance the port target asc and ascq fields are undefined 00EB An asynchronous SWAP interrupt was detected by the HSJ30 40 controller for the shelf indicated by the port field Possible reasons for this occurance include e Device insertion removal e Shelf power failure SWAP interrupts reenabled Note that in this instance the target asc and ascq fields are undefined 00EB During device initialization the device reported the SCSI Sense Key NO SENSE This indicates that there is no specific sense key information to be reported for the designated logical unit This would
90. be enclosed in quotes with an alphabetic character first Each SCS node name must be unique within its VMScluster 6 Enter the following command to set the MSCP allocation class CLI gt SET THIS CONTROLLER MSCP_ALLOCATION_CLASS n where n is 1 through 255 Digital recommends providing a unique allocation class value for every pair of dual redundant controllers in the same cluster 7 Enter the following command to set the TMSCP allocation class CLI gt SET THIS CONTROLLER TMSCP_ALLOCATION_CLASS n where n is 1 through 255 Note Always restart the controllers after setting the ID SCS node name or allocation classes 8 Restart both controllers either by pressing the green reset buttons or entering the following commands CLI gt RESTART OTHER CONTROLLER CLI gt RESTART THIS CONTROLLER 9 Enter the following commands to verify the preceding parameters were set CLI gt SHOW THIS CONTROLLER CLI gt SHOW OTHER CONTROLLER 10 Connect the host port cables to the front of the controllers Do not connect the controllers in a dual redundant pair to separate different host CPUs HSJ series Connect the CI cable and tighten its captive screws with a flat head screwdriver CAUTION Do not connect host port cables to an HSD series controller while the power is on to any members on the DSSI bus including the controller and host Doing so risks short circuits that may blow fuses on all the members HSD
91. block address out of range 24 00 Invalid field in CDB 25 00 Logical unit not supported 26 00 Invalid field in parameter list 26 01 Parameter not supported 26 02 Parameter value invalid 26 03 Threshold parameters not supported 28 00 Not ready to ready transition medium may have changed 29 00 Power on reset or bus device reset occurred 29 01 Power on occurred 29 02 SCSI bus reset occurred 29 03 Bus device reset occurred 2A 00 Parameters changed 2A 01 Mode parameters changed 2A 02 Log parameters changed 2B 00 Copy cannot execute because host cannot disconnect 2C 00 Command sequence error 2F 00 Commands cleared by another initiator 30 00 Incompatible medium installed 30 01 Cannot read medium unknown format 30 02 Cannot read medium incompatible format 37 00 Rounded parameter 39 00 Saving parameters not supported 3A 00 Medium not present 3D 00 Invalid bits in identify message 3E 00 Logical unit has not self configured yet 3F 00 Target operating conditions have changed 3F 01 Microcode has been changed continued on next page HSJ Series Error Logging C 73 C 74 Table C 15 Cont SCSI ASC ASCQ Codes For CDROM Devices ASC ASCQ Code Code Description 3F 02 Changed operating definition 3F 03 Inquiry data has changed 43 00 Message error 44 00 Internal target failure 45 00 Select or reselect failure 46 00 Unsuccessful soft reset 47 00 SCSI parity error 48 00
92. cable attached to the trilink connector 6 Optional Loosen captive screws and remove the trilink connector from the front of the controller 7 30 Removing and Replacing Field Replaceable Units 7 7 4 Cable Replacement Installation Use the following procedure to replace SCSI host cables 1 Optional Attach the trilink connector to the front of the controller and tighten its captive screws 2 Position and route the SCSI host cable within the cabinet 3 Connect the SCSI host cable to the trilink connector on the front of the controller and tighten the captive screws on the SCSI host cable connector 4 Optional Connect and tighten captive screws for the terminator or secondary SCSI host cable at the open connection of the trilink connector Install any tie wraps as necessary to hold the SCSI host cable in place 6 Close and lock the cabinet doors SW800 series using a 5 32 inch Allen wrench 7 Connect the other end of the cable to the appropriate device on the bus removing terminators as necessary 7 8 SCSI Device Port Cables Servicing SCSI device port cables causes subsystem down time because you must remove devices to access SCSI connectors on the BA350 MA controller and BA350 SB device shelf backplanes Note If the desired cable connects to a device shelf in the lower part of a cabinet it may be easier to remove the device shelf rather than attempt this procedure with the shelf installed Refer to
93. comm path event Invalid EVENT CODE parameter in call to dmscp dcd comm path event Invalid EVENT CODE parameter in call to mscp do disconnect An attempt was about to be made to return a progress indicator to the host that was OXFFFFFFFF the only invalid value An WH_DAF command was requested to be performed by the wrong process A non immediate WHM operation was passed to the dmscp exec whm immediate routine This routine found an invalid xfer state so cannot continue HIS did not allocate an HTB when there should have been one reserved for this connection as determined by mscp rcv listen HIS did not allocate an HTB when there should have been one reserved for this connection as determined by dmscp dcd src ges send HIS did not allocate an HTB when there should have been one reserved for this connection as determined by dmscp dcd comm path event When trying to put THE extra send HTB on the connections send htb list there was already one on the queue The VA CHANGE STATE service did not set the Software write protect as requested for disk The VA CHANGE STATE service did not set the Software write protect as requested for tape Initial HIS LISTEN call for MSCP DISK was unsuccessful Initial HIS LISTEN call for MSCP TAPE was unsuccessful dmscp ded send cmd received a command on an idle remote source connection that is no longer valid Unrecognized or invalid in this context return value fr
94. controller extended status An additional set of status information maintained by the drive that is of interest to a host error log Extended status is drive type specific and is not utilized by the controller except as input to the host error log and diagnostic processes failover A software process that takes place when one controller fails in a dual redundant configuration and the other controller takes over service to the devices of the failed controller fan An airflow device mounted in a Storage Works cabinet fast differential SCSI See FD SCSI fast wide differential SCSI See FWD SCSI FD SCSI The fast differential SCSI bus with an 8 bit data transfer rate of 10 MB s See also FWD SCSI and SCSI field replaceable unit See FRU filler panel A sheet metal or plastic panel used to cover unused mounting areas in StorageWorks cabinets and shelves firmware executive See EXEC flush To write cached data to storage FRU field replaceable unit full height device A single device that occupies an entire 5 25 inch SBB carrier StorageWorks full height devices have an order number suffix of VA Glossary 7 Glossary 8 FWD SCSI The fast wide differential SCSI bus with a 16 bit data transfer rate of up to 20 MB s See also FD SCSI and SCSI half height device A device that occupies half of a 5 25 inch SBB carrier Two half height devices can be mounted in a 5 25 inch SBB carrier The f
95. controller Device Services and Value Added Services firmware components report errors detected while performing work related to disk unit transfer operations via the Disk Transfer Error Event Log If the error is associated with a command issued by a host system the Disk Transfer Error Event Log will be sent to the host system that issued the command on the same connection upon which the command was received if This Host error logging is enabled on that connection and to all host systems that have enabled Other Host error logging on a connection or connections established with the HSJ30 40 controller s Disk and or Tape MSCP Server If the error is associated with a command issued by a HSJ30 40 controller firmware component the Disk Transfer Error Event Log will be sent to all host systems that have enabled Miscellaneous error logging on a connection established with the HSJ30 40 controller s Disk MSCP Server The Disk Transfer Error Event Log is reported via the MSCP Disk Transfer Errors error log message format The format of this event log including the HSJ30 40 controller specific fields is shown in Figure C 26 Disk Transfer Error Event Log Format Specific Fields format This field contains the value 02 that is MSCP Disk Transfer Errors error log format code event code The values that can be reported in this field for this event log are shown in Table C 28 instance code See Section C 2 1 for the descr
96. critical to proper controller operation is indicated immediate attention is required 02 HARD Failure of a component that affects controller performance or precludes access to a device connected to the controller is indicated 0A SOFT An unexpected condition detected by a controller firmware component such as protocol violations host buffer access errors internal inconsistencies and so forth is indicated 64 INFORMATIONAL An event having little or no effect on proper controller or device operation is indicated With the exception of events reported via the Disk Copy Data Correlation Event Log the Event Notification Recovery Threshold value assigned to a particular event is supplied in the NR Threshold subfield of the instance code field of the event log used to report the event See Section C 2 for instance code field details Disk Copy Data Correlation Event Log Conditions The Event Notification Recovery Threshold Classification assigned to the following conditions reported via a Disk Copy Data Correlation Event Log is SOFT see Table C 50 e Subcommand Error subcode Destination Command Timed Out e Subcommand Error subcode Source Command Timed Out e Subcommand Error subcode Destination Inconsistent State cases A B C D E and F Controller Error subcode Local Connection Request Failed Insufficient Resources to Request Local Connection Controller Error subcode Remo
97. device shelves e Use predesignated spares on separate controller ports and device shelves e Place storage set members on separate controllers when implementing host based RAID for example HBVS Figure 3 9 shows examples of optimal configurations for raidset members and designated spares on separate controller ports Figure 3 9 Optimal Availability Configurations BA350 MA BA350 MA HSJ40 CONTROLLER HSJ40 CONTROLLER HSJ40 CONTROLLER OOOO HSJ40 CONTROLLER BA350 SB BA350 SB BA350 SB BA350 SB BA350 SB RAID1 SHADOWSET MEMBERS RAIDSET MEMBERS CXO 3752B MC 3 18 Configuration Rules and Restrictions Highest Availability For highest availability especially with RAID implementations follow these guidelines e For host based RAID implementations split the normal access path between controllers e Use redundant power supplies in all shelves 3 6 Host Considerations The following sections explain important considerations when configuring the HS controller and subsystem to the host CPU 3 6 1 Host Cables Following are special guidelines for configuring host cables buses to and from the HS controller HSD series controllers DSSI cable length between nodes members on the DSSI bus must be no greater than 16 feet 4 9 meters e Total DSSI cable length end to end on one DSSI bus must be no gre
98. drives in a way that simulates a high level of user activity Disk Inline Exerciser See DILX DIGITAL Storage Architecture See DSA DSSI Digital s storage system interconnect bus with an 8 bit data transfer rate of 4 5 MB s dual universal asynchronous receiver transmitter See DUART dual cabinet power configuration A cabinet ac power configuration in which two ac sources and two ac power supplies are used to supply dc power to the cabinet s SBB shelves dual porting or dual access The ability of a disk or tape drive to be accessed by two controllers All DSA drives have a standard dual port feature DSA drives can be online to only one controller at a time However they are able to disconnect themselves from a failed controller or be disconnected by a failing controller and become available for continued service through the other controller dual shelf power configuration A cabinet ac power configuration in which one ac source and two ac power supplies are used to supply dc power to the cabinet s SBB shelves Glossary 5 Glossary 6 dual redundant configuration A controller configuration consisting of a primary and backup controller in one controller shelf Both controllers normally share access to each other s devices If the primary controller fails the backup controller assumes control over the failing controller s devices DUART Dual Universal Asynchronous Receiver Transmitter An integrated circuit conta
99. e Last Failure Parameter 1 contains the USB index value The event log format found in V fm template table is not supported by the Fault Manager The bad format was discovered while trying to fill in a supplied eip e Last Failure Parameter 0 contains the instance code value e Last Failure Parameter 1 contains the format code value Last Failure Parameter 2 contains the requester error table index value The Fault Manager could not allocate memory for its Event Information Packet EIP buffers The Fault Manager could not allocate a Datagram HTB in its initialization routine There is more EIP information than will fit into a datagram The requestor specific size is probably too large e Last Failure Parameter 0 contains the instance code value e Last Failure Parameter 1 contains the format code value Last Failure Parameter 2 contains the requester error table index value The event log format found in the already built EIP is not supported by the Fault Manager The bad format was discovered while trying to copy the EIP information into a datagram HTB e Last Failure Parameter 0 contains the format code value e Last Failure Parameter 1 contains the instance code value The caller of FM CANCEL EVENT NOTIFICATION passed an address of an event notification routine that does not match the address of any routines for which event notification is enabled continued on next page HSJ Series Error Logging C
100. e When TILX terminates for any conditions except an abort e When Ctrl G is entered or Ctrl T when running from a VCS The performance display has different formats depending on whether or not performance statistics were requested in the user specified parameters and if errors were detected The following is an example of a TILX performance display where performance statistics were not selected and where no errors were detected TILX Summary at 18 JUN 1993 06 18 41 Test minutes remaining 0 expired 6 Unit 1 Total IO Requests 482 No errors detected Unit 2 Total IO Requests 490 No errors detected The following is an example of a TILX performance display where performance statistics were selected and no errors were detected TILX Summary at 18 JUN 1993 06 18 41 Test minutes remaining 0 expired 6 Unit 1 Total IO Requests 482 Read Count 292 Write Count 168 Access Count 21 Erase Count 0 KB xfer Read 7223 Write 4981 Total 12204 No errors detected The following is an example of a TILX performance display where performance statistics were not selected and where errors were detected TILX Summary at 18 JUN 1993 06 18 41 Test minutes remaining 0 expired 6 O unit 10 Total IO Requests 153259 No errors detected O unit 40 Total IO Requests 2161368 Err in Hex 1C 031A4002 PTL 04 00 00 Key 04 ASC Q B0 00 HC 0 SC 1 Total Errs Hard Cnt 0 Soft Cnt 1 O unit 55 Total IO Requests 2017193 Err in Hex 1C 03094002 PTL 05 05 00 Key 01 AS
101. eese ees Last Failure Event Log Template 01 Format Last Failure Code Format elles Failover Event Log Template 05 Format o ooooo o Nonvolatile Parameter Memory Component Event Log Template 11 Formate 324 4p WP SR RP SERERE SE ARES ww Backup Battery Failure Event Log Template 12 Format Subsystem Built In Self Test Failure Event Log Template 13 Format ee X E stp sao er erp ERAN ene E ue Te aee obs Memory System Failure Event Log Template 14 Format CI Port Event Log Template 31 Format CI Port Port Driver Event Log Template 32 Format CI System Communication Services Event Log Template 33 Formata M PP CET Device Services Nontransfer Error Event Log Template 41 Formatos 0434 ee here eee shee tik SEAS Se DERI Ea Disk Transfer Error Event Log Template 51 Format Disk Bad Block Replacement Attempt Event Log Template 57 Formata E RE EE M Tape Transfer Error Event Log Template 61 Format Media Loader Error Event Log Template 71 Format Related Documentation llle HS Controller Models llle eee Summary of HS Controller Product Features HS Controller Specifications o o Environmental Specifications o 3 Inch SBB Configurations 6 Port Controller 3 Inc
102. example during a device warm swap radio frequency interference See RFI redundant array of independent disks See RAID read cache A block of high speed memory used by a controller to buffer data being read from storage devices by a host A read cache increases the controller s effective device access speed by satisfying host read requests from its local cache memory when possible instead of from external storage devices The controller maintains in the cache copies of data recently requested by the host and may fetch blocks of data ahead in anticipation that the controller will access the next sequential blocks In a basic read cache host write requests are handled without involving the cache See also write through cache Glossary 11 Glossary 12 RAID Redundant array of independent disks A set of storage techniques devised to increase the performance and availability of a storage subsystem restore Data previously backed up on tape is retrieved for disk storage using the normal priority Backup is used to preserve information in the event of a disk failure Restore is used to recover the information RFI Radio frequency interference The impairment of a signal by an unwanted radio signal or radio disturbance SBB StorageWorks building block A device housed in a standard Storage Works SBB carrier An SBB has a standard physical and electrical interface that is compatible with those of StorageWorks shelves and enclo
103. execution so that the test will perform exactly as the one that just completed However there is one exception If the previous test was the Basic Function test with the initial write pass and the initial write pass completed the initial write pass is not performed when the test is restarted e Change unit DILX allows you to drop or add units to testing For each unit dropped another unit must be added until all units in the configuration have been tested The unit chosen will be tested with the same parameters that were used for the unit that was dropped from testing When you have completed dropping and adding units all performance statistics are initialized and DILX execution resumes with the same parameters as the last run Drop unit x y n n Explanation This question is displayed if you choose to change a unit as an answer to the reuse parameters previous question Enter the unit number that you wish to drop from testing The new unit will be write enabled Do you wish to continue y n n Explanation This question is displayed if you choose to change a unit as an answer to the reuse parameters question It is only asked if the unit being dropped was write enabled This question gives you the chance to terminate DILX testing if you do not want data destroyed on the new unit Enter N to terminate DILX Diagnostics Exercisers and Utilities 6 57 6 4 5 DILX Output Messages The following message is display
104. field contains the value 71 for this event log tdisize See Section C 2 1 for the description of this field This field contains the value 3C for this event log reserved offset 36 This field contains the value 0 event time See Section C 2 1 for the description of this field ancillary information The format of this field varies depending on whether or not the event being reported is associated with a command issued by a host system or one issued by an HSJ30 40 controller firmware component If the event is associated with a command issued by a host system this field is formatted as described in Section C 2 2 2 If the event is associated with a command issued by an HSJ30 40 controller firmware component this field is considered reserved and contains the value 0 device locator devtype device identification device serial number See Section C 2 2 4 for the description of these fields C 54 HSJ Series Error Logging cmdopcd infoq ercdval segment snsflgs info addsnsl cmdspec asc ascq frucode keyspec See Section C 2 2 5 for the description of these fields C 2 3 15 Disk Copy Data Correlation Event Log The HSJ30 40 controller Disk MSCP Server firmware component reports errors detected while performing Disk Copy Data commands via the Disk Copy Data Correlation Event Log The format of the Disk Copy Data Correlation Event Log is identical to the format of the MSCP Disk Copy Data Correlation error log
105. following units on the other controller lt list of problem units gt Explanation When attempting to SHUTDOWN RESTART or SELFTEST the other controller some units could not be successfully spun down This can be caused either by online units or errors when trying to spin down the units Either rectify the problems on the problem units or enter the SHUTDOWN RESTART or SELFTEST command with the qualifier OVERRIDE_ONLINE or IGNORE_ERRORS Command Line Interpreter B 67 Error 4160 Unable to rundown the following units on this controller lt list of problem units gt Explanation When attempting to SHUTDOWN RESTART or SELFTEST the this controller some units could not be successfully spun down This can be caused either by online units or errors when trying to spin down the units Either rectify the problems on the problem units or enter the SHUTDOWN RESTART or SELFTEST command with the qualifier OVERRIDE_ONLINE or IGNORE_ERRORS Error 4170 Only lt max_targets gt targets may be specified Explanation When setting THIS_CONTROLLER ID you specified too many IDs you may only specify up to lt max_targets gt IDs Retry the SET THIS_CONTROLLER ID command with no more than lt max_ targets gt IDs specified Error 4180 Invalid unit number s still present that must be deleted before the controller ID may be changed All unit numbers must be in the range s lt start gt to lt end gt Explanation You attempted to change the cont
106. for a connection we already know about HIS has reported a connection event that should not be possible Table C 45 System Communication Services Directory Service Last Failure Codes Code Description 62000100 HIS LISTEN call failed with INSUFFICIENT RESOURCES 62010100 Failure to allocate associated work queue 62020100 Failure to allocate associated timer queue 62030100 Failure to allocate connection ID timers Table C 46 Disk Inline Exerciser DILX Last Failure Codes Code Description 80010100 80020100 80030100 80040100 80050100 80060100 80070100 80080100 80090100 C 116 HSJ Series Error Logging An HTB was not available to issue an IO when it should have been A unit could not be dropped from testing because an available command failed DILX tried to release a facility that was not reserved by DILX DILX tried to change the unit state from MAINTENANCE MODE to NORMAL but was rejected because of insufficient resources DILX tried to change the USB unit state from MAINTENANCE MODE to NORMAL but DILX never received notification of a successful state change DILX tried to switch the unit state from MAINTENANCE MODE to NORMAL but was not successful DILX aborted all commands via va d abort but the HTBs have not been returned While DILX was deallocating HIS EIP buffers at least one could not be found DILX received an end message that corresponds to an opcode not supported by DILX
107. from the internal CI cable second Never leave unterminated paths on the star coupler Never leave cables terminated or not attached at the star coupler and disconnected at the internal CI cable connector This minimizes adverse effects on the cluster and prevents a short circuit between the two ground references Disconnect the external CI cable connectors from the star coupler one at a time in the following order refer to Figure 7 7 TXA RXA TXB RXB Attach terminators to the open star coupler connectors Unlock and open the cabinet SW800 series using a 5 32 inch Allen wrench Disconnect the external CI cables from the internal CI cable Loosen the captive screws on the internal CI cable where it attaches to the front of the controller using a flat head screwdriver and disconnect the internal CI cable from the controller Remove the internal CI cable from the cabinet cutting tie wraps as necessary 7 5 4 Cable Replacement Installation Use the following procedure to replace internal CI cables 1 2 Position and route the internal CI cable within the cabinet Connect the internal CI cable to the front of the controller and tighten the captive screws on the internal CI cable where it attaches to the controller using a flat head screwdriver CAUTION Always connect the external CI cable to the internal CI cable first then connect it to the star coupler second Never leave unterminated paths on th
108. half or all of the disk units configured It will perform a very thorough test with WRITES enabled The user will only be able to select the run time and performance summary options and whether or not to test a half or full configuration The user will not be able to specify specific units to test The Auto Configure option is only recommended for initial installations Do you wish to perform an Auto Configure y n n n Use all defaults and run in read only mode y n y y Disk unit numbers on this controller include 10 12 14 Enter unit number to be tested 10 Unit 10 successfully allocated for testing Select another unit y n n y Enter unit number to be tested 12 Unit 12 successfully allocated for testing Select another unit y n n n DILX testing started at 13 JAN 1993 04 47 57 Test will run for 10 minutes Type T if running DILX through VCS or G in all other cases to get a current performance summary Type C to terminate the DILX test prematurely Type Y to terminate DILX prematurely DILX Summary at 13 JAN 1993 04 49 14 Test minutes remaining 9 expired 1 continued on next page 6 22 Diagnostics Exercisers and Utilities Example 6 6 Cont Using All Defaults DILX Unit 10 Total IO Requests 4530 No errors detected Unit 12 Total IO Requests 2930 No errors detected Reuse Parameters stop continue restart change unit stop DILX Normal Termination HSJ gt
109. have been lost To correct this problem enter the SHOW THIS_CONTROLLER and SHOW OTHER_CONTROLLER commands to determine the current controller settings Use the SET THIS_CONTROLLER and SET OTHER_ CONTROLLER commands to restore settings NVPM Host Protocol Parameters component initialized to default settings Explanation The tape and disk MSCP allocation class settings for this controller have been lost To correct this problem enter the SHOW THIS_CONTROLLER and SHOW OTHER CONTROLLER commands to determine the current controller settings Use the SET THIS_CONTROLLER and SET OTHER_ CONTROLLER commands to restore settings NVPM User Interface Parameters component initialized to default settings Explanation Terminal setting information has been lost To correct this problem enter the SHOW THIS_CONTROLLER and SHOW OTHER_CONTROLLER commands to determine the current terminal settings Compare the terminal settings with the CONFIGURATION INFO output information and use the SET THIS CONTROLLER and SET OTHER CONTROLLER commands to restore terminal settings The following NVPM Configuration Information component elements were initialized to default settings n Explanation The settings given by n have been initialized in connection with another NVPM error To clear this error perform the following procedure 1 Enter the following commands CLI SHOW DEVICES CLI gt SHOW UNITS CLI gt SHOW STORAGESETS 2 Compare the informa
110. initial write pass If you respond with Y the system performs writes starting at the lowest user selected LBN and issues spiral I Os with the largest byte count possible This continues until the specified LBN range has been completely written Upon completion of the initial write pass normal functions of the Random I O phase start The advantage of selecting the initial write pass is that compare host data commands can then be issued and the data previously written to the media can be verified for accuracy It makes sure that all LBNs within the selected range are accessed by DILX The disadvantage of using the initial write pass is that it may take a long time to complete because a large LBN range was specified You can bypass this by selecting a smaller LBN range but this creates another disadvantage in that the entire disk space is not tested The initial write pass only applies to the Basic Function test The write percentage will be set automatically Enter read percentage for random IO and data intensive phase 0 100 67 Explanation This question is displayed if read write mode is selected It allows you to select the read write ratio to use in the Random I O and Data Intensive phases The default read write ratio is similar to the I O ratio generated by a typical OpenVMS system Diagnostics Exercisers and Utilities 6 11 Enter data pattern number 0 all 19 user_defined 0 19 0 Explanation The DILX data patterns
111. invalid serial number This controller cannot be used Call Digital Services Explanation This error means that an uninitialized controller has slipped out of manufacturing or the NV memory was destroyed Contact Digital Multivendor Services Error 4100 Unable to RESTART other controller Explanation A communication error occurred when trying to restart the other controller Retry the RESTART command Error 4110 Unable to SHUTDOWN other controller Explanation A communication error occurred when trying to shut down the other controller Retry the SHUTDOWN command Error 4120 Unable to SELFTEST other controller Explanation A communication error occurred when trying to self test the other controller Retry the SELFTEST command Error 4130 Unable to setup controller restart Explanation A communication error occurred when trying to RESTART or self test the other controller Retry the RESTART or SELFTEST command Error 4140 Unable to lock the other controller s NV memory Explanation Most configuration commands such as ADD DELETE and SET require both controllers in a dual redundant configuration to be up and functioning so configuration changes can be recorded in both controllers If one controller is not running this message results when you attempt to change the configuration Restart the other controller and try the command again or SET NOFAILOVER on the remaining controller Error 4150 Unable to rundown the
112. is entered or Ctrl T when running from a VCS The performance display has different formats depending on whether or not performance statistics are requested in the user specified parameters and if errors are detected The following is an example of a DILX performance display where performance statistics were not selected and where no errors were detected Diagnostics Exercisers and Utilities 6 27 DILX Summary at 18 JUN 1993 06 18 41 Test minutes remaining 0 expired 6 Unit 1 Total IO Requests 482 No errors detected Unit 2 Total IO Requests 490 No errors detected The following is an example of a DILX performance display where performance statistics were selected and no errors were detected DILX Summary at 18 JUN 1993 06 18 41 Test minutes remaining 0 expired 6 Unit 1 Total IO Requests 482 Read Count 292 Write Count 168 Access Count 21 Erase Count 0 KB xfer Read 7223 Write 4981 Total 12204 No errors detected The following is an example of a DILX performance display where performance statistics were not selected and where errors were detected on a unit under test DILX Summary at 18 JUN 1993 06 18 41 Test minutes remaining 0 expired 6 O unit 10 Total IO Requests 153259 No errors detected O unit 40 Total IO Requests 2161368 Err in Hex IC 031A4002 PTL 04 00 00 Key 04 ASC Q B0 00 HC 0 SC 1 Total Errs Hard Cnt 0 Soft Cnt 1 O unit 55 Total IO Requests 2017193 Err in Hex IC 03094002 PTL 05 05 00 Key 01 ASC Q 18 8
113. is shown in Figure C 28 Tape Transfer Error Event Log Format Specific Fields format This field contains the value 05 that is TMSCP Tape Errors error log format code event code The values that can be reported in this field for this event log are shown in Table C 30 instance code See Section C 2 1 for the description of this field The values that can be reported in this field for this event log are shown in Table C 30 C 50 HSJ Series Error Logging Figure C 28 Tape Transfer Error Event Log Template 61 Format 31 0 command reference number sequence number unit number event code flags format controller identifier multiunit code csvrsn unit identifier position object count reserved event time ancillary information device identification device serial number info snsflgs keyspec frucode templ See Section C 2 1 for the description of this field This field contains the value 61 for this event log tdisize See Section C 2 1 for the description of this field This field contains the value 3C for this event log reserved offset 32 This field contains the value 0 HSJ Series Error Logging C 51 event time See Section C 2 1 for the description of this field ancillary information The format of this field varies depending on whether or not the event being reported is associated with a command issued by a host system or one issued by an HSJ30 40 controller firmware compon
114. limit 65535 Testing will stop if the limit is reached e A hex dump of the extended error log information is disabled The I O queue depth 4 A maximum of 4 I Os will be outstanding at any time The Selected Test the Basic Function test e Read only mode e All user available LBNs are available for testing e Data compares are disabled Enter the execution time limit in minutes 1 65535 10 Explanation Enter the desired time you want DILX to run The default run time is 10 minutes Diagnostics Exercisers and Utilities 6 9 Enter performance summary interval in minutes 1 65535 10 Explanation Enter a value to set the interval for which a performance summary is displayed The default is 10 minutes Include performance statistics in performance summary y n n Explanation Enter Y to see a performance summary that includes the performance statistics that include the total count of read write access and erase I O requests and the kilobytes transferred for each command Enter N and no performance statistics are displayed Display hard soft errors y n n Explanation Enter Y to enable error reporting including end messages and ElPs Enter N to disable error reporting including end messages and ElPs The default is disabled error reporting Display hex dump of Event Information Packet requester specific Information y n n Explanation Enter Y to enable the hex dump dis
115. message The HSJ30 40 controller generates Disk Copy Data Correlation Event Logs in accordance with MSCP protocol If a Controller Error subcode Local Connection Request Failed Insufficient Resources to Request Local Connection or a Controller Error subcode Remote Connection Request Failed Insufficient Resources to Request Remote Connection condition is detected the HSJ30 40 controller will store one of values shown in Table C 32 in the first longword of the event dependent information field of the MSCP Disk Copy Data Correlation error log message to identify the resource that is lacking HSJ Series Error Logging C 55 C 3 Event Log Codes Tables C 2 through C 49 list specific codes contained within the event log information Table C 2 Firmware Component Identifier Codes Code Description 01 Executive Services 02 Value Added Services 03 Device Services 04 Fault Manager 06 Dual Universal Asynchronous Receiver Transmitter Services 07 Failover Control 08 Nonvolatile Parameter Memory Failover Control 20 Command Line Interpreter 40 Host Interconnect Services 42 Host Interconnect Port Services 60 Disk and Tape MSCP Server 61 Diagnostics and Utilities Protocol Server 62 System Communication Services Directory Service 80 Disk Inline Exerciser DILX 81 Tape Inline Exerciser TILX 82 Subsystem Built In Self Tests BIST 83 Automatic Device Configuration Program CONFIG Table C 3 Host Int
116. metadata on the device When bringing other HS controller nontransportable devices to an HS controller subsystem simply add the device to your configuration using the ADD command Do not initialize the device or you will reset destroy forced error information on the device When adding devices the controller firmware will verify that metadata is present Ifin doubt try to add the device so that the controller will check for For purposes of setting transportable nontransportable devices the HSC K scsi controllers may be considered compatible with the HS controllers Normal Operation 4 17 metadata If an error stating that there is no metadata occurs initialize the device before adding it A nontransportable device is interchangeable with an HSC K scsi module or another HS controller subsystem Nontransportable devices are MSCP compliant and support forced error e For transportable devices A transportable feature is provided for transfer of devices between non HS controller systems and HS controller arrays Transportable devices do not have metadata on them and initializing a device after setting it as transportable will destroy metadata if any on the device Before moving devices from an HS controller subsystem to a non HS controller system delete the unit associated with the device and set the device as transportable Then initialize the device to remove any metadata When bringing non HS controller devices to a
117. must be run in read only mode only Thus DILX can be used to determine the health of a controller and the disks connected to it and to acquire performance statistics You can run DILX from a maintenance terminal virtual terminal or VCS DILX now allows for auto configuring of drives This allows for quick configuring and testing of all units at once Please be aware that customer data will be lost by running this test Digital recommends only using auto configure during initial installations DILX tests logical units that may consist of storage sets of multiple physical devices Error reports identify the logical units not the physical devices Therefore if errors occur while running against a unit its storage set should be reconfigured as individual devices and then DILX run again against the individual devices There are no limitations on the number of units DILX may test at one time However Digital recommends only using DILX when no host activity is present If you must run DILX during a live host connection you should limit your testing to no more than half of any controller s units at one time This conserves controller resources and minimizes performance degradation on the live units you are not testing DILX and the tape inline exerciser TILX may run concurrently with one initiated from a maintenance terminal and the other from a virtual terminal connection Digital recommends however that the exercisers not be run while no
118. name command B 48 SHOW DISKS command B 47 SHOW OTHER CONTROLLER command B 49 SHOW STORAGESETS command B 51 SHOW stripeset container name command B 53 SHOW STRIPESETS command B 52 SHOW tape container name command B 55 SHOW TAPES command B 54 SHOW THIS CONTROLLER command B 56 SHOW unit number command B 59 SHOW UNITS command B 58 SHUTDOWN OTHER CONTROLLER command 7 2 B 60 SHUTDOWN THIS_CONTROLLER command 7 2 B 62 Shutting down 7 2 Software HS controller See Firmware Solid codes OCP 5 4 Specifications cache module 1 9 controller module 1 9 environmental 1 10 HSD30 1 9 HSJ30 1 9 HSJ40 1 9 HSZ40 1 9 Storage controller perspective 2 13 controller PTL 2 13 differences in HSZ series 2 16 host perspective 2 13 Storage cont d host perspective HSZ series 2 15 host PTL HSZ series 2 16 how addressed 2 13 Storage SBB status 5 8 Storage set defined B 51 size 4 14 Storage sets adding B 78 initializing B 78 Stripeset 2 12 4 4 4 14 B 78 Striping 2 12 Subsystem initialization 4 2 Summary of features 1 3 SW500 series cabinets configurations 3 6 SW800 series cabinets configurations 3 2 T Tape in line exerciser See TILX Target HSZ series as one or two 2 15 2 16 Test definition questions HSJ HSD series DILX 6 8 TILX 6 33 HSZ series DILX 6 53 TILX 1 5 2 10 HSJ HSD series abort codes 6 49 basic function test 6 32 data test patterns 6 44 defined 6 30 end mess
119. no copies of TILX running submit a CLD error report and restart the controller TILX detected error code x Explanation The normal way TILX recognizes an error on a unit is through the reception of an EIP which loosely corresponds to an error log However there are some errors that TILX will detect without the reception of an EIP These errors are as follow e Illegal Data Pattern Number found in data pattern header Unit x This is code 1 TILX read data from the tape unit and found that the data were not in a pattern that TILX previously wrote to the tape e No write buffers correspond to data pattern Unit x This is code 2 TILX read a legal data pattern from the tape at a place where TILX wrote to the tape but TILX does not have any write buffers that correspond to the data pattern Thus the data have been corrupted 6 38 Diagnostics Exercisers and Utilities e Read data do not match what TILX thought was written to the media This is code 3 TILX writes data to the tape and then reads it and compares it against what TILX thought it wrote to the tape This indicates a compare failure More information is displayed to indicate where in the data buffer the compare failed and what the data were and should have been e TILX Tape record size mismatch This is code 4 This error would only be detected on a read pass Because TILX knows what was written to the tape TILX expects to encounter the records of diff
120. not found 14 01 Record not found 14 02 Filemark or setmark not found 14 03 End of data not found 14 04 Block sequence error 15 00 Random positioning error 15 01 Mechanical positioning error 15 02 Positioning error detected by read of medium 17 00 Recovered data with no error correction applied 17 01 Recovered data with retries 17 02 Recovered data with positive head offset 17 03 Recovered data with negative head offset 18 00 Recovered data with error correction applied 1A 00 Parameter list length error 1B 00 Synchronous data transfer error 20 00 Invalid command operation code 21 00 Logical block address out of range 24 00 Invalid field in CDB 25 00 Logical unit not supported continued on next page HSJ Series Error Logging C 69 Table C 14 Cont SCSI ASC ASCQ Codes For Sequential Access Devices such as magnetic tape ASC ASCQ Code Code Description 26 00 Invalid field in parameter list 26 01 Parameter not supported 26 02 Parameter value invalid 26 03 Threshold parameters not supported 27 00 Write protected 28 00 Not ready to ready transition medium may have changed 29 00 Power on reset or bus device reset occurred 29 01 Power on occurred 29 02 SCSI bus reset occurred 29 03 Bus device reset occurred 2A 00 Parameters changed 2A 01 Mode parameters changed 2A 02 Log parameters changed 2B 00 Copy cannot execute because host cannot disconnect 2C 00 Command seq
121. off the LEDs An error is displayed if no devices have been configured CANCEL The LOCATE CANCEL command turns off all amber device fault LEDs on all configured devices An error is displayed if no devices have been configured DISKS The LOCATE DISKS command turns on the amber device fault LEDs of all configured disks See LOCATE CANCEL to turn off the LEDs An error is displayed if no disks have been configured TAPES The LOCATE TAPES command turns on the amber device fault LEDs of all configured tape devices See LOCATE CANCEL to turn off the LEDs An error is displayed if no tape devices have been configured UNITS The LOCATE UNITS command turns on the amber device fault LEDs of all devices used by units This command is useful to determine which devices are not currently configured into logical units See LOCATE CANCEL to turn off device the LEDs An error is displayed if no units have been configured PTL SCSI location The LOCATE PTL SCSI location command turns on the amber device fault LEDs at the given SCSI location SCSI location is specified in the form PTL where P designates the port 1 through 6 or 1 through 3 depending on the controller model T designates the target ID of the device 0 through 6 in a nonfailover configuration or 0 through 5 if the controller is in a failover configuration and L designates the LUN of the device 0 through 7 B 18 Command Line Interpreter Examples LOCATE When ent
122. only recommended for initial installations Do you wish to perform an Auto Configure y n n continued on next page Diagnostics Exercisers and Utilities 6 23 Example 6 7 Cont All Functions DILX Use all defaults and run in read only mode y n y n Enter execution time limit in minutes 1 65535 10 245 Enter performance summary interval in minutes 1 65535 10 45 Include performance statistics in performance summary y n n y Display hard soft errors y n n y Display hex dump of Error Information Packet Requester Specific information y n n y hen the hard error limit is reached the unit will be dropped from testing Enter hard error limit 1 65535 65535 hen the soft error limit is reached soft errors will no longer be displayed but testing will continue for the unit Enter soft error limit 1 65535 32 Enter 10 queue depth 1 20 4 10 Available tests are 1 Basic Function 2 User Defined Use the Basic Function test 99 9 of the time The User Defined test is for special problems only Enter test number 1 2 1 1 CAUTION If you answer yes to the next question user data WILL BE destroyed rite enable disk unit s to be tested y n n y The write percentage will be set automatically Enter read percentage for Random IO and Data Intensive phase 0 100 67 Enter data pattern number 0 ALL 19 USER DEFINED 0 19 0 Perform initial write y n n y T
123. operations in progress simultaneously on both CI paths Path A and Path B Receive receive receive transmit or transmit transmit operations can be active at the same time The only restriction is that simultaneous transmits and simultaneous receives may not be active on the same virtual circuit The packets that are simultaneously active can be to any two separate CI nodes or a transmit receive operation may be active to the same node if it also supports DDL operation such as to a CIXCD adapter Each CI path Path A and Path B runs in half duplex This means the path can either be transmitting or receiving but not both at the same time 2 1 12 2 HSD Series DSSI Interface Figure 2 4 shows a block diagram of the HSD series to DSSI host interface hardware The SCSI to DSSI interface gets implemented with the NCR 53C720 chip plus specific DSSI logic and transceivers The NCR 53C720 chip reads and runs scripts from controller shared memory to perform command and DMA operations on the DSSI interface The policy processor sets up and maintains the operation of the NCR 53C720 chip 2 6 Functional Description Figure 2 4 HSD Series DSSI Host Interface Hardware Block Diagram NCR 53C720 HOST PORT PROCESSOR XYLINX GATE ARRAY 720 TO DXX LOGIC BUS IDLE DETECT FOR FAIR ARB CODE DSSI CONNECTOR TO FROM HOST vi CXO 3981A MC Figure 2 5 HSZ Series SCSI 2 Host Interface Hardware Block Diagram NCR 530720 HOS
124. or n is the HSD series controller one digit DSSI node number 0 through 7 Each controller DSSI node number must be unique on its DSSI interconnect or n is the HSZ series controller SCSI target ID s 0 through 7 4 Enter the following command to set the SCS node HSJ and HSD series controllers CLI gt SET THIS CONTROLLER SCS_NODENAME xxxxxx where xxxxxx is a one to six character alphanumeric name for this node The node name must be enclosed in quotes with an alphabetic character first Each SCS node name must be unique within its VMScluster 5 Enter the following command to set the MSCP allocation class HSJ and HSD series controllers CLI gt SET THIS CONTROLLER MSCP ALLOCATION CLASS n where n is 0 through 255 6 Enter the following command to set the TMSCP allocation class HSJ and HSD series controllers CLI gt SET THIS CONTROLLER TMSCP ALLOCATION CLASS n 3 Refer to Chapter 4 for important information about VMS node names 7 10 Removing and Replacing Field Replaceable Units where n is 0 through 255 Note Always restart the controller after setting the ID SCS node name or allocation classes 7 Restart the controller either by pressing the green reset button or entering the following command CLI gt RESTART THIS CONTROLLER 8 Enter the following command to verify the preceding parameters were set CLI gt SHOW THIS CONTROLLER FULL 9 Connect the host port cable to the front
125. phase SKSV The content of the keyspec field is valid if and only if this bit is set to one Field Pointer The Field Pointer subfield indicates which byte of the command descriptor block or of the parameter data was in error When a multiple byte field is in error the pointer points to the most significant left most byte of the field If the Sense Key value is RECOVERED ERROR HARDWARE ERROR or MEDIUM ERROR the format of this field is as shown in Figure C 13 Figure C 13 SCSI Sense Data Byte OF through 11 keyspec Field Actual Retry Count Bytes Format Actual Retry Count SCSI Sense Data Byte OF through 11 keyspec Actual Retry Count Bytes Specific Subfields SKSV The content of the keyspec field is valid if and only if this bit is set to one Actual Retry Count The actual retry count subfield contains the implementation specific information on the actual number of retries of the recovery algorithm used in attempting to recover an error or exception condition If the Sense Key value is NOT READY and the last command issued to the device was a FORMAT UNIT the format of this field is as shown in Figure C 14 Figure C 14 SCSI Sense Data Byte OF through 11 keyspec Field Progress Indication Bytes Format Progress Indication HSJ Series Error Logging C 21 SCSI Sense Data Byte OF through 11 keyspec Progress Indication Bytes Specific Subfields SKSV The c
126. question asked Do you wish to perform an Auto Configure y n n Explanation Enter Y if you wish to invoke the Auto Configure option After the Auto Configure option is selected DILX will display the following caution statement CAUTION All data on the Auto Configured disks will be destroyed You MUST be sure of yourself Are you sure you want to continue y n n Explanation This question is self explanatory Use All Defaults and Run in Read Only Mode y n y Explanation Enter Y to use the defaults for DILX run in read only mode and most of the other DILX questions are not asked Enter N and the defaults are not used You must then answer each question as it is displayed The following defaults are assumed for all units selected for testing e Execution time limit 10 minutes e Performance summary interval 10 minutes Displaying sense data for hard or soft errors is disabled e The hard error limit 65535 Testing will stop if the limit is reached The I O queue depth 4 A maximum of 4 I Os will be outstanding at any time e The Selected Test the Basic Function test e Read only mode e All user available LBNs are available for testing Diagnostics Exercisers and Utilities 6 53 e Data compares are disabled Enter the execution time limit in minutes 1 65535 10 Explanation Enter the desired time you want DILX to run The default run time is 10 minutes Ente
127. realize the mazimum storage set size all devices that make up the storage set should be identical Warning 4000 A restart of this controller will be required before all the parameters modified will take effect Explanation This error results from a SET THIS_CONTROLLER command Some controller parameters require a restart before they can take effect If any of those parameters are changed this warning is displayed It is recommended that a restart via the RESTART THIS_CONTROLLER command be done as soon as possible Warning 4010 A restart of the other controller will be required before all the parameters modified will take effect Explanation This error results from a SET OTHER_CONTROLLER command Some controller parameters require a restart before they can take effect If any of those parameters are changed this warning is displayed Restart the controller and retry the command Warning 4020 A restart of both this and the other controller will be required before all the parameters modified will take effect Explanation This error results from a SET THIS_CONTROLLER or a SET OTHER_CONTROLLER command Some controller parameters require a restart of both controllers before they can take effect If any of those parameters are changed this warning is displayed Restart both controllers and retry the command Warning 6000 Communication failure with other controller while taking controllers out of dual redundant mode Enter a SET NOFAILOVER c
128. respect to interrupts thread control queuing support timers and so forth The executive functions establish the HS controller environment as a non preemptive interrupt driven process 2 2 2 Host Interconnect Functions The three different host interconnections HS operating firmware supports are Cl DSSI and SCSI The following list briefly describes the protocols used for host access of controller storage e CI SCS MSCP and or TMSCP protocol and DUP e DSSI SCS MSCP and or TMSCP protocol and DUP e SCSI SCSI 2 protocol with SCSI pass through software to the Command Line Interpreter CLI tagged command queuing on the host and device side and mode select sense support for SCSI 2 2 3 Operator Interface and Subsystem Management Functions The operator interface and subsystem management functions support the user interface subsystem management subsystem verification and error logging fault management These functions are presented in the following sections Functional Description 2 9 2 2 3 1 Command Line Interpreter The Command Line Interpreter CLI is the primary user interface for HS controllers The CLI contains firmware for responding to most management functions plus local program execution Appendix B contains a full description of CLI operation Briefly the CLI provides the following two types of commands SET SHOW commands for the controller itself This includes setting and showing of controller ID
129. series Disconnect controller power Then connect the DSSI cable and the terminator to the trilink connector and tighten their captive screws Restore power to all members on the DSSI bus Refer to Chapter 4 for important information about VMS node names Removing and Replacing Field Replaceable Units 7 17 11 12 Enter the following commands to enable CI paths A and B to the host HSJ series controllers CLI gt SET THIS CONTROLLER PATH_A CLI gt SET THIS CONTROLLER PATH_B CLI gt SET OTHER_CONTROLLER PATH_A CLI gt SET OTHER_CONTROLLER PATH_B Enter the following commands to enable the host port path HSD series controllers CLI gt SET THIS CONTROLLER PATH CLI gt SET OTHER_CONTROLLER PATH Use the following commands to verify your configuration matches the earlier printed configuration before proceeding CLI gt SHOW DEVICES FULL CLI gt SHOW UNITS FULL 7 1 5 Both Dual Redundant Controllers In the rare event that both controllers in your dual redundant configuration fail both controllers green OCP reset LEDs will be lit continuously You will have to replace both controller modules CAUTION Simultaneously replacing both controllers in a dual redundant configuration causes system down time for the duration of the service cycle Digital recommends only using this procedure if both controllers fail or if your system is off line already for another reason Otherwise to replace both controlle
130. service supports this feature under control from a local program running at CLI Device services must quiesce all the SCSI buses in order to safely allow you to remove and replace a controller see Chapter 7 Functional Description 2 11 2 2 5 Value Added Functions 2 2 5 1 RAID HS operating firmware contains value added functions to enhance availability performance subsystem management and maintenance and connectivity features of the HS controller subsystem These value added functions are presented in the following sections HS operating firmware supports levels of Redundant Array of Independent Disks RAID storage methods HS operating firmware supports host based volume shadowing HBVS assistance also referred to as RAID level 1a With HBVS assistance shadow copy operations requested by the host between two units under one controller run under direction from the controller This leaves the host CPU free for other operations HS operating firmware supports RAID level 0 striping Striping allows for parallel transfers to all stripeset members This feature enhances performance in the areas of latency and throughput Stripesets can be from 2 to 14 members Striping firmware is tuned to balance the load across devices and not for maximum data transfer bandwidth Refer to The Digital Guide to RAID Storage Technology for a description of RAID and how the various levels of RAID improve data integrity 2 2 5 2 Failover H
131. short circuits that may blow fuses on all the members 7 6 1 Tools Required You will need the following tools to remove or replace DSSI host cables e 5 32 inch Allen wrench e Tie wrap cutters e Flat head screwdriver 7 6 2 Precautions Refer to Chapter 1 for DSSI host cable handling guidelines Removing and Replacing Field Replaceable Units 7 27 Figure 7 8 DSSI Host Cables TRILINK CONNECTOR TERMINATOR DSSI HOST CABLE CXO 4206A MC 7 6 3 Cable Removal Use the following procedure to remove DSSI host cables 1 Enter the following command to halt activity on the host path CLI gt SET THIS CONTROLLER NOPATH 2 Disconnect power from all members including the HSD series controller and host on the DSSI bus 3 Disconnect the DSSI host cable from the host or other device the device at the other end of the cable from the controller 4 If necessary to access the HSD series controller unlock and open the cabinet SW800 series using a 5 32 inch Allen wrench 5 Loosen the captive screws on the DSSI host cable where it attaches to the trilink connector on the front of the controller and disconnect the cable 6 Remove the DSSI host cable from the cabinet cutting tie wraps as necessary 7 28 Removing and Replacing Field Replaceable Units Optional Loosen cap
132. small amount of disk space was made inaccessible to the host and used for metadata The metadata will now be initialized If TRANSPORTABLE was specified any metadata on the device will now be destroyed Refer to Chapter 4 for details on metadata and when INITIALIZE is required 4 Add the units that use either the devices or the storage sets built from the devices by entering the following command CLI gt ADD UNIT logical unit number container name where logical unit number is the unit number the host uses to access the device container name identifies the device or the storage set 7 12 Removing and Replacing Field Replaceable Units 5 Use the following commands to verify that your configuration matches the earlier printed configuration CLI gt SHOW DEVICES FULL CLI gt SHOW UNITS FULL 7 1 4 One Dual Redundant Controller CAUTION To perform the procedures in this section at least one controller must be functioning To replace one controller in a dual redundant configuration or one at a time use the second controller to service devices while the first controller is absent This procedure causes no service outage but system performance will decrease slightly while one controller does the work of two Note HSD series controllers You cannot effectively remove the HSD series controller in slot SCSI ID 7 because of interference from the trilink connector attached to the companion controller Remove t
133. submit a CLD error report then reset the controller Disk unit x does not exist Explanation An attempt was made to allocate a unit for testing that does not exist on the controller Unit x successfully allocated for testing Explanation All processes that DILX performs to allocate a unit for testing have been completed The unit is ready for DILX testing Unable to allocate unit Explanation This message should be preceded by a reason why the unit could not be allocated for DILX testing 6 58 Diagnostics Exercisers and Utilities DILX detected error code x Explanation The normal way DILX recognizes an error on a unit is through the reception of SCSI sense data This loosely corresponds to an MSCP error log However the following are some errors that DILX will detect using internal checks without SCSI sense data e Illegal Data Pattern Number found in data pattern header Unit x This is code 1 DILX read data from the disk and found that the data were not in a pattern that DILX previously wrote to the disk e No write buffers correspond to data pattern Unit x This is code 2 DILX read a legal data pattern from the disk at a place where DILX wrote to the disk but DILX does not have any write buffers that correspond to the data pattern Thus the data have been corrupted e Read data do not match what DILX thought was written to the media Unit x This is code 3 DILX writes data to the disk and then r
134. swap 7 42 Fault management firmware 2 10 Features summary 1 3 Field replaceable unit See FRU Field replaceable units 1 4 Firmware when downloaded 6 3 Firmware executive See EXEC Firmware HS controller CLI 2 10 core functions 2 9 description 2 8 Firmware HS controller cont d device services 2 11 DUP 2 10 error logging 2 10 executive functions 2 9 failover 2 12 fault management 2 10 host protocol 2 9 HSZUTIL 2 10 4 11 local programs 2 10 operator interface 2 9 program card 1 1 read cache 2 12 self test 2 9 upgrading 1 1 value added 2 12 version restriction 1 3 Flashing codes OCP 5 4 FRU controller A 1 related A 3 G Green LED 4 1 5 3 6 1 7 2 H Hardware HS controller bus exchanger 2 4 cache module 2 5 description 2 1 device ports 2 5 diagnostic registers 2 2 dual controller port 2 4 host interface 2 5 I D cache 2 2 Intel 80960 chip 2 1 maintenance terminal 2 3 NVMEM 2 4 OCP 2 2 5 2 policy processor 2 1 program card 2 2 shared memory 2 4 HBVS 2 12 HELP command B 16 High availability See Configuration dual redundant Host adapters HSD series controllers 3 20 HSJ series controllers 3 20 HSZ series controllers 3 20 Quiet slot time 3 19 Host interface 2 5 HSD series to DSSI 2 6 3 19 7 27 HSJ series to CI 2 5 7 23 7 25 HSZ series to SCSI 2 7 3 19 7 29 testing 6 3 Index 15 Host port path 4 6 4 8 7 11 7 17 7 47
135. target 0 LUN 0 and named RZ26_100 CLI gt ADD DISK DISKO 2 3 0 NOTRANSPORTABLE A nontransportable disk is added to port 2 target 3 LUN 0 and named DISKO CLI gt ADD DISK TDISKO 3 2 0 TRANSPORTABLE A transportable disk is added to port 3 target 2 LUN 0 and named TDISKO B 4 Command Line Interpreter ADD STRIPESET ADD STRIPESET Format Parameters Description Qualifiers Examples Creates a stripeset from a number of containers ADD STRIPESET container name container name1 container name2 container nameN container name Specifies the name that will be used to refer to this stripeset The name must start with a letter A through Z and can then consist of up to eight more characters made up of A through Z 0 through 9 period dash and underscore _ for a total of nine characters container name1 container name2 container nameN The containers that will make up this stripeset A stripeset may be made up of from two to fourteen containers Adds a stripeset to the known list of stripesets and names the stripeset This command must be used when a new stripeset is to be added to the configuration CHUNKSIZEzn CHUNKSIZE DEFAULT D Specifies the chunksize to be used The chunksize may be specified in blocks CHUNKSIZE n or you may let the controller determine the optimal chunksize CHUNKSIZE DEFAULT When entering an ADD command CHUNKSIZE DEFAULT is the default CLI gt ADD STRIPESET STRI
136. test on this controller SELFTEST THIS_CONTROLLER The SELFTEST THIS_CONTROLLER command shuts down the this controller then restarts it in DAEMON loop on self test mode The OCP reset button must be pushed to take this controller out of loop on self test mode If any disks are on line to this controller the controller will not self test unless the OVERRIDE_ONLINE qualifier is specified HSD and HSJ only If any user data cannot be flushed to disk the controller will not self test unless the IGNORE_ERRORS qualifier is specified Specifying IMMEDIATE will cause this controller to self test immediately without flushing any user data to the disks even if drives are on line to a host Note If you enter a SELFTEST THIS_CONTROLLER command and you are using a virtual terminal to communicate with the controller the connection will be lost when this controller starts the self test Qualifiers for HSD and HSJ controllers IGNORE_ERRORS NOIGNORE_ERRORS D If errors result when trying to write user data the controller will not start self test unless IGNORE_ERROR is specified CAUTION Customer data may be lost or corrupted if the IGNORE_ERRORS qualifier is specified NOIGNORE_ERRORS is the default IMMEDIATE NOIMMEDIATE D If IMMEDIATE is specified immediately start the self test on the controller without checking for online devices CAUTION Customer data may be lost or corrupted if the IMMEDIATE qu
137. the Rights in Technical Data and Computer Software clause at DFARS 252 227 7013 NOTE This equipment generates uses and may emit radio frequency energy The equipment has been type tested and found to comply with the limits for a Class A digital device pursuant to Part 15 of the FCC rules These limits are designed to provide reasonable protection against harmful interference in a residential installation Any changes or modifications made to this equipment may void the user s authority to operate the equipment Operation of this equipment in a residential area may cause interference in which case the user at his own expense will be required to take whatever measures may be needed to correct the interference Copyright Digital Equipment Corporation 1993 1994 Printed in U S A All rights reserved AXP CI DCL DEC DECconnect DECserver Digital HSC HSC95 HSJ HSD30 HSD05 HSZ MSCP OpenVMS StorageWorks TMSCP VAX VAXcluster VAX 7000 VAX 10000 VMS VMScluster VT and the DIGITAL logo are trademarks of Digital Equipment Corporation Intel is a trademark of Intel Corporation OSF and OSF 1 are trademarks of Open Software Foundation Inc All other trademarks and registered trademarks are the property of their respective holders The postpaid READER S COMMENTS card requests the user s critical evaluation to assist in preparing future documentation This document was prepared using VAX DOCUMENT Version 2 1 Co
138. the StorageWorks Solutions Shelf and SBB User s Guide for procedures to remove a device shelf and for correct SCSI cable lengths 7 8 1 Tools Required You will need the following tools to remove or replace device port cables e ESD strap e 3 32 inch Allen wrench e 5 32 inch Allen wrench e Flat head screwdriver 7 8 2 Precautions Refer to Chapter 1 for ESD grounding module handling and cable handling guidelines Removing and Replacing Field Replaceable Units 7 31 7 8 3 Cable Removal Use the following procedure to remove device port cables 1 Unlock and open the cabinet SW800 series using a 5 32 inch Allen wrench 2 Remove the controller s and cache module s by referencing the procedures described in Sections 7 1 and 7 2 3 Using a flat head screwdriver loosen the two captive screws on each side of the volume shield and remove the shield see Figure 7 10 Figure 7 10 Volume Shield VOLUME SHIELD CXO 4176A MC 4 Remove the cable from the BA350 MA controller shelf backplane by pinching the cable connector side clips and disconnecting the cable CAUTION Digital recommends labelling devices to indicate what slot they were removed from If SBBs are removed and then returned to a different slot customer data may be destroyed 7 32 Removing and Replacing Field Replaceable Units Let disk drives spin down fo
139. the virtual terminal until the disk drives attached to the HSJ series controller are reported usually two repetitions are sufficient 2 Enter the default initialization device string Refer to the VAX console instructions in the VAX console documentation 3 Enter BOOT HSD series controllers An HSD series unit can be an OpenVMS operating system initialization disk HSZ series controllers An HSZ series unit can be a DEC OSF 1 AXP operating system initialization disk if the system unit is LUN 0 as seen by the host CPU 4 9 2 Operating System Nodes OpenVMS Be aware of the following condition for HSJ series controllers e Ifa controller is already an active member of an OpenVMS cluster and you change its current CI node number but not its CI node name and then restart the controller with the new node number access to its devices and overall cluster operation will be adversely affected This occurs because the OpenVMS operating system makes continuous attempts to establish new virtual circuits with new nodes and it will find a known node name at a new node address This operation is a security feature provided by the operating system to prevent one CI node from masquerading as another 3 See the HSZ series firmware release notes for restrictions 4 12 Normal Operation e Ifthe controller CI node number and node name are both changed and you restart the controller while the OpenVMS cluster remains operational the opera
140. this instance the last failure code and last failure parameters fields are undefined 07050064 022A Failover Control received a Last Gasp message from the other C 78 HSJ Series Error Lo gging HSJ30 40 The other HSJ30 40 is expected to restart itself within a given time period If it does not it will be held reset with the Kill line continued on next page Table C 19 Cont Failover Event Log Template 05 Instance MSCP Event Codes MSCP Instance Event Code Code Description 07060C01 022A Failover Control detected that both HSJ30 40s are acting as SCSI ID 6 Because IDs are determined by hardware it is unknown which HSJ30 40 is the real SCSI ID 6 Note that in this instance the last failure code and last failure parameters fields are undefined 07070C01 022A Failover Control detected that both HSJ30 40s are acting as SCSI ID 7 Because IDs are determined by hardware it is unknown which HSJ30 40 is the real SCSI ID 7 Note that in this instance the last failure code and last failure parameters fields are undefined 07080B0A 022A Failover Control was unable to send keepalive communication to the other HSJ30 40 It is assumed that the other HSJ30 40 is hung or not started Note that in this instance the last failure code and last failure parameters fields are undefined Table C 20 Nonvolatile Parameter Memory Component Event Log Template 11 Instance MSCP Event
141. to DS PORT BLOCKED we never got a FALSE status back which signals that nothing is blocked Last Failure Parameter 0 contains the port number 1 n that we were waiting on to be unblocked C 110 HSJ Series Error Logging Table C 41 Host Interconnect Services Last Failure Codes Code Description 40000101 An unrecognized CI opcode was received by HIS These packets are packets with CI opcodes recognized by the port but not by HIS Last Failure Parameter 0 contains the CI opcode value 40150100 LOCAL VC Timer in unexpected state 40280100 Failed to allocate Buffer Name Table 40290100 Failed to allocate ACB 402A0100 Failed to allocate ID member template 402B0100 Failed to allocate DG HTBs 402C0100 Failed to allocate message HTBs 402D0101 S_max_node greater than MAX_VC_ENTRIES Last Failure Parameter 0 contains the S_ci_max_nodes value 402E0101 S_max_node not set to valid value 8 16 32 64 128 256 Last Failure Parameter 0 contains the S_ci_max_nodes value 402F0100 Failure to allocate a HIS EIP structure 40300100 Failure in memory allocation 40510100 htb_id type not DG when attempting to deallocate DG HTB 40520100 htb_id type not RCV_SND when attempting to dealloc recv queue HTB 40530100 htb_id type not RCV_SND when attempting to dealloc SCS queue HTB 40560100 Failed to find a virtual circuit entry for CCB during his_close_connection routine 407B0100 SCS command timeout unexpectedly inactive during
142. to a configuration Refer to your SPD and release notes for a list of specific supported device types e A maximum of two 5 4 inch SBBs are allowed per port in a single shelf or four 514 inch SBBs per port in adjacent jumpered shelves No more than four 5 4 inch SBBs are allowed on a single port that would take three shelves which cannot be configured within SCSI 2 cable limits e Intermixing 5 4 inch and 3 inch SBBs is permitted using up to six devices per port maximum of two shelves with no more than three 5 inch SBBs You can use two 54 inch SBBs and four 32 inch SBBs in two BA350 SB shelves or one 514 inch SBB and four 312 inch SBBs in one BA350 SB shelf Configuration Rules and Restrictions 3 9 e When using jumpered shelves only five jumpered pair shelves for a total of ten shelves can be used within each SW800 series data center cabinet This leaves the sixth controller port unused Alternately four jumpered ports permit two single shelf connections on the remaining two controller ports which is preferable This setup is only permitted in the lower front of the cabinet from the C1 controller position Five such ports can take up to a maximum of ten front shelf locations with no allowance for cable access to shelves or devices in the rear of the SW800 series cabinet Refer to Figure 3 1 A more balanced configuration consists of four 5 4 inch SBBs on each of four ports and two ports each with two 5 4 inch SBBs
143. to run d flashing I D Instruction Data cache on the controller module DRAB Dynamic RAM Controller and Arbitration Engine operates controller shared memory SRAM Static RAM ECC Error Correction Code EDC Error Detection Code NXM Nonexistent Memory 5 6 Error Analysis and Fault Isolation continued on next page Figure 5 3 Cont Flashing OCP Codes Reset 1 2 3 4 5 6 Description of Error Action O O 400 24 The code image was not the same as the image on the card after the contents were copied to memory Replace controller module L1 L1 L1 L The journal SRAM battery is bad Replace controller module d Lll li There was an unexpected interrupt from a read cache or the present and lock bit are not working correctly Replace controller module ld O ld ld There is an interrupt pending to the controller s policy processor when there should be none Replace controller module bl D A D d ld ill There was an unexpected fault during initialization Replace controller module ld ld UU ld There was an unexpected maskable interrupt received during initialization Replace controller module ld ld ld O There was an unexpected nonmaskable interrupt received during initialization Replace controller module An illegal process was activated during initialization Replace controller module L off I
144. undefined 03BA0101 01CA Source driver programming error encountered during media loader operation Note that in this instance the asc and ascq fields are undefined 03CF0101 01CA Source driver programming error encountered during operation to a device that is unknown to the controller Note that in this instance the asc and ascq fields are undefined 03080101 O1EA Miscellaneous SCSI Port Driver coding error detected during disk operation Note that in this instance the asc and ascq fields are undefined continued on next page HSJ Series Error Logging C 85 Table C 27 Cont Device Services Nontransfer Error Event Log Template 41 Instance MSCP Event Codes Instance Code MSCP Event Code Description 03890101 03BB0101 03CB0101 03270101 038A0101 03BC0101 03CC0101 03D04002 03D14002 03F00402 03F 10502 C 86 HSJ Series Error Logging OIEA O1EA O1EA O1EA OIEA O1EA O1EA 01AA 006A 00EB 00EB Miscellaneous SCSI Port Driver coding error encountered during tape operation Note that in this instance the asc and ascq fields are undefined Miscellaneous SCSI Port Driver coding error detected during media loader operation Note that in this instance the asc and asco fields are undefined Miscellaneous SCSI Port Driver coding error detected during operation to a device that is unkown to the controller Not
145. what was written to the disk This indicates a compare failure More information is displayed to indicate where in the data buffer the compare failed and what the data were and should have been Compare Host Data should have reported a compare error but did not Unit x This is code 4 A compare host data compare was issued in a way that DILX expected to receive a compare error but no error was received DILX terminated A termination a print summary or a reuse parameters request was received but DILX is currently not testing any units Explanation The user entered a Ctrl Y termination request a Ctrl G print summary request or a Ctrl C reuse parameters request before DILX had started to test units DILX cannot satisfy the second two requests so DILX treats all of these requests as a termination request DILX will not change the state of a unit if it is not NORMAL Explanation DILX cannot allocate the unit for testing because it is already in Maintenance mode Maintenance mode can only be invoked by the firmware If another DILX session is in use the unit is considered in Maintenance mode Unit is not available if you dismount the unit from the host it may correct this problem Explanation The unit has been placed on line by another user or host or the media is not present The most common reason for this message is that the unit is mounted on the host Diagnostics Exercisers and Utilities 6 15 Soft error re
146. 0 SB shelf information Each BA350 SB shelf s upper SCSI 2 port connector is cabled to a controller port The lower SCSI 2 port connector is attached to a controller port for 2x3T configurations and is unused for a 1x6T or 1x7T Available for future expansion Nonredundant controller and power not recommended 3 12 Configuration Rules and Restrictions 3 4 4 5 4 inch SBBs Tables 3 3 and 3 4 list some recommended configurations for 5 4 inch SBBs Table 3 3 5 Inch SBB Configurations 6 Port Controller Number of Available Number BA350 SB for 5 inch of Devices Shelves Configure as SBBs Ports Used 1 2 1 1 2x3T 1 0 1 2 3 4 2 2 2x3T 1 0 3 4 5 6 3 3 2x3T 1 0 5 6 7 8 4 2 1x6T 1 0 6 2 2x3T 9 10 5 4 1x6T 1 0 6 1 2x3T 11 12 6 6 1x6T 1 0 13 14 7 6 1x6T 1 0 1 1x6J 15 16 8 6 1x6T 1 0 6 2 1x6J 17 18 9t 6 1x6T 1 0 6 3 1x6J 19 20 10 6 1x6T 1 0 6 4 1x6J Notes Each BA350 SB shelf has its upper connector cable attached to either the adjacent BA350 SB shelf s lower connector 1x6J or a controller port connector 2x3T or 1x6T The lower connector cable is attached to either an adjacent BA350 SB shelf s upper connector 1x6J as in the first list item controller port connector 2x3T or is unused 1x6T Consult the StorageWorks Solutions Shelf User s Guide for BA350 SB shelf information Available for additional 5 4 inch device When used with the cont
147. 0000 FFFF 0000 FFFF 0000 FFFF B6D9 5555 5555 5555 AAAA AAAA AAAA 5555 5555 AAAA AAAA 5555 AAAA 5555 AAAA 5555 AAAA 5555 DB6C 2D2D 2D2D 2D2D D2D2 D2D2 D2D2 2D2D 2D2D D2D2 D2D2 2D2D D2D2 2D2D D2D2 2D2D D2D2 6DB6 0001 0002 0004 0008 0010 0020 0040 0080 0100 0200 0400 0800 1000 2000 4000 8000 FIE FFFD FFFB FFF7 FFEF FFDF FFBF FF7F FEFF FDFF FBFF F7FF EFFF BFFF DFFF 7FFF DB6D B6DB 6DB6 DB6D B6DB 6DB6 DB6D B6DB 6DB6 DB6D B6DB 6DB6 DB6D 3333 3333 3333 1999 9999 9999 B6D9 B6D9 B6D9 B6D9 FFFF FFFF 0000 0000 DB6C DB6C 9999 1999 699C E99C 9921 9921 1921 699C 699C 0747 0747 0747 699C E99C 9999 9999 FFFF 6 3 9 TILX Examples This sections provides some TILX examples with different options chosen 6 3 9 1 TILX Example Using All Defaults In Example 6 14 TILX is run using all defaults This is a semi extensive test even though the test only runs for 10 minutes The only function not performed is data compares Data compares are a time consuming operation with tapes TILX is invoked from a maintenance terminal CAUTION TILX should only be run using scratch tapes This test will write to the tape and destroy any data that exist on the tape Diagnostics Exercisers and Utilities 6 45 Example 6 14 Using All D HSJ gt show tape Name Type TAPE500 tape TAPE520 tape HSJ gt run tilx Copyright Di
148. 002 03D14002 03D24402 03D3450A 031D4002 037D4002 039D4002 036D430A 03D4450A 03D5450A 400D640A 03D6450A 03D7450A 03D8450A 03D9450A 03DA450A 03DB450A 03DC450A 03DD450A 03DE450A 03DF450A 405E020A 03E0450A 03E1450A 03E2450A 030E4002 031E4002 036E4002 037E4002 039E430A 400E640A 03F00402 405F020A 03F 10502 03F 20064 03F 30064 030F4002 C 82 C 80 C 89 C 89 C 91 C 92 C 91 C 81 C 84 C 84 C 84 C 86 C 86 C 85 C 85 C 85 C 89 C 81 C 83 C 86 C 86 C 85 C 87 C 89 C 91 C 92 C 91 C 87 C 87 C 81 C 87 C 87 C 88 C 88 C 88 C 88 C 88 C 88 C 88 C 88 C 83 C 88 C 88 C 89 C 89 C 89 C 91 C 91 C 92 C 81 C 86 C 83 C 86 C 87 C 87 C 89 Index 5 Codes Instance Codes cont d 031F4002 C 90 036F4002 C 91 037F4002 C 91 039F430A C 92 Last Failure Codes firmware 01000100 C 93 01010100 C 93 01020100 C 93 01030100 C 93 01040100 C 93 01050104 C 93 01060100 C 93 01070100 C 93 01082004 C 93 02000100 C 97 02010100 C 97 02040100 C 97 02050100 C 97 02080100 C 97 02090100 C 97 02100100 C 97 02170100 C 97 02180100 C 97 02210100 C 97 02220100 C 97 02270104 C 97 02360101 C 98 02370102 C 98 02440100 C 98 02530102 C 98 02560102 C 99 02570102 C 99 02620102 C 99 02690102 C 99 02720100 C 99 02730100 C 99 02790102 C 99 02800100 C 100 02820100 C 100 02830100 C 100 02840100 C 100 02850100 C 100 02860100 C 100
149. 020 0024 or 0026 24 00 Invalid field in CDB 25 00 Logical unit not supported 26 00 Invalid field in parameter list 26 01 Parameter not supported 26 02 Parameter value invalid 26 03 Threshold parameters not supported 27 00 Write protected 28 00 Not ready to ready transition medium may have changed 29 00 Power on reset or bus device reset occurred 29 01 Power on occurred C 66 HSJ Series Error Logging continued on next page Table C 13 Cont SCSI ASC ASCQ Codes For Direct Access Devices such as magnetic disk ASC ASCQ Code Code Description 29 02 SCSI bus reset occurred 29 03 Bus device reset occurred 2A 00 Parameters changed 2A 01 Mode parameters changed 2A 02 Log parameters changed 2B 00 Copy cannot execute because host cannot disconnect 2C 00 Command sequence error 2F 00 Commands cleared by another initiator 30 00 Incompatible medium installed 30 01 Cannot read medium unknown format 30 02 Cannot read medium incompatible format 30 03 Cleaning cartridge installed 31 00 Medium format corrupted 31 01 Format command failed 32 00 No defect spare location available 32 01 Defect list update failure 37 00 Rounded parameter 39 00 Saving parameters not supported 3A 00 Medium not present 3D 00 Invalid bits in identify message 3E 00 Logical unit has not self configured yet 3F 00 Target operating conditions have changed 3F 01 Microcode has been changed 3F 02
150. 020A A spurious interrupt was detected during the execution of a Subsystem Built In Self Test 82052002 020A An unrecoverable error was detected during execution of the HOST PORT Subsystem Test The system will not be able to communicate with the host 82062002 020A An unrecoverable error was detected during execution of the UART DUART Subsystem Test This will cause the console to be unusable This will cause failover communications to fail 82072002 020A An unrecoverable error was detected during execution of the FX Subsystem Test 82082002 020A An unrecoverable error was detected during execution of the nbuss init Test Table C 23 Memory System Failure Event Log Template 14 Instance MSCP Event Codes MSCP Instance Event Code Code Description 02072201 012A The CACHE Dynamic RAM Controller and Arbitration engine 0 DRABO failed testing performed by the Cache Diagnostics The memory address field contains the starting physical address of the CACHEAO memory 02082201 012A The CACHE Dynamic RAM Controller and Arbitration engine 1 DRAB1 failed testing performed by the Cache Diagnostics The memory address field contains the starting physical address of the CACHEA1 memory 020C2201 012A Cache Diagnostics have declared the cache bad during testing The memory address field contains the starting physical address of the CACHEAO memory C 80 HSJ Series Error Logging Table C 24 CI Port Event Log Templ
151. 02880100 C 100 02890100 C 100 02900100 C 100 02910100 C 100 02920100 C 100 02950100 C 100 02960100 C 100 02970100 C 100 03020101 C 101 03030101 C 101 03040101 C 101 03050101 C 101 Index 6 Codes Last Failure Codes firmware cont d 03060101 03070101 03080101 03150100 03280100 03290100 03320101 03370108 03390108 03410101 03470100 03480100 03490100 04010101 04020102 04030102 04040103 04050100 04060100 04070103 04080102 04090100 06010100 06020100 06030100 07010100 07020100 07030100 07040100 07050100 07060100 08010101 08020100 08030101 08040101 08050100 08060100 08070100 08080000 08090010 08100101 08110101 08120100 08130100 08140100 08150100 08160100 08170100 08180100 08190100 20010100 20020100 20030100 20070100 20080000 C 101 C 101 C 101 C 101 C 101 C 101 C 102 C 103 C 104 C 105 C 106 C 106 C 106 C 107 C 107 C 107 C 107 C 107 C 107 C 107 C 107 C 107 C 108 C 108 C 108 C 108 C 108 C 108 C 108 C 108 C 108 C 109 C 109 C 109 C 109 C 109 C 109 C 109 C 109 C 109 C 109 C 109 C 109 C 109 C 109 C 109 C 109 C 109 C 110 C 110 C 110 C 110 C 110 C 110 C 110 Codes Last Failure Codes firmware cont d 20090010 40000101 40150100 40280100 40290100 40300100 40510100 40520100 40530100 40560100 40900100 40930100 40950100 40960100 40970100 40980100 42000100
152. 03 3B 08 3B 0D 3B OE 11 0C 0C 00 1C 00 2C 00 4C 00 C 67 C 70 C 67 C 70 C 73 C 76 C 67 C 70 C 73 C 76 C 67 C 67 C 67 C 67 C 71 C 74 C 76 C 67 C 71 C 74 C 76 C 67 C 71 C 74 C 76 C 67 C 71 C 74 C 76 C 67 C 71 C 74 C 76 C 67 C 71 C 74 C 76 C 67 C 71 C 74 C 76 C 71 C 71 C 71 C 71 C 71 C 68 C 71 C 74 C 76 C 71 C 68 C 71 C 74 C 76 C 74 C 74 C 74 C 65 C 69 C 65 C 69 C 72 C 75 C 66 C 69 C 73 C 75 C 67 C 70 C 73 C 75 C 67 C 70 C 73 C 76 C 67 C 71 C 74 C 76 C 68 C 71 C 74 C 76 C 67 C 70 C 73 C 75 C 68 C 71 C 74 C 76 C 67 C 70 C 73 C 75 C 68 C 71 C 68 C 71 C 65 C 66 C 69 C 73 C 75 C 67 C 70 C 73 C 70 C 68 C 71 C 74 C 76 C 68 C 71 C 74 C 76 C 70 C 68 C 71 C 74 C 76 C 70 C 68 C 71 C 74 C 76 C 68 C 71 C 74 C 76 C 70 C 76 C 76 C 65 C 69 C 66 C 67 C 70 C 73 C 75 C 68 C 71 C 74 C 76 Codes SCSI ASC ASCQ Codes cont d 5C 0C 1C 5C 0C 1C 5C 1D 2D 3D 1E 3E 4E 2F 3F 3F 3F 3F 00 C 68 01 C 65 01 C 66 01 C 68 02 C 65 02 C 66 02 C 68 00 C 66 00 C 70 76 00 C 67 C 00 C 66 00 70 C 70 C 73 C 73 C 76 76 00 00 71 70 C 74 73 C 75 70 C 73 C 76 76
153. 1 DRAB CSR Register value e Last Failure Parameter 2 contains the CACHEB1 DRAB Diagnostic CSR Register value e Last Failure Parameter 3 contains the CACHEB1 DRAB Diagnostic Error Register value e Last Failure Parameter 4 contains the CACHEB1 DRAB Error Address Register value e Last Failure Parameter 5 contains the CACHEB1 DRAB Error Data Register value e Last Failure Parameter 6 contains the CACHEB1 DRAB Error Region Register value e Last Failure Parameter 7 contains the CACHEB1 DRAB Region Setup Register value A processor interrupt was generated with an indication that the other controller in a dual controller configuration asserted the KILL line to disable this controller A processor interrupt was generated with an indication that the RESET button on the controller module was depressed A processor interrupt was generated with an indication that the program card was removed A powerfail interrupt occurred because of watch dog timeout Cache region timeout with no other DRAB errors C 96 HSJ Series Error Logging Table C 34 Value Added Services Last Failure Codes Code Description 02000100 Initialization code was unable to allocate enough memory to set up the receive data descriptors 02010100 Initialization code was unable to allocate enough memory to set up the send data descriptors 02040100 Unable to allocate memory necessary for data buffers 02050100 Unable to allocate memory fo
154. 10 byte 2E 00 WRITE AND VERIFY 10 byte 2F 00 05 VERIFY 10 byte 30 00 05 SEARCH DATA HIGH 10 byte continued on next page HSJ Series Error Logging C 61 Table C 10 Cont SCSI Command Operation Codes Code Supported Device Types See Table C 9 Description 31 32 38 34 34 35 36 37 39 3A 3B 3C 3E 3F 40 41 42 43 44 45 47 48 49 4B 4C 4D 55 5A A5 A5 A6 A8 AQ AF BO Bl B2 00 05 00 05 00 05 01 00 05 00 05 00 05 00 00 01 05 00 01 05 00 01 05 08 00 01 05 08 00 05 00 00 01 05 08 00 05 05 05 05 05 05 05 05 00 01 05 08 00 01 05 08 00 01 05 08 00 01 05 08 05 08 08 05 05 05 05 05 05 C 62 HSJ Series Error Logging SEARCH DATA EQUAL 10 byte SEARCH DATA LOW 10 byte SET LIMITS 10 byte READ POSITION PRE FETCH SYNCHRONIZE CACHE LOCK UNLOCK CACHE READ DEFECT DATA 10 byte COMPARE COPY AND VERIFY WRITE BUFFER READ BUFFER READ LONG WRITE LONG CHANGE DEFINITION WRITE SAME READ SUB CHANNEL READ TOC table of contents READ HEADER PLAY AUDIO 10 byte PLAY AUDIO MSF PLAY AUDIO TRACK INDEX PLAY TRACK RELATIVE 10 byte PAUSE RESUME LOG SELECT LOG SENSE MODE SELECT 10 byte MODE SENSE 10 byte PLAY AUDIO 12 byte MOVE MEDIUM EXCHANGE MEDIUM READ 12 byte PLAY TRACK RELATIVE 12 byte VERIFY 12 byte SEARCH DATA HIGH 12 byte SEARCH DATA EQUAL 12 byte S
155. 100 027F0100 02800100 02820100 02830100 02840100 02850100 02860100 02880100 02890100 028A0100 028B0100 028C0100 028D0100 028E0100 028F0100 02900100 02910100 02920100 02950100 02960100 02970100 029B0100 029C0100 An invalid status was returned from VA XFER in a complex ACCESS operation e Last Failure Parameter 0 contains the DD address e Last Failure Parameter 1 contains the invalid status Unable to allocate memory for a Failover Control Block Unable to allocate memory for a Failover Control Block Unable to allocate memory for a Failover Control Block Unable to allocate memory for a Failover Control Block Unable to allocate memory for the Dirty Count Array Unable to allocate memory for the Cache Buffer Index Array Unable to allocate memory for the XNode Array Cache was declared bad by the Cache Diagnostics after first Meg was tested Cannot recover and use local memory because those initial buffers cannot be retrieved Unable to allocate memory for the Fault Management Event Information Packet used by the Cache Manager in generating error logs to the host Invalid FOC Message in cmfoc_snd_cmd Invalid FOC Message in cmfoc_rcv_cmd Invalid return status from DIAG CACHE_MEMORY_TEST Invalid return status from DIAG CACHE_MEMORY_TEST Invalid error status given to cache_fail Invalid number of banks in cache Cache module is locked when not expected Invalid status returned from CACHE CHECK_META
156. 1060100 TILX tried to switch the unit state from MAINTENANCE_MODE to NORMAL but was not successful 81070100 TILX aborted all commands via va d_abort but the htbs have not been returned 81080100 While TILX was deallocating HIS EIP buffers at least one could not be found 81090100 TILX received an end message that corresponds to an opcode not supported by TILX 810A0100 TILX was was not able to restart HIS timer 810B0100 TILX tried to issue an IO for an opcode that is not supported 810C0100 TILX tried to issue a oneshot IO for an opcode that is not supported 810D0100 A TILX device control block contains an unsupported unit_state 810E0100 TILX received an unsupported Value Added status in a Value Added completion message 810F0100 TILX found an unsupported device control block substate while trying to build a command for the Basic Function test 81100100 TILX found an unsupported device control block substate while trying to build a command for the Read Only test continued on next page HSJ Series Error Logging C 117 Table C 47 Cont Tape Inline Exerciser TILX Last Failure Codes Code Description 81110100 81120100 81130100 81140100 81140100 811B0100 811C0100 TILX found an unsupported device control block substate while trying to build a command for the User Defined test TILX received an EOT encountered while in a substate where EOT encountered should not occur TILX calculated an illegal
157. 2 ee 6 1x7T 5 0 Notes 2x3T Two split SCSI 2 connections separately terminated in the shelf The devices appear as IDs O 2 4 and 1 3 5 1x6T Single path SCSI 2 connection terminated in the shelf The devices appear as IDs 0 through 5 1x7T Single path SCSI 2 connection terminated in the shelf The devices appear as IDs 0 through 6 Consult the StorageWorks Solutions Shelf User s Guide for BA350 SB shelf information Each BA350 SB shelf s upper SCSI 2 port connector is cabled to a controller port The lower SCSI 2 port connector is attached to a controller port for 2x3T configurations and is unused for a 1x6T or 1x7T Available for future expansion Nonredundant controller and power not recommended Configuration Rules and Restrictions 3 11 Table 3 2 3v2 Inch SBB Configurations 3 Port Controller Number of Available Number BA350 SB for 3 inch of Devices Shelves Configure as SBBs Ports Used 1 2 1 1 2x3T 5 4 1 2 3 12 2 1 2x3T 9 0 3 1 1x6T 13 18 3 3 1x6T 5 0 19 21 3 3 1x7T 2 0 Notes 2x3T Two split SCSI 2 connections separately terminated in the shelf The devices appear as IDs 0 2 4 and 1 3 5 1x6T Single path SCSI 2 connection terminated in the shelf The devices appear as IDs 0 through 5 1x7T Single path SCSI 2 connection terminated in the shelf The devices appear as IDs 0 through 6 Consult the StorageWorks Solutions Shelf User s Guide for BA35
158. 2 1 for the description of this field The values that can be reported in this field for this event log are shown in Table C 19 templ See Section C 2 1 for the description of this field This field contains the value 05 for this event log tdisize See Section C 2 1 for the description of this field This field contains the value 24 for this event log reserved offset 1E This field contains the value 0 event time See Section C 2 1 for the description of this field last failure code last failure parameters These fields contain the last failure information supplied in the last gasp message sent by the other HSJ30 40 controller in a dual redundant configuration as a normal part of terminating controller operation See Section C 2 3 1 for the description of the format of these fields Note that the content of certain of the fields described previously may be undefined depending on the value supplied in the instance code field See Table C 19 for more detail C 2 3 3 Nonvolatile Parameter Memory Component Event Log Template 11 The HSJ30 40 controller Executive firmware component reports errors detected while accessing a Nonvolatile Parameter Memory Component via the Nonvolatile Parameter Memory Component Event Log The Nonvolatile Parameter Memory Component Event Log will be sent to all host systems that have enabled Miscellaneous error logging on a connection or connections established with the HSJ30 40 contr
159. 2 inch Allen wrench Remove any terminators from the star coupler connections Connect the external CI cable connectors to the star coupler one at a time in the following order refer to Figure 7 7 RXB TXB RXA TXA 5 For the replaced path s enter the following commands to resume activity on the replaced host path s CLI gt SET THIS CONTROLLER PATH_A CLI gt SET THIS CONTROLLER PATH_B 7 5 Internal Cl Cables HSJ series Servicing internal CI cables causes down time for the affected controller because both host paths A and B must be disabled for the duration of the procedure Use the procedures in this section when you are removing and replacing internal CI cables 7 5 1 Tools Required You will need the following tools to remove or replace internal CI cables e 5 32 inch Allen wrench e Tie wrap cutters e Flat head screwdriver 7 5 2 Precautions Refer to Chapter 1 for CI cable handling guidelines Removing and Replacing Field Replaceable Units 7 25 7 5 3 Cable Removal Use the following procedure to remove internal CI cables 1 You should determine that paths are in fact suspect before proceeding Refer 2 LA 8 to Chapter 5 for troubleshooting guidelines Enter the following commands to halt activity on both host paths CLI gt SET THIS CONTROLLER NOPATH_A CLI gt SET THIS CONTROLLER NOPATH_B CAUTION Always disconnect the external CI cable from the star coupler first then disconnect it
160. 3 IO Requests 40794 rite Count 40793 Erase Count 0 0 Write 326344 ted IO Requests 13282 rite Count 13281 Erase Count 0 Write 106248 Total 326344 0 ted Total 106248 Reuse Parameters stop continue restart change_unit stop DILX Normal Termination HSJ gt 6 2 9 3 DILX Examples Auto Configure with All Units In Example 6 8 DILX is run using the Auto Configure option with the all units option Example 6 8 Auto Configuration with All Units HSJ gt run dilx Copyright Digital Equipment Corporation 1993 Disk Inline Exerciser version 1 4 The Auto Configure option will automatically select for testing half or all of the disk units configured It will perform a very thorough test with WRITES enabled The user will only be able to select the run time and performance summary options and whether or not to test a half or full configuration The user will not be able to specify specific units to test The Auto Configure option is only recommended for initial installations n y If you want to test a dual redundant subsystem it is recommended that you pick option 2 on the first controller and then option 2 on the other controller Auto Configure options are Do you wish to perform an Auto Configure y n 1 Configure all disk units for testing This is recommended for a single controller subsystem 2 Configure half of all disk units for testing this is recommended for a dual controller subsys
161. 3 C3 a a a 3 vvv0 yu T ovo EA EA ASA A OS E AOS SS CNA AAA OS OO DAS c ES DOT DOT 001 DOT 00T pv pv cep Gr pt 0 C c c2 c 0 0 0 0 Y Oo ooo ooOoOoOoOoOoOoOoOo Oo OOOO OOO o c e 0 0 0 0 0 0 0 0 0 0 0 Iv 0 0 0 0 0 0 0 0 0 0 U 21M st zi Se te lt lt o 4 fd lt OOOO OOOOOO GB GB GB BG lt JB pu S A MSY 3 s bu 0 a a a a a Ca AA a Hudadad Huaaaka a Hyaa a Hquadadd a Huadaad q HYAA LOSPECT 190181 a AAA a S TA 9697 vie Nt ew SG ANAT UU OT AT me wel 11 WADAWO 68L9GTEZTO 68L9SPEZTO snqeqs yqeg Suor3oeuuo d9 y 07 uo as 0 0 W a 02 607 YO 4 8I L8T AN qazo styd SPd E UPa Td 0 7 07 0 0 gI r 0c 02 Nu C S S3q Sia Y ed 9 I 0 0 68361010075 PISAS 8 06 GZ Mog PUSH SPON 20d TPI 38 06 FH 0 1 06 Ta Td Td Td T8 T8 T8 Ud Ta Td Td Td Td Td Ud T8 Td Ud eis dAL xew xs a CO DO O 0 O Es 2 C2 C2 C2 2 2 2 m CO CO A Ea Ba A Bn Bo Bn Eos Eos Eos Eos Gs Gs Bn Bn Bn Bn bn 4 A 0 01 02 0 0 91 01 02 0t 01 02 0 0 01 01 01 0t 01 0 Luv 304 LONE JOHAN NIYWITO SIH TS YA doS SOS E E a LC 0c 0 qd d1o queudtnby Texrbra 7661 O qubtIAdo TOI
162. 3 I 0 T110 DEC TZ877 C DEC 930A TAPE130 tape 3 3 0 T130 DEC 12877 C DEC 930A CDROM230 cdrom 2 3 0 D623 DEC RRD44 C DEC 3593 CDROM240 cdrom 2 4 0 D624 DEC RRD44 C DEC 3593 A full listing of devices attached to the controller B 46 Command Line Interpreter SHOW DISKS SHOW DISKS Shows all disk drives and drive information Format SHOW DISKS Description The SHOW DISKS command displays all the disk drives known to the controller Qualifiers FULL If the FULL qualifier is specified additional amplifying information may be displayed after each device Examples s CLI gt SHOW DISKS Name Type Port Targ Lun Used by DIO disk 1 0 0 D100 DI1 disk 1 1 0 D110 A basic listing of disks attached to the controller e CLI gt SHOW DISKS FULL Name Type Port Targ Lun Used by DIO disk 1 0 0 D100 DEC RZ35 C DEC X388 DI1 disk 1 1 0 D110 DEC R226 C DEC T386 A full listing of disks attached to the controller Command Line Interpreter B 47 SHOW disk container name SHOW disk container name Shows information about a disk drive Format SHOW disk container name Parameters disk container name The name of the disk drive that will be displayed Description The SHOW disk container name command is used to show specific information about a particular disk Examples 1 CLI gt SHOW DI3 Name Type Port Targ Lun Used by DI3 disk 1 3 0 D130 DEC RZ26 C DEC X388 A listing of disk DI3 B 48 Command Line Interpreter
163. 42020100 42030100 42060100 42340100 42350100 42640100 42680102 42690101 42742001 42752002 42760102 42770102 60000100 60010100 60030100 60040100 60050100 60060100 60070100 60080100 60090100 60100100 60110100 60120100 60130100 60140100 60150100 60160100 60170100 60180100 60190100 60250100 60260100 60270100 60280100 60290100 60400100 60410100 C 110 C 111 D 3 C 111 C 111 C 111 C 111 C 111 C 111 C 111 C 111 C 111 C 111 C 111 C 111 C 111 C 111 C 112 C 112 C 112 C 112 T H um N DOUODOUDOJDO A 0000 Ww Codes Last Failure Codes firmware cont d 60420100 60430100 60440100 60450100 60460100 60480100 60490100 60500100 60550100 60560100 60570100 60580100 60610100 60620100 60640100 60650100 60660100 60670100 60680100 61020100 61090100 62000100 62010100 62020100 62030100 80010100 80020100 80030100 80040100 80050100 80060100 80070100 80080100 80090100 80100100 80120100 80130100 80140100 81010100 81020100 81030100 81040100 81050100 81060100 81070100 81080100 81090100 81100100 81110100 81120100 81130100 81140100 83010100 83020100 83030100 C 114 C 114 C 114 C 114 C 115 C 115 C 115 C 115 C 115 C 115 C 115 C 115 C 115 C 115 C 115 C 115 C 115 C 115 C 115 C 116 C 116 C 116 C 116 C 116 C 116 C 116 C 116 C 116 C 116 C 116 C 116 C 116 C 11
164. 55 AAAA 5555 DB6C 2D2D 2D2D 2D2D D2D2 D2D2 D2D2 2D2D 2D2D D2D2 D2D2 2D2D D2D2 2D2D D2D2 2D2D D2D2 6DB6 0001 0002 0004 0008 0010 0020 0040 0080 0100 0200 0400 0800 1000 2000 4000 8000 FIE FFFD FFFB FFF7 FFEF FFDF FFBF FF7F FEFF FDFF FBFF F7FF EFFF BFFF DFFF 7FFF DB6D B6DB 6DB6 DB6D B6DB 6DB6 DB6D B6DB 6DB6 DB6D B6DB 6DB6 DB6D 3333 3333 3333 1999 9999 9999 B6D9 B6D9 B6D9 B6D9 FFFF FFFF 0000 0000 DB6C DB6C 9999 1999 699C E99C 9921 9921 1921 699C 699C 0747 0747 0747 699C E99C 9999 9999 FFFF Diagnostics Exercisers and Utilities 6 21 6 2 9 DILX Examples This section provides DILX examples using different options 6 2 9 1 DILX Example Using All Defaults In Example 6 6 DILX is run using all defaults DILX is executed in read only mode No data on the units under test are destroyed The entire user available LBN range on each disk is accessible for DILX testing DILX was invoked from a maintenance terminal Example 6 6 Using All Defaults DILX HSJ gt show disk ame Type Port Targ LUN Used by DISK100 disk 1 0 0 D10 DISK120 disk 1 2 0 D12 DISK140 disk 1 4 0 D14 DISK210 disk 2 1 0 D21 DISK230 disk 2 3 0 D23 DISK610 disk 6 1 0 D61 DISK630 disk 6 3 0 D63 HSJ gt run dilx Copyright Digital Equipment Corporation 1993 Disk Inline Exerciser version 1 4 The Auto Configure option will automatically select for testing
165. 6 C 116 C 117 C 117 C 117 C 117 C 117 C 117 C 117 C 117 C 117 C 117 C 117 C 117 C 117 C 117 C 118 C 118 C 118 C 118 C 118 C 118 C 118 Index 7 Codes Last Failure Codes firmware cont d Index 8 080A0000 200A0000 020A0100 028A0100 030A0100 032A0100 081A0100 402A0100 601A0100 602A0100 604A0100 800A0100 810A0100 811A0100 025A0102 424B0001 020B0100 021B0100 028B0100 029B0100 032B0100 080B0100 200B0100 402B0100 407B0100 420B0100 600B0100 601B0100 602B0100 603B0100 604B0100 606B0100 800B0100 810B0100 811B0100 081B0101 426B0101 025B0102 027B0102 40B40101 424C0001 020C0100 021C0100 028C0100 029C0100 080C0100 200C0100 402C0100 407C0100 409C0100 420C0100 600C0100 601C0100 602C0100 603C0100 C 109 C 110 C 97 C 100 C 101 C 101 C 110 C 111 C 113 C 114 C 115 C 117 C 117 C 118 C 99 C 112 C 97 C 97 C 100 C 100 C 101 C 109 C 110 C 111 C 111 C 112 C 113 C 113 C 114 C 114 C 115 C 115 Codes Last Failure Codes firmware cont d 606C0100 610C0100 800C0100 810C0100 811C0100 033C0101 025C0102 021D0100 027D0100 028D0100 040D0100 080D0100 409D0100 420D0100 600D0100 601D0100 602D0100 604D0100 605D0100 606D0100 800D0100 810D0100 200D0101 402D0101 020E0100 021E0100 027E0100 028E0100 03180100 408E0100 600E0100 601E0100 602E0100 605E0100 606E0100
166. 6 7 lists TILX defined error codes and definitions for TILX detected errors Table 6 7 TILX Abort Codes and Definitions Value Definition No BP WO NY HL Illegal Data Pattern Number found in data pattern header No write buffers correspond to data pattern Read data do not match write buffer TILX TAPE record size mismatch A tape mark was detected in a place where it was not expected EOT encountered in unexpected position 6 4 Disk Inline Exerciser HSZ Series Controllers Note The information on DILX for the HSZ series controllers is presented separately because the messages and performance summaries differ from those of the HSJ and HSD series controllers DILX is a diagnostic tool used to exercise the data transfer capabilities of selected disks connected to an HSZ series controller DILX exercises disks in a way that simulates a high level of user activity Using DILX you can read and write to all customer available data areas DILX can also be run on CDROMs but must be run in read only mode only Thus DILX can be used to determine the health of a controller and the disks connected to it and to acquire performance statistics You can run DILX from a maintenance terminal 6 50 Diagnostics Exercisers and Utilities DILX now allows for auto configuring of drives This allows for quick configuring and testing of all units at once Please be aware that customer data will be lost by running this test D
167. 7 C 77 C 77 C 77 Instance Codes 01010302 C 78 01032002 C 79 02020064 C 90 02032001 C 79 02042001 C 79 02072201 C 80 02082201 C 80 02090064 C 89 02110064 C 90 03010101 C 84 03022002 C 84 03034002 C 84 03044402 C 84 03052002 C 85 03062002 C 85 03070101 C 85 03080101 C 85 03094002 C 89 03104002 C 89 03134002 C 89 03144002 C 89 03154002 C 89 03164002 C 89 03170064 C 89 03180064 C 89 03194002 C 89 03204002 C 90 03214002 C 90 03224002 C 90 03234002 C 90 03244002 C 90 03254002 C 90 03270101 C 86 03644002 C 91 03674002 C 91 03694002 C 91 03704002 C 91 03714002 C 91 03720064 C 91 03730064 C 91 03744002 C 91 03754002 C 91 Index 4 Codes Instance Codes cont d 03760101 03774002 03784002 03794002 03804002 03820101 03832002 03844002 03854402 03862002 03872002 03880101 03890101 03964002 03994002 07050064 40016001 40026001 40440064 82012002 82022202 82032202 82042002 82052002 82062002 82072002 82082002 01020304 0311430A 03124304 03264504 03284504 03680004 03814504 40036404 40040204 40076404 40096404 40150204 40290104 40510204 40520204 40530204 40540204 40550204 40560204 40570204 40580204 40590204 40600204 40610204 40620204 40630204 40640204 40650204 40660204 00 NN 0 1 Ol TOF bee Pe Oo Oo Rh CYXO0 OQ CC Y0 3001006060 000 OOo Ooo Qo Qo Qo Oo Oo Oo Qo Qo OO o
168. 7 04 17 05 17 06 17 07 17 08 18 00 18 01 18 02 18 03 18 04 18 05 18 06 19 00 19 01 19 02 19 03 20 00 21 00 21 01 22 00 24 00 25 00 26 00 26 01 26 02 26 03 27 00 28 00 28 01 29 00 29 01 29 02 29 03 30 00 30 01 30 02 30 03 31 00 31 01 32 00 Index 10 o0 C2 2 Q ATAO aon 252 25 GOS aa sax Q2 02 73 C 75 73 C 75 EI oO O so ao 73 C 75 73 C 75 73 C 75 73 C 75 73 C 75 73 C 75 OQQA anaaaaaa 1 1 1 1 1 10 0 oO oO o o o o so amp 73 C 75 73 C 75 73 C 75 73 C 75 73 C 75 73 C 76 Q000000 nO eOQ0o0000n0 2 0 10 0 10 1 1 1 1 ocooooooooo PEP CP ELEC C2 C2 PELE LP 2 2 2 2 2 2 2 Q2 C2 2 EP ca Ca PPE Ca a Ca Ca C2 C a C e a e e ca ca ca e e ea ARARARRARARWRARABRAARBRBBABAADBBBHARDDBADBBHBAARBABDBBABBARBBBAABAABABWAGS ANANN N N N N N NAAA O OQ O OQ O OQ O 000000050 O OQ O Qo YAAA O O AANA O O Ot Ot Ot Ot O OON Codes SCSI ASC ASCQ Codes cont d 32 01 33 00 37 00 39 00 40 00 41 00 42 00 43 00 44 00 45 00 46 00 47 00 48 00 49 00 50 00 50 01 50 02 51 00 52 00 53 00 53 01 53 02 57 00 63 00 64 00 11 0A 0A 00 1A 00 2A 00 3A 00 4A 00 5A 00 2A 01 5A 01 2A 02 5A 02 5A 03 11 0B 1B 00 2B 00 3B 00 4B 00 5B 00 3B 01 5B 01 3B 02 5B 02 5B
169. 800 Series Data Center Cabinet Installation and User s Guide for more details Note In Figures 3 1 through 3 5 S indicates a BA350 SB storage shelf and C indicates a BA350 MA controller shelf Figure 3 1 shows the loading sequence for storage and controller shelves in an SWS800 series data center cabinet Figure 3 2 shows the loading sequence for storage and controller shelves when one or two TZ8xx series tape devices are installed Figure 3 3 shows the loading sequence for storage and controller shelves when three or four TZ8xx series tape devices are installed e Standard shelf configuration A standard of three or four BA350 MA shelves connected to 18 BA350 SB shelves in a single SW800 series data center cabinet is suggested e Two device shelves per port jumpered pairs Two BA350 SB shelves can be joined on the same controller port with the following restrictions The SCSI 2 cable to the first BA350 SB storage shelf is 1 0 meter or less The SCSI 2 cable from the first BA350 SB shelf to the second shelf is 0 5 meters or less This requires two shelves to be immediately adjacent to each other The first BA350 SB storage shelf is configured for an unterminated single SCSI cable e TZ8x7 half rack tape loader Any TZ8x7 half rack tape loader device must be located at the top front positions filling two or four top BA350 SB shelf positions front and back Note that each tape l
170. 9 HC 0 SC 1 Err in Hex IC 03094002 PTL 05 05 00 Key 01 ASC Q 18 86 HC 0 SC 1 o Total Errs Hard Cnt 0 Soft Cnt 2 where O Represents the unit number and the total I O requests to this unit O Represents the unit number and total I O requests to this unit All values for the following codes are described in Appendices C and D This also includes the following items associated with this error and the total number of hard and soft errors for this unit The HSJ HSD series Instance code in hex The Port Target LUN PTL e The SCSI Sense Key e The SCSI ASC and ASQ ASC Q codes e The total hard and soft count for this error Represents information about the first two unique errors All values for the following codes are described in Appendices C and D This also includes the following items associated with this error and the total number of hard and soft errors for this unit e The HSJ HSD series Instance code in hex e The Port Target LUN PTL e The SCSI Sense Key 6 28 Diagnostics Exercisers and Utilities e The SCSI ASC and ASQ ASC Q codes e The total hard and soft count for this error A line of this format may be displayed up to three times in a performance summary There would be a line for each unique error reported to DILX for this unit up to three errors O Represents the total hard and soft errors experienced for this unit The following is an example of a DILX performance display where
171. A shelf s SCSI connectors This slot is SCSI ID 7 By using SCSI ID 7 SCSI ID 6 the other controller slot is available as an additional ID on the device shelf The maximum recommended controller subsystem configuration is six devices per controller port This allows for the addition of another controller and additional power supplies in the storage shelves A nonredundant controller configuration can support seven devices per port However Digital still recommends six devices per port to permit the ease of future upgrade HSZ series controller The HSZ series controller may currently only be configured as nonredundant Two nonredundant HSZ series controllers may not be placed in the same BA350 MA controller shelf Configuration Rules and Restrictions 3 15 3 5 2 Dual Redundant Controllers The following guidelines apply to dual redundant controllers Only HSJ and HSD series controllers may be configured as dual redundant Dual redundant controllers are located in the same BA350 MA shelf and are connected to each other through the shelf backplane Both controllers have access to all the devices on each other s ports This setup increases availability and provides for failover when one controller in the pair fails The surviving controller takes over service to all devices Dual redundant configurations follow the same guidelines as nonredundant configurations except there is no option to increase to seven devices per port Bo
172. ACK 0004 CONNECT_REC 0005 ACCEPT_SENT 0006 REJECT_SENT 0007 OPEN 0008 DISCONNECT_SENT 0009 DISCONNECT_REC 000A DISCONNECT_ACK 000B DISCONNECT_MATCH Table C 9 Supported SCSI Device Type Codes Code Description 00 01 05 08 Direct Access Devices such as magnetic disk Sequential Access Devices such as magnetic tape CDROM Devices Medium Changer Devices such as jukeboxes C 60 HSJ Series Error Logging Table C 10 SCSI Command Operation Codes Supported Device Types Code See Table C 9 Description 00 00 01 05 08 TEST UNIT READY 01 01 REWIND 01 00 05 08 REZERO UNIT 03 00 01 05 08 REQUEST SENSE 04 00 FORMAT UNIT 05 01 READ BLOCK LIMITS 07 08 INITIALIZE ELEMENT STATUS 07 00 REASSIGN BLOCKS 08 00 01 05 READ 6 byte 0A 00 01 WRITE 6 byte 0B 00 05 SEEK 6 byte OF 01 READ REVERSE 10 01 WRITE FILEMARKS 11 01 SPACE 12 00 01 05 08 INQUIRY 13 01 TAPE VERIFY 14 01 RECOVER BUFFERED DATA 15 00 01 05 08 MODE SELECT 6 byte 16 00 01 05 08 RESERVE UNIT 17 00 01 05 08 RELEASE UNIT 18 00 01 05 COPY 19 01 ERASE 1A 00 01 05 08 MODE SENSE 6 byte 1B 00 05 START STOP UNIT 1B 01 LOAD UNLOAD 1C 00 01 05 08 RECEIVE DIAGNOSTIC RESULTS 1D 00 01 05 08 SEND DIAGNOSTIC 1E 00 01 05 08 PREVENT ALLOW MEDIUM REMOVAL 25 00 05 READ CAPACITY 28 00 05 READ 10 byte 2A 00 WRITE 10 byte 2B 08 POSITION TO ELEMENT 2B 01 LOCATE 2B 00 05 SEEK
173. ADAPTER SYSGEN gt EXIT 5 4 10 Normal Operation e For OpenVMS AXP MCR SYSMAN SYSMAN IO CONNECT FYAO NOADAPTER DRIVER SYSSFYDRIVER SYSMAN EXIT Once FYDRIVER is loaded you may make the virtual terminal connection as follows SET HOST LOG CONFIGURATION INFO DUP SERVER MSCP DUP TASK CLI SCS nodename 4 7 Virtual Terminal HSZ series Controllers A virtual terminal port can be created through a host based application called HSZUTIL HSZ series controller This program uses SCSI diagnostic send and receive commands to deliver and receive characters to and from the HSZ series CLI and local programs See Chapter 6 for more information on the HSZUTIL application 4 8 VAXcluster Console System You can run VAXcluster Console System VCS with any HS controller If you are unfamiliar with VCS refer to the VCS Software Manual for instructions Note VCS can only be used from a terminal connected to a maintenance terminal port 4 9 Operating Systems The following sections describe particulars associated with host operating systems which may help in understanding and servicing the HS controllers The two primary operating systems that support the HS controllers are the OpenVMS and DEC OSF 1 AXPoperating systems as shown in Table 4 1 Table 4 1 Operating System Support Operating System HSJ series HSD series HSZ series OpenVMS AXP V1 5 V1 5 N S OpenVMS VAX V5 5 2 V5 5 2 N S VAX VMS V5 5 1
174. ART OTHER command 2 Press and hold the Reset button while inserting the program card 3 Release Reset and the controller will initialize 4 Configure new controller by referring to the HS Array Controller User s Guide If the controller initializes correctly its green reset LED will begin to flash at 1 Hz If an error occurs during initialization the OCP will display a code Refer to Chapter 5 to analyze the code Restore parameters for the new controller using the steps in Section 7 11 2 5 7 11 2 5 Restoring Parameters The new controller module has no initial parameters so you must use a maintenance terminal to enter them Refer to information in the CONFIGURATION INFO file or on the configuration sheet packaged with your system whichever is most current for parameters Be sure to use the same parameters from the removed controller when installing a replacement Follow these steps CAUTION Do not install HSJ series CI host port cables until after setting all parameters listed here Failure to follow this procedure may result in adverse effects on the host cluster Removing and Replacing Field Replaceable Units 7 45 CAUTION SET FAILOVER establishes controller to controller communication and copies configuration information Always enter this command on one controller only COPY configuration source specifies where the good configuration data are located Never blindly specify SET FAILOVER Know where yo
175. ATE remote node name remote connection id and connection state fields are undefined Received an unrecognized SCS message Note that in this instance if the connection ID field is zero the content of the VCSTATE remote node name remote connection id and connection state fields are undefined Received SCS CONNECT_RSP with an unrecognized status Connection is broken by Host Interconnect Services Received SCS REJECT_REQ with an invalid reason Received SCS APPL_MSG with no receive credit available HSJ Series Error Logging C 83 C 84 Table C 27 Device Services Nontransfer Error Event Log Template 41 Instance MSCP Event Codes Instance Code MSCP Event Code Description 021B0064 021A0064 03010101 03820101 03B40101 03C80101 03022002 03832002 03B52002 03C92002 03034002 03844002 03B64002 03CA4002 03044402 HSJ Series Error Logging 0014 0014 006A 006A 006A 006A 002A 002A 002A 002A 016A 016A 016A 016A 01AA Disk Bad Block Replacement attempt completed for a read of controller metadata from a location outside the user data area of the disk Note that due to the way Bad Block Replacement is performed on SCSI disk drives information on the actual replacement blocks is not available to the controller and is therefore not included in the event report Disk Bad Block Replacement attempt c
176. B 10 Command Line Interpreter CLEAR_ERRORS CLI CLEAR_ERRORS CLI Stops the display of errors at the CLI prompt Format CLEAR_ERRORS CLI Description Errors detected by controller firmware are listed before the CLI prompt These errors are listed even after the error condition is rectified until either the controller is restarted or the CLEAR ERRORS CLI command is entered Note This command does not clear the error conditions it only clears the reporting of the errors at the CLI prompt Examples CLI gt All NVPM components initialized to their default settings CLI gt CLEAR_ERRORS CLI CLI gt This clears the message All NVPM components initialized to their default settings that was displayed at the CLI prompt Command Line Interpreter B 11 DELETE container name DELETE container name Deletes a container from the list of known containers Format DELETE container name Parameters container name Specifies the name that identifies the container This is the name given the container when it was created using the ADD command ADD DEVICE ADD STRIPESET and so forth Description Checks to see if the container is used by any other containers or a unit If the container is in use an error will be displayed and the container will not be deleted If the container is not in use it is deleted Examples CLI DELETE DISKO DISKO is deleted from the known list of containers CLI gt DELETE
177. C register Last Failure Parameter 3 contains the PCB copy of the 710 DNAD register Last Failure Parameter 4 contains the PCB copy of the 710 DSP register Last Failure Parameter 5 contains the PCB copy of the 710 DSPS register Last Failure Parameter 6 contains the PCB copies of the 710 SSTAT2 SSTAT1 SSTATO DSTAT registers Last Failure Parameter 7 contains the PCB copies of the 710 LCRC RESERVED ISTAT DFIFO registers continued on next page HSJ Series Error Logging C 103 Table C 35 Cont Device Services Last Failure Codes Code Description 03380188 03390108 033C0101 C 104 HSJ Series Error Logging A 710 s DSTAT register contains multiple asserted bits or an invalidly asserted bit or both Last Failure Parameter 0 contains the PCB reg710_ptr value Last Failure Parameter 1 contains the PCB copy of the 710 TEMP register Last Failure Parameter 2 contains the PCB copy of the 710 DBC register Last Failure Parameter 3 contains the PCB copy of the 710 DNAD register Last Failure Parameter 4 contains the PCB copy of the 710 DSP register Last Failure Parameter 5 contains the PCB copy of the 710 DSPS register Last Failure Parameter 6 contains the PCB copies of the 710 SSTAT2 SSTAT1 SSTATO DSTAT registers Last Failure Parameter 7 contains the PCB copies of the 710 LCRC RESERVED ISTAT DFIFO registers An unknown interrupt code was found in a 710 s DSPS register Last
178. C 112 C 112 C 112 C 112 C 96 C 106 C 112 C 98 C 101 C 112 C 112 C 112 C 112 C 112 C 112 C 112 Codes Recommended Repair Action Codes cont d 08 C 120 09 20 22 40 41 43 44 45 60 61 63 0A 0B 0C C 120 C 121 C 121 C 121 C 121 C 121 C 121 C 121 C 121 C 121 C 121 D 4 C 120 C 120 C 121 SCSI ASC ASCQ Codes C 65 C 68 C 72 C 75 00 00 00 01 00 02 00 03 00 04 00 05 00 06 00 11 00 12 00 13 00 14 00 15 01 00 02 00 03 00 03 01 03 02 04 00 04 01 04 02 04 03 04 04 06 00 07 00 08 00 08 01 08 02 09 00 09 01 09 02 09 03 10 00 11 00 11 01 11 02 11 03 11 04 11 05 11 06 11 08 o eOQoon0 O O O O O oO oO o P 1 bo 75 75 75 75 75 75 75 75 C 68 C 68 C 68 C 68 C 68 C 65 C 68 C 72 C 75 C 72 C 72 C 72 C 72 C 72 C 65 C 65 C 72 C 75 C 65 C 68 C 68 C 68 C 65 C 68 C 72 C C 65 C 68 C 72 C C 65 C 69 C 72 C C 65 C 69 C 72 C C 65 C 69 C 65 C 72 C 75 C 65 C 69 C 72 C C 65 C 69 C 72 C C 65 C 69 C 72 C C 65 C 69 C 72 C C 65 C 69 C 72 C 72 C 72 C 72 C 65 C 65 C 65 C 65 C 65 C 65 C 72 C 72 C 69 ANN 0 O O 0d 2 Index 9 Codes SCSI ASC ASCQ Codes cont d 11 09 12 00 13 00 14 00 14 01 14 02 14 03 14 04 15 00 15 01 15 02 16 00 17 00 17 01 17 02 17 03 1
179. C 56 000E0009 C 57 D 2 62 C 56 001E0009 C 57 80 C 56 000F0009 C 57 81 C 56 001F0009 C 57 D 3 82 C 56 HSJ30 40 Controller Vendor Specific SCSI 83 C 56 ASC ASCQ Codes flashing OCP 5 4 80 03 C 77 Format Codes 80 06 C 77 00 C 22 C 26 C 32 C 36 C 38 C 40 80 07 C 77 C 43 8201 C 77 01 C 28 C 30 C 34 8404 C 77 02 C 45 8505 C 77 05 C 50 89 00 C 77 09 C 48 93 00 C 77 OA C 54 8A 00 C 77 Host Interconnect Services Status Codes AO 00 C 77 00000000 C 56 AO 01 C 77 00000001 C 56 AO 02 C 77 00000002 C 56 AO 03 C 77 00000003 C 56 AO 04 C 77 00000004 C 56 AO 05 C 77 00000009 C 56 A1 00 C 77 00000032 C 56 A1 01 C 77 00000033 C 56 A1 02 C 77 00000034 C 56 A1 03 C 77 00000035 C 56 BO 00 C 77 00000036 C 57 BO 01 C 77 00000064 C 57 D 2 8C 04 C 77 00000065 C 57 D 2 DO 01 C 77 00010009 C 57 D 2 D0 02 C 77 00020009 C 57 D0 03 C 77 00030009 C 57 D 2 D1 00 C 77 00040009 C 57 D 2 D1 02 C 77 00050009 C 57 D1 03 C 77 00060009 C 57 D1 04 C 77 00070009 C 57 D 2 D1 05 C 78 00080009 C 57 D1 07 C 78 00090009 C 57 D1 08 C 78 00100009 C 57 D 2 D1 09 C 78 00110009 C 57 D1 0A C 78 00120009 C 57 D 2 D1 0B C 78 00130009 C 57 D2 00 C 78 Index 3 Codes HSJ30 40 Controller Vendor Specific SCSI ASC ASCQ Codes cont d D3 00 D4 00 D5 02 D7 00 8F 00 3F 85 3F 87 3F 88 3F 90 3F Co 3F C2 3F D2 C 78 C 78 C 78 C 78 C 77 C 77 C 77 C 77 C 7
180. C Q 18 89 HC 0 SC 1 Err in Hex 1C 03094002 PTL 05 05 00 Key 01 ASC Q 18 86 HC 0 SC 1 Total Errs Hard Cnt 0 Soft Cnt 2 where O Represents the unit number and the total I O requests to this unit O Represents the unit number and total I O requests to this unit All values for the following codes are described in Appendices C and D This also includes the items associated with this error and the total number of hard and soft errors for this unit e The HSJ HSD series Instance code in hex e The Port Target LUN PTL 6 48 Diagnostics Exercisers and Utilities e The SCSI Sense Key The SCSI ASC and ASQ ASC Q codes e The hard and soft count for this error O Represents information about the first two unique errors for this unit All values for the following codes are described in Appendices C and D This also includes the items associated with this error and the total number of hard and soft errors for this unit The HSJ HSD series Instance code in hex The Port Target LUN PTL e The SCSI Sense Key The SCSI ASC and ASQ ASC Q codes e The hard and soft count for this error A line of this format may be displayed up to three times in a performance summary There would be a line for each unique error reported to TILX for this unit up to three errors The following is an example of a TILX performance display where performance statistics were not selected and where a controller error error was dete
181. C closed due to data scan 001A0009 VC closed due to data timeout 001B0009 VC closed due to unrecognized packet 001C0009 VC closed due to data transmit failure 001D0009 VC closed due to CI ID complete failure 001E0009 VC closed due to lost command 001F0009 Not implemented in CI environment HSJ Series Error Logging C 57 Table C 4 Cl Message Operation Codes Code Description 00 Reserved 01 DG 02 MSG 03 CNF 04 MCNF 05 IDREQ 06 RST 07 STRT 08 DATREQO 09 DATREQ1 0A DATREQ2 0B ID 0C PSREQ 0D LB 0E MDATREQ OF RETPS 10 SNTDAT 11 RETDAT 12 SNTMDAT 13 RETMDAT Table C 5 CI Virtual Circuit State Codes Code Description 0001 VC_CLOSED 0002 START_SENT 0003 START_REC 0004 VC_OPEN 0005 VC_CLOSING C 58 HSJ Series Error Logging Table C 6 Port Port Driver Message Operation Codes Code Description 0000 START 0001 STACK 0002 ACK 0003 SCS_DG 0004 SCS_MSG 0005 ERROR_LOG 0006 NODE_STOP Table C 7 System Communication Services Message Operation Codes Code Description 0000 CONNECT_REQ 0001 CONNECT RSP 0002 ACCEPT REQ 0003 ACCEPT RSP 0004 REJECT REQ 0005 REJECT RSP 0006 DISCONNECT REQ 0007 DISCONNECT_RSP 0008 CREDIT REQ 0000 CREDIT RSP 000A APPL MSG 000B APPL DG HSJ Series Error Logging C 59 Table C 8 Cl Connection State Codes Code Description 0000 CLOSED 0001 LISTENING 0002 CONNECT_SENT 0003 CONNECT_
182. CLI gt ADD UNIT D100 DISKO Note that no INITIALIZE is required because DISKO has already been initialized Creating a Transportable Unit From a Disk Device CLI gt ADD DISK DISKO 2 0 0 TRANSPORTABLE CLI gt INITIALIZE DISKO CLI gt ADD UNIT DO DISKO or LI gt ADD DISK DISKO 2 0 0 LI gt SET DISKO TRANSPORTABLE LI gt INITIALIZE DISKO LI gt ADD UNIT DO DISKO AQ OR Deleting the Unit Stripeset and All Disks Associated With the Stripeset CLI DELETE DO CLI DELETE STRIPEO CLI DELETE DISKO CLI DELETE DISK1 CLI DELETE DISK2 CLI DELETE DISK3 B 80 Command Line Interpreter C HSJ Series Error Logging This appendix details errors the HSJ series controller reports in its host event logs under the OpenVMS operating system as well as how to extract the information from the logs Note Host event log translations are correct as of the date of publication of this manual However log information may change with firmware updates Refer to your Storage Works Array Controller Operating Firmware Release Notes for event log information updates C 1 Reading an HSJ Series Error Log To understand the error logs use the following guidelines Each error log contains an MSLG B_FORMAT field in the upper portion of the log plus a CONTROLLER DEPENDENT INFORMATION area in the lower portion of the log CONTROLLER DEPENDENT INFORMATION will vary according to the MSLG B_FOR
183. Change_unit lIf you select this option TILX allows you to drop a unit from testing and add a unit to testing For each unit dropped another unit must be added until all units in the configuration have been tested The unit chosen will be tested with the same parameters chosen for the unit that was dropped from testing When you have completed adding and dropping units all performance statistics are initialized and TILX execution resumes with the same parameters as the last run Drop unit x y n n Explanation This question is displayed if you choose to change a unit as an answer to the reuse parameters question It is asked for every unit that was tested After entering Y you are prompted for the unit number Enter the unit number to drop from testing Enter N if you do not wish to drop a unit from testing Note For each unit dropped from testing one must be added 6 3 5 TILX Output Messages The following message is displayed when TILX is started Copyright O Digital Equipment Corporation 1993 Tape Inline Exerciser version 1 4 This message identifies the internal program as TILX and gives the TILX software version number TILX Normal Termination Explanation This message is displayed when TILX terminates under normal conditions Diagnostics Exercisers and Utilities 6 37 Insufficient resources Explanation Following this line is a second line that gives more information about the problem wh
184. Changed operating definition 3F 03 Inquiry data has changed 40 00 Ram failure should use 8040 through FF40 41 00 Data path failure should use 8040 through FF40 42 00 Power on or self test failure should use 8040 through FF40 43 00 Message error 44 00 Internal target failure 45 00 Select or reselect failure 46 00 Unsuccessful soft reset 47 00 SCSI parity error 48 00 Initiator detected error message received 49 00 Invalid message error 4A 00 Command phase error continued on next page HSJ Series Error Logging C 67 Table C 13 Cont SCSI ASC ASCQ Codes For Direct Access Devices such as magnetic disk ASC ASCQ Code Code Description 4B 00 Data phase error 4C 00 Logical unit failed self configuration 4E 00 Overlapped commands attempted 53 00 Media load or eject failed 53 02 Medium removal prevented 5A 00 Operator request or state change input unspecified 5A 01 Operator medium removal request 5A 02 Operator selected write protect 5A 03 Operator selected write permit 5B 00 Log exception 5B 01 Threshold condition met 5B 02 Log counter at maximum 5B 03 Log list codes exhausted 5C 00 Rpl status change 5C 01 Spindles synchronized 5C 02 Spindles not synchronized 40 nn Diagnostic failure detected on component nn where nn identifies a specific target device component nn range 80 through FF Refer to documentation provided by the vendor of the target device for a descr
185. Control Protocol The message level protocol used by the HSJ and HSD series controllers to communicate with a host computer The three types of MSCP communication are sequential messages datagrams and block data transfers Glossary 9 Glossary 10 Network Interconnect See NI NI One of two standard interconnects used in the System Interconnect Architecture CI is the other The NI also known as the Ethernet connects communications servers and compute servers creating a local area network node An intelligent entity in a distributed computing configuration Nodes are independent but linked as in a network or a cluster becoming parts of a whole In a cluster HSJ series controllers and host computers are cluster nodes nonredundant A configuration in which there is no backup hardware in place for the hardware that is present nontransportable A device setting that indicates the device is MSCP compliant and contains metadata Nontransportable devices can be moved amongst HS controller subsystems but not taken directly to non HS controller systems See also transportable nonvolatile See NV nonvolatile memory See NVMEM nonvolatile parameters memory See NVPM NV Nonvolatile A term used to describe memory the contents of which survive loss of power NVMEM Nonvolatile memory NVMEM is the battery backed up SRAM on the controller module NVPM Nonvolatile parameter memory NVPM is a portion of NVMEM u
186. D0351 a r 0 0 0 0 0 0 0 0 0 0 D0911 a r 0 0 0 0 0 0 0 0 0 0 D1000 a r 0 0 0 0 0 0 0 0 0 0 This subdisplay shows the status of the logical units that are known to the controller firmware It also shows I O performance information and caching statistics for the units Up to 42 units may be displayed in this subdisplay O The Unit column contains a letter indicating the type of unit followed by the unit number of the logical unit The list is sorted by unit number There may be duplication of unit numbers between devices of different types If this happens the order of these devices is arbitrary The device type letters that may displayed are as follow D indicates a disk device e T indicates a tape device e L indicates a media loader e C indicates a CDROM device e F indicates a device type not listed above e U indicates the device type is unknown The ASWC columns indicate the availability spindle state write protect state and cache state respectively of the logical unit The availability state is indicated using the following letters e a Available Available to be mounted by a host system e d Offline Disabled by Digital Multivendor Services The unit has been disabled for service e e Online Exclusive Access Unit has been mounted for exclusive access by a user e f Offline Media Format Error The unit cannot be brought available due to a media format inconsistancy e i Offli
187. DATA Unable to allocate memory for the First Cache Buffer Index Array Invalid metadata combination detected in build_raid_node Unable to handle that many bad dirty pages exceeded MAX_BAD_ DIRTY Cache memory is bad Invalid DCA state detected in start_crashover Invalid DCA state detected in start_failover Invalid DCA state detected in init_failover The host port software has insufficient resources to set up a block data transfer operation for a WRITE command The host port software has insufficient resources to set up a block data transfer operation for a COMPARE command C 100 HSJ Series Error Logging Table C 35 Device Services Last Failure Codes Code Description 03020101 03030101 03040101 03050101 03060101 03070101 03080101 030A0100 030B0188 03150100 031E0100 031F0100 03280100 03290100 032A0100 032B0100 Invalid SCSI direct access device opcode in miscellaneous command DWD Last Failure Parameter 0 contains the SCSI command opcode Invalid SCSI sequential access device opcode in miscellaneous command DWD Last Failure Parameter 0 contains the SCSI command opcode Invalid SCSI CDROM device opcode in miscellaneous command DWD Last Failure Parameter 0 contains the SCSI command opcode Invalid SCSI medium changer device opcode in miscellaneous command DWD Last Failure Parameter 0 contains the SCSI command opcode Invalid SCSI device type in PUB Last Failure Parameter 0
188. DSSI Port Port Driver Event Log Template 32 Template 32 CI System Communication Services DSSI System Communication Services Event Event Log Template 33 Log Template 33 D 3 Event Log Codes Tables D 2 through D 5 show some important difference in reported codes between HSJ and HSD series controllers Some entries may show identical numeric codes with different description text while other entries are in fact different HSD series controller only codes and descriptions Be aware of these differences when decoding HSD series controller error logs using Appendix C Table D 2 Host Interconnect Services Status Codes Code Description 00000064 The DSSI IDREQ send without receiving a DSSI ID in response limit has been reached on Path A the remote node is acknowledging the packets but not responding to them 00000065 A DSSI ID or DSSI CNF packet transmitted by the thread on behalf of Host Interconnect Services could not be successfully transmitted 00010009 Virtual circuit closed due to DSSI ID request failure 00030009 Virtual circuit closed due to DSSI START failure 00040009 Virtual circuit closed due to DSSI STACK failure 00070009 Virtual circuit closed due to NAK ADP retry DSSI ID transmit failure 000A0009 Not implemented in DSSI environment 000B0009 Virtual circuit closed due to NOR ADP retry DSSI ID transmit failure 000E0009 Not implemented in DSSI environment 00100009 Not implemented in DSSI environment
189. Device Name Port Target LUN Disk DISKO 1 0 0 Disk DISK1 2 0 0 Disk DISK2 3 0 0 Disk DISK3 4 0 0 Disk DISK4 4 1 0 Tape TAPEO 5 1 0 CDROM CDROMO 6 0 0 B 3 6 Adding Storage Sets Storage sets are created from disks In the previous example devices were given names to make them identifiable Use these names when creating storage sets CLI gt ADD STRIPESET STRIPEO DISKO DISK1 DISK2 DISK3 This example creates a stripeset named STRIPEO using disks DISKO DISK1 DISK2 and DISK3 from Section B 3 5 All members of the storage set a stripeset must have been previously defined using ADD DISK Tapes and CDROMs cannot be bound to storage sets B 3 7 Initializing Containers Disks and storage sets are also called containers Containers must be initialized before they are made available to a host via the ADD UNIT command The following initializes containers from the previous examples CLI gt INITIALIZE STRIPEO CLI gt INITIALIZE DISK4 Initializing a tape or CDROM is not required and is not allowed B 3 8 Adding Logical Units Units can be created from any container either device or storage set Tapes and CDROMs are always bound directly to units because they cannot comprise a storage set B 78 Command Line Interpreter The following makes the devices and containers from the previous examples available to the host as units CLI gt ADD UNIT DO STRIPEO CLI gt ADD UNIT D100 DISK4 CLI gt ADD UNIT D120 CDROMO CLI gt ADD U
190. E PTL SCSI location command B 18 LOCATE TAPES command B 18 LOCATE UNITS command B 18 Logical Unit Number See LUN Logical units adding B 78 Low availability See Configuration nonredundant LUN controller perspecctive 2 13 host perspective HSZ series 2 16 Maintenance strategy 1 4 Maintenance terminal 1 5 2 3 Mirroring See HBVS MIST 6 2 6 3 See also Core MIST See also DAEMON Mixing disk and tape 3 9 Mixing SBB sizes 3 14 MMJ 2 3 Modified modular jack See MMJ Module handling guidelines 1 6 Module integrity self test See MIST Modules 1 1 Moving devices between controllers 4 17 MSCP 4 5 4 7 7 10 7 17 7 46 MSCP timeout 4 14 N Nonredundant controller and downtime 5 1 configuration 3 15 installing 7 7 removal 7 4 replacing 7 7 restoring parameters 7 9 service consideration 5 1 service of 7 3 service precautions 7 3 shelf rails 7 7 tools 7 3 Nontransportable devices 4 17 Nonvolatile memory See NVMEM Nonvolatile Parameters in Memory See NVPM NOTRANSPORTABLE qualifier 4 9 7 12 NVMEM 2 4 NVPM 5 12 error messages 5 12 O OCP 1 5 2 2 4 2 5 2 amber LEDs 5 3 codes 5 4 fault notification 5 4 6 2 6 3 flashing codes 5 4 green LED 5 3 normal operation 5 3 reset button 5 3 solid codes 5 4 OpenVMS AUTOGEN COM file 4 13 cluster size 4 14 initialization disk 4 12 MSCP timeout 4 14 polling parameters 4 15 shadow member timeout 4 15 shadow set
191. EARCH DATA LOW 12 byte continued on next page Table C 10 Cont SCSI Command Operation Codes Supported Device Types Code See Table C 9 Description B3 05 SET LIMITS 12 byte B5 08 REQUEST VOLUME ELEMENT ADDRESS B6 08 SEND VOLUME TAG B8 08 READ ELEMENT STATUS Table C 11 SCSI Buffered Modes Codes Code Description 0 The target shall not report GOOD status on write commands until the data blocks are actually written on the medium 1 The target may report GOOD status on write commands as soon as as all the data specified in the write command has been transferred to the target s buffer One or more blocks may be buffered prior to writing the block s to the medium 2 The target may report GOOD status on write commands as soon as 1 All the data specified in the write command has been successfully transferred to the target s buffer and 2 All buffered data from different initiators has been successfully written to the medium Reserved for future use Reserved for future use Reserved for future use Reserved for future use ND 7 Ff W Reserved for future use HSJ Series Error Logging C 63 C 64 Table C 12 SCSI Sense Key Codes Code Description 0 Q NO SENSE Indicates that there is no specific sense key information to be reported for the designated logical unit This would be the case for a successful command or a command that received CHECK CONDITION or COMMAND TERMINATED sta
192. EDs will also flash Note The length of time required for 1 O to stop can vary from zero seconds to several minutes depending on load device type and cache status After you remove the SBB the flashing pattern on the OCP stops and normal operation on the ports resumes At this time the removed SBB s port LED will turn on The LED stays on until the SBB is returned to its slot or until another SBB is inserted in the slot The remaining port LEDs turn off 7 11 1 4 Device Replacement Use a replacement device of the same type as the removed device Otherwise subsystem failures such as the establishment of stripesets may occur Use the following procedure to replace a device 1 Quiesce the SBBs port by pressing and holding the port button for the SBB Continue holding the button until all amber OCP LEDs light Note Only one port may be quiesced at any time If the button is not held long enough or multiple buttons are pushed in quick succession all buttons are ignored no ports are quiesced You must press and hold the button again to quiesce the port 7 40 Removing and Replacing Field Replaceable Units Wait until the chosen port LED flashes alternately with the other port LEDs this indicates I O has stopped The alternating pattern flashes for approximately 30 seconds during which you may insert the SBB If the pattern does not appear after a minute or two another shelf is asserting a fault signal t
193. ERRORS is the default Command Line Interpreter B 21 RESTART OTHER_CONTROLLER Examples IMMEDIATE NOIMMEDIATE D If IMMEDIATE is specified immediately restart the controller without checking for online devices CAUTION Customer data may be lost or corrupted if the IMMEDIATE qualifier is specified NOIMMEDIATE is the default OVERRIDE_ONLINE NOOVERRIDE_ONLINE D If any units are on line to the controller the controller will not be restarted unless OVERRIDE_ONLINE is specified If the OVERRIDE_ONLINE qualifier is specified the controller will restart after all customer data is written to disk CAUTION Customer data may be lost or corrupted if the OVERRIDE_ONLINE qualifier is specified NOOVERRIDE_ONLINE is the default CLI gt RESTART OTHER_CONTROLLER Restart the other controller as long as the other controller does not have any units that are on line CLI gt RESTART OTHER_CONTROLLER OVERRIDE_ONLINE Restart the other controller even if there are units on line to the other controller B 22 Command Line Interpreter RESTART THIS_CONTROLLER RESTART THIS_CONTROLLER Format Description Restarts this controller RESTART THIS_CONTROLLER The RESTART THIS_CONTROLLER command restarts this controller If any disks are on line to this controller the controller will not restart unless the OVERRIDE_ONLINE qualifier is specified HSD and HSJ only If any user data cannot be flus
194. Failure Parameter 0 contains the PCB reg710_ptr value Last Failure Parameter 1 contains the PCB copy of the 710 TEMP register Last Failure Parameter 2 contains the PCB copy of the 710 DBC register Last Failure Parameter 3 contains the PCB copy of the 710 DNAD register Last Failure Parameter 4 contains the PCB copy of the 710 DSP register Last Failure Parameter 5 contains the PCB copy of the 710 DSPS register Last Failure Parameter 6 contains the PCB copies of the 710 SSTAT2 SSTAT1 SSTATO DSTAT registers Last Failure Parameter 7 contains the PCB copies of the 710 LCRC RESERVED ISTAT DFIFO registers An invalid code was seen by the error recovery thread in the er_funct_step field of the PCB Last Failure Parameter 0 contains the PCB er_funct_ step code continued on next page Table C 35 Cont Device Services Last Failure Codes Code Description 033E0108 033F0108 03410101 An attempt was made to restart a 710 at the SDP DBD Last Failure Parameter 0 contains the PCB reg710_ptr value Last Failure Parameter 1 contains the PCB copy of the 710 TEMP register Last Failure Parameter 2 contains the PCB copy of the 710 DBC register Last Failure Parameter 3 contains the PCB copy of the 710 DNAD register Last Failure Parameter 4 contains the PCB copy of the 710 DSP register Last Failure Parameter 5 contains the PCB copy of the 710 DSPS register Last Failure Parameter 6 co
195. Glossary Lists acronyms and terms specific to the HS controllers xvii Related Documentation Table 1 lists documents containing information related to this product xviii Table 1 Related Documentation Document Title Order Number HSJxx Array Controller Software Product Description SPD47 26 04 HSD30 Array Controller Software Product Description SPD53 53 00 HSZ40 Array Controller Software Product Description SPD53 54 00 StorageWorks Array Controllers HS Family of Array Controllers Pocket Service Guide StorageWorks Array Controllers HS Family of Array Controllers User s Guide StorageWorks Array Controllers HSJ40 and HSJ30 Array Controller Operating Firmware Release Notes StorageWorks Array Controllers HSD30 Array Controller Operating Firmware Release Notes StorageWorks Array Controllers HSZ40 Array Controller Operating Firmware Release Notes StorageWorks Solutions Building Block User s Guide StorageWorks Solutions Controller Shelf User s Guide StorageWorks Solutions Configuration Guide StorageWorks Solutions Shelf and SBB User s Guide StorageWorks Solutions Shelf Metric Mounting Kit User s Guide StorageWorks Solutions SW800 Series Data Center Cabinet Installation and User s Guide StorageWorks Solutions SW800 Series Data Center Cabinet Cable Distribution Unit Installation Sheet StorageWorks Solutions SW500 Series Cabinet Installation and User s Guide StorageWorks Solutions SW500 S
196. HIS XMIT APPL MSG processing completed non automatic end message 60040100 Invalid return value from routine HIS XFER BLOCK DATA processing return of Write History Log to host buffers 60050100 Invalid return value from routine HISSCONNECT while DCD attempting to establish connection to a remote subsystem 60060100 Invalid return value from routine HIS XMIT APPL MSG while dmscp_ dcd send cmd attempting to send a command to a remote subsystem 60070100 Invalid return value from routine HIS MAP while dmscp ded allocate bh attempting to map a buffer 60080100 Invalid return value from routine HIS XMIT APPL MSG while dmscp_ dcd src gcs send attempting to send a GCS command to a remote subsystem 60090100 Invalid return value from routine HIS SDISCONNECT while dmscp_dcd_ comm path event attempting to disconnect a remote source connection 600B0100 Invalid return value from routine HIS SPREPARE MSG XMIT processing TMSCP Write Read or Compare Host Data command 600C0100 Invalid return value from routine RESMGR ALLOCATE DATA SEGMENT 600D0100 Opcode field in command being aborted is not valid 600E0100 Opcode of command to be initiated is invalid 600F0100 Opcode of command to be initiated is invalid 60100100 Opcode field in non sequential command being inititated is invalid 60110100 Opcode of command to be initiated is invalid 60120100 Opcode of TMSCP command to be aborted is invalid 60130100 tmscp clear cdl cmpl rtn de
197. HSC K scsi HSDO5 HS Controller Transportable Yes No Yes Yes Nontransportable No Yes No Yes 4 18 Normal Operation 9 Error Analysis and Fault Isolation This chapter describes the errors faults and significant events that may occur during HS controller initialization and normal operation A translation of the events and in most cases how to respond to a specific event is also given The error and event descriptions isolate failures to the field replaceable unit FRU However in most cases additional information for diagnosis beyond the FRU is given This information will help increase your knowledge of controller functions and assist with your report to depot repair personnel CAUTION Do not attempt to replace or repair components within FRUs or equipment damage may result Use the controller fault indications and error logs to isolate FRU level failures 5 1 Special Considerations Some or all of the situations presented iun the following sections may apply when your controller detects a fault 5 1 1 Nonredundant Configurations When a controller or its cache module or both fails in a nonredundant configuration a short period of system down time is needed to remove the faulty unit and install a replacement The devices attached to that controller will be off line for the duration of the remove and replace cycle 5 1 2 Dual redundant Configurations When a controller fails in a dual redundant configuration f
198. HSZ series DILX 6 62 DDL 2 6 DEC OSF 1 AXP initialization disk 4 12 support 4 11 Defaults HSJ HSD series DILX 6 9 HSZ series DILX 6 54 Deferred error display HSZ series DILX 6 62 DELETE container name command B 12 DELETE unit number command B 13 Device LEDs 5 8 SBB active LED 5 8 SBB fault LED 5 8 storage SBB faults 5 8 Device port cable 7 31 installing 7 33 removing 7 32 replacing 7 33 service of 7 31 service precautions 7 31 tools 7 31 Device ports 2 5 running on fewer 6 3 testing 6 3 Device services firmware 2 11 Device shelf status power supply faults 5 9 power supply LEDs 5 9 shelf faults 5 9 single power supply power supply faults 5 10 shelf faults 5 10 Device warm swap 7 38 device removal 7 39 device replacement 7 40 precautions 7 39 tools 7 38 Devices adding 4 9 7 12 B 78 configurations 3 9 configuring automatic 6 98 initializing 4 17 4 18 B 78 moving between controllers 4 17 nontransportable 4 17 transportable 4 18 Diagnostic and execution monitor See DAEMON Diagnostic registers 2 2 Diagnostic utility protocol See DUP Diagnostics 4 1 6 1 DILX 1 5 2 10 HSJ HSD series abort codes 6 29 basic function test 6 7 configuring all units 6 25 data test patterns 6 21 defaults 6 9 defined 6 5 end message display 6 18 error codes 6 30 Index 13 DILX HSJ HSD series cont d error information packets 6 18 examples 6 22 interrupting
199. I host LUNs exist on a SCSI host s device interface Controller LUNs and SCSI host LUNs may represent the same structure but only if the user configures up to eight controller devices in a one to one unit relationship with the host This situation may or may not occur under normal operation Host Port Target LUN Addressing HSZ series Note Non SCSI hosts CI DSSI though they access virtual devices do not use a PTL addressing scheme Any unit seen by these hosts is simply called a host logical unit not a LUN Host PTL addressing is the process by which a SCSI host selects a logical unit made up of physical devices connected to an HSZ series controller The process takes place in three steps 1 The port selection The host selects the SCSI bus that has the HSZ series controller connected to it 2 The target selection The host selects the controller s SCSI ID that is the target on that port bus The HSZ series controller may act as one or two target IDs 3 The LUN selection The host presents the controller with the LUN of the desired host logical unit The controller translates the LUN into the physical device addresses required to allow the host access to the virtual device 2 16 Functional Description 3 Configuration Rules and Restrictions This chapter describes rules and restrictions as they apply to the physical configuration and connection of the following HS controller subsystem hardwar
200. ILOVER COPY configuration source COPY configuration source Specifies where the good copy of the device configuration resides If THIS_CONTROLLER is specified for configuration source all the device configuration information on THIS_CONTROLLER the one that either the maintenance terminal is connected to or the virtual terminal is connected to is copied to the other controller If OTHER_CONTROLLER is specified for configuration source all the device configuration information on the OTHER_CONTROLLER the controller that either the maintenance terminal or the virtual terminal connection is not connected to will be copied to this controller The SET FAILOVER command places THIS CONTROLLER and the OTHER_CONTROLLER in a dual redundant configuration After entering this command if one of the two controllers fail the devices attached to the failed controller become available to and accessible through the operating controller CAUTION All device configuration information on the controller not specified by the COPY parameter is destroyed and overwritten by the configuration information found in the controller specified by the COPY parameter Make sure you know where your good configuration information is stored or you have a complete copy of the device configuration BEFORE entering this command A considerable amount of work and effort will be lost by overwriting a good configuration with incorrect information if the wrong controll
201. ILX runs under Note that the length of all 1 Os is in bytes and is evenly divisible by the sector size 512 bytes Read write access and erase commands are issued using random logical block numbers LBNs In the read write mode DILX issues the reads and writes in the ratio specified previously under read write ratio and issues access and erase commands in the ratio specified previously under access erase ratio When read only mode is chosen only read and access commands are issued If compares are enabled compares are performed on write and read commands using the data compare modifier and DILX internal checks The percentage of compares to perform can be specified This phase is executed 60 percent of the time It is the first phase executed after the initial write pass has completed It is re executed at 10 minute intervals with each cycle lasting approximately 6 minutes Intervals are broken down into different cycles The interval is repeated until the user selected time interval expires Data Intensive Designed to test disk throughput by selecting a starting LBN and repeating transfers to the next sequential LBN that has not been written to by the previous I O The transfer size of each I O equals the maximum sized I O that is possible with the memory constraints DILX must run under This phase continues performing spiraling I O to sequential tracks Read and write commands are issued in read write mode This phase is execute
202. ILX to abort The reuse parameters question is not asked 6 4 3 DILX Tests There are two DILX tests as follow e The Basic Function test The User Defined test 6 4 3 1 Basic Function Test DILX The Basic Function test for DILX executes in two or three phases The three phases are as follow Initial Write Pass Is the only optional phase and is always executed first if selected The initial write pass writes the selected data patterns to the entire specified data space or until the DILX execution time limit has been reached Once the initial write pass has completed it is not re executed no matter how long the DILX execution time is set The other phases are re executed on a 10 minute cycle Diagnostics Exercisers and Utilities 6 51 e Random I O Simulates typical I O activity with random transfers from one byte to the maximum size I O possible with the memory constraints DILX runs under Note that the length of all I Os is in bytes and is evenly divisible by the sector size 512 bytes Read and write if enabled commands are issued using random logical block numbers LBNs In the read write mode DILX issues the reads and writes in the ratio specified previously under read write ratio When read only mode is chosen only read commands are issued If compares are enabled compares are performed on read commands using DILX internal checks The percentage of compares to perform can be specified This phase is executed
203. INVALID BYTE COUNT 6 A timer is in an unexpected expired state that prevents it from being started 7 The semaphore was set after a oneshot IO was issued but nothing was found in the received HTB que 8 A termination a print summary or a reuse parameters request was received when DILX was not testing any units 9 User requested an abort via Y 6 2 12 DILX Error Codes Table 6 4 list the DILX error codes and definitions for DILX detected errors Table 6 4 DILX Error Codes and Definitions Value Definition 1 Illegal Data Pattern Number found in data pattern header 2 No write buffers correspond to data pattern 3 Read data does not match write buffer 4 Compare Host Data should have reported a compare error but did not 6 3 Tape Inline Exerciser HSJ and HSD Series Controllers TILX is a diagnostic tool used to exercise the data transfer capabilities of selected tape drives connected to an HSJ or HSD series controller TILX exercises tape drives in a way that simulates a high level of user activity Thus TILX can be used to determine the health of the controller and the tape drives connected to it You can run TILX from a maintenance terminal or from a virtual terminal DILX and TILX may run concurrently with one initiated from a maintenance terminal and the other from a virtual terminal connection Digital recommends however that the exercisers not be run while normal I O operations are in progress as system
204. IS x8A960428 x00000000 x8A971E00 x003B5520 x40023230 x00000200 x8A960450 xA0 x06 x0000 x0002 SCSI_STAT_CHECK_CONDITION x00 x00000000 x0000000000000001DA681B08 x0000003C x0000 x4000 x20 Error exception or abnormal _condition HARDWARE ERROR Nonrecoverable hardware error continued on next page HSZ Series Error Logging E 3 Example E 1 Cont The uerf utility Error Event Log ERROR CODE x0070 CODE x70 SEGMENT x00 SENSE KEY x0004 HARDWARE ERR INFO BYTE 3 x00 INFO BYTE 2 x00 INFO BYTE 1 x00 INFO BYTE 0 x00 ADDITION LEN x98 CMD SPECIFIC 3 x00 CMD SPECIFIC 2 x00 CMD SPECIFIC 1 x0 CMD SPECIFIC 0 x00 ASC x44 ASQ x00 FRU x00 SENSE SPECIFIC x000000 ADDITIONAL SENSE 0000 00030000 01080108 00000206 40020000 XN eA hehe id ii Q 0010 01510309 08002800 01DA681B 01000000 Beso O54 trees DS sut es 0020 00000700 20202020 58432020 33323130 eee CX0123 0030 37363534 5A373845 00000000 36333400 4567E872 436 0040 325A5241 20202038 43282020 45442029 ARZ28 C DE 0050 00000043 00000000 00000004 00000000 Na eur e pora areae 0060 01080000 00000000 00000000 00000000 Pitt puce ener rios x 0070 00000000 00000000 00000000 00000000 ur 0080 00000000 00000000 00000000 00000000 SE 0090 7E250000 00005E3C 00000000 00000000 Win BOR riri Vid E 4 HSZ Series Error Logging Glossary ac distribution The method of d
205. If the emdopcd is an 18 COPY 39 COMPARE or 3A COPY AND VERIFY this field contains the number of the current segment descriptor snsflgs This field contains byte 2 of the Sense Data returned in the response of a SCSI REQUEST SENSE command This field is formatted as shown in Figure C 11 C 16 HSJ Series Error Logging Figure C 11 SCSI Sense Data Byte Two snsflgs Field Format 7 6 5 4 Tu Tee 3 2 1 0 Cw ee info SCSI Sense Data Byte Two snsflgs Specific Subfields Sense Key The sense key provides generic categories in which events can be reported The sense keys are described in Table C 12 An incorrect length indicator ILI bit of one usually indicates that the requested logical block length did not match the logical block length of the data on the medium EOM For sequential access devices that is devtype is 1 an end of me dium EOM bit set to one indicates that the unit is at or past the early warning if the direction was forward or that the command could not be completed because beginning of partition was encountered if the direction was reverse A filemark FM bit set to one indicates that the current command has read a filemark or setmark The Additional Sense Code field see asc field description may be used to indicate whether or not a filemark or setmark was read Note that the reporting of setmarks is optional This field contains bytes 3 through 6 Info
206. MAT field Example C 1 shows an example of an ERF translated host error log a Disk Transfer Event log See Example C 1 to find MSLG B_FORMAT and CONTROLLER DEPENDENT INFORMATION The key to interpreting error logs is a 32 bit instance code located in the CONTROLLER DEPENDENT INFORMATION area The instance code uniquely identifies the following e The error or condition The component reporting the condition e The recommended repair action e The threshold when the repair action should be taken Note The instance code is the single most important part of interpreting the error log This is a departure from HSC based error logs where other fields in the error information contained values of primary interest HSJ Series Error Logging C 1 Example C 1 Disk Transfer Error Event Log VAX VMS SYSTEM ERROR REPORT COMPILED 16 MAR 1993 11 05 04 PAGE 146 FOG IGG IGG IG CCR ENTRY 12 CGI III I IRI RI EK ERROR SEQUENCE 2832 LOGGED ON SID 05903914 DATE TIME 16 MAR 1993 10 27 58 95 SYS_TYPE 00000000 SYSTEM UPTIME 4 DAYS 02 11 34 SCS NODE CNOTE VAX VMS V5 5 2 ERLSLOGMESSAGE ENTRY KA825 HW REV B PATCH REV 28 UCODE REV 20 BI NODE 2 I O SUB SYSTEM UNIT _FREDSDUA115 MESSAGE TYPE 0001 DISK MSCP MESSAGE MSLGSL CMD REF 9DB30013 MSLGSW UNIT 0073 UNIT 115 MSLGSW SEQ NUM 0002 SEQUENCE 2 MSLG B FORMAT 02 DISK TRANSFER LOG MSLGSB FLAGS 00 UNRECOVERABLE ERROR MSLGSW EVENT 000B DRIVE ERROR
207. NIT TO TAPEO This creates disk unit 0 from stripeset STRIPEO disk unit 100 from DISK4 disk unit 120 from CDROMO and tape unit 0 from TAPEO At the UNIT level CDROMs are treated as disks but only a subset of the disk SET commands are available for CDROMs B 3 9 Device Configuration Examples The following examples show some different device configurations Creating a Unit From a Disk Device CLI gt ADD DISK DISKO 2 0 0 CLI gt INITIALIZE DISKO CLI gt ADD UNIT DO DISKO Creating a Unit From a Tape Device CLI gt ADD TAPE TAPEO 3 0 0 CLI gt ADD UNIT TO TAPEO Creating a Unit From a Four Member Stripeset CLI gt ADD DISK DISKO 1 0 0 CLI gt ADD DISK DISK1 2 0 0 CLI gt ADD DISK DISK2 3 0 0 CLI gt ADD DISK DISK2 1 1 0 CLI gt ADD STRIPESET STRIPEO DISKO DISK1 DISK2 DISK3 Warning 3000 This storageset is configured with more than one disk per port This will cause a degradation in performance LI gt INITIALIZE STRIPEO LI gt ADD UNIT DO STRIPEO C Q Creating a Write Protected Unit From a Disk CLI gt ADD DISK DISKO 2 0 0 CLI gt INITIALIZE DISKO CLI gt ADD UNIT DO DISKO WRITE_PROTECT Write Protecting an Existing Unit CLI gt ADD DISK DISKO 2 0 0 CLI gt INITIALIZE DISKO CLI gt ADD UNIT DO DISKO CLI gt SET DO WRITE_PROTECT Command Line Interpreter B 79 Renumbering Disk Unit 0 to Disk Unit 100 CLI gt ADD DISK DISKO 2 0 0 CLI gt INITIALIZE DISKO CLI gt ADD UNIT DO DISKO CLI gt DELETE DO
208. ONLINE qualifier is specified NOOVERRIDE_ONLINE is the default Qualifiers for HSZ controllers Examples IGNORE_ERRORS NOIGNORE_ERRORS D If errors result when trying to write user data the controller will not be shut down unless IGNORE_ERROR is specified CAUTION Customer data may be lost or corrupted if the IGNORE_ERRORS qualifier is specified NOIGNORE_ERRORS is the default IMMEDIATE NOIMMEDIATE D If IMMEDIATE is specified immediately shuts down the controller without checking for online devices CAUTION Customer data may be lost or corrupted if the IMMEDIATE qualifier is specified NOIMMEDIATE is the default CLI gt SHUTDOWN THIS_CONTROLLER Shuts down this controller as long as this controller does not have any units on line CLI gt SHUTDOWN THIS_CONTROLLER OVERRIDE_ONLINE Shuts down this controller even if there are units on line to this controller Command Line Interpreter B 63 B 2 CLI Messages The following sections describe messages you may encounter during interactive use of the CLI B 2 1 Error Conventions An Error nnnn means that the command did not complete Except for a few of the failover messages 6000 series no part of the command was executed When encountering an error going into or exiting dual redundant mode some synchronization problems are unavoidable the error message in such a case will tell you what to do to get things back in synchro
209. PEO DISKO DISK1 DISK2 DISK3 A STRIPESET is created out of four disks DISKO DISK1 DISK2 and DISK3 Because the chunksize was not specified the chunksize will be the default CLI gt ADD STRIPESET STRIPEO DISKO DISK1 DISK2 DISK3 CHUNKSIZE 16 A STRIPESET is created out of four disks DISKO DISK1 DISK2 and DISK3 The chunksize will be 16 blocks Command Line Interpreter B 5 ADD TAPE ADD TAPE Adds a tape drive to the known list of tape drives Note This command is valid for HSJ and HSD controllers only Format ADD TAPE device name SCSI location Parameters device name Specifies the name that will be used to refer to this tape drive This name will be referred to when creating units The name must start with a letter A through Z and can then consist of up to eight more characters made up of A through Z 0 through 9 period dash and underscore _ for a total of nine characters SCSI location The location of the tape drive to be added in the form PTL where P designates the port 1 through 6 or 1 through 3 depending on the controller model T designates the target ID of the tape drive 0 through 6 in a nonfailover configuration or 0 through 5 if the controller is in a failover configuration and L designates the LUN of the tape drive 0 through 7 When entering PTL at least one space must separate the the port target and LUN Description Adds a tape drive to the known list of tape dr
210. RY command data The most significant character of the product identification data will appear in the low order byte of the first longword of this field while the least significant character appears in the high order byte of the last long word device serial number Eight bytes of ASCII data as defined by the device vendor in the Product Serial Number field of the SCSI Unit Serial Number Page data The most significant character of the serial number data will appear in the low order byte of the first longword of this field while the least significant character appears in the high order byte of the last longword Note that the number of characters of serial number data supplied may vary from vendor to vendor as well as from device to device If the serial number data supplied is less than eight characters this field is ASCII space filled from the lowest order byte relative to the low order byte of the first longword containing a serial number character through the high order byte of the last longword If the serial number data supplied are greater than eight characters the serial number data are truncated at eight bytes that is the least significant character s of the serial number data are lost If the serial number data are not available at all this field is ASCII space filled C 2 2 5 SCSI Device Sense Data Common Fields The fields common to certain event logs generated by the Device Services and Value Added firmware component
211. Replacing Field Replaceable Units 7 5 Figure 7 3 Eject Button HSJ40 Controller Oo O O 000 080 090 MOUNTING SP SCREWS T 5 Mb ans PROGRAM CARD o9 B 0090 EJECT BUTTON o 0090 0000 0900 99250 O O 9990 o9 0090 O 0000 0000 0000 0000 0090 O 000 080 209 CI HOST CABLE ooo O O O MOUNTING SCREWS CXO 4118A MC 7 Remove the program card by pushing the eject button shown in Figure 7 3 Pull the card out and save it for use in the replacement controller module 8 HSJ series Loosen the captive screws on the CI cable connector shown in Figure 7 3 with a flat head screwdriver and remove the cable from the front of the controller module CAUTION Do not remove host port cables from an HSD series controller while the power is on to any members on the DSSI bus including the controller and host Doing so risks short circuits that may blow fuses on all the members HSD series Turn off power to all members on the DSSI bus Then with a flat head screwdriver loosen the captive screws on the DSSI cable connector and terminator and remove them from the trilink connector shown in Figure 7 4 7 6 Removing and Replacing Field Replaceable Units Figure 7 4 Trilink Connector REAR VIEW FRONT VIEW CXO 3851A MC HSZ series With a small flat head screwdriver loosen the captive screws on the trilink con
212. SCP unit Uses D150 DI5 Switches RUN READ CACHE NOWRITE PROTECT NOTRANSPORTABLE MAXIMUM CACHED TRANSFER SIZE 32 State ONLINE to this controller No exclusive access A listing of a specific disk unit CLI gt sho t110 MSCP unit Uses T110 TAPE110 Switches DEFAULT_FORMAT DEVICE_DEFAULT State AVAILABLE No exclusive access A listing of a specific tape unit Command Line Interpreter B 59 SHUTDOWN OTHER_CONTROLLER SHUTDOWN OTHER_CONTROLLER Format Description Shuts down and does not restart the other controller Note This command is valid for HSJ and HSD controllers only SHUTDOWN OTHER_CONTROLLER The SHUTDOWN OTHER_CONTROLLER command shuts down the other controller If any disks are on line to the other controller the controller will not shut down unless the OVERRIDE_ONLINE qualifier is specified HSD and HSJ only If any user data cannot be flushed to disk the controller will not shut down unless the IGNORE_ERRORS qualifier is specified Specifying IMMEDIATE will cause the other controller to shut down immediately without flushing any user data to the disks even if drives are on line to the host Qualifiers for HSD and HSJ controllers IGNORE_ERRORS NOIGNORE_ERRORS D If errors result when trying to write user data the controller will not be shut down unless IGNORE_ERROR is specified CAUTION Customer data may be lost or corrupted if the IGNORE_ERRORS qualifier is specifi
213. SCSI ASC ASCQ Codes For Direct Access Devices such as magnetic disk ous V Ret Sh eek reU an UE wha ENS nU SCSI ASC ASCQ Codes For Sequential Access Devices such as maegne c tape O e S SCSI ASC ASCQ Codes For CDROM Devices SCSI ASC ASCQ Codes For Medium Changer Devices such as JUKEDOXES reist ende xit Vet I a ur EOS HSJ30 40 Controller Vendor Specific SCSI ASC ASCQ Codes Last Failure Event Log Template 01 Instance MSCP Event nn Pe DLE Failover Event Log Template 05 Instance MSCP Event Codes 6 21 6 30 6 30 6 45 6 50 6 50 6 62 6 65 6 65 6 66 6 67 6 80 7 20 7 21 7 21 7 21 7 43 7 45 o o 00 00 c l o NRP Qe ex oO Qe DAAMDRDAAAAAAN e o K ie O al C 20 C 21 C 22 C 23 C 24 C 25 C 26 C 27 C 28 C 29 C 30 C 31 C 32 C 33 C 34 C 35 C 36 C 37 C 38 C 39 C 40 C 41 C 42 C 43 C 44 C 45 C 46 C 47 C 48 C 49 C 50 C 51 Nonvolatile Parameter Memory Component Event Log Template 11 Instance MSCP Event Codes 00 cc cece eee ees Backup Battery Failure Event Log Template 12 Instance MSCP Event Codes Jue ode bags edis wes a Oed dva ed Subsystem Built In Self Test Failure Event Log Template 13 Instance MSCP Event Codes llle eee Memory System Failure Event Log Template 14 Instance MSCP Event Codes unit ls mte aedi di ae och ane UU ld CI Port Event Log Temp
214. SET THIS CONTROLLER PATH The host port path for HSZ series controllers is always on so no command is needed 4 3 5 Initial Configuration Dual redundant Controllers 4 6 In a dual redundant configuration one terminal can set both controller configurations After installation of both controllers use the CLI to define their parameters in the following order from a maintenance terminal connected to one controller CAUTION Do not install HSJ series CI host port cables until after setting all parameters listed here Failure to follow this procedure may result in adverse effects on the host cluster Note Not all steps are applicable to all controller models Steps applicable to certain models are designated as such 1 Enter the following command to set the MAX NODES HSJ series controllers CLI gt SET THIS CONTROLLER MAX NODES n where n is 8 16 or 32 2 Enter the following command to set a valid controller ID CLI gt SET THIS CONTROLLER ID n where n is the HSJ series controller CI node number 0 through MAX_NODES 1 or n is the HSD series controller one digit DSSI node number 0 through 7 Each controller DSSI node number must be unique on its DSSI interconnect Normal Operation 3 Enter the following command to set the SCS node CLI gt SET THIS CONTROLLER SCS_NODENAME xxxxxx where xxxxxx is a one to six character alphanumeric name for this node The node name must be enc
215. SJ and HSD series controllers A failover component FOC in HS operating firmware links two controllers in a dual redundant configuration The controllers exchange status signals and configuration information When one controller fails the surviving controller takes over service to the failed controller s units An HSJ40 controller can execute failover within 15 seconds Failover also allows for easier system management because only one terminal connection is required to access both controllers See Chapter 4 for more information on failover 2 2 5 3 Caching Cache firmware within the value added section of HS operating firmware will address the following areas Read caching e Write through caching Handling of up to 32 MB of cache e Logical Block Number LBN extent locking Least Recently Used LRU replacement policy Refer to Section 2 1 11 1 for a description of the LRU algorithm Read and write through caching enabled on a per logical unit basis The Cache policies for the product are as follow e Transfer defined extent TDE based cache e Data caching based on transfer size maximum read and write size is changed on a per logical unit basis e All I O subject to locking 2 12 Functional Description 2 3 Addressing Storage Within the Subsystem This section provides an overview about how storage is addressed in a controller subsystem Storage is seen in two different ways depending on your perspective and cont
216. Sequence Format Flags Event Code Controller ID Controller SW ver Controller HW ver ulti Unit Code Unit ID 0 Unit ID 1 Uni to ct ct ct Recovery Level Retry Count Position Software Rev Unit Hardware Rev x x MM KM KM DX v4 MM v4 KM MK X X Formatter SW version x Formatter HW version x Instance Template Type X X Requestor Information Size x Requestor Speci Requestor Speci Requestor Speci 6 3 8 TILX Data Patterns ic Data bytes 0 7 XX XX XX XX XX XX XX XX ic Data bytes 8 15 XX XX XX XX XX XX XX XX ic Data bytes XX XX XX XX XX XX XX XX XX XX Table 6 5 defines the data patterns used with the TILX Basic Function or User Defined tests There are 18 unique data patterns These data patterns were selected as worst case or the ones most likely to produce errors on tapes connected to the controller 6 44 Diagnostics Exercisers and Utilities Table 6 5 TILX Data Pattern Definitions Pattern Number Pattern in hex 1 2 3 4 5 shifting 1s 6 shifting Os 7 alternating 1s Os 10 11 12 13 ripple 1 14 ripple 0 15 16 17 18 Default Use all of the above patterns in a random method 0000 8B8B 3333 3091 0001 0003 0007 000F 001F 003F 007F OOFF O1FF 03FF 07FF OFFF 1FFF 3FFF 7FFF FTE FFFC FFFC FFFC FFEO FFEO FFEO FFEO FEOO FC00 F800 F000 E000 C000 8000 0000 0000 0000 0000 FFFF FFFF FFFF 0000 0000 FFFF FFFF
217. Setting HSJ Series Parameters Dual Redundant SET SET SET SET SET SET THIS_CONTROLLER MAX_NODES 16 THIS CONTROLLER ID 5 SCS NODENAME HSJ01 THIS CONTROLLER MSCP ALLOCATION CLASS 4 TMSCP ALLOCATION CLASS 4 FAILOVER COPY THIS OTHER CONTROLLER MAX NODES 16 OTHER CONTROLLER ID 7 SCS NODENAME HSJ02 RESTART OTHER CONTROLLER other controller restarts at this point RESTART THIS CONTROLLER this controller restarts at this point SET THIS CONTROLLER PATH A PATH B SET OTHER CONTROLLER PATH A PATH B B 3 3 Setting HSZ Series Parameters SET THIS CONTROLLER ID 5 RESTART THIS CONTROLLER this controller restarts at this point B 3 4 Setting Terminal Speed and Parity SET THIS CONTROLLER TERMINAL SPEED 19200 NOTERMINAL PARITY Note Garbage will appear on the terminal after setting the controller s terminal speed until you set the terminal s speed to match the new speed Command Line Interpreter B 77 B 3 5 Adding Devices This example shows how to define the devices on a six port controller Define devices one at a time through the ADD command specifying device type DISK TAPE CDROM device name and device PTL location CLI gt ADD DISK DISKO CLI gt ADD DISK DISK1 CLI gt ADD DISK DISK2 CLI gt ADD DISK DISK3 CLI gt ADD DISK DISK4 CLI gt ADD TAPE TAPEO 5 CLI gt ADD CDROM CDROMO Hs Hs GW DH Lr Le a e Sa a ea eS GOG COQ C303 0 This example created the following devices Device Type
218. Shelf and SBB User s Guide e A 54 inch SBB may be located in the same shelf with three or four 3Y inch SBBs 3 4 6 Atypical Configurations By unbalancing the number of devices per controller port configurations can be devised with a smaller shelf count This results in lower performance and or availability The minimum shelf count for various numbers of 3 inch SBBs is listed in Tables 3 5 and 3 6 3 14 Configuration Rules and Restrictions Table 3 5 Small Shelf Count Configurations 6 Port Controller Number of Number of BA350 SB Devices Shelves Configure as 1 6 1 1x6T 7 12 2 1x6T 13 18 3 1x6T 19 24 4 1x6T 25 30 5 1x6T 31 36 6 1x6T 37 42 6 1x7T Notes Consult the StorageWorks Solutions Shelf User s Guide for BA350 SB shelf information Nonredundant controller and power configurations not recommended Table 3 6 Small Shelf Count Configurations 3 Port Controller Number of Number of BA350 SB Devices Shelves Configure as 1 6 1 1x6T 7 12 2 1x6T 13 18 3 1x6T 19 21 3 1x7T Notes Consult the StorageWorks Solutions Shelf User s Guide for BA350 SB shelf information Nonredundant controller and power configurations not recommended 3 5 Controllers This section describes specifics of configuring the controllers 3 5 1 Nonredundant Controllers The following guidelines apply to nonredundant controllers A single controller must be installed in the slot furthest from the BA350 M
219. Specifies the allocation class 0 through 255 in a single controller configuration or 1 through 255 in a dual redundant configuration When first installed the controller s TMSCP_ALLOCATION_CLASS is set to 0 Qualifiers for HSJ controllers ID n Specifies the CI node number 0 through MAX NODES 1 MAX NODES n Specifies the maximum number of nodes 8 16 or 32 When first installed the controllers MAX NODES is set to 16 MSCP ALLOCATION CLASS n Specifies the allocation class 0 through 255 in a single controller configuration or 1 through 255 in a dual redundant configuration When first installed the controllers MSCP ALLOCATION CLASS is set to 0 PATH A NOPATH A Enables or disables CI Path A When first installed NOPATH A is set PATH B NOPATH B Enables or disables CI Path B When first installed NOPATH B is set PROMPT new prompt Specifies a 1 to 16 character prompt enclosed in quotes that will be displayed when the controller s CLI prompts for input Only printable ASCII characters are valid When first installed the CLI prompt is set to the first three letters of the controller s model number for example HSJ gt HSD gt or HSZ SCS _NODENAME xxxxxx Specifies a one to six character name for node TERMINAL_PARITY ODD EVEN NOTERMINAL_PARITY Specifies the parity transmitted and expected Parity options are ODD or EVEN NOTERMINAL_PARITY causes the controller not to check for or trans
220. Storage Works Array Controllers HS Family of Array Controllers Service Manual Order Number EK HSFAM SV B01 This manual contains necessary servicing information for the HS family of array controllers Information included pertains to configuration normal operating procedures troubleshooting and error analysis field replaceable units and removal and replacement procedures Digital Equipment Corporation Maynard Massachusetts April 1994 While Digital believes the information included in this manual is correct as of the date of publication it is subject to change without notice Digital Equipment Corporation makes no representations that the interconnection of its products in the manner described in this document will not infringe existing or future patent rights nor do the descriptions contained in this document imply the granting of licenses to make use or sell equipment or software in accordance with the description Possession use or copying of the software or firmware described in this documentation is authorized only pursuant to a valid written license from Digital an authorized sublicensor or the identified licensor No responsibility is assumed for the use or reliability of firmware on equipment not supplied by Digital Equipment Corporation or its affiliated companies Restricted Rights Use duplication or disclosure by the U S Government is subject to restrictions as set forth in subparagraph K 1 ii of
221. T PORT PROCESSOR SCSI DIFFERENTIAL XCVRS SCSI CONNECTOR 1f TO FROM HOST a CXO 3982A MC 2 1 12 3 HSZ Series SCSI 2 Interface Figure 2 5 shows a block diagram of the HSZ series to SCSI 2 host interface hardware Functional Description 2 7 The HSZ series interfaces with a SCSI 2 Fast Wide Differential FWD 16 bit host bus or a SCSI 2 8 bit differential bus The hardware consists of the NCR 53C720 chip and tranceivers and functions in much the same way as the DSSI interface refer to Section 2 1 12 2 CAUTION Although the HSD series and HSZ series interfaces are similar care should be taken not to accidentally install an HSD series controller in an HSZ series system or vice versa Equipment damage would result 2 2 HS Controller Firmware The HS controller firmware or hierarchical storage operating firmware consists of functional code diagnostics utilities and excercisers HS operating firmware is stored in a PCMCIA program card Digital ships the card along with your HS controller Thereafter each time HS operating firmware is updated new cards are manufactured You can purchase the update cards on a per release basis or through an update service contract Once the card is installed in the HS controller the contents are validated and loaded into shared memory Any time you reset the controller this validating and loading process gets repeated Because of this scheme when the firmware
222. TILX Use Ctrl T when invoking TILX from a VCS Ctrl G causes TILX to produce a performance summary TILX continues normal execution without affecting the runtime parameters Ctrl C causes TILX to produce a performance summary stop testing and asks the reuse parameters question Diagnostics Exercisers and Utilities 6 31 Ctrl Y causes TILX to terminate The reuse parameters question is not asked Ctrl T causes TILX to produce a performance summary TILX then continues executing normally without affecting any of the runtime parameters 6 3 3 TILX Tests There are three TILX tests as follow The Basic Function test The User Defined test The Read Only test 6 3 3 1 Basic Function Test TILX The Basic Function test executes a write pass followed by a read pass The write pass executes in two phases as follows Data Intensive The first one third of the records are written in this phase All records written to the tape have a byte count of 16 kilobytes With this high byte count and the default queue depth this phase should test the streaming capability if supported of the tape unit Random tThis test is performed for the remaining two thirds of the selected record count It consists of writes with random byte counts Intermixed is the sequence write reposition back one record read This sequence performed three times in a row Tape mark writing is also intermixed in the test The write pass is complete
223. TION e she a9 5 5 O lele Ti T2 gt C1 C4 c1 04 d amp o Az ele C1 C4 C1 C4 e o olo E lo o E e o lo o o olo z o o o E B o o o ss QO lR 1 50 PIE t HOLE STORAGE J Of En loe ke STORAGE e HOLE 14 Position s13 9 E al Ue e fel 9 C9 position s14 14 CABLE _ oo e Lie ale CABLE PASS nr seee mmek T PASS THROUGH oe S E399 fo MEE THROUGH HOLE 2 GS _ ey Ea ele de 426 j STORAGE co fi s E YS F ce storace 34 HOLE 1 POSITION S6 Jl az el e as lee POSITION S12 2 o ojo O o o Y o g o ll EE fl EE legs HOLE P STORAGE cr Es 8 3 5 8 EPI ca storace a HOLE POSITION S1 Sa OL fe O amp ee POSITION S11 32 HOLE s MN NIIS qa 38 Ph STORAGE a p a 8 STORAGE I HOLE q C1 Bl o eael C2 38 P POSITION S2 i B POSITION S10 HOLE 5 als E s Ma i 44 hM STORAGE en fii j e elel ca STORAGE T HOLE f POSITION 3 2 SPP me Il POSITION S9 iid x efe a J lt as HOLE el STORAGE el aisi 12121 i STORAGE ni HOLE d o Qe a iq lo k 50 Lt positionsa CP B o O9 C29 Posmonss 3 50 HOLE STORAGE ei f aiaj STORAGE HOLE 56 positionss Le SURE C2 position s7 56 o o o L222ta asl o ba Ad OS T Ey El a o o o CABINET FRONT CABINET REAR CXO 4220A MC Number of devices Up to 42 devices can be attached using 7 31 inch SBBs in each of 6 BA350
224. TORY Lists the diagnostics and utilities available on THIS CONTROLLER Format DIRECTORY Description The DIRECTORY command lists the various diagnostics and utilities that are available on THIS CONTROLLER A directory of diagnostics and utilities available on this controller is displayed For specific information about the diagnostics and utilities available refer to the Storage Works Array Controllers HS Family of Array Controllers Service Manual Examples CL1 gt DIRECTORY TILX X067 DILX X067 VIDPY X067 ECHO X067 DIRECTX067 CLI X067 VUU UU tU A directory listing B 14 Command Line Interpreter EXIT EXIT Exits the CLI and breaks a virtual terminal connection Format EXIT Description When entering the EXIT command from a host using a virtual terminal connection the connection is broken and control is returned to the host If entered from a maintenance terminal the EXIT command restarts the CLI displaying the copyright notice the controller type and the last fail packet Examples d CLI gt EXIT Copyright Digital Equipment Corporation 1993 HSJ40 Software version E140 Hardware version 0000 Last fail code 01800080 Press at any time for help CLI gt An EXIT command issued from a maintenance terminal 2 CLI gt EXIT Control returned to host An EXIT command entered on a terminal that was connected to the CLI via a DUP connection Command Line Interpreter B 15 HELP HELP Displays
225. TUOW A CLA 00 00 MH CRIA MS SE6OOLEEDZ N S OPLSH 6 68 Diagnostics Exercisers and Utilities s z z 3 T TOO TE NIT OT IWM TE amp UUU W000 Tg NAZ 07 23 0 8 68LISPEZTO 68L9GHEZIO 0 0 TE ONE T OT IWM 62 snqeqg jeg suoT329Uu0 0 Td NAT 0T ION 82 2 0 0 TE NIZ 9T NIVNITO 12 a 00 Td NIT 0T STH 9 QT T8 MIT 0 0S0 S gt 1 CLL UE ONE IL 0p T S0 Hz a H aadad e 00 Td NIT 0T WA amp H Qu zo 00 TE NET 02 N K H Qd Td 00 Td ON I 0T SOS 0 3 L9SVEZIO 0 0 HON GO TE MIT OD dM 61 E yore 0 0 MEN 0 0 Td ONT 0T a Sd 81 gt T S XX 00 TE ONE T 0T QHINI LI o Te hE AD 70 W dye OT AMIA 8 E 0 0 0 0 0 2 o E69 S SMd SPd ISSATIT W ONES 0 1H 0 0 0 0 0 2 0 Z69 0 0 Td NIT 0T Noo Z i 00000 4 161 CTTA9OTIOOZb PIS S L UY 0 0 TIN 0 gt SUH WU SM SPE S M MSY WUN SLH SUO 22M APH S M MSY TUN 9 PIO JAISH PON nd vas d L xek xns SWN Id 5 L050 0 da S bu 0 9 83 0 STPI amp L 6T 3TH A T 396 2 15 80 91 P661 8H4 0 dio queudinbg Tezt ta p661 O qbtIAdoD I0FTUON JAQLA 00 00 MH OHTA MS 90000E0PXO N S OCCSH Diagnostics Exercisers and Utilities 6 69 ET L0 0 0 da Figure 6 4 VTDPY Default Display for SCSI Controllers O 010 0 00 0 00 OO SD 0 SIH U 31M Pd S A MSY FUN 21H oco SS OG OG O OGO nG gt E gt gt gt ED ED 06 UD 3M PY S I MSY Vu S b3 091 60 0S 0S 0S 0S 0S 0G 0G 0G 0G 0G 0G 06 6v 0G 6r
226. Table C 26 Cl System Communication Services Event Log Template 33 Instance MSCP Event Codes MSCP Instance Event Code Code Description 40150204 006A Remote SYSAP sent an SCS APPL_MSG but no receive credit was available 40290104 006A Illegal connection state Not in CONNECT_REC connection state when an SCS ACCEPT_REQ is pending 402A010A 006A Illegal connection state Not in CONNECT_REC connection state when an SCS REJECT_REQ is pending 402B010A 006A Illegal connection state Not in CLOSED connection state when an SCS CONNECT REQ is pending 402C010A 006A Illegal connection state Not in OPEN or DISCONNECT REC connection state when an SCS DISCONNECT_REQ is pending 4051020A 006A Received SCS CONNECT RSP when not in CONNECT SENT connection state 4052020A 006A Received SCS CONNECT RSP when the connection is no longer valid 4053020A 006A Received SCS ACCEPT REQ when not in CONNECT ACK connection state 4054020A 006A Received SCS ACCEPT RSP when not in the ACCEPT SENT connection state 4055020A 006A Received SCS REJECT_REQ when not in the CONNECT ACK connection state 4056020A 006A Received SCS REJECT RSP when not in the REJECT SENT connection state 4057020A 006A Received SCS DISCONNECT_REQ when not in the OPEN DISCONNECT SENT or DISCONNECT_ACK connection state 4058020A 006A Received SCS DISCONNECT_RSP when not in the DISCONNECT SENT or DISCONNECT MATCH connection state 4059020A 006A Received SCS CREDIT REQ when in
227. The CI Communication Services Event Log will be sent to all host systems that have enabled Miscellaneous error logging on a connection or connections established with the HSJ30 40 controller s Disk and or Tape MSCP Server The CI System Communication Services Event Log is reported via the T MSCP Controller Errors error log message format The format of this event log including the HSJ30 40 controller specific fields is shown in Figure C 24 CI System Communication Services Event Log Format Specific Fields format This field contains the value 00 that is T MSCP Controller Errors error log format code event code The values that can be reported in this field for this event log are shown in Table C 26 C 40 HSJ Series Error Logging Figure C 24 Cl System Communication Services Event Log Template 33 Format 31 0 controller identifier instance code tdisize templ reserved event time scs opcode remote node name reserved offset 16 This field contains the value 0 instance code See Section C 2 1 for the description of this field The values that can be reported in this field for this event log are shown in Table C 26 templ See Section C 2 1 for the description of this field This field contains the value 33 for this event log HSJ Series Error Logging C 41 tdisize See Section C 2 1 for the description of this field This field contains the value 2C for this event log reserved offset 1E
228. The SHOW STRIPESET command displays all the stripesets known by the controller Qualifiers FULL If the FULL qualifier is specified additional amplifying information may be displayed after each storage set Examples h CLI gt SHOW STRIPESETS Name Storageset Uses Used by ST1 stripeset DISK500 D1 DISK510 DISK520 ST2 stripeset DISK400 D17 DISK410 DISK420 A basic listing of all stripesets 2 CLI gt SHOW STRIPESETS FULL Name Storageset Uses Used by ST1 stripeset DISK500 D1 DISK510 DISK520 CHUNKSIZE DEFAULT ST2 stripeset DISK400 D17 DISK410 DISK420 CHUNKSIZE DEFAULT A full listing of all stripesets B 52 Command Line Interpreter SHOW stripeset container name SHOW stripeset container name Shows information about a stripeset Format SHOW stripeset container name Parameters stripeset container name The name of the stripeset that will be displayed Description The SHOW stripeset container name command is used to show specific information about a particular stripeset Examples 5 CLI gt SHOW STRIPEO Name Storageset Uses Used by STRIPEO stripeset DISK500 D1 DISK510 DISK520 CHUNKSIZE DEFAULT A listing of stripeset STRIPEO Command Line Interpreter B 53 SHOW TAPES SHOW TAPES Shows all tape drives and tape drive information Note This command is valid for HSJ and HSD controllers only Format SHOW TAPES Description The SHOW TAPES command displays all the tape drives known to the con
229. The subsystem is unable to clear the swap signal for a swapped device where xx is the shelf number This could indicate an unsupported SBB or no power to the device shelf SWAP signal cleared all SWAP interrupts re enabled Explanation This message indicates that the swap signal is now cleared Shelf xx has a bad power supply or fan Explanation Troubleshoot the system to isolate and replace the failed component Shelf xx fixed Explanation Shelf number xx has been correctly repaired 5 6 5 Failover Messages The messages in this section are generated during failover between dual redundant controllers Received LAST GASP message from other controller Explanation One controller in a dual redundant configuration is attempting an automatic restart after failing or undergoing a bugcheck See Section 5 7 for more information on this message Other controller restarted Explanation The other controller in a dual redundant pair has successfully restarted after failing or undergoing a bugcheck See Section 5 7 for more information on this message Other controller not responding RESET signal asserted Explanation One controller in a dual redundant configuration is locked up not responding or the kill line to it is asserted SCSI Device and HSxxx controller both configured at SCSI address 6 Explanation This message appears when a device is accidentally configured as SCSI ID 6 and two controllers SCSI IDs 6 and 7 are in
230. U StorageWorks battery backup unit that extends power availability after the loss of primary ac power or a power supply to protect against the corruption or loss of data BIST Built in self test BIST is the internal self test routine for the HS controller module microprocessor chip block A stream of data transferred as a unit Used interchangeably with the term sector for disk drives to represent 512 bytes for 16 and 32 bit host architectures or 576 bytes for 36 bit architectures A block is the smallest data unit addressable on a subunit It occupies a specific physical position relative to the index and is available for reading or writing once per disk rotation The five types of blocks follow 1 Diagnostic block Used for drive read or write diagnostics The diagnostic block area is not visible to the host operating system However it is visible to the controller Diagnostic block addresses are 28 bits wide and are called diagnostic block numbers DBNs 2 External block Contains the format control tables The external block area is not visible to the host operating system However it is visible to the controller External block addresses are 28 bits wide and are called external block numbers XBNs 3 Logical block Contains the host applications area and the Replacement Control Table All logical blocks are visible to the host operating system Logical block addresses are 28 bits wide and are called logical block n
231. UNITS B 58 SHUTDOWN OTHER_CONTROLLER B 60 SHUTDOWN THIS_CONTROLLER B 62 CONFIG command 2 10 6 98 CONFIG utility 6 98 Configuration 3 inch SBB restrictions 3 9 5 inch SBB restrictions 3 9 3 inch SBBs 3 10 54 inch SBBs 3 13 atypical 3 14 available 1 1 cabinets 3 1 combination 3 1 CONFIGURATION INFO file 4 3 designation 3 10 devices 3 9 dual redundant 1 1 3 16 4 6 7 16 7 46 restrictions 1 3 highest availability 3 19 highest performance 3 17 mismatch 5 4 mixing disk and tape 3 9 Configuration cont d mixing SBB sizes 3 14 nonredundant 1 1 4 4 7 9 nonredundant controller 3 15 optimal availability 3 18 optimal performance 3 16 ordering 3 1 predefined 3 1 shelf 3 8 small shelf count 3 14 starter subsystem 3 1 SW500 series cabinets 3 6 SWS800 series cabinets 3 2 Configured to order See CTO Containers initializing B 78 Controller ID 4 5 4 6 7 10 7 16 7 46 Controller module failures 7 2 shutting down 7 2 warm swap 7 2 Controller storage explained 2 13 Controller warm swap 7 42 controller removal 7 42 controller replacement 7 44 precautions 7 42 tools 7 42 Core functions firmware 2 9 Core MIST 6 2 hardware tests 6 2 IBR 6 2 program card validation 6 2 CTO 3 1 C_SWAP command 2 10 7 42 D DAEMON 6 3 6 4 manually running 6 4 manually stopping 6 5 Data test patterns HSJ HSD series DILX 6 21 TILX 6 44
232. Units 7 33 4 Listen for the connector to snap into place CAUTION Return a device to the slot from which it was removed If SBBs are removed and then returned to a different slot customer data may be destroyed 5 Insert the SBBs into the device shelf making sure that all SBBs are returned to their original slots The SBB mounting tabs will snap into place as the SBBs are locked into the shelf 6 Replace the volume shield in the controller shelf and tighten the captive screws finger tight using a flat head screwdriver refer to Figure 7 10 7 Replace the cache module s and controller s by referencing the procedures described in Sections 7 1 and 7 2 8 Close and lock the cabinet doors SW800 series using a 5 32 inch Allen wrench WARNING Service procedures described in this manual that involve blower removal or access to the rear of the shelf must be performed only by qualified service personnel 7 9 Blowers The BA350 MA and BA350 SB StorageWorks shelves have two rear mounted blowers that cool the controllers and storage devices see Figure 7 12 Connectors on the shelf backplane provide 12 Vdc power to operate them When either blower fails the shelf status upper LED on the power SBB turns off and an error message is passed to the controller or host As long as one blower is operating there is sufficient air flow to prevent an overtemperature condition If both blowers fail the shelf can o
233. X is started Copyright Digital Equipment Corporation 1993 Disk Inline Exerciser version 1 4 This message identifies the internal program as DILX and gives the DILX software version number Change Unit is not a legal option if Auto Configure was chosen Explanation This message will be displayed if the user selected the Auto Configure option and selected the change unit response to the reuse parameters question You cannot drop a unit and add a unit if all units were selected for testing DILX Normal Termination Explanation This message is displayed when DILX terminates under normal conditions Insufficient resources Explanation Following this line is a second line that gives more information about the problem which could be one of the following messages e Unable to allocate memory DILX was unable to allocate the memory it needed to perform DILX tests You should run DILX again but choose a lower queue depth and or choose fewer units to test e Cannot perform tests DILX was unable to allocate all of the resources needed to perform DILX tests You should run DILX again but choose a lower queue depth and or choose fewer units to test Unable to change operation mode to maintenance DILX tried to change the operation mode from normal to maintenance using the SYSAP CHANGE_STATE routine but was not successful due to insufficient resources This problem should not occur If it does occur submit a
234. Z 91 NIVNITO LZ ZI ye 79P0d 0 3 lt 2 IE 0A AAVA tZ OCO Ta ONS T OT SIH 9z 1 904 0 d JScOSPOd O ATTOcCQ Ta 2NA Z oz osa GZ 1 ye Z9v0d 9GF I O SZPOG WAWAWO 0 0 UH ONA OP T Sd pZ 1 2 TOP0A ZSb 1 O PCTOG 68L9SHEZTO 00 TE ONT T OT VA EZ 1 2 09p04 ZSb ZI O ZPO0Gd SUOTIOSUUO 0 0 Td NA T OZ dOSW TZ X 2 GSPOT ZS ZI O ZZpOa 00 TE NA T OT sos OZ 1 ysv04 ZSp i O TzPOA Hudadadado 0 0 TE NA I OT and 61 I lt 2 ESTO ZSP i O 0zv0d Huaaaaaas 0 0 IH DNA T 0T SH Sa sI i ZSP0Gd 0 i ye Srvoa Hudaaqadvea 00 Td DNA T OT GHHIN3 LT 3 lt 2 DOPO 0 i ye ptvboa Hudadddde3 00 uH danad OT AXddIA 6 1 OSPOd 0 i ye TpOd HUdaddadzo 0O 00T UH ONA y OP 14H 1 Ye GPPOG 0 i e TIpod HUAAQAAAATA 00 Td 3NA T OT NOOWH Z 1 2 PPPOG 0 1 OIPOd LOGPEZTO 0 0 Uy 0 0 TIN 0 S A MSY ITUNM S g8M OMSV JUN I9b18L 2ndo eas J L xew xas auen d 60 SZ 0 0 do bY 0 S A LILZ STPI 0 0 4TH G I 88 0S 2G 91 v661 dd43 E0 dzop qusudrnbg Teqthtd p66T O IUBTIAdOD 10ITUON AdQlA Diagnostics Exercisers and Utilities 6 73 Figure 6 8 VTDPY Brief DSSI Status Display PS O06 IH ONS I OT IAA TE G8L9SVEZTO 0 0 TE ONT Z 0Z 204 0 sn38728 yd 0 0 Td ONS I OT SIOWXH 62 070 TE ONS T OT OJAN 82 0 0 IH ONT Z 9T NIVNITO LZ zZ 9 0 IH ONY T OI SIH 9z E020 Ud 2N4 T 0 0 Sd SZ Wc 0 070 Ud NA IT OP T Sd vc 68L9GPEZTTO 0
235. a dual redundant configuration Both HSxxx controllers are using SCSI address 6 Explanation There is a hardware problem with the BA350 MA shelf This problem probably involves the shelf backplane Both HSxxx controllers are using SCSI address 7 Explanation There is a hardware problem with the BA350 MA shelf This problem probably involves the shelf backplane Error Analysis and Fault Isolation 5 15 5 6 6 Other CLI Messages The previous sections detailed automatic messages you may encounter For a list of other messages you may see during interactive use of the CLI see Appendix B Consult your firmware release notes for updates to the list of error messages 5 7 Host Error Logs Events related to controller and device operation are recorded in the host error log If the OCP device LEDs or error messages cannot help you determine the cause of a problem review the host error logs They provide the greatest level of detail about the controller and connected devices 5 7 1 Translation Utilities OpenVMS systems have the Errorlog Report Formatter ERF to aid in error log translation The tool reads the information from the log and provides the operator with more information about what the log means with respect to controller operation and repair ERF provides bit to text translation of the binary log so that the operator can read the information The OpenVMS DCL command ANALYZE ERROR_LOG invokes ERF For a description of th
236. a list of data patterns The default uses all 18 patterns in a random method This question also allows you to create a unique data pattern of your own choice Enter the 8 digit hexadecimal user defined data pattern Explanation This question is only displayed if you choose to use a User Defined data pattern for write commands The data pattern is represented in a longword and can be specified with eight hexadecimal digits Enter start block number 0 highest Ibn on the disk 0 Explanation Enter the starting block number of the area on the disk you wish DILX to test Zero is the default Enter end block number starting lbn highest lbn on the disk highest Ibn on the disk Explanation Enter the highest block number of the area on the disk you wish DILX to test The highest block number of that type of disk is the default Perform data compare y n n Explanation Enter Y to enable data compares Enter N and no data compare operations are done This question is only asked if you select the initial write option Data compares are only performed on reads This option can be used to test data integrity Enter compare percentage 1 100 5 Explanation This question is displayed only if you choose to perform data compares This question allows you to change the percentage of read and write commands that will have a data compare operation performed Enter a value indicating the compare percen
237. a unit out if it Error 9300 Metadata found on container Are you sure this is a TRANSPORTABLE container An INITIALIZE must be issued before this container may be used Explanation Metadata was found on a TRANSPORTABLE container Enter an INITIALIZE command Error 9330 NV memory write collision Please try again Explanation Two users were trying to configure the CLI at the same time Check the configuration you were trying to modify to make sure it is unchanged and retry the command Error 9350 Metadata found on container but the chunksize is different Either a SET lt storageset name gt CHUNKSIZE lt chunksize gt or an INITIALIZE lt storageset name gt must be issued before this container may be used Explanation The chunksize defined by the ADD or SET command is different than that on the media Either INITIALIZE the storageset or SET the chunksize to the given value Error 9360 A tape is not installed at the PTL lt port gt lt target gt lt lun gt Cannot set tape switches unless a tape is installed Explanation A SET or ADD command specified a tape format but there was no tape installed at the tape s PTL Install a tape and retry the command Command Line Interpreter B 73 Error 9370 A lt tape name gt is an unsupported device Tape switches cannot be set on unsupported devices Explanation The tape installed is not currently supported by the controller Replace the tape with a supported device and ret
238. a unit while it is write enabled This is the last chance to back out of testing the displayed unit Enter Y to write enable the unit Enter N to back out of testing that unit 6 10 Diagnostics Exercisers and Utilities Select another unit y n n Explanation Enter Y to select another unit for testing Enter N to begin testing the units already selected The system will display the following test selections Available tests are 1 Basic Function Za User Defined Test Use the Basic Function 99 9 of the time The User Defined test is for special problems only Enter test number 1 2 1 Explanation Enter 1 for the Basic Function test or 2 for the User Defined test After selecting a test the system will then display the following message IMPORTANT If you answer yes to the next question user data WILL BE destroyed Write enable disk unit y n n Explanation Enter Y to write enable the unit Write commands are enabled for the currently selected test Data within your selected LBN range will be destroyed Be sure of your actions before answering this question This question applies to all DILX tests Enter N to enable read only mode where read and access commands are the only commands enabled Perform initial write y n n Explanation Enter Y to write to the entire user selected LBN range with the user selected data patterns Enter N for no
239. able to get LOCAL STATIC memory from exec for use as Write History Log Allocation Failure Lists 60640100 Invalid condition when there exists no unused Write History Log Entries 60650100 Attempting to block incoming requests for the tape loader when it was unnexpectedly found already blocked 60660100 Loader boundary block request to stall incoming requests to the tape loader unit was not setup as expected 60670100 Invalid return value from routine HIS XMIT_APPL_MSG 60680100 VAS ENABLE NOTIFICATION failed with insufficient resources at init time 606B0100 mscp foc receive cmd detected that the message sent from the other controller had an illegal usb index 606C0100 mscp foc receive cmd detected that the message sent from the other controller had an illegal exclusive access state continued on next page HSJ Series Error Logging C 115 Table C 43 Cont Disk and Tape MSCP Server Last Failure Codes Code Description 606D0100 606E0100 FOC provided mscp foc send cmpl rtn with an invalid status for the FOC SEND transmit command completion FOC provided mscp foc send rsp done with an invalid transmit status for the FOC SEND transmit response completion Table C 44 Diagnostics and Utilities Protocol Server Last Failure Codes Code Description 61020100 61090100 610C0100 HIS LISTEN call failed with INSUFFICIENT RESOURCES LISTEN CONNECTION ESTABLISHED event from HIS specified a connection ID
240. ace on an approved ESD work surface or mat 9 If necessary you may now remove the cache module as described in Section 7 2 3 Removing and Replacing Field Replaceable Units 7 43 Once you remove the controller you will see the following displayed as the subsystem uses the remaining controller to service the quiesced ports Restarting ALL ports Port 1 restarted Port 2 restarted Port 3 restarted Port 4 restarted Port 5 restarted Port 6 restarted 7 11 2 4 Controller Replacement Use the following procedure to replace the controller il The system will prompt you with the following to replace the controller Do you have a replacement HSJ40 readily available N Try to have a replacement available If you do not have one you must answer with N Then the warm swap sequence will terminate and you must restart the routine later when you have a replacement When you find a replacement you can restart the sequence by entering the RUN C_SWAP command again The system responds with the following Do you have a replacement HSJ40 readily available N Answer Y if you have the controller The following is displayed next Sequence to INSERT other HSJ40 has begun Do you wish to INSERT the other HSJ40 N Answer Y to insert the controller Remember to first reinsert the cache module if applicable Attempting to quiese all ports Port 1 quiesced Port 2 quiesced Port 3 quiesced Port 4 quie
241. aceable Units 7 37 7 10 4 Power Supply Replacement Installation Use the following procedure to replace a power supply refer to Figure 7 13 CAUTION The power supply is relatively heavy and can be damaged if dropped Always use both hands to fully support the power supply during replacement 1 Hold the power supply in both hands and firmly push it into the shelf until you hear the mounting tabs snap into place 2 Plug the power cord back into the power supply Observe the power and shelf status LEDs to make sure both turn on If both LEDs do not turn on refer to Chapter 5 for troubleshooting basics 4 Close and lock the cabinet doors SW800 series using a 5 32 inch Allen wrench 7 11 Warm Swap When you warm swap a storage SBB or a controller you quickly and efficiently remove the hardware and install a replacement Warm swap is possible without taking your controllers out of service or adversely affecting activity on the rest of the subsystem Using warm swap also preserves data integrity Note Warm swap is not applicable to service on unpowered StorageWorks shelves Do not attempt to execute warm swap on an unpowered shelf 7 11 1 SBB Warm Swap Device warm swap involves quickly removing and replacing a disk drive tape drive or other storage SBB You can safely remove SBBs without taking your system or controller off line However before removing a device either the controller or the operator mus
242. adapters Supersedes CIBCA A CIBCA A is no longer supported See the HSZ series firmware release notes for restrictions 3 20 Configuration Rules and Restrictions 4 Normal Operation This chapter describes operating conditions and procedures for the HS controllers Included is information about both storage and controller configurations The configurations discussed in this chapter are those set by the operator employing user interfaces such as the HS operating firmware and or operating system commands Refer to Chapter 3 for physical configuration of the subsystem hardware Also given are cross references to other sections of this manual where more information about controller operation is provided 4 1 Initialization The following sections discuss the operating conditions surrounding initialization of the controller and subsystem 4 1 1 Controller Initialization The controller will initialize after any of the following conditions e Power is turned on e The firmware resets the controller The operator presses the green reset button e The host clears the controller Note Keep the program card in its slot during controller subsystem operation If the program card is removed the controller will reset See Chapter 6 for a description of the initialization of both the controller and its cache module The process is described in Chapter 6 because some of the initialization diagnostics are avai
243. after resetting the controller If the error remains the same look up information for that error If the indication changes look up information for the newer error Refer to Chapter 5 for detailed information about errors and repair actions Before Proceeding You should decide exactly what you will be servicing a nonredundant controller one dual redundant controller or both dual redundant controllers before proceeding to the following sections as each procedure varies and has different consequences 7 1 2 Shutting Down a Controller Controller failures are not the only reason to remove and replace a controller module You may be moving resources or removing a functioning controller for use as a replacement somewhere else in your system Note If you wish to quickly remove and replace one controller in a dual redundant configuration you may warm swap see Section 7 11 2 the controller with a replacement if you have one This method provides the fastest most transparent way of exchanging controllers with minimal system impact and no down time Unless you are warm swapping a controller you must shut down a functional controller before removing it Use the following guidelines to shut down a controller Always stop all processes on and dismount devices attached to a controller you intend to shut down To enter any CLI gt SHUTDOWN command your terminal must be connected to a fully or partially functio
244. age display 6 42 error codes 6 50 error information packets 6 42 examples 6 45 interrupting 6 31 output messages 6 37 performance summary 6 48 read only test 6 33 running from maintenance terminal 6 31 running from VCS 6 31 running from virtual terminal 6 31 test definition questions 6 33 tests available 6 32 user defined test 6 32 using all defaults 6 45 using all functions 6 46 TMSCP 4 5 4 7 7 10 7 17 7 46 Index 19 TMSCP timeout 4 14 Transportable devices 4 18 TRANSPORTABLE qualifier 4 9 7 12 Troubleshooting 5 2 5 11 7 2 and error logs 5 2 and visual indicators 5 2 error messages 5 11 fault notification 5 2 using OCP 5 2 U uerf invoking 5 16 Units adding B 78 creating from disk B 79 creating from stripeset B 79 creating from tape B 79 deleting B 80 renumbering B 80 transportable B 80 write protection B 79 UNIX Errorlog Report Formatter See uerf Upgrade cache memory capacity 7 20 firmware 1 1 User defined test HSJ HSD series DILX 6 8 TILX 6 32 HSZ series DILX 6 52 V Value added firmware 2 12 VAXcluster console system See VCS VCS 2 4 4 11 6 5 6 6 6 31 Virtual terminal 1 5 2 3 HSZ series controllers 6 100 VTDPY 1 5 2 10 6 65 help 6 97 W Warm swap 7 38 See also Controller warm swap See also Device warm swap controller 1 5 2 10 7 42 controller module 7 2 defined 7 38 HSZ series controller 7 3
245. ailable due to a media format inconsistancy i Offline Inoperative The unit is inoperative and cannot be brought available by the controller m Offline Maintenance The unit has been placed in Maintenance mode for diagnostic or other purposes o Online Mounted by at least one of the host systems r Offline Rundown The CLI SET NORUN command has been issued for this unit v Offline No Volume Mounted The device does not contain media x Online to other controller Not available for use by this controller Diagnostics Exercisers and Utilities 6 89 e A space in this column indicates the availability is unknown The spindle state is indicated using the following characters e A For disks this symbol indicates the device is at speed For tapes it indicates the tape is loaded e gt For disks this symbol indicates the device is spinning up For tapes it indicates the tape is loading e lt For disks this symbol indicates the device is spinning down For tapes it indicates the tape is unloading e v For disks this symbol indicates the device is stopped For tapes it indicates the tape is unloaded e For other types of devices this column is left blank For disks and tapes a w in the write protect column indicates the unit is write protected This column is left blank for other device types The data caching state is indicated using the following letters e r Read caching is enabled e A s
246. ails during normal operation the controller will continue to operate You will have to extract the host error log to determine the cause of this error Error Analysis and Fault Isolation 5 11 The following sections list automatic messages you may encounter The controller sends these messages when the specific fault is detected regardless of whether or not you are interactively viewing or using the virtual or maintenance terminal These messages differ in this respect from the ones listed in Appendix B which appear based on your inputs to the CLI Be aware that not all the error messages listed in this section will pertain to your model of controller Some messages are specific to the HSJ HSD or HSZ series controllers 5 6 1 Diagnostic Messages This section contains error messages that may be displayed if a fault occurs during initialization or self test diagnostics See Chapter 6 for more information on diagnostics Half CACHE FAILED Diagnostics Explanation Up to 50 of the cache memory has failed diagnostic tests Whole CACHE FAILED Diagnostics Explanation The cache module has failed diagnostics tests SCSI port n FAILED Diagnostics Explanation A SCSI 2 port has failed diagnostics This message can appear even if you do not have a host connection The variable n indicates which port failed HOST port FAILED Diagnostics Explanation The host port of the controller has failed diagnostics CI Path x has FAILED extern
247. ains error threshold values Section C 5 contains recommended repair actions e When you look up a specific instance code you will notice that each error belongs to one of fifteen template types Each template type has a one byte value identifying it which is also located in the CONTROLLER DEPENDENT INFORMATION area longwords as shown in Table C 1 You may be able to use Table C 1 to quickly identify the template type after examining the longwords in the CONTROLLER DEPENDENT INFORMATION area However since the location of the value identifying the template varies the safest way to determine the template is to use the instance code The template type is always the very next byte after the instance code C 4 HSJ Series Error Logging Table C 1 Template Types Deskewed Description Template Longword Value Value Last Failure Event Log 014 2 2401xxxx 00002401 Failover Event Log 057 2 0005xxxx 00000005 Host buffer Access Error Event Log 10 2 00000C10 Nonvolatile Parameter Memory 11 2 00000811 Component Event Log Backup Battery Failure Event Log 12 2 00000012 Subsystem Built In Self Test Failure 131 2 2413xxxx 00002413 Event Log Cache Memory Failure Event Log 14 2 00002414 CI Port Event Log 311 2 0C31xxxx 00000C31 CI Port Port Driver Event Log 321 2 1032xxxx 00001032 CI System Communication Services 331 2 2C33xxxx 00002C33 Event Log Device Services Nontransfer Event Error 417 2 0441 xxxx 00000441 Log Disk Transfer E
248. al of nine characters SCSI location The location of the CDROM drive to be added in the form PTL where P designates the port 1 through 6 or 1 through 3 depending on the controller model T designates the target ID of the CDROM drive 0 through 6 in a nonfailover configuration or 0 through 5 if the controller is in a failover configuration and L designates the LUN of the CDROM drive 0 through 7 When entering PTL at least one space must separate the the port target and LUN Adds a CDROM drive to the known list of CDROM drives and names the drive This command must be used when a new SCSI 2 CDROM drive is to be added to the configuration CLI ADD CDROM CD PLAYER 1 0 0 A CDROM drive is added to port 1 target 0 LUN 0 and named CD PLAYER B 2 Command Line Interpreter ADD DISK ADD DISK Format Parameters Description Qualifiers Adds a disk drive to the known list of disk drives ADD DISK container name SCSI location container name Specifies the name that will be used to refer to this disk drive This name will be referred to when creating units and stripesets The name must start with a letter A through Z and can then consist of up to eight more characters made up of A through Z 0 through 9 period dash and underscore for a total of nine characters SCSI location The location of the disk drive to be added in the form PTL where P designates the port 1 through 6 or 1 through 3
249. al loop back Diagnostics Explanation The CI path named by x has failed the loop back diagnostics x can be A or B Local Terminal Port FAILED Diagnostics Explanation The maintenance EIA 423 terminal port has failed diagnostics 5 6 2 NVPM Messages The messages listed in this section are displayed because of a problem or fault associated with the nonvolatile parameters in memory NVPM Note Some NVPM messages will read NVPM component name component initialized to default settings For some of these initialization cases corrective action may only clear the error message until the next time the controller is reset because the error could be caused by a fault in NVPM itself If the error persists replace the controller module 5 12 Error Analysis and Fault Isolation NVPM Revision level updated from n to N Explanation The format of the NVPM has changed as a result of installing a newer program card containing updated firmware However all subsystem configuration information has been retained NVPM Failover Information component initialized to default settings Explanation The identity of the other controller in a dual redundant pair has been lost Enter the SET FAILOVER COPY OTHER CONTROLLER command to correct this problem NVPM Host Interconnect Parameters component initialized to default settings Explanation The SCS node name CI node number or Path A or Path B enable settings for this controller
250. al maintenance terminal protocol allows asynchronous delivery of control characters using the CLI SEND DIAGNOSTIC PAGE command The CLI Command Code field is set to ANSWER and the control character is placed in the first byte of the ASCII text buffer Any other characters in the ASCII text buffer are ignored There is no fixed connection made between the host process and the HSZ series controller It is therefore possible to implement a host interface that allows a user to exit the host program while a program is running within the HSZ series controller The terminal session could be resumed at a later time This also implies that if multiple users attempt to have simultaneous virtual terminal sessions the resulting responses from the controller may be unpredictable 6 7 5 2 Host Virtual Terminal I O Algorithm Following is a description of the sequence of events that occurs in the host virtual maintenance terminal I O algorithm 1 Obtain the device information 2 Enter a SCSI INQUIRY command and display the returned INQUIRY information 3 Make sure the remote device supports the protocol s diagnostic pages by entering a SCSI RECEIVE DIAGNOSTIC RESULTS command for page 0 and comparing the received list with the virtual terminal protocol list If the diagnostic pages are not supported then exit 4 Enter SCSI TEST UNIT READY commands until either the device becomes available or a failure occurs If a failure occurs then exit 5 T
251. alifier is specified NOIMMEDIATE is the default B 28 Command Line Interpreter SELFTEST THIS_CONTROLLER OVERRIDE_ONLINE NOOVERRIDE_ONLINE D If any units are on line to the controller SELFTEST will not take place unless OVERRIDE_ONLINE is specified If the OVERRIDE_ONLINE qualifier is specified the controller will start self test after all customer data is written to disk CAUTION Customer data may be lost or corrupted if the OVERRIDE_ONLINE qualifier is specified NOOVERRIDE_ONLINE is the default Qualifiers for HSZ controllers Examples IGNORE_ERRORS NOIGNORE_ERRORS D If errors result when trying to write user data the controller will not start self test unless IGNORE_ERROR is specified CAUTION Customer data may be lost or corrupted if the IGNORE_ERRORS qualifier is specified NOIGNORE_ERRORS is the default IMMEDIATE NOIMMEDIATE D If IMMEDIATE is specified immediately start the self test on the controller without checking for online devices CAUTION Customer data may be lost or corrupted if the IMMEDIATE qualifier is specified NOIMMEDIATE is the default CLI gt SELFTEST THIS_CONTROLLER Start the self test on this controller as long as this controller does not have any units on line CLI gt SELFTEST THIS_CONTROLLER OVERRIDE_ONLINE Start the self test on this controller even if there are units on line to this controller Command Line Inte
252. and need be entered to make the command unique usually three characters For example SHO is equivalent to SHOW 4 3 4 Initial Configuration Nonredundant Controller 4 4 After installation of a nonredundant controller use the CLI to define its parameters in the following order from a maintenance terminal CAUTION Do not install HSJ series CI host port cables until after setting all parameters listed here Failure to follow this procedure may result in adverse effects on the host cluster Note Not all steps are applicable to all controller models Steps applicable to certain models are designated as such Normal Operation 1 Enter the following command to set the MAX_NODES HSJ series controllers CLI gt SET THIS CONTROLLER MAX NODES n where n is 8 16 or 32 2 Enter the following command to set a valid controller ID CLI SET THIS CONTROLLER ID n where n is the HSJ series controller CI node number 0 through MAX NODES 1 or n is the HSD series controller one digit DSSI node number 0 through 7 Each controller DSSI node number must be unique on its DSSI interconnect or n is the HSZ series controller SCSI target ID s 0 through 7 3 Enter the following command to set the SCS node HSJ and HSD series controllers CLI SET THIS CONTROLLER SCS_NODENAME xxxxxx where xxxxxx is a one to six character alphanumeric name for this node The node name must be enclosed in qu
253. ands They will be executed in the order entered The commands will be repeated until the execution time limit expires CAUTION If you define write commands user data will be destroyed Enter command number x read write quit Explanation This question only applies to the User Defined test It allows you to define command x as a read or write command Enter quit to finish defining the test After making your command selection s the following message is displayed by DILX IMPORTANT If you answer yes to the next question user data WILL BE destroyed Write enable disk unit y n n Explanation Enter Y to write enable the unit Write commands are enabled for the currently selected test Data within your selected LBN range will be destroyed Be sure of your actions before answering this question This question applies to all DILX tests Enter N to enable read only mode where read and access commands are the only commands enabled Perform initial write y n n Explanation Enter Y to write to the entire user selected LBN range with the user selected data patterns Enter N for no initial write pass If you respond with Y the system performs writes starting at the lowest user selected LBN and issues spiral I Os with the largest byte count possible This continues until the specified LBN range has been completely written Upon completion of the initial write pass normal functions of the Ran
254. anel The operator control panel OCP on the front of the controller has seven buttons and LEDs The buttons and LEDs serve different functions with respect to controlling the SCSI ports and or reporting fault and normal conditions See Chapter 5 for a complete description of the OCP General Information and Subsystem Overview 1 5 1 4 Precautions This section describes necessary precautions and procedures for properly maintaining and servicing HS controllers 1 4 1 Electrostatic Discharge Protection Electrostatic discharge ESD is a common problem for any electronic device and may cause data loss system down time and other problems The most common source of static electricity is the movement of people in contact with carpets and clothing Low humidity also increases the amount of static electricity You must discharge all static electricity prior to touching electronic equipment In general you should follow routine ESD protection procedures when handling controller modules and cache modules and when working around the cabinet and shelf that houses the modules Follow these guidelines to further minimize ESD problems Maintain more than 40 percent humidity in the room where the equipment is installed Place the subsystem cabinet away from heavy traffic paths Do not place the subsystem on carpet if possible If carpet is necessary choose antistatic carpet If the carpet is already in place place antistatic mats around the
255. ard and soft errors experienced for this unit The following is an example of a DILX performance display where performance statistics were not selected and where a controller error was detected DILX Summary at 18 JUN 1993 06 18 41 Test minutes remaining 0 expired 6 Cnt err in HEX 1C 07080064 Key 06 ASC Q A0 05 HC 1 SC 0 Total Cntrl Errs Hard Cnt 1 Soft Cnt 0 Unit 1 Total IO Requests 482 No errors detected Unit 2 Total IO Requests 490 No errors detected For the previous examples the following definitions apply These codes are translated in Appendix E e C The HSZ series Instance code e ASC Q The SCSI ASC and ASCQ code associated with this error e HC The hard count of this error e SC The soft count of this error e PTL The location of the unit Port Target LUN 6 64 Diagnostics Exercisers and Utilities The performance displays contain error information for up to three unique errors Hard errors always have precedence over soft errors A soft error represented in one display may be replaced with information on a hard error in subsequent performance displays 6 4 10 DILX Abort Codes Table 6 9 lists the DILX abort codes and definitions Table 6 9 DILX Abort Codes and Definitions Value Definition 1 An IO has timed out 2 dcb_p gt htb_used_count reflects an available HTB to test IOs but none could be found FAO returned either FAO BAD FORMAT or FAO OVERFLOW TS SEND TERMINAL DATA return
256. ard s out If you are updating firmware follow the instructions included with your new firmware for used card return or disposal 7 3 4 Card Replacement Installation Use the following procedure to replace the program card Note If you are updating firmware install your new program card s by following the instructions included with the card s Otherwise you may use the following guidelines to replace the program card s For a nonredundant configuration Press and hold the controller green OCP reset button Then insert the program card The program card eject button will extend when the card is fully inserted For a dual redundant configuration Press and hold both green reset buttons at the same time even if you are only 7 22 Removing and Replacing Field Replaceable Units replacing one of the cards Then insert the program card s The program card eject button will extend when the card is fully inserted 2 Release the reset button s to initialize the controller s If the controller s initialize correctly the green reset LED s will begin to flash at 1 Hz If an error occurs during initialization the OCP s will display a code Refer to Chapter 5 to analyze any codes 3 If you wish you may disconnect the maintenance terminal The terminal is not required for normal controller operation 4 Close and lock the cabinet doors SW800 series using a 5 32 inch Allen wrench 7 4 External Cl Cable
257. are written to the tape The data are then read from the tape and compared against the corresponding TILX buffers On read commands the data are read from the tape into the TILX buffers read again and then compared against the corresponding TILX buffers If a discrepancy is found an error is reported Enter N and the compare modifier bit is disabled The default is to have the bit disabled Enter compare percentage 1 100 2 Explanation This question is displayed only if you choose to perform data compares It allows you to enter the percentage of read and write commands that will have a data compare operation performed Enter command number x red wrt rew wtm rpr rpf quit Explanation This question only applies to the User Defined test It allows you to define command x as a read write rewind write tape mark reposition records or reposition file marks Enter quit to finish defining the test Reposition towards EOT y EOT n BOT y Explanation If you specify the reposition records or reposition file marks command in the User Defined test this question is displayed Enter the direction of the reposition operation you want either towards the end of tape EOT or at the beginning of tape BOT Enter number of records to reposition 1 255 1 Explanation If you specify the reposition records command in the User Defined test this question is displayed The question is self explanatory Enter numb
258. arget device The Additional Sense Code ASC field and the Additional Sense Code Qualifier ASCQ field together describe the event being reported The standard SCSI ASC ASCQ codes are devtype dependent as shown in Tables C 13 through C 16 Note that the SCSI standard defines ASCs within the range 80 through FF in combination with ASCQs within the range 00 through FF and ASCQs within the range 80 through FF regardless of ASC value as being vendor specific Refer to documentation provided by the vendor of the target device for a description of an ASC ASCQ value that falls within the defined vendor specific ranges If the value contained in the addsns field is 6 or greater and the dssd subfield of the sdqual field is equal to 0 the asc and ascq fields contain HSJ30 40 controller vendor specific SCSI ASC ASCQ codes generated by the HSJ30 40 on behalf of the target device See Table C 17 for the descriptions of the HSJ30 40 controller vendor specific SCSI ASC ASCQ codes frucode If the value contained in the addsnsl field is 7 or greater this field contains byte OE Field Replaceable Unit field of the Sense Data returned in the response of a SCSI REQUEST SENSE command If this field is nonzero the target device is identifying the field replaceable unit that has failed Refer to documentation for the target device for complete details of the meaning of this value keyspec If the value contained in th
259. arity transmitted and expected Parity options are ODD or EVEN NOTERMINAL_PARITY causes the controller not to check for or transmit any parity on the terminal lines When first installed the controller s terminal parity is set to NOTERMINAL PARITY TERMINAL_SPEED baud_rate Sets the terminal speed to 300 600 1200 2400 4800 or 9600 baud The transmit speed is always equal to the receive speed When first installed the controller s terminal speed is set to 9600 baud t CLI gt SET THIS_CONTROLLER PATH_A PATH_B SPEED 1200 Turns on this HSJ controller s two CI paths and sets the terminal speed to 1200 baud 2 CLI SET THIS CONTROLLER ID 5 Sets this HSZ controller so it responds to requests for target 5 3 CLI gt SET THIS CONTROLLER ID 2 5 Sets this HSZ controller so it responds to requests for targets 2 and 5 B 40 Command Line Interpreter SET unit number SET unit number Modifies the unit parameters Format SET unit number Parameters unit number Specifies the logical unit number on HSDs and HSJs D0 D4094 or TO T4094 on HSZs D0 D7 or TO T 7 whose software switches are to be modified This is the name given the unit when it was created using the ADD UNIT command Description The SET command is used to change logical unit parameters Qualifiers for a unit created from a CDROM drive HSJ and HSD only MAXIMUM_CACHED_TRANSFER n MAXIMUM_CACHED_TRANSFER 32 D Specifies the maximum size tran
260. as given to the reuse parameters question This is not a valid response if the run time has expired Reinvoke TILX When TILX starts to exercise the tape units the following is displayed with the current time of day TILX testing started at xx xx xx Test will run for x minutes Type T if running TILX through a VCS or G in all other cases to get a current performance summary Type C to terminate the TILX test prematurely Type Y to terminate TILX prematurely Diagnostics Exercisers and Utilities 6 41 6 3 6 TILX End Message Display The Value Added Status field corresponds to the TMSCP end message status Example 6 10 is an example of a TILX end message display Example 6 10 TILX End Message Display Bad Value Added Completion Status for unit x End message in hex Event Code Op Code Cmd Ref Number End Flags Host Xfer Byte Count Tape Rec Byte Count Tape Position Sequence Number xxx KM MM MM 6 3 7 TILX Error Information Packet Displays Contact Digital Multivendor Services for assistance in deciphering the EIP fields A TILX EIP display may or may not include a hex dump of the Requestor Specific Data This is an option you can select for TILX selectable parameters The EIP will be in one of the following formats that corresponds to MSCP error log formats e Controller Error e Memory Error e Tape Error Examples 6 11 through 6 13 are samples of each display Each display includes the optional requestor s
261. ate 31 Instance MSCP Event Codes MSCP Instance Event Code Code Description 40016001 006A CI A B transmit cables are crossed 40026001 006A CI A B receive cables are crossed 4009640A 006A CI Port detected bad Path A upon attempting to transmit a packet 400A640A 006A CI Port detected bad Path B upon attempting to transmit a packet 400B640A 006A CI Port detected bad Path A upon attempting to transmit a packet 400C640A 006A CI Port detected bad Path B upon attempting to transmit a packet 400D640A 006A CI Port detected bad Path A upon attempting to transmit a packet 400E640A 006A CI Port detected bad Path B upon attempting to transmit a packet Table C 25 CI Port Port Driver Event Log Template 32 Instance MSCP Event Codes MSCP Instance Event Code Code Description 4003640A 006A CI Port detected a Dual Receive condition that resulted in the closure of the Virtual Circuit This error condition will be eliminated in a future CI interface chip 4004020A 006A Host Interconnect Services detected protocol error upon validating a received packet 4007640A 006A CI Port detected error upon attempting to transmit a packet This resulted in the closure of the Virtual Circuit 403D020A 006A Received packet with an unrecognized PPD opcode Note that the content of the vcstate field is undefined in this instance 40440064 006A Received a PPD NODE STOP and closed virtual circuit HSJ Series Error Logging C 81
262. ater than 60 feet 18 3 meters HSZ series controllers The maximum length end to end of fast and slow buses is summarized in Table 3 8 Table 3 8 SCSI Bus Maximum Lengths Bus Type Transfer Rate Meters Feet 8 bit single ended 5 MB s 6 19 7 8 bit single ended 10 MB s 3 9 8 16 bit differential 20 MB s 25 82 0 3 6 2 Host Adapters The HSJ series controllers follow the same CI configuration rules as the HSC controller product family which supports from 1 to 31 host nodes Consult your HSJ series controller software product description SPD and firmware release notes for specific restrictions and a current list of supported host adapters Also for the HSJ series controllers all host adapter CI ports in a CI configuration must have the quiet slot time set to 10 Some older systems may have the quiet slot time set to 7 which will cause incorrect operation of the CI Configuration Rules and Restrictions 3 19 The following host adapters currently are supported e HSJ series controllers CIXCD for XMI based systems CIBCA B for BI based systems C1780 for SBI based systems HSD series controllers SHAC for various DEC and VAX systems D4000 for DEC 4000 systems KFMSA for XMI based systems e HSZ series controllers KZTSA for DEC 3000 systems KZMSA for DEC 7000 10000 systems via DWZZA Consult your controller SPD and firmware release notes for current lists of supported host
263. ault isolation and corrective actions are similar to a nonredundant configuration However failover takes place so the surviving controller takes over the failed controller s ports and devices 5 1 3 Cache Module Failures If a cache module fails its controller still functions however Digital recommends that you replace the cache module as soon as possible When a cache module fails in a dual redundant configuration cache failover occurs so that the companion cache module can take over all caching operations Error Analysis and Fault Isolation 5 1 5 2 Types of Error Reporting The controller can notify you of an error through one or more of the following means The OCP e Device LEDs e Error messages at a host virtual terminal or error messages at a maintenance terminal if attached e Host error logs 5 3 Troubleshooting Basics When an error occurs use the following steps as top level guidelines for fault isolation 1 Make a note of all visual indicators OCP device LEDs or error messages available to you 2 Extract and read host error logs see Section 5 7 Errors can be intermittent reset the controller to see if the error clears 4 See if the error indication changes after resetting the controller If the error remains the same look up the cause for that error If the indication changes look up the cause for the newer error See Sections 5 4 through 5 6 for detailed information about errors and
264. aximum Nonoperating Environment Range Temperature Relative humidity Nonoperating Altitude 40 to 66 C 40 to 151 F During transportation and associated short term storage 8 to 95 in original shipping container noncondensing otherwise 50 noncondensing From 300 m 1000 ft to 3600 m 412 000 ft MSLT TMean sea level 1 10 General Information and Subsystem Overview 2 Functional Description This chapter provides a detailed functional description of the HS controller hardware and firmware 2 1 HS Controller Hardware The HS controller provides a connection between a host computer and an array of SCSI 2 compatible storage devices The controller hardware consists of core circuitry common to all models of HS controllers as follows Policy processor Program card Diagnostic registers Operator control panel Maintenance terminal port Dual controller port Nonvolatile memory NVMEM Bus exchangers Shared memory Device ports Cache module Each controller model also has a unique interface tailored to the appropriate host system Figure 2 1 shows a block diagram of the HS controller hardware 2 1 1 Policy Processor The policy processor consists of microprocessor hardware necessary for running the HS controller 2 1 1 1 Intel 80960CA The heart of the policy processor is an Intel 80960CA processor chip This processor chip runs the firmware from the program card and provid
265. ay Bad Value Added Completion Status for unit x End message in hex Event Code Op Code Cmd Ref Number Byte Count Error Byte Count Sequence Number Flags XXXXxxo 6 2 7 DILX Event Information Packet Displays A DILX EIP display may or may not include a hex dump of the Requestor Specific Data This is an option you can select as a DILX parameter The EIP will be in one of the following formats that corresponds to MSCP error log formats e Controller Error e Memory Error e Disk Transfer Error Bad Block Replacement Attempt Error Examples 6 2 through 6 5 are examples of each display Each display includes the optional requestor specific information In all cases the Instance code template type and all requestor specific information correspond to event error log device dependent parameters while everything else has a one to one correspondence to error log fields See Appendices C and D for a translation of these codes 6 18 Diagnostics Exercisers and Utilities Example 6 2 Controller Error Error Information Packet in hex Cmd Ref Number Unit Number Log Sequence Format Flags Event Code Controller ID Controller SW ver Controller HW ver ulti Unit Code Instance Template Requestor Requestor Requestor xxx MM MM xoxo Type Information Size Example 6 3 Memory Error Error Information Packet in hex Cmd Ref Number Unit Number Log Sequence Format Flags Event Code Controller ID Contro
266. be the case for a successful command or a command that received CHECK CONDITION or COMMAND TERMINATED status because one of the FM EOM or ILI bits is set to one in the sense data flags field 00EB During device initialization the device reported the SCSI Sense Key RECOVERED ERROR This indicates the last command completed successfully with some recovery action performed by the target 00EB During device initialization the device reported the SCSI Sense Key NOT READY This indicates that the logical unit addressed cannot be accessed Operator intervention may be required to correct this condition 00EB During device initialization the device reported the SCSI Sense Key MEDIUM ERROR This indicates that the command terminated with a nonrecovered error condition that was probably caused by a flaw in the medium or an error in the recorded data This sense key may also be returned if the target is unable to distinguish between a flaw in the medium and a specific hardware failure HARDWARE ERROR sense key 00EB During device initialization the device reported the SCSI Sense Key HARDWARE ERROR This indicates that the target detected a nonrecoverable hardware failure for example controller failure device failure parity error and so forth while performing the command or during a self test continued on next page HSJ Series Error Logging C 87 Table C 27 Cont Device Services Nontransfer Error Event Log Template 41 Instance
267. binets such as the SW800 series in which StorageWorks components can be mounted device driver An operating system software module used to physically control an I O device In DSA conventional device drivers are replaced by a single driver for an entire class of devices such as disk drives and a single port driver for the host to controller transport mechanism For example a host computer communicating with an HSJ series controller uses disk and tape class drivers and the CI port driver device shelf A StorageWorks shelf designed to contain SBBs Diagnostic And Execution MONitor See DAEMON Diagnostics and Utilities Protocol See DUP digital audio tape See DAT DIGITAL Standard Disk Format See DSDF DSDF The Digital Storage Architecture DSA standard for disk media format DSDF specifies the mechanism for mapping a contiguous logical block address space into a possibly imperfect physical space as well as defining diagnostic and factory areas DSDF is transparent to the system DIGITAL Storage Architecture See DSA DSA A set of specifications and interfaces describing standards for designing mass storage products DSA defines the functions performed by host computers controllers and drives It also specifies how they interact to accomplish mass storage management DIGITAL Storage System Interconnect See DSSI DILX Disk inline exerciser Diagnostic firmware used to test the data transfer capabilities of disk
268. ble to allocate flush buffer Unable to allocate active receive FCB C 108 HSJ Series Error Logging Table C 39 Nonvolatile Parameter Memory Failover Control Last Failure Codes Code Description 08010101 A remote state change was received from the FOC thread that NVFOC does not recognize Last Failure Parameter 0 contains the unrecognized state value 08020100 No memory could be allocated for a NVFOC information packet 08030101 Work received on the S_nvfoc_bque did not have a NVFOC work ID Last Failure Parameter 0 contains the ID type value that was received on the NVFOC work queue 08040101 Unknown work value received by the S_nvfoc_bque Last Failure Parameter 0 contains the unknown work value 08050100 An unlock was received and the controller was not locked by the other controller 08060100 A read write command was received when the NV memory was not locked 08070100 A write to NV memory was received while not locked 08080000 The other controller requested this controller to restart 08090010 The other controller requested this controller to shutdown 080A0000 The other controller requested this controller to selftest 080B0100 Could not get enough memory to build a FCB to send to the remote routines on the other controller 080C0100 Could not get enough memory for FCBs to receive information from the other controller 080D0100 Could not get enough memory to build a FCB to reply to a request from the other control
269. bles or disables CI Path B When first installed NOPATH B is set PROMPT new prompt Specifies a 1 to 16 character prompt enclosed in quotes that will be displayed when the controller s CLI prompts for input Only printable ASCII characters are valid When first installed the CLI prompt is set to the first three letters of the controller s model number for example HSJ gt HSD gt or HSZ gt SCS _NODENAME xxxxxx Specifies a one to six character name for node TERMINAL_PARITY ODD EVEN NOTERMINAL_PARITY Specifies the parity transmitted and expected Parity options are ODD or EVEN NOTERMINAL_PARITY causes the controller not to check for or transmit any parity on the terminal lines Command Line Interpreter B 35 SET OTHER_CONTROLLER When first installed the controller s terminal parity is set to NOTERMINAL_PARITY TERMINAL_SPEED baud_rate Sets the terminal speed to 300 600 1200 2400 4800 or 9600 baud The transmit speed is always equal to the receive speed When first installed the controller s terminal speed is set to 9600 baud TMSCP_ALLOCATION_CLASS n Specifies the allocation class 0 through 255 in a single controller configuration or 1 through 255 in a dual redundant configuration When first installed the controller s TMSCP ALLOCATION CLASS is set to 0 Examples CLI gt SET OTHER_CONTROLLER PATH_A PATH_B SPEED 1200 Turns on the other HSJ controller s two CI paths and sets the terminal spee
270. byte and sense data and the next higher order byte contains the starting byte number of an area relative to Sense Byte 0 that contains unchanged the destination logical unit s status byte and sense data If the low order or next higher order byte of this field contains the value zero no status byte or sense data was supplied for the corresponding source or destination logical unit The content of the highest order two bytes of this field is undefined C 18 HSJ Series Error Logging e If the emdopcd is a 7 REASSIGN BLOCKS this field contains the logical block address of the first defect descriptor not reassigned If information about the first defect descriptor not reassigned is not available or if all the defects have been reassigned this field will contain the value FFFFFFFF e If the emdopcd is a 31 SEARCH DATA EQUAL 30 SEARCH DATA HIGH or 32 SEARCH DATA LOW and the Sense Key subfield of the snsflgs field refer to Figure C 11 value is EQUAL this field contains the record offset of the matching record asc ascq If the value contained in the addsns field is 6 or greater and the dssd subfield of the sdqual field is equal to 1 the asc and ascq fields contain the values supplied in the byte 0C Additional Sense Code and byte 0D Additional Sense Code Qualifier fields respectively of the Sense Data returned in the response of a SCSI REQUEST SENSE command issued to the t
271. cable order number 17 03831 01 then to a 2 meter SCSI 2 cable order number BN21H 02 that connects to one of the controller SCSI 2 ports Use of an upper controller shelf By convention controller shelf C3 would use only the top three or four storage shelves in the front of the cabinet the fourth controller shelf C4 would use the top three or four storage shelves in the back of the cabinet Configuration Rules and Restrictions 3 3 Figure 3 2 SW800 Series Data Center Cabinet Controller Storage 1 2 Tape Drive Locations SHELF SHELF MOUNTING MOUNTING LOCATIONS LOCATIONS o olo o o eo o o lo oj 08 o9 gj olo o HOLE 4 ap TEM eL S HOLE 3 s TAPE TAPE 1 3 ele TAPE TAPE 3 o O PORTON ofo o P c 2121 POSITION POSI
272. cate memory for WARPs and RMDs 02210100 Invalid parameters in CACHE OFFER META call 02220100 No buffer found for CACHE MARK META DIRTY call 02270104 A callback from DS on a transfer request has returned a bad or illegal DWD status e Last Failure Parameter 0 contains the DWD Status e Last Failure Parameter 1 contains the DWD address e Last Failure Parameter 2 contains the PUB Address e Last Failure Parameter 3 contains the Device Port continued on next page HSJ Series Error Logging C 97 Table C 34 Cont Value Added Services Last Failure Codes Code Description 022E0102 02360101 02370102 02392084 023A2084 02440100 02530102 C 98 HSJ Series Error Logging An invalid mapping type was specified for a logical unit e Last Failure Parameter 0 contains the USB address e Last Failure Parameter 1 contains the Unit Mapping Type Unrecognized state supplied to FOC SEND callback routine va dap snd cmd complete Last Failure Parameter 0 contains the unrecognized value Unsupported return from HIS GET CONN INFO routine e Last Failure Parameter 0 contains the DD address e Last Failure Parameter 1 contains the invalid status A processor interrupt was generated by the HSJ30 40 controller s XOR engine FX with no bits set in the CSR to indicate a reason for the interrupt e Last Failure Parameter 0 contains the FX Control and Status Register CSR e Last Failure Parameter 1
273. che 16 MB 12 5 inches 7 75 inches 1 5 W 300 mA 2 mA Read cache 32 MB 12 5 inches 7 75 inches 2 0 W 300 mA 2 mA Refer to the Storage Works Solutions Controller Shelf User s Guide for power requirements for the BA350 MA controller shelf General Information and Subsystem Overview 1 9 1 6 Controller Environmental Specifications The HS controllers are intended for installation in a Class A computer room environment The StorageWorks product line environmental specifications listed in Table 1 4 are the same as for other Digital storage devices Table 1 4 Environmental Specifications Condition Specification Optimum Operating Environment Temperature 18 to 24 C 65 to 75 F Rate of change Step change Relative humidity Altitude Air quality Inlet air volume 3 C 5 4 F 3 C 5 4 F 40 to 60 noncondensing with a step change of 10 or less noncondensing From sea level to 2400 m 8000 ft Maximum particle count 5 micron or larger not to exceed 500 000 particles per cubic ft of air 026 cubic m per second 50 cubic ft per minute Maximum Operating Environment Range Temperature Relative humidity 10 to 40 C 50 to 104 F Derate 1 8 C for each 1000 m 1 0 F for each 1000 ft of altitude Maximum temperature gradient 11 C hr 20 F hr 2 C hr 4 F hr 10 to 90 noncondensing Maximum wet bulb temperature 28 C 82 F Minimum dew point 2 C 36 F M
274. cial purpose servers such as the HS controller providing a special set of services to the rest of the nodes command line interpreter See CLI cold swap A method of device replacement that requires that power be removed from all shelves in a cabinet This method is used when conditions preclude the use of the warm swap or hot swap methods container Either a single disk device or group of disk devices linked as a storage set controller A hardware software device that facilitates communications between a host and one or more devices Glossary 3 Glossary 4 controller shelf A StorageWorks shelf designed to contain controller and cache memory modules CRC A checkword polynomial checksum generally appended to a disk data transfer CRC is computed using data message bits as coefficients divided by a generating polynomial The resulting remainder is the CRC When a transmitter computes and transmits a CRC following a data transfer the receiver can recompute and compare it with the received version to verify correct reception EDC and ECC both used by disks are examples of CRC checkwords cyclic redundancy check See CRC DAEMON Diagnostic And Execution MONitor DAEMON is a part of HS controller self testing that includes port and cache initialization and self test routines DAT Digital Audio Tape A format for recording digital data on a cartridge tape data center cabinet A generic reference to the large ca
275. cmpl_rtn found the state change failed 604A0100 tmscp_clear_cdl_cmpl_rtn found the state change failed 604B0100 Subroutine process event returned a value to dmscp_dcd_comm_path event that indicates that an internal disconnect request occurred while processing an immediate communications event 604D0100 Subroutine process event returned a value to dmscp_dcd_comm_path event that indicates that a connection established event occurred while no DCD commands were active 604F0100 tmscp_set_cmpl_rtn found the state change failed 60500100 dmscep ded op cmpl found an unrecognized P STS value in a DCD HTB status field 60550100 mscp_initialize unable to get LOCAL STATIC memory from exec for use as a local connection ITB 60560100 mscp_initialize unable to get LOCAL STATIC memory from exec for use as an AVAILABLE ITB 60570100 mscp_initialize unable to get LOCAL STATIC memory from exec for use as an AVAILABLE state change ITB 60580100 mscp_initialize unable to get LOCAL STATIC memory from exec for use as a state change ITB 605D0100 tmscp onl cleanup rtn detected a failure in enabling variable speed mode suppression 605E0100 tmscp suc cmpl rtn detected a failure in enabling variable speed mode suppression 605F0100 tmscp suc cmpl rtn detected a failure in enabling variable speed mode suppression 60610100 mscp_initialize unable to get BUFFER STATIC memory from exec for use as Write History Logs 60620100 mscp_initialize un
276. code as shown in Table C 6 HSJ Series Error Logging C 9 scs opcode The System Communication Services layer opcode as shown in Table C 7 C 2 2 2 Host Server Connection Common Fields The fields common to certain event logs generated by the Disk and Tape MSCP Server CI Host Interconnect Services Device Services and Value Added firmware components are shown in Figure C 4 Figure C 4 Host Server Connection Common Fields 3 1 0 connection id remote node name Host Server Connection Common Fields connection id Identifies the host server connection associated with the event being reported If this value is zero the host server connection information was invalidated before the event could be reported remote node name An 8 byte ASCII string that represents the node name associated with the host server connection identified in the connection id field If the connection id field is zero the content of this field is undefined C 2 2 3 Byte Count Logical Block Number Common Fields The fields common to certain event logs generated by the Device Services and Value Added firmware components are shown in Figure C 5 C 10 HSJ Series Error Logging Figure C 5 Byte Count Logical Block Number Common Fields byte count logical block number reserved Byte Count Logical Block Number Common Fields byte count Number of bytes of the HSJ30 40 controller firmware component initiated transfer successful
277. configuration must contain the same version of firmware Use the procedures in this section when you are removing and replacing only the program card 7 3 4 Tools Required You will need a 5 32 inch Allen wrench to remove or replace the program card 7 3 2 Precautions Refer to Chapter 1 for program card handling guidelines Removing and Replacing Field Replaceable Units 7 21 7 3 3 Card Removal Use the following procedure to remove the program card 1 If you have not done so already unlock and open the cabinet doors SW800 series using a 5 32 inch Allen wrench Examine the green OCP reset LED s on the controller s They should be flashing If a green LED is lit continuously its controller has failed To service the controller refer to Section 7 1 Note You need not record configuration information the configuration infomation is not lost when removing a program card Connect a maintenance terminal to the MMJ of the controller s you are removing the program card from and shut down the controller s by following the guidelines in Section 7 1 2 The green LED s should light continuously when shutdown completes Note Earlier controller models had a program card EMI shield This shield may be discarded Unsnap and discard the program card EMI shield s if attached Remove the program card s by pushing the eject button s refer to Figure 7 3 next to the card s Pull the c
278. contains the SCSI device type Invalid CDB Group Code detected during create of miscellaneous command DWD Last Failure Parameter 0 contains the SCSI command opcode Invalid SCSI OPTICAL MEMORY device opcode in miscellaneous command DWD Last Failure Parameter 0 contains the SCSI command opcode Error DWD not found in port in_proc_q A dip error was detected when pcb_busy was set e Last Failure Parameter 0 contains the PCB reg710 ptr value e Last Failure Parameter 1 contains the new info NULL SSTATO DSTAT ISTAT e Last Failure Parameter 2 contains the PCB copy of the 710 DBC register e Last Failure Parameter 3 contains the PCB copy of the 710 DNAD register e Last Failure Parameter 4 contains the PCB copy of the 710 DSP register e Last Failure Parameter 5 contains the PCB copy of the 710 DSPS register e Last Failure Parameter 6 contains the PCB copies of the 710 SSTAT2 SSTAT1 SSTATO DSTAT registers e Last Failure Parameter 7 contains the PCB copies of the 710 LCRC RESERVED ISTAT DFIFO registers More DBDs than allowded for in mask Cannot find in_error DWD on in process queue Either DWD_PTR is null or bad value in DSPS SCSI CDB contains an invalid group code for a transfer command The required error information packet EIP or device work descriptor DWD were not supplied to the Device Services error logging code HIS GET_CONN_INFO returned an unexpected completion code A Device Work D
279. controller can access any devices Table 1 2 summarizes the main features of each HS controller General Information and Subsystem Overview 1 3 Table 1 2 Summary of HS Controller Product Features Feature HSJ40 HSJ30 HSD30 HSZ40 Host system bus CI CI DSSI SCSI 2 Host protocol SCS MSCP SCS MSCP SCS MSCP SCSI 2 TMSCP TMSCP TMSCP Storage device protocol SCSI 2 SCSI 2 SCSI 2 SCSI 2 Number of SCSI 2 ports 6 3 3 6 Number of SCSI 2 devices per 6 or 7 t 6 or 7 t 6 or 7 6 or 7 port Maximum number of SCSI 2 36 or 42 T 18 or 21 5 18 or 21 36 or 42 devices Shared memory nonvolatile 32 KB 32 KB 32 KB 32 KB memory Read cache module 16 or 32 MB 16 or 32 MB 16 or 32 MB 16 or 32 MB RAID levels supported RAID 0 1a RAID 0 1a RAID 0 1a RAID 0 1a Mixed disk and tape support Yes Yes Yes No tapes Tape drive media loader support Sequential Sequential Sequential N A access device access device access device Dual redundant configurations Yes Yes Yes No Program card firmware update Yes Yes Yes Yes Error detection code EDC Error correction code ECC on cache and shared memory Power fail write nonvolatile journal Data integrity and byte parity all buses memory Validation of program card firmware Yes Yes Yes Validation of program card firmware Yes Yes Yes Validation of program card firmware Yes Yes Yes Validation of program card firmware Yes Yes
280. controller begins self test For self test to properly execute you must have a valid configuration and enable the host paths To run self test enter one of the following commands which command you need will depend on your configuration which controller the terminal is connected to and which controller you wish to test 6 4 Diagnostics Exercisers and Utilities CAUTION Do not use the OVERRIDE_ONLINE qualifier for the SELFTEST command as customer data may be overwritten CLI gt SELFTEST THIS_CONTROLLER CLI gt SELFTEST OTHER_CONTROLLER See Appendix B for more information on the command and its qualifiers When you run self test all outstanding I O operations complete The controller will also attempt to flush the cache However even if self test fails to flush the cache the program will continue to execute Self test will halt if it detects a fault Otherwise the self test loop continues until you press the reset button or the cycle controller power off and on after which the controller reinitializes 6 2 Disk Inline Exerciser HSJ and HSD Series Controllers The disk inline exerciser DILX is a diagnostic tool used to exercise the data transfer capabilities of selected disks connected to an HSJ or HSD series controller DILX exercises disks in a way that simulates a high level of user activity Using DILX you can read and write to all customer available data areas DILX can also be run on CDROMs but
281. cted TILX Summary at 18 JUN 1993 06 18 41 Test minutes remaining 0 expired 6 Cnt err in HEX 1C 07080064 Key 06 ASC Q A0 05 HC 1 SC 0 Total Cntrl Errs Hard Cnt 1 Soft Cnt 0 Unit 1 Serial Number 1 Total IO Requests 482 No errors detected Unit 2 Serial Number 2 Total IO Requests 490 No errors detected The performance displays contain error information on up to three unique errors It should be noted that hard errors always have precedence over soft errors A soft error represented in one display may be replaced with information on a hard error in subsequent performance displays 6 3 11 TILX Abort Codes Table 6 6 list TILX abort codes and definitions Diagnostics Exercisers and Utilities 6 49 Table 6 6 TILX Abort Codes and Definitions Value Definition 1 2 3 4 An IO has timed out A HTB was not available to issue an IO when it should have been FAO returned either FAO BAD FORMAT or FAO OVERFLOW TS SEND TERMINAL DATA returned either an ABORTED or INVALID BYTE COUNT TS READ TERMINAL DATA returned either an ABORTED or INVALID BYTE COUNT A timer is in an unexpected expired state that prevents it from being started The semaphore was set after a oneshot IO was issued but nothing was found in the received HTB que A termination or a print summary or a reuse parameters request was received when TILX was not testing any units User requested abort via control Y 6 3 12 TILX Error Codes Table
282. cted 08 00 Logical unit communication failure 08 01 Logical unit communication time out 08 02 Logical unit communication parity error 09 00 Track following error 09 01 Tracking servo failure 09 02 Focus servo failure 09 03 Spindle servo failure 0A 00 Error log overflow 11 00 Unrecovered read error 11 05 L ec uncorrectable error 11 06 CIRC unrecovered error 14 00 Recorded entity not found 14 01 Record not found 15 00 Random positioning error 15 01 Mechanical positioning error 15 02 Positioning error detected by read of medium 17 00 Recovered data with no error correction applied 17 01 Recovered data with retries 17 02 Recovered data with positive head offset 17 03 Recovered data with negative head offset 17 04 Recovered data with retries and or CIRC applied 17 05 Recovered data using previous sector id 18 00 Recovered data with error correction applied C 72 HSJ Series Error Logging continued on next page Table C 15 Cont SCSI ASC ASCQ Codes For CDROM Devices ASC ASCQ Code Code Description 18 01 Recovered data with error correction and retries applied 18 02 Recovered data data auto reallocated 18 03 Recovered data with CIRC 18 04 Recovered data with LEC 18 05 Recovered data recommend reassignment 18 06 Recovered data recommend rewrite 1A 00 Parameter list length error 1B 00 Synchronous data transfer error 20 00 Invalid command operation code 21 00 Logical
283. d rfen erst in Betrieb genommen werden nachdem hierf r von dem f r den vorgesehenen Aufstellungsort zust ndigen Fernmeldeamt mit Funkst rungsmeBstelle die Genehmigung erteilt ist Als Antrag auf Erteilung einer Genehmigung dient eine Anmeldepostkarte Anhang des Handbuches mit Angabe der FTZ Serienpr fnummer Der untere Teil der Postkarte ist vom Betreiber zu vervollst ndigen und an das rtliche Fernmeldeamt zu schicken Der obere Teil bleibt beim Ger t xxi xxii Betreiberhinweis Das Ger t wurde funktechnisch sorgf ltig entst rt und gepr ft Die Kennzeichnung mit der Zulassungsnummer bietet Ihnen die Gew hr daf dieses Ger t keine anderen Fernmeldeanlagen einschlieflich Funkanlagen st rt Sollten bei diesen Ger ten ausnahmsweise trotzdem z B im ung nstigsten Fall beim Zusammenschalten mit anderen EVA Ger ten Funst rungen auftreten kann das im Einzelnen zus tzliche Funkentst rungsmafinahmen durch den Benutzer erfordern Bei Fragen hierzu wenden Sie sich bitte an die rtlich zust ndige Funkst rungsmefstelle Ihres Fernmeldeamtes Externe Datenkabel Sollte ein Austausch der von Digital spezifizierten Datenkabel n tig werden mu der Betreiber f r eine einwandfreie Funkentst rung sicherstellen da Austauschkabel im Aufbau und Abschirmqualit t dem Digital Originalkabel entsprechen Kennzeichnung Die Ger te werden bereits in der Fertigung mit der Zulassungsnummer gekennzeichnet und mit einer Anmeldep
284. d to 1200 baud B 36 Command Line Interpreter SET stripeset container name SET stripeset container name Modifies the characteristics of a stripeset Format SET stripeset container name Parameters stripeset container name Specifies the name of the stripeset whose characteristics will be modified Description Changes the characteristics of a stripeset Qualifiers CHUNKSIZE n CHUNKSIZE DEFAULT D Specifies the chunksize to be used The chunksize may be specified in blocks CHUNKSIZE n or you may let the controller determine the optimal chunksize CHUNKSIZE DEFAULT When entering an ADD command CHUNKSIZE DEFAULT is the default Note The chunksize may not be changed if the stripeset is currently in use by a unit To change the chunksize the unit must first be deleted then the chunksize may be changed CAUTION If the chunksize is changed the stripeset must be initialized which will destroy all customer data on the stripeset Examples CLI gt SET STRIPEO CHUNKSIZE 32 Stripeset STRIPEO s chunksize is set to 32 Command Line Interpreter B 37 SET THIS_CONTROLLER SET THIS_CONTROLLER Modifies this controller s parameters the controller that the maintenance terminal is connected to or the target of the DUP connection Format SET THIS CONTROLLER Description The SET THIS CONTROLLER command allows you to modify controller parameters on THIS CONTROLLER in single and dual redundant configurations
285. d 20 percent of the time after the initial write pass has completed This phase always executes after the random I O phase It is re executed at 10 minute intervals with each cycle approximately 2 minutes Diagnostics Exercisers and Utilities 6 7 e Seek Intensive Is designed to stimulate head motion on the selected disk units Single sector erase and access commands are issued if the test is write enabled Each I O uses a different track on each subsequent transfer The access and erase commands are issued in the ratio that you selected using the access erase ratio parameter This phase is executed 20 percent of the time after the initial write pass has completed This phase always executes after the data intensive I O phase It is re executed at 10 minute intervals with each cycle approximately 2 minutes 6 2 3 2 User Defined Test DILX CAUTION The User Defined test should be run only by very knowledgeable personnel Otherwise customer data can be destroyed When this test is selected DILX prompts you for input to define a specific test In the DILX User Defined test a total of 20 or fewer I O commands can be defined Once all of the commands are issued DILX issues the commands again in the same sequence This is repeated until the selected time limit is reached As you build the test DILX collects the following information from you for each command e The I O command name write read access or erase or quit Note t
286. d as described in Appendix B Diagnostics Exercisers and Utilities 6 99 6 7 HSZUTIL Virtual Maintenance Terminal Application This section describes the virtual maintenance terminal application HSZUTIL The HSZUTIL program is a host resident user application that provides a virtual maintenance terminal facility for communicating with an HSZ series controller over its host SCSI bus interface The virtual maintenance terminal communication protocol was developed explicitly for the HSZ series controller 6 7 1 General Implementation Considerations The HSZUTIL application is written entirely in C language The portion of the code that is system dependent is contained in separate system specific modules The terminal interface uses portable C I O functions and therefore does not support asynchronous terminal 1 O This is not a restriction of the virtual maintenance terminal protocol SCSI commands used by the HSZUTIL application in communicating with the HSZ series controller are as follow TEST UNIT READY INQUIRY SEND DIAGNOSTIC RECEIVE DIAGNOSTIC RESULTS 6 7 2 Restrictions There are several restrictions that must be noted before running the HSZUTIL application as follow Though the programming interface allows access to most SCSI commands HSZUTIL is not intended to provide functions beyond those required for maintaining a virtual terminal session The existing code contains code to several additional SCSI functions This c
287. d cache module 54 22229 01 discontinued Version 1 16 MB read cache module 54 22910 02 Version 2 32 MB read cache module 54 22910 01 Version 2 StorageWorks HSJ40 program card BG PYU6A 0A CI internal cables GRAY 17 03427 02 SCSI 2 device port cables BN21H 02 Table A 2 HSJ30 FRUs FRU Part Number HSJ30 CI SCSI controller module 70 30097 02 including OCP and bezel 16 MB read cache module 54 22910 02 Version 2 32 MB read cache module 54 22910 01 Version 2 StorageWorks HSJ30 program card BG PYU6A 0A CI internal cables GRAY 17 03427 02 SCSI 2 device port cables BN21H 02 Field Replaceable Units A 1 Table A 3 HSD30 FRUs FRU Part Number HSD30 DSSI SCSI controller module 70 31458 01 including bezel and trilink connector 16 MB read cache module 54 22910 02 Version 2 32 MB read cache module 54 22910 01 Version 2 StorageWorks HSD30 program card BG Q6HL0 0A SCSI 2 device port cables BN21H 02 Trilink connector 12 39921 02 included in 70 31458 01 50 pin DSSI bus terminator 12 31281 01 Table A 4 HSZ40 FRUs FRU Part Number HSZ40 SCSI to SCSI controller module 70 31457 01 including bezel and trilink connector 16 MB read cache module 54 22910 02 Version 2 32 MB read cache module 54 22910 01 Version 2 StorageWorks HSZ40 program card BG Q6HN0 0A SCSI 2 device port cables BN21H 02 Trilink connector 12 39921 01 included in 70
288. d communication between bus devices and shared memory One bus exchanger handles address lines while the other exchanger handles data lines The bus exchangers are classified as four way cross point switches which means the bus exchangers allow connections between one port and any other port on the switch 2 1 9 Shared Memory Shared memory consists of a dynamic RAM controller and arbitration engine DRAB gate array controller and 8 MB of associated dynamic RAM DRAM Shared memory uses parity protected 9 bit error correction code ECC and error detection code EDC for improved data integrity The shared memory stores the HS controller firmware and is shared between bus devices for data structures as well as data buffers One portion of shared memory contains instructions for the Intel 80960CA processor chip firmware variables and data structures including the look up table for the Intel 80960CA processor chip In the absence of the HS controller cache module another portion of shared memory acts as a cache Otherwise this portion contains cache module context for cache look ups when a cache module is in place 2 4 Functional Description 2 1 10 Device Ports The HS controller SCSI 2 device ports are a combination of NCR 53C710 SCSI port processors and SCSI transceivers The 53C710 processors perform operations in 8 bit single ended normal or fast mode The 53C710 processors execute scripts read from shared memory and under contro
289. d is therefore not included in the Old RBN and New RBN fields The content of those fields is undefined 02020064 0014 Disk Bad Block Replacement attempt completed for a write within the user data area of the disk Note that due to the way Bad Block Replacement is performed on SCSI disk drives information on the actual replacement blocks is not available to the controller and is therefore not included in the Old RBN and New RBN fields The content of those fields is undefined C 90 HSJ Series Error Logging Table C 30 Tape Transfer Error Event Log Template 61 Instance MSCP Event Codes MSCP Instance Event Code Code Description 020A0064 0007 A data compare error was detected during the execution of a compare modified READ or WRITE command Note that in this instance the SCSI Device Sense Data fields emdopcd through keyspec are undefined 03644002 000B An unrecoverable tape drive error was encountered while performing work related to tape unit operations 038B450A 000B The tape device reported standard SCSI Sense Data 03674002 014B A Drive failed because a Test Unit Ready command or a Read Capacity command failed 0368000A 0103 hia was failed by a Mode Select command received from the ost 03694002 00EB Drive failed due to a deferred error reported by drive 03644002 00E8 Unrecovered Read or Write error 036B4002 002B No response from one or more drives 036C430A 012B Nonvolatile memo
290. default run time is 10 minutes Enter performance summary interval in minutes 1 65535 10 Explanation Enter a value to set the interval for which a performance summary is displayed The default is 10 minutes Include performance statistics in performance summary y n n Explanation Enter Y to see a performance summary that includes the performance statistics that include the total count of read and write I O requests and the kilobytes transferred for each command Enter N and no performance statistics are displayed Display hard soft errors y n n Explanation Enter Y to enable error reporting including end messages and ElPs Enter N to disable error reporting including end messages and ElPs The default is disabled error reporting Display hex dump of Event Information Packet Requester Specific information y n n Explanation Enter Y to enable the hex dump display of the requester specific information contained in the EIP Enter N to disable the hex dump When the hard error limit is reached the unit will be dropped from testing Enter hard error limit 1 65535 65535 Explanation Enter a value to specify the hard error limit for all units to test This question is used to obtain the hard error limit for all units under test If the hard error limit is reached TILX discontinues testing the unit that reaches the hard error limit If other units are currently being tested b
291. dentification device serial number See Section C 2 2 4 for the description of these fields Note that the content of certain of the fields described previously may be undefined depending on the value supplied in the instance code field See Table C 29 for more detail C 2 3 13 Tape Transfer Error Event Log Template 61 The HSJ30 40 controller Device Services and Value Added Services firmware components report errors detected while performing work related to tape unit transfer operations via the Tape Transfer Error Event Log If the error is associated with a command issued by a host system the Tape Transfer Error Event Log will be sent to the host system that issued the command on the same connection upon which the command was received if This Host error logging is enabled on that connection and to all host systems that have enabled Other Host error logging on a connection or connections established with the HSJ30 40 controller s Disk and or Tape MSCP Server If the error is associated with a command issued by an HSJ30 40 controller firmware component the Tape Transfer Error Event Log will be sent to all host systems that have enabled Miscellaneous error logging on a connection established with the HSJ30 40 controller s Tape MSCP Server The Tape Transfer Error Event Log is reported via the TMSCP Tape Errors error log message format The format of this event log including the HSJ30 40 controller specific fields
292. depending on the controller model T designates the target ID of the disk drive 0 through 6 in a nonfailover configuration or 0 through 5 if the controller is in a failover configuration and L designates the LUN of the disk drive 0 through 7 When entering PTL at least one space must separate the port target and LUN Adds a disk drive to the known list of disk drives and names the drive This command must be used when a new SCSI 2 disk drive is to be added to the configuration TRANSPORTABLE NOTRANSPORTABLE D In normal operations the controller makes a small portion of the disk inaccessible to the host and uses this area to store metadata which improves data reliability error detection and recovery This vast improvement comes at the expense of transportability If NOTRANSPORTABLE is specified or allowed to default and there is no valid metadata on the unit the unit must be initialized If TRANSPORTABLE is specified and there is valid metadata on the unit the unit will have to be initialized in order to remove the metadata Note Digital recommends that you avoid specifying TRANSPORTABLE unless transportability of disk drives or media is imperative and there is no other way to accomplish the movement of data When entering an ADD DISK command NOTRANSPORTABLE is the default Command Line Interpreter B 3 ADD DISK Examples CLI gt ADD DISK RZ26 10010 0 A nontransportable disk is added to port 1
293. dicates that a write once device or a sequential access device encountered blank medium or format defined end of data indication while reading or a write once device encountered a non blank medium while writing During device initialization the device reported a SCSI Vendor Specific Sense Key This sense key is available for reporting vendor specific conditions During device initialization the device reported the SCSI Sense Key COPY ABORTED This indicates a COPY COMPARE or COPY AND VERIFY command was aborted due to an error condition on the source device the destination device or both During device initialization the device reported the SCSI Sense Key ABORTED COMMAND This indicates the target aborted the command The initiator may be able to recover by trying the command again During device initialization the device reported the SCSI Sense Key EQUAL This indicates a SEARCH DATA command has satisfied an equal comparison During device initialization the device reported the SCSI Sense Key VOLUME OVERFLOW This indicates a buffered peripheral device has reached the end of partition and data may remain in the buffer that has not been written to the medium A RECOVER BUFFERED DATA command s may be issued to read the unwritten data from the buffer During device initialization the device reported the SCSI Sense Key MISCOMPARE This indicates the source data did not match the data read from the medium continued on next pag
294. dom I O phase start The advantage of selecting the initial write pass is that compare host data commands can then be issued and the data previously written to the media can be verified for accuracy It makes sure that all LBNs within the selected range are accessed by DILX Diagnostics Exercisers and Utilities 6 55 The disadvantage of using the initial write pass is that it may take a long time to complete because a large LBN range was specified You can bypass this by selecting a smaller LBN range but this creates another disadvantage in that the entire disk space is not tested The initial write pass only applies to the Basic Function test The write percentage will be set automatically Enter read percentage for random IO and data intensive phase 0 100 67 Explanation This question is displayed if read write mode is selected It allows you to select the read write ratio to use in the Random I O and Data Intensive phases The default read write ratio is similar to the 1 O ratio generated by a typical OpenVMS system Enter data pattern number 0 all 19 user_defined 0 19 0 Explanation The DILX data patterns are used in write commands This question is displayed when writes are enabled for the Basic Function or User Defined tests There are 18 unique data patterns to select from These patterns were carefully selected as worst case or most likely to produce errors for disks connected to the controller See Section 6 4 8 for
295. ductive ESD mat 3 32 inch Allen wrench 5 32 inch Allen wrench Flat head screwdriver 7 2 2 Precautions Refer to Chapter 1 for ESD grounding module handling and program card handling guidelines Ground yourself to the cabinet grounding stud Figure 7 1 before servicing the read cache module 7 2 3 Module Removal Use the following procedure to remove the read cache module 1 The controller module is seated in front of the read cache module Any time you service a read cache you must shut down the controller s based on considerations of configuration down time and so on Refer to Section 7 1 To access the read cache module remove its controller module Refer to Section 7 1 Use a gentle up and down rocking motion to loosen the module from the shelf backplane Slide the read cache module out of the shelf noting which rails it was seated in and place it on an approved ESD mat 7 2 4 Module Replacement Installation Use the following procedure to replace the read cache module 1 The controller module is seated in front of the read cache module Any time you service a read cache you must shut down controller s based on considerations of configuration down time and so on Refer to Section 7 1 To replace the read cache module its controller module must already be removed You should replace the read cache module before reinstalling the controller module Slide the read cache module into the shelf usi
296. e Table C 27 Cont Device Services Nontransfer Error Event Log Template 41 Instance MSCP Event Codes MSCP Instance Event Code Code Description 03E2450A 00EB During device initialization the device reported a reserved SCSI Sense Key Table C 28 Disk Transfer Error Event Log Template 51 Instance MSCP Event Codes MSCP Instance Event Code Code Description 02090064 0007 A data compare error was detected during the execution of a compare modified READ or WRITE command Note that in this instance the SCSI Device Sense Data fields emdopcd through keyspec are undefined 03094002 000B An unrecoverable disk drive error was encountered while performing work related to disk unit operations 0328450A 000B The disk device reported standard SCSI Sense Data 030C4002 014B A Drive failed because a Test Unit Ready command or a Read Capacity command failed 030D000A 0103 Drive was failed by a Mode Select command received from the host 030E4002 00EB Drive failed due to a deferred error reported by drive 030F4002 00E8 Unrecovered Read or Write error 03104002 002B No response from one or more drives 0311430A 012B Nonvolatile memory and drive metadata indicate conflicting drive configurations 03124304 012B The Synchronous Transfer Value differs between drives in the same storageset 03134002 012B Maximum number of errors for this data transfer operation exceeded 03144002 00CB Drive reported recovered error without
297. e e Cabinets e Shelves e Devices e Controllers Hosts The information in this chapter describes physical configurations with respect to both standard and nonstandard customized subsystems Further information can be found in the specific StorageWorks cabinet shelf and configuration documentation Note Configuration rules and restrictions apply to all controllers unless stated otherwise 3 1 Ordering Considerations Digital provides the following configuration approaches for ordering controller subsystems e Preconfigured packaged starter subsystems e Configured to order CTO subsystems custom configurations e A combination of preconfigured and CTO subsystems Refer to the StorageWorks Array Controllers HS Family of Array Controllers User s Guide for a list of preconfigured controller subsystem option numbers Not all controller models have preconfigured subsystem option numbers 3 2 Cabinets The following sections present information to keep in mind when loading controller and storage shelves in SW800 series data center cabinets and SW500 series cabinets Preconfigured subsystems include a range of solutions for various capacities performance levels and availability Configuration Rules and Restrictions 3 1 3 2 1 SW800 Series Data Center Cabinet This section presents the rules to apply to subsystem configurations in SW800 series data center cabinets Refer to the Storage Works Solutions SW
298. e addsnsl field is 10 or greater this field contains bytes OF through 11 Sense Key Specific field of the Sense Data returned in the response of a SCSI REQUEST SENSE command The definition of this field is determined by the value of the Sense Key subfield of the snsflgs field This field is reserved HSJ Series Error Logging C 19 for Sense Key values other than ILLEGAL REQUEST RECOVERED ERROR HARDWARE ERROR MEDIUM ERROR and NOT READY If the Sense Key value is ILLEGAL REQUEST the format of this field is as shown in Figure C 12 Figure C 12 SCSI Sense Data Byte OF through 11 keyspec Field Field Pointer Bytes Format C 20 HSJ Series Error Logging 7 6 5 4 3 2 1 0 MSB Field Pointer LSB HOB SCSI Sense Data Byte OF through 11 keyspec Field Pointer Bytes Specific Subfields Bit Pointer and BPV A bit pointer valid BPV bit of zero indicates that the value in the Bit Pointer subfield is not valid A BPV bit of one indicates that the Bit Pointer subfield specifies which bit of the byte designated by the Field Pointer field is in error When a multiple bit field is in error the Bit Pointer subfield points to the most significant left most bit of the field A command data C D bit of one indicates that the illegal parameter is in the command descriptor block A C D bit of zero indicates that the illegal parameter is in the data parameters sent by the initiator during the DATA OUT
299. e e U indicates the device type is unknown The ASWF columns indicate the allocation spindle state write protect state and fault state respectively of the device The availability state is indicated using the following letters e A Allocated to this controller e a Allocated to the other controller e U Unallocated but owned by this controller e u Unallocated but owned by the other controller e A space in this column indicates the allocation is unknown The spindle state is indicated using the following characters e For disks this symbol indicates the device is at speed For tapes it indicates the tape is loaded 6 94 Diagnostics Exercisers and Utilities o o o O6 e gt For disks this symbol indicates the device is spinning up For tapes it indicates the tape is loading e lt For disks this symbol indicates the device is spinning down For tapes it indicates the tape is unloading e v For disks this symbol indicates the device is stopped For tapes it indicates the tape is unloaded e For other types of devices this column is left blank For disks and tapes a W in the write protect column indicates the device is hardware write protected This column is left blank for other device types A F in the fault column indicates an unrecoverable device fault If this field is set the device fault indicator will also be illuminated Rq S This column shows the average I O requ
300. e that in this instance the asc and ascq fields are undefined A disk related error code was reported that was unknown to the Fault Management firmware Note that in this instance the asc and ascq fields are undefined A tape related error code was reported that was unknown to the Fault Management firmware Note that in this instance the asc and ascq fields are undefined A media loader related error code was reported that was unknown to the Fault Management firmware Note that in this instance the asc and ascq fields are undefined A error code was reported that was unknown to the Fault Management firmware Note that in this instance the asc and ascq fields are undefined A failure occurred while attempting a SCSI Test Unit Ready or Read Capacity command to a device The device type is unknown to the controller Note that in this instance the asc and ascq fields are undefined The identification of a device does not match the configuration information The actual device type is unknown to the controller Note that in this instance the asc and ascq fields are undefined The shelf indicated by the port field is reporting a problem This could mean one or both of the following e If the shelf is using dual power supplies one power supply has failed e One of the shelf cooling fans has failed Note that in this instance the target asc and
301. e ES 4 16 4 10 3 Pad A AR e A BNI e d 4 16 4 10 4 Failover Setup Mismatch 4 17 4 11 Moving Devices Between Controllers o o o oooo oo 4 17 5 Error Analysis and Fault Isolation 5 1 Special Considerations cece ee eee eee 5 1 5 1 1 Nonredundant Configurations o 5 1 5 1 2 Dual redundant Configurations 5 1 5 1 3 Cache Module Failures oooooocoooooooo ooo ooo 5 1 5 2 Types of Error Reporting ooooooooooooor es 5 2 5 3 Troubleshooting BasicS 5 2 5 4 Operator Control Panel o 5 2 5 4 1 Normal Operation 5 3 5 4 2 Fault Notification cai Sk ee E aate RS EAM ti 5 4 5 5 Device LEDS A Mala ee tees eg 5 8 5 5 1 Storage SBB Status o o oooo ooo o mo 5 8 5 5 2 Device Shelf Status and Power Supply Status 5 9 5 6 Error Messages ni DER AUR A da 5 11 5 6 1 Diagnostic Messages cee eee eens 5 12 5 6 2 NVPM Messages aci as 5 12 5 6 3 CLI Automatic Messages o 5 14 5 6 4 Shelf Messages 5 15 5 6 5 Failover Messages ccc ee eee s 5 15 5 6 6 Other CLI Messages 0 ccc rs 5 16 5 7 Host Error Logs iee tee 205 tee g Rob I Sa RA wae a 5 16 5 7 1 Translation Utilities o ooooooooooooooooo ooo 5 16 5 7 2 Host Error Log Translation
302. e SCSI buffered mode selected on the device The various SCSI Buffered Modes are shown in Table C 11 C 14 HSJ Series Error Logging uweuo This bit is set to one if and only if an unrecoverable write error was detected while unwritten objects that is data blocks filemarks or setmarks remain in the buffer msbd This bit is set to one if and only if the MODE SENSE block descriptor is nonzero fow This bit is set to one if and only if the Fixed bit of the WRITE command is set to one rsvd Reserved for future use dssd This bit is set to one if and only if the Sense Data contained in the ercdval through keyspec fields are supplied by the target device If this bit is zero the Sense Data contained in the ercdval through keyspec fields are generated by the HSJ30 40 controller on behalf of the target device because the Sense Data could not be obtained from that device ercdval This field contains byte 0 of the Sense Data returned in the response of a SCSI REQUEST SENSE command This field is formatted as shown in Figure C 10 Figure C 10 SCSI Sense Data Byte Zero ercdval Field Format 7 6 5 4 3 2 1 0 SCSI Sense Data Byte Zero ercdval Specific Subfields Error Code An error code of 70 indicates that the event being reported occurred during the execution of the current command identified in the cmdopcd field HSJ Series Error Logging C 15 An error code of 71 indica
303. e Table C 51 e Subcommand Error subcode Source Inconsistent State cases A and B All other conditions that can be reported via the Disk Copy Data Correlation Event Log are not assigned a specific Recommended Repair Action Code because they can be correlated with the associated condition specific event log C 122 HSJ Series Error Logging C 6 Deskew Command Procedure Example C 2 presents a command procedure to deskew the CONTROLLER DEPENDENT INFORMATION for a CONTROLLER LOG type error log Example C 2 Deskew Command Procedure Example Pl Input file name P2 Output file name on warning then exit inew_entry U KKKKKKKKKKKKKKKKKKKKKKKKKKKKKKE ENTRY ctrl_entry CONTROLLER LOG lw_entry LONGWORD ctrl_inp FALSE lw_string open read inf pl open write ouf p2 in loop read end in done inf inr inrlen f length inr if fSlocate new_entry inr ne inrlen then write sysSoutput inr if ctrl_inp then gosub convert_longs ctrl_inp FALSE endif else if f locate ctrl entry inr ne inrlen then write sysSoutput inr ctrl inp TRUE lw string endif if fSlocate lw_entry inr ne inrlen and ctrl inp then lw f element 2 fSedit inr TRIM COMPRESS if lw string eqs then lw string f extract 0 4 1w else lw string lw lw string endif endif endif write ouf inr goto in loop n UA LU UU UY 17 UY DU LU LU UU Y LN Ur US UY LN
304. e VMS Analyze Error Log Utility including more information about this command and its qualifiers refer to the VMS Error Log Utility Reference Manual or call Digital Multivendor Services DEC OSF 1 AXP systems use the UNIX Errorlog Report Formatter uerf to assist in error log translation This tool also reads information from the log and provides the operator with indications as to what the log means with respect to controller host operation Invoke uerf using the uerf R o full command 5 7 2 Host Error Log Translation The format of transmitted error information varies according to model of HS controller Consequently you will find the description of error logs and how to read the logs broken into separate appendices for each model See the following e For HSJ series controllers see Appendix C e For HSD series controllers see Appendix D e For HSZ series controllers see Appendix E Note Host error log translations are correct as of the date of publication of this manual However log information may change with firmware updates Refer to your StorageWorks Array Controller Operating Firmware Release Notes for error log information updates 5 16 Error Analysis and Fault Isolation 6 Diagnostics Exercisers and Utilities This chapter discusses the automatic and manual programs available to assist operation and diagnosis of the HS controller subsystem including the following e Initialization and self test ro
305. e When using jumpered shelves only two jumpered pair shelves for a total of four shelves can be used with an SW500 series cabinet When five ports SW800 or two ports SW500 have doubled shelves for 5 inch SBBs 4 2 TZ8x7 tapes cannot be connected or even mounted in the cabinet because all or most front shelf locations are needed for the 5 inch SBBs 3 4 2 1 Table Conventions The following describes the designations used in Tables 3 1 through 3 6 The designation shows the possible devices in each shelf and the possible number of devices in similarly configured shelves n mxoT n mxoJ where n is the number of device shelves m is the number of SCSI 2 connections to a device shelf o is the number of devices on each SCSI 2 connection T indicates the device shelf is terminated J indicates the device shelf is jumpered According to the formula m o possible devices in each shelf n m o possible number of devices in similarly configured shelves 3 4 3 3 inch SBBs Tables 3 1 and 3 2 list some recommended configurations for 3Y2 inch SBBs 3 10 Configuration Rules and Restrictions Table 3 1 3v2 Inch SBB Configurations 6 Port Controller Number of Available Number BA350 SB for 3 inch of Devices Shelves Configure as SBBs Ports Used 1 2 1 2x3T 5 4 1 2 3 4 2 2 2x3T 9 8 3 4 5 18 3 3 2x3T 13 0 5 6 19 24 4 2 2x3T 5 0 6 2 1x6T 25 30 5 1 2x3T 5 0 6 4 1x6T 31 36 6 1x6T 5 0 97 A4
306. e and updates the cache memory to make sure that the memory does not contain obsolete data This technique increases the chances that future host read requests can be filled from the cache The host sees the write operation as complete only after the external storage device has been updated Also see read cache 3 4 Inch SBBs configurations 3 10 restrictions 3 9 5 4 Inch SBBs configurations 3 13 restrictions 3 9 A Abort codes HSJ HSD series DILX 6 29 TILX 6 49 HSZ series DILX 6 65 Acceptance test 4 10 ADD CDROM command B 2 ADD DISK command B 3 ADD STRIPESET command B 5 ADD TAPE command B 6 ADD UNIT command B 7 Adding physical devices 4 9 7 12 B 78 Adding storage sets B 78 Adding units B 78 Allocation class 4 5 4 7 7 10 7 17 7 46 Amber LEDs 5 3 AUTOGEN COM file recognized devices 4 13 required modifications 4 13 Availability configuration 3 18 3 19 Basic function test HSJ HSD series DILX 6 7 TILX 6 32 HSZ series DILX 6 51 BIST 6 2 Bit Flags Connection State Codes 0000 C 60 0001 C 60 0002 C 60 0003 C 60 0004 C 60 Index Bit Flags Connection State Codes cont d 0005 C 60 0006 C 60 0007 C 60 0008 C 60 0009 C 60 000A C 60 000B C 60 Virtual Circuit State Codes 0001 C 58 0002 C 58 0003 C 58 0004 C 58 0005 C 58 Blower 7 34 installing 7 36 removing 7 35 replacing 7 36 service of 7 34 service precautions 7 34 tools 7 34 Boot See I
307. e change requests for the other controller in a dual controller configuration The thread that manages state changes initiated by the other controller in a dual controller configuration The thread that manages the data buffer pool continued on next page 6 80 Diagnostics Exercisers and Utilities Table 6 13 Cont Thread Description Thread Name Description SCS The SCS directory thread SCSIVT A thread that provides a virtual terminal connection to the CLI over the host SCSI bus SHIS The host SCSI protocol interface thread for SCSI controllers TILX A Local Program that exercises tape devices VA The thread that provides host protocol independent logical unit services VTDPY A Local Program thread that provides a dynamic display of controller configuration and performance information Diagnostics Exercisers and Utilities 6 81 CI DSSI Host Port Characteristics Node HSJ501 Q Port 13 O SysId 4200100D0720 O Description This subdisplay shows the current host port identification information This subdisplay is only available for CI or DSSI based controllers SCS node name O Port number O SCS system ID 6 82 Diagnostics Exercisers and Utilities SCSI Host Port Characteristics Description Xfer Rate 10 0 9 0 1 W 7 10 00 2 W AsyncG This subdisplay shows the current host port SCSI target identification any initiator that has negotiated synchronous transfers and the negotiated transf
308. e errors on disks connected to the controller Table 6 8 DILX Data Patterns Pattern Number Pattern in hex 1 0000 2 8B8B 3 3333 4 3091 5 shifting 1s 0001 0003 0007 000F 001F 003F 007F OOFF 01FF O3FF 07FF OFFF 1FFF 3FFF 7FFF 6 shifting Os FTE FFFC FFFC FFFC FFEO FFEO FFEO FFEO FEOO FCO00 F800 F000 F000 C000 8000 0000 7 alternating 1s 0s 0000 0000 0000 FFFF FFFF FFFF 0000 0000 FFFF FFFF 0000 FFFF 0000 FFFF 0000 FFFF B6D9 5555 5555 5555 AAAA AAAA AAAA 5555 5555 AAAA AAAA 5555 AAAA 5555 AAAA 5555 AAAA 5555 10 DB6C 11 2D2D 2D2D 2D2D D2D2 D2D2 D2D2 2D2D 2D2D D2D2 D2D2 2D2D D2D2 2D2D D2D2 2D2D D2D2 12 6DB6 13 ripple 1 0001 0002 0004 0008 0010 0020 0040 0080 0100 0200 0400 0800 1000 2000 4000 8000 14 ripple 0 FIE FFFD FFFB FFF7 FFEF FFDF FFBF FF7F FEFF FDFF FBFF F7FF EFFF BFFF DFFF 7FFF 15 DB6D B6DB 6DB6 DB6D B6DB 6DB6 DB6D B6DB 6DB6 DB6D B6DB 6DB6 DB6D 16 3333 3333 3333 1999 9999 9999 B6D9 B6D9 B6D9 B6D9 FFFF FFFF 0000 0000 DB6C DB6C 17 9999 1999 699C E99C 9921 9921 1921 699C 699C 0747 0747 0747 699C E99C 9999 9999 continued on next page 6 62 Diagnostics Exercisers and Utilities Table 6 8 Cont DILX Data Patterns Pattern Number Pattern in hex 18 FFFF Default Use all of the above patterns in a random method 6 4 9 Interpreting the DILX
309. e on CI based controllers O Packets received from a remote node O Packets sent to a remote node that were acknowledged ACK O Packets sent to a remote node that were not acknowledged NAK O Packets sent to a remote node for which no response was received 6 84 Diagnostics Exercisers and Utilities DSSI Performance Display DSSI Pkts Pkts S RCV 5710 519 ACK 11805 1073 0 NAK 2073 188 NOR 1072 90 Description This display indicates the number of packets sent and received through the DSSI port and the packet rate This display is only available on DSSI based controllers O Packets received from a remote node O Packets sent to a remote node that were acknowledged ACK O Packets sent to a remote node that were not acknowledged NAK O Packets sent to a remote node for which no response was received Diagnostics Exercisers and Utilities 6 85 CI DSSI Connection Status Connections 0123456789 Description This display shows the current status of any connections to a remote CI or DSSI node This display is available only on CI and DSSI based controllers O Each position in the data field represents one of the possible nodes to which the controller can communicate To locate the connection status for a given node use the column on the left to determine the high order digit of the node number and use the second row to determine the low order digit For CI controllers the number of nodes displayed is determi
310. e precautions 7 13 servicing both of 7 18 servicing one of 7 13 tools 7 13 DUP 2 10 E EDC 6 2 6 3 EIA 423 terminal port 2 3 Electrostatic discharge See ESD End message display HSJ HSD series DILX 6 18 TILX 6 42 Environmental specifications 1 10 ERF invoking 5 16 Error codes HSJ HSD series DILX 6 30 TILX 6 50 HSZ series DILX 6 65 Error detection code See EDC Error information packets HSJ HSD series DILX 6 18 TILX 6 42 Error logging 1 5 5 16 and controller model 5 16 and ERF 5 16 and uerf 5 16 firmware 2 10 translations 5 16 Error messages 5 11 automatic 5 11 cache module 5 12 CLI automatic 5 14 CLI interactive 5 16 during failover 5 15 Error messages cont d from diagnostics 5 12 NVPM 5 12 shelf 5 15 Errorlog Report Formatter See ERF ESD See also Precautions danger 1 6 grounding 1 6 guidelines 1 6 module guidelines 1 6 subsystem placement 1 6 subsystem room 1 6 Examples HSJ HSD series DILX 6 22 TILX 6 45 EXEC 6 3 Executive functions firmware 2 9 Exercisers See DILX 6 5 See TILX 6 5 EXIT command B 15 F Failover 2 4 4 15 and SHUTDOWN 7 3 copying configuration 4 7 correcting mismatch 4 17 error messages 5 15 exiting 4 16 firmware 2 12 initialization 4 17 of cache 5 1 reviving failed controller 4 16 setup for 4 16 setup mismatch 4 17 shared commands 4 15 testing for 4 17 time required for 4 16 warm
311. e star coupler Never leave cables terminated or not attached at the star coupler and disconnected at the internal CI cable connector This minimizes adverse effects on the cluster and prevents a short circuit between the two ground references 7 26 Removing and Replacing Field Replaceable Units Connect the external CI cables to the internal CI cable 4 Remove any terminators from the star coupler connections Connect the external CI cable connectors to the star coupler one at a time in the following order refer to Figure 7 7 RXB TXB RXA TXA 6 Install any tie wraps as necessary to hold the internal CI cable in place 7 Close and lock the cabinet doors SW800 series using a 5 32 inch Allen wrench 8 Enter the following commands to resume activity on the host paths CLI gt SET THIS CONTROLLER PATH A CLI gt SET THIS CONTROLLER PATH B 7 6 DSSI Host Cables HSD series Servicing DSSI host cables Figure 7 8 causes system down time for all bus members because all power must be disconnected from every member on the DSSI bus before cable removal replacement Use the procedures in this section when you are removing and replacing DSSI host cables Optional The trilink connector may be considered part of the DSSI host cable during service CAUTION Do not service the host port cables of an HSD series controller while the power is on to any members on the DSSI bus including the controller and host Doing so risks
312. e type gt cannot be used in a lt storageset type gt Explanation The device specified cannot be used in the storage set specified for example tapes cannot be bound into a stripeset Reexamine the configuration and correct the incompatibility Error 9100 A lt storageset type gt must have from lt least gt to lt most gt entities Explanation The wrong number of devices was specified for this storage set Different storage sets require different numbers of devices Reexamine the configuration then correct the number of devices Error 9130 Cannot delete ONLINE unit Explanation The unit specified in a DELETE command is on line to a host Dismount the unit at the host then retry the command Or add the OVERRIDE_ONLINE qualifier to the DELETE command Error 9140 Cannot delete exclusive access unit Explanation The unit specified in a DELETE command is set up for exclusive access Take the unit out of exclusive access mode and retry the command Command Line Interpreter B 71 Error 9150 INITIALIZE is no longer supported at the unit level You must INITIALIZE the container that makes up this unit Explanation You tried to initialize a unit Units may no longer be initialized The container that makes up the unit must be initialized before a unit is created out of the container Error 9160 Non disk devices cannot be INITIALIZED Explanation Tapes and CDROMS may not be initialized Error 9170 lt device type gt lt de
313. eads it and compares it against what was written to the disk This indicates a compare failure More information is displayed to indicate where in the data buffer the compare failed and what the data were and should have been DILX terminated A termination a print summary or a reuse parameters request was received but DILX is currently not testing any units Explanation You entered a Ctrl Y termination request a Ctrl G print summary request or a Ctrl C reuse parameters request before DILX had started to test units DILX cannot satisfy the second two requests so DILX treats all of these requests as a termination request DILX will not change the state of a unit if it is not NORMAL Explanation DILX cannot allocate the unit for testing because it is already in Maintenance mode Maintenance mode can only be invoked by the firmware If another DILX session is in use the unit is considered in Maintenance mode Unable to bring unit online Explanation This message is self explanatory Soft error reporting disabled Unit x Explanation This message indicates that the soft error limit has been reached and therefore no more soft errors will be displayed for this unit Hard error limit reached unit x dropped from testing Explanation This message indicates that the hard error limit has been reached and the unit is dropped from testing Soft error reporting disabled for controller errors Explanation This indicates that t
314. ecific Fields format This field contains the value 01 that is T MSCP Memory Errors error log format code C 30 HSJ Series Error Logging event code The values that can be reported in this field for this event log are shown in Table C 21 memory address The content of this field depends on the value supplied in the instance code field See Table C 21 for more detail instance code See Section C 2 1 for the description of this field The values that can be reported in this field for this event log are shown in Table C 21 templ See Section C 2 1 for the description of this field This field contains the value 12 for this event log tdisize See Section C 2 1 for the description of this field This field contains the value 00 for this event log reserved offset 22 This field contains the value 0 event time See Section C 2 1 for the description of this field C 2 3 5 Subsystem Built In Self Test Failure Event Log Template 13 The HSJ30 40 controller Subsystem Built In Self Tests firmware component reports errors detected during test execution via the Subsystem Built In Self Test Failure Event Log The Subsystem Built In Self Test Failure Event Log will be sent to all host systems that have enabled Miscellaneous error logging on a connection or connections established with the HSJ30 40 controller s Disk and or Tape MSCP Server The Subsystem Built In Self Test Failure Event Log is reported via t
315. ecified HSD and HSJ only If any user data cannot be flushed to disk the controller will not shut down unless the IGNORE_ERRORS qualifier is specified Specifying IMMEDIATE will cause this controller to shut down immediately without flushing any user data to the disks even if drives are on line to a host Note If you enter a SHUTDOWN THIS_CONTROLLER command communication with the controller will be lost when this controller shuts down Qualifiers for HSD and HSJ controllers IGNORE_ERRORS NOIGNORE_ERRORS D If errors result when trying to write user data the controller will not be shut down unless IGNORE_ERROR is specified CAUTION Customer data may be lost or corrupted if the IGNORE_ERRORS qualifier is specified NOIGNORE_ERRORS is the default IMMEDIATE NOIMMEDIATE D If IMMEDIATE is specified immediately shut down the controller without checking for online devices CAUTION Customer data may be lost or corrupted if the IMMEDIATE qualifier is specified NOIMMEDIATE is the default B 62 Command Line Interpreter SHUTDOWN THIS_CONTROLLER OVERRIDE_ONLINE NOOVERRIDE_ONLINE D If any units are on line to the controller the controller will not be shut down unless OVERRIDE_ONLINE is specified If the OVERRIDE_ONLINE qualifier is specified the controller will shut down after all customer data is written to disk CAUTION Customer data may be lost or corrupted if the OVERRIDE_
316. ecifies a one to six character name for node TERMINAL_PARITY ODD EVEN NOTERMINAL_PARITY Specifies the parity transmitted and expected Parity options are ODD or EVEN NOTERMINAL_PARITY causes the controller not to check for or transmit any parity on the terminal lines Command Line Interpreter SET OTHER_CONTROLLER When first installed the controller s terminal parity is set to NOTERMINAL_PARITY TERMINAL_SPEED baud_rate Sets the terminal speed to 300 600 1200 2400 4800 or 9600 baud The transmit speed is always equal to the receive speed When first installed the controller s terminal speed is set to 9600 baud TMSCP_ALLOCATION_CLASS n Specifies the allocation class 0 through 255 in a single controller configuration or 1 through 255 in a dual redundant configuration When first installed the controller s TMSCP_ALLOCATION_CLASS is set to 0 Qualifiers for HSJ controllers ID n Specifies the CI node number 0 through MAX NODES 1 MAX NODES n Specifies the maximum number of nodes 8 16 or 32 When first installed the controllers MAX NODES is set to 16 MSCP ALLOCATION CLASS n Specifies the allocation class 0 through 255 in a single controller configuration or 1 through 255 in a dual redundant configuration When first installed the controllers MSCP ALLOCATION CLASS is set to 0 PATH A NOPATH A Enables or disables CI Path A When first installed NOPATH A is set PATH B NOPATH B Ena
317. ed NOIGNORE_ERRORS is the default IMMEDIATE NOIMMEDIATE D If IMMEDIATE is specified immediately shut down the controller without checking for online devices CAUTION Customer data may be lost or corrupted if the IMMEDIATE qualifier is specified NOIMMEDIATE is the default B 60 Command Line Interpreter Examples SHUTDOWN OTHER_CONTROLLER OVERRIDE_ONLINE NOOVERRIDE_ONLINE D If any units are on line to the controller the controller will not be shut down unless OVERRIDE_ONLINE is specified If the OVERRIDE_ONLINE qualifier is specified the controller will shut down after all customer data is written to disk CAUTION Customer data may be lost or corrupted if the OVERRIDE_ONLINE qualifier is specified NOOVERRIDE_ONLINE is the default CLI gt SHUTDOWN OTHER_CONTROLLER Shuts down the other controller as long as the other controller does not have any units on line CLI gt SHUTDOWN OTHER_CONTROLLER OVERRIDE_ONLINE Shuts down the other controller even if there are units on line to the other controller Command Line Interpreter B 61 SHUTDOWN THIS_CONTROLLER SHUTDOWN THIS_CONTROLLER Format Description Shuts down and does not restart this controller SHUTDOWN THIS_CONTROLLER The SHUTDOWN THIS_CONTROLLER command shuts down this controller If any disks are on line to this controller the controller will not shut down unless the OVERRIDE_ONLINE qualifier is sp
318. ed either an ABORTED or INVALID BYTE COUNT 5 TS READ TERMINAL DATA returned either an ABORTED or INVALID BYTE COUNT 6 A timer is in an unexpected expired state that prevents it from being started 7 The semaphore was set after a oneshot IO was issued but nothing was found in the received HTB que 8 A termination a print summary or a reuse parameters request was received when DILX was not testing any units 9 User requested an abort via Y 6 4 11 DILX Error Codes Table 6 10 list the DILX error codes and definitions for DILX detected errors Table 6 10 DILX Error Codes and Definitions Value Definition 1 Illegal Data Pattern Number found in data pattern header 2 No write buffers correspond to data pattern 3 Read data do not match write buffer 6 5 VTDPY Utility The VTDPY utility gathers and displays system state and performance information for the HS family of modular storage controllers The information displayed includes processor utilization host port activity and status device state logical unit state and cache and I O performance The VTDPY utility requires a video terminal that supports ANSI control sequences such as a VT220 VT320 or VT420 terminal A graphics display that provides emulation of an ANSI compatible video terminal also can be used For DSSI and CI based HS controllers VTDPY can be run on terminals either directly connected to the HS controller or on terminals connected through a h
319. ed when DILX is started Copyright O Digital Equipment Corporation 1993 Disk Inline Exerciser version 1 4 This message identifies the internal program as DILX and gives the DILX software version number Change Unit is not a legal option if Auto Configure was chosen Explanation This message will be displayed if the user selected the Auto Configure option and selected the change unit response to the reuse parameters question You cannot drop a unit and add a unit if all units were selected for testing DILX Normal Termination Explanation This message is displayed when DILX terminates under normal conditions Insufficient resources Explanation Following this line is a second line that gives more information about the problem which could be one of the following messages e Unable to allocate memory DILX was unable to allocate the memory it needed to perform DILX tests You should run DILX again but choose a lower queue depth and or choose fewer units to test e Cannot perform tests DILX was unable to allocate all of the resources needed to perform DILX tests You should run DILX again but choose a lower queue depth and or choose fewer units to test Unable to change operation mode to maintenance DILX tried to change the operation mode from normal to maintenance using the SYSAP CHANGE_STATE routine but was not successful due to insufficient resources This problem should not occur If it does occur
320. egister value e Last Failure Parameter 1 contains the CACHEAO DRAB CSR Register value e Last Failure Parameter 2 contains the CACHEAO DRAB Diagnostic CSR Register value e Last Failure Parameter 3 contains the CACHEAO DRAB Diagnostic Error Register value e Last Failure Parameter 4 contains the CACHEAO DRAB Error Address Register value e Last Failure Parameter 5 contains the CACHEAO DRAB Error Data Register value e Last Failure Parameter 6 contains the CACHEAO DRAB Error Region Register value e Last Failure Parameter 7 contains the CACHEAO DRAB Region Setup Register value continued on next page Table C 33 Cont Executive Services Last Failure Codes Code Description 01832288 A processor interrupt was generated by the CACHEA1 Dynamic RAM Controller and Arbitration engine DRAB with an indication that an unrecoverable memory access problem occurred e Last Failure Parameter 0 contains the CACHEA1 DRAB Setup Register value e Last Failure Parameter 1 contains the CACHEA1 DRAB CSR Register value e Last Failure Parameter 2 contains the CACHEA1 DRAB Diagnostic CSR Register value e Last Failure Parameter 3 contains the CACHEA1 DRAB Diagnostic Error Register value e Last Failure Parameter 4 contains the CACHEA1 DRAB Error Address Register value e Last Failure Parameter 5 contains the CACHEA1 DRAB Error Data Register value e Last Failure Parameter 6 contains the CACHEA1 DRAB Error Region Register
321. elds are undefined 03862002 002A Device port SCSI chip reported gross error during tape operation Note that in this instance the asc and ascq fields are undefined 03B82002 002A Device port SCSI chip reported gross error during media loader operation Note that in this instance the asc and ascq fields are undefined 03CD2002 002A Device port SCSI chip reported gross error during operation to a device that is unknown to the controller Note that in this instance the asc and ascq fields are undefined 03062002 008A Non SCSI bus parity error during disk operation Note that in this instance the asc and ascq fields are undefined 03872002 008A Non SCSI bus parity error during tape operation Note that in this instance the asc and ascq fields are undefined 03B92002 008A Non SCSI bus parity error during media loader operation Note that in this instance the asc and ascq fields are undefined 03CE2002 008A Non SCSI bus parity error during operation to a device that is unknown to the controller Note that in this instance the asc and ascq fields are undefined 03070101 01CA Source driver programming error encountered during disk operation Note that in this instance the asc and ascq fields are undefined 03880101 01CA Source driver programming error encountered during tape operation Note that in this instance the asc and ascq fields are
322. emoving is still functioning green LED flashing you must shut down the controller by following the guidelines in Section 7 1 2 If the controller s green LED is lit continuously it has already shut down and the surviving controller has assumed service to its devices Note Early controller models had a program card EMI shield This shield may be discarded 6 On the controller you are removing unsnap and discard the program card EMI shield if attached refer to Figure 7 2 7 Remove the program card by pushing the eject button refer to Figure 7 3 next to the card Pull the card out and save it for use in the replacement controller module 8 HSJ series Loosen the captive screws on the CI cable connector refer to Figure 7 3 with a flat head screwdriver and remove the cable from the front of the controller module CAUTION Do not remove host port cables from an HSD series controller while the power is on to any members on the DSSI bus including the controller and host Doing so risks short circuits that may blow fuses on all the members HSD series Turn off power to all members on the DSSI bus Then with a flat head screwdriver loosen the captive screws on the DSSI cable connector and terminator and remove them from the trilink connector If necessary for controller access loosen the captive screws on the trilink connector and remove it from the front of the companion controller 9 Remove the ma
323. ent If the event is associated with a command issued by a host system this field is formatted as described in Section C 2 2 2 If the event is associated with a command issued by an HSJ30 40 controller firmware component this field is considered reserved and contains the value 0 device locator devtype device identification device serial number See Section C 2 2 4 for the description of these fields cmdopcd infoq ercdval segment snsflgs info addsnsl cmdspec asc ascq frucode keyspec See Section C 2 2 5 for the description of these fields Note that the content of certain of the fields described previously may be undefined depending on the value supplied in the instance code field See Table C 30 for more detail C 2 3 14 Media Loader Error Event Log Template 71 The HSJ30 40 controller Device Services firmware component reports errors detected while performing work related to media loader operations via the Media Loader Error Event Log If the error is associated with a command issued by a host system the Media Loader Error Event Log will be sent to the host system that issued the command on the same connection upon which the command was received if This Host error logging is enabled on that connection and to all host systems that have enabled Other Host error logging on a connection or connections established with the HSJ30 40 controller s Disk and or Tape MSCP Server C 52 HSJ Series Er
324. er configurations e Controller commands Set and show the basic controller parameters Set the controller ID CI or DSSI node number or SCSI target ID Set the resident terminal characteristics Restart the controller Run resident diagnostics and utilities see Chapter 6 Normal Operation 4 3 e Device commands e Device commands specify and show the location of physical SCSI 2 devices attached to the controller Locations of devices are specified using the SCSI Port Target LUN PTL designation e Only devices that have been defined by the ADD command are seen or used by the controller Devices that have been placed in a shelf but have not been added will not be automatically used by the controller Use the CONFIG utility to quickly add such devices see Chapter 6 e Storage set commands e Storage set commands add modify rename and show storage sets such as stripesets e Logical unit commands e Logical unit commands add modify and show logical units built from devices and storage sets e Exerciser commands The exerciser commands invoke disk and tape exercisers that test device data transfer capabilities The exercisers DILX and TILX are fully described in Chapter 6 Note Remember these two guidelines when using the CLI e Not all configuration parameters need to be specified on one line They can be entered by using multiple SET commands e Only enough of each comm
325. er is specified by the COPY parameter Also note that due to the amount of information that must be passed between the two controllers this command may take up to 1 minute to complete CLI gt SET FAILOVER COPY THIS_CONTROLLER This places two controllers into a dual redundant configuration where the good data was on the controller that the maintenance terminal or virtual terminal connection was connected to Command Line Interpreter B 31 SET FAILOVER 2 CLI SET FAILOVER COPY OTHER CONTROLLER This places two controllers into a dual redundant configuration where the good data was on the controller that the maintenance terminal or virtual terminal connection was not connected to B 32 Command Line Interpreter SET NOFAILOVER SET NOFAILOVER Format Description Examples Removes THIS CONTROLLER and OTHER_CONTROLLER if reachable from a dual redundant configuration SET NOFAILOVER The SET NOFAILOVER command removes THIS CONTROLLER and the OTHER_CONTROLLER if currently reachable from a dual redundant configuration Before or immediately after entering this command one controller should be physically removed because the sharing of devices is not supported by single controller configurations The controller on which the command was entered will always be removed from a dual redundant state even if the other controller is not currently reachable No configuration information is lost when lea
326. er method currently in use between the controller and the initiators This subdisplay is only available for SCSI based HS controllers 2 e o SCSI host port target ID Transfer width W indicates 16 bit or wide transfers are being used A space indicates 8 bit transfers are being used The initiator with which synchronous commication has been negotiated A numeric value indicates the synchronous data rate that has been negotiated with the initiator at the specified SCSI ID The value is listed in megahertz Mhz In this example the negotiated synchronous transfer rate is approximately 3 57 Mhz To convert this number to the nanosecond period invert and multiply by 1000 The period for this is approximately 280 nanoseconds Async indicates communication between this target and all initiators is being done in asynchronous mode This is the default communication mode and will be used unless the initiator successfully negotiates for synchronous communications If there is no communication with a given target ID the communication mode will be listed as asynchronous Diagnostics Exercisers and Utilities 6 83 Cl Performance Display Path A Pkts Pkts S RCV 5710 519 ACK 11805 1073 0 AK 2073 188 O OR 1072 970 Path B Pkts Pkts S RCV 5869 533 ACK 11318 1028 AK 2164 196 OR 445 40 Description This display indicates the number of packets sent and received over each CI path and the packet rate This display is only availabl
327. er of tape marks to reposition 1 255 1 Explanation If you specify the reposition file marks command in the User Defined test this question is displayed The question is self explanatory Enter IO size in bytes 1 65535 Explanation This question is only asked in the User Defined test for read or write commands The question is self explanatory 6 36 Diagnostics Exercisers and Utilities Enter in HEX the TMSCP Command Modifiers 0 Explanation This question only applies to the User Defined test It allows you to specify the TMSCP command modifiers You must understand the meaning of the TMSCP command modifiers before entering any value other than the default Contact Digital Multivendor Services if you wish to use other than default values Reuse Parameters stop continue restart change_unit stop Explanation This question is displayed after the TILX execution time limit expires after the hard error limit is reached for every unit under test or after you enter Ctrl C The options are as follow e Stop TILX terminates normally Continue TILX resumes execution without resetting the remaining TILX execution time or any performance statistics If the TILX execution time limit has expired or all units have reached their hard error limit TILX terminates e Restart TILX resets all performance statistics and restarts execution so that the test will perform exactly as the test that just completed e
328. er of this appendix you should become familiar with MSCP and TMSCP protocols especially in the area of error log message formats C 2 1 Implementation Dependent Information Area With the exception of the Disk Copy Data Correlation error log message format each of the error log message formats listed in Section C 2 provide an implementation dependent information area located at the end of the message For HSJ30 40 controller specific event logs this area is formatted as shown in Figure C 1 Note that the fields shown in Figure C 1 always begin on a longword boundary within HSJ30 40 controller specific event logs If the implementation dependent information area of a particular MSCP error log message format does not begin on a longword boundary a reserved field containing the appropriate number of bytes is appended to the format to provide the necessary alignment such as offset 16 in Figure C 15 Implementation Dependent Information Fields instance code A number that uniquely identifies the event being reported The format of this field is shown in Figure C 2 C 6 HSJ Series Error Logging Figure C 1 Implementation Dependent Information Format 31 0 size tema reserved event time template dependent information M Figure C 2 Instance Code Format 3 22 11 1 43 65 8 7 0 Component ID Event Number Repair Action NR Threshold Instance Code Specific Subfields NR Threshold The notifica
329. erconnect Services Status Codes Code Description 00000000 00000001 00000002 00000003 00000004 00000009 00000032 00000033 00000034 00000035 C 56 HSJ Series Error Logging Request succeeded The remote sent a message over a connection that has been invalidated The remote sent a message for which no receive credit is available Received a message from the remote while in an invalid or illegal connection state Pending work exists but connection state is invalid or illegal Request failed no additional information available A PPD message was received from the remote but the Virtual Circuit is in an invalid or illegal state A PPD START was received from the remote but the Virtual Circuit state indicates that the Virtual Circuit is already OPEN A PPD NODE_STOP was received from the remote The PPD START send without receiving a PPD START in response limit has been reached the remote node is acknowledging the packets but not responding to them continued on next page Table C 3 Cont Host Interconnect Services Status Codes Code Description 00000036 The PPD STACK send without receiving a PPD ACK in response limit has been reached the remote node is acknowledging the packets but not responding to them 00000064 The CI IDREQ send without receiving a CI ID in response limit has been reached on both Path A and Path B the remote node is acknowledging the packets but not respond
330. erent sizes tape marks and the EOT in exactly the same positions as previously written This error most likely means that the tape unit has a positioning problem e A tape mark was detected in a place not expected by TILX This is code 5 This error would only be detected on a read pass Because TILX knows what was written to the tape TILX expects to encounter the records tape marks and the EOT in exactly the same positions as previously written This error most likely means that the tape unit has a positioning problem e Record Data Truncated not generated This is code 6 This error would only be detected on a read pass Occasionally TILX issues a read with a byte count less than what TILX knows was written to the current tape record Thus TILX would expect to receive a Record Data Truncated status If TILX does not receive the Record Data Truncated status when expected this TILX detected error is reported e EOT encountered in unexpected position This is code 7 This error would only be detected on a read pass Because TILX knows what was written to the tape TILX expects to encounter the records tape marks and the EOT in exactly the same positions as previously written This error most likely means that the tape unit has a positioning problem TILX terminated A termination a print summary or a reuse parameters request was received but TILX is currently not testing any units Explanation A Ctrl Y termination request
331. eries Cabinet Cable Distribution Unit Installation Sheet The Digital Guide to RAID Storage Technology VAXcluster Console System User s Guide VAXcluster Systems Guidelines for VAXcluster System Configurations AE PYTGA TE AE Q6HKA TE AE Q6HMA TE EK HSFAM PS EK HSFAM UG EK HSFAM RN EK HSD30 RN EK HSZ40 RN EK SBB35 UG EK 350MA UG EK BA350 CG EK BA350 UG EK 35XRD IG EK SW800 IG EK SWCDU IS EK SW500 IG EK SW5CU IS EC B1960 45 AA GVA5D TE EK VAXCS CG Documentation Conventions The following conventions are used in this manual boldface type italic type UPPERCASE Ctrl x CDROM HSJ series HSD series HSZ series Boldface type in examples indicates user input Boldface type in text indicates the first instance of terms defined in either the text the glossary or both Italic type indicates emphasis variables in command strings and complete manual titles Words in uppercase text indicate a command the name of a file or an abbreviation for a system privilege CTRL x indicates that you hold down the Ctrl key while you press another key indicated by x For DILX and TILX the caret symbol is equivalent to the Ctrl key and these same instructions apply This refers to both a command and a hardware device The proper usage of CD ROM with a hyphen is not used to avoid reader confusion This refers to all CI based controllers covered in this manual as listed in Table 1 1 This refers to a
332. ering the PTL at least one space must separate the port target and LUN See LOCATE CANCEL to turn off the LEDs An error is displayed if the port target or LUN is invalid or if no device is configured at that location device or storage set name or unit number entity The LOCATE entity turns on the amber device fault LEDs that make up the entity supplied If a device name is given the device s LED is lit If a storage set name is given all device LEDs that make up the storage set are lit Ifa unit number is given all device LEDs that make up the unit are lit See LOCATE CANCEL to turn off the LEDs An error is displayed if no entity by that name or number has been configured CLI gt LOCATE DISKO Turns on the device fault LED on device DISKO 2 CLI gt LOCATE D12 Turns on the device fault LEDs on all devices that make up disk unit number 12 3 CLI gt LOCATE DISKS Turns on the device fault LEDs on all disk devices Command Line Interpreter B 19 RENAME RENAME Renames a container Format RENAME old container name new container name Parameters old container name Specifies the existing name that identifies the container new container name Specifies the new name to identify the container This name is referred to when creating units and storage sets The name must start with a letter A through Z and can then consist of up to eight more characters made up of A through Z 0 through 9 period dash and unde
333. erpreter B 57 SHOW UNITS SHOW UNITS Shows all units and unit information Format SHOW UNITS Description The SHOW UNITS command displays all the units known by the controller First disks including CDROMs are listed then tapes Qualifiers FULL If the FULL qualifier is specified after UNITS additional amplifying information may be displayed after each unit number such as the switch settings Examples 1 CLI gt SHOW UNITS MSCP unit Uses D100 DIO D110 DIL D150 DI5 A basic listing of units available on the controller 2 CLI gt SHOW UNITS FULL MSCP unit Uses D100 DIO Switches RUN READ CACHE NOWRITE PROTECT NOTRANSPORTABLE MAXIMUM CACHED TRANSFER SIZE 32 State ONLINE to this controller No exclusive access D110 DIL Switches RUN READ_CACHE NOWRITE_PROTECT NOTRANSPORTABLE MAXIMUM_CACHED_TRANSFER_SIZE 32 State ONLINE to this controller No exclusive access D150 DI5 Switches RUN READ_CACHE NOWRITE_PROTECT NOTRANSPORTABLE MAXIMUM_CACHED_TRANSFER_SIZE 32 State ONLINE to this controller No exclusive access A full listing of units available on the controller B 58 Command Line Interpreter SHOW unit number SHOW unit number Shows information about a unit Format SHOW unit number Parameters unit number The unit number of the unit to display Description The SHOW unit number command is used to show specific information about a particular unit Examples CLI gt SHOW D150 M
334. error related to timing 51 00 Erase failure 52 00 Cartridge fault 53 00 Media load or eject failed 53 01 Unload tape failure 53 02 Medium removal prevented 5A 00 Operator request or state change input unspecified 5A 01 Operator medium removal request 5A 02 Operator selected write protect 5A 03 Operator selected write permit 5B 00 Log exception 5B 01 Threshold condition met 5B 02 Log counter at maximum 5B 03 Log list codes exhausted 40 nn Diagnostic failure detected on component nn where nn identifies a specific target device component nn range 80 through FF Refer to documentation provided by the vendor of the target device for a description of the component identified by nn HSJ Series Error Logging C 71 Table C 15 SCSI ASC ASCQ Codes For CDROM Devices ASC ASCQ Code Code Description 00 00 No additional sense information 00 06 T O process terminated 00 11 Audio play operation in progress 00 12 Audio play operation paused 00 13 Audio play operation successfully completed 00 14 Audio play operation stopped due to error 00 15 No current audio status to return 02 00 No seek complete 04 00 Logical unit not ready cause not reportable 04 01 Logical unit is in process of becoming ready 04 02 Logical unit not ready initializing command required 04 03 Logical unit not ready manual intervention required 06 00 No reference position found 07 00 Multiple peripheral devices sele
335. es System ID 420010061120 Path A is ON Path B is ON MSCP allocation class 3 TMSCP allocation class 3 Cache 32 megabyte read cache version 2 The basic HSJ controller information CLI SHOW THIS CONTROLLER Controller HSD30 2633400026 Software E140 Hardware 0000 Configured for dual redundancy with CX40100000 All devices failed over to this controller SCSI address 7 Host port Node name HSD001 valid DSSI node 1 Host path is ON MSCP allocation class 9 TMSCP allocation class 9 Cache 32 megabyte read cache version 2 The basic HSD controller information B 56 Command Line Interpreter SHOW THIS_CONTROLLER CLI gt SHOW THIS_CONTROLLER Controller HSZ40 SC00103056 Software E140 Hardware 0000 SCSI address 6 Host port valid SCSI target 2 Cache 32 megabyte read cache version 2 The basic HSZ controller information CLI gt SHOW THIS_CONTROLLER FULL Controller HSJ40 ZG313FF115 Software E140 Hardware 0000 Configured for dual redundancy with 2630355555 In dual redundant configuration SCSI address 6 Host port Node name HSJ306 valid CI node 6 32 max nodes System ID 420010061120 Path A is O Path B is O MSCP allocation class 3 TMSCP allocation class 3 Cache 32 megabyte read cache version 2 Extended information Terminal speed 19200 baud eight bit no parity 1 stop bit Operation control 00000005 Security state code 41415 A full HSJ controller information listing Command Line Int
336. es such as magnetic disk ASC ASCQ Code Code Description 00 00 No additional sense information 00 06 T O process terminated 01 00 No index sector signal 02 00 No seek complete 03 00 Peripheral device write fault 04 00 Logical unit not ready cause not reportable 04 01 Logical unit is in process of becoming ready 04 02 Logical unit not ready initializing command required 04 03 Logical unit not ready manual intervention required 04 04 Logical unit not ready format in progress 06 00 No reference position found 07 00 Multiple peripheral devices selected 08 00 Logical unit communication failure 08 01 Logical unit communication time out 08 02 Logical unit communication parity error 09 00 Track following error 0A 00 Error log overflow 0C 01 Write error recovered with auto reallocation 0C 02 Write error auto reallocation failed 10 00 ID CRC or ECC error 11 00 Unrecovered read error 11 01 Read retries exhausted 11 02 Error too long to correct 11 03 Multiple read errors 11 04 Unrecovered read error auto reallocate failed 11 0A Miscorrected error 11 OB Unrecovered read error recommend reassignment 11 0C Unrecovered read error recommend rewrite the data 12 00 Address mark not found for ID field 13 00 Address mark not found for data field 14 00 Recorded entity not found 14 01 Record not found 15 00 Random positioning error 15 01 Mechanical positioning error 15 02 Pos
337. es a consistent 25 MIPs The processor chip controls all but low level device and host port operations Functional Description 2 1 Figure 2 1 HS Controller Common Hardware Block Diagram POLICY PROCESSOR INTEL 32KB 80960CA VD CACHE uP DUAL CONTROLLER MAINTENANCE PORT TERMINAL HSJ HSD30 PORT ONLY JARE FTN 32 MB 1 SHARED EXCHANGER 1 1 MEMORY READ CACHE 5 LI 1 OPTION HOST VALUE ADDED DEVICE DEVICE DEVICE DEVICE DEVICE DEVICE INTERFACE FUNCTIONS PORT 1 PORT 2 PORT 3 PORT 4 PORT 5 PORT 6 CXO 3979B MC 2 1 1 2 Instruction Data Cache Although the Intel 80960CA processor chip has an internal cache the internal cache is not large enough to offset performance degradation caused by shared memory To compensate for this a separate Instruction Data I D cache is part of the policy processor This 32 KB static RAM SRAM cache helps the Intel 80960CA processor chip achieve faster access to instructions and variables A write through cache design maintains data coherency in the I D cache 2 1 2 Program Card The program card is a PCMCIA standard program card device containing the firmware for operating the controller The firmware is validated and then loaded from the program card into shared memory each time the controller is initialized 2 1 3 Diagnostic Registers The HS controller has two write and two read diagnostic registers Diagnostic and functional firmware use the write diagnostic re
338. es adverse effects on the cluster and prevents a short circuit between the two ground references Always terminate the connections of the star coupler when removing external CI cables When handling or moving the internal CI cables it is very important that the connectors do not become grounded No metal may contact the metal connectors on these cables other than an external CI host cable connector 1 4 4 2 DSSI Cable Turn off all power to HSD series controllers and all other devices including the host CPU on a DSSI bus before removing a DSSI host cable If you accidentally short DSSI connector pins during aligning and inserting removing a DSSI connector you risk blowing the fuses of all members on the DSSI bus 1 4 4 3 SCSI Cable Always terminate open active SCSI connections to the host CPU when SCSI cables are removed 1 5 Controller Specifications Table 1 3 lists the physical and electrical specifications for the HS controllers and their cache modules Note Measurements in Table 1 3 are nominal measurements tolerances are not listed Table 1 3 HS Controller Specifications Current Current Hardware Length Width Power at 5 V at 12 V HSJ40 controller module 12 5 inches 9 50 inches 40 5 W 6 2 A 670 mA HSJ30 controller module 12 5 inches 9 50 inches 40 5 W 6 2 A 670 mA HSD30 controller module 12 5 inches 8 75 inches 20 9 W 3 2 A 10 mA HSZ40 controller module 12 5 inches 8 75 inches 24 8 W 46A 10 mA Read ca
339. est good or the cache will be declared bad If cache is locked by the other controller dual redundant configurations then all cache DAEMON diagnostics are postponed During functional code when the cache manager determines that the cache is unlocked the cache manager will test the DRAB followed by the memory The tests run by DAEMON and the cache manager are summarized in Table 6 1 Table 6 1 Cache Module Testing Test DAEMON Cache Manager DRAB e All memory is initialized e No memory is initialized e Full address test e Address test on diagnostic pages only Memory e Never invoked e Always invokes all memory tests e Read only or read write After successful test completion DAEMON releases control At this time initialization is finished and functional controller firmware takes over DAEMON handles all interrupts and errors received during cache module testing If DAEMON receives any interrupt it stops initialization DAEMON displays any errors as a code on the OCP 6 1 3 1 Self Test Self test is a special function of DAEMON where you set DAEMON to run in a continuous loop Self test allows you to diagnose intermittent hardware failures because the loop will continue until an error is detected In addition self test checks the controller hardware without affecting devices on any ports Digital recommends you run self test from the maintenance terminal because the host port will disconnect once the
340. est rate for the device during the last update interval These requests are up to 8 kilobytes long and are either generated by host requests or cache flush activity RdKB S This column shows the average data transfer rate from the device in kilobytes during the previous screen update interval WrKB S This column shows the average data transfer rate to the device in kilobytes during the previous screen update interval Que This column shows the maximum number of transfer requests waiting to be transferred to the device during the last screen update interval Tg This column shows the maximum number of transfer requests queued to the device during the last screen update interval If a device does not support tagged queueing the maximum value will be 1 CR This column indicates the number of SCSI command resets that occurred since VIDPY was started BR This column indicates the number of SCSI bus resets that occurred since VTDPY was started TR This column indicates the number of SCSI target resets that occurred since VTDPY was started Diagnostics Exercisers and Utilities 6 95 Device SCSI Port Performance 1 D Ci 4 Co DN Description Port 59 36 raxB sO wrks s crO BRO TRO 0 0 0 0 0 0 11 93 0 0 0 0 48 341 0 0 0 0 48 340 0 0 0 0 58 93 375 0 0 0 0 0 0 0 0 0 This subdisplay shows the accumulated I O performance values and bus statistics for the SCSI device ports The subdisplay for a controller that has si
341. et enough memory for FLS FCBs to receive information from the other controller 08190100 An unlock command was received when the NV memory was not locked 08140100 Unable to allocate memory for remote work 081B0101 Bad remote work received on remote work queue Last Failure Parameter 0 contains the ID type value that was received on the NVFOC remote work queue Table C 40 Command Line Interpreter Last Failure Codes Code Description 20010100 20020100 20030100 20070100 20080000 20090010 200A0000 200B0100 200C0100 200D0101 The action for work on the CLI queue should be CLI CONNECT CLI_ COMMAND IN or CLI PROMPT If it is not one of these three this bugcheck will result The FAO returned a non successful response This will only happen if a bad format is detected or the formatted string overflows the output buffer The type of work received on the CLI work queue was not of type CLI A work item of an unknown type was placed on the CLI s DUP Virtual Terminal thread work queue by the CLI This controller requested this controller to restart This controller requested this controller to shut down This controller requested this controller to self test Could not get enough memory for FCBs to receive information from the other controller After a CLI command the NV memory was still locked The CLI should always unlock NV memory when the command is complete if it had an error or not After many calls
342. eter 1 contains the address of the fault e Last Failure Parameter 2 contains the actual data value e Last Failure Parameter 3 contains the expected data value 01800080 A powerfail interrupt occured continued on next page HSJ Series Error Logging C 93 C 94 Table C 33 Cont Executive Services Last Failure Codes Code Description 01812088 A processor interrupt was generated by the Master Dynamic RAM Controller and Arbitration engine DRAB with an indication that an unrecoverable memory access problem occurred e Last Failure Parameter 0 contains the Master DRAB Setup Register value e Last Failure Parameter 1 contains the Master DRAB CSR Register value e Last Failure Parameter 2 contains the Master DRAB Diagnostic CSR Register value e Last Failure Parameter 3 contains the Master DRAB Diagnostic Error Register value e Last Failure Parameter 4 contains the Master DRAB Error Address Register value e Last Failure Parameter 5 contains the Master DRAB Error Data Register value e Last Failure Parameter 6 contains the Master DRAB Error Region Register value e Last Failure Parameter 7 contains the Master DRAB Region Setup Register value 01822288 A processor interrupt was generated by the CACHEAO Dynamic RAM HSJ Series Error Logging Controller and Arbitration engine DRAB with an indication that an unrecoverable memory access problem occurred e Last Failure Parameter 0 contains the CACHEAO DRAB Setup R
343. eters Digital recommends that the SYSGEN parameters PAPOLLINTERVAL and PANUMPOLL be set such that all nodes in the cluster are polled within 30 seconds or less This ensures proper operation of the HSJ series CI in the event of controller reinitialization Failure to set this value may result in MSCP command timeouts The default values are set to poll 16 node clusters every 5 seconds and 32 node clusters every 10 seconds 4 10 Failover Failover takes place when one controller fails in a dual redundant configuration To support failover information is shared between the two controllers such as e Physical device PTL configurations e Storage set names e Logical unit definitions Prior to failover all resources are considered unbound to a particular controller until a logical unit is brought on line by the host through one of the controllers At this point all containers used by the logical unit become solely accessible through the one controller In a failover configuration all commands are shared between the two controllers except the following SET THIS_CONTROLLER SET OTHER_CONTROLLER SHOW THIS_CONTROLLER SHOW OTHER_CONTROLLER RESTART THIS_CONTROLLER RESTART OTHER_CONTROLLER SHUTDOWN THIS_CONTROLLER SHUTDOWN OTHER_CONTROLLER In these cases the command will be directed to the correct controller e THIS CONTROLLER refers to the controller to which the terminal is connected e OTHER CONTROLLER refers to the other cont
344. exists This display is available only on CI and DSSI based controllers O Each position in the data field represents one of the possible nodes to which the controller can communicate To locate the path status for a given node use the column on the left to determine the high order digit of the node number and use the second row to determine the low order digit For CI controllers the number of nodes displayed is determined by the controllers MAX NODE parameter The maximum supported value for this parameter is 32 For DSSI controllers the number of nodes is fixed at 8 O Each location in the grid contains a character to indicate the path status A indicates only CI path A is functioning properly In this example node 12 demonstrates this This value will not be displayed for DSSI based controllers B indicates only CI path B is functioning properly In this example node 14 demonstrates this This value will not be displayed for DSSI based controllers X indicates the CI cables are crossed In this example node 27 demonstrates this This value will not be displayed for DSSI based controllers indicates the single DSSI path or both CI paths are functioning properly In this example nodes 8 9 and 15 demonstrate this If a period is in a position corresponding to a node that node does not have any virtual circuits or connections to this controller so either the path status cannot be determined or neither path is functioni
345. f upper Off Off Fault status PS lower Off On PS2 is operational Replace PS1 as described in Chapter 7 Shelf upper Off Off Fault status PS lower Off Off Possible PS1 and PS2 fault or input power problem T Shelf power supply installed in slot 7 t Redundant power supply installed in slot 6 5 6 Error Messages The HS operating firmware is designed to send messages to a virtual terminal and or maintenance terminal under certain fault conditions The messages appear on the lines just before the CLI prompt as shown in the following example SWAP signal cleared all SWAP interrupts re enabled CLI gt You might not have a remote or maintenance terminal connected to display messages In this case the HS operating firmware saves messages for you You need only connect a terminal and press the Return key to see the 15 most recently received error messages Often messages will continue to appear each time Return is pressed To clear the terminal of the errors enter the CLEAR_ERRORS command You may want to make a note of the errors before clearing them because they cannot be recalled afterwards Note Because the severity of errors varies the controller may or may not initialize or operate or both even though an error message appears For example if all of the SCSI ports or the host port and local terminal port fail diagnostics the controller will not operate However if the cache module f
346. f rack tape loader Figure 3 5 Any TZ8x7 half rack tape loader must be located at the top front positions filling the two top BA350 SB shelf positions front and rear Note that each tape loader occupies the full cabinet depth Up to two tape drive loader devices can be loaded in an SW500 series cabinet displacing shelves S4 S9 and S7 S8 moving the CDUs to shelf location S7 Single or paired TZ8x7 The associated BA350 MA controller shelf must be located near enough to satisfy this restriction Configuration Rules and Restrictions 3 7 3 3 Shelves devices must be connected to a controller port as in the SW800 series data center cabinet Use of a second controller shelf By convention the first controller shelf C1 would use positions S1 S4 and S9 the second controller shelf C2 would use positions S5 S7 and S8 This permits two subsystems one with up to 24 28 3 inch SBB devices in the front and the other with 18 21 3 inch SBB devices in the rear Device shelves can be arranged in any SCSI 2 legal configuration subject to the following No more than a single extension joining two BA350 SB device shelves is permitted The two BA350 SB shelves must be physically adjacent to each other Figure 3 6 shows an example of device shelves in a single extension configuration Figure 3 6 Single Extension from Device Shelf to Device Shelf BA350 MA HSJ40 CONTROLLERS BA350 SB 5 1 4 SBB 5 1 4 SBB
347. fied this warning is displayed Check both the logical and physical configuration of the devices that make up the unit and correct any mismatches Warning 9060 lt device type gt lt device name gt at PTL lt port gt lt target gt lt lun gt Incorrect device type installed Explanation When a unit is added the configuration of the disks that make up the unit is checked If a non disk device is found at the PTL specified this warning is displayed Check both the logical and physical configuration of the devices that make up the unit and correct any mismatches B 76 Command Line Interpreter B 3 Examples The following examples show some commonly performed CLI functions Your subsystem parameters will of course differ from those shown here B 3 1 Setting HSD Series Parameters Nonredundant SET SET SET SET TH TH TH TH IS_CO IS_CO IS_CO IS_CO RESTART THI this SET These SET SET TH TH TH contr IS_CO TRO TRO TRO TRO S CO olle TRO LLER ID 5 LLER SCS_NODENAME HSD03 LLER MSCP ALLOCATION CLASS 4 LLER TMSCP ALLOCATION CLASS 4 TROLLER r restarts at this point LLER PATH commands could optionally be entered on fewer lines IS CO IS CO RESTART THI this SET TH contr IS CO TRO TRO S CO olle TRO LLER ID 5 SCS_NODENAME HSD03 LLER MSCP_ALLOCATION_CLASS 4 TMSCP_ALLOCATION_CLASS 4 TROLLER r restarts at this point LLER PATH B 3 2
348. fining the test Enter starting LBN for this command 0 highest_Ibn_on_the_disk Explanation This question only applies to the User Defined test It allows you to set the starting LBN for the command currently being defined Enter the starting LBN for this command Enter the IO size in 512 byte blocks for this command 1 size in blocks Explanation This question only applies to the User Defined test It allows you to set the I O size in 512 byte blocks for the command currently being defined Enter values indicating the I O size for this command Enter in HEX the MSCP Command Modifiers 0 Explanation This question only applies to the User Defined test It allows you to specify the MSCP command modifiers You must understand the meaning of the MSCP command modifiers before you enter any value other than the default Reuse parameters stop continue restart change unit stop Explanation This question is displayed after the DILX execution time limit expires after the hard error limit is reached for every unit under test or after you enter Ctrl C These options are as follow e Stop DILX terminates normally e Continue DILX resumes execution without resetting the remaining DILX execution time or any performance statistics If the DILX execution time limit has expired or all units have reached their hard error limit DILX terminates e Restart DILX resets all performance statistics and restarts executi
349. firmware changes your device and or storage set names and sends this message The functional operation of your configuration is not changed when this message appears Controllers misconfigured Type SHOW THIS_CONTROLLER Explanation If this message appears examine the SHOW THIS_ CONTROLLER display to determine the source of the misconfiguration Taken out of failover due to serial number format error Explanation An invalid serial number format was entered for the second controller of a dual redundant pair Serial number initialized due to format error Explanation An invalid serial number was entered for the second controller of a dual redundant pair Configuration information deleted due to internal inconsistencies Explanation This message is displayed if a test of nonvolatile memory shows corruption The configuration information for the controller is deleted when this message is displayed Restart of the other controller required Explanation When changing some parameters you must reinitialize the companion controller in a dual redundant pair to have the parameter take effect Restart of this controller required Explanation A changed parameter requires reinitialization of this controller to take effect 5 14 Error Analysis and Fault Isolation 5 6 4 Shelf Messages This section lists messages displayed by the controller shelf Unable to clear SWAP signal on shelf xx all SWAP interrupts disabled Explanation
350. following command connects a host terminal to the CLI the command requires the DIAGNOSE privilege Note The controller SCS node name must be specified SET HOST LOG CONFIGURATION INFO DUP SERVER MSCP DUP TASK CLI SCS nodename Establishing a virtual terminal for HSZ series controllers requires using the HSZUTIL application which is described in Chapter 6 Note Your CLI gt prompt may be factory set to reflect your controller model such as HSJ HSD or HSZ Appendix B provides details on how to change the CLI gt prompt 4 3 2 Exiting the CLI When exiting the CLI keep the following guidelines in mind e If you are using a maintenance terminal you cannot exit the CLI Entering the EXIT command merely restarts the CLI and redisplays the copyright notice controller type and any last fail error information e If you are using the DUP connection virtual terminal enter the following command to exit the CLI and return the terminal to the host CLI gt EXIT e If you connect a virtual terminal via the OpenVMS VAX operating system you can specify the qualifier LOG CONFIGURATION INFO on the DCL command line This qualifier creates a log file of your CLI session Then when you exit the CLI you can open the log file to remember how you configured your subsystem 4 3 3 Command Sets The CLI consists of the following six command sets e Failover commands Failover commands support dual redundant controll
351. for the description of this field This field contains the value 41 for this event log tdisize See Section C 2 1 for the description of this field HSJ Series Error Logging C 43 Figure C 25 Device Services Nontransfer Error Event Log Template 41 Format 31 0 command reference number sequence number unit number controller identifier reserved chvrsn csvrsn reserved event time This field contains the value 04 for this event log reserved offset 1E This field contains the value 0 event time See Section C 2 1 for the description of this field port The SCSI bus number affected by the error being reported target The SCSI target number on the port affected by the error being reported C 44 HSJ Series Error Logging asc ascq The asc and ascq fields contain the values supplied in byte 0C Additional Sense Code and byte 0D Additional Sense Code Qualifier fields respectively of the Sense Data returned in the response of a SCSI REQUEST SENSE command issued to the target device The description of the value supplied in the instance code field see Table C 27 describes the Sense Key value supplied in the Sense Data returned Note that the content of certain of the fields described previously may be undefined depending on the value supplied in the instance code field See Table C 27 for more detail C 2 3 11 Disk Transfer Error Event Log Template 51 The HSJ30 40
352. fore Proceeding Restore initial controller parameters by following the steps in Section 7 1 4 5 Press and hold both controllers green reset buttons Then insert the program card into the new controller The program card eject button will extend when the card is fully inserted Release both reset buttons Enter the following command to initialize the controller CLI gt RESTART THIS CONTROLLER If the controllers initialize correctly their green LEDs will begin to flash at 1 Hz If an error occurs during initialization the OCP will display a code Refer to Chapter 5 to analyze the code If you wish you may disconnect the maintenance terminal The terminal is not required for normal controller operation Close and lock the cabinet doors SW800 series using a 5 32 inch Allen wrench Removing and Replacing Field Replaceable Units 7 15 7 1 4 5 Restoring Initial Parameters A new controller module has no initial parameters so you must use a maintenance terminal to enter them Refer to information in the CONFIGURATION INFO file or on the configuration sheet packaged with your system whichever is most current for parameters Be sure to use the same parameters from the removed controller when installing a replacement Follow these steps CAUTION Do not install HSJ series CI host port cables until after setting all parameters listed here Failure to follow this procedure may result in adverse effects on the ho
353. g order see Figure 7 7 TXA RXA TXB RXB Attach terminators to the open star coupler connectors If necessary to access to internal external CI cable connector unlock and open the cabinet SW800 series using a 5 32 inch Allen wrench 6 Disconnect the external CI cables from the internal CI cable 7 Remove the cable Figure 7 7 External and Internal CI Cables HSJ series REVERSE VIEW LEGEND TO FRONT OF HSJ CONTROLLER S o gt 7 24 Removing and Replacing Field Replaceable Units INTERNAL Cl CABLE CHANNEL CLUSTER NODE A PATH e 15 et bz RED LABEL Is i o d GREEN Gi Wig es QU 2 T y gt EXTERNAL Cl CABLES CXO 3753B MC 7 4 4 Cable Replacement Installation Use the following procedure to replace the external CI cables CAUTION Always connect the external CI cable to the internal CI cable first then connect it to the star coupler second Never leave unterminated paths on the star coupler Never leave cables terminated or not attached at the star coupler and disconnected at the internal CI cable connector This minimizes adverse effects on the cluster and prevents a short circuit between the two ground references 1 Connect the external CI cables to the internal CI cable 2 If necessary close and lock the cabinet doors SW800 series using a 5 3
354. gisters to control HS controller and StorageWorks operations Certain bits in the registers activate test modes for forcing errors in the HS controller Other bits control the operator control panel OCP LEDs The policy processor reads the read diagnostic registers to determine the cause of an interrupt when an interrupt occurs 2 1 4 Operator Control Panel The OCP includes the following One reset button with embedded green LED 2 2 Functional Description One button per SCSI port Six amber LEDs Figure 2 2 shows an example of the OCP from the HSZ40 controller The buttons and LEDs serve different functions with respect to controlling the SCSI ports and or reporting fault and normal conditions See Chapter 5 for further information on using the OCP Figure 2 2 HS Controller Operator Control Panel RESET BUTTON LEDS BUTTONS CXO 4204A MC 2 1 5 Maintenance Terminal Port Each HS controller has a modified modular jack MMJ on its front bezel that can support an EIA 423 compatible maintenance terminal You must connect the maintenance terminal during controller installation to set initial controller parameters During normal operation you may use either the maintenance terminal or a virtual host terminal to add devices and storage sets or to perform other storage configuration tasks However a maintenance terminal is required whe
355. gital Equipme efaults TILX Port Targ LUN Used by 5 0 0 T50 9 2 0 T52 nt Corporation 1993 Tape Inline Exerciser version 1 4 Use all defaults y n y Ye Tape unit numbers on this controller include 5 52 Enter Is a Unit Selec unit number to be tes tape loaded and ready 50 successfully alloca t another unit y n Enter unit number to be tes Is a tape loaded and ready Unit 52 successfully alloca Maximum number of units are TILX testing s Test will run tarted at for 10 mi ted 50 answer ted for n y ted 52 answer Yes when ready y ted for testing now configured 13 JAN 1993 04 35 08 nutes Yes when ready y testing Type T to get Type C Type Y TILX Summary at Test minu Unit 50 To if running TILX a curr to ter to ter through VCS ent performance summa inate the TILX test p inate TILX prematurel 13 JAN 1993 04 36 24 aining 9 expired 1 Requests 868 tes re tal IO or G in all other cases ry rematurely y No errors detected Unit 52 Total IO Requests No errors detected Reuse Parameters stop con TILX Normal Termination HSJ gt 860 tinue restart change_unit stop 6 3 9 2 TILX Example Using All Functions In Example 6 15 TILX is run using all functions and using a longer run time and higher record count than the default The performance statistics and a performance summary are displayed every 15 minu
356. h SBB Configurations 3 Port Controller 5 Inch SBB Configurations 6 Port Controller 5 Inch SBB Configurations 3 Port Controller Small Shelf Count Configurations 6 Port Controller Small Shelf Count Configurations 3 Port Controller High performance Devices per Port o ooooooo ooo SCSI Bus Maximum Lengths o o Operating System SUPppolt ooo ooo Transportable and Nontransportable Devices Storage SBB Status LEDS o o o Shelf and Single Power Supply Status LEDs Shelf and Dual Power Supply Status LEDS Cache Module Testing C 20 C 21 C 21 C 23 C 24 C 26 C 28 C 30 C 32 C 35 C 37 C 39 C 41 C 44 C 46 C 49 C 51 C 53 xviii 1 1 1 4 1 9 1 10 3 11 3 12 3 13 3 14 3 15 3 15 3 16 3 19 4 11 4 18 5 9 5 10 5 11 6 4 xiii xiv gt L L OOND T HE A wet oopooooooooonoo a aA ON OOo A CO rn DILX Data Patterns 1 ao a a E ee ee eens DILX Abort Codes and Definitions llle DILX Error Codes and Definitions ooooooooooo TILX Data Pattern Definitions 0 0 cece eee eee TILX Abort Codes and Definitions o ooooooooooo TILX Abort Codes and Definitions o oooooooooo
357. h o udho wd CXO 4116A MC 7 1 3 3 Module Removal Use the following procedure to remove the controller module 1 If you have not done so already unlock and open the cabinet doors SW800 series using a 5 32 inch Allen wrench 2 Examine the green OCP reset LED shown in Figure 7 2 on the controller If the green LED stays lit continuously after troubleshooting refer to Section 7 1 1 the controller has failed and is already shut down Proceed to step 6 3 Ifthe controller is fully or partially functioning green LED flashing connect a maintenance terminal to its MMJ shown in Figure 7 2 and enter the following commands CLI gt SHOW THIS CONTROLLER FULL CLI gt SHOW DEVICES FULL CLI gt SHOW UNITS FULL 7 4 Removing and Replacing Field Replaceable Units Figure 7 2 Reset LED HSJ40 Controller EMI SHIELD RESET LED MMJ CXO 4117A MC 4 Record the output from the commands and keep it available for reference Note Never remove a controller while it is still servicing devices 5 Because the controller is still functioning you must shut down the controller by following the guidelines listed in Section 7 1 2 Note Earlier controller models had a program card EMI shield This shield may be discarded 6 Unsnap and discard the program card EMI shield if attached see Figure 7 2 Removing and
358. hat an invalid IDENTIFY message was received UNIT ATTENTION Indicates that the removable medium may have been changed or the target has been reset DATA PROTECT Indicates that a command that reads or writes the medium was attempted on a block that is protected from this operation The read or write operation is not performed BLANK CHECK Indicates that a write once device or a sequential access device encountered blank medium or format defined end of data indication while reading or a write once device encountered a non blank medium while writing Vendor Specific This sense key is available for reporting vendor specific conditions COPY ABORTED Indicates a COPY COMPARE or COPY AND VERIFY command was aborted due to an error condition on the source device the destination device or both ABORTED COMMAND Indicates that the target aborted the command The initiator may be able to recover by trying the command again EQUAL Indicates a SEARCH DATA command has satisfied an equal comparison VOLUME OVERFLOW Indicates that a buffered peripheral device has reached the end of partition and data may remain in the buffer that has not been written to the medium A RECOVER BUFFERED DATA command s may be issued to read the unwritten data from the buffer MISCOMPARE Indicates that the source data did not match the data read from the medium RESERVED HSJ Series Error Logging Table C 13 SCSI ASC ASCQ Codes For Direct Access Devic
359. hat prevents any quiesce function on this controller To correct the problem you must locate the suspect shelf and do one of three things e Remove all devices from the shelf e Disconnect the shelf s SCSI device cables Section 7 8 e Repair replace the shelf power supply Section 7 10 While the OCP LEDs are flashing any SBBs on the quiesced port that have status LEDs will also flash Note The length of time required for I O to stop can vary from zero seconds to several minutes depending on load device type and cache status Hold the SBB in both hands and firmly push it into the shelf until you hear the mounting tabs snap into place 7 11 1 5 Restoring the Device to the Configuration After you insert the SBB the flashing pattern on the OCP stops and normal operation on the ports resumes At this time the port LEDs will turn off If you inserted a new device in a previously unused slot that port s LED remains lit until the device is added by entering the following command see Appendix B CLI gt ADD device If a tape SBB is inserted in a slot where a disk SBB was previously installed the port LED remains lit until the device is added using the ADD command and you delete the previously installed device from the list of known devices as follows CLI gt DELETE device name If the new disk is to be part of a storage set you must delete the storage set from the configuration and create ADD it again
360. hat quit is not a command instead it indicates to DILX that you have finished defining the test e The starting Logical Block Number LBN The size of the I O in 512 byte blocks The MSCP command modifiers 6 2 4 DILX Test Definition Questions The following text is displayed when running DILX The text includes questions that are listed in the approximate order that they are displayed on your terminal These questions prompt you to define the runtime parameters for DILX Note Defaults for each question are given inside If you press the Return key as a response to a question the default is used as the response After DILX has been started the following message describing the Auto Configure option is displayed The Auto Configure option will automatically select for testing half or all of the disk units configured It will perform a very thorough test with WRITES enabled The user will only be able to select the run time and performance summary options and whether to test a half or full configuration The user will not be able to specify specific units to test The Auto Configure option is only recommended for initial installations It is the first question asked 6 8 Diagnostics Exercisers and Utilities Do you wish to perform an Auto Configure y n n Explanation Enter Y if you wish to invoke the Auto Configure option DILX next diplays the following information If you want to test a dua
361. he Device Services Nontransfer Error Event Log will be sent to the host system that issued the command on the same connection upon which the command was received if This Host error logging is enabled on that connection and to all host systems that have enabled Other Host error logging on a connection or connections established with the HSJ30 40 controller s Disk and or Tape MSCP Server If the error is associated with a command issued by an HSJ30 40 controller firmware component the Device Services Nontransfer Error Event Log will be sent to all host systems that have enabled Miscellaneous error logging on a connection established with the HSJ30 40 controller s Disk and or Tape MSCP Server The Device Services Nontransfer Error Event Log is reported via the T MSCP Controller Errors error log message format The format of this event log including the HSJ30 40 controller specific fields is shown in Figure C 25 Device Services Nontransfer Error Event Log Format Specific Fields format This field contains the value 00 that is T MSCP Controller Errors error log format code event code The values that can be reported in this field for this event log are shown in Table C 27 reserved offset 16 This field contains the value 0 instance code See Section C 2 1 for the description of this field The values that can be reported in this field for this event log are shown in Table C 27 templ See Section C 2 1
362. he T MSCP Controller Errors error log message format The format of this event log including the HSJ30 40 controller specific fields is shown in Figure C 20 HSJ Series Error Logging C 31 Figure C 20 Subsystem Built In Self Test Failure Event Log Template 13 Format 31 0 controller identifier instance code tdisize templ reserved event time hdrflgs hdrtype tflags temd return code Subsystem Built In Self Test Failure Event Log Format Specific Fields format This field contains the value 00 that is T MSCP Controller Errors error log format code event code The values that can be reported in this field for this event log are shown in Table C 22 reserved offset 16 This field contains the value 0 C 32 HSJ Series Error Logging instance code See Section C 2 1 for the description of this field The values that can be reported in this field for this event log are shown in Table C 22 templ See Section C 2 1 for the description of this field This field contains the value 13 for this event log tdisize See Section C 2 1 for the description of this field This field contains the value 24 for this event log reserved offset 1E This field contains the value 0 event time See Section C 2 1 for the description of this field undefined This field is only present to provide longword alignment its content is undefined hdrtype hdrflgs te tnum temd tflags error code retur
363. he companion s trilink connector first in this case 7 1 4 1 Tools Required You will need the following tools to remove or replace the controller module e ESD strap e 3 32 inch Allen wrench e 5 32 inch Allen wrench e Flat head screwdriver 7 1 4 2 Precautions Refer to Chapter 1 for ESD grounding module handling and program card handling guidelines Ground yourself to the cabinet grounding stud refer to Figure 7 1 before servicing the controller module 7 1 4 3 Module Removal Use the following procedure to remove the controller module 1 Ifyou have not done so already unlock and open the cabinet doors SW800 series using a 5 32 inch Allen wrench 2 Examine the green OCP reset LED refer to Figure 7 2 on both controllers At least one green LED should not remain lit continuously after basic troubleshooting refer to Section 7 1 1 If both green LEDs stay lit continuously both controllers have failed Refer to Section 7 1 5 Removing and Replacing Field Replaceable Units 7 13 3 Connect a maintenance terminal to the MMJ refer to Figure 7 2 of each functioning or partially functioning controller and enter the following commands CLI gt SHOW THIS CONTROLLER FULL CLI gt SHOW DEVICES FULL CLI gt SHOW UNITS FULL 4 Record the output from the commands and keep it available for reference Note Never remove a controller while it is still servicing devices 5 Ifthe controller you are r
364. he controller devices in a one to one unit relationship with the host This situation may or may not occur under normal operation For this reason host addressing is often tied to a virtual storage device a storage set 2 3 3 Host Storage Addressing HSZ series Figure 2 7 shows a typical connection between an HSZ series controller and its host In this case the SCSI host device interface consists of device ports each connected to a SCSI bus containing up to eight devices The HSZ series controller resides on one of the SCSI buses The HSZ series controller can be assigned one or two SCSI IDs on the bus Figure 2 7 Host Storage Addressing HSZ series HOST PORT ADDRESS SCSI BUS N SCSI BUS 1 HOST SESUIDX TARGET ADDRESS HOST INTERFACE DEVICE INTERFACE LUN LUN LUN LUN LUN LUN LUN LUN HOST 0 1 2 3 4 5 6 7 LUN ADDRESS VIRTUAL DEVICES CXO 4107A MC Functional Description 2 15 A SCSI host also sees host logical units through the controller However in SCSI systems there can only be up to eight units per ID For the HSZ series controller this translates as up to 16 units or eight per each 1D Furthermore the host addresses each unit by a SCSI logical unit number also called a LUN Note Although they share the same name controller LUNs and SCSI host LUNs are logical addresses for two different storage structures Controller LUNs exist on the controller s device interface and SCS
365. he description of these fields cmdopcd infoq ercdval segment snsflgs info addsnsl cmdspec asc ascq frucode keyspec See Section C 2 2 5 for the description of these fields Note that the content of certain of the fields described previously may be undefined depending on the value supplied in the instance code field See Table C 28 for more detail C 2 3 12 Disk Bad Block Replacement Attempt Event Log Template 57 The HSJ30 40 controller Value Added firmware component reports disk unit bad block replacement attempt results via the Disk Bad Block Replacement Attempt Event Log HSJ Series Error Logging C 47 If the replacement is associated with a command issued by a host system the Disk Bad Block Replacement Attempt Event Log will be sent to the host system that issued the command on the same connection upon which the command was received if This Host error logging is enabled on that connection and to all host systems that have enabled Other Host error logging on a connection or connections established with the HSJ30 40 controller s Disk and or Tape MSCP Server If the replacement is associated with a command issued by an HSJ30 40 controller firmware component the Disk Bad Block Replacement Attempt Error Event Log will be sent to all host systems that have enabled Miscellaneous error logging on a connection established with the HSJ30 40 controller s Disk MSCP Server The Disk Bad Block Replacement A
366. he erase percentage will be set automatically Enter access percentage for Seek Intensive phase 0 100 90 Perform data compare y n n y Enter compare percentage 1 100 5 Disk unit numbers on this controller include 10 2 4 21 23 61 63 Enter unit number to be tested Unit 10 will be write enabled Do you still wish to add this unit y n n y Enter start block number 0 1664214 0 Enter end block number 0 1664214 1664214 Unit 10 successfully allocated for testing Select another unit y n n y Enter unit number to be tested Unit 12 will be write enabled Do you still wish to add this unit y n n y Enter start block number 0 832316 0 Enter end block number 0 832316 832316 Unit 12 successfully allocated for testing Select another unit y n n n DILX testing started at 13 JAN 1993 04 52 26 Test will run for 45 minutes Type T if running DILX through VCS or G in all other cases to get a current performance summary Type C to terminate the DILX test prematurely Type Y to terminate DILX prematurely cC N continued on next page 6 24 Diagnostics Exercisers and Utilities Example 6 7 Cont All Functions DILX DILX Summary a Test minutes r Unit 10 Total Read Count 0 Access Count 0 B xfer Read o errors detec Unit 12 Total Read Count 0 Access Count 0 B xfer Read o errors detec t 13 JAN 1993 04 56 20 emaining 42 expired
367. he soft error limit has been reached for controller errors Thus controller soft error reporting is disabled Diagnostics Exercisers and Utilities 6 59 Hard error limit reached for controller errors All units dropped from testing Explanation This message is self explanatory Unit is already allocated for testing Explanation This message is self explanatory No drives selected Explanation DILX parameter collection was exited without choosing any units to test Maximum number of units are now configured Explanation This message is self explanatory Testing will start after this message is displayed Unit is write protected Explanation The user wants to test a unit with write and or erase commands enabled but the unit is write protected The unit status and or the unit device type has changed unexpectedly Unit x dropped from testing Explanation The unit status may change if the unit experienced hard errors or if the unit is disconnected Either way DILX cannot continue testing the unit Last Failure Information follows This error was NOT produced by running DILX It represents the reason why the controller crashed on the previous controller run Explanation This message may be displayed while allocating a unit for testing It does not indicate any reason why the unit is or is not successfully allocated but rather represents the reason why the controller went down in the previous run The information tha
368. hed to disk the controller will not restart unless the IGNORE _ ERRORS qualifier is specified Specifying IMMEDIATE will cause this controller to restart immediately without flushing any user data to the disks even if drives are on line to a host The RESTART THIS_CONTROLLER command will not cause a failover to the other controller in a dual redundant configuration This controller will restart and resume operations where it was interrupted Note If you enter a RESTART THIS_CONTROLLER command and you are using a virtual terminal to communicate with the controller the connection will be lost when this controller restarts Qualifiers for HSD and HSJ controllers IGNORE_ERRORS NOIGNORE_ERRORS D If errors result when trying to write user data the controller will not be restarted unless IGNORE_ERROR is specified CAUTION Customer data may be lost or corrupted if the IGNORE_ERRORS qualifier is specified NOIGNORE_ERRORS is the default IMMEDIATE NOIMMEDIATE D If IMMEDIATE is specified immediately restart the controller without checking for online devices CAUTION Customer data may be lost or corrupted if the IMMEDIATE qualifier is specified Command Line Interpreter B 23 RESTART THIS_CONTROLLER NOIMMEDIATE is the default OVERRIDE_ONLINE NOOVERRIDE_ONLINE D If any units are on line to the controller the controller will not be restarted unless OVERRIDE_ONLINE is specified
369. hed with the HSJ30 40 controller s Disk and or Tape MSCP Server The Last Failure Event Log is reported via the T MSCP Controller Errors error log message format The format of this event log including the HSJ30 40 controller specific fields is shown in Figure C 15 Last Failure Event Log Format Specific Fields format This field contains the value 00 that is T MSCP Controller Errors error log format code C 22 HSJ Series Error Logging Figure C 15 Last Failure Event Log Template 01 Format 31 0 controller identifier instance code tdisize templ reserved event time last failure code last failure parameters event code The values that can be reported in this field for this event log are shown in Table C 18 reserved offset 16 This field contains the value 0 instance code See Section C 2 1 for the description of this field The values that can be reported in this field for this event log are shown in Table C 18 HSJ Series Error Logging C 23 templ See Section C 2 1 for the description of this field This field contains the value 01 for this event log tdisize See Section C 2 1 for the description of this field This field contains the value 24 for this event log reserved offset 1E This field contains the value 0 event time See Section C 2 1 for the description of this field last failure code A number that uniquely describes the unrecoverable condition being reported as s
370. hown in Tables C 33 through C 48 The format of this field is shown in Figure C 16 Note Do not confuse this field with the instance code field They are similar in format but convey different information Figure C 16 Last Failure Code Format 3 2 161 1 3 6 5 8 7 6 4 3 0 Component Error Repair H Rest Param ID Number Action w Code Count Last Failure Code Specific Subfields 2 4 Parameter Count The number of longwords of supplemental information provided in the last failure parameters field Restart Code A number that describes the actions taken to restart the controller after the unrecoverable condition was detected as shown in Table C 49 C 24 HSJ Series Error Logging HW Hardware firmware flag If this flag is equal to 1 the unrecoverable condition is due to a hardware detected fault If this flag is equal to 0 the unrecoverable condition is due to a firmware detected inconsistency Repair Action The recommended repair action code assigned to the condition This value indicates what notification recovery action should be taken See Section C 5 for more detail Error Number A number when combined with the value contained in the Component ID subfield uniquely identifies the condition detected Component ID A number that uniquely identifies the firmware component that reported the condition as shown in Table C 2 last failure parameters This field contains supplementa
371. ia changer device reported standard SCSI Sense Data 03994002 0097 A Drive failed because a Test Unit Ready command or a Read Capacity command failed 039A000A 0077 E was failed by a Mode Select command received from the ost 039B4002 0097 Drive failed due to a deferred error reported by drive 039C4002 0097 Unrecovered Read or Write error 039D4002 0037 No response from one or more drives 039E430A 0097 Nonvolatile memory and drive metadata indicate conflicting drive configurations 039F430A 0097 The Synchronous Transfer Value differs between drives in the same storageset 03A04002 0097 Maximum number of errors for this data transfer operation exceeded 03A14002 0097 Drive reported recovered error without transferring all data 03A24002 0097 Data returned from drive is invalid 03A34002 0097 Request Sense command to drive failed 03A40064 0016 Illegal command for pass through mode 03A50064 0016 Data transfer request error 03A64002 0097 Premature completion of a drive command 03A74002 0037 Command timeout 03A80101 0037 Watchdog timer timeout 03A94002 0037 Disconnect timeout 03AA4002 0097 Unexpected bus phase 03AB4002 0097 Disconnect expected 03AC4002 0097 ID Message not sent by drive 03AD4002 0097 Synchronous negotiation error 03AE4002 0097 The drive unexpectedly disconnected from the SCSI bus 03AF4002 0097 Unexpected message 03B04002 0097 Unexpected Tag message 03B14002 0097 Channel busy 03B24002 0097 Message Reject received on a
372. ich could be one of the following messages e Unable to allocate memory TILX was unable to allocate the memory needed to perform TILX tests You should run TILX again but choose a lower queue depth and or choose fewer units to test e Cannot perform tests TILX was unable to allocate all of the resources needed to perform TILX tests You should run TILX again but choose a lower queue depth and or choose fewer units to test Unable to change operation mode to maintenance TILX tried to change the operation mode from normal to maintenance using the SYSAP CHANGE_STATE routine but was not successful due to insufficient resources This problem should not occur If it does occur submit an error report Then reset the controller Tape unit x does not exist Explanation An attempt was made to allocate a unit for testing that does not exist on the controller Unit x successfully allocated for testing Explanation All processes that TILX performs to allocate a unit for testing have been completed The unit is ready for TILX testing Unable to allocate unit Explanation This message should be preceded by a reason why the unit could not be allocated for TILX testing Cannot enable eip notification Explanation This message indicates that TILX was not successful in enabling EIP notification This should only occur if another copy of TILX is running Wait for the first copy to finish or terminate the second copy If there are
373. igital recommends only using the Auto Configure option during initial installations DILX tests logical units that may consist of storage sets of multiple physical devices Error reports identify the logical units not the physical devices Therefore if errors occur while running against a unit its storage set should be reconfigured as individual devices and then DILX run again against the individual devices There are no limitations on the number of units DILX may test at one time However Digital recommends only using DILX when no host activity is present If you must run DILX during a live host connection you should limit your testing to no more than half of any controller s units at one time This conserves controller resources and minimizes performance degradation on the live units you are not testing 6 4 1 Invoking DILX To invoke DILX from a maintenance terminal enter the following command at the CLI gt prompt CLI gt RUN DILX 6 4 2 Interrupting DILX Execution Use the following guidelines to interrupt DILX execution Note The symbol is equivalent to the Ctrl key You must press and hold the Ctrl key and type the character key given e Ctrl G or Ctrl T causes DILX to produce a performance summary DILX continues normal execution without affecting the runtime parameters e Ctrl C causes DILX to produce a performance summary stop testing and ask the reuse parameters question e Ctrl Y causes D
374. iguration device service is interrupted for the duration of the service cycle HSZ series controllers In effect following these procedures to remove and replace an HSZ series controller is warm swapping the controller This is because other targets on the host SCSI bus remain unaffected However take care not to confuse removing and replacing an HSZ series controller with the special warm swap procedure for HSJ series controllers described in Section 7 11 2 7 1 3 1 Tools Required You will need the following tools to remove or replace the controller module e ESD strap e 3 32 inch Allen wrench e 5 32 inch Allen wrench e Flat head screwdriver e Small flat head screwdriver 7 1 3 2 Precautions Refer to Chapter 1 for ESD grounding module handling and program card handling guidelines Ground yourself to the cabinet grounding stud shown in Figure 7 1 before servicing the controller module Nonredundant controllers will always be installed in slot SCSI ID 7 Slot 7 is the controller shelf slot furthest from the SCSI device cable connectors Removing and Replacing Field Replaceable Units 7 3 Figure 7 1 Cabinet Grounding Stud 800 SERIES CABINET GROUNDING STUD fo u dh o moa udh o vdo do udi o ud
375. ile Change the value 137 in the following statement to 142 SIF temp LE 137 AND temp GE 1 3 Run the AUTOGEN program This change will allow AUTOGEN to run successfully against the controller attached disk drives used as system disks OpenVMS VAX V6 1 The OpenVMS VAX V6 1 operating system does not require modifications to AUTOGEN COM as described in the previous sections 4 9 4 Other Conditions OpenVMS The following conditions and recommendations also apply to controllers running under the OpenVMS operating system e MSCP and TMSCP controller timeouts The MSCP and TMSCP controller timeouts have been split and the TMSCP timeout has been increased from 200 to 255 seconds This is to reduce host resets from the TU driver in OpenVMS VAX V5 5 2 that occur when the driver sends multiple position commands to a tape drive with shorter timeouts This change in HSJ and HSD series controller firmware will reduce but not eliminate the rate of these host resets e Write history log The write history log has been increased from 512 to 2048 entries The allocation failure entry table has also been increased from 128 to 512 entries This should eliminate or drastically reduce VMS crashes from entries and tables filling up while the OpenVMS software is using Host Based Volume Shadowing HBVS on the HSJ or HSD series controller e Increased storage set size Fourteen member RAID 0 storage sets are now supported Previous versi
376. ilink 7 31 Reset button 4 1 4 17 5 3 6 1 6 5 7 2 RESTART OTHER_CONTROLLER command B 21 RESTART THIS_CONTROLLER command B 23 Restoring initial parameters nonredundant controller 7 9 one dual redundant controller 7 16 RUN command B 25 S Safety See Precautions SCS node name 4 5 4 6 7 10 7 16 7 46 restriction 4 12 SCSI cable service precautions 1 9 SCSI cable device port See Device port cable SCSI host cable 3 19 7 29 installing 7 31 length 3 19 removing 7 30 replacing 7 31 service of 7 29 service precautions 7 30 tools 7 29 SCSI host interconnection supported protocols 2 9 SCSI hosts and storage 2 15 SCSI target ID 4 5 7 10 SCSI trilink installing 7 31 removing 7 30 replacing 7 31 Self test 1 5 2 9 6 4 See also DAEMON running 6 4 Self test cont d stopping 6 5 SELFTEST OTHER_CONTROLLER command B 26 SELFTEST THIS CONTROLLER command B 28 Sense data display HSZ series DILX 6 61 SET disk container name command B 30 SET FAILOVER command B 31 SET NOFAILOVER command B 33 SET OTHER_CONTROLLER command B 34 SET stripeset container name command B 37 SET THIS_CONTROLLER command B 38 SET unit number command B 41 Shadow member timeout 4 15 Shadow sets 4 15 Shared memory 2 4 6 3 testing 6 3 Shelf configurations 3 8 error messages 5 15 SHOW cdrom container name command B 45 SHOW CDROMS command B 44 SHOW DEVICES command B 46 SHOW disk container
377. ilities on THIS CONTROLLER Diagnostics and utilities can only be run on the controller where the terminal or DUP connection is connected For specific information about available diagnostics and utilities refer to the StorageWorks Array Controllers HS Family of Array Controllers Service Manual Examples CLI RUN DILX Copyright O Digital Equipment Corporation 1993 Disk Inline Exerciser version 1 0 How the diagnostic DILX would be run Command Line Interpreter B 25 SELFTEST OTHER_CONTROLLER SELFTEST OTHER_CONTROLLER Format Description Runs a self test on the other controller Note This command is valid for HSJ and HSD controllers only SELFTEST OTHER_CONTROLLER The SELFTEST OTHER CONTROLLER command shuts down the other controller then restarts it in DAEMON loop on self test mode The OCP reset button must be pushed to take the other controller out of loop on self test mode If any disks are on line to the other controller the controller will not self test unless the OVERRIDE_ONLINE qualifier is specified HSD and HSJ only If any user data cannot be flushed to disk the controller will not self test unless the IGNORE_ERRORS qualifier is specified Specifying IMMEDIATE will cause the other controller to self test immediately without flushing any user data to the disks even if drives are on line to the host Qualifiers for HSD and HSJ controllers IGNORE_ERRORS NOIGNORE_ERRORS D If erro
378. ill be spun up If NORUN is specified the unit will be spun down When entering an ADD UNIT command RUN is the default WRITE_PROTECT NOWRITE_PROTECT D Enables and disables write protection of the unit When entering an ADD UNIT command NOWRITE_PROTECT is the default Qualifiers for a unit created from a tape drive HSJ and HSD only Examples DEFAULT_FORMAT format DEFAULT_FORMAT DEVICE_DEFAULT D Specifies the tape format to be used unless overridden by the host Note that not all devices support all formats The easiest way to determine what formats are supported by a specific device is to enter SHOW lt tape unit number gt DEFAULT_ FORMAT the valid options will be displayed Supported tape formats are as follow e DEVICE DEFAULT The default tape format is the default that the device uses or in the case of devices that are settable via switches on the front panel the settings of those switches e 800BPI 9TRACK e 1600BPI 9TRACK e 6250BPI 9TRACK e TZ85 e TZ86 e TZ87 NOCOMPRESSION e TZ87 COMPRESSION DAT NOCOMPRESSION DAT COMPRESSION 3480 NOCOMPRESSION e 3480 COMPRESSION When entering the ADD UNIT command for a tape device DEFAULT FORMAT DEVICE DEFAULT is the default E CLI gt ADD UNIT DO DISKO Disk unit number 0 is created from container DISKO Command Line Interpreter B 9 ADD UNIT z CLI ADD UNIT TO TAPE12 Tape unit number 0 is created from container TAPE12
379. ing of CDROMs B 44 Command Line Interpreter SHOW cdrom container name SHOW cdrom container name Shows information about a CDROM Format SHOW cdrom container name Parameters cdrom container name The name of the CDROM drive that will be displayed Description The SHOW cdrom container name command is used to show specific information about a particular CDROM drive Examples CLI gt SHO CDROM230 Name Type Port Targ Lun Used by CDROM230 cdrom 2 3 0 D623 DEC RRD44 C DEC 3593 A listing of CDROM CDROM230 Command Line Interpreter B 45 SHOW DEVICES SHOW DEVICES Shows physical devices and physical device information Format SHOW DEVICES Description The SHOW DEVICES command displays all the devices known to the controller First disks are shown then tapes and finally CDROMs Qualifiers FULL If the FULL qualifier is specified additional amplifying information may be displayed after each device Information contained in the amplifying information is dependent on the device type Examples t CLI gt SHOW DEVICES Name Type Port Targ Lun Used by DIO disk 1 0 0 D100 DIl disk 1 1 0 D110 TAPE110 tape 3 1 0 T110 TAPE130 tape 3 3 0 T130 CDROM230 cdrom 2 3 0 D623 CDROM240 cdrom 2 4 0 D624 A basic listing of devices attached to the controller Si CLI gt SHOW DEVICES FULL Name Type Port Targ Lun Used by DIO disk 1 0 0 D100 DEC RZ35 C DEC X388 DIl disk 1 1 0 D110 DEC RZ26 C DEC T386 TAPE110 tape
380. ing to them 00000065 A CI ID or CI CNF packet transmitted by the thread on behalf of Host Interconnect Services could not be successfully transmitted 00010009 VC closed due to CI ID request failure 00020009 VC closed due to unexpected SCS state 00030009 VC closed due to CI START failure 00040009 VC closed due to CI STACK failure 00050009 VC closed due to PPD ACK failure 00060009 VC closed due to PPD NODE_STOP or PPD START message received 00070009 VC closed due to NAK ADP retry CI ID transmit failure 00080009 VC closed due to NAK ADP retry transmit failure 00090009 VC closed due to NOR DDL retry transmit failure on Path A 000A0009 VC closed due to NOR DDL retry transmit failure on Path B 000B0009 VC closed due to NOR ADP retry CI ID transmit failure 000C0009 VC closed due to NOR ADP retry transmit failure 000D0009 VC closed due to NAK DDL retry transmit failure on Path A 000E0009 VC closed due to NAK DDL retry transmit failure on Path B 000F0009 VC closed due to arbitration timeout on Path A 00100009 VC closed due to arbitration timeout on Path B 00110009 VC closed due to Path A off 00120009 VC closed due to Path B off 00130009 VC closed due to dual receive 00140009 VC closed due to invalid receive data structure state 00150009 VC closed due to no path 00160009 VC closed due to message transmit closed 00170009 VC closed due to data transmit closed 00180009 VC closed due to message scan 00190009 V
381. ining two serial asynchronous transceiver circuits DUP Diagnostic and Utility Protocol Host application software that allows a host operator terminal to connect to the controller s command line interpreter See also virtual terminal ECC One or more cyclic redundancy check CRC words that allow detection of a mismatch between transmitted and received data in a communications system or between stored and retrieved data in a storage system The ECC allows for location and correction of an error in the received retrieved data All ECCs have limited correction power EDC One or more checksum words that allow detection of a mismatch between transmitted and received data in a communications system or between stored and retrieved data in a storage system The EDC has no data correction capability EIP Error information packet The EIP includes bytes of data meant to be decoded into information explaining error events electromagnetic interference See EMI electrostatic discharge See ESD EMI Electromagnetic interference The impairment of a signal by an electromagnetic disturbance error correction code See ECC error detection code See EDC error information packet See EIP ESD Electrostatic discharge The discharge of a potentially harmful static electric voltage as a result of improper grounding EXEC Firmware executive EXEC is the portion of HS controller firmware that acts as the operating system for the
382. intenance terminal cable if attached 10 Loosen the four screws refer to Figure 7 3 on each side of the front bezel with a 3 32 inch Allen wrench HSJ series controllers or flat head screwdriver HSD and HSZ series controllers 11 Use a gentle up and down rocking motion to loosen the module from the shelf backplane 7 14 Removing and Replacing Field Replaceable Units 12 Slide the module out of the shelf noting which rails the module was seated in and place on an approved ESD work surface or mat 13 If necessary you may now remove the cache module as described in Section 7 2 3 7 1 4 4 Module Replacement Installation Use the following procedure to replace the controller module 1 2 10 11 Replace the cache module now if you removed it Refer to 7 2 4 Make sure the OCP cable HSJ series only is correctly plugged into the underside of the module refer to Figure 7 5 Slide the controller module into the shelf using its slot s rightmost rails as guides refer to Figure 7 6 Use a gentle up and down rocking motion to help seat the module into the backplane Press firmly on the module until it is seated Finally press firmly once more to make sure the module is seated Tighten the four screws on the front bezel using a 3 32 inch Allen wrench HSJ series controllers or flat head screwdriver HSD and HSZ series controllers Connect a maintenance terminal to the MMJ of the new controller Be
383. inutes Note Do not remove the controller with the blinking green LED reset button You have 5 minutes to remove the controller following the steps described in Table 7 5 Your terminal will update you with the time remaining to complete the removal procedure as shown in the following example Time remaining 4 minutes 40 seconds Note If you fail to remove the controller within five minutes the subsystem will restart the quiesced ports and you will have to begin this procedure again Table 7 5 Module Removal Step Description 1 Ground yourself to the cabinet grounding stud refer to Figure 7 1 2 Unlock and open the cabinet doors SW800 series using a 5 32 inch Allen wrench 3 Unsnap and discard the program card EMI shield if attached refer to Figure 7 2 4 Remove the program card by pushing the eject button refer to Figure 7 3 next to the card Pull the card out and save it for use in the replacement controller module 5 Loosen the captive screws on the host interface CI cable connector refer to Figure 7 3 with a flat head screwdriver and remove the cable from the front of the controller module 6 Loosen the four screws refer to Figure 7 3 on each side of the front bezel with a 3 32 inch Allen wrench 7 Use a gentle up and down rocking motion to loosen the module from the shelf backplane 8 Slide the module out of the shelf noting which rails the module was seated in and pl
384. iption of the component identified by nn Table C 14 SCSI ASC ASCQ Codes For Sequential Access Devices such as magnetic tape ASC ASCQ Code Code Description 00 00 No additional sense information 00 01 Filemark detected 00 02 End of partition medium detected 00 03 Setmark detected 00 04 Beginning of partition medium detected 00 05 End of data detected 00 06 T O process terminated 03 00 Peripheral device write fault 03 01 No write current 03 02 Excessive write errors 04 00 Logical unit not ready cause not reportable 04 01 Logical unit is in process of becoming ready C 68 HSJ Series Error Logging continued on next page Table C 14 Cont SCSI ASC ASCQ Codes For Sequential Access Devices such as magnetic tape ASC ASCQ Code Code Description 04 02 Logical unit not ready initializing command required 04 03 Logical unit not ready manual intervention required 04 04 Logical unit not ready format in progress 07 00 Multiple peripheral devices selected 08 00 Logical unit communication failure 08 01 Logical unit communication time out 08 02 Logical unit communication parity error 09 00 Track following error 0A 00 Error log overflow 0C 00 Write error 11 00 Unrecovered read error 11 01 Read retries exhausted 11 02 Error too long to correct 11 03 Multiple read errors 11 08 Incomplete block read 11 09 No gap found 11 0A Miscorrected error 14 00 Recorded entity
385. iption of this field HSJ Series Error Logging C 45 Figure C 26 Disk Transfer Error Event Log Template 51 Format 31 0 command reference number sequence number unit number event code flags format controller identifier unit identifier reserved event time ancillary information device identification device serial number info snsflgs keyspec frucode The values that can be reported in this field for this event log are shown in Table C 28 templ See Section C 2 1 for the description of this field This field contains the value 51 for this event log tdisize See Section C 2 1 for the description of this field This field contains the value 3C for this event log C 46 HSJ Series Error Logging reserved offset 32 This field contains the value 0 event time See Section C 2 1 for the description of this field ancillary information The format of this field varies depending on whether or not the event being reported is associated with a command issued by a host system or one issued by an HSJ30 40 controller firmware component If the event is associated with a command issued by a host system this field is formatted as described in Section C 2 2 2 If the event is associated with a command issued by an HSJ30 40 controller firmware component this field is formatted as described in Section C 2 2 3 device locator devtype device identification device serial number See Section C 2 2 4 for t
386. ire horizontal alignment If desired vertical shelf locations can be used for most disk drives Refer to the device specific documentation for requirements Any of the vertical shelves can be used However Digital recommends surrendering controller positions C4 then C3 first for storage shelves Refer to Figure 3 1 Configuration Rules and Restrictions 3 5 3 2 2 SW500 series Cabinets The rules presented in this section apply to subsystem configurations in SW500 series cabinets Refer to the StorageWorks Solutions SW500 Series Cabinet Installation and User s Guide for more details Figure 3 4 shows the loading sequence for storage and controller shelves in an SW500 series cabinet Figure 3 4 SW500 Series Cabinet Loading MOUNTING LOCATIONS HOLE gt k 2 STORAGE oo 00 HOLE POSITION S9 8 STORAGE POSITION S4 0000 HOLE 14 STORAGE oo oo HOLE POSITION S3 00 00 20 CONTROLLER HOLE gt e POSITION C1 26 STORAGE POSITION S1 ooo oo loofoo Ro HOLE 32 STORAGE POSITION S2 oo 00 S le o o o
387. irst half height device is normally mounted in the lower part of the carrier The second device is normally mounted in the upper part of the carrier HBVS Host Based Volume Shadowing Also known as Phase 2 Volume Shadowing HBVS assistance RAID level 1a The HS controller performs HBVS assistance by independently directing shadow copy operations that were requested by the host between two units under the given controller Hierarchical Storage Controller See HSC HIS Host Interconnect Services The firmware that communicates with the host in HS family controllers host The primary or controlling computer to which a storage subsystem is attached Host Based Volume Shadowing See HBVS Host Interconnect Services See HIS host logical unit A virtual group of devices addressable as a unit See also logical unit hot swap A method of device replacement whereby the complete system remains on line and active during device removal and reinstallation The device being removed or reinstalled is the only device that cannot perform operations during this process HSC Hierarchical Storage Controller An intelligent mass storage server used on the CI bus Capable of supporting a total of eight disk and or tape data channels the HSC is part of the System Interconnect Architecture and Digital Storage Architecture By performing as an I O manager the HSC can be classified as an I O server removing the burden of I O management from
388. is enabled and disabled through the Command Line Interpreter CLI Unable to continue run time expired Explanation A continue response was given to the reuse parameters question This is not a valid response if the run time has expired Reinvoke DILX When DILX starts to exercise the disk units the following message is displayed with the current time of day DILX testing started at xx xx xx Test will run for x minutes Type T if running DILX through a VCS or G in all other cases to get a current performance summary Type C to terminate the DILX test prematurely Type Y to terminate DILX prematurely 6 4 6 DILX Sense Data Display To interpret the sense data fields correctly refer to SCSI 2 specifications Example 6 16 is an example of a DILX sense data display Example 6 16 DILX Sense Data Display Sense data in hex for unit x Sense Key Sense ASC Sense ASQ Instance x x MX Diagnostics Exercisers and Utilities 6 61 6 4 7 DILX Deferred Error Display Example 6 17 is an example of a DILX deferred error display Example 6 17 DILX Deferred Error Display Deferred error detected hard error counted against each unit Sense Key Sense ASC Sense ASQ Instance x x KM ox 6 4 8 DILX Data Patterns Table 6 8 defines the data patterns used with the DILX Basic Function or User Defined tests There are 18 unique data patterns These data patterns were selected as worst case or the ones most likely to produc
389. iscriptor DWD was supplied with a NULL Physical Unit Block PUB pointer continued on next page HSJ Series Error Logging C 101 Table C 35 Cont Device Services Last Failure Codes Code Description 03320101 03330188 03350188 C 102 HSJ Series Error Logging An invalid code was passed to the error recovery thread in the error_stat field of the PCB Last Failure Parameter 0 contains the PCB error_stat code A parity error was detected by a 710 while sending data out onto the SCSI bus Last Failure Parameter 0 contains the PCB reg710_ptr value Last Failure Parameter 1 contains the PCB copy of the 710 TEMP register Last Failure Parameter 2 contains the PCB copy of the 710 DBC register Last Failure Parameter 3 contains the PCB copy of the 710 DNAD register Last Failure Parameter 4 contains the PCB copy of the 710 DSP register Last Failure Parameter 5 contains the PCB copy of the 710 DSPS register Last Failure Parameter 6 contains the PCB copies of the 710 SSTAT2 SSTAT1 SSTATO DSTAT registers Last Failure Parameter 7 contains the PCB copies of the 710 LCRC RESERVED ISTAT DFIFO registers The TEA bus fault signal was asserted into a 710 Last Failure Parameter 0 contains the PCB reg710_ptr value Last Failure Parameter 1 contains the PCB copy of the 710 TEMP register Last Failure Parameter 2 contains the PCB copy of the 710 DBC register Last Failure Parameter 3 con
390. istributing ac power in a cabinet ac power supply A power supply designed to produce dc power from an ac input adapter A device that converts the protocol and hardware interface of one bus type into that of another without changing the functionality of the bus See signal converter American National Standards Institute See ANSI ANSI American National Standards Institute An organization that develops and publishes electronic and mechanical standards array controller A hardware software device that facilitates communications between a host and one or more devices organized in an array The HS controllers are array controllers BA350 MXx controller shelf The StorageWorks controller shelf used for HS family controller modules cache modules and shelf power units BA350 Sx SBB shelf A StorageWorks shelf used for only power units and SBBs bad block A block containing a defect that Exceeds the correction capability of the subsystem error correction scheme e Exceeds a drive specified error threshold Once a block exceeds this threshold data integrity is not guaranteed Imposes too great a strain on system performance In this case the subsystem still assures data integrity but the extensive error correction required for each block access causes too great a strain on system performance Glossary 1 Glossary 2 bad block replacement See BBR battery backup unit See BBU BBR Bad block replacement BB
391. it is broken and reestablished For manual configuration the following steps add devices storage sets and logical units Use the CLI to complete these steps so that the host will recognize the storage device These steps can be run from a virtual terminal 4 8 Normal Operation Add the physical devices by using the following command CLI gt ADD device type device name scsi location For example CLI gt ADD DISK DISK100 1 0 0 CLI gt ADD TAPE TAPE510 5 1 0 CLI gt ADD CDROM CDROMO 6 0 0 where device type is the type of device to be added This can be DISK TAPE or CDROM device name is the name to refer to that device The name is referenced when creating units or storage sets SCSI location is the port target and LUN PTL for the device When entering the PTL at least one space must separate the port target and LUN Add the storage sets for the devices See Appendix B for examples for adding storage sets If you do not desire storage sets in your configuration skip this step CAUTION The INITIALIZE command destroys all data on a container See Appendix B for specific information on this command Enter the following command to initialize the containers devices or storage sets or both prior to adding logical units to the configuration CLI gt INITIALIZE container name where container name is a device or storage set that will become part of a unit When initializing a single device con
392. ith firmware Version 1 4 2 See Tables 7 1 through 7 4 to find and order the part number you need for the upgrade Table 7 1 Cache Upgrade HSJ40 Controller Current Cache Desired Cache Option Required 16 MB 32 MB HSJ40 XE Ver 1 or 2 7 20 Removing and Replacing Field Replaceable Units Table 7 2 Cache Upgrade HSJ30 Controller Current Cache Desired Cache Option Required None 16 MB HSJ30 XD 32 MB HSJ30 XF 16 MB 32 MB HSJ30 XE Table 7 3 Cache Upgrade HSD30 Controller Current Cache Desired Cache Option Required None 16 MB HSD30 XD 32 MB HSD30 XF 16 MB 32 MB HSD30 XE Table 7 4 Cache Upgrade HSZ40 Controller Current Cache Desired Cache Option Required None 16 MB HSZ40 XD 32 MB HSZ40 XF 16 MB 32 MB HSZ40 XE 3 If necessary remove the cache module as described in Section 7 2 3 4 Insert the upgraded cache module by following the steps in 7 2 4 7 3 Program Card Whenever you remove a failed controller module refer to Section 7 1 you remove the PCMCIA program card However there are times when you need to remove only the program card such as when you install updated firmware You are allowed to remove one or both program cards from a dual redundant configuration or one card from a nonredundant configuration Note When you update firmware you must remove both program cards from a dual redundant configuration Furthermore the two cards in a dual redundant
393. itioning error detected by read of medium 16 00 Data synchronization mark error 17 00 Recovered data with no error correction applied continued on next page HSJ Series Error Logging C 65 Table C 13 Cont SCSI ASC ASCQ Codes For Direct Access Devices such as magnetic disk ASC ASCQ Code Code Description 17 01 Recovered data with retries 17 02 Recovered data with positive head offset 17 03 Recovered data with negative head offset 17 05 Recovered data using previous sector ID 17 06 Recovered data without ECC data auto reallocated 17 07 Recovered data without ECC recommend reassignment 17 08 Recovered data without ECC recommend rewrite 18 00 Recovered data with error correction applied 18 01 Recovered data with error correction and retries applied 18 02 Recovered data data auto reallocated 18 05 Recovered data recommend reassignment 18 06 Recovered data recommend rewrite 19 00 Defect list error 19 01 Defect list not available 19 02 Defect list error in primary list 19 03 Defect list error in grown list 1A 00 Parameter list length error 1B 00 Synchronous data transfer error 1C 00 Defect list not found 1C 01 Primary defect list not found 1C 02 Grown defect list not found 1D 00 Miscompare during verify operation 1E 00 Recovered ID with ECC correction 20 00 Invalid command operation code 21 00 Logical block address out of range 22 00 Illegal function should use 0
394. ives and names the drive This command must be used when a new SCSI 2 tape drive is to be added to the configuration Examples CLI gt ADD TAPE TAPEO 1 0 0 A tape drive is added to port 1 target 0 LUN 0 and named TAPEO B 6 Command Line Interpreter ADD UNIT ADD UNIT Format Parameters Description Adds a logical unit to the controller ADD UNIT unit number container name unit number HSJ and HSD only The device type letter followed by the logical unit number that the host will use to access the unit The device type letter is either D for disk devices including CDROMs or T for tape devices Using this format logical unit 3 which is made up of a disk or disks such as a stripeset would be specified as D3 Logical unit 7 which is made up of a tape device would be T7 unit number HSZ only The unit number determines both the target 0 though 7 and the LUN that the device will be made available from The 100 s place of the unit number is the target and the 1 s place is the LUN For example D401 would be target 4 LUN 1 D100 would be target 1 LUN 0 D5 would be target 0 LUN 5 Note The only target numbers specified in the unit number must be previously specified in the SET THIS CONTROLLER IDz n1 n2 command A target number may not be specified that has not been previously specified by the SET THIS_CONTROLLER ID command container name The name of the container that will be u
395. l Errs Hard Cnt 0 Soft Cnt 1 O unit 55 Total IO Requests 2017193 Err in Hex 1C 03094002 PTL 05 05 00 Key 01 ASC Q 18 89 HC 0 SC 1 Err in Hex 1C 03094002 PTL 05 05 00 Key 01 ASC Q 18 86 HC 0 SC 1 o Total Errs Hard Cnt 0 Soft Cnt 2 where O Represents the unit number and the total 1 0 requests to this unit Diagnostics Exercisers and Utilities 6 63 O Represents the unit number and total I O requests to this unit All values for the following codes are described in Appendix E This also includes the following items associated with this error and the total number of hard and soft errors for this unit e The HSZ series Instance code in hex e The Port Target LUN PTL The SCSI Sense Key e The SCSI ASC and ASQ ASC Q codes e The total hard and soft count for this error O Represents information about the first two unique errors for this unit All values for the following codes are described in Appendix E This also includes the following items associated with this error and the total number of hard and soft errors for this unit e The HSZ series Instance code in hex e The Port Target LUN PTL e The SCSI Sense Key e The SCSI ASC and ASQ ASC Q codes e The total hard and soft count for this error A line of this format may be displayed up to three times in a performance summary There would be a line for each unique error reported to DILX for up to three errors for each unit O Represents the total h
396. l information specific to the failure being reported The content of the parameters supplied if any are described in the individual last failure code descriptions contained in Tables C 33 through C 48 C 2 3 2 Failover Event Log Template 05 The HSJ30 40 controller Failover Control firmware component reports errors and other conditions encountered during redundant controller communications and failover operation via the Failover Event Log The Failover Event Log will be sent to all host systems that have enabled Miscellaneous error logging on a connection or connections established with the HSJ30 40 controller s Disk and or Tape MSCP Server The Failover Event Log is reported via the T MSCP Controller Errors error log message format The format of this event log including the HSJ30 40 controller specific fields is shown in Figure C 17 HSJ Series Error Logging C 25 Figure C 17 Failover Event Log Template 05 Format 31 0 controller identifier instance code tdisize templ reserved event time last failure code last failure parameters Failover Event Log Format Specific Fields format This field contains the value 00 that is T MSCP Controller Errors error log format code event code The values that can be reported in this field for this event log are shown in Table C 19 reserved offset 16 This field contains the value 0 C 26 HSJ Series Error Logging instance code See Section C
397. l of the policy processor Each SCSI 2 port can have up to six or seven attached devices depending on controller configuration dual redundant and nonredundant respectively In a dual redundant configuration subsystem availability improves because each controller has access to the other controller s devices 2 1 11 Cache Module The HS controllers can run with a companion read cache module available in 16 or 32 MB 2 1 11 1 Common Cache Functions The HS controller cache module increases the controller I O performance During normal operation a host read operation accesses data either from the fast memory of the cache module or from an I O device If a host read is a cache hit data already in the cache the data are supplied to the host immediately improving I O performance by reducing latency If the host read is a cache miss data not in the cache the HS controller accesses the appropriate disk to satisfy the request Then the controller reads the data returns it to the host and writes it to the cache Cache entry sizes are fixed at 64 KB 128 logical blocks for each logical unit Read caching is enabled by default but can be optionally disabled using the CLI Logical Unit SET command on a per unit basis see Appendix B The data replacement algorithm is a least recently used LRU replacement algorithm When the cache is full and new data must be written the LRU algorithm removes the oldest resident cached da
398. l redundant subsystem it is recommended that you pick option 2 on the first controller and then option 2 on the other controller Auto Configure options are 1 Configure all disk units for testing This is recommended for a single controller subsystem 2 Configure half of all disk units for testing This is recommended for a dual controller subsystem 3 Exit Auto Configure and DILX Enter Auto Configure option 1 3 3 Explanation This is self explanatory After you enter the desired Auto Configure option DILX will display the following caution statement CAUTION All data on the Auto Configured disks will be destroyed You MUST be sure of yourself Are you sure you want to continue y n n Explanation This question is only asked if the Auto Configure option was selected and if the user selected Auto Configure option 1 or 2 as described in the last question above Use All Defaults and Run in Read Only Mode y n y Explanation Enter Y to use the defaults for DILX run in read only mode and most of the other DILX questions are not asked Enter N and the defaults are not used You must then answer each question as it is displayed The following defaults are assumed for all units selected for testing e Execution time limit 10 minutes e Performance summary interval 10 minutes e Displaying hard or soft error Event Information Packets EIPs and end messages is disabled e The hard error
399. lable as a controller self test function for the operator 4 1 2 Dual Redundant Configuration Initialization The controllers in a dual redundant configuration run the same initialization sequence that is described in Chapter 6 except they exchange signals during their individual initialization sequences The first signal occurs after one controller starts initializing The signal informs the other controller that an initialization is occurring This way the other controller will not assume that the initializing controller is not functioning and will not attempt to disable it Normal Operation 4 1 4 1 3 Subsystem Initialization Full StorageWorks subsystem initialization take place when the subsystem is switched on for the first time In the event of a reset due to one of the following conditions a subset of the initialization sequence is run e A partial or complete power failure e Equipment failure e An error condition A complete StorageWorks subsystem initialization includes the following 1 When the subsystem is turned on all shelves in the subsystem are reset Then entities in the shelves including storage devices controllers and cache modules run their initialization and self test sequences 2 During initialization the controller interrogates the entities with which it has connections including other controllers in the subsystem 3 When the initialization sequence on all entities is completed the controller begin
400. late 31 Instance MSCP Event Codes CI Port Port Driver Event Log Template 32 Instance MSCP Event Codes usse ed oeque do bes dedo RE oe do Dos Ug Ye odo CI System Communication Services Event Log Template 33 Instance MSCP Event Codes lese een Device Services Nontransfer Error Event Log Template 41 Instance MSCP Event Codes llle eee Disk Transfer Error Event Log Template 51 Instance MSCP Event Codes oto Ree Rescue RN e e ded UR e RA Re he Disk Bad Block Replacement Attempt Event Log Template 57 Instance MSCP Event Codes llle ees Tape Transfer Error Event Log Template 61 Instance MSCP Event Codes wi iE PE et e bte NR eee ls edis eb eta eet Media Loader Error Event Log Template 71 Instance MSCP Event rH Disk Copy Data Correlation Event Log event dependent information Valles gue ga A v Panes p EN Bae Mnt y Saws Executive Services Last Failure CodesS o oooooooo Value Added Services Last Failure Codes oooooo o Device Services Last Failure Codes o oooooooo ooo Fault Manager Last Failure Codes o Dual Universal Asynchronous Receiver Transmitter Services Last Failure Codes lr o dead aseado Failover Control Last Failure Codes oooooooo oo o Nonvolatile Parameter Memory Failover Control Last Failure A A NN Command Line Interpreter Last Failure Codes Host Interco
401. le that different versions of the controller firmware will have different threads or different names for the threads Table 6 13 Thread Description Thread Name Description CLI CLIMAIN CONFIG DILX DIRECT DS_0 DS_1 DS_HB DUART DUP FMTHREAD FOC HIS HPT MSCP NULL NVFOC REMOTE RMGR A local program that provides an interface to the controller s command line interpreter thread The command line interpreter CLI thread A local program that locates and adds devices to an HS array controller configuration A Local Program that exercises disk devices A local program that returns a listing of available Local Programs Device error recovery management thread The thread that handles successful completion of physical device requests The thread that manages the device and controller error indicator lights and port reset buttons The console terminal interface thread The DUP protocol server thread The thread that performs error log formatting and fault reporting for the controller The thread that manages communication between the controllers in a dual controller configuration The SCS protocol interface thread for CI and DSSI controllers The thread that handles interaction with the host port logic and PPD protocol for CI and DSSI controllers The MSCP and TMSCP protocol server thread The process that is scheduled when no other process can be run The thread that initiates stat
402. ler 080E0101 An out of range receiver ID was received by the NVFOC communication utility master send to slave send ACK Last Failure Parameter 0 contains the bad ID value 080F0101 An out of range receiver ID was received by the NVFOC communication utility received by master Last Failure Parameter 0 contains the bad ID value 08100101 A call to NVFOC TRANSACTION had a from field id that was out of range for the NVFOC communication utility Last Failure Parameter 0 contains the bad ID value 08110101 NVFOC tried to defer more than one FOC send Last Failure Parameter 0 contains the master ID of the connection that had the multiple delays 08120100 Unable to lock other controller s NVmemory despite the fact that the running and handshake_complete flags are set 08130100 Could not allocate memory to build a callback context block on an unlock NVmemory call 08140100 Could not allocate memory to build a workblock to queue to the NVFOC thread 08150100 A lock was requested by the other controller but the memory is already locked by the other controller 08160100 A request to clear the remote configuration was received but the memory was not locked 08170100 A request to read the next configuration was received but the memory was not locked continued on next page HSJ Series Error Logging C 109 Table C 39 Cont Nonvolatile Parameter Memory Failover Control Last Failure Codes Code Description 08180100 Could not g
403. ler module O O O Replace controller module L off AM lit continuously 15 The controller DRAB chip failed to report forced failed ECC L 16 The controller DRAB chip failed some operation in the reporting validating and testing of the multibit ECC memory error ld d 17 The controller DRAB chip failed some operation in the reporting validating and testing of the multiple single bit ECC memory error Ol L 18 The controller main memory did not write correctly in one or more sized memory transfers L d 19 The controller did not cause an I to N bus timeout when accessing a reset host port chip ld L 1A The controller DRAB chip did not report an I to N bus timeout when accessing a reset host port chip d 1B The controller DRAB did not interrupt the controller processor when expected L EL 1C The controller DRAB did not report an NXM error when nonexistent memory was accessed L d 1D The controller DRAB did not report an address parity error when one was forced 4 1E There was an unexpected nonmaskable interrupt from the controller DRAB during the DRAB memory test L L 20 Therequired amount of memory available for the code image to be loaded from the program card is insufficient O 21 The required amount of memory available in the pool area is insufficient for the controller to run ld d 23 The required amount of memory available in the buffer area is insufficient for the controller
404. ll DSSI based controllers covered in this manual as listed in Table 1 1 This refers to all SCSI based controllers covered in this manual as listed in Table 1 1 Xix Manufacturer s Declarations CAUTION This is a class A product In a domestic environment this product may cause radio interference in which case the user may be required to take adequate measures ACHTUNG Dieses ist ein Ger t der Funkst rgrenzwertklasse A In Wohnbereichen k nnen bei Betrieb dieses Ger tes Rundfunkst rungen auftreten in welchen Fallen die Benutzer f r entsprechende Gegenma nahmen verantwortlich sind ATTENTION Ceci est un produit de Classe A Dans un environment domestique ce produit risque de cr er des interf rences radi lectriques il appartiendra alors l utilisateur de prendre les mesures sp cifiques appropri es F r Bundesrepublik Deutschland For Federal Republic of Germany Pour la R publique f derale d Allemagne Hochfrequenzger tezulassung und Betriebsgenehmigung Bescheinigung des Herstellers Importeurs Hiermit wird bescheinigt da die Einrichtung in bereinstimmung mit den Bestimmungen der DBP Verf gung 523 1969 Amtsblatt 113 1969 und Grenzwertklasse A der VDE0871 funkenst rt ist Das Bundesamt f r Zulassungen in der Telekommunikation der Deutschen Bundespost DBP hat diesem Ger t eine FTZ Serienpr fnummer zugeteilt Betriebsgenehmigung Hochfrequenzger te
405. ll not be placed in a dual redundant configuration You should both logically and physically reconfigure the drives so that target 6 is not used Error 6100 Allocation classes cannot be zero for a dual redundant configuration Set MSCP and TMSCP allocation classes to non zero Explanation Ifin a dual redundant configuration the allocation class must not be set to zero Error 6110 This controller already in failover mode You must issue a SET NOFAILOVER command first Explanation A SET FAILOVER cannot be entered on a controller already in failover Error 6120 Other controller already in failover mode You must issue a SET NOFAILOVER command first Explanation A SET FAILOVER ccommand was entered and although this controller was not configured for dual redundancy the other controller was Error 6170 An lt controller type gt and lt controller type gt cannot configured for failover Explanation Two different controllers such as an HSJ and an HSZ cannot be configured for failover Replace the other controller with the same model as this one and reenter the command Error 9000 Cannot rename a unit Explanation Only devices and storage sets may be renamed If you attempt to rename a unit this message results Error 9010 lt name gt is an illegal name it must be from 1 to 9 characters Explanation This error results from an ADD command with an illegal name given Error 9020 lt name gt is an illegal name it
406. ller SW ver Controller HW ver ulti Unit Code emory Address Instance Template Type Requestor Information Size xx KKM MK MM KM X OX Specific Data bytes 0 7 Specific Data bytes 8 15 Requestor Specific Data bytes xx xx Requestor Specific Data bytes 0 7 Requestor Specific Data bytes 8 15 Requestor Specific Data bytes xx xx X X X XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX X X X XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX Diagnostics Exercisers and Utilities 6 19 Example 6 4 Disk Transfer Error Error Information Packet in hex Cmd Ref Number Unit Number Log Sequence Format Flags Event Code Controller ID Controller SW ver Controller HW ver Multi Unit Code Unit ID 0 Unit ID 1 Unit Software Rev Unit Hardware Rev Recovery Level Retry Count Serial Number Header Code Instance X Template Type X Requestor Information Size x Requestor Specific Data bytes 0 7 XX XX XX XX XX XX XX XX Requestor Specific Data bytes 8 15 xXx XX XX XX XX XX XX XX tick Gt ct x x MMMM KKM v4 DX v4 MM v4 KM MK X Requestor Specific Data bytes xx XX XX XX XX XX XX XX XX XX Example 6 5 Bad Block Replacement Attempt Error Error Information Packet in hex Cmd Ref Number Unit Number Log Sequence Format Flags Event Code Controller ID Controller SW ver Controller HW ver Multi Unit Code Unit ID 0 Unit ID 1 Unit Software Rev Unit Hardware Rev
407. losed in quotes with an alphabetic character first Each SCS node name must be unique within its VMScluster 4 Enter the following command to set the MSCP allocation class CLI gt SET THIS CONTROLLER MSCP ALLOCATION CLASS n where n is 1 through 255 Digital recommends providing a unique allocation class value for every pair of dual redundant controllers in the same cluster 5 Enter the following command to set the TMSCP allocation class CLI gt SET THIS CONTROLLER TMSCP ALLOCATION CLASS n where n is 1 through 255 CAUTION The SET FAILOVER command establishes controller to controller communication and copies configuration information Always enter this command on one controller only COPY configuration source specifies where the good configuration data are located Never blindly specify SET FAILOVER Know where your good configuration information resides before entering the command 6 Enter the following command to copy parameters to the other controller the one not connected to CLI SET FAILOVER COPY THIS CONTROLLER Note Always restart the controllers after setting the ID SCS node name or allocation classes 7 Restart both controllers either by pressing the green reset buttons or by entering the following commands CLI gt RESTART OTHER CONTROLLER CLI gt RESTART THIS CONTROLLER See Section 4 9 2 for important information about VMS node names Normal Operation 4 7 8 Enter
408. lover configuration as well as the other controller if it is reachable No device configuration information is lost from either controller 4 10 3 Failing Over A failed or unresponsive controller in a dual redundant configuration is disabled by its companion controller The functioning controller sends a signal to the other controller to induce failover The functioning controller assumes control of the storage devices that were on line to the disabled controller Maintenance can now take place on the failed controller Failover should normally complete in 30 seconds or less 15 seconds or less for three port controllers If there is no outstanding drive I O activity at the time of controller failure failover should require substantially less than 30 seconds If drive I O is in progress at the time of failure the surviving controller must reset any SCSI buses with outstanding I O These bus resets can require up to 5 seconds per port to complete 4 16 Normal Operation Whenever you need to revive a controller that was disabled you must enter the following command from a terminal connected to the functioning controller CLI gt RESTART OTHER_CONTROLLER Then press the reset button to initialize the controller You may test failover by removing the program card from one of the controllers The other controller will assume service to the dormant controller s devices until you reinsert the program card and reinitialize restart the co
409. ly of modular data storage products that allows customers to design and configure their own storage subsystems Components include power packaging cabling devices controllers and software Customers can integrate devices and array controllers in StorageWorks enclosures to form storage subsystems StorageWorks building block See SBB stripeset In a RAID configuration a virtual disk drive with its physical data spread across multiple physical disks Stripeset configurations do not include a data recovery mechanism supported device A device tested as functionally compatible with an approved StorageWorks hardware and software configuration surviving controller The controller in a dual redundant pair that assumes service to its companion s devices when the companion fails See also failover System Communication Architecture See SCA System Communications Services See SCS Tape Inline Exerciser See TILX Tape Mass Storage Control Protocol See TMSCP target A member of a SCSI bus responsible for carrying out operations requested by an initiator The physical storage devices are targets of the HS controller Also the HSZ series controller is a target of its host CPU Glossary 13 Glossary 14 TILX Tape Inline Exerciser Diagnostic firmware used to test the data transfer capabilities of tape drives in a way that simulates a high level of user activity TMSCP Tape Mass Storage Control Protocol An applications
410. ly transferred logical block number Starting logical block number of the HSJ30 40 controller firmware component initiated transfer reserved Reserved for future use currently contains the value 0 C 2 2 4 Device Location Identification Common Fields The fields common to certain event logs generated by the Device Services and Value Added firmware components are shown in Figure C 6 Device Location Identification Common Fields device locator The location within the HSJ30 40 controller s subsystem of the target device involved in the event being reported This field is formatted as shown in Figure C 7 HSJ Series Error Logging C 11 Figure C 6 Device Location Identification Common Fields 3 22 1 4 3 0 device identification device serial number Figure C 7 Device Locator Field Format lun target port Device Locator Specific Subfields port The SCSI bus number to which the target device is connected target The SCSI target number on the port to which the target device is connected lun The logical unit number on the target by which the target device is logically addressed C 12 HSJ Series Error Logging devtype The SCSI device type of the device The various SCSI device types supported by the HSJ30 40 controller are shown in Table C 9 device identification Sixteen bytes of ASCII data as defined by the device vendor in the Product Identification field of the SCSI INQUI
411. me Shield vex ad aa Ws SCSI Device Cables iii a a ti a Replacing a Blower ccc ee eee ene Power Supply Removal llle SBB Warm Swap saciid shoe ha epe eee bh eG wea es Implementation Dependent Information Format Instance Code Format 0 0 ccc eee eee nes CI Host Interconnect Services Common Event Log Fields Host Server Connection Common Fields Byte Count Logical Block Number Common Fields Device Location Identification Common Fields Device Locator Field Format llle SCSI Device Sense Data Common Fields Sense Data Qualifier Field Format llle SCSI Sense Data Byte Zero ercdval Field Format SCSI Sense Data Byte Two snsflgs Field Format Co w Go Qo O 050 0 ON DD 7 8 7 24 7 28 7 30 7 32 7 33 7 35 7 37 7 40 C 7 C 7 C 10 C 11 C 12 C 12 C 14 C 14 C 15 C 17 C 19 C 20 C 21 C 22 C 23 C 24 C 25 C 26 C 27 C 28 C 29 3 4 3 6 3 7 3 8 4 1 4 2 5 1 5 2 5 3 6 1 SCSI Sense Data Byte OF through 11 keyspec Field Field Pointer Bytes Format a deh Be A aa SCSI Sense Data Byte OF through 11 keyspec Field Actual Retry Count Bytes Format llle eee SCSI Sense Data Byte OF through 11 keyspec Field Progress Indication Bytes Format
412. mit any parity on the terminal lines When first installed the controller s terminal parity is set to NOTERMINAL_PARITY TERMINAL_SPEED baud_rate Sets the terminal speed to 300 600 1200 2400 4800 or 9600 baud The transmit speed is always equal to the receive speed When first installed the controller s terminal speed is set to 9600 baud Command Line Interpreter B 39 SET THIS_CONTROLLER TMSCP_ALLOCATION_CLASS n Specifies the allocation class 0 through 255 in a single controller configuration or 1 through 255 in a dual redundant configuration When first installed the controller s TMSCP_ALLOCATION_CLASS is set to 0 Qualifiers for HSZ controllers Examples IDzn or ID n1 n2 Specifies one or two SCSI target IDs 0 through 7 If two target IDs are specified they must be enclosed in parenthesis and separated by a comma Note The unit number determines which target the LUN will be available under For example D203 would be target 2 LUN 3 D500 would be target 5 LUN 0 D5 would be target 0 LUN 5 PROMPT new prompt Specifies a 1 to 16 character prompt enclosed in quotes that will be displayed when the controller s CLI prompts for input Only printable ASCII characters are valid When first installed the CLI prompt is set to the first three letters of the controller s model number for example HSJ gt HSD gt or HSZ gt TERMINAL_PARITY ODD EVEN NOTERMINAL_PARITY Specifies the p
413. mmand Explanation Communication was lost in the middle of a SET FAILOVER command Follow the instructions included in the error message Error 6070 Illegal command this controller not configured for dual redundancy Explanation A command was entered to a single controller configuration that requires two controllers to be in dual redundant mode If two controllers are supposed to be in dual redundant mode enter a SET FAILOVER command If not do not enter the command that resulted in the error Error 6080 Illegal command this controller not currently in dual redundant mode Explanation A command was entered to a dual redundant configured controller but the other controller was not available for communication Restart the other controller and wait until it is communicating with this controller If this controller is no longer supposed to be in dual redundant mode enter a SET NOFAILOVER command Command Line Interpreter B 69 Error 6090 In failover no device may be configured at target 6 lt device type gt device name gt is at PTL port target lun Explanation Target addresses 6 and 7 are used by the controllers when in a dual redundant configuration When in a single controller configuration target 6 is available for use by devices If devices are configured at target 6 and you attempted to install a dual redundant configuration this error is displayed for all devices that use target 6 and the controllers wi
414. must start with A Z Explanation This error results from an ADD command with an illegal name given Error 9030 lt name gt is an illegal name characters may consist only of A Z 0 9 or _ Explanation This error results from an ADD command with an illegal name given B 70 Command Line Interpreter Error 9040 lt name gt conflicts with keyword lt keyword gt Explanation The name given in an ADD command conflicts with a CLI keyword Specify another name Error 9050 Configuration area full Explanation The total number of units devices and storage sets that can be configured is 195 in any combination This error results when you exceed that number of nodes Delete some units or devices in order to recover some configuration nodes Error 9060 lt name gt does not exist Explanation Some operation SET DELETE INITIALIZE and so forth specified a name that does not exist Check the name and retry the command Error 9070 lt name gt is still part of a configuration Delete upper configuration first Explanation Devices may not be deleted if they are still in use by storage sets or units Storage sets may not be deleted if they are still used by units Delete configurations from the top down delete units then stripesets and then finally devices Error 9080 lt name gt is already used Explanation An ADD command specified a name that is already in use Specify another name Error 9090 A lt devic
415. n HS controller subsystem initialize the device after setting it transportable then copy the data on the device to another nontransportable unit Then reinitialize the device after setting it nontransportable thereby putting metadata on the device You must initialize these devices because they may contain intact metadata blocks which can fool the controller into attempting to run with the device CAUTION Do not keep any device set as transportable on an HS controller subsystem Doing so sacrifices forced error support on all units attached to the device This is mandatory for HBVS and improving data integrity on the entire array A transportable device is interchangeable with any SCSI interface that does not utilize the device metadata for example a VAX workstation an SZ200 or a PC Transportable devices are not MSCP compliant do not support forced error and may not be members of a shadow set A controller error see Chapter 5 will occur if the operating system attempts to write forced error information to a transportable device Note Be careful not to confuse the terms transportable and nontransportable with the commands TRANSPORTABLE and NOTRANSPORTABLE See Appendix B for more information on these commands Transportable nontransportable device support is summarized in Table 4 2 Table 4 2 Transportable and Nontransportable Devices VAX or AXP Media Format Workstation
416. n Self Test The controller begins initialization by executing its policy processor s internal built in self test BIST BIST always executes upon initialization because it is an integral part of the Intel 80960CA chip 1960 microcode BIST runs entirely from the 1960 chip and a small portion of the firmware program card Successful completion of BIST means the 1960 chip is functioning properly If BIST fails the controller will show no activity and all port indicators on the OCP will be off The green reset LED will be solidly lit BIST will fail if an incorrect program card is present 6 1 2 Core Module Integrity Self Test After BIST completes successfully initialization routines and diagnostics expand to testing of the controller module itself The tests are part of the program card firmware and are known as core module integrity self test core MIST Just before beginning core MIST the controller reads the initial boot record IBR to determine the address of hardware setup parameters and process control information After reading the IBR the firmware within the program card is initialized to the IBR parameters Program card firmware then executes core MIST as follows 1 MIST checks the initial state of the read write diagnostic register 2 The test validates program card contents by reading each memory location and computing an error detection code EDC The test then compares the computed EDC with a predetermined EDC The pr
417. n a host connection is not available 1 The HSJ series has the amber LEDs embedded in the port buttons Functional Description 2 3 Note If you connect a maintenance terminal to one controller in a dual redundant configuration and both controllers are functioning you can communicate with both controllers A VAXcluster console system VCS or serial interface can also be connected to the EIA 423 terminal port for maintenance 2 1 6 Dual Controller Port The HSJ series and HSD series controllers have an internal serial port for communication with a second controller of the same model The second controller needs to be mounted in the same controller shelf with communication passing through the ports and shelf backplane A dual redundant configuration allows one controller to take over for another failed controller The takeover process is called failover During failover the surviving controller supports the SCSI 2 devices linked to the failed controller Note The HSZ series controller does not support dual redundant configurations thus failover cannot occur 2 1 7 Nonvolatile Memory The HS controller has 32 KB of nonvolatile memory NVMEM NVMEM is implemented using battery backed up SRAM This memory stores parameter and configuration information such as device and unit number assignments entered by you and by the HS controller firmware 2 1 8 Bus Exchangers Bus exchange devices allow high spee
418. n code address of error expected error data actual error data extra status 1 extra status 2 extra status 3 The content of these fields varies depending on the HSJ30 40 controller Subsystem Built in Self Test that detected the error condition and the error condition that was detected HSJ Series Error Logging C 33 C 2 3 6 Memory System Failure Event Log Template 14 The HSJ30 40 controller Executive firmware component and the Cache Manager part of the Value Added firmware component report the occurrence of memory errors via the Memory System Failure Event Log The Memory System Failure Event Log will be sent to all host systems that have enabled Miscellaneous error logging on a connection or connections established with the HSJ30 40 controller s Disk and or Tape MSCP Server The Memory System Failure Event Log is reported via the T MSCP Memory Errors error log message format The format of this event log including the HSJ30 40 controller specific fields is shown in Figure C 21 Memory System Failure Event Log Format Specific Fields format This field contains the value 01 that is T MSCP Memory Errors error log format code event code The values that can be reported in this field for this event log are shown in Table C 23 memory address The content of this field depends on the value supplied in the instance code field See Table C 23 for more detail instance code See Section C 2 1 for the descri
419. nal controller A fully functional controller s green OCP reset LED flashes at 1 Hz A partially functional controller s green LED may flash at 3 Hz Record which devices have lit flashing fault LEDs before resetting as a reset may temporarily clear the LED even though the fault remains 7 2 Removing and Replacing Field Replaceable Units e You cannot enter CLI gt SHUTDOWN commands from terminals connected to failed controllers green LED lit continuously For dual redundant configurations only You may enter the CLI gt SHUTDOWN OTHER_CONTROLLER command from a terminal connected to one of the controllers The other shutdown controller s green LED will light continuously when shutdown completes After you shut down one controller in a dual redundant configuration the other surviving controller takes over service to the shut down controller s devices This process is called failover For both nonredundant and dual redundant configurations You may enter the CLI gt SHUTDOWN THIS_CONTROLLER command from a terminal connected to the controller you want to shut down The shutdown controller s green LED will light continuously when shutdown completes See Appendix B for a complete description of the SHUTDOWN command and its qualifiers Be sure to understand the consequences to data and devices when using any qualifiers 7 1 3 Nonredundant Controller When you replace the controller module in a nonredundant conf
420. name path controls and other vital information Configuration commands to add delete devices storage sets and logical units 2 2 3 2 Diagnostic Utility Protocol Diagnostic Utility Protocol DUP from the host is supported over CI and DSSI HSJ and HSD series controllers DUP allows you to access the CLI and local programs through a host virtual terminal in much the same way as using a maintenance terminal See Chapter 4 for more information 2 2 3 3 HSZ Series Virtual Terminal A virtual terminal port can be created using a host based application called HSZUTIL HSZ series controller The HSZUTIL application uses SCSI diagnostic send receive commands to deliver and receive characters to from the HSZ series CLI and local programs See Chapter 6 for more information on the HSZUTIL application 2 2 3 4 Local Programs There are several local utilities available for HS controller subsystem management verification as follow DILX and TILX allow you to test and verify operation of the controller with attached SCSI 2 storage under a high or low I O load These utilities place the load on the controller bypassing the host port Chapter 6 provides a full description of DILX and TILX VTDPY allows the user to display current controller state and performance data including processor utilization host port activity and status device state logical unit state and cache and I O performance It is similar to the VTDPY for an HSC50 controlle
421. ne Inoperative The unit is inoperative and cannot be brought available by the controller m Offline Maintenance The unit has been placed in maintenance mode for diagnostic or other purposes e o Online Mounted by at least one of the host systems e r Offline Rundown The CLI SET NORUN command has been issued for this unit Diagnostics Exercisers and Utilities 6 91 e v Offline No Volume Mounted The device does not contain media e x On line to other controller Not available for use by this controller e A space in this column indicates the availability is unknown The spindle state is indicated using the following characters e A For disks this symbol indicates the device is at speed For tapes it indicates the tape is loaded e gt For disks this symbol indicates the device is spinning up For tapes it indicates the tape is loading e lt For disks this symbol indicates the device is spinning down For tapes it indicates the tape is unloading e v For disks this symbol indicates the device is stopped For tapes it indicates the tape is unloaded e For other types of devices this column is left blank For disks and tapes a w in the write protect column indicates the unit is write protected This column is left blank for other device types The data caching state is indicated using the following letters e r Read caching is enabled e A space in this column indicates caching is disabled
422. nector and remove the trilink from the front of the controller You will have to work around any SCSI cable or terminator connections when removing the trilink Do not remove cables or terminators from the trilink or you will interrupt the host SCSI bus 9 Remove the maintenance terminal cable if attached 10 Loosen the four mounting screws refer to Figure 7 3 on each side of the front bezel with a 3 32 inch Allen wrench HSJ series controllers or flat head screwdriver HSD and HSZ series controllers 11 Use a gentle up and down rocking motion to loosen the module from the shelf backplane 12 Slide the module out of the shelf noting which rails the module was seated in and place on an approved ESD work surface or mat 13 If necessary you may now remove the cache module as described in Section 7 2 3 7 1 3 4 Module Replacement Installation Use the following procedure to replace or install the controller module 1 You should replace the cache module now if you removed it See Section 7 2 4 for further information on replacing of installing the cache module 2 Make sure the OCP cable HSJ series only is correctly plugged into the underside of the module as shown in Figure 7 5 3 Slide the controller module into the shelf using its slot s rightmost rails as guides see Figure 7 6 Removing and Replacing Field Replaceable Units 7 7 Figure 7 5 OCP Cable HSJ Series Controller OCP RIBBON CABLE CONNECTION
423. ned by the controllers MAX NODE parameter The maximum supported value for this parameter is 32 For DSSI controllers the number of nodes is fixed at 8 O Each location in the grid contains a character to indicate the connection status e C indicates one connection to that node In this example node 12 shows one connection This usually happens if a host has multiple adaptors and is using more than one adaptor for load balancing M indicates multiple connections to that node Because each host system can make a separate connection to each of the disk tape and DUP servers this field frequently shows multiple connections to a host system In this example nodes 8 9 and 14 show multiple connections e V indicates that only a virtual circuit is open and no connection is present This happens prior to establishing a connection It also will happen when there is another controller on the same network and when there are systems with multiple adaptors connected to the same network Node 15 demonstrates this principle Ifa period is in a position corresponding to a node that node does not have any virtual circuits or connections to this controller e A space indicates the address is beyond the visible node range for this controller 6 86 Diagnostics Exercisers and Utilities CI DSSI Host Path Status Path Status 0123456789 Description This display indicates the path status to any system for which a virtual circuit
424. ner name 06 cc hh hah B 12 B 2 B 2 1 B 2 2 B 2 3 B 2 4 B 3 1 B 3 2 B 3 3 B 3 4 B 3 5 B 3 6 B 3 7 B 3 8 DELETE unit number DIRECTORY RESTART OTHER CONTROLLER esee RESTART THIS CONTROLLER eeeeee 00 ee eee SELFTEST OTHER CONTROLLER sse SELFTEST THIS CONTROLLER 0 0 000 0 en SET disk container name SET FAILOVER SET NOFAILOVER SET OTHER CONTROLLER eran SET stripeset container name o SET THIS CONTROLLER era SET unit number SHOW CDROMS SHOW cdrom container name ooo SHOW DEVICES SHOW DISKS SHOW disk container name ccce eh SHOW OTHER CONTROLLER ees SHOW STORAGESETS SHOW STRIPESETS SHOW stripeset container name ooo SHOW TAPES SHOW ape container name oo SHOW THIS CONTROLLER esee SHOW UNITS SHOW unit number SHUTDOWN OTHER CONTROLLER 00004 SHUTDOWN THIS CONTROLLER sssee eee CLI Messages Error Conventions CLI Error Messages Warning Conventions CLI Warning Messages rm Examples Setting HSD Series Parameters Nonredundant Setting HSJ Series Parameters Dual Redundant Setting HSZ Series Parameters o ooo o o Setting Terminal Speed and Parity ooo
425. ng its slot s leftmost rails as guides refer to Figure 7 6 Removing and Replacing Field Replaceable Units 7 19 4 Press firmly and use a gentle up and down rocking motion on the module until it is seated Finally press firmly once more to make sure the module is seated 5 Replace the controller module Refer to Section 7 1 7 2 5 Upgrading Cache Modules You can upgrade a cache module by increasing memory capacity as follows 1 Determine your cache module type by entering the CLI gt SHOW THIS_CONTROLLER command The following information is displayed CLI gt SHOW THIS_CONTROLLER Controller HSJ40 CX01234561 Software V1 4 Hardware 0000 ot configured for dual redundancy SCSI address 7 Host port ode name HSJA7 valid CI node 29 32 max nodes System ID 4200101DF52F Path A is O Path B is O SCP allocation class 3 TMSCP allocation class 3 Cache 16 megabyte read cache version 1 Cache is GOOD Note the cache module size cache version number and firmware version Note If you upgrade from 16 to 32 MB read cache you will need to return the 16 MB module to Digital for replacement when you order the upgrade An HSJ40 controller may have a version 1 or 2 cache module All HSJ30 HSD30 and HSZ40 models will have version 2 cache modules You must also run HS operating firmware Version 1 4 or higher to operate any version 2 or higher cache module Version 1 cache modules are also compatible w
426. ng properly A space indicates the address is beyond the visible node range for this controller Diagnostics Exercisers and Utilities 6 87 Device SCSI Status Target 01234567 P1 DDDDFhH o2TTT T hH r3DDD hH t 4DDDDDDhH 5DDDD hH 6 Description hH This display shows what devices the controller has been able to identify on the device busses Note The controller will not look for devices that are not configured into the nonvolatile memory using the CLI ADD command Q The column headings indicate the SCSI target numbers for the devices SCSI Targets are in the range 0 through 7 Target 7 is always used by a controller In a dual controller configuration target 6 is used by the second controller O The device grid contains a letter signifying the device type in each port target location where a device has been found C indicates a CDROM device D indicates a disk device F indicates a device type not listed above H indicates bus position of this controller h indicates bus position of the other controller L indicates a media loader T indicates a tape device A period indicates the device type is unknown A space indicates there is no device configured at this location This subdisplay contains a row for each SCSI device port supported by the controller The subdisplay for a controller that has six SCSI device ports is shown 6 88 Diagnostics Exercisers and Utilities U
427. nit Status abbrievated Description UnitQ aswc KB S na O 11120 c9 17 09 0 0 D0110 a r 0 0 0 D0120 a r 0 0 0 0 0 D0130 o r 236 100 0 0 100 T0220 av 0 0 0 0 0 T0230 o 123 0 100 0 0 This subdisplay shows the status of the logical units that are known to the controller firmware It also indicates performance information for the units Up to 42 units may be displayed in this subdisplay O The Unit column contains a letter indicating the type of unit followed by the unit number of the logical unit The list is sorted by unit number There may be duplication of unit numbers between devices of different types If this happens the order of these devices is arbitrary The following device type letters that may be displayed are as follow D indicates a disk device T indicates a tape device L indicates a media loader C indicates a CDROM device F indicates a device type not listed above U indicates the device type is unknown O The ASWC columns indicate respectively the availability spindle state write protect state and cache state of the logical unit The availability state is indicated using the following letters a Available Available to be mounted by a host system d Offline Disabled by Digital Multivendor Services The unit has been disabled for service e Online Exclusive Access Unit has been mounted for exclusive access by a user f Offline Media Format Error The unit cannot be brought av
428. nitialization Built in self test See BIST Bus exchanger 2 4 C Cabinet grounding stud 7 3 Cabinets configurations 3 1 Cable See also CI cable external See also CI cable internal See also Device port cable See also DSSI host cable See also SCSI host cable CI 1 9 7 23 7 25 DSSI 1 9 7 27 handling guidelines 1 8 SCSI 1 9 7 29 SCSI device port 7 31 Cache module 2 5 6 4 7 19 See also Read cache DAEMON 6 4 Index 1 Cache module cont d error messages 5 12 failover 5 1 how to identify 7 20 operation 2 5 read cache 2 5 service consideration 5 1 service of 7 19 size restriction 1 3 specifications 1 9 testing of 6 4 upgrading 7 20 write through 2 5 Certification Class A xxi EMI xxi Federal Republic of Germany xxi Chunksize How to change B 37 CI cable service precautions 1 9 CI cable external 7 23 installing 7 25 order for removal 7 24 order for replacement 7 25 removing 7 23 replacing 7 25 service of 7 23 service precautions 7 23 tools 7 23 CI cable internal 7 25 installing 7 26 removing 7 26 replacing 7 26 service of 7 25 service precautions 7 25 tools 7 25 CI host interconnection supported protocols 2 9 CI node number 4 5 4 6 7 10 7 16 7 46 restriction 4 12 CLEAR_ERRORS CLI command B 11 CLI accessing 4 2 command sets 4 3 described 4 2 error conventions B 64 error messages B 64 error messages automatic 5 14 error messages interacti
429. nization Multiple error messages may result from one command Items in angle brackets lt gt will be replaced at run time with names numbers and so on B 2 2 CLI Error Messages B 64 For HSJ and HSD30 controllers Error 1000 Unit number must be from 0 to 4094 For HSZ controllers Error 1000 The LUN portion of the unit number must be from 0 to 7 Explanation This error results from an ADD UNIT command where the n in the Dn or Tn specified is out of range The MSCP or TMSCP unit number after the D or T must be in the range of 0 to 4094 Retry the ADD UNIT command with a correct number Error 1010 Maximum cached transfer size must be 1 through 1024 blocks Explanation This error results from a SET lt unit number gt or an ADD UNIT command where MAXIMUM_CACHED_TRANSFER_SIZE was specified MAXIMUM_CACHED_TRANSFER_SIZE must be in the range 1 through 1024 Retry the SET or ADD command with a correct number Error 1020 CHUNKSIZE must be from lt minimum gt to lt maximum gt Explanation This error results from a SET storageset container name or an ADD storage set type command where CHUNKSIZE was specified The chunksize must be DEFAULT VOLUME or greater than 15 Retry the SET or ADD command with DEFAULT VOLUME or a correct number Error 1030 Cannot set chunksize on a storageset that is still part of a configuration Explanation Chunksize must be set before a storage set is bound to a unit If yo
430. nnect Services Last Failure Codes Host Interconnect Port Services Last Failure Codes Disk and Tape MSCP Server Last Failure Codes Diagnostics and Utilities Protocol Server Last Failure Codes System Communication Services Directory Service Last Failure Codes iii a PA HS epa Disk Inline Exerciser DILX Last Failure Codes Tape Inline Exerciser TILX Last Failure Codes Automatic Device Configuration Program CONFIG Last Failure CA es des ae eee ee ol ths eae Se a ea Si Controller Restart Codes ccc ee eee eens Event Notification Recovery Threshold Classifications Recommended Repair Action Codes 0 0 0 0 ee eee ene Template Types meseria Se Peis ek AS e guid WEYees ioa ales C 82 C 84 C 89 C 90 C 91 C 92 C 93 C 93 C 97 C 101 C 107 C 108 C 108 C 109 C 110 C 111 C 112 C 113 C 116 C 116 C 116 C 117 C 118 C 118 C 119 C 120 XV D 2 Host Interconnect Services Status Codes o ooooo oo D 2 D 3 DSSI Port Port Driver Event Log Template 32 Instance MSCP Event CODES rerepi ali a Lv D 3 D 4 Host Interconnect Services Last Failure Codes D 3 D 5 Host Interconnect Port Services Last Failure Codes D 3 D 6 Recommended Repair Action Codes o o oooooo oo D 4 xvi Preface This manual describes how to maintain and service
431. nnot be set to NOTRANSPORTABLE once it is being used by an upper level unit or storage set Error 4000 The CLI prompt must have 1 to 16 characters Explanation This error results from a SET THIS_CONTROLLER or SET OTHER_CONTROLLER command with the qualifier PROMPT The length of the CLI prompt must be at least one character and may not exceed 16 characters Retry the command with the correct number of characters Error 4010 Illegal character in CLI prompt Explanation A nonprintable character was specified Only ASCII characters space through tilde may be specified hex 20 7E Error 4020 Terminal speed must be 300 1200 2400 4800 9600 or 19200 Explanation This error results from a SET THIS_CONTROLLER or SET OTHER CONTROLLER command with the argument TERMINAL SPEED The only valid baud rates that may be specified are 110 300 1200 2400 4800 or 9600 baud Retry the command with a correct terminal speed Error 4030 Controller ID must be in the range 0 to lt max nodes minus 1 gt Explanation The ID was specified with a number greater than lt max nodes minus 1 gt If increasing the controller s ID set MAX_NODES first then the controller s ID Error 4040 SCS nodename length must be from 1 to 6 characters Explanation This error results from a SET THIS_CONTROLLER or SET OTHER CONTROLLER command with the argument SCSS_NODENAME The SCS node name must consist of one to six alphanumeric character
432. nous 8 bit single ended device support Tagged queueing for SCSI 2 devices Read and write physical device addressing and access This is the read and write path to and from devices and from and to the value added portion of HS operating firmware Specified device support per HS operating firmware release Refer to your HS operating firmware release notes to identify specifically supported devices Mixed disk and tape support You can mix disk and tape storage on one controller Furthermore Disks and tapes may be placed together on one of the controller s six SCSI 2 ports Note Tapes are not currently supported for the HSZ series controller Refer to your StorageWorks Array Controller Operating Firmware Release Notes for specific information and restrictions for tape drives Device warm swap You can remove and replace devices without taking the subsystem off line see Chapter 7 Device shelf and SBB observation and control This service monitors SHELF OK signals and alerts you you of fan and power supply failures This firmware also controls the fault LEDs on the SBBs for use in warm swap and identifying device failures or configuration mismatches Device error recovery This service performs error recovery and read and write retries directly making every attempt to serve data to and from the host before declaring an unrecoverable error or marking a device as failed Controller warm swap HSJ series controllers This
433. ns on HSD series controllers Be aware of the difference when decoding HSD series controller error logs using Appendix C Table D 6 Recommended Repair Action Codes Code Description 63 Check the DSSI adapter on the host system identified in the remote node name field for proper operation D 4 HSD Series Error Logging E HSZ Series Error Logging This appendix details errors the HSZ series controller will report in its host event logs under the DEC OSF 1 AXP operating system as well as how to extract the information from the logs Note Host event log translations are correct as of the date of publication of this manual However log information may change with firmware updates Refer to your StorageWorks Array Controllers HSZ40 Array Controller Operating Firmware Release Notes for error log information updates E 1 Reading an HSZ Series Error Log Example E 1 shows an example of a uerf translated host error log The uerf utility under the DEC OSF 1 AXP operating system will show the target and LUN of the unit in question Use your current configuration information to match the unit to the devices it is mapped to Then test and or service the devices on a case by case basis HSZ Series Error Logging E 1 Example E 1 was generated using the uerf o full command on an HSZ40 controller with a KZTSA host adapter Example E 1 The uerf utility Error Event Log RRA kk ke ke ke ke ke ke ke ek e xk kk x kk k
434. ns for the number of entries in the good module memory list E E Y O Ol Ah Ol 12 The controller DRAB chip does not arbitrate Replace controller correctly module B O 4d O O d d 13 The controller DRAB chip failed to detect Replace controller forced parity or detected parity when not module forced O off lit continuously flashing I D Instruction Data cache on the controller module DRAB Dynamic RAM Controller and Arbitration Engine operates controller shared memory ECC Error Correction Code EDC Error Detection Code SRAM Static RAM NXM Nonexistent Memory continued on next page Error Analysis and Fault Isolation 5 5 Figure 5 3 Cont Flashing OCP Codes Reset 1 2 Description of Error Action 14 The controller DRAB chip failed to verify the EDC correctly Replace controller module b xd O Replace controller module op QO O Y 0 0o a D bI 1 O Replace controller module E bl O Ah Replace controller module Replace controller module Replace controller module Replace controller module bl hb h N Replace controller module Replace controller module Replace controller module 0 rn pr Oo p O O N N b N N A b A b O0 O0 0 a Replace controller module H Y O O O Replace controller module A O O O Replace control
435. nsl This field contains byte 7 Additional Sense Length field of the Sense Data returned in the response of a SCSI REQUEST SENSE command This field contains the number of additional Sense Data bytes to follow If this value is less than 10 the content of some or all of the remaining event log fields that is cmdspec asc ascq frucode and keyspec may be undefined The cmdspec field is undefined unless this value is 4 or greater The asc and ascq fields are undefined unless this value is 6 or greater The frucode field is undefined unless this value is 7 or greater The keyspec field is undefined unless this value is 10 or greater If this value is greater than 10 the device supplied the Additional Sense Bytes field which begins at byte 12 of the Sense Data The content of the Additional Sense Bytes field is not included in the event log cmdspec If the value contained in the addsnsl field is 4 or greater this field contains bytes 8 through 0B Command Specific Information field of the Sense Data returned in the response of a SCSI REQUEST SENSE command The content of this field varies depending on the value contained in the cmdopcd field as follows e If the cmdopcd is an 18 COPY 39 COMPARE or 3A COPY AND VERIFY the low order byte of this field contains the starting byte number of an area relative to Sense Data byte 0 that contains unchanged the source logical unit s status
436. ntains the PCB copies of the 710 SSTAT2 SSTAT1 SSTATO DSTAT registers Last Failure Parameter 7 contains the PCB copies of the 710 LCRC RESERVED ISTAT DFIFO registers An EDC error was detected on a read of a soft sectored device path not yet implemented Last Failure Parameter 0 contains the PCB reg710_ptr value Last Failure Parameter 1 contains the PCB copy of the 710 TEMP register Last Failure Parameter 2 contains the PCB copy of the 710 DBC register Last Failure Parameter 3 contains the PCB copy of the 710 DNAD register Last Failure Parameter 4 contains the PCB copy of the 710 DSP register Last Failure Parameter 5 contains the PCB copy of the 710 DSPS register Last Failure Parameter 6 contains the PCB copies of the 710 SSTAT2 SSTAT1 SSTATO DSTAT registers Last Failure Parameter 7 contains the PCB copies of the 710 LCRC RESERVED ISTAT DFIFO registers Invalid SCSI device type in PUB Last Failure Parameter 0 contains the PUB SCSI device type continued on next page HSJ Series Error Logging C 105 Table C 35 Cont Device Services Last Failure Codes Code Description 03420188 03470100 03480100 03490100 034A2080 A UDC interrupt could not be associated with either a DWD or the non callable scripts Last Failure Parameter 0 contains the PCB reg710_ptr value Last Failure Parameter 1 contains the PCB copy of the 710 TEMP register Last Failure Parameter 2 contains the
437. ntents Preface foc o do EU ed Patan es Manufacturer s Declarations luus ccc eee eee ee 1 General Information and Subsystem Overview Technical Overview v mE ye WD US mE EM AVR Re Re Ee Maintenance Strategy cesses Maintenance Features 0 ce cece eee eee eh PRECAUTIONS ea Re Edo Ra a ae a ee etre e 1 Electrostatic Discharge Protection o 2 Module Handling Guidelines 3 Program Card Handling Guidelines o oooooooo 4 Cable Handling Guidelines o 4 1 CTI Cable us 2 lies e aia isis 4 2 DSSI Cable 4 Ix a 4 3 O AA Ee Rr a uie er E eed Controller Specifications 2 0 0 0 0 cc eee eens Controller Environmental Specifications o 2 Functional Description 2 1 HS Controller Hardware o 2 1 1 Policy Processor iaa ots aa Pe eels USD 2 1 1 1 Tritel 80960CA nto A ee SS Y 2 1 1 2 Instruction Data Cache 0 0 eens 2 1 2 Program Card x e se oat eR P ORE CARAS a SUE 2 1 3 Diagnostic Registers 0 eee eens 2 1 4 Operator Control Panel 0 0 ccc eens 2 1 5 Maintenance Terminal Port 2 1 6 Dual Controller Port ici e veg edt eR EE BR GO d 2 1 7 Nonvolatile Memory 0 0 ccc cee ence nes 2 1 8 Bus ExchahgersS liess bugs asd WA Ole heed BH a ane Lb cod 2 1 9 Shared Memory ies ge yo ee ee De we he oe es
438. ntroller 4 10 4 Failover Setup Mismatch During failover mismatch one controller will function while the second controller will not recognize any devices Although it is rare a failover mismatch may occur during the following scenarios e Ifthe controllers initialize at exactly the same time one controller may be set for failover while the other is not If one controller is running operating normally while the second controller is initialized mismatch may occur For example this can happen after one controller was undergoing maintenance To correct a failover mismatch stop all processes on the devices for both controllers Then enter the following commands to determine which controller has the desired good configuration information CLI gt SHOW UNITS CLI gt SHOW STORAGESETS CLI gt SHOW DEVICES After deciding on one of the two configurations use the SET FAILOVER command to copy the good information from one controller to the other 4 11 Moving Devices Between Controllers The moving of devices from one controller to another is supported under the following conditions e For nontransportable devices Under normal operation the controller makes a small portion of a disk inaccessible to the host and uses this area to store metadata Metadata improves error detection and media defect management Devices utilizing metadata are called nontransportable Initializing a device that is set as nontransportable will place reset
439. ntroller Vendor Specific SCSI ASC ASCQ Codes ASC ASCQ Code Code Description D1 05 Synchronous negotiation error D1 07 Unexpected disconnect D1 08 Unexpected message D1 09 Unexpected Tag message D1 0A Channel busy D1 OB Device initialization failure device sense data available D2 00 Miscellaneous SCSI driver error D3 00 Drive SCSI chip reported gross error D4 00 Non SCSI bus parity error D5 02 Message Reject received on a valid message D7 00 Source driver programming error Table C 18 Last Failure Event Log Template 01 Instance MSCP Event Codes MSCP Instance Event Code Code Description 01010302 03EA EXEC BUGCHECK called with HW flag set that is an unrecoverable hardware detected fault occurred 0102030A 040A EXEC BUGCHECK called with HW flag clear that is an unrecoverable firmware inconsistency was detected Table C 19 Failover Event Log Template 05 Instance MSCP Event Codes MSCP Instance Event Code Code Description 07030B0A 022A Failover Control detected a receive packet sequence number mismatch The HSJ30 40s are out of synchronization with each other and are unable to communicate Note that in this instance the last failure code and last failure parameters fields are undefined 07040B0A 022A Failover Control detected a transmit packet sequence number mismatch The HSJ30 40s are out of synchronization with each other and are unable to communicate Note that in
440. ny time a fault occurs during core MIST the OCP will display a code Refer to Chapter 5 6 1 3 Module Integrity Self Test DAEMON Once initialization control is passed to EXEC EXEC calls the diagnostic and execution monitor DAEMON DAEMON tests the device port hardware host port hardware and cache module To test the device ports DAEMON checks each NCR 53C710 SCSI processor chip Initialization continues unless all SCSI device ports fail testing In other words it is possible for the controller to run with only one functioning device port DAEMON tests the host port hardware for the particular controller model For HSJ series controllers this test focuses primarily on the YACI chip For the HSD and HSZ series controllers the NCR 53C720 host processor chip is tested Initialization continues even if the host port tests fail However DAEMON stops initialization if the DUART test from core MIST and the host port tests fail Diagnostics Exercisers and Utilities 6 3 e DAEMON tests the cache module as follows Note The controller still functions if the cache module fails its testing In this case the controller will use its on board shared memory for caching operations DAEMON tests the DRAB memory controller on the read cache module After DAEMON completes and functional code takes control of the firmware the cache manager tests the memory on the cache At least the first megabyte of the memory must t
441. o start communication enter Ctrl C to place the HSZ series controller CLI into a known state This is done by entering a SCSI SEND DIAGNOSTIC command for the CLI DATA PAGE with the CLI Command Code set at ANSWER and a Ctrl C character in the first byte of the ASCII Text field If this fails exit 6 102 Diagnostics Exercisers and Utilities 6 Process the following code Do If a message was received from the drive process it If the message length is greater than 2 Print the message If we have a log file log the message If the message was a SCSI_CLI_INPUT_REQUEST Get terminal input f we have a log file log the terminal input f the first character is a the user is trying to send a control character so convert the string into the appropriate control character f we got End of File on the input string Put a C in the input string to abort the program Send the input string to the remote program Else This is a keep alive message so ignore it 7 Ifthe CLI has asked for a polling delay sleep for the delay period until End Of File is received on the terminal read or until an error occurs while communicating with the HSZ series controller Diagnostics Exercisers and Utilities 6 103 7 Removing and Replacing Field Replaceable Units This chapter describes how to remove and replace install the following FRUs in both dual redundant and nonredundant configurations Controller mod
442. o terminate the warm swap program and restart it later when you have a replacement 7 11 2 1 Tools Required You will need the following tools to warm swap a controller e ESD strap e 3 32 inch Allen wrench e 5 32 inch Allen wrench e Flat head screwdriver 7 11 2 2 Precautions Refer to Chapter 1 for ESD grounding module handling and program card handling guidelines Ground yourself to the cabinet grounding stud refer to Figure 7 1 before servicing the controller module 7 11 2 3 Controller Removal Use the following procedure to remove the controller 1 Apply either a virtual terminal connection or a maintenance terminal to the controller you will not be removing 2 Enter the RUN C_SWAP command The system responds with the following Controller Warm Swap Software Version V1 4 Copyright Digital Equipment Corporation 1993 Sequence to REMOVE other HSJ40 has begun Do you wish to REMOVE the other HSJ40 Y N N 3 Enter Y to continue the procedure Will its cache module also be removed Y N N 7 42 Removing and Replacing Field Replaceable Units Enter Y only if you will be removing the controller s cache module as well Killing other controller Attempting to quiese all ports Port 1 quiesced Port 2 quiesced Port 3 quiesced Port 4 quiesced Port 5 quiesced Port 6 quiesced All ports quiesced Remove the other HSJ40 the one WITHOUT a blinking green LED within 5 m
443. oader occupies the full cabinet depth Up to four tape drive loader devices can be loaded in an SW800 series data center cabinet displacing shelves S6 and S12 S18 leaving 10 BA350 SB shelves remaining The associated BA350 MA controller shelf must be located near enough to satisfy this restriction 3 2 Configuration Rules and Restrictions Figure 3 1 SW800 Series Data Center Cabinet Loading oooooooooooo0 00 0000000000 oooooo oo 00 00 00 0000 00 0000 oo 0000 000000 00000000 00000000000000 00000000 0000 000000 0000 00000 0000 00000 oo Joo CONTROLLER POSITION C4 oug T Jao oa 0 Toe oot E CONTROLLER POSITION C2 CDUA 00000000000000000000000090 000000 00000000 0000 0000090 0000090 000000 Tuo ou Tuo oul t T 99980 oo oooo 8559 7 2080 0990980 080087 TRE E T T T 18980 09
444. ode is currently disabled e The HSZUTIL application does not support the RZxx SCSI DUP protocol 6 7 3 DEC OSF 1 for Alpha AXP Implementations The DEC OSF 1 AXP version issues SCSI commands through the CAM User Agent interface The user identifies the HSZ series controller through its bus target and LUN identifiers The HSZ series controller therefore does not need to be configured into the system prior to accessing it through HSZUTIL SUPERUSER privilege is required to run the HSZUTIL application on DEC OSF 1 AXP 6 7 3 1 Running HSZUTIL Under DEC OSF 1 AXP The HSZUTIL application is installed in the USR LOCAL BIN directory by SETLD The program is invoked as follows HSZUTIL bus target LUN where bus is the number of the SCSI bus target is the target ID of the HSZ series controller LUN is the logical unit number of one of the devices connected to the HSZ series controller If specified the parameters must be specified in order HSZUTIL prompts for missing parameters The specified device need not be known to the operating system To exit the program enter Ctrl D 6 100 Diagnostics Exercisers and Utilities Control characters to be delivered to the HSZ series controller CLI are entered by typing the character followed by the appropriate letter For example Ctrl G would be entered as MG 6 7 4 Description of HSZ series Controller Virtual Terminal Protocol Diagnostic Pages Figures 6 10 and 6 11 presen
445. of the controller HSJ series Connect the CI cable and tighten its captive screws with a flat head screwdriver CAUTION Do not connect host port cables to an HSD series controller while the power is on to any members on the DSSI bus including the controller and host Doing so risks short circuits that may blow fuses on all the members HSD series Disconnect controller power Then connect the DSSI cable and the terminator to the trilink connector and tighten their captive screws Restore power to all members on the DSSI bus HSZ series Connect the SCSI cable trilink connector to the front of the controller and tighten its captive screws with a small flat head screwdriver You will have to work around any SCSI cable or terminator connections when replacing the trilink Do not remove cables or terminators from the trilink or you will interrupt the host SCSI bus 10 Enter the following commands to enable CI paths A and B to the host HSJ series controllers CLI gt SET THIS CONTROLLER PATH A CLI gt SET THIS CONTROLLER PATH B Enter the following command to enable the host port path HSD series controllers CLI gt SET THIS CONTROLLER PATH The host port path for HSZ series controllers is always on so no command is needed To automatically configure devices on the controller use the CONFIG utility described in Chapter 6 Removing and Replacing Field Replaceable Units 7 11 For manual configuration the follo
446. og HSJ Series Error Logging C 37 reserved offset 1E This field contains the value 0 event time See Section C 2 1 for the description of this field his status error id sre dst intopcd See Section C 2 2 1 for the description of these fields undef This field is only present to provide longword alignment its content is undefined C 2 3 8 Cl Port Port Driver Event Log Template 32 The HSJ30 40 controller Host Interconnect Services firmware component reports errors detected while performing work related to the CI Port Port Driver PPD communication layer via the CI Port Port Driver Event Log The CI Port Port Driver Event Log will be sent to all host systems that have enabled Miscellaneous error logging on a connection or connections established with the HSJ30 40 controller s Disk and or Tape MSCP Server The CI Port Port Driver Event Log is reported via the T MSCP Controller Errors error log message format The format of this event log including the HSJ30 40 controller specific fields is shown in Figure C 23 CI Port Port Driver Event Log Format Specific Fields format This field contains the value 00 that is T MSCP Controller Errors error log format code event code The values that can be reported in this field for this event log are shown in Table C 25 reserved offset 16 This field contains the value 0 C 38 HSJ Series Error Logging Figure C 23 Cl Port Port Driver Event Log Template 32
447. ogram card contents are valid if both EDCs match 6 2 Diagnostics Exercisers and Utilities Core MIST then tests and or checks module hardware attached to the buses e Timer operation e DUART operation e DRAB DRAM shared memory operation The test writes to and reads all legal addresses Then boundaries are checked by attempting to access nonexistent addresses To pass this test the first two megabytes of memory must test good If bad segments are found the bad segments may divide total memory into no more than 16 good continuous sections The test selects a device then checks whether or not the bus has selected that device The test verifies that each allowable memory transfer size works and that illegal transfer sizes do not e Bus parity e Registers The test checks registers for frozen bits e Journal SRAM The test writes to and reads all journal SRAM addresses e ID cache After core MIST successfully tests the program card and bus hardware the initialization routine loads the firmware into the first two megabytes of controller shared memory The initialization routine then uses the EDC method to compare the memory contents with the program card to make sure of a successful download The policy processor is initialized to the new parameters the ones read from the IBR At this time control of initialization passes to the firmware executive EXEC EXEC runs from controller shared memory If at a
448. olf uo lle T1 T2 n o C1 C4 1 04 RR OF SZ ls eros C1 C4 ii slo ce OD elo o olal EE lo off Et ele H E ob ZA lh e Zo lolo K o ol QO lo Jol OO lelo o HOLE 3H 2 2 3 si SE kl cr HOLE o ollo o lo ollo o 1 15 o TAPE C1 C4 TAPE C1 04 lo o o o e jo TAPE C1 C4 TAPE C1 C4 o ule CABLE 2 POSITION T4 POSITION T3 ME le 8 Jas oa i o POSITION T3 POSITION T4 CABLE PASS E 319 loo o SA Bl PASS THROUGH j ee f elfo e THROUGH i e m pe s c MM 3 se 4S 5 E 59 ele o ele AZ lo off AZ blo o y ole a O o Q O lele M HOLE I EI E Se Eh e HOLE 255 STORAGE c Ba 89 k o SS El ca STORAGE i POSITION S1 elo OL j 9 amp Ele POSITION S11 HOLE x ole M d im E ma TE STORAGE c iiem o elke 07 STORAGE ER owe POSITION S2 gt Ad POSITION S10 HOLE E of E o a 44 hi STORAGE ien hal he El ca STORAGE if HOLE C1 13 sul Ue els C2 M 44 e POSITION S3 glo o 313 POSITION S9 HOLE d air fs let 3 HOLE 0 olo 2 o gt op re enn STORAGE c1 IB a a El co STORAGE den ef POSITION S4 alo O e 2o cel POSITION S8 HOLE STORAGE se o 318 STORAGE Er HOLE 56 H positionss C1 We J R e 9 posmons7 56 o o o 299 RR fui ls o pke eie eE e E n Me CABINET FRONT CABINET REAR CXO 4162C MC e Vertical device shelves Vertical shelves are not used for device shelves because some devices requ
449. oller s Disk and or Tape MSCP Server The Nonvolatile Parameter Memory Component Event Log is reported via the T MSCP Memory Errors error log message format The format of this event log including the HSJ30 40 controller specific fields is shown in Figure C 18 HSJ Series Error Logging C 27 Figure C 18 Nonvolatile Parameter Memory Component Event Log Template 11 Format 31 0 command reference number sequence number reserved controller identifier reserved chvrsn csvrsn memory address reserved event time byte count number of times written Nonvolatile Parameter Memory Component Event Log Format Specific Fields format This field contains the value 01 that is T MSCP Memory Errors error log format code event code The values that can be reported in this field for this event log are shown in Table C 20 C 28 HSJ Series Error Logging memory address The physical address of the beginning of the affected Nonvolatile Parameter Memory component area instance code See Section C 2 1 for the description of this field The values that can be reported in this field for this event log are shown in Table C 20 templ See Section C 2 1 for the description of this field This field contains the value 11 for this event log tdisize See Section C 2 1 for the description of this field This field contains the value 08 for this event log reserved offset 22 This field contains the value 0 event time
450. ollowing exceptions When MSLG B_FORMAT reads 09 BAD BLOCK REPLACEMENT ATTEMPT the instance code does not appear because ERF does not provide CONTROLLER DEPENDENT INFORMATION When MSLG B FORMAT reads OA MEDIA LOADER LOG the instance code appears in LONGWORD 2 When MSLG B FORMAT reads 00 CONTROLLER LOG the instance code appears in part of both LONGWORD 1 and LONGWORD 2 For this MSLG B FORMAT the code is skewed and not directly readable as a longword The code s low order bytes appear in the two high order bytes of LONGWORD 1 and the code s high order bytes appear in the two low order bytes of LONGWORD 2 For example CONTROLLER DEPENDENT INFORMATION LONGWORD 1 030A0000 hal Psst In this case the instance code is 0102030A A OpenVMS DCL command procedure is provided at the end of this appendix see Section C 6 for deskewing this particular instance code Running the command procedure will make the error log directly readable when used in conjunction with the other information supplied in this appendix LONGWORD 2 24010102 e Once you locate and identify the instance code see the following sections for further information Section C 3 contains the Event Log Code tables Tables C 2 through C 49 These tables list specific code descriptions Section C 2 contains detailed error packet descriptions based on template type Section C 4 cont
451. om routine RESMGR ALLOCATE DATA SEGMENT while dmscp dcd allocate dseg attempting to allocate a data segment Unrecognized or invalid in this context return value from routine RESMGR ALLOCATE DATA BUFFERS while dmscp dcd allocate dbuf attempting to allocate a data buffer dmscp dcd rmte end msg was unable to find a command message that corresponds to end message it is currently processing dmscp dcd src ges send was entered even though remote connection lost is indicated This condition should not occur because the command timer is deactivated when a connection is lost and the server is running at the same priority as HIS and cannot invalidate a connection dmscp ded src gcs cmpl found the command being GCSed is no longer at the head of the remote connection s queue dmscp ded errlog rvc found that an error log is not associated with a command internal miscellaneous error logs are assumed to not be associated with a connection and remote miscellaneous error logs generation was not requested continued on next page Table C 43 Cont Disk and Tape MSCP Server Last Failure Codes Code Description 60460100 dmscp_dcd_elrt_scc_send was entered to issue a remote source connection SCC but was unable to find an available HTB on the connection s htb_list With no active DCDs the connection should always have HTBs available 60480100 tmscp_suc_avl_cmpl_rtn found the unit not in the available state 60490100 tmscp_clear_cdl_
452. ommand Line Interpreter Examples SET unit number Supported tape formats are as follow DEVICE_DEFAULT The default tape format is the default that the device uses or in the case of devices that are settable via switches on the front panel the settings of those switches 800BPI 9TRACK 1600BPI 9TRACK 6250BPI 9TRACK TZ85 TZ86 TZ87 NOCOMPRESSION TZ87 COMPRESSION DAT NOCOMPRESSION DAT COMPRESSION 3480 NOCOMPRESSION 3480 COMPRESSION When entering the ADD UNIT command for a tape device DEFAULT FORMAT DEVICE DEFAULT is the default CLI gt SET D1 WRITE PROTECT NOREAD CACHE Write protect and turn off the read cache on unit D1 CLI SET T47 DEFAULT FORMAT 1600BPI 9TRACK Set unit T47 to 1600 bpi Command Line Interpreter B 43 SHOW CDROMS SHOW CDROMS Shows all CDROM drives and drive information Note This command is valid for HSJ and HSD controllers only Format SHOW CDROMS Description The SHOW CDROMS command displays all the CDROM drives known to the controller Qualifiers FULL If the FULL qualifier is specified additional amplifying information may be displayed after each device Examples n CLI SHO CDROM Name Type Port Targ Lun Used by CDROM230 cdrom 2 3 0 D623 CDROM240 cdrom 2 4 0 D624 A normal listing of CDROMs CLI SHO CDROM FULL Name Type Port Targ Lun Used by CDROM230 cdrom 2 3 0 D623 DEC RRD44 C DEC 3593 CDROM240 cdrom 2 4 0 D624 DEC RRD44 C DEC 3593 A full list
453. ommand on the other controller Explanation This error results from a SET NOFAILOVER command This controller was unable to communicate with the other controller to notify it that it is no longer in dual redundant mode Typically this occurs when the other controller has already been removed prior to the SET NOFAILOVER command A SET NOFAILOVER command should be entered on the other controller as soon as possible Warning 9030 Cannot determine if the correct device type is at the PTL specified Explanation When a device is added the location specified is checked to see if the correct device type is present This error results when no device responds from the location specified Check the physical configuration and the PTL that was specified Command Line Interpreter B 75 Warning 9040 There is currently a lt device type gt at the PTL specified Explanation When a device is added the location specified is checked to see if the correct device type is present This error results when a device different from the one specified is found at the location specified for example a tape is found where a disk was added Check the physical configuration and the PTL that was specified Warning 9050 lt device type gt lt device name gt at PTL lt port gt lt target gt lt lun gt No device installed Explanation When a unit is added the configuration of the disks that make up the unit is checked If no device is found at the PTL speci
454. ompleted for a write of controller metadata to a location outside the user data area of the disk Note that due to the way Bad Block Replacement is performed on SCSI disk drives information on the actual replacement blocks is not available to the controller and is therefore not included in the event report No command control structures available for disk operation Note that in this instance the asc and ascq fields are undefined No command control structures available for tape operation Note that in this instance the asc and ascq fields are undefined No command control structures available for media loader operation Note that in this instance the asc and ascq fields are undefined No command control structures available for operation to a device that is unkown to the controller Note that in this instance the asc and ascq fields are undefined SCSI interface chip command timeout during disk operation Note that in this instance the asc and ascq fields are undefined SCSI interface chip command timeout during tape operation Note that in this instance the asc and ascq fields are undefined SCSI interface chip command timeout during media loader operation Note that in this instance the asc and ascq fields are undefined SCSI interface chip command timeout during operation to a device that is unknown to the controller Note that in this instance the asc
455. on so that the test will perform exactly as the one that just completed However there is one exception If the previous test was the Basic Function test with the initial write pass and the initial write pass completed the initial write pass is not performed when the test is restarted e Change unit DILX allows you to drop or add units to testing For each unit dropped another unit must be added until all units in the configuration have been tested The unit chosen will be tested with the same parameters that were used for the unit that was dropped from testing When you have completed dropping and adding units all performance statistics are initialized and DILX execution resumes with the same parameters as the last run Drop unit x y n n Explanation This question is displayed if you choose to change a unit as an answer to the reuse parameters previous question Enter the unit number that you wish to drop from testing Diagnostics Exercisers and Utilities 6 13 The new unit will be write enabled Do you wish to continue y n n Explanation This question is displayed if you choose to change a unit as an answer to the reuse parameters question It is only asked if the unit being dropped was write enabled This question gives you the chance to terminate DILX testing if you do not want data destroyed on the new unit Enter N to terminate DILX 6 2 5 DILX Output Messages The following message is displayed when DIL
456. onfigured yet 3F 00 Target operating conditions have changed 3F 01 Microcode has been changed 3F 02 Changed operating definition 3F 03 Inquiry data has changed 43 00 Message error 44 00 Internal target failure 45 00 Select or reselect failure 46 00 Unsuccessful soft reset 47 00 SCSI parity error 48 00 Initiator detected error message received 49 00 Invalid message error 4A 00 Command phase error 4B 00 Data phase error 4C 00 Logical unit failed self configuration 4E 00 Overlapped commands attempted 53 00 Media load or eject failed 53 02 Medium removal prevented 5A 00 Operator request or state change input unspecified 5A 01 Operator medium removal request 5B 00 Log exception 5B 01 Threshold condition met 5B 02 Log counter at maximum 5B 03 Log list codes exhausted 40 nn Diagnostic failure detected on component nn where nn identifies a specific target device component nn range 80 through FF Refer to documentation provided by the vendor of the target device for a description of the component identified by nn C 76 HSJ Series Error Logging Table C 17 HSJ30 40 Controller Vendor Specific SCSI ASC ASCQ Codes ASC ASCQ Code Code Description 3F 85 Test Unit Ready or Read Capacity command failed 3F 87 Drive failed by a Host Mode Select command 3F 88 Drive failed due to a deferred error reported by drive 3F 90 Unrecovered Read Write error 3F CO No response from one or more drive
457. onger valid Note that in this instance if the connection ID field is zero the content of the VCSTATE remote node name remote connection id and connection state fields are undefined Received SCS DISCONNECT_REQ on a connection that is no longer valid Note that in this instance if the connection ID field is zero the content of the VCSTATE remote node name remote connection id and connection state fields are undefined Received SCS DISCONNECT_RSP on a connection that is no longer valid Note that in this instance if the connection ID field is zero the content of the VCSTATE remote node name remote connection id and connection state fields are undefined Received SCS CREDIT_REQ on a connection that is no longer valid Note that in this instance if the connection ID field is zero the content of the VCSTATE remote node name remote connection id and connection state fields are undefined Received SCS CREDIT_RSP on a connection that is no longer valid Note that in this instance if the connection ID field is zero the content of the VCSTATE remote node name remote connection id and connection state fields are undefined Received SCS APPL_MSG on a connection that is no longer valid Note that in this instance if the connection ID field is zero the content of the VCST
458. ons of HS operating firmware supported only five member storage sets The OpenVMS VAX operating system maximum capacity restriction for file structured volumes 16 777 216 blocks about 8 5 gigabytes remains in effect for operating system versions prior to V6 0 The CLUSTER SIZE qualifier for large devices or storage sets Digital recommends that the formula displayed by the OpenVMS HELP DEVICE INIT CLUSTER_SIZE command be used to determine the proper OpenVMS file system cluster size Using too small a file system cluster size may prevent some of the device or storage set capacity from being 4 14 Normal Operation accessed too large a cluster size usually wastes storage capacity by allocating large blocks of storage for small files e Shadow set operation In the OpenVMS VAX operating system versions earlier than V6 0 timed out T O requests to shadow set members may lead to member disks attached to controllers being dropped from shadow sets In some cases this may lead to host crashes To avoid this possibility Digital recommends changing the value of the SYSGEN parameter SHADOW MBR TMO to at least 120 seconds for systems running operating system versions earlier than V6 0 Be aware that your system may temporarily pause during the 120 second interval Version 6 0 of the OpenVMS VAX operating system avoids this problem by retrying timed out operations to shadow set members several times e PAPOLLINTERVAL and PANUMPOLL param
459. ontent of the keyspec field is valid if and only if this bit is set to one Progress Indication This subfield is a percent complete indication in which the returned value is the numerator that has 10000 as its denominator The progress indication is based upon the total format operation including any certification or initialization operations C 2 3 Specific Event Log Formats In addition to the common fields generated across certain event logs there is specific information for each log based on template type The specific information is described in Sections C 2 3 1 through C 2 3 14 C 2 3 1 Last Failure Event Log Template 01 Unrecoverable conditions detected by either the firmware or hardware and certain operator initiated conditions result in the termination of HSJ30 40 controller operation In most cases following such a termination the controller will attempt to restart that is reinitialization with hardware components and firmware data structures initialized to the states necessary to perform normal operations If the restart is successful and communications are reestablished with the host system s and Miscellaneous error logging is enabled by one or more host systems the HSJ30 40 controller will send a Last Failure Event Log that describes the condition that caused controller operation to terminate to all host systems that have enabled Miscellaneous error logging on a connection or connections establis
460. ontroller Host Interconnect Services firmware component reports errors detected while performing work related to the CI Port communication layer via the CI Port Event Log The CI Port Event Log will be sent to all host systems that have enabled Miscellaneous error logging on a connection or connections established with the HSJ30 40 controller s Disk and or Tape MSCP Server The CI Port Event Log is reported via the T MSCP Controller Errors error log message format The format of this event log including the HSJ30 40 controller specific fields is shown in Figure C 22 CI Port Event Log Format Specific Fields format This field contains the value 00 that is T MSCP Controller Errors error log format code event code The values that can be reported in this field for this event log are shown in Table C 24 reserved offset 16 This field contains the value 0 HSJ Series Error Logging Figure C 22 Cl Port Event Log Template 31 Format 31 0 controller identifier reserved chvrsn csvrsn instance code reserved event time ECN instance code See Section C 2 1 for the description of this field The values that can be reported in this field for this event log are shown in Table C 24 templ See Section C 2 1 for the description of this field This field contains the value 31 for this event log tdisize See Section C 2 1 for the description of this field This field contains the value 0C for this event l
461. oo DILX Data Patterns icno Tii barri ae ea teens DILX Abort Codes and Definitions o ooooooooooo DILX Error Codes and Definitions o aeaea VTDPY Control Keys ers ee a ee eee ERR RR VTDPY Commands simi eR A E a Thread D scription A he ek Cache Upgrade HSJ40 Controller llle Cache Upgrade HSJ30 Controller 0 0 0 eee eens Cache Upgrade HSD30 Controller llle Cache Upgrade HSZ40 Controller 0 0 eee eens Mod le Removal ce bee ode eee a Module Replacement 0 0 celere HSJ40 FRUS ente so eth Pee ide ates HSI30 FRUS iere IO RN ER wok Sa aes Re ees HSD30 ERUS 3 S ot veteres tegi O are HSZ4A0 ERUS i te o Sah ate MAU Rob RUIN P AES Controller Related FRUs o oooooooocooooooo ooo Template Types i RIT Ren AES Se e I e A e ege SS e Firmware Component Identifier Codes o Host Interconnect Services Status Codes ooooooooooo CI Message Operation Codes oooo ooo oo o CI Virtual Circuit State Codes lllo eee Port Port Driver Message Operation Codes ssl System Communication Services Message Operation Codes CI Connection State Codes 22i Supported SCSI Device Type Codes 0 0 0 0 eee SCSI Command Operation Codes llle oo SCSI Buffered Modes Codes llli ee eee nes SCSI Sense Key Codes 0 llli eee
462. ooocoocor L I L i Qo O Oo OO N ooo oO oo Qo O RAR 81 D 3 Qo Qo Qo Qo Qo Qo Qo Oo Oo do Qo Oo do Oo Oo Oo Oo Oo CO 0 Q2 Co Co Co Q5 CO bO bO bO bO bO bO bO bO bO bO b2 FE 4000000000000 0000m 0g000u0d 0000aoo0o0dno0n0 Codes Instance Codes cont d 40670204 039A000A 020A0064 021A0064 038A0101 402A010A 405A020A 03A04002 03A14002 03A24002 03A34002 031A4002 036A4002 03744002 03A40064 03A50064 03A64002 400A640A 03A74002 03A80101 03A94002 03AA4002 03AB4002 03AC4002 03AD4002 03AE4002 03AF4002 021B0064 031B0101 402B010A 405B020A 03B04002 07030B0A 07040B0A 07080B0A 03B14002 03B24002 03B3450A 036B4002 037B4002 039B4002 03B40101 038B450A 03B52002 03B64002 400B640A 03B74402 03B82002 03B92002 03BA0101 03BB0101 03BC0101 03BD450A 07060C01 07070C01 402C010A C 83 C 92 C 91 C 84 C 86 C 82 C 82 C 92 C 92 C 92 C 92 C 89 C 91 C 91 C 92 C 92 C 92 C 81 C 92 C 92 C 92 C 92 C 92 C 92 C 92 C 92 C 92 C 84 C 89 C 82 C 82 C 92 C 78 C 78 C 79 C 92 C 92 C 92 C 91 C 91 C 92 C 84 C 91 C 84 C 84 C 81 C 85 C 85 C 85 C 85 C 86 C 86 C 92 C 79 C 79 C 82 Codes Instance Codes cont d 405C020A 020C2201 030C4002 031C4002 037C4002 03904002 036C430A 400C640A 03C80101 03C92002 03CA4002 03CB0101 03CC0101 03CD2002 03CE2002 03CF0101 030D000A 403D020A 405D020A 03D04
463. oooo oo VTDPY Utility a a S Que How to Run VTDPY o o oooo ooo Using the VTDPY Control Keys Using the VTDPY Command Line How to Interpret the VTDPY Display Fields DAADDDAANDADAHDDOAOAAAAIa T0ganwna akrFHHRAPKHPPKWWAWWOWWWWWANNNNNNNHH H NN YN O O 00000NN 00N _1 0000000 4 nNDNJ 00NNN 000 JO0IO0NN 0 00 400 JJOO Oo Qo9o09009000 90 909090 9090o09 900900 909o0909o90o9090 90oo900o00oo9090o9090ooo0ooo o o o 6 6 6 6 1 6 7 6 7 1 6 7 2 6 7 3 6 7 3 1 6 7 4 6 7 5 6 7 5 1 6 7 5 2 The CONFIG Utility 2 ida Running the CONFIG Utility ooooooooooooo HSZUTIL Virtual Maintenance Terminal Application General Implementation Considerations oooooo oo Restrictions DEC OSF 1 for Alpha AXP Implementations Running HSZUTIL Under DEC OSF 1 AXP Description of HSZ series Controller Virtual Terminal Protocol Diagnostic Pages 0 0 eee ene a ATAD Virtual Maintenance Terminal Communications Protocol Protocol Notes 4 paa a Host Virtual Terminal I O Algorithm 0 7 Removing and Replacing Field Replaceable Units 7 1 7 1 1 7 1 2 7 1 3 7 1 3 1 7 1 3 2 7 1 3 3 7 1 3 4 7 1 3 5 7 1 4 7 1 4 1 7 1 4 2 7 1 4 3 7 1 4 4 7 1 4 5 7 1 5 7 2 7 2 1 7 2 2 7 2 3 7 2 4 7 2 5 7 3 7 3 1 7 3 2 7 3 3 7 3 4 7 4 7 4 1 7 4 2 7 4 3 7 4 4 7 5 7 5 1 7 5 2 7 5 3 7
464. ooooooo Adding Devices Adding Storage Sets Initializing Containers 0 ccc eee eens Adding Logical Units B 13 B 14 B 15 B 16 B 17 B 18 B 20 B 21 B 23 B 25 B 26 B 28 B 30 B 31 B 33 B 34 B 37 B 38 B 41 B 44 B 45 B 46 B 47 B 48 B 49 B 51 B 52 B 53 B 54 B 55 B 56 B 58 B 59 B 60 B 62 B 64 B 64 B 64 B 74 B 74 B 77 B 77 B 77 B 77 B 77 B 78 B 78 B 78 B 78 B 3 9 Device Configuration Examples o C HSJ Series Error Logging C 1 Reading an HSJ Series Error Log 0 0 00 cece eee eee C 2 Event Log Formats 02 2 9 id a UH C 2 1 C 2 2 C 2 2 1 C 2 2 2 C 2 2 3 C 2 2 4 C 2 2 5 C 2 3 C 2 3 1 C 2 3 2 C 2 3 3 C 2 3 4 C 2 3 5 C 2 3 6 C 2 3 7 C 2 3 8 C 2 3 9 C 2 3 10 C 2 3 11 C 2 3 12 C 2 3 13 C 2 3 14 C 2 3 15 Implementation Dependent Information rea Common Event Log Fields o o CI Host Interconnect Services Common Event Log Fields Host Server Connection Common Fields Byte Count Logical Block Number Common Fields Device Location Identification Common Fields SCSI Device Sense Data Common Fields Specific Event Log Formats o Last Failure Event Log Template 01 Failover Event Log Template 05 o oo ooooooooo oo Nonvolatile Parameter Memory
465. ort buttons at this time will light its corresponding amber LED and quiesce its SCSI 2 port You must quiesce a port to remove or warm swap a device on the SCSI 2 bus for that port Once you replace the device you can press the button again to turn off the LED and reactivate the port See Chapter 7 for a detailed description of removing and replacing devices Error Analysis and Fault Isolation 5 3 5 4 2 Fault Notification The OCP LEDs display information when the HS controller encounters a problem with a device configuration a device or the controller itself Should a configuration mismatch or a device fault occur the amber LED for the affected device s bus will light continuously For controller problems LED codes determined by internal diagnostics and operating firmware will indicate either controller faults or HS operating firmware program card faults In either case the single green reset LED lights continuously when an error is detected The remaining amber LEDs display the error codes in two different ways The error code will be lit continuously for faults detected by internal diagnostic and initialization routines See Figure 5 2 to determine what these codes mean The error code will flash at 3 Hz representing faults that occur during normal controller operation See Figure 5 3 to determine what these codes mean Figure 5 2 Solid OCP Codes Reset 1 2 3 4 5 6
466. ost based DUP connection For SCSI based HS controllers the VTDPY utility can Diagnostics Exercisers and Utilities 6 65 be run only on terminals connected the the HS controller maintenance terminal port Note VCS can only be used from a terminal attached to the EIA 423 terminal port of the controller The VTDPY utility is conceptually based on the HSC utility of the same name Though the information displayed differs from the HSC utility due to system implementation differences a user familiar with the HSC utility should be able to easily understand this display terminology The following sections show how to use the VTDPY utility 6 5 1 How to Run VTDPY Only one VTDPY session can be run on each controller at one time Note Prior to running VTDPY be sure the terminal is set in NOWRAP mode Otherwise the top line of the display scrolls off of the screen To initiate VTDPY from the maintenance terminal at the CLI gt prompt enter the following command CLI gt RUN VTDPY To initiate VTDPY from a virtual terminal refer to Chapter 4 6 5 1 1 Using the VTDPY Control Keys Use the following control key sequences to work the VTDPY display Table 6 11 VTDPY Control Keys Control Key Sequence Function Ctrl C Prompts for commands Ctrl G Updates the screen same as Ctrl Z Ctrl O Pauses or resumes screen updates Ctrl R Refreshes current screen display same as Ctrl W Ctrl W Refreshes current sc
467. ost buses unterminated during service How you service your cables and what devices you may leave running terminated and so on will depend on your configuration Optional The trilink connector may be considered part of the SCSI host cable during service 7 7 1 Tools Required You will need the following tools to remove or replace SCSI host cables 5 32 inch Allen wrench Tie wrap cutters Flat head screwdriver Removing and Replacing Field Replaceable Units 7 29 Figure 7 9 SCSI Host Cable TRILINK Us CONNECTOR I TERMINATOR OJO l SCSI HOST CABLE CXO 4205A MC 7 7 2 Precautions Refer to Chapter 1 for SCSI host cable handling guidelines 7 7 3 Cable Removal Use the following procedure to remove SCSI host cables 1 Disconnect the SCSI host cable from the host or other device the device at the other end of the cable from the controller 2 If necessary to access the HSZ series controller unlock and open the cabinet SW800 series using a 5 32 inch Allen wrench 3 Loosen the captive screws on the SCSI host cable where it attaches to the trilink connector on the front of the controller and disconnect the cable 4 Remove the SCSI host cable from the cabinet cutting tie wraps as necessary 5 Optional Loosen captive screws and remove the terminator or secondary SCSI host
468. ostkarte versehen Sollte Kennzeichnung und Anmeldepostkarte bergangsweise nicht mit ausgeliefert werden kontaktieren Sie bitte das n chstgelegene Digital Equipment Kundendienstb ro 1 General Information and Subsystem Overview This chapter contains general information and technical overview information on the hierarchial storage HS controller For purposes of this manual HS controller refers to several models as shown in Table 1 1 Table 1 1 HS Controller Models Type Model HSJ series HSJ40 HSJ30 HSD series HSD30 HSZM series HSZ40 Controllers not covered in this manual Any HSC controller HSD05 HSZ1x 1 1 Technical Overview The HS controllers are an integral part of Digital s family of array controllers The controllers connect Small Computer System Interface generation 2 SCSI 2 storage devices to a variety of host interfaces including CI DSSI and SCSI Each HS controller consists of the following e A controller module e A read cache module optional The two modules are housed together in a BA350 MA controller shelf The controller shelf can be inserted in different StorageWorks cabinets The cabinets are shown in Figures 1 1 and 1 2 Firmware that controls the HS controllers hierarchial storage operating firmware resides on a Personal Computer Memory Card Industry Association PCMCIA program card The card plugs into the controller module To receive the most cu
469. otes with an alphabetic character first Each SCS node name must be unique within its VMScluster 4 Enter the following command to set the MSCP allocation class HSJ and HSD series controllers CLI gt SET THIS CONTROLLER MSCP_ALLOCATION_CLASS n where n is 0 through 255 5 Enter the following command to set the TMSCP allocation class HSJ and HSD series controllers CLI gt SET THIS CONTROLLER TMSCP ALLOCATION CLASS n where n is 0 through 255 Note Always restart the controller after setting the ID SCS node name or allocation classes 6 Restart the controller either by pressing the green reset button or by entering the following command CLI gt RESTART THIS CONTROLLER 7 Enter the following command to verify the preceding parameters were set CLI gt SHOW THIS CONTROLLER 1 See Section 4 9 2 for important information about VMS node names Normal Operation 4 5 CAUTION Do not plug the host port cable into an HSD series controller while the power is on to any devices on the DSSI bus Doing so risks short circuits that may blow fuses on all the devices 8 Connect the host port cable to the front of the controller see Chapter 7 9 Enter the following commands to enable CI paths A and B to the host HSJ series controllers CLI gt SET THIS CONTROLLER PATH A CLI gt SET THIS CONTROLLER PATH B Enter the following command to enable the host port path HSD series controllers CLI gt
470. other controller has exclusive access declared for this unit Explanation This message is self explanatory This unit is marked inoperative Explanation The unit could not be allocated for testing because the controller internal tables have the unit marked as inoperative The unit does not have any media present Explanation The unit could not be allocated for testing because no media is present The RUNSTOP_SWITCH is set to RUN_DISABLED Explanation The unit could not be allocated for testing because the RUNSTOP_SWITCH is set to RUN_DISABLED This is enabled and disabled through the Command Line Interpreter CLD Unable to continue run time expired Explanation A continue response was given to the reuse parameters question This is not a valid response if the run time has expired Reinvoke DILX When DILX starts to exercise the disk units the following message is displayed with the current time of day DILX testing started at xx xx xx Test will run for x minutes Type T if running DILX through a VCS or G in all other cases to get a current performance summary Type C to terminate the DILX test prematurely Type Y to terminate DILX prematurely Diagnostics Exercisers and Utilities 6 17 6 2 6 DILX End Message Display To interpret the end message fields correctly you must contact Digital Multivendor Services Example 6 1 is an example of a DILX end message display Example 6 1 DILX End Message Displ
471. p continue restar DILX Normal Termination HSJ gt n with All Units n y 65535 60 nutes 1 65535 ng ng ng ng ng ng ng 04 42 39 60 or G in all other cases ry rematurely y t change_unit stop In Example 6 9 DILX is run using the Auto Configure option with the half of all units option Example 6 9 Auto Configuration with Half of All Units HSJ gt run dilx Copyright Digital Equipment Corporatio Disk Inline Exerciser version 1 4 The Auto Configure option will automatic all of the disk units configured It wil WRITES enabled The user will only be performance summary options and whether configuration The user will not be able The Auto Configure option is only reco Do you wish to perform an Auto Con figure 6 26 Diagnostics Exercisers and Utilities n 1993 ally select for testing half or l perform a very thorough test with able to select the run time and or not to test a half or full to specify specific units to test ended for initial installations n y n y continued on next page Example 6 9 Cont Auto Configuration with Half of All Units If you want to test a dual redundant subsystem it is recommended that you pick option 2 on the first controller and then option 2 on the other controller Auto Configure options are 1 Configure all disk units for testing This is recommended for a single controller subsystem
472. p does not disable the shelf or its contents e Use cold swap during installation or when there is no operational shelf power supply Should this occur on a controller shelf the controller cache module and all associated SCSI buses are disabled until power is restored On a device shelf those particular devices are disabled though their controller will still service devices on other shelves 7 36 Removing and Replacing Field Replaceable Units 7 10 1 Tools Required You will need a 5 32 inch Allen wrench to remove or replace a power supply 7 10 2 Precautions Refer to Chapter 1 for safety guidelines 7 10 3 Power Supply Removal Use the following procedure to remove a power supply see Figure 7 13 Figure 7 13 Power Supply Removal CXO 4177A MC Note The cold swap procedure is identical except you should take the shelf contents devices or controllers off line before removing the power supply 1 Unlock and open the cabinet doors SW800 series using a 5 32 inch Allen wrench 2 Make sure the power status lower LED on the power supply is off Unplug the power supply Press the two mounting tabs together to release the power supply from the shelf CAUTION The power supply is relatively heavy and can be damaged if dropped Always use both hands to fully support the power supply during removal 5 Use both hands to pull the power supply out of the shelf Removing and Replacing Field Repl
473. pace in this column indicates caching is disabled KB S This column indicates the average amount of kilobytes of data transferred to and from the unit in the previous screen update interval This data is only available for disk and tape units O Rd This column indicates what percentage of data transferred between the host and the unit were read from the unit This data is only contained in the DEFAULT display for disk and tape device types O Wr This column indicates what percentage of data transferred between the host and the unit were written to the unit This data is only contained in the DEFAULT display for disk and tape device types O Cm This column indicates what percentage of data transferred between the host and the unit were compared A compare operation may be accompanied by either a read or a write operation so this column is not cumulative with read percentage and write percentage columns This data is only contained in the DEFAULT display for disk and tape device types O HT This column indicates the cache hit percentage for data transferred between the host and the unit 6 90 Diagnostics Exercisers and Utilities Unit Status full Description unit asicO x5 30 razO 11120 cQ 1730 PHO 1520 Purge 210000 515 D0003 o x 382 0 100 0 0 0 0 0 6880 0 D0250 o r 382 100 0 0 0 0 100 0 6880 0 D0251 o r 284 100 0 0 0 0 100 0 5120 0 D0262 a r 0 0 0 0 0 0 0 0 0 0 D0280 o x 497 44 55 0 0 0 100 0 9011 0
474. pecific information In all cases the Instance code template type and all requestor specific information correspond to event error log device dependent parameters while everything else has a one to one correspondence to error log fields See Appendices C and D for a translation of these codes 6 42 Diagnostics Exercisers and Utilities Example 6 11 Controller Error Error Information Packet in hex Cmd Ref Number Unit Number Log Sequence Format Flags Event Code Controller ID Controller SW ver Controller HW ver ulti Unit Code Instance Template Requestor Requestor Requestor xxx MM MM xoxo Type Information Size Example 6 12 Memory Error Error Information Packet in hex Cmd Ref Number Unit Number Log Sequence Format Flags Event Code Controller ID Controller SW ver Controller HW ver ulti Unit Code emory Address Instance Template Type Requestor Information Size xx KKM MK MM KM X OX Specific Data bytes 0 7 Specific Data bytes 8 15 Requestor Specific Data bytes xx xx Requestor Specific Data bytes 0 7 Requestor Specific Data bytes 8 15 Requestor Specific Data bytes xx xx X X X XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX X X X XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX XX Diagnostics Exercisers and Utilities 6 43 Example 6 13 Tape Error Error Information Packet in hex Cmd Ref Number Unit Number Log
475. performance statistics were not selected and where a controller error was detected DILX Summary at 18 JUN 1993 06 18 41 Test minutes remaining 0 expired 6 Cnt err in HEX 1C 07080064 Key 06 ASC Q A0 05 HC 1 SC 0 Total Cntrl Errs Hard Cnt 1 Soft Cnt 0 Unit 1 Total IO Requests 482 No errors detected Unit 2 Total IO Requests 490 No errors detected For the previous examples the following definitions apply These codes are translated in Appendices C and D e C The HSJ HSD series Instance code e ASC Q The SCSI ASC and ASCQ code associated with this error e HC The hard count of this error e SC The soft count of this error e PTL The location of the unit Port Target LUN The performance displays contain error information for up to three unique errors Hard errors always have precedence over soft errors A soft error represented in one display may be replaced with information on a hard error in subsequent performance displays 6 2 11 DILX Abort Codes Table 6 3 lists the DILX abort codes and definitions Diagnostics Exercisers and Utilities 6 29 Table 6 3 DILX Abort Codes and Definitions Value Definition 1 An IO has timed out 2 dcb_p gt htb_used_count reflects an available HTB to test IOs but none could be found 3 FAO returned either FAO_BAD_FORMAT or FAO_OVERFLOW TS SEND TERMINAL DATA returned either an ABORTED or INVALID BYTE COUNT 5 TS READ TERMINAL DATA returned either an ABORTED or
476. performance will degrade due to the heavy load the exercisers impose on the controller 6 30 Diagnostics Exercisers and Utilities 6 3 1 Invoking TILX Note Before running TILX be sure that all units you wish to test have been dismounted from the host The following describes how to invoke TILX from a maintenance terminal at the CLI gt prompt or a VCS or from a virtual terminal through the DUP connection To invoke TILX from a maintenance terminal enter the following command at the CLI gt prompt CLI gt RUN TILX To invoke TILX from a maintenance terminal using a VCS enter the following command at the CLI gt prompt CLI gt VCS CONNECT node name where node name is the controller s SCS node name Consult the VAXcluster Console System User s Guide for complete details on using a VCS Note The node name must be specified for a VCS To invoke TILX from a virtual terminal enter the following command for OpenVMS software SET HOST DUP SERVER MSCPSDUP TASK TILX SCS nodename where SCS nodename indicates where TILX will execute 6 3 2 Interrupting TILX Execution Use the following guidelines to interrupt TILX execution Note The symbol is equivalent to the Ctrl key You must press and hold the Ctrl key and type the character key given Note Do not use Ctrl G from a VCS because it will cause VCS to terminate VOS acts on the sequence and the sequence is never sent to
477. planation One of the TILX I Os to this unit did not complete within the command timeout interval and when examined was found not progressing This indicates a failing controller TILX terminated prematurely by user request Explanation A Ctrl Y was entered TILX interprets this as a request to terminate This message is then displayed and TILX terminates Unit is owned by another sysap Explanation TILX could not allocate the unit specified because the unit is currently allocated by another system application Terminate the other system application or reset the controller Exclusive access is declared for this unit Explanation The unit could not be allocated for testing because exclusive access has been declared for the unit The other controller has exclusive access declared for this unit Explanation This message is self explanatory This unit is marked inoperative Explanation The unit could not be allocated for testing because the controller internal tables have the unit marked as inoperative The unit does not have any media present Explanation The unit could not be allocated for testing because no media is present The RUNSTOP_SWITCH is set to RUN_DISABLED Explanation The unit could not be allocated for testing because the RUNSTOP_SWITCH is set to RUN_DISABLED This is enabled and disabled through the Command Line Interpreter CLD Unable to continue run time expired Explanation A continue response w
478. play of the requester specific information contained in the EIP Enter N to disable the hex dump When the hard error limit is reached the unit will be dropped from testing Enter hard error limit 1 65535 65535 Explanation Enter a value to specify the hard error limit for all units to test This question is used to obtain the hard error limit for all units under test If the hard error limit is reached DILX discontinues testing the unit that reaches the hard error limit If other units are currently being tested by DILX testing continues for those units When the soft error limit is reached soft errors will no longer be displayed but testing will continue for the unit Enter soft error limit 1 65535 32 Explanation Enter a value to specify the soft error limit for all units under test When the soft error limit is reached soft errors are no longer displayed but testing continues for the unit Enter IO queue depth 1 12 4 Explanation Enter the maximum number of outstanding I Os for each unit selected for testing The default is 4 Enter unit number to be tested Explanation Enter the unit number for the unit to be tested Note When DILX asks for the unit number it requires the number designator for the disk where D117 would be specified as unit number 117 Unit x will be write enabled Do you still wish to add this unit y n n Explanation This is a reminder of the consequences of testing
479. ported This field is divided into separate fields specific to the template identified in the templ field The template specific fields common to multiple event logs are described in separate subsections of Section C 2 2 to avoid duplication of the field descriptions in Section C 2 3 C 2 2 Common Event Log Fields Common fields are generated across certain event logs These common fields are described in Sections C 2 2 1 through C 2 2 5 C 2 2 1 Cl Host Interconnect Services Common Event Log Fields The fields common to certain event logs generated by the CI Host Interconnect Services firmware component are shown in Figure C 3 C 8 HSJ Series Error Logging Figure C 3 CI Host Interconnect Services Common Event Log Fields 3 1 1 1 65 0 scs opcode ppd opcode CI Host Interconnect Services Common Fields his status The Host Interconnect Services status code as shown in Table C 3 error id The address of the Host Interconnect Services routine that detected the event src The CI source node address dst The CI destination node address intopcd The CI message opcode as shown in Table C 4 vestate The virtual circuit state code as shown in Table C 5 Note The setting of the high order bit Bit 7 in this field indicates the state of ID polling for the virtual circuit If Bit 7 is set ID polling is complete Otherwise ID polling is incomplete ppd opcode The Port Port Driver layer op
480. porting disabled Unit x Explanation This message indicates that the soft error limit has been reached and therefore no more soft errors will be displayed for this unit Hard error limit reached unit x dropped from testing Explanation This message indicates that the hard error limit has been reached and the unit must be dropped from testing Soft error reporting disabled for controller errors Explanation This indicates that the soft error limit has been reached for controller errors Thus controller soft error reporting is disabled Hard error limit reached for controller errors All units dropped from testing Explanation This message is self explanatory Unit is already allocated for testing Explanation This message is self explanatory No drives selected Explanation DILX parameter collection was exited without choosing any units to test Maximum number of units are now configured Explanation This message is self explanatory Testing will start after this message is displayed Unit is write protected Explanation The user wants to test a unit with a write commands or erase commands or both enabled but the unit is write protected The unit status and or the unit device type has changed unexpectedly Unit x dropped from testing Explanation The unit status may change if the unit experienced hard errors or if the unit is disconnected Either way DILX cannot continue testing the unit Last Failu
481. position type value while trying to generate a command for the position intensive phase of the Basic Function test While trying to display an EIP TILX discovered an unsupported MSCP error log format TILX expected a deferred error to be on the receive deferred error question but no deferred errors were there TILX was asked to fill a data buffer with an unsupported data pattern TILX could not process an unsupported answer in tx reuse params Table C 48 Automatic Device Configuration Program CONFIG Last Failure Codes Code Description 83010100 The CLI prompt was not returned to the Auto Config virtual terminal code within a reasonable amount of time 83020100 An unsupported message type or terminal request was received by the Auto Config virtual terminal code from the CLI 83030100 Not all alter_device requests completed within the timeout interval Table C 49 Controller Restart Codes Code Description 0 Full restart 1 No restart C 118 HSJ Series Error Logging C 4 Event Notification Recovery Threshold An Event Notification Recovery Threshold value is assigned to each significant event that can be reported by an HSJ30 40 controller The Event Notification Recovery Threshold values and their meanings are shown in Table C 50 Table C 50 Event Notification Recovery Threshold Classifications Threshold Value Classification Description 01 IMMEDIATE Failure or potential failure of a component
482. protocol used by the HSJ and HSD series controllers to communicate with the host computer TMSCP is tape specific but overlaps and shares certain portions of MSCP transportable A device setting that indicates the device is not MSCP compliant and does not contain metadata Transportable devices can be moved between HS controller subsystems and non HS controller systems However such devices do not support forced error and should not be set to transportable after correct installation in an HS controller subsystem See also nontransportable VAXcluster System Console See VCS vcs VAXcluster System Console This terminal allows access to hosts by networks Another method of accessing the controller See also DUP virtual terminal A software path from an operator terminal on the host to the controller s CLI interface The path can be established via the host port on the controller using DUP or via the maintenance port through an intermediary host VCS A virtual terminal is also sometimes called a host console warm swap A controller function that allows devices to be added removed or replaced while the subsystem remains operational All activity on the device s SCSI bus must normally be halted for the duration of the warm swap operation write through cache A technique for handling host write requests in read caches When the host requests a write operation the cache writes data directly to the external storage devic
483. pting to use the same SCSI ID either 6 or 7 as indicated in the event report Note that the other HSJ30 40 controller of the dual redundant pair has been reset with the Kill line by the HSJ30 40 controller that reported the event Two possible problem sources are indicated e A HSJ30 40 controller hardware failure e A controller backplane failure Follow repair action 20 for the Killed HSJ30 40 controller If the problem persists then follow repair action 20 for the Surviving HSJ30 40 controller If the problem still persists then replace the controller backplane 20 Replace HSJ30 40 controller module Refer to Chapter 7 for proper replacement procedure 22 Replace indicated HSJ30 40 cache module 40 If the Sense Data FRU field is non zero follow repair action 41 Otherwise replace the appropriate FRU associated with the device s SCSI interface or the entire device 41 Consult the device s maintenance manual for guidance on replacing the indicated device FRU 43 Update the configuration data to correct the problem 44 Replace the SCSI cable for the failing SCSI bus If the problem persists replace the controller backplane drive backplane or controller module 45 Interpreting the device supplied Sense Data is beyond the scope of the HSJ30 40 controller firmware Refer to the device documentation to determine the appropriate repair action if any 60 Swap the transmit and receive cables for the indicated path 61
484. ption of this field The values that can be reported in this field for this event log are shown in Table C 23 templ See Section C 2 1 for the description of this field This field contains the value 14 for this event log tdisize See Section C 2 1 for the description of this field This field contains the value 34 for this event log reserved offset 22 This field contains the value 0 event time See Section C 2 1 for the description of this field C 34 HSJ Series Error Logging Figure C 21 Memory System Failure Event Log Template 14 Format 31 0 command reference number sequence number reserved controller identifier memory address reserved event time byte count reserved chvrsn csvrsn byte count The number of bytes contained in the bad memory area that is the area bounded by memory address through memory address byte count 1 HSJ Series Error Logging C 35 dsr csr desr der ear edr err rsr These fields contain the values contained in the registers of the DRAB that detected the memory failure rdrO rdrl wdrO wdrl These fields contain the values contained in the HSJ30 40 controller s Read and Write Diagnostic registers Note that the content of certain of the fields described previously may be undefined depending on the value supplied in the instance code field See Table C 23 for more detail C 2 3 7 Cl Port Event Log Template 31 C 36 The HSJ30 40 c
485. r See Chapter 6 for detailed information on this utility Controller warm swap C_SWAP for HSJ series controllers efficiently removes and replaces one controller in a dual redundant configuration When you warm swap a controller you are changing out a controller in the most transparent method available to the HS controller subsystem Warm swapping a controller has minimal system and device impact as explained in Chapter 7 Configure CONFIG checks the SCSI device ports for any device not previously added This utility will add and name these devices See Chapter 6 for more information on the configuration utility 2 2 3 5 Error Logging and Fault Management Error Logging and Fault Management is integrated function that collects system errors in a central firmware location to send the error information to the host See Chapter 5 and Appendices C through E for more information on error logging 2 10 Functional Description 2 2 4 Device Services SCSI 2 device service firmware includes device port drivers mixed disk and tape support on one controller and physical device addressing and access Device service consists of normal functions such as read and write plus error recovery code It also contains firmware for controlling and observing the BA350 SB shelf and StorageWorks building blocks SBBs such as LED power and blower monitoring Specific features include the following Normal SCSI 2 8 bit single ended support FAST synchro
486. r at least 30 seconds prior to removing them from the device shelf Gyroscopic motion from a spinning disk may cause you to drop and damage the SBB 5 Remove any SBBs necessary to access the SCSI cable as shown in Figure 7 11 Press down on the two SBB mounting tabs to release it from the shelf and pull the device straight out 6 Remove the cable from the BA350 SB device shelf backplane by pinching the cable connector side clips and disconnecting the cable Figure 7 11 SCSI Device Cables DEVICE SHELF SCSI DEVICE CABLES CXO 4123A MC 7 8 4 Cable Replacement Installation Use the following procedure to replace device port cables CAUTION Be very careful when inserting cable connectors into connectors within the BA350 MA and BA350 SB shelves Inserting a poorly aligned cable connector can damage the shelf connector You must replace the entire shelf if its connectors are damaged 1 For the device shelf connector gently slide the cable connector in from one side to the other and rock the connector from top to bottom to seat it Listen for the connector to snap into place For the controller shelf connector gently slide the cable connector in from one side to the other and rock the connector from top to bottom to seat it Removing and Replacing Field Replaceable
487. r performance summary interval in minutes 1 65535 10 Explanation Enter a value to set the interval for which a performance summary is displayed The default is 10 minutes Include performance statistics in performance summary y n n Explanation Enter Y to see a performance summary that includes the performance statistics that include the total count of read and write I O requests and the kilobytes transferred for each command type Enter N and no performance statistics are displayed Display hard soft errors y n n Explanation Enter Y to enable displays of sense data and deferred errors Enter N to disable error reporting The default is disabled error reporting When the hard error limit is reached the unit will be dropped from testing Enter hard error limit 1 65535 65535 Explanation Enter a value to specify the hard error limit for all units to test This question is used to obtain the hard error limit for all units under test If the hard error limit is reached DILX discontinues testing the unit that reaches the hard error limit If other units are currently being tested by DILX testing continues for those units When the soft error limit is reached soft errors will no longer be displayed but testing will continue for the unit Enter soft error limit 1 65535 32 Explanation Enter a value to specify the soft error limit for all units under test When the soft error limit is reached
488. r the Free Buffer Array 02080100 A call to EXEC ALLOCATE_MEM_ZEROED failed to return memory when populating the disk read DWD stack 02090100 A call to EXEC ALLOCATE_MEM_ZEROED failed to return memory when populating the disk write DWD stack 020A0100 A call to EXEC ALLOCATE_MEM_ZEROED failed to return memory when populating the tape read DWD stack 020B0100 A call to EXEC ALLOCATE_MEM_ZEROED failed to return memory when populating the tape write DWD stack 020C0100 A call to EXEC ALLOCATE_MEM_ZEROED failed to return memory when populating the miscellaneous DWD stack 020E0100 A call to RESMGR ALLOCATE_SEND_DATA_DESC failed to return a send data descriptor when populating the send_dd_stack 020F0100 A call to RESMGR ALLOCATE_RCV_DATA_DESC failed to return a receive data descriptor when populating the rcv dd stack 02100100 A call to EXECS ALLOCATE MEM ZEROED failed to return memory when creating the device services state table 02170100 Unable to allocate memory for the Free Node Array 02180100 Unable to allocate memory for the Free Buffer Descriptor Array 021B0100 A call to EXECS ALLOCATE MEM ZEROED failed to return memory when populating the disk read EDC DWD stack 021C0100 A call to EXECS ALLOCATE MEM ZEROED failed to return memory when populating the disk write EDC DWD stack 021D0100 Unable to allocate memory for the Free Buffer Array 021E0100 Unable to allocate memory for the Free Strip Node Array 021F0100 Unable to allo
489. r was found not to be running If the other controller is in the process of restarting retry the command later If the other controller is shut down or turned off start it If the other controller is no longer present enter a SET NOFAILOVER command to take it out of dual redundant mode Error 6020 Initial failover handshake not yet complete Explanation For a short period of time after start up the two controllers must communicate to set up a dual redundant mode This setup time is typically less than 1 minute If commands that require controller to controller communication are entered during this setup time error 6020 results Retry the command later Error 6030 Unable to communicate with the other controller to setup FAILOVER Explanation Could not setup FAILOVER due to communication problems between the controllers The command should be retried later Error 6040 The write of the other controller s configuration information did not succeed information may be in an inconsistent state Before further use both controllers should be removed from dual redundant mode SET NOFAILOVER and then placed back into dual redundant mode SET FAILOVER to assure consistency Explanation Communication was lost in the middle of a SET FAILOVER command Follow the instructions included in the error message Error 6050 Communication failure with other controller while putting controllers into dual redundant mode Reissue the SET FAILOVER co
490. re Information follows This error was NOT produced by running DILX It represents the reason why the controller crashed on the previous controller run Explanation This message may be displayed while allocating a unit for testing It does not indicate any reason why the unit is or is not successfully allocated but rather represents the reason why the controller went down in the previous run The information that follows this message is the contents of an EIP Disk unit numbers on this controller include Explanation After this message is displayed a list of disk unit numbers on the controller is displayed 6 16 Diagnostics Exercisers and Utilities IO to unit x has timed out DILX aborting Explanation One of the DILX I Os to this unit did not complete within the command timeout interval and when examined was found not progressing This indicates a failing controller DILX terminated prematurely by user request Explanation A Ctrl Y was entered DILX interprets this as a request to terminate This message is displayed and DILX terminates Unit is owned by another sysap Explanation DILX could not allocate the unit specified because the unit is currently allocated by another system application Terminate the other system application or reset the controller Exclusive access is declared for this unit Explanation The unit could not be allocated for testing because exclusive access has been declared for the unit The
491. re being displayed this is the transfer rate between the controller and the devices O Cumulative unit or device request rate per second When logical units are being displayed this is the request rate between the host and the controller When physical devices are being displayed this is the request rate between the controller and the devices 6 78 Diagnostics Exercisers and Utilities Controller Threads Display Description PrO waneO stk Max TypQ sta cos 0 3 NULL 0 Rn 47 2 HPT 40 7 FNC Rn 40 3 VIDPY 10 3 DUP Rn 0 1 FMTHRD 10 2 FNC Bl 0 0 DS HB 10 2 FNC Bl 0 0 DUP 10 2 FNC Bl led secs 10 2 FNC Bl 0 0 SCP 20 6 FNC Bl 0 0 VA 10 3 FNC Bl 1 2 DS 1 40 6 FNC Rn 8 9 DS 0 20 4 FNC Bl 0 0 HIS 10 2 FNC Bl 0 0 CLIMAIN 16 6 FNC Bl 0 0 FOC 16 4 FNC Bl 0 0 DUART 10 2 FNC Bl 0 0 This display shows the status and characteristics of the active threads in the controller Threads that are not active such as DUP Local Program threads will not be displayed until they become active If the number of active threads exceeds the available space not all of them will be displayed 2 e The Pr column lists the thread priority The higher the number the higher the priority The Name column contains the thread name For DUP Local Program threads this is the name used to invoke the program The Stk column lists the allocated stack size in 512 byte pages The Max column lists the number of stack pages
492. reen display same as Ctrl R Ctrl Y Terminates VTDPY and resets screen characteristics Ctrl Z Updates the screen same as Ctrl G Note While VTDPY and the maintenance terminal interface support passing all of the listed control characters some host based terminal interfaces restrict passing some of the characters All of the listed characters have equivalent text string commands 6 66 Diagnostics Exercisers and Utilities 6 5 1 2 Using the VTDPY Command Line VTDPY contains a command line interpreter that is invoked by entering Ctrl C any time after the program has begun execution The command line interpreter is used to modify the characteristics of the VTDPY display Commands also exist to duplicate the function of the control keys listed in Section 6 5 1 1 Table 6 12 VTDPY Commands Command String Function DISPLAY CACHE Uses 132 column unit caching statistics display DISPLAY DEFAULT Uses default 132 column system performance display DISPLAY DEVICE Uses 132 column device performance display DISPLAY STATUS Uses 80 column controller status display EXIT Terminates program same as QUIT INTERVAL lt seconds gt Changes update interval HELP Displays help message text REFRESH Refreshes the current display QUIT Terminates program same as EXIT UPDATE Updates screen display The keywords in the command strings can be abbreviated to the minimum number of characters that are necessary to uniquely identify the keyword
493. repair actions 5 4 Operator Control Panel The operator control panel OCP includes the following One reset button with an embedded green LED One button per SCSI port e Six amber LEDs Figure 5 1 shows the OCP from the HSZ40 controller The buttons and LEDs serve different functions with respect to controlling the SCSI ports and or reporting fault and normal conditions Button and LED functions are discussed in the following sections 1 Record which devices have lit flashing fault LEDs before resetting as a reset may temporarily clear the LED even though the fault remains 2 The HSJ series has the amber LEDs embedded in the port buttons 5 2 Error Analysis and Fault Isolation Figure 5 1 HS Controller Operator Control Panel O RESET BUTTON 0 0 0 D n oop I PORT PORT LEDS BUTTONS CXO 4204A MC 5 4 1 Normal Operation The green LED reflects the state of the controller and the host interface Once controller initialization completes and its firmware is functioning the green button flashes continuously at 1 Hz Pressing the green button during this normal operation resets the controller Under normal operation the amber LEDs indicate the state of the respective SCSI 2 buses attached to the controller When the devices on the buses are functioning correctly the amber LEDs will not be lit or flashing Pressing one of the p
494. rmal I O operations are in progress as system performance will degrade due to the heavy load the exercisers impose on the controller Diagnostics Exercisers and Utilities 6 5 6 2 1 Invoking DILX Note Before running DILX be sure that all units that you wish to test have been dismounted from the host The following describes how to invoke DILX from a maintenance terminal at the CLI gt prompt or from a VCS or from a virtual terminal through a DUP connection To invoke DILX from a maintenance terminal enter the following command at the CLI gt prompt CLI gt RUN DILX e To invoke DILX from a maintenance terminal using a VCS enter the following command at the CLI gt prompt CLI gt VCS CONNECT node name where node name is the controller s SCS node name Consult the VAXcluster Console System User s Guide for complete details on using a VCS Note The node name must be specified for a VCS e To invoke DILX from a virtual terminal using a DUP connection enter the command for the OpenVMS operating system SET HOST DUP SERVER MSCPSDUP TASK DILX SCS nodename Specify the controller s SCS node name to indicate where DILX will execute 6 2 2 Interrupting DILX Execution Use the following guidelines to interrupt DILX execution Note The symbol is equivalent to the Ctrl key You must press and hold the Ctrl key and type the character key given Note Do not use Ctrl G from
495. rmation field of the Sense Data returned in the response of a SCSI REQUEST SENSE command The content of this field varies depending on the values contained in the devtype and cmdopcd fields and the bufmode uweuo msbd and fbw subfields of the sdqual field as follows Regardless of the value of the devtype field and the sdqual subfields if the cmdopcd is an 18 COPY 39 COMPARE or 3A COPY AND VERIFY this field contains the difference residue of the requested number of blocks minus the actual number of blocks copied or compared for the current segment descriptor HSJ Series Error Logging C 17 e Regardless of the value of the sdqual subfields if devtype is 0 Direct Access Devices such as magnetic disk or 5 CDROM Devices and emdopcd is not an 18 COPY 39 COMPARE or 3A COPY AND VERIFY this field contains the unsigned logical block address associated with the value contained in the Sense Key subfield of the snsflgs field see Figure C 11 Regardless of the value of cmdopcd if devtype is 1 Sequential Access Devices such as magnetic tape and uweuo is 1 and bufmode is either 1 or 2 this field contains the following The total number of objects in the buffer if msbd and fbw are both 1 The number of bytes in the buffer including filemarks and setmarks if msbd is 1 and fbw is 0 adds
496. roller ID s when there were still units using those IDs The current valid unit ranges are given by the lt start gt and lt end gt values Either delete the units that use the ID that will no longer be specified or retry the SET THIS_CONTROLLER ID specifying the ID being used by the existing units Error 5000 A program name may be from 1 to 6 characters Explanation This error results from a RUN lt program name gt Error 5010 The requested program is currently busy Explanation This error results from a RUN lt program name gt The program requested is being run by someone else Error 5020 The requested program is unknown Explanation This error results from a RUN lt program name gt Enter DIR to get a list of available programs Error 5030 Insufficient memory for request Explanation This error results from a RUN lt program name gt resource problem Retry the command later Error 6000 Communication failure with other controller Explanation There was a communication problem with the other controller This typically happens if the other controller is shutting down If these messages happen often when the other controller is not shutting down call Digital Multivendor Services B 68 Command Line Interpreter Error 6010 Other controller not present Explanation When asked to communicate with another controller the result of any one of a number of commands the other controlle
497. roller MMJ You do not need a maintenance terminal for normal operation However you must connect a maintenance terminal for initial controller configuration Thereafter use either a maintenance terminal or a host virtual terminal to communicate with the controller Follow this procedure to connect a maintenance terminal 1 Make sure the power switch on the back of the terminal is off O 2 Connect one end of the terminal cable to the back of the terminal 3 Connect the other end of the terminal cable to the MMJ on the controller 4 Set your terminal at 9600 baud 8 data bits 1 stop bit and no parity Refer to your terminal documentation for terminal setup instructions 4 6 Virtual Terminal HSJ and HSD Series Controllers After installation and setting of initial controller parameters through a maintenance terminal controller functions may be executed from a virtual host terminal through a DUP connection Refer to Section 4 3 1 for information on making the virtual connection Establishing a virtual terminal session under the OpenVMS VAX and OpenVMS AXP operating systems SET HOST DUP requires the FYDRIVER The following error indicates that the FYDRIVER has not been loaded SHSCPAD F DRVNOTLOAD FYDRIVER not loaded SYSTEM W NOSUCHDEV no such device available If you receive this message load the FYDRIVER as follows e For OpenVMS VAX MCR SYSGEN SYSGEN LOAD SYS LOADABLE IMAGES FYDRIVER SYSGEN CONNECT FYAO NO
498. roller in the C1 position in an SW800 series or SW500 series cabinet Refer to Figures 3 1 and 3 5 t Cannot be configured in SW500 series cabinets Configuration Rules and Restrictions 3 13 Table 3 4 5 Inch SBB Configurations 3 Port Controller Number of Available Number BA350 SB for 5 inch of Devices Shelves Configure as SBBs Ports Used 1 2 1 1 2x3T 1 0 1 2 3 4 2 1 2x3T 1 0 3 1 1x6T 5 6 3 3 1x6T 1 0 7 8 4 2 1x6T 1 0 1 1x6J 9 10 5 1 1x6T 1 0 3 2 1x6J 11 12 61 3 1x6J 1 0 3 Notes Each BA350 SB shelf has its upper connector cable attached to either the adjacent BA350 SB shelf s lower connector 1x6J or a controller port connector 2x3T or 1x6T The lower connector cable is attached to either an adjacent BA350 SB shelf s upper connector 1x6J as in the first list item controller port connector 2x3T or is unused 1x6T Consult the StorageWorks Solutions Shelf User s Guide for BA350 SB shelf information Available for additional 5 4 inch device Cannot be configured in SW500 series cabinets 3 4 5 Intermixing 5 inch and 31 inch SBBs Use these guidelines for intermixing 5 inch and 3 inch SBBs e Treat each 54 inch SBB as three 3 inch SBBs e Each 5 inch SBB must have its SCSI 2 ID set manually using the address switch on the rear of the SBB or by setting the switch to automatic and letting the slot connector dictate the device address Refer to the StorageWorks Solutions
499. roller in the dual redundant pair Normal Operation 4 15 4 10 1 Setting Failover To place two controllers into failover configuration enter the following command CLI gt SET FAILOVER COPY configuration source where configuration source is either THIS_CONTROLLER or OTHER_ CONTROLLER depending on where the good copy of device configuration information is found CAUTION Digital recommends that the controllers be set for failover before any device configuration commands are entered Then as devices storage sets and units are added to one controller s configuration they are automatically added to the other controller s configuration Given two controllers it is possible to fully configure one controller and then enter the SET FAILOVER command but if the wrong configuration source is specified all device configuration information will be lost overwritten Never blindly specify SET FAILOVER Know where your good configuration information resides before entering the command A considerable amount of work and effort could easily be lost by overwriting good information Note Due to the amount of information that must be passed between the two controllers the SET FAILOVER command may take up to one minute to complete 4 10 2 Exiting Failover To take two controllers out of the failover configuration enter the following command CLI gt SET NOFAILOVER This removes the controller from the fai
500. roller model From the controller SCSI device interface At the physical device level e From the host interface At the virtual device level Following are descriptions of both levels of storage addressing 2 3 1 Controller Storage Addressing Note This section on controller storage applies to all controller models Figure 2 6 shows a typical physical storage device interface for a controller Each of the controller s six device ports supports a SCSI bus connected with up to six devices The devices typically reside in a Storage Works BA350 SB storage shelf The current implementation of all controllers supports only one controller LUN per physical device LUN 0 is the default controller LUN address for each device Controller Port Target LUN Addressing Controller Port Target LUN PTL addressing is the process by which the controller selects storage space within a specific physical storage device The process takes place in three steps 1 The port selection The controller selects the SCSI bus port connected to a particular device 2 The target selection The controller selects the device s SCSI ID that is the target on that port 3 The LUN selection The controller selects the desired LUN within that physical device In the current implementation there is only one LUN on each device and its LUN address is always 0 Note that controller PTL addressing is always tied to a physical storage device 2 3 2
501. ror Logging If the error is associated with a command issued by an HSJ30 40 controller firmware component the Media Loader Error Event Log will be sent to all host systems that have enabled Miscellaneous error logging on a connection established with the HSJ30 40 controller s Tape MSCP Server The Media Loader Error Event Log is reported via the T MSCP Media Loader Errors error log message format The format of this event log including the HSJ30 40 controller specific fields is shown in Figure C 29 Figure C 29 Media Loader Error Event Log Template 71 Format command reference number sequence number unit number event code flags format controller identifier multiunit code csvrsn unit identifier reserved uhvrsn usvrsn media loader identifier ml unit number mlsvrsn instance code tdisize templ reserved ancillary information device identification device serial number keyspec frucode HSJ Series Error Logging C 53 Media Loader Error Event Log Format Specific Fields format This field contains the value OA that is T MSCP Media Loader Errors error log format code event code The values that can be reported in this field for this event log are shown in Table C 31 instance code See Section C 2 1 for the description of this field The values that can be reported in this field for this event log are shown in Table C 31 templ See Section C 2 1 for the description of this field This
502. rpreter B 29 SET disk container name SET disk container name Format Parameters Description Qualifiers Examples Modifies the characteristics of a disk drive SET disk container name disk container name Specifies the name of the disk drive whose characteristics will be modified Changes the characteristics of a disk drive TRANSPORTABLE NOTRANSPORTABLE D In normal operations the controller makes a small portion of the disk inaccessible to the host and uses this area to store metadata which improves data reliability error detection and recovery This vast improvement comes at the expense of transportability If NOTRANSPORTABLE is specified or allowed to default and there is no valid metadata on the unit the unit must be initialized If TRANSPORTABLE is specified and there is valid metadata on the unit the unit will have to be initialized in order to remove the metadata Note Digital recommends that you avoid specifying TRANSPORTABLE unless transportability of disk drives or media is imperative and there is no other way to accomplish the movement of data When entering an ADD DISK command NOTRANSPORTABLE is the default CLI SET DISK130 TRANSPORTABLE DISK130 is made transportable B 30 Command Line Interpreter SET FAILOVER SET FAILOVER Format Parameters Description Examples Places THIS CONTROLLER and OTHER_CONTROLLER into a dual redundant configuration SET FA
503. rrent controller and device support Digital recommends replacing this card with the latest firmware as each new version is released General Information and Subsystem Overview 1 1 Figure 1 1 SW800 Series Data Center Cabinet CXO 3658B PH The HSJ and HSD series controllers can be configured alone nonredundant or in conjunction with a second controller of the same model dual redundant for improved availability Dual redundant configurations support 6 SCSI 2 devices per port for example 36 devices on an HSJ40 controller Nonredundant low availability configurations support up to 7 devices per SCSI 2 port but this setup sacrifices a convenient upgrade to high availability and redundant backup power options Note The HSZ series controllers can only be configured alone nonredundant Digital recommends the dual redundant configuration for the HSJ and HSD series controllers to support up to six SCSI 2 storage devices per port 1 2 General Information and Subsystem Overview Figure 1 2 SW500 Series Cabinet CXO 4138A PH Refer to the appropriate StorageWorks Array Controller Operating Firmware Release Notes and StorageWorks Firmware Array Controller Software Product Description for supported devices Note In the dual redundant configuration make sure that both controller s cache modules have the same number of megabytes and that both firmware versions are identical If there is a mismatch neither
504. rror 9210 Cannot check if drives are online to the other controller Explanation When trying to check for online drives on the other controller there was a communication failure Retry the command Error 9230 Unable to modify switches requested Explanation This error results from a SET command The system is currently busy Retry the SET command later B 72 Command Line Interpreter Error 9240 Cannot delete unit in maintenance mode Explanation When trying to delete a unit the unit was found to be in Maintenance mode This is typically the result of trying to delete a unit that is in use by DILX or TILX Make sure that DILX and TILX is not being run against the unit that is to be deleted and retry the command Error 9250 Initialize of disk failed Explanation Unable to write metadata on disk Make sure the disk is functioning properly Error 9260 Cannot INITIALIZE a container that is still part of a configuration Delete upper configuration first Explanation A container cannot be initialized that is part of another configuration or is being used by a unit Delete the upper configuration and reenter the INITIALIZE command Error 9270 No metadata found on container unit not created An INITIALIZE lt container name gt must be issued before this container may be used Explanation You attempted to create a unit from a container that did not have valid metadata INITIALIZE the metadata on the container then create
505. rror Event Log 51 2 00003C51 Disk Bad Block Replacement BBR 57 No Longwords Attempt Event Log Tape Transfer Error Event Log 61 2 00003C61 Media Loader Error Event Log 71 3 00003C71 The MSLG B_FORMAT field for these templates will read 00 CONTROLLER LOG so you may want to run the OpenVMS DCL command procedure provided at the end of this appendix Section C 6 for deskewing the longwords e You should use the template type to learn even more from the error log Information available in longwords other than the instance code includes the following Template type Template information size Event time Drive sense data Other information specific to the template Knowing the template type allows you to better use Section C 2 to obtain a complete description of each template and determine where information is located within the associated CONTROLLER DEPENDENT INFORMATION HSJ Series Error Logging C 5 C 2 Event Log Formats Note The numeric code values discussed in the figures and tables of this appendix are hexadecimal unless otherwise stated The HSJ30 40 controller reports significant events that occur during normal controller operation using the following standard MSCP and TMSCP error log message formats e Controller Errors e Memory Errors e Disk Transfer Errors Bad Block Replacement Attempts e Tape Errors e Media Loader Errors e Disk Copy Data Correlation To more fully use the remaind
506. rs follow the steps in Section 7 1 4 Replace the controllers one at a time and maintain device service Use the following guidelines to simultaneously replace both controllers 1 Examine the green OCP reset LED on both controllers Follow basic troubleshooting guidelines refer to Section 7 1 1 if necessary For any fully or partially functioning controller connect a terminal and enter the following commands CLI gt SHOW THIS CONTROLLER FULL CLI SHOW DEVICES FULL CLI gt SHOW UNITS FULL Record the output from the commands and keep it available for reference CAUTION Never remove a controller while it is still servicing devices Shut down any fully or partially functioning controller green LED flashing by following the guidelines in Section 7 1 2 Remove both controllers by referring to the steps 6 through 13 in Section 7 1 3 3 7 18 Removing and Replacing Field Replaceable Units Replace the first of the controllers as if this were a nonredundant configuration refer to Section 7 1 3 4 Replace the second controller by following the dual redundant procedure refer to Section 7 1 4 4 7 2 Cache Module Most controller modules will have a read cache module installed behind them in the controller shelf Currently there are two read cache modules available 16 MB and 32 MB 7 2 1 Tools Required You will need the following tools to remove or replace the read cache module ESD strap noncon
507. rs for I O rundown in VC Close Ci_isr found that the YACI hardware had invalid transmit status on Path A no bits set Ci_isr found that the YACI hardware had invalid transmit status on Path B no bits set CI ISR found the abort bit set with out any valid reason Path A CI ISR found transmit parity error without abort bit set Path A CI ISR found buffer underflow without abort bit set Path A CI ISR found the abort bit set with out any valid reason Path B CI ISR found transmit parity error without abort bit set Path B CI ISR found buffer underflow without abort bit set Path B Ci isr found that yaci hardware had a parity error Ci isr found that yaci hardware had a bus timeout error Ci isr found Data parity on Transmit Path A Ci isr found Data parity on Transmit Path B Ci isr found Host Reset on Path A Last Failure Parameter 0 contains the node number of the resetting node Ci isr found Host Reset on Path B Last Failure Parameter 0 contains the node number of the resetting node Ci isr found Fetch parity on Transmit Path A Ci isr found Fetch parity on Transmit Path B C 112 HSJ Series Error Logging Table C 43 Disk and Tape MSCP Server Last Failure Codes Code Description 60000100 Invalid return value from routine HIS SPREPARE MSG XMIT processing write command 60010100 Invalid return value from routine HIS SPREPARE MSG XMIT processing read command 60030100 Invalid return value from routine
508. rs result when trying to write user data the controller will not start self test unless IGNORE_ERROR is specified CAUTION Customer data may be lost or corrupted if the IGNORE_ERRORS qualifier is specified NOIGNORE_ERRORS is the default IMMEDIATE NOIMMEDIATE D If IMMEDIATE is specified immediately start the self test on the controller without checking for online devices CAUTION Customer data may be lost or corrupted if the IMMEDIATE qualifier is specified NOIMMEDIATE is the default B 26 Command Line Interpreter Examples SELFTEST OTHER_CONTROLLER OVERRIDE_ONLINE NOOVERRIDE_ONLINE D If any units are on line to the controller self test will not take place unless OVERRIDE_ONLINE is specified If the OVERRIDE_ONLINE qualifier is specified the controller will start self test after all customer data is written to disk CAUTION Customer data may be lost or corrupted if the OVERRIDE_ONLINE qualifier is specified NOOVERRIDE_ONLINE is the default CLI gt SELFTEST OTHER_CONTROLLER Start the self test on the other controller as long as the other controller does not have any units that are on line CLI gt SELFTEST OTHER_CONTROLLER OVERRIDE_ONLINE Start the self test on the other controller even if there are units on line to the other controller Command Line Interpreter B 27 SELFTEST THIS_CONTROLLER SELFTEST THIS_CONTROLLER Format Description Runs a self
509. rscore _ for a total of nine characters Description Gives a known container a new name by which to be referred Examples CLI gt RENAME DISKO DISK100 Rename container DISKO to DISK100 B 20 Command Line Interpreter RESTART OTHER CONTROLLER RESTART OTHER CONTROLLER Format Description Restarts the other controller Note This command is valid for HSJ and HSD controllers only RESTART OTHER CONTROLLER The RESTART OTHER CONTROLLER command restarts the other controller If any disks are on line to the other controller the controller will not restart unless the OVERRIDE ONLINE qualifier is specified HSD and HSJ only If any user data cannot be flushed to disk the controller will not restart unless the IGNORE ERRORS qualifier is specified Specifying IMMEDIATE will cause the other controller to restart immediately without flushing any user data to the disks even if drives are on line to the host The RESTART OTHER CONTROLLER command will not cause a failover to this controller in a dual redundant configuration The other controller will restart and resume operations where it was interrupted Qualifiers for HSD and HSJ controllers IGNORE ERRORS NOIGNORE ERRORS D If errors result when trying to write user data the controller will not be restarted unless IGNORE ERROR is specified CAUTION Customer data may be lost or corrupted if the IGNORE ERRORS qualifier is specified NOIGNORE
510. rt activity and status device state logical unit state and cache and I O performance See Chapter 6 for detailed information on this utility The configuration utility checks the SCSI device ports for any device not previously added This utility will add and name these devices See Chapter 6 for more information on the configuration utility Exercisers The controller can run both the disk exerciser DILX and the tape exerciser TILX These exercisers simulate high levels of user activity so running them provides performance information you may use to determine the health of the controller and the devices attached to it See Chapter 6 for more information about the exercisers Terminal access You can use a virtual host terminal or a maintenance terminal to check status and set operating parameters The terminal connection provides access to the following Command Line Interpreter CLI See Chapter 4 Appendix B Error messages See Chapter 5 Error logs See Chapter 5 Appendices C through E Controller warm swap HSJ series controller You can efficiently remove and replace or warm swap one controller in a dual redundant configuration When you warm swap a controller you are changing out a controller in the most transparent method available to the HS controller subsystem Warm swapping a controller has minimal system and device impact For more information on warm swapping see Chapter 7 Operator control p
511. ry and drive metadata indicate conflicting drive configurations 036D430A 012B The Synchronous Transfer Value differs between drives in the same storageset 036E4002 012B Maximum number of errors for this data transfer operation exceeded 036F4002 00CB Drive reported recovered error without transferring all data 03704002 00E8 Data returned from drive is invalid 03714002 012B Request Sense command to drive failed 03720064 0016 Illegal command for pass through mode 03730064 0016 Data transfer request error 03744002 012B Premature completion of a drive command 03754002 002B Command timeout 03760101 002B Watchdog timer timeout 03774002 002B Disconnect timeout 03784002 012B Unexpected bus phase 03794002 012B Disconnect expected 037A4002 012B ID Message not sent by drive 037B4002 012B Synchronous negotiation error 037C4002 012B The drive unexpectedly disconnected from the SCSI bus 037D4002 012B Unexpected message 037E4002 012B Unexpected Tag message 037F4002 012B Channel busy 03804002 012B Message Reject received on a valid message 03814504 00EB The tape device reported Vendor Unique SCSI Sense Data HSJ Series Error Logging C 91 Table C 31 Media Loader Error Event Log Template 71 Instance MSCP Event Codes MSCP Instance Event Code Code Description 03964002 0097 An unrecoverable media loader error was encountered while performing work related to media loader operations 03BD450A 0097 The med
512. ry the command Error 9380 Unable to allocate unit for NORUN to RUN transition Explanation The unit could not be allocated so the controller could do a RUN NORUN transition Retry the command If this error persists call Digital Multivendor Services Error 9390 Cannot change default tape format while tape drive online to host Explanation The default tape format cannot be changed when the tape drive is on line to a host Dismount the tape drive from the host and retry the command Error 9400 Cannot rundown or allocate unit in order to delete it Explanation Retry the command If this error persists call Digital Multivendor Services B 2 3 Warning Conventions A Warning nnnn means that the command completed but there is a situation that you should be aware of Typically a warning will result in an unusable configuration you will have to either logically reconfigure the cabinet using the CLI or physically reconfigure the cabinet by moving the disks around Multiple warning messages may result from one command Items in angle brackets lt gt will be replaced at run time with names numbers and so on B 2 4 CLI Warning Messages B 74 Warning 3000 This storageset is configured with more than one disk per port This will cause a degradation in performance Explanation This error results from an ADD storageset type command The storage set specified has more than one member per port One method of increasing the con
513. s 3F C2 NV memory and drive metadata indicate conflicting drive configurations 3F D2 Synchronous Transfer Value differences between drives 80 03 Fault Manager detected an unknown error code 80 06 Maximum number of errors for this I O exceeded 80 07 Drive reported recovered error without transferring all data 82 01 No command control structures available 84 04 Command failed SCSI ID verification failed 85 05 Data returned from drive is invalid 89 00 Request Sense command to drive failed 8A 00 Illegal command for pass through mode 8C 04 Data transfer request error 8F 00 Premature completion of a drive command 93 00 Drive returned vendor unique sense data AO 00 Last failure event report AO 01 Nonvolatile parameter memory component event report AO 02 Backup battery failure event report AO 03 Subsystem built in self test failure event report AO 04 Memory system failure event report AO 05 Failover event report Al 00 Shelf OK is not properly asserted Al 01 Unable to clear SWAP interrupt interrupt disabled Al 02 Swap interrupt reenabled Al 03 Asynchronous SWAP detected BO 00 Command timeout BO 01 Watchdog timer timeout DO 01 Disconnect timeout DO 02 Chip command timeout DO 03 Byte transfer timeout D1 00 Bus errors D1 02 Unexpected bus phase D1 03 Disconnect expected D1 04 ID Message not sent continued on next page HSJ Series Error Logging C 77 Table C 17 Cont HSJ30 40 Co
514. s 4 15 storage set size 4 14 support 4 11 TMSCP timeout 4 14 write history log 4 14 Operating system initialization disk 4 12 support 4 11 Operator control panel See OCP Operator interface firmware 2 9 maintenance terminal 2 3 virtual terminal 2 3 OSF 1 initialization disk 4 12 support 4 11 Output messages HSJ HSD series DILX 6 14 TILX 6 37 HSZ series DILX 6 58 P Parameters initial 4 4 4 6 7 9 7 16 7 46 Path host port 4 6 4 8 7 11 7 17 7 47 PCMCIA 1 1 Performance configuration 3 16 3 17 Performance summary HSJ HSD series DILX 6 27 TILX 6 48 HSZ series DILX 6 63 Personal Computer Memory Card Industry Association See PCMCIA Policy processor 2 1 6 2 Polling parameters 4 15 Port Target LUN See PTL Power supply 7 36 cold swap 7 36 hot swap 7 36 installing 7 38 removing 7 37 replacing 7 38 service of 7 36 service precautions 7 37 tools 7 37 Precautions 1 6 cable guidelines 1 8 ESD 1 6 grounding 1 6 module guidelines 1 6 program card guidelines 1 7 subsystem placement 1 6 subsystem room 1 6 Program card 1 1 2 2 7 21 contents 6 2 handling guidelines 1 7 installing 7 22 removing 1 1 4 1 4 17 6 1 7 22 replacing 1 1 7 22 self test 6 2 service of 7 21 service precautions 7 21 Index 17 Program card cont d tools 7 21 validation 6 2 version restriction 1 3 PTL controller perspective 2 13 host per
515. s HSJ Series Use the procedures in this section when you are removing and replacing external CI cables 7 4 4 Tools Required You will need 5 32 inch Allen wrench to remove or replace external CI cables 7 4 2 Precautions Refer to Chapter 1 for CI cable handling guidelines 7 4 3 Cable Removal Use the following procedure to remove external CI cables 1 The Cl interface includes two connections paths A and B You should determine what paths are suspect before proceeding Refer to Chapter 5 for troubleshooting guidelines Note When only one external CI cable requires replacement you need only halt activity and disconnect cables for the one suspect path 2 For the suspect path s enter one or both of the following commands to halt activity on the suspect host path s CLI gt SET THIS CONTROLLER NOPATH_A CLI gt SET THIS CONTROLLER NOPATH_B CAUTION Always disconnect the external CI cable from the star coupler first then disconnect it from the internal CI cable second Never leave unterminated paths on the star coupler Never leave cables terminated or not attached at the star coupler and disconnected at the internal CI cable connector This minimizes adverse effects on the cluster and prevents a short circuit between the two ground references Removing and Replacing Field Replaceable Units 7 23 3 Disconnect the external CI cable connectors from the star coupler one at a time in the followin
516. s enclosed in quotes with an alphabetic character first Retry the command with a correct SCS node name length Error 4050 SCS nodename must start with an alpha character and contain only A Z and 0 9 Explanation This error results from a SET THIS CONTROLLER or SET OTHER CONTROLLER command with the argument SCS8 NODENAME The SCS node name must consist of alphanumeric characters enclosed in quotes with an alphabetic character first Retry the command with a correct SCS node name Error 4060 Allocation class must be from minimum to 255 Explanation An illegal MSCP or TMSCP allocation class was specified The minimum is 0 for a single controller configuration or 1 for a dual redundant configuration B 66 Command Line Interpreter Error 4070 Max nodes must be 2 8 16 or 32 Explanation This error results from a SET THIS_CONTROLLER or SET OTHER_CONTROLLER command with the argument MAX_NODES Max nodes must be 2 8 16 or 32 nodes Retry the command with a correct max node number Error 4080 Current node 1D too large for requested max nodes setting Explanation This error results from a SET THIS_CONTROLLER or SET OTHER_CONTROLLER command with the arguments MAX_NODES or ID MAX_NODES was specified with a number less than the controller s ID or the controller s ID was specified with a number greater than MAX_NODES 1 If decreasing MAX NODES set the controller s ID first then MAX_NODES Error 4090 Module has
517. s 7 35 7 9 4 Blower Replacement Installation WARNING To reduce the risk of electrical energy hazard disconnect the power cables from the shelf power supplies before replacing shelf blower assemblies or performing service in the backplane area Use the following procedure to replace a blower 1 Align the replacement blower connector and push the blower straight in making sure it is fully seated and that both mounting tabs lock in place 2 Replace the safety screw in the corner of the blower using a Phillips screwdriver 3 If you had to remove the shelf to access the blowers replace the shelf as described in the StorageWorks Solutions Shelf and SBB User s Guide Then replace its SCSI device cables as described in Section 7 8 4 Connect the shelf power cables and verify that the shelf and all SBBs are operating properly Note If the upper power supply LED shelf status does not come on and all the shelf power supplies are operating the second blower may have failed or the wrong blower may have been replaced 5 Close and lock the cabinet doors SW800 series using the 5 32 inch Allen wrench 7 10 Power Supplies There are two methods for replacing power supply SBBs hot swap and cold swap e Use hot swap to replace a power supply only when there are two power supplies in a shelf Hot swap allows you to remove the defective power supply while the other supply furnishes power Note Hot swa
518. s are shown in Figure C 8 The first two fields shown in Figure C 8 the emdopcd and sdqual fields are supplied by the HSJ30 40 controller to provide qualifying information required to interpret the other SCSI Sense Data Common fields The other fields ercdval through keyspec contain standard Sense Data returned in the response of a SCSI REQUEST SENSE command issued to the target device or generated by the HSJ30 40 controller on the target device s behalf HSJ Series Error Logging C 13 Figure C 8 SCSI Device Sense Data Common Fields 3 22 1 1 1 4 3 65 8 7 0 ascq cmdspec keyspec frucode Figure C 9 Sense Data Qualifier Field Format 7 6 5 4 3 2 1 0 SCSI Device Sense Data Common Fields cmdopcd The operation code of the SCSI command issued to the target device SCSI command operation codes vary according to device type see Table C 10 so the content of this field depends on the content of the devtype field See the description of the ercdval field for information regarding the validity of this field sdqual This field contains information necessary to determine whether or not the Sense Data contained in the ercdval through keyspec fields are supplied by an attached device or generated by the HSJ30 40 controller itself and to qualify the content of the info field This field is formatted as shown in Figure C 9 Sense Data Qualifier Specific Subfields bufmode Th
519. s data transfer and other operations with the host 4 2 Operator Control Panel The operator can use the operator control panel OCP to reset the controller control the SCSI 2 buses attached to the controller and interpret error conditions that result in LED error codes The OCP and its use are described in Chapter 5 4 3 Command Line Interpreter The Command Line Interpreter CLI is the user interface to the controller The CLI allows you to control storage and controller configurations through commands The following sections explain how to use the CLI and how it defines and modifies configurations A detailed description of CLI commands is provided in Appendix B 4 3 1 Accessing the CLI You can access the CLI through a maintenance terminal see Section 4 5 or through a virtual terminal To access the CLI through a maintenance terminal all controllers connect the terminal and press the Return key You must use a maintenance terminal to set the controller initial configuration This is because the controller arrives with an invalid ID and its host ports HSJ HSD series controllers are initially off Thereafter you may use a virtual host terminal to modify the configuration The method of establishing the virtual terminal connection varies depending on your operating system and interface 4 2 Normal Operation For example for HSJ and HSD series controllers under the OpenVMS operating system for VAX hardware the
520. sary for dismounting a device 2 Unlock and open the cabinet doors SW800 series using a 5 32 inch Allen wrench 3 Quiesce the SBB s port by pressing and holding the controller port button for the SBB Continue holding the button until all amber OCP LEDs light Note Only one port may be quiesced at any time If the button is not held long enough or multiple buttons are pushed in quick succession all buttons are ignored no ports are quiesced You must press and hold the button again to quiesce the port 4 Wait until the chosen port LED flashes alternately with the other port LEDs this indicates I O has stopped The alternating pattern flashes for approximately 30 seconds during which you may remove the SBB If the pattern does not appear after a minute or two another shelf is asserting a fault signal that prevents any quiesce function on this controller To correct the problem you must locate the suspect shelf and do one of three things e Remove all devices from the shelf e Disconnect the shelf s SCSI device cables Section 7 8 e Repair replace the shelf power supply Section 7 10 Removing and Replacing Field Replaceable Units 7 39 5 To remove the SBB press its two mounting tabs together to release it from the shelf and pull it out using both hands see Figure 7 14 Figure 7 14 SBB Warm Swap CXO 3611B PH While the OCP LEDs are flashing any SBBs on the quiesced port that have status L
521. sced Port 5 quiesced Port 6 quiesced OY Ci 4 W DN All ports quiesced Insert the other HSJ40 WITHOUT its program card and press Return Insert the cache if applicable and controller now Follow the steps outlined in Table 7 6 7 44 Removing and Replacing Field Replaceable Units Table 7 6 Module Replacement Step Description 1 2 Ground yourself to the cabinet grounding stud refer to Figure 7 1 You should replace the cache module now if you removed it Refer to Section 7 2 4 Make sure the OCP cable is correctly plugged into side two of the module refer to Figure 7 5 Slide the controller module into the shelf using its slot s rightmost rails as guides refer to Figure 7 6 Use a gentle up and down rocking motion to help seat the module into the backplane Press firmly on the module until it is seated Finally press firmly once more to make sure the module is seated Tighten the four screws on the front bezel using a 3 32 inch Allen wrench Connect a maintenance terminal to the MMJ of the other controller the one you did not replace if one is not already connected 6 Restarting ALL ports Port 1 restarted Port 2 restarted Port 3 restarted Port 4 restarted Port 5 restarted Port 6 restarted The configuration has two contollers Follow the steps in the system message The Controller Warm Swap program has terminated To restart the other controller 1 Enter the REST
522. sed to create the unit The ADD UNIT command is used to add a logical unit for the host to access All requests by the host to the logical unit number will be mapped as requests to the container specified in the ADD UNIT command For disk devices and stripesets built out of disk devices the metadata on the container must be initialized before a unit may be created from it If the container s metadata cannot be found or is incorrect an error will be displayed and the unit will not be created Qualifiers for a unit created from a CDROM drive HSJ and HSD only MAXIMUM_CACHED_TRANSFER n MAXIMUM_CACHED_TRANSFER 32 D Specifies the maximum size transfer in blocks to be cached by the controller Any transfers over this size will not be cached Valid values are 1 through 1024 When entering the ADD UNIT command MAXIMUM_CACHED_TRANSFER 32 is the default Command Line Interpreter B 7 ADD UNIT READ_CACHE D NOREAD_CACHE Enables and disables the controller s read cache on this unit When entering an ADD UNIT command READ_CACHE is the default RUN D NORUN Enables and disables a unit s ability to be spun up When RUN is specified the devices that make up the unit will be spun up If NORUN is specified the unit will be spun down When entering an ADD UNIT command RUN is the default Qualifiers for a unit created from a disk drive MAXIMUM_CACHED_TRANSFER n MAXIMUM_CACHED_TRANSFER 32 D Specifies the maximum si
523. sed to store controller configuration data OCP Operator control panel The control indicator panel associated with a device The OCP is usually mounted on the device and is accessible to the operator offline One of the possible status conditions of a mass storage device or server When a device is offline it is not capable of communicating with the controller When the controller is offline it is inaccessible to any node in the configuration operator control panel See OCP PCMCIA Personal Computer Memory Card Industry Association An organization that develops standards for ROM memory cards Personal Computer Memory Card Industry Association See PCMCIA port The hardware and software used to connect a host controller to a communication bus such as a CI SCSI or SDI bus port target LUN See PTL program card The PCMCIA card containing the HS controller operating firmware PTL Port target LUN PTL is a three number hierarchical value representing a device location to a SCSI initiator For example PTL 143 is a device on port 1 of the initiator target 4 on port 1 and LUN 3 under target 4 qualified device A device that has been fully tested in all appropriate Storage Works hardware and software configurations and is in complete compliance with Digital and country specific standards for example FCC and TUV quiesce To make a bus inactive or dormant The operator must quiesce SCSI bus operations for
524. sfer in blocks to be cached by the controller Any transfers over this size will not be cached Valid values are 1 through 1024 When entering the ADD UNIT command MAXIMUM_CACHED_TRANSFER 32 is the default READ_CACHE D NOREAD_CACHE Enables and disables the controller s read cache on this unit When entering an ADD UNIT command READ_CACHE is the default RUN D NORUN Enables and disables a unit s ability to be spun up When RUN is specified the devices that make up the unit will be spun up If NORUN is specified the unit will be spun down When entering an ADD UNIT command RUN is the default Qualifiers for a unit created from a disk drive MAXIMUM_CACHED_TRANSFER n MAXIMUM_CACHED_TRANSFER 32 D Specifies the maximum size transfer in blocks to be cached by the controller Any transfers over this size will not be cached Valid values are 1 through 1024 When entering the ADD UNIT command MAXIMUM_CACHED_TRANSFER 32 is the default READ_CACHE D NOREAD_CACHE Enables and disables the controller s read cache on this unit When entering an ADD UNIT command READ_CACHE is the default Command Line Interpreter B 41 SET unit number RUN D NORUN Enables and disables a unit s ability to be spun up When RUN is specified the devices that make up the unit will be spun up If NORUN is specified the unit will be spun down When entering an ADD UNIT command RUN is the default WRITE_PROTECT NOWRITE PROTECT D
525. should record which devices have lit flashing fault LEDs before resetting the controller as a reset may temporarily clear the LED even though the fault remains 5 8 Error Analysis and Fault Isolation Figure 5 4 Storage SBB LEDs DEVICE ACTIVITY GREEN DEVICE FAULT AMBER CXO 3671A PH Table 5 1 defines the valid states for these LEDs Table 5 1 Storage SBB Status LEDs LED Status Indication Device activity On SBB is operating normally Device fault Off Device activity Flashing SBB is operating normally Device fault Off Device activity Off SBB is operating normally Device fault Off The SBB is inactive and there is no fault Device activity On Fault status Device fault On SBB is probably not responding to control signals It is recommended that you replace the SBB Device activity Off Fault status Device fault On SBB is inactive and spun down Digital recommends that you replace the SBB Device activity On Fault status Device fault Flashing SBB is active and is spinning down because of the fault 5 5 2 Device Shelf Status and Power Supply Status The status of both the device shelf blowers and power supplies is displayed on the power supply LEDs as shown in Figure 5 5 The upper LED displays the shelf status and the lower LED displays the power supply status When the upper LED is on both the shelf blowers and the power supplies are functioning properly When the upper LED is off either
526. spective HSZ series 2 16 Q Quiet slot time 3 19 R RAID firmware 2 12 HBVS 2 12 level 0 2 12 44 4 14 B 78 level la 2 12 striping 2 12 Read cache 7 19 and power failure 2 5 firmware 2 12 hardware 2 5 installing 7 19 removing 7 19 replacing 7 19 service of 7 19 service precautions 7 19 specifications 1 9 testing 6 4 tools 7 19 Read only test HSJ HSD series TILX 6 33 Related documents xviii Removal blower 7 35 both dual redundant controllers 7 18 CI cable external 7 23 CI cable internal 7 26 device port cable 7 32 DSSI host cable 7 28 DSSI trilink 7 28 nonredundant controller 7 4 of controller using warm swap 7 42 of devices using warm swap 7 39 one dual redundant controller 7 13 power supply 7 37 program card 1 1 4 1 4 17 6 1 7 22 read cache 7 19 SCSI cable device port 7 32 SCSI host cable 7 30 SCSI trilink 7 30 RENAME command B 20 Replaceable parts See Field replaceable units Index 18 Replacement blower 7 36 both dual redundant controllers 7 18 CI cable external 7 25 CI cable internal 7 26 device port cable 7 33 DSSI host cable 7 29 DSSI trilink 7 29 nonredundant controller 7 7 of controller using warm swap 7 44 of devices using warm swap 7 40 one dual redundant controller 7 15 power supply 7 38 program card 1 1 7 22 read cache 7 19 SCSI cable device port 7 33 SCSI host cable 7 31 SCSI tr
527. st cluster CAUTION SET FAILOVER establishes controller to controller communication and copies configuration information Always enter this command on one controller only COPY configuration source specifies where the good configuration data are located Never blindly specify SET FAILOVER Know where your good configuration information resides before entering the command Note Not all steps are applicable to all controller models Steps applicable to certain models are designated as such 1 HSD series controller Power the controller on before entering parameters 2 Enter the following command to copy configuration information to the new controller CLI gt SET FAILOVER COPY OTHER CONTROLLER 3 Enter the following command to set the MAX_NODES HSJ series controllers CLI gt SET THIS CONTROLLER MAX NODES n where n is 8 16 or 32 4 Enter the following command to set a valid controller ID CLI gt SET THIS CONTROLLER ID n where n is the HSJ series controller CI node number 0 through MAX_NODES 1 or n is the HSD series controller one digit DSSI node number 0 through 7 Each controller DSSI node number must be unique on its DSSI interconnect 7 16 Removing and Replacing Field Replaceable Units 5 Enter the following command to set the SCS node CLI gt SET THIS CONTROLLER SCS_NODENAME xxxxxx where xxxxxx is a one to six character alphanumeric name for this node The node name must
528. ster value 42690101 Dssi_isr routine found that the 720 script reported an invalid Receive status Last Failure Parameter 0 contains the receive interrupt status written by the 720 chip 426B0101 Dssi_err_isr routine found that 720 interrupted without status Last Failure Parameter 0 contains the 720 chip istat register value 42742001 Dssi_err_isr routine found that 720 reported a bus error on the FIB internal bus Last Failure Parameter 0 contains the 720 chip dstat register value 42752002 Dssi_err_isr routine found that 720 reported a bus error on the FIB internal bus Last Failure Parameter 0 contains the 720 chip dstat register value Last Failure Parameter 1 contains the 720 chip demd register value 42760102 Dssi_err_isr routine found that 720 reported an unexpected status for initiator mode Last Failure Parameter 0 contains the 720 chip dstat register value Last Failure Parameter 1 contains the 720 chip sist1 register value continued on next page HSD Series Error Logging D 3 Table D 5 Cont Host Interconnect Port Services Last Failure Codes Code Description 42770102 Dssi_err_isr routine found that 720 reported an unexpected status for initiator mode Last Failure Parameter 0 contains the 720 chip dstat register value Last Failure Parameter 1 contains the 720 chip sist1 register value D 4 Recommended Repair Action Table D 6 shows a difference in description text for recommended repair actio
529. subsystem Use ESD wrist straps antistatic bags and grounded ESD mats when handling FRUs Obey the module handling guidelines listed in Section 1 4 2 1 4 2 Module Handling Guidelines Prior to handling the controller module or cache module follow these grounding guidelines CAUTION Refer to ESD guidelines in Section 1 4 1 prior to handling the controller module or cache module Damage to the modules can result if the guidelines are not followed Obtain and wear an ESD wrist strap on your wrist Make sure the strap fits snugly Plug the ESD strap into the grounding stud located on the vertical rail between the BA350 MA controller shelves and the device shelves You can find the stud approximately half way down from the top of the rail Figure 1 3 After removing a module from the shelf place the module into an approved antistatic bag or onto a grounded antistatic mat Not required for handling the program card The grounding stud is moveable and can be relocated to another part of the cabinet 1 6 General Information and Subsystem Overview Figure 1 3 Shelf Grounding Stud 800 SERIES CABINET GROUNDING STUD nO d oo n CXO 4116A MC Remain grounded while installing a replacemen
530. sures SBB shelf A StorageWorks shelf such as the BA350 SB designed to house plug in SBB modules SCA The interface specifications and protocols defining the connection of independent computer systems into clusters scs System Communication Services A delivery protocol for packets of information commands or data to or from the host SCSI Small Computer System Interface An ANSI interface defining the physical and electrical parameters of a parallel I O bus used to connect hosts to a maximum of seven devices The StorageWorks device interface is implemented according to the SCSI 2 standard allowing the synchronous transfer of 8 bit data at rates of up to 10 MB s shelf brackets Sheet metal components designed to attach and position StorageWorks shelves in their associated enclosures signal converter A device that converts the protocol and hardware interface of one bus type into that of another without changing the functionality of the bus See adapter single cabinet power configuration A cabinet ac power configuration in which only one ac source and one ac power supply is used to supply dc power to the cabinet s SBB shelves skirt A trim panel designed to mount around the base of the cabinet Small Computer System Interface See SCSI standard disk interface See SDI standard tape interface See STI storage set A grouping of disk drives that make up a new distinct container StorageWorks Digital s fami
531. t DILX collects the following information from you for each command e The I O command name write read or quit Quit is not a command instead it indicates to DILX that you have finished defining the test The starting logical block number LBN The size of the I O in 512 byte blocks 6 52 Diagnostics Exercisers and Utilities 6 4 4 DILX Test Definition Questions The following text is displayed when running DILX The text includes questions that are listed in the approximate order that they are displayed on your terminal These questions prompt you to define the runtime parameters for DILX Note Defaults for each question are given inside If you press the Return key as a response to a question the default is used as the response After DILX has been started the following message and prompt is displayed It is recommended that DILX only be run when there is no host activity present on the HSZ series controller Do you want to continue y n n The following message describing the Auto Configure option is displayed The Auto Configure option will automatically select for testing all of the disk units configured It will perform a very thorough test with WRITES enabled The user will only be able to select the run time and performance summary options The user will not be able to specify specific units to test The Auto Configure option is only recommended for initial installations It is the first
532. t determine that the swap is necessary e The controller determines that a device is bad by trying to access the device receiving no response from the device or detecting excessive errors from the device e The operator decides to remove a device by examining the OCP codes the SBB LEDs system messages or system error log information 7 11 1 1 Tools Required You will need a 5 32 inch Allen wrench to warm swap a device 5 You may also use the SBB warm swap procedure to add a device to an empty shelf slot 7 38 Removing and Replacing Field Replaceable Units 7 11 1 2 Precautions Refer to Chapter 1 for safety guidelines 7 11 1 3 Device Removal CAUTION Warm swap supports removal and replacement of only one SBB at a time Should another SBB need to be swapped you must repeat the entire warm swap procedure You must follow steps in this section in their exact order so that the following is ensured e Preserve data integrity especially for devices with older SCSI interface designs e Reduce chances of making a port unusable for a long period which can render several devices inaccessible e Prevent the controller from performing unpredictably Use the following procedure to remove a device 1 You must dismount the device from the host before proceeding For example enter the DISMOUNT command if you are using the OpenVMS operating system Refer to your operating system documentation for procedures neces
533. t follows this message is the contents of an EIP Disk unit numbers on this controller include Explanation After this message is displayed a list of disk unit numbers on the controller is displayed IO to unit x has timed out DILX aborting Explanation One of the DILX I Os to this unit did not complete within the command timeout interval and when examined was found not progressing This indicates a failing controller DILX terminated prematurely by user request Explanation A Ctrl Y was entered DILX interprets this as a request to terminate This message is displayed and DILX terminates Unit is owned by another sysap Explanation DILX could not allocate the unit specified because the unit is currently allocated by another system application Terminate the other system application or reset the controller 6 60 Diagnostics Exercisers and Utilities This unit is reserved Explanation The unit could not be allocated for testing because a host has reserved the unit This unit is marked inoperative Explanation The unit could not be allocated for testing because the controller internal tables have the unit marked as inoperative The unit does not have any media present Explanation The unit could not be allocated for testing because no media is present The RUNSTOP_SWITCH is set to RUN_DISABLED Explanation The unit could not be allocated for testing because the RUNSTOP_SWITCH is set to RUN_DISABLED This
534. t module 1 4 3 Program Card Handling Guidelines Follow these guidelines when handling the program card CAUTION Follow program card guidelines or damage to the program card and firmware may result e Keep the program card in its original carrying case when not in use Do not twist or bend the program card e Do not touch the card contacts Keep the card out of direct sunlight e Do not immerse the card in water or chemicals General Information and Subsystem Overview 1 7 e Always push the program card eject button shown in Figure 1 4 to remove the card do not pull on the card Figure 1 4 Program Card Eject Button PROGRAM CARD EJECT BUTTON CXO 4203A MC 1 4 4 Cable Handling Guidelines Use the guidelines presented in the following sections when handling the host interface cables to the controller See Chapter 7 for host cable removal and replacement instructions Note Always halt activity on the host path to the target controller before servicing its host cables see Chapter 7 1 8 General Information and Subsystem Overview 1 4 4 1 Cl Cable CAUTION If the internal CI cable connectors should become grounded damage to the equipment can result Never leave external CI cables terminated or not attached at the star coupler and disconnected at the internal CI cable connector This minimiz
535. t the formats of both the send and receive diagnostic page formats Figure 6 10 HSZ series Controller CLI Send Diagnostic Page Format 4 4 BIEI F poe la its fe 2 Pe 0 Byte 0 Page Code 80h CLI Data page vendor specific F 1 Reserved 4 2 MSB Page Length n 3 Sot 3 LSB E 4 CLI Cmd Code INQUIRY 1 or ANSWER 2 E 5 Reserved 4 6 Used for ANSWER only 4 ASCII Text 132 bytes maximum n 4 Figure 6 11 HSZ series Controller CLI Receive Diagnostic Page Format F 4 Y Bit 716151 4131 2 11 0 Byte Ho do 0 Page Code 80h CLI Data page vendor specific E 1 Reserved E 2 MSB Page Length n 3 3 LSB doo o 4 Status SUCCESS 1 or INPUT_REQUESTED 2 po q t 5 Delay 0 10 second delay before next cmd HO A 6 132 bytes maximum ASCII Text n Ho do o Diagnostics Exercisers and Utilities 6 101 6 7 5 Virtual Maintenance Terminal Communications Protocol The following sections describe the communications protocol developed to support the virtual maintenance terminal utility 6 7 5 1 Protocol Notes The virtu
536. t you want to run on all selected units The following questions define the TILX tests Enter data pattern number 0 all 19 user_defined 0 19 0 Explanation The TILX data patterns are used in write commands This question is displayed for the Basic Function and User Defined tests There are 18 unique data patterns from which to select These patterns were carefully selected as worst case or most likely to produce errors for tapes connected to the controller See Table 6 5 for a list of the data patterns The default uses all 18 patterns in a random method This question also allows you to create a unique data pattern of your choice Diagnostics Exercisers and Utilities 6 35 Enter record count 1 4294967295 4096 Explanation Enter the number of records to write to the tape Note The record count does not include tape marks that are intermixed with the records written to the tape in the Basic Function test Enter the 8 digit hexadecimal user defined data pattern Explanation This question is only displayed if you choose to use a User Defined data pattern for write commands The data pattern is represented in a longword and can be specified with eight hexadecimal digits Perform data compare y n n Explanation Enter Y to enable the compare modifier bit with the read and write commands This question only applies to the Basic Function test If the compare modifier is set on write commands the data
537. ta with the least number of references and replaces it with the new data 2 1 11 2 Read Cache Module During a host write operation using the read cache data are written to the disk and the cache This is known as write through caching and it improves the performance of subsequent reads because often the requested data were previously written to the cache The read cache consists of DRAM storage However the read cache is volatile Subsystem power failures will cause the loss of all data in the read cache 2 1 12 Host Interface The following sections provide descriptions of the host interface hardware for each series of HS controller 2 1 12 1 HSJ Series Cl Interface Figure 2 3 shows a block diagram of the HSJ series to CI host interface hardware Functional Description 2 5 Figure 2 3 HSJ Series Cl Host Interface Hardware Block Diagram YACI CI GATE ARRAY CIRT B CI CONNECTOR PA A 1 TOIFROM HOST DUALPATH 1 CI CONNECTOR CXO 3980A MC The CI interface for the HSJ series controllers consists of a YACI CI gate array and CI receiver transmit CIRT chips for the individual CI ports The YACI allows direct memory access of data between the host CI port and controller shared memory Specialized host port firmware running on the policy processor sets up and maintains the CI port The HSJ series controller supports dual data link DDL operations on the CI bus With DDL the controller can have
538. tage The default is 5 6 56 Diagnostics Exercisers and Utilities Enter command number x read write quit Explanation This question only applies to the User Defined test It allows you to define command x as a read write access or erase command Enter quit to finish defining the test Enter starting LBN for this command 0 highest_Ibn_on_the_disk Explanation This question only applies to the User Defined test It allows you to set the starting LBN for the command currently being defined Enter the starting LBN for this command Enter the IO size in 512 byte blocks for this command 1 size in blocks Explanation This question only applies to the User Defined test It allows you to set the I O size in 512 byte blocks for the command currently being defined Enter values indicating the I O size for this command Reuse parameters stop continue restart change unit stop Explanation This question is displayed after the DILX execution time limit expires after the hard error limit is reached for every unit under test or after you enter Ctrl C These options are as follow e Stop DILX terminates normally e Continue DILX resumes execution without resetting the remaining DILX execution time or any performance statistics If the DILX execution time limit has expired or all units have reached their hard error limit DILX terminates e Restart DILX resets all performance statistics and restarts
539. tainer If NOTRANSPORTABLE the default was specified when the device was added a small amount of disk space was made inaccessible to the host and used for metadata The metadata will now be initialized e If TRANSPORTABLE was specified any metadata on the device will now be destroyed See Appendix B for details on metadata and when INITIALIZE is required Add the units that use either the devices or the storage sets built from the devices by entering the following command CLI gt ADD UNIT logical unit number container name where logical unit number is the unit number the host uses to access the device container name identifies the device or the storage set Normal Operation 4 9 4 4 Acceptance Test After you install set parameters for and configure your controller follow the guidelines in this section to acceptance test your subsystem 1 Turn your system on This resets all shelves and starts the spin up cycle on devices within the shelves This includes the initialization diagnostics on the controller s and device self tests 2 Run DILX using the default answers to the test questions see Chapter 6 This tests all disk devices in your subsystem 3 Run TILX using the default answers to the test questions see Chapter 6 This tests all tape devices in your subsystem 4 5 Maintenance Terminal A maintenance terminal is a locally connected EIA 423 compatible terminal a terminal connected directly to the cont
540. tains the PCB copy of the 710 DNAD register Last Failure Parameter 4 contains the PCB copy of the 710 DSP register Last Failure Parameter 5 contains the PCB copy of the 710 DSPS register Last Failure Parameter 6 contains the PCB copies of the 710 SSTAT2 SSTAT1 SSTATO DSTAT registers Last Failure Parameter 7 contains the PCB copies of the 710 LCRC RESERVED ISTAT DFIFO registers continued on next page Table C 35 Cont Device Services Last Failure Codes Code Description 03360188 03370108 A 710 s host bus watchdog timer expired Last Failure Parameter 0 contains the PCB reg710_ptr value Last Failure Parameter 1 contains the PCB copy of the 710 TEMP register Last Failure Parameter 2 contains the PCB copy of the 710 DBC register Last Failure Parameter 3 contains the PCB copy of the 710 DNAD register Last Failure Parameter 4 contains the PCB copy of the 710 DSP register Last Failure Parameter 5 contains the PCB copy of the 710 DSPS register Last Failure Parameter 6 contains the PCB copies of the 710 SSTAT2 SSTAT1 SSTATO DSTAT registers Last Failure Parameter 7 contains the PCB copies of the 710 LCRC RESERVED ISTAT DFIFO registers A 710 detected an illegal script instruction Last Failure Parameter 0 contains the PCB reg710_ptr value Last Failure Parameter 1 contains the PCB copy of the 710 TEMP register Last Failure Parameter 2 contains the PCB copy of the 710 DB
541. te Connection Request Failed Insufficient Resources to Request Local Connection All other conditions that can be reported via the Disk Copy Data Correlation Event Log are not assigned a specific Event Notification Recovery Threshold Classification because they can be correlated with the associated condition specific event log HSJ Series Error Logging C 119 C 5 Recommended Repair Action A Recommended Repair Action code is assigned to each significant event that can be reported by an HSJ30 40 controller The Recommended Repair Action codes and their meanings are shown in Table C 51 Table C 51 Recommended Repair Action Codes Code Description 00 01 02 03 04 05 06 07 08 09 0A OB No action necessary An unrecoverable hardware detected fault occurred or an unrecoverable firmware inconsistency was detected proceed with HSJ30 40 controller support avenues Contact Digital Multivendor Services Inconsistent erroneous information received from the operating system proceed with operating system software support avenues Contact Digital Multivendor Services Follow the recommended repair action contained in the last failure code field Two possible problem sources are indicated e In the case of a shelf with dual power supplies one of the power supplies has failed Follow repair action 07 for the power supply with the power LED out e One of the shelf blowers has failed Follow repair ac
542. te commands the data are written to the disk The data are then read from the disk and compared against the corresponding DILX buffers On read commands the data are read from the disk into the DILX buffers read again then compared against the corresponding DILX buffers If a discrepancy is found an error is reported If the initial write was chosen for the Basic Function test and you enter Y to this question compare host data commands are then enabled and data previously written to the media are verified for accuracy Enter compare percentage 1 100 5 Explanation This question is displayed only if you choose to perform data compares This question allows you to change the percentage of read and write commands that will have a data compare operation performed Enter a value indicating the compare percentage The default is 5 The erase percentage will be set automatically Enter access percentage for Seek Intensive Phase 0 100 90 Explanation This question only applies to the Seek Intensive phase if writes are enabled It allows you to select the percentage of access and erase commands to be issued Enter a value indicating the access percentage 6 12 Diagnostics Exercisers and Utilities Enter command number x read write access erase quit Explanation This question only applies to the User Defined test It allows you to define command x as a read write access or erase command Enter quit to finish de
543. tected an unexpected opcode 60140100 tmscp clear cdl cmpl rtn detected an unexpected opcode 60150100 VA CHANGE STATE failed to change the SW Write protect when requested to do so as part of the Disk Set Unit Characteristics command 60160100 VA CHANGE STATE failed to change the SW Write protect when requested to do so as part of the Tape Set Unit Characteristics command 60170100 Invalid type in entry of long interval work queue 60180100 mscp short interval found an Invalid type in entry of long interval work queue 60190100 dmscp ded send cmd found that the SIWI Work Item code supplied is unrecognized or invalid in this context during DCD inhibited processing 601A0100 dmscp ded send cmd found that the SIWI Work Item code supplied is unrecognized or invalid in this context during HIS XMIT APPL MSG failure processing 601B0100 Invalid EVENT CODE parameter in call to dmscp connection event continued on next page HSJ Series Error Logging C 113 Table C 43 Cont Disk and Tape MSCP Server Last Failure Codes Code Description 601C0100 601D0100 601E0100 601F0100 60250100 60260100 60270100 60280100 60290100 602A0100 602B0100 602C0100 602D0100 602E0100 603B0100 603C0100 603F0100 60400100 60410100 60420100 60430100 60440100 60450100 C 114 HSJ Series Error Logging Invalid EVENT CODE parameter in call to tmscp connection event Invalid EVENT CODE parameter in call to dmscp ded
544. tem 3 Exit Auto Configure and DILX 3 Enter Auto Configure option 1 3 21 k Caution All data on the Auto Configured disks will be destroyed You MUST be sure of yourself continued on next page Diagnostics Exercisers and Utilities 6 25 Example 6 8 Cont Auto Configuratio Are you sure you want to continue y n Enter execution time limit in minutes Enter performance summary interval in mi Unit 10 successfully allocated for testi Unit 12 successfully allocated for testi Unit 14 successfully allocated for testi Unit 21 successfully allocated for testi Unit 23 successfully allocated for testi Unit 61 successfully allocated for testi Unit 63 successfully allocated for testi DILX testing started at 13 JAN 1993 Test will run for 60 minutes Type T if running DILX through VCS to get a current performance summa Type C to terminate the DILX test p Type Y to terminate DILX prematurel DILX Summary at 13 JAN 1993 04 44 11 Test minutes remaining 59 expired Unit 10 Total IO Requests 9595 o errors detected Unit 12 Total IO Requests 5228 o errors detected Unit 14 Total 10 Requests 10098 o errors detected Unit 2 Total IO Requests 9731 o errors detected Unit 23 Total 10 Requests 5230 o errors detected Unit 6 Total 10 Requests 11283 o errors detected Unit 63 Total IO Requests 5232 o errors detected Reuse Parameters sto
545. tes TILX is invoked from a maintenance terminal This is an extensive test Example 6 15 Using All Functions TILX HSJ gt run tilx Copyright Digital Equipme Tape Inline Exerciser ver Enter TILX hex debug flags 6 46 Diagnostics Exercisers and Utilities nt Corporation 1993 sion 1 4 0 ffff 0 E continued on next page Example 6 15 Cont Using All Functions TILX Use all defaults y n y n nter execution time limit in minutes 10 65535 10 nter performance summary interval in minutes 1 65535 10 nclude performance statistics in performance summary y n n y isplay hard soft errors y n n y isplay hex dump of Error Information Packet requester specific cy nformation y n n hen the hard error limit is reached the unit will be dropped from testing nter hard error limit 1 65535 32 hen the soft error limit is reached soft errors will no longer be isplayed but testing will continue for the unit nter soft error limit 1 65535 32 nter IO queue depth 1 20 4 6 uppress caching y n n Available tests are 1 Basic Function 2 User Defined 3 Read Only P Q e P mc m Co Ed Ed Q zj Ed zj b CO U r4 E Ed Use the Basic Function test 99 9 of the time The User Defined test is for special problems only Enter test number 1 3 1 1 Enter data pattern number 0 ALL 19 USER DEFINED 0 19 0 Enter record count 1 4294967295 4096
546. tes that the event being reported occurred during execution of a previous command for which GOOD status has already been returned The emdopcd field is undefined in this case For error codes 70 and 71 the remaining fields of the event log such as segment snsflgs info and so forth will contain the standard SCSI Sense Data fields bytes 1 through 17 returned in the response of a SCSI REQUEST SENSE command An error code of 7F indicates that the Sense Data fields are in a vendor specific format so the content of the remaining event log fields can only be determined from documentation provided by the vendor of the target device The SCSI standard states that error code values 72 through 7E are currently reserved for future use and that error codes 00 through 6F are not defined Should this field contain any of those codes the remaining event log fields are undefined Valid If this bit is set to one the content of the Sense Data Information field bytes 3 through 6 is valid and its content is as defined by the SCSI standard see the description of the info field for the SCSI definition of the Sense Data Information field Otherwise the Sense Data Information field is not as defined by the SCSI standard refer to documentation provided by the device vendor for their definition of the field segment This field contains byte 1 Segment field of the Sense Data returned in the response of a SCSI REQUEST SENSE command
547. th controller s cache modules must have the same number of megabytes and both firmware versions must be identical If there is a mismatch neither controller will access any devices Dual redundant HSJ series controllers must be on the same star coupler Dual redundant HSD series controllers must be on the same DSSI bus 3 5 3 Optimal Performance Configuration For optimal performance configure to the following guidelines Balance the number of devices on each port of a controller For example for 18 3 inch SBBs place 3 devices on each of 6 ports This permits parallel activity on the controller s available ports to the attached devices Figure 3 8 is an example of how to balance devices across ports Evenly distribute higher performance devices across separate ports so that higher and lower performance devices are intermixed on the same port For example put multiple solid state disks on separate ports This intermixing of higher and lower performance devices on the same port benefits overall performance Use the guidelines in Table 3 7 Table 3 7 High performance Devices per Port Number of high performance devices Number of high performance devices per port 1 6 1 7 12 2 13 18 3 Limit the number of devices per controller port to three in dual redundant configurations In doing so both controllers access three devices per each other s port maintaining six SCSI 2 devices total Maximize the amount of cache memor
548. the CPU initiator The SCSI bus member that requests an operation be performed by another member target When the HS controller interacts with physical storage devices it is the initiator Furthermore when the host CPU interacts with the HSZ series controller the host is the initiator instance code The four byte value transmitted in the error log packet that is key to interpreting the error KILL line The controller to controller disable signal used in a dual redundant configuration least recently used See LRU logical unit A virtual group of devices addressable as a unit Also called host logical unit logical unit number See LUN LRU Least recently used This is cache terminology for the block replacement policy for the read cache LUN A value of 0 through 7 that identifies a logical unit to a SCSI initiator maintenance terminal The operator terminal used to identify an HS family controller to enable its host paths to define its subsystem configuration and to check its status The HS family maintenance terminal interface is designed to accept any terminal conforming to EIA 423 A maintenance terminal is only required to initially configure a controller and is not required for normal operations Mass Storage Control Protocol See MSCP MIST Module integrity self test MIST tests controller functions upon initialization See also DAEMON Module integrity self test See MIST MSCP Mass Storage
549. the DILX Performance Summaries DILX Abort Codes o ooooooo oo DILX Error Codes o ooooo oo Tape Inline Exerciser HSJ and HSD Series Controllers Invoking TILX llle Interrupting TILX Execution TILX Tests a aia the Basic Function Test TILX User Defined Test TILX Read Only Test TILX TILX Test Definition Questions TILX Output Messages TILX End Message Display TILX Error Information Packet Displays TILX Data Patterns oooooooo TILX Examples o o oooooo ooo TILX Example Using All Defaults TILX Example Using All Functions Interpreting the TILX Performance Summaries TILX Abort Codes o o oooooooooo TILX Error Codes o oooooooooo Disk Inline Exerciser HSZ Series Controllers Invoking DILX erioa para a Interrupting DILX Execution DILX Tests gan ee es Basic Function Test DILX User Defined Test DILX DILX Test Definition Questions DILX Output Messages DILX Sense Data Display DILX Deferred Error Display DILX Data PatterdS oooo o Interpreting the DILX Performance Summaries DILX Abort Codes 0 00 00 oo DILX Error Codes o o o ooo
550. the DISCONNECT REC or DISCONNECT MATCH connection state 405A020A 006A Received SCS APPL MSG when in the DISCONNECT SENT or DISCONNECT ACK connection state 405B020A 006A Received SCS ACCEPT REQ on a connection that is no longer valid Note that in this instance if the connection ID field is zero the content of the VCSTATE remote node name remote connection id and connection state fields are undefined 405C020A 006A Received SCS ACCEPT RSP on a connection that is no longer C 82 HSJ Series Error Logging valid Note that in this instance if the connection ID field is zero the content of the VCSTATE remote node name remote connection id and connection state fields are undefined continued on next page Table C 26 Cont Cl System Communication Services Event Log Template 33 Instance MSCP Event Codes Instance Code MSCP Event Code Description 405D020A 405E020A 405F020A 40600204 40610204 40620204 40630204 4064020A 40650204 4066020A 40670204 006A 006A 006A 006A 006A 006A 006A 006A 006A 006A 006A Received SCS REJECT_REQ on a connection that is no longer valid Note that in this instance if the connection 1D field is zero the content of the VCSTATE remote node name remote connection id and connection state fields are undefined Received SCS REJECT_RSP on a connection that is no l
551. the HS family of array controllers The manual details configuration controls and indicators normal operating procedures error reporting troubleshooting and fault analysis field replaceable units FRUs and removal and replacement procedures Intended Audience Structure This manual is intended for Digital Mutlivendor Services Personnel and customers who need assistance in operating and maintaining the HS array controllers Familiarity with the StorageWorks Array Controllers HS Family of Array Controllers User s Guide is assumed This manual contains the following chapters Chapter 1 Provides an overview of the HS controllers Chapter 2 Provides a technical explanation of HS controller hardware and firmware Chapter 3 Defines physical configuration rules for the HS controller subsystem Chapter 4 Provides operation and configuration instructions Chapter 5 Discusses how to translate error information and perform initial fault analysis Chapter 6 Details the diagnostics inline exercisers and utilities for the HS controllers Chapter 7 Provides procedures for the removal and replacement of FRUs Appendix A Lists the HS controller FRUs including part numbers and related FRUs Appendix B Provides complete details for CLI commands and their usage Appendix C Describes HSJ series controller error logging Appendix D Describes HSD series controller error logging Appendix E Describes HSZ series controller error logging
552. the following commands to verify the preceding parameters were set CLI gt SHOW THIS CONTROLLER CLI SHOW OTHER CONTROLLER CAUTION Do not plug host port cables into an HSD series controller while the power is on to any members on the DSSI bus including the controller and host Doing so risks short circuits that may blow fuses on all the members 9 Connect the host port cables to the front of the controllers see Chapter 7 Do not connect the two controllers in a dual redundant pair to separate or different star couplers HSJ series or DSSI buses HSD series 10 Enter the following commands to enable CI paths A and B to the host HSJ series controllers CLI gt SET THIS CONTROLLER PATH A CLI gt SET THIS CONTROLLER PATH B CLI gt SET OTHER CONTROLLER PATH A CLI gt SET OTHER CONTROLLER PATH B Enter the following commands to enable the host port path HSD series controllers CLI SET THIS CONTROLLER PATH CLI gt SET OTHER CONTROLLER PATH 4 3 6 Configuring Storage Devices To automatically configure devices on the controller use the CONFIG utility described in Chapter 6 Note If you use the ADD command to add a removable media device such as a tape or CDROM to an HSJ or HSD series controller the host will not be able to access the device until one of the following occurs e The media is loaded into the device e The controller is reinitialized e The host is reinitialized e The virtual circu
553. ting system will establish communication with the controller using the new CI node address and CI node name Normal operation will occur with the exception that the controller s devices will be assigned new device names based on the controller s new node name e Ifit is necessary to change only the controller s CI node number all CI host CPU nodes must be shut down and then restarted 4 9 3 AUTOGEN COM OpenVMS The OpenVMS AUTOGEN COM file must be edited for HSJ and HSD series controller attached disks to be recognized If AUTOGEN is run without modification in a system that includes such controller attached disk drives the following error message is displayed WARNING unsupported system disk type Using speed and size characteristics of an RK07 The AUTOGEN program does not recognize the device types of the controller s attached devices The OpenVMS DCL lexical F GETDVI returns the following values OpenVMS VAX V6 0 VAX VMS V5 5 1 OpenVMS VAX V6 1 OpenVMS VAX V5 5 2 141 HSX00 35 unknown device 142 HSX01 35 unknown device The AUTOGEN COM DCL procedure must be modified as follows to support these values VAX VMS V5 5 1 and OpenVMS V5 5 2 The AUTOGEN COM DCL procedure will select a 1 unsupported device from the speed list To circumvent this problem perform the following steps 1 Make a copy of the AUTOGEN COM DCL file in case restoration of the original state is required 2 The section of AUTOGEN
554. tion 06 Four possible problem sources are indicated e Total power supply failure on a shelf Follow repair action 09 e A device inserted into a shelf that has a broken internal SBB connector Follow repair action 0A e A standalone device is connected to the HSJ30 40 controller with an incorrect cable Follow repair action 08 e A HSJ30 40 controller hardware failure Follow repair action 20 Determine which blower has failed and replace it Refer to Chapter 7 for the blower removal procedure Replace power supply Refer to Chapter 7 for the power supply removal procedure Replace the cable Refer to the specific device documentation Determine power failure cause Determine which SBB has a failed connector and replace it Refer to Chapter 7 The other HSJ30 40 controller in a dual redundant configuration has been reset with the Kill line by the HSJ30 40 controller that reported the event To restart the Killed HSJ30 40 controller enter the CLI RESTART OTHER command on the Surviving HSJ30 40 controller and then press the RESET button on the Killed HSJ30 40 controller If the other HSJ30 40 controller is repeatedly being Killed for the same or a similar reason follow repair action 20 continued on next page C 120 HSJ Series Error Logging Table C 51 Cont Recommended Repair Action Codes Code Description 0C Both HSJ30 40 controllers in a dual redundant configuration are attem
555. tion displayed with a printout of the CONFIGURATION INFO file or with a copy of the most current configuration 3 Reconfigure the necessary devices units or storage sets See the CLI commands described in Appendix B Error Analysis and Fault Isolation 5 13 CAUTION Replace the controller immediately if any of the following messages occur Do not continue to use the controller NVPM Controller Characteristics component initialized to default settings The following NVPM Manufacturing Failure Information component elements were initialized to default settings list of component elements NVPM Recursive Bugcheck Information component initialized to default settings NVPM System Information Page component initialized to default settings NVPM Volume Serial Number component initialized to default settings All NVPM components initialized to their default settings Unknown NVPM Revision Level Unknown reformat stage encountered during NVPM Revision Level 1 to 2 reformat Controller Characteristics component reformat failed during NVPM Revision Level 1 to 2 reformat Host Access Disabled 5 6 3 CLI Automatic Messages This section lists the automatic messages displayed by the CLI Device and or Storageset names changed to avoid conflicts Explanation Digital adds new CLI keywords at each new HS operating firmware release that can conflict with existing device and or storage set names When this happens HS operating
556. tion recovery threshold assigned to the event This value indicates when notification recovery action should be taken See Section C 4 for more detail Repair Action The recommended repair action code assigned to the event This value indicates what notification recovery action should be taken when the NR Threshold is reached See Section C 5 for more detail Event Number A number when combined with the value contained in the Component ID subfield uniquely identifies the event HSJ Series Error Logging C 7 Component ID A number that uniquely identifies the firmware component that detected the event as shown in Table C 2 templ A number that uniquely describes the format of the template dependent information field tdisize The number of bytes contained in the template dependent information field reserved Reserved for future use event time The time the event occurred according to the power on time value maintained by the HSJ30 40 controller operational firmware The power on time value is a 64 bit unsigned integer that represents the total number of seconds HSJ30 40 controller operational firmware has executed on the HSJ30 40 controller board Note that the time expended during controller reinitializations power on diagnostics and system initialization is not accounted for by this value template dependent information A variable length field containing information specific to the event being re
557. tive screws and remove the terminator or secondary DSSI host cable attached to the trilink connector Optional Loosen captive screws and remove the trilink connector from the front of the controller 7 6 4 Cable Replacement Installation Use the following procedure to replace DSSI host cables 1 Optional Attach the trilink connector to the front of the controller and tighten its captive screws Position and route the DSSI host cable within the cabinet Connect the DSSI host cable to the trilink connector on the front of the controller and tighten the captive screws on the DSSI host cable connector Optional Connect and tighten captive screws for the terminator or secondary DSSI host cable at the open connection of the trilink connector Install any tie wraps as necessary to hold the DSSI host cable in place Close and lock the cabinet doors SW800 series using a 5 32 inch Allen wrench Connect the other end of the cable to the appropriate device on the bus Reapply power to the controller and devices on the DSSI bus Enter the following command to resume activity on the host path CLI gt SET THIS CONTROLLER PATH 7 7 SCSI Host Cables HSZ Series Servicing SCSI host cables Figure 7 9 causes subsystem down time because the host path will be disconnected for the duration of the procedure Use the procedures in this section when you are removing and replacing SCSI host cables CAUTION Never leave active SCSI h
558. to allow a specific test to be defined In a User Defined test a total of 20 or fewer I O commands can be defined Once all of the commands are issued TILX issues the commands again in the same sequence This is repeated until the selected time limit is reached As you build the test TILX collects the following information for each command The I O command operation write read reposition record reposition file write tape mark rewind quit Note that quit is not a command instead it indicates to TILX that you have finished defining the test The number of times to repeat the command Applies only to write read and write tape mark e The number of records or file marks to reposition The data pattern to use The direction of reposition operation toward EOT or BOT The size of the I O in bytes The TMSCP command modifiers 6 3 3 3 Read Only Test TILX The Read Only test should only be used to verify that a tape is readable The Read Only test reads records until the EOT or the selected record count is reached At that point the tape is rewound and another read pass proceeds Tape marks are ignored This test will most likely issue reads with incorrect record sizes If there are record size mismatches they will be ignored All other errors will be recorded 6 3 4 TILX Test Definition Questions The following section lists the questions that TILX asks to collect the parameters needed to perform a TILX test
559. transferring all data 03154002 00E8 Data returned from drive is invalid 03164002 012B Request Sense command to drive failed 03170064 0016 Tllegal command for pass through mode 03180064 0016 Data transfer request error 03194002 012B Premature completion of a drive command 03144002 002B Command timeout 031B0101 002B Watchdog timer timeout 031C4002 002B Disconnect timeout 031D4002 012B Unexpected bus phase 031E4002 012B Disconnect expected continued on next page HSJ Series Error Logging C 89 Table C 28 Cont Disk Transfer Error Event Log Template 51 Instance MSCP Event Codes MSCP Instance Event Code Code Description 031F4002 012B ID Message not sent by drive 03204002 012B Synchronous negotiation error 03214002 012B The drive unexpectedly disconnected from the SCSI bus 03224002 012B Unexpected message 03234002 012B Unexpected Tag message 03244002 012B Channel busy 03254002 012B Message Reject received on a valid message 03264504 00EB The disk device reported Vendor Unique SCSI Sense Data Table C 29 Disk Bad Block Replacement Attempt Event Log Template 57 Instance MSCP Event Codes MSCP Instance Event Code Code Description 02110064 0014 Disk Bad Block Replacement attempt completed for a read within the user data area of the disk Note that due to the way Bad Block Replacement is performed on SCSI disk drives information on the actual replacement blocks is not available to the controller an
560. troller Qualifiers FULL If the FULL qualifier is specified additional amplifying information may be displayed after each device Examples P CLI gt sho t0 MSCP unit Uses TO TAPEO Switches DEFAULT FORMAT TZ87_NOCOMPRESSION State AVAILABLE No exclusive access CLI gt Shows an individual tape unit B 54 Command Line Interpreter SHOW tape container name SHOW tape container name Shows information about a tape drive Format SHOW tape container name Parameters tape container name The name of the tape drive that will be displayed Description The SHOW tape container name command is used to show specific information about a particular tape drive Examples HSJBO gt SHOW TAPE230 Name Type Port Targ Lun Used by TAPE230 tape 2 3 0 T230 DEC TSZ07 0309 A listing of TAPE230 Command Line Interpreter B 55 SHOW THIS_CONTROLLER SHOW THIS CONTROLLER Shows this controller s information Format SHOW THIS CONTROLLER Description Shows all controller port and terminal information for this controller Qualifiers FULL If the FULL qualifier is specified additional amplifying information is displayed after the normal controller information Examples CLI SHOW THIS CONTROLLER Controller HSJ40 2G313FF115 Software E140 Hardware 0000 Configured for dual redundancy with 2630355555 In dual redundant configuration SCSI address 6 Host port Node name HSJ306 valid CI node 6 32 max nod
561. troller s performance is through parallel transfers to members of a storage set If multiple members of a storage set are on one port transfers must be done in serial to those members Though multiple storage set members on one port will work it is strongly recommended that the storage set be deleted and reconfigured with one member per port Warning 3010 Unable to check all device types that make up this storageset If the storageset is made up of different device types it may result in a storageset of reduced size Explanation This error results from an ADD storageset type command Device types being added to a storage set are checked to make sure that they are the correct device types If one or more devices could not be checked this warning is displayed You should check all the devices to make sure that they are correctly installed and configured Command Line Interpreter Warning 3020 This storageset is configured with different device types This may result in a storageset of reduced size Explanation This error results from an ADD storageset type command Device types being added to a storage set are checked to assure that they are the same types If all devices are not the same this warning is displayed Storage set size is determined by the size of the smallest device so the storage set configured will be of reduced size If a reduced size storage set is acceptable nothing need be done in response to this warning To
562. ttempt Event Log is reported via the MSCP Bad Block Replacement Attempt error log message format The format of this event log including the HSJ30 40 controller specific fields is shown in Figure C 27 Disk Bad Block Replacement Attempt Event Log Format Specific Fields format This field contains the value 09 that is MSCP Bad Block Replacement Attempt error log format code event code The values that can be reported in this field for this event log are shown in Table C 29 reserved offset 36 This field contains the value 0 instance code See Section C 2 1 for the description of this field The values that can be reported in this field for this event log are shown in Table C 29 templ See Section C 2 1 for the description of this field This field contains the value 57 for this event log tdisize See Section C 2 1 for the description of this field This field contains the value 1C for this event log C 48 HSJ Series Error Logging Figure C 27 Disk Bad Block Replacement Attempt Event Log Template 57 Format command reference number event code flags format controller identifier unit identifier reserved cause instance code tdisize templ reserved event time device identification device serial number reserved offset 3E This field contains the value 0 event time See Section C 2 1 for the description of this field HSJ Series Error Logging C 49 device locator devtype device i
563. tus because one of the FM EOM or ILI bits is set to one in the snsflgs field RECOVERED ERROR Indicates that the last command completed successfully with some recovery action performed by the target Details may be determinable by examining the info field NOT READY Indicates that the logical unit addressed cannot be accessed Operator intervention may be required to correct this condition MEDIUM ERROR Indicates that the command terminated with a non recovered error condition that was probably caused by a flaw in the medium or an error in the recorded data This sense key may also be returned if the target is unable to distinguish between a flaw in the medium and a specific hardware failure sense key 4 HARDWARE ERROR Indicates that the target detected a non recoverable hardware failure for example controller failure device failure parity error and so forth while performing the command or during a self test ILLEGAL REQUEST Indicates that there was an illegal parameter in the command descriptor block or in the additional parameters supplied as data for some commands FORMAT UNIT SEARCH DATA and so forth If the target detects an invalid parameter in the command descriptor block then it shall terminate the command without altering the medium If the target detects an invalid parameter in the additional parameters supplied as data then the target may have already altered the medium This sense key may also indicate t
564. u wish to change the chunksize delete the unit and then change it CAUTION After changing the chunksize an INITIALIZE command is required to rewrite the container s metadata This will destroy customer data Command Line Interpreter Error 1090 Tape unit numbers must start with the letter T Explanation All tape unit numbers are of the form Tn This error is displayed if you add a tape unit and do not begin the unit number with the letter T Retry the ADD command with a T at the start of the unit number Error 1100 Disk unit numbers must start with the letter D Explanation All disk unit numbers are of the form Dn This error is displayed if you add a disk unit and do not begin the unit number with the letter D Retry the ADD command with a D at the beginning of the unit number Error 1110 Unit numbers may not have leading zeros Explanation Tape and disk unit numbers may not be of the form D03 for example D3 should be specified Retry the ADD command without any leading zeros Error 1120 LUN lt lun gt is already used Explanation Lun number lt lun gt has already been used by a disk or tape Retry the ADD command specifying a different LUN Error 1130 The unit number cannot exceed lt max_unit gt Explanation You specified a unit number that was out of bounds Try to add the unit again using a unit number that is less than or equal to lt max_unit gt
565. uence error 2D 00 Overwrite error on update in place 2F 00 Commands cleared by another initiator 30 00 Incompatible medium installed 30 01 Cannot read medium unknown format 30 02 Cannot read medium incompatible format 30 03 Cleaning cartridge installed 31 00 Medium format corrupted 33 00 Tape length error 37 00 Rounded parameter 39 00 Saving parameters not supported 3A 00 Medium not present 3B 00 Sequential positioning error 3B 01 Tape position error at beginning of medium 3B 02 Tape position error at end of medium 3B 08 Reposition error 3D 00 Invalid bits in identify message 3E 00 Logical unit has not self configured yet 3F 00 Target operating conditions have changed 3F 01 Microcode has been changed 3F 02 Changed operating definition 3F 03 Inquiry data has changed C 70 HSJ Series Error Logging continued on next page Table C 14 Cont SCSI ASC ASCQ Codes For Sequential Access Devices such as magnetic tape ASC ASCQ Code Code Description 43 00 Message error 44 00 Internal target failure 45 00 Select or reselect failure 46 00 Unsuccessful soft reset 47 00 SCSI parity error 48 00 Initiator detected error message received 49 00 Invalid message error 4A 00 Command phase error 4B 00 Data phase error 4C 00 Logical unit failed self configuration 4E 00 Overlapped commands attempted 50 00 Write append error 50 01 Write append position error 50 02 Position
566. ule including its mounting bracket OCP and bezel Cache module Program card Internal host cable CI External host cables CI Host cable DSSI and SCSI SCSI device port cables Blowers Power supplies CAUTION Do not attempt to replace or repair components within FRUs or equipment damage may result Use the controller fault indications and error logs to isolate FRU level failures This chapter also discusses how to warm swap controllers and storage devices 7 1 Controller Module Servicing a controller module involves several considerations Diagnosing the controller Shutting down controllers Deciding what to replace A nonredundant controller One dual redundant controller Both dual redundant controllers Removing and Replacing Field Replaceable Units 7 1 7 1 1 Diagnosing the Controller If you are presented with a controller failure you should be aware of the following Generally if the green OCP reset button is lit continuously the controller module needs replacing However you need to be as familiar as possible with the failure or reason for replacing the module Be sure you have followed troubleshooting basics 1 Make a note of all visual indicators OCP device LEDs and or error messages available to you Extract and read host error logs Chapter 5 Errors can be intermittent Reset the controller to see if the error clears See if the error indication changes
567. umbers LBNs 4 Physical block Contains all the blocks on a subunit DBNs LBNs RBNs and XBNs are subsets of the physical block area Physical block addresses are 28 bits wide and are called physical block numbers PBNs 5 Replacement block A reserved block used as a replacement for a bad block on a subunit Replacement block addresses are 28 bits wide and are called replacement block numbers RBNs blower An airflow device mounted in a StorageWorks shelf Built in self test See BIST cable distribution unit See CDU carrier A standard StorageWorks shelf compatible plastic shell into which a device can be installed Sometimes called SBB carrier CDU Cable distribution unit The power entry device for StorageWorks center cabinets The unit provides the connections necessary to distribute ac power to cabinet shelves and fans CI bus Digital s computer interconnect bus using two serial paths each with a transfer rate of 70 Mb s 8 75 MB s CIRT CI receiver transmitter CI20 DECSYSTEM 20 interface to the CI bus CI750 VAX 11 750 and VAX 11 751 interface to the CI bus CI780 VAX 11 780 and VAX 11 782 interface to the CI bus CLI Command line interpreter for and user interface to the HS family controller firmware cluster A collection of processors called nodes attached to each other by a high speed bus These processors are independent and survivable They may be general purpose computers or spe
568. ur good configuration information resides before entering the command 1 Enter the following command to copy configuration information to the new controller CLI gt SET FAILOVER COPY THIS CONTROLLER 2 Enter the following command to set the MAX_NODES CLI gt SET OTHER CONTROLLER MAX NODES n where n is 8 16 or 32 3 Enter the following command to set a valid controller ID CLI gt SET OTHER CONTROLLER ID n where n is the CI node number 0 through MAX NODES 1 4 Enter the following command to set the SCS node CLI gt SET OTHER CONTROLLER SCS_NODENAME xxxxxx where xxxxxx is a one to six character alphanumeric name for this node The node name must be enclosed in quotes with an alphabetic character first Each SCS node name must be unique within its VMScluster 5 Enter the following command to set the MSCP allocation class CLI gt SET OTHER CONTROLLER MSCP ALLOCATION CLASS n where n is 1 through 255 Digital recommends providing a unique allocation class value for every pair of dual redundant controllers in the same cluster 6 Enter the following command to set the TMSCP allocation class CLI gt SET OTHER CONTROLLER TMSCP ALLOCATION CLASS n where n is 1 through 255 Note Always restart the new controller after setting the ID SCS node name or allocation classes 7 Restart the new controller either by pressing its green reset button or entering the following command CLI gt
569. ure 08 01 Logical unit communication time out 08 02 Logical unit communication parity error 0A 00 Error log overflow 15 00 Random positioning error 15 01 Mechanical positioning error 1A 00 Parameter list length error 1B 00 Synchronous data transfer error 20 00 Invalid command operation code 21 00 Logical block address out of range 21 01 Invalid element address 24 00 Invalid field in CDB 25 00 Logical unit not supported 26 00 Invalid field in parameter list 26 01 Parameter not supported 26 02 Parameter value invalid 26 03 Threshold parameters not supported 28 00 Not ready to ready transition medium may have changed 28 01 Import or export element accessed 29 00 Power on reset or bus device reset occurred 29 01 Power on occurred 29 02 SCSI bus reset occurred 29 03 Bus device reset occurred 2A 00 Parameters changed 2A 01 Mode parameters changed 2A 02 Log parameters changed 2C 00 Command sequence error 2F 00 Commands cleared by another initiator continued on next page HSJ Series Error Logging C 75 Table C 16 Cont SCSI ASC ASCQ Codes For Medium Changer Devices such as jukeboxes ASC ASCQ Code Code Description 30 00 Incompatible medium installed 37 00 Rounded parameter 39 00 Saving parameters not supported 3A 00 Medium not present 3B 0D Medium destination element full 3B OE Medium source element empty 3D 00 Invalid bits in identify message 3E 00 Logical unit has not self c
570. utines Disk exerciser HSJ and HSD series controllers Tape exerciser HSJ and HSD series controllers e Disk exerciser HSZ series controllers e VTDPY utility e CONFIG utility e HSZUTIL virtual terminal host resident application 6 1 Initialization The controller will initialize after any of the following conditions e Power is turned on The firmware resets the controller The operator presses the green reset button e The host clears the controller Whenever the controller initializes it steps through a three phase series of tests designed to detect any hardware or firmware faults The three test areas are as follow e Built in self test e Core module integrity self test Module integrity self test DAEMON Initialization time will vary depending on your model of controller and what size and type of cache module if any you are running However initialization will always complete in under 1 minute Figure 6 1 shows the initialization process Diagnostics Exercisers and Utilities 6 1 Figure 6 1 Controller Initialization 960 BIST POLICY PROCESSOR CORE MIST READ WRITE DIAGNOSTIC REGISTER PROGRAM CARD CONTENTS TIMER DUART DRAB DRAM BUS PARITY REGISTERS JOURNAL SRAM I D CACHE EXEC ade WC MIST DAEMON i DEVICE PORTS SELF TEST 4 HOST PORT LOOP CACHE MODULE PNE TNT FUNCTIONAL CODE CXO 3697B MC 6 1 1 Built I
571. valid message 03B3450A 0097 The media changer device reported Vendor Unique SCSI Sense Data C 92 HSJ Series Error Logging Table C 32 Disk Copy Data Correlation Event Log event dependent information Values Value Description 00000001 Unable to allocate a sufficient number of DCD Context Blocks to support this host 00000002 Unable to find an inactive Unit Path Block 00000003 Unable to find an inactive Source Unit Block 00000004 Insufficient resources returned by HISSCONNECT Table C 33 Executive Services Last Failure Codes Code Description 01000100 Memory allocation failure during executive initialization 01010100 An interrupt without any handler was triggered 01020100 Entry on timer que was not of type AQ or BQ 01030100 Memory allocation for a facility lock failed 01040100 Memory initialization called with invalid memory type 01050104 The 1960 reported a fault e Last Failure Parameter 0 contains the PC value e Last Failure Parameter 1 contains the AC value e Last Failure Parameter 2 contains the fault type and subtype values e Last Failure Parameter 3 contains the address of the faulting instruction 01060100 An attempt was made to do EXEC UART 1 O when there is no support for it 01070100 Timer chip setup failed 01082004 The core diagnostics reported a fault e Last Failure Parameter 0 contains the error code value same as blinking OCP LEDs error code e Last Failure Param
572. value e Last Failure Parameter 7 contains the CACHEA1 DRAB Region Setup Register value 01842288 A processor interrupt was generated by the CACHEBO Dynamic RAM Controller and Arbitration engine DRAB with an indication that an unrecoverable memory access problem occurred e Last Failure Parameter 0 contains the CACHEBO DRAB Setup Register value e Last Failure Parameter 1 contains the CACHEBO DRAB CSR Register value e Last Failure Parameter 2 contains the CACHEBO DRAB Diagnostic CSR Register value e Last Failure Parameter 3 contains the CACHEBO DRAB Diagnostic Error Register value e Last Failure Parameter 4 contains the CACHEBO DRAB Error Address Register value e Last Failure Parameter 5 contains the CACHEBO DRAB Error Data Register value e Last Failure Parameter 6 contains the CACHEBO DRAB Error Region Register value e Last Failure Parameter 7 contains the CACHEBO DRAB Region Setup Register value continued on next page HSJ Series Error Logging C 95 Table C 33 Cont Executive Services Last Failure Codes Code Description 01852288 01860080 01870080 01880080 01890080 018A0080 A processor interrupt was generated by the CACHEB1 Dynamic RAM Controller and Arbitration engine DRAB with an indication that an unrecoverable memory access problem occurred e Last Failure Parameter 0 contains the CACHEB1 DRAB Setup Register value e Last Failure Parameter 1 contains the CACHEB
573. ve 5 16 exiting 4 3 firmware 2 10 warning conventions B 74 warning messages B 74 CLI commands B 1 Cluster size 4 14 Index 2 Codes CI Message Operation Codes 00 C 58 01 C 58 02 C 58 03 C 58 04 C 58 05 C 58 06 C 58 07 C 58 08 C 58 09 C 58 10 C 58 11 C 58 12 C 58 13 C 58 0A C 58 OB C 58 0C C 58 0D C 58 OE C 58 OF C 58 Controller Restart Codes 0 C 118 1 C 118 Event Codes 0007 C 89 C 9 0014 C 84 C 90 0016 C 89 C 9 0037 C 9 0077 C 9 0097 C 9 0103 C 8 002A C 84 8 006A C 8 008A C 8 012A C 7 016A C 8 020A C 8 022A C 7 040A C 7 01AA C 000B C 8 002B C 8 8 8 aon do a P Qo o 012B C 014B C 89 01CA C 00CB C 00E8 C O1EA C 03EA C O0EB C C 91 Event Notification Recovery Threshold Classification Value 01 C 119 02 C 119 64 C 119 ge epe Oo OO HORKA P e m PPE e E C 83 C 84 C 86 D 3 Codes Codes Event Notification Recovery Threshold Host Interconnect Services Status Codes cont d Classification Value cont d 00140009 C 57 0A C 119 00150009 C 57 Firmware Component Identifier Codes 00160009 C 57 01 C 56 00170009 C 57 02 C 56 00180009 C 57 03 C 56 00190009 C 57 04 C 56 000A0009 C 57 D 2 06 C 56 001A0009 C 57 07 C 56 000B0009 C 57 D 2 08 C 56 001B0009 C 57 20 C 56 000C0009 C 57 40 C 56 001C0009 C 57 42 C 56 000D0009 C 57 60 C 56 001D0009 C 57 D 2 61
574. verheat in as little as 60 seconds 7 9 1 Tools Required You will need the following tools to remove or replace the blower e 5 32 inch Allen wrench e Phillips screwdriver 2 7 9 2 Precautions Refer to Chapter 1 for safety guidelines 7 34 Removing and Replacing Field Replaceable Units Figure 7 12 Replacing a Blower CONNECTOR PHILLIPS SCREW MOUNTING TAB p i l BLOWER 5 p a CXO 3659A PH 7 9 3 Blower Removal WARNING To reduce the risk of electrical energy hazard disconnect the power cables from the shelf power supplies before removing shelf blower assemblies or performing service in the backplane area Use the following procedure to remove a blower 1 Unlock and open the cabinet doors SW800 series using a 5 32 inch Allen wrench If you cannot access the rear of the shelf remove its SCSI device cables as described in Section 7 8 Then remove the shelf as described in the StorageWorks Solutions Shelf and SBB User s Guide Disconnect the power cables from the shelf power SBBs The primary power supply cord is black The secondary power supply cord is gray Use a Phillips screwdriver to remove the safety screw in the upper right corner or lower left corner of the blower Press the upper and lower blower mounting tabs together to release the blower Pull the blower straight out to disconnect it from the shelf power connector Removing and Replacing Field Replaceable Unit
575. verify the list of devices that are currently configured on the controller as shown in the following example The example shows the CONFIG utility as it is run on an HSJ or HSD series controller The text of the prompts may change slightly when run on other controllers in the HS controller family HSJ gt SHOW DEVICES No devices HSJ gt RUN CONFIG Copyright O Digital Equipment Corporation 1993 Config Local Program Invoked Config will search all port target LUN combinations to determine what devices exist on the subsystem It will then add all disk tape and cdrom devices that are found It will not initialize devices add units or storage sets Do you want to continue y n y YES Config is building its tables and determining what devices exist on the subsystem Please be patient add disk DISK100 add disk DISK12 add disk DISK14 add disk DISK21 add disk DISK23 add disk DISK50 add disk DISK52 add tape TAPE60 add tape TAPE61 m L3 C5 h2 CO CO rm 4 hc 0 0 0 0 0 0 0 0 O O 0101NNRR Config Normal Termination HSJ gt 6 98 Diagnostics Exercisers and Utilities HSJ gt SHOW DEVICES Name Type Port Targ LU DISK100 disk 1 0 DISK120 disk 1 2 DISK140 disk dl 4 DISK210 disk 2 1 DISK230 disk 2 3 DISK500 disk 5 0 DISK520 disk 5 2 TAPE600 tape 6 0 TAPE610 tape 6 1 HSJ gt After you run the CONFIG utility you may have to initialize your containers using the INITIALIZE comman
576. vice Shelves Optimal Availability Configurations 0 0 0 0 cc eee eee HS Controller Operator Control Panel o o ooooooo o oo Solid OCP Codes su AS a Sosa re Reine ers Flashing OCP Codes o Storage SBB LEDs lees Power Supply LEDs i12 Re EE REX E ERAS Controller Initialization llle VTDPY Default Display for CI Controllers sess VTDPY Default Display for DSSI Controllers VTDPY Default Display for SCSI Controllers VTDPY Device Performance Display ooooooooo ooo VTDPY Unit Cache Performance Display o VTDPY Brief CI Status Display ooooooooooo o VTDPY Brief DSSI Status Display o oooooo o VTDPY Brief SCSI Status Display o ooo oooooo o o HSZ series Controller CLI Send Diagnostic Page Format HSZ series Controller CLI Receive Diagnostic Page Format Cabinet Grounding Stud Reset LED HSJ40 Controller eee Eject Button HSJ40 Controller llle Trilink Connector rial a ERES A ea pet Fae OCP Cable HSJ Series Controller o o oooooooo oo Controller Shelf RailS o ooooooooooooooo ooo External and Internal CI Cables HSJ series DSSI Host Gables cio A AAA e SCSI Host Cable i Rex da Volu
577. vice name gt at PTL lt port gt lt target gt lt lun gt No device installed Explanation When a unit is added or initialized the configuration of the devices that makes up the unit is checked If no device is found at the PTL specified this error is displayed Check both the logical and physical configuration of the unit and correct any mismatches Error 9180 lt device type gt lt device name gt at PTL lt port gt lt target gt lt lun gt Incorrect device type installed Explanation When a unit is added or initialized the configuration of the devices that make up the unit is checked If a non disk device is found at the PTL specified this error is displayed Check both the logical and physical configuration of the unit and correct any mismatches Error 9190 Unit lt unum gt is currently online Explanation When a SHUTDOWN RESTART or SELFTEST command is entered without the OVERRIDE_ONLINE qualifier and online devices are found the command is aborted and the units that are currently on line are listed Either retry the command with OVERRIDE_ONLINE qualifier or dismount all devices from the hosts Error 9200 lt name gt conflicts with unit names Explanation This error results from an ADD command Names in the format of Dn and Tn where n is a number from 0 to 4094 are reserved for units Rename the storage set or device that is being added so it does not conflict with the unit names and retry the command E
578. ving a dual redundant state CLI gt SET NOFAILOVER The two controllers are taken out of dual redundant configuration Command Line Interpreter B 33 SET OTHER_CONTROLLER SET OTHER_CONTROLLER Format Description Modifies the other controller s parameters in a dual redundant configuration the controller that the maintenance terminal is not connected to or the controller that is not the target of the DUP connection Note This command is valid for HSJ and HSD controllers only SET OTHER_CONTROLLER The SET OTHER_CONTROLLER command allows you to modify the controller parameters of the other controller in a dual redundant configuration Qualifiers for HSD controllers B 34 ID n Specifies the DSSI node number 0 through 7 MSCP_ALLOCATION_CLASS n Specifies the allocation class 0 through 255 in a single controller configuration or 1 through 255 in a dual redundant configuration When first installed the controller s MSCP_ALLOCATION_CLASS is set to 0 PATH NOPATH Enables or disables the DSSI port When first installed NOPATH is set PROMPT new prompt Specifies a 1 to 16 character prompt enclosed in quotes that will be displayed when the controller s CLI prompts for input Only printable ASCII characters are valid When first installed the CLI prompt is set to the first three letters of the controller s model number for example HSJ gt HSD gt or HSZ gt SCS _NODENAME xxxxxx Sp
579. when the selected record count is reached or if the end of tape EOT is reached The tape is rewound and the read pass is started The read pass consists of the following three phases Data Intensive Consists of reads of fixed record sizes with a byte count equal to the expected tape record byte count When tape marks are encountered forward position commands are issued Random Begins at the point where random sized records were written to the tape Most reads are issued with a byte count equal to the expected tape record byte count Occasionally reads will be intermixed with a byte count less than or greater than the expected tape record byte count When tape marks are encountered forward position commands are issued Position Intensive Begins half way down from the start of the area where random sized records are located In the Position Intensive phase reads and position commands are intermixed so that the test gradually proceeds toward the EOT When tape marks are encountered forward position commands are issued In all phases if the EOT is detected the tape is rewound to the beginning of tape BOT and the write pass is again entered 6 3 3 2 User Defined Test TILX CAUTION The User Defined test should be run only by very knowledgeable personnel Otherwise customer data can be destroyed 6 32 Diagnostics Exercisers and Utilities When the TILX User Defined test is selected TILX prompts you for input
580. wing steps add devices storage sets and logical units Use the CLI to complete these steps so that the host will recognize the storage device These steps can be run from a virtual terminal 1 Add the physical devices by using the following command CLI gt ADD device type device name scsi location where device type is the type of device to be added This can be DISK TAPE or CDROM device name is the name to refer to that device The name is referenced when creating units or storage sets SCSI location is the port target and LUN PTL for the device When entering the PTL at least one space must separate the port target and LUN For example CLI gt ADD DISK DISK100 1 0 0 CLI gt ADD TAPE TAPE510 5 1 0 CLI gt ADD CDROM CDROMO 6 0 0 2 Add the storage sets for the devices See Appendix B for examples for adding storage sets If you do not desire storage sets in your configuration proceed to step 3 CAUTION The INITIALIZE command destroys all data on a container See Appendix B for specific information on this command 3 Enter the following command to initialize the containers devices and or storage sets prior to adding logical units to the configuration CLI gt INITIALIZE container name where a container name is a device or storage set that will become part of a unit When initializing a single device container If NOTRANSPORTABLE the default was specified when the device was added a
581. x SCSI device ports is shown o eo o o o 6 O The Port column indicates the number of the SCSI device port Rq S This column shows the average I O request rate for the port during the last update interval These requests are up to 8 kilobytes long and are either generated by host requests or cache flush activity RdKB S This column shows the average data transfer rate from all devices on the SCSI bus in kilobytes during the previous screen update interval WrKB S This column shows the average data transfer rate to all devices on the SCSI bus in kilobytes during the previous screen update interval CR This column indicates the number of SCSI command resets that occurred since VIDPY was started BR This column indicates the number of SCSI bus resets that occurred since VIDPY was started TR This column indicates the number of SCSI target resets that occurred since VTDPY was started 6 96 Diagnostics Exercisers and Utilities Help Example Description VIDPY gt HELP Available VIDPY commands C Prompt for commands G or Z Update screen 0 Pause Resume screen updates Y Terminate program R or W Refresh screen DISPLAY CACHE Use 132 colu DISPLAY DEFAULT Use defaul n unit caching statistics display t 132 column system performance display DISPLAY DEVICE Use 132 column device performance display DISPLAY STATUS Use 80 colu EXIT Terminate program sa e as QUIT INTERVAL
582. y TILX testing continues for those units 6 34 Diagnostics Exercisers and Utilities When the soft error limit is reached soft errors will no longer be displayed but testing will continue for the unit Enter soft error limit 1 65535 32 Explanation Enter a value to specify the soft error limit for all units under test If the soft error limit is reached for a unit under test soft error reporting is disabled for that unit only However testing continues for that unit Enter IO queue depth 1 20 4 Explanation Enter the maximum number of outstanding I Os for each unit selected for testing The default is 4 Enter unit number to be tested Explanation Enter the unit number for the tape drive unit to be tested Note When TILX asks for the unit number it requires the actual number of the tape where T177 would be specified as unit number 177 Is a tape loaded and ready answer Yes when ready Explanation This question is self explanatory Select another unit y n n Explanation Enter Y to select another unit to test Enter N to begin testing the units selected The system will display the following test selections Available tests are 1 Basic Function 2 User Defined Test 3 Read Only Use the Basic Function test 99 9 of the time The User Defined test is for special problems only Enter test number 1 3 1 Explanation This question allows you to pick which TILX tes
583. y allocated for testing Explanation This message is self explanatory No drives selected Explanation TILX parameter collection was exited without choosing any units to test Maximum number of units are now configured Explanation This message is self explanatory Testing will start after this message is displayed Unit is write protected Explanation The user wants to test a unit with write and or erase commands enabled but the unit is write protected The unit status and or the unit device type has changed unexpectedly Unit x dropped from testing Explanation The unit status may change if the unit experienced hard errors or if the unit is disconnected Either way TILX cannot continue testing the unit Last Failure Information follows This error was NOT produced by running TILX It represents the reason why the controller crashed on the previous controller run Explanation This message may be displayed while allocating a unit for testing It does not indicate any reason why the unit is or is not successfully allocated but rather represents the reason why the controller went down in the previous run The information that follows this message is the contents of an EIP Tape unit numbers on this controller include Explanation After this message is displayed a list of tape unit numbers on the controller is displayed 6 40 Diagnostics Exercisers and Utilities IO to unit x has timed out TILX aborting Ex
584. y per controller with the 16 or 32 MB cache module option 3 16 Configuration Rules and Restrictions Figure 3 8 Balanced Devices Within Device Shelves UNBALANCED BALANCED 6 DEVICES PORT ON 3 PORTS 3 DEVICES PORT BA350 MA BA350 MA CONTROLLER CONTROLLER BA350 SB BA350 SB 6 3 1 2 SBBs H 6 3 1 2 SBBs 6 3 1 2 SBBs BA350 SB 6 3 1 2 SBBs BA350 SB BA350 SB 6 3 1 2 SBBs 6 3 1 2 SBBs YAMOd uU3MOd CXO 3698B MC Highest Performance To obtain the highest performance possible use a dual redundant configuration and balance the number of devices across the two controllers Do this through your operating system by ordering how devices are mounted or sequenced and by setting preferred path definitions Following this guideline results in approximately half of the devices normally accessed through each controller Should one controller fail the surviving controller automatically will assume service to the failed controller s devices Configuration Rules and Restrictions 3 17 3 5 4 Optimal Availability Configuration For optimal availability configure to the following guidelines e Use dual redundant controllers and redundant power supplies in all shelves e Place storage set members on different controller ports and different
585. ze transfer in blocks to be cached by the controller Any transfers over this size will not be cached Valid values are 1 through 1024 When entering the ADD UNIT command MAXIMUM_CACHED_TRANSFER 32 is the default READ_ CACHE D NOREAD_CACHE Enables and disables the controller s read cache on this unit When entering an ADD UNIT command READ_CACHE is the default RUN D NORUN Enables and disables a unit s ability to be spun up When RUN is specified the devices that make up the unit will be spun up If NORUN is specified the unit will be spun down When entering an ADD UNIT command RUN is the default WRITE_PROTECT NOWRITE_PROTECT D Enables and disables write protection of the unit When entering an ADD UNIT command NOWRITE_PROTECT is the default Qualifiers for a unit created from a stripeset MAXIMUM_CACHED_TRANSFER n MAXIMUM_CACHED_TRANSFER 32 D Specifies the maximum size transfer in blocks to be cached by the controller Any transfers over this size will not be cached Valid values are 1 through 1024 When entering the ADD UNIT command MAXIMUM_CACHED_TRANSFER 32 is the default READ_CACHE D NOREAD_CACHE Enables and disables the controller s read cache on this unit When entering an ADD UNIT command READ_CACHE is the default B 8 Command Line Interpreter ADD UNIT RUN D NORUN Enables and disables a unit s ability to be spun up When RUN is specified the devices that make up the unit w

StorageWorks™ Array Controllers HS Family of

Contents

Download Pdf Manuals

Related Search

Related Contents

StorageWorks&trade; Array Controllers HS Family of

Contents

Download Pdf Manuals

Related Search

Related Contents

StorageWorks™ Array Controllers HS Family of