Home
Service Manual
Contents
1. 43 83 103 Unit Available Subcode Hex 4 Error Log Status Event Codes Cont Octal Code 103 203 403 Description Unit inoperative For SDI drives the controller has marked the drive in operative due to an unrecoverable error in a previous level 2 exchange or the drive has a duplicate unit identifier Duplicate unit number Unit disabled by field service or diagnostic For SDI drives the DD bit is set NOT USED Media Format Error Subcode Hex 5 5 5 105 125 Write Protected Subcode Hex 6 1006 2006 Compare Error Subcode Hex 7 Data Error Subcode Hex 8 8 48 68 88 E8 108 128 148 245 305 405 445 10006 20006 10 110 150 210 350 410 450 510 Format mismatch Disk is not formatted with 512 byte sectors The disk s FCT indicates it is formatted with 576 byte sectors and either the controller or the drive only supports 512 byte sectors FCT corrupted Disk is not formatted or the FCT is corrupted RCT corrupted The RCT search algorithm encounters an invalid RCT enter No replacement block available Unit is software write protected Unit is hardware write protected NOT USED Sector written with Force Error modifier Invalid header The subsystem reads an invalid or inconsistent header for the requested sector Causes of an invalid header include header mis sync header sync ti
2. 1 4 UDASO Switch Setting for Address 772150 1 5 Real Time Drive State Bits AR e 2 10 Drive Sratus DIS xay cepa oS du pde aod UR arts AG Ae tee 2 13 Status Message 2 14 Decode of Controller Identification Message 2 20 Decode of Unit Identification Message A 2 21 Decoding VAX VMS Error Report SDI 2 23 MSCP Packet Word Orzganization 2 25 Decoding Message Error Format 2 26 MSCP Packet for SDI Error Format 2 27 MSCP Packet for Disk Transfer Error Format 2 28 MSCP Packet for Controller Error Format 2 29 MSCP Packet for Host Memory Access Error 2 30 UDA and Drive Status Error Words 22 through 27 2 31 10 11 UNIBUS Dela a van Ree S e 1 6 UDASO TED Error Codes si ias a uya kuu de acer 2 1 SA Resister anl adu wan eee bs 2 3 Real Time Drive State Code Interpretation 2 11 Bite Description of Status Message 5 A 1 Subsystem Diagnostic Error Code List A 3
3. UDASO host resident diagnostics contain four tests that isolate subsystem faults to the UNIBUS disk drives system exerciser program 15 also provided to test the performance of the entire disk subsystem 1 5 UDA50 ADDRESS SWITCHES AND JUMPERS The UDA50 Disk Controller contains two registers visible in the UNIBUS I O page They are the initializing and polling register and the status and address SA register The IP and SA registers are assigned an octal UNIBUS address of 772150 and 772152 respectively The UNIBUS address selector switches and a jumper plug W13 are used to set the UNIBUS address for the IP register The location of these switches and jumper plug on UDA50 module 7485 is shown in Figure 1 4 Set the UNIBUS address switches and jumpers to the positions shown in Figure 1 5 to select UNIBUS address 772150 If 772150 default address shipped with the UDA cannot be used alternate addresses are 760334 and 760340 EDGE STRIP THIS SIDE P i P1 EY TEST 94 MSb LSb LOGIC ANALYZER DATA BUS CONTROL UNIBUS ADDRESS SELECTOR SWITCH ADDRESS JUMPER PLUG UNIBUS DELAY ON Jumpers OFF T4 6 T5 o 0 14 T7 1 g 10 000000000 W13 M7485 CX 073A Figure 1 4 M7485 Address Switch and Jumper Locations 1 4 17 16 15 14 13 12 11 10 9 BINARY CODE UDA5O SWITCH 1 510 59 58 57 56 55 54 53 52 51 W13 O O SETTING ON OFF ON OFF
4. When asserted this bit indicates the drive has entered the drive available state The signal is negated when the drive leaves the available state Table 2 3 describes the possible drive state codes The error printout example 8001 indicates the signal R W RDY is active and the drive is ready to receive This represents the normal drive online state of a drive in a UDASO subsystem Table 2 3 Real Time Drive State Code Interpretation RTDS Hex Code 0000 0001 0002 0003 0040 0041 0042 0043 8000 Description The drive is either in initialization or in an offline state The drive is online Possibly an error state was recently cleared or the drive spun down with the RUN STOP switch out This code indicates an invalid drive state ATTN is asserted and the drive cannot receive con troller commands with RCVR RDY negated The drive is online and one of two conditions exist 1 The disks are spinning and there is an error state 2 The disks are not spinning and there is a switch change active This code indicates an invalid drive state RCVR RDY should be asserted if the drive is in the available state The drive is available but cannot be spun up The RUN STOP switch is not pushed in or there could be an open module interlock preventing spinup This code indicates an invalid drive state ATTN is asserted and the drive cannot receive con troller commands with RCVR RDY negated The drive is available
5. DD DB 57 DB is set indicating it is possible for the host to use a diagnostic cylinder on the drive The FO bit is set indicating the drive can be formatted No bits set in the W4 to W1 field indicate no subunit is write protected The DD bit is not set indicating the drive has not been disabled by a controller due to some error or diagnostic routine The S7 bit is not set indicating the 512 byte sector format is selected for the drive Byte 6 is the error byte and for this example none of the above errors have occurred DE RE PE DF WE Byte 7 is the controller byte and for this example a normal drive status is observed C1 C4 zeros The 54 to 51 bits being cleared indicate that the UDASO 15 to interrupt the host CPU whenever any drive on the subsystem raises its available line to the UDASO Byte 8 is the retry count failure code and for this example no retries by the diagnostic have been attempted 2 15 2 5 SUBSYSTEM ERROR MESSAGES In addition to the three error messages above the subsystem diagnostic tests may also print out one of the following Host error message UDA initialization error message Interrupt handler error message Diagnostic machine error message A sample printout of each of these four types of subsystem error messages is given here for the EVRLA diagnostics Each error message contains a number and a description of the cause of the error Sample printout o
6. OS is 2 2 22 Subsystem Diagnostics Preparation subsystem diagnostics consist of two programs ZZ EVRLA UDA host resident diagnostic and ZZ EVRLC SDI generic disk exerciser Also included with the diagnostic kit for the VAX subsystem is a disk formatter program ZZ EVRLB The formatter EVRLB 15 not a diagnostic do not run unless specifi cally instructed ZZ EVRLC tests the read and write ability of any SDI disk drive and displays differences in the read and write data to the operator Before running the subsystem EVRLA diagnostic tests the system must be set up and tested under the diagnostic supervisor On a VAX this requires DW UNIBUS adapter UDASO disk controller and an RAnn disk drive to be attached The following printout shows how to attach these devices 4 y F FF E e NOTE An RA60 must be attached by using the following sequence Note the instead of the U This is how VMS identifies the drive as removable 2 3 RUNNING THE HOST RESIDENT DIAGNOSTICS Host resident diagnostic CZUDC EVRLA contains tests 1 through 4 linked together to run automatically in sequence However if an attempt to read or write on the customer data area is desired during test 4 manual intervention is necessary A detailed description of these diagnostics is available on microfiche under RA80 diagnostics CZUDC If a printout of the test progress is want
7. The chart in Figure 2 11 shows how to interpret the MSCP packet for the controller error format The tables in the appendix show how to decode the packet words 2 28 RESERVED SEQUENCE NUMBER 0 3 6 7 LOW WORD OF MID WORD OF CONTROLLER SERIAL NUMBER 427 HIGH WORD OF CONTROLLER SERIAL NUMBER A VARIABLE AMOUNT OF CONTROLLER OR DISK DEPENDENT INFORMATION THE LENGTH OF THIS MESSAGE IS DEPENDENT ON THE LENGTH OF THE ERROR LOG MESSAGE SENT TO THE 12 15 HOST SOFTWARE BY THE CONTROLLER MICROCODE OFTEN NO CONTROLLER OR DISK DEPENDENT INFORMATION IS PROVIDED THIS INFORMATION WILL TYPICALLY NOT BE INTERPRETED BY ERROR LOG PROGRANMS AND WILL THUS BE PRINTED AS A SERIES OF OCTAL VALUES 16 19 20 23 24 27 085 Figure 2 11 MSCP Packet for Controller Error Format 2 9 4 4 Host Memory Access Error Format Chart The host memory access error format determined from the low byte of word 4 is used by the SDI type disk controllers to report errors that occur while at tempting to access host memory The failing operation may be retried The number of retries is a function of the controller The results of each retry are logged by the error log program with the same command sequence number words 0 and 1 Use the chart in Figure 2 12 to interpret the MSCP packet for the host memory access error format Refer to the tables in the appendix to decode the packet words 2 29 2 WORDS DRIVE NUMBER
8. I e gH SES SOT SUPPLIED STATUS FIELD Lil SJ CONTROLLER FLAGS HORMAL OPERATION xL J DRINE ERROR CTIIRIUE FALTO 4 TEAHSMISSION ERROR WEA RASH AG 1 STATUS By TES 8 SEEK amp FECAL RETRY COURT alc RACE bis P ion SEE PELEH aaa RASH 81 STATUS BYTE 18 lay RASH REL STATUS BYTE 11 BJ CTL ADDE LOW PRINTED IH HERST FIELD eal RASH S1 STATUS BYTE 12 BI CYLIHDER ADDRESS 71 m T Lt RABEL STATUS BYTE 13 81 CURREHT GROUP 1 2 34 CHOTE 282 CHOTE 21 Lt HOTE 221 A kanasa STATUS BYTE 14 CHOTE 221 B LED ERROR CODE SPIHILE IMTERLOCE STATUS BYTE 15 HOTE 241 DP 8 CONTROL PANEL FAULT CODE 501 ERROR RSX interprets this field RA80 81 Status Byte 9 incorrectly The field should display the last SDI com mand issued to the drive Instead RSX is decoding it as an MSCP Command It should have been decoded SEEK SDI opcode 012 The proper decoding for byte 9 is shown in Table 2 4 Table 2 4 RAS0 81 Status Byte 9 Decode Octal Hex Command Octal Hex Command 03 Diagnose 207 87 Get Common 05 Drive Clear Characteristics 06 Error Recovery 210 88 Get Subunit Initiate Seek Characteristics 0C Run 213 8B On Line OF Write Memory 215 8D Read Memory 81 Change Mode 216 8E Recalibrate 82 Mod Flags 220 90 Topology 84 Disconnect 771 FF Select Group 2 10 2 N
9. SDI Error Format idi hihi YY RRP em LP E k E apu g PETIT Yo Pd i E B i 4 s gZr m d g z nte E y i TY k RE uu i iL be BE EX p T ZQ k KE EE m i I i tt T L TF DP L TPT s T mA pm cegon e gan A ioth dIE F limite HUBS EE LIP BL UH 3 ri 1 i2 1 1 RSe LiMeM PLUS ERROR LOGGING SYSTEM Us DD MHM YT SE mapp age ow OTTER EP T Lil LIU Ty EP i Fu Lok S COST Due 2 32 DEVICE SUPPLIED INFORMATION tier ryt mi EN ael vi a EN en ma a wal 1 1 1 n t MAMMA MES ET 181 fu SSAGE EMUELOPE FIELD 1 24 CONNECTION I D MSCP DISE 2 26 MESSAGE TY PE DATAGRAM MESSAGE 3 161 CREDITS 6 2 8j MESSAGE LENGTH 22 WORDS COMMAND REFERENCE HUMBER FIELD 21 67 ERROR DOES RELATE TO A UNIT NUMBER Lis 83 UNIT HUMBE SAGE SEGUERNCE HUMBER Bi ERROR PACKET IS COMPLETE MESSAGE FORMAT FIELD xL f BJ SDT ERROR E J CONTINUING J SEQUEMCE HUMBER RESET EUEHT CODE FIELD e015 51 SUBCOTE DRTUE DETECTED ERROR 4 BJ MAJOR STATUS DRTUE ERROR m CONTROLLER ESHA 441 DASA ck CONTROLLER MODEL FIELT B M
10. UBA UBI 19 Failed to clear UNIBUS status 20 Not enough memory to test units 21 Failed GETBUF routine 22 Failed SETMAP routine EVRLA Test 1 Host Error Messages Failed to initialize device bus UBA UBI 2 Failed to clear UBA UBI status 3 Failed while checking UBA UBI status 4 Error trying to address UDAIP 5 Failed while checking UBA UBI status 6 Error trying to address UDASA 7 Failed to initialize device bus UBA UBD 8 Failed to clear UBA UBI status 9 Step bit did not set UDASA register during initialization 10 UDA resident diagnostic detected a failure 11 UDASA register failed to change during port loop diagnostic 12 Data comparison error during port loop test diagnostic 13 UDASA register failed to change during port loop diagnostic 14 Data comparison error during port loop test diagnostic 15 Failed to initialize device bus UBA UBI 16 Failed to clear UBA UBI status 17 Step bit did not set UDASA register during initialization 18 UDA resident diagnostic detected a failure A 4 Table 2 Subsystem Diagnostic Error Code List Cont Decimal Description Error Number 19 Channel services interrupt enable failure 20 Channel services interrupt disable failure 21 UDA falled to interrupt 22 Unexpected interrupt encountered 23 Unknown interrupt encountered 24 Expected received bus request BR levels do not match 25 Expected received vectors do not match 26 Failed to initialize device bus UBA UBD 27 Failed to c
11. separate error log messages with verbal descriptions the RSTS E error log prints out the contents of an MSCP packet that must be decoded For the following example it is easier to convert MSCP messages from their octal format to hexadecimal format This section shows how to decode the information in a RSTS E ERRDIS error log report A sample RSTS E error log report is given with note numbers on the right hand side Refer to the numbered notes following the sample printout to interpret the error log message 2 9 1 Sample Printout Of RSTS E Error Log Report gt 1 1 ARIS T Gm 1 3 2 24 2 9 2 Notes For RSTS E Error Log Report Note 1 This line of the sample error log printout reads RA80 for all RA81 and RA60 drive errors Note 2 Note the last two digits of the second MSCP envelope word 000020 If they read 20 as shown here the following MSCP packet contains error information If the last two digits read 01 the MSCP message is an end packet and contains limited useful information Do not use this document when the envelope word indicates this is an end packet Note 3 status code of the packet message is reporting the specific error or event that causes this error log report If a coded message is given instead of a verbal description for example drive error refer to Table A 5 in the appendix to interpret the m
12. 2 27 1 2 WORDS COMMAND REFERENCE NUMBER DRIVE NUMBER SEQUENCE LOGICAL UNIT NUMBER 9 3 ADDRESS 6 7 STATUS EVENT LOW WORD OF MID WORD OF CODE 4 7 CONTROLLER SERIAL NUMBER FLAG coDE GOOD INFO 8 HIGH WORD OF CONTROLLER CNTRL UDA RESERVED bea SERIAL N A NUMBER 12 13 14 LOW WORD OF MiD WORD OF HIGH WORD OF DRIVE DRIVE 12 15 DRIVE SERIAL NUMBER CLASS MODEL 18 19 LOW WORD OF HIGH WORD OF PACK HDA SERIAL NUMBER 16 19 LOW WORD OF MID WORD OF A VARIABLE AMOUNT OF CON TROLLER OR DISK DEPENDENT 20 23 LOGICAL BLOCK NUMBER INFORMATION THE LENGTH OF THIS MESSAGE IS DEPENDENT ON THE LENGTH OF THE ERROR LOG MESSAGE SENT TO THE HOST SOFTWARE BY THE CONTROLLER MICROCODE OFTEN NO CONTROLLER OR DISK DEPENDENT INFORMATION IS PROVIDED THIS INFORMATION 24 27 WILL TYPICALLY NOT INTERPRETED BY THE ERROR LOG PROGRAMS AND WILL THUS BE PRINTED 5 SERIES OF OCTAL VALUES CX 084B Figure 2 10 Packet for Disk Transfer Error Format 2 9 4 3 Controller Error Format Chart The controller error format as determined from the low byte of word 4 is used by the SDI type disk controllers to report errors that occur within the controller The failing operation may be retried The number of retries is a function of the type of error the type of drive and the type of controller The results of each retry will be logged by the error log program with the same command reference number words 0 and 1
13. 8421 0001 XX X X Hex 1 undefined Undefined 0010 0000 Hex 2 microcode stuck M7485 or init step 2 software 0011 0000 Hex 3 microcode stuck in M7485 or init step 3 software 0100 0000 Hex 4 microcode stuck in M7485 init step 4 or UNIBUS host timeout error inactive B L 0101 0000 Hex 4 5 test complete No problem N UDASO communicating with host software K 0110 X XXX Hex 6 undefined Undefined X X X X 0110 0111 XXX X Hex 7 undefined Undefined X X X X 0111 1000 0000 Hex 8 wrap bit 14 set in 7485 or SA register software 1001 0000 Hex 9 board one error M7485 0000 1001 2 1 Table 2 1 LED Error and Symptom Codes Cont M7485 M7486 Error Symptoms Most Likely Failure LEDs LEDs 8421 8421 1010 0000 Hex A board two error M7486 1010 1010 1011 X X X X Hex B undefined Undefined X XXX 1011 XX XX 1100 Hex C Timeout error Many 1100 X XXX check error code in SA register causes 1101 X X X X Hex D RAM parity error M7486 X XXX 1101 1110 XXX X Hex E ROM parity error M7485 XX XX 1110 1111 1111 Hex F error M7485 Cycling Cycling None No problem pattern pattern Cycling Cycling The cycling pattern continues beyond the start pattern pattern of the host software initialization process The UDASO is not responding to the host CPU M7485 The LEDs normally cycle while the UDA50 is waiting for the host to start the initialization process At that time it responds to the initialization and the cycling pattern s
14. OFF OFF ON ON OFF ON T1 T2 ALWAYS ONES ALWAYS ZEROS CX 262A Figure 1 5 UDA50 Switch Setting for Address 772150 NOTE The UNIBUS address switches and jumpers should be set for a floating address when a second UDA50 is in stalled on a system Check the system configuration and UNIBUS addresses of all devices already installed Common floating addresses are 760340 and 760330 In past disk products a vector address was also physically selectable This is not true with the UDASO Disk Controller A vector address typically 154 octal will be supplied by the software 1 5 1 UNIBUS Tuning A UNIBUS system may experience data late conditions that can be remedied by tuning the UNIBUS This process involves changing the relative positions of the nonprocessor request NPR devices on the bus The device at the front of the bus near the host will have the highest priority The device at the end of the bus will have the lowest priority 1 5 1 1 UNIBUS Device Positions The UDA should be placed at the end of the UNIBUS lowest NPR priority because it is heavily buffered Other NPR devices should be placed along the UNIBUS depending on their buffering The NPR devices with the least amount of buffering should be placed at the front of the UNIBUS 1 5 1 2 UDA NPR Priority Jumper A jumper has been inserted on the M7485 module to help tune the UNIBUS system The jumper changes the average number of UDA NPR requests over a given amount
15. Subsystem Diagnostic Error Code List Cont Decimal Error Description Number 02005 02006 02007 02008 02009 02010 02011 02012 02013 02014 02015 02016 02017 02018 02019 02020 02021 02022 02023 02024 02025 02026 02027 02028 02029 02030 Echo during receive of echo response from drive Echo command responded with different data Error bit set in get status response after drive clear Timeout on send of online command to drive Error during receive of online response from drive Online command was unsuccessful Online command did not return expected response code Timeout on send of get unit characteristics command to drive Error during receive of get unit characteristics command Get unit characteristics command was unsuccessful Get unit characteristics command did not return expected response code Host program gave DM code improper data Timeout on send of diagnose command to drive Error during receive of diagnose command response from drive Diagnose command was unsuccessful Diagnose command did not return expected response code Drive diagnostic reports a hard error Host program downline loaded a diagnostic with zero byte count Diagnostic requested by drive could not be supplied by host Timeout on send of memory read command to drive Error during receive of memory read response from drive Memory read command was unsuccessful Memory read command did not return expected response code Timeout on send o
16. X DO NOT CARE CONDITION CX 075A Figure 2 1 Real Time Drive State Bits The following four terms define the state of the drive as seen from the controller Drive offline The drive is not operational and may not communicate via the drive control protocol Drive unavailable The drive is operating is visible to and may at times communicate with the controller However the controller may not fully utilize the drive because it is online to the other controller Drive available The drive is visible to capable of communicating with and capable of executing an online command However the drive is not currently online to either controller Drive online The drive is dedicated to the exclusive use of one controller and is not available to the other The following paragraphs explain the causes effects and relationships among the four state bits within the RTDS message RCVR RDY receiver ready When asserted this bit indicates the drive is ready to receive a command on the SDI interface write command line RCVR RDY 15 negated while the drive 15 processing a command ATTN attention This notifies the controller that a potentially significant status change has occurred in the drive The drive asserts this signal in the online state whenever any of its generic status bits change The following three cases are exceptions to this rule 1 generic status bit changes as a direct consequence of the correct operation
17. assembly an I O bulkhead assembly and assorted hardware Figure 1 1 illustrates the major field replaceable units FRUs in a UDASO assembly 1 1 50 PIN FLAT CABLE 40 FLAT CABLE SDI CABLE ASSY 70 18455 6K M7486 B CABLE TIE 90 07032 00 REAR SHIELD INSIDE CABLE HOUSING RETAINER BRACKET 74 26094 01 74 26095 01 UDA BULKHEAD SUB ASSEMBLY 70 18454 01 Ta REAR CONNECTOR gt PU S MOUNT Sd Mee 74 26090 01 1 0 PANEL WITH MASSBUS CUTOUT OUTSIDE CABLE 00 RETAINER BRACKET 74 26095 01 c CX 261A Figure 1 1 UDASO Illustrated Parts 1 4 UDAS0 MAINTENANCE FEATURES The UDA50 Disk Controller has the following maintenance features UDASO resident diagnostics UDASO LED maintenance displays UDASO host resident diagnostics The UDASO resident diagnostic is PROM based microcode program that performs UDA50 self diagnosis upon powerup or hard initialization A UDASO maintenance display is located on each UDA50 module Each display consists of four LEDs These LEDs display current resident diagnostic activity and error codes caused by malfunctions Figures 1 2 and 1 3 show the location of the maintenance LEDs on each module 277 Figure 1 2 Diagnostic LED Locations UDA50 Module 7486 a 1 MSb LSb CX 277C Figure 1 3 Diagnostic LED Locations on UDA50 Module M7485 1 3
18. blocks 04054 Operator error bad block number exceeds maximum 04055 Operator error start cylinder greater than ending cylinder 04056 Operator error random and sequential seek cannot be mixed 04057 Operator error overflow calculating the LBN DBN from cylinder 04058 Operator error track exceeds maximum for device or group exceeds maximum for device A 14 Table A 2 Subsystem Diagnostic Error Code List Cont Decimal Error Number Description 04059 04062 04063 04064 04068 04069 04070 04071 04072 04073 04074 04075 04076 04077 04078 05000 Operator error two identical tracks or groups Operator error cylinder too large DBN LBN exceeds maximum Real time state received error during write Real time state received error during read Unknown error code during write Unknown error code during read Timeout of send Timeout of receive First word received was not a start frame Framing error on level 0 receive Checksum error on level 0 receive Buffer size smaller than receive Response level 2 command not as expected Drive never deasserted receiver ready after send Unknown error code returned from level 2 receive Unable to find requested drive for testing Table A 3 Error Log Event Format Codes Format Format Format Format Description code Code Code Dec Octal Hex 0 0 0 Controller errors 1 1 1 Host memory access errors 2 2 2 Disk transfer errors 3 3 3 SDI errors A 15 Table 4 E
19. ele ths e 2 31 Hi 2 2 10 2 10 1 2 10 2 2 10 3 2 10 4 5 1 1 1 1 1 APPENDIX FIGURES XD CA E S Q S t Page UDA50 FAULT ISOLATION Cont DECODING RSX ERROR LOGS 2 32 Sample Printout Of RSX 11 Error Log Report SDI CIO LOM AL SA ARES NEN ica M 2 32 Notes for RSX 11 Error Log Report SDI error sss sal oc OO AOE 2 35 Sample Printout Of RSX 11 Error Log Report Disk Transfer format cer sete rite we set dere Dee had 2 36 Notes for RSX 11 Error Log Report DISK TRANSFER error FORMAN A Daraus hut d Brett ake s 2 39 Remaining Error Log Packet Formats 2 39 INTRODUCTION TO SPEAR 224263293495 eee be yaa 2 40 Faulty ERE Selection ee erbe ee e e act 2 40 Replace The FRU Suggested By SPEAR 2 40 SUMMARY dat tos we BSG Ree oe Soon as h 2 40 TABLES UDASO Hiustrated va iba alos 1 2 Diagnostic LED Locations On UDASO Module 7466 1 3 Diagnostic LED Locations On UDA50 Module 7485 1 3 M7485 Address Switch and Jumper
20. of a command 2 A generic status bit changes as the result of an error in the reception validation or execu tion of a command 3 RE status bit changes due to a transmission error outside of a command The RE bit is described in byte 6 of the drive status message An online drive may assert ATTN regardless of whether a command is in progress not The drive will continue to assert this signal until it receives a valid GET STATUS command from the controller At this point the drive will negate the ATTN signal 2 10 A spinning drive in the avallable state always asserts ATTN signal The ATTN signal is negated if any condition arises that would prevent the available drive from spinning up under controller command R W RDY read write ready This indicates the drive 15 capable of handling a data transfer to or from the disk surface Upon receipt of the start frame of a command the drive negates R W RDY prior to reasserting RCVR RDY The signal will remain negated until the drive has processed the command and has transmitted the end frame of the response 1f required Any head motion negates this signal until the operation 15 completed and the drive is again ready to perform I O operations The drive asserts R W RDY after the successful completion of a seek operation If the operation is unsuccessful the drive will keep the R W RDY signal negated and use ATTN to signal the problem AVAIL available
21. of time by delaying the request for 0 6 2 or 10 microseconds Table 1 1 shows the amount of delay and jumper configuration 1 5 Table 1 1 UNIBUS Delay Amount of Delay Jumper Configuration 0 microseconds 4 6 6 2 microseconds 5 6 10 microseconds 6 7 If late conditions observed after setting the delay to 6 2 microseconds position the be set to the 10 microseconds position T6 T7 On some systems it will not be possible to remedy data late errors by changing the UDA NPR Priority Jumper The following is a list of systems that cannot use a UDA along with rules on how many UDAs can be installed on a system The UDA RK07 DMRII configuration on an 11 70 only gives data late errors from the RK07 regardless of the UDA s jumper setting Either an RKO7 or a UDA but not both can be con figured on the 11 70 when 1 megabit per second DMRI1 is present On both PDP 11 and VAX systems no more than two UDAs may be installed on a UNIBUS with nonbuffered UNIBUS peripheral devices NOTE If a bus repeater is used a greater possibility of data late errors exists In general the longer the UNIBUS the greater the possibility of data late errors 1 5 1 3 UDA Burst Parameter The UDA burst parameter is a host software value that indicates how many long words 32 bits the UDA will attempt to transfer when it accesses the UNIBUS The default for this parameter 15 1 but
22. two response from the drive has an invalid opcode an improper length or is not possible response in the context of the exchange 16B 553 Clock resumption fails after initialization For SDI drives the drive clock does not start after a controller attempt to initialize the drive 18B 613 Clock persists after initialization For SDI drives the drive clock continues beyond drive initialization 1 653 Receiver ready collision For SDI drives the controller attempts to assert 15 receiver ready to receive a response and the drive s receiver ready is still asserted to receive a command 1CB 713 Response overflow Table 6 Controller Class Values Class Subsystem Type Byte Decimal 0 Reserved 1 Mass storage controllers 2 Disk class device DEC Standard 166 format 3 Tape class device 4 Disk class device DEC Standard 144 format Table A 7 Controller Model Values Model Controller Type Byte Decimal 0 Reserved 1 HSC50 4 VMS MSCP server 5 TU81 6 UDASO A 20 Table A 8 Drive Model Number Values Model Byte Decimal 1 4 5 Device Model RASO fixed media disk drive RA60 removable media disk drive 1 fixed media disk drive Table A 9 MSCP Error Codes Octal Code e JO WN N Definition Error is logged by the bad block replacement module Driver is sending a command at the time of the error Driver can not find a free command packet Driver determined th
23. 3 353 10 413 12B 453 Description Data memory error The controller detects an error in an internal memory such as a parity error or a nonresponding address This subcode only applies to errors not reported via MSCP These errors do not affect the controller s abili ty to properly generate end and error log messages For most controllers this subcode is only returned for controller memory errors in data or buffer memory and noncritical control structures If the controller has several such memories the specific memory involved is reported as part of the error ad dress in the error log message PLI reception buffer parity error N A for UDA PLI transmission buffer parity error N A for UDA Drive command timeout For SDI drives the controller s timeout expires for either a level two exchange or the assertion of read write ready after an initiate seek Controller detected transmission error For SDI drives the controller detects an invalid framing code or a checksum error in a level two response from the drive The UDA50 also returns this subcode for controller detected protocol er rors All other SDI controllers return subcode 9 for protocol errors Positioner error mis seek The drive reports a seek operation is successful but the controller has determined the drive has positioned itself to an incorrect cylinder Lost read write ready during or between transfers For SDI drives read write ready is n
24. BYTE MSB 4 S3 S2 S1 C1 C2 C3 C4 LSB ERROR BYTE MSB DE RE PE DF WE 0 0 0 LSB MODE BYTE MSB W4 W3 W2 W1 DD FO DB S7 LSB REQUEST BYTE MSB OA RR DR SR EL O PS RU LSB SUBUNIT MASK BITS AND FOUR HI ORDER BITS OF UNIT NUMBER 8 BIT UNIT NUMBER 077 Figure 2 3 Status Message Interpretation Byte 1 is the GET STATUS response code and is not printed out Byte 2 and the lower half of byte 3 comprise a three digit hexadecimal unit number In the exam ple the unit number is 020 hexadecimal or 32 decimal Byte 3 upper half reflects the subunit mask and indicates the drive sending the status is subunit zero 0001 2 14 Byte 4 is the request byte and breaks down as follows 0001 0011 OA DR SR EL 0 PS RU The RU bit is set indicating the drive RUN switch is depressed The PS bit is set indicating the port select switch for the UDA requesting the status is depressed The drive is available to the UDA50 The SR bit is set indicating the drive spindle is up to speed The OA bit is not set indicating the drive is at a drive available state If it had been set it would indicate online The RR bit is not set indicating the selected drive does not need an internal adjustment The DR bit is not set indicating the selected drive has no request for an external diagnostic to be loaded into it Byte 5 is the mode byte and breaks down as follows 0000 10110 W4 W3 W2 W
25. CP and SDI bus structure compensates for differences between disk drives Also included with the PDP 11 subsystem diagnostic is a disk formatter program CZUDE The formatter CZUDE is not a diagnostic do not run it unless specifically instructed The CZUDC diagnostic program asks both hardware and software questions of the user A sample printout of these questions when the default conditions are selected is shown below If the manual intervention question is answered yes the following series of questions will be asked Manual intervention should be used only to further isolate problems after running the test with the default answers shown above Bat BLOCK i53 Answer this question to bypass further questioning A Y answer results in the following questions The last two questions will be asked one to four times depending on the answer to the previous question If answered 2 The last question may be asked one to seven times depending on the answer to the previous question answered 3 The last question may asked to seven times depending the answer previous question The following question 15 asked only 1 options 2 or 3 were requested The following question will be asked if the LIMIT THE CYLINDERS question was answered Y within options 2 and 3 or if option 4 was selected for the area to test IF it x poy mii i
26. CP error codes Status code of packet The status code of the packet is a summary statement of the condition which prompted the error log entry The condition statement is obtained from an analysis of the status event field This statement is very general and does not give a detailed cause for the error Refer to table A 10 in the appendix for a list of status codes for the packet 2 18 17 SDI status message An SDI status message is received when the error log is reporting an SDI error event message format 03 In a VMS SYE error log report this message is given in the form of three 8 character fields This message contains the contents of bytes 4 through 15 of the drive status message shown in Figure 2 2 Bytes 4 through 8 contain controller specific informa tion and bytes 9 through 15 contain drive specific information Status byte 15 contains the same drive error code normally obtained through the hand held terminal In the RSTS ERRDIS error log the SDI status Information is found in words 22 through 27 of the MSCP packet This packet is printed out in the form of 37 character groups Refer to the sample printout of the ERRDIS error log report 2 8 DECODING VMS ERROR LOG REPORTS This section shows how to decode the information in a VAX VMS error log report Two sample VAX V MS error log printouts are given with note numbers on the right hand side These numbered notes after each sample printout show how to interpret the error log mess
27. EK UDA50 SV 003 UDASO Service Manual EK UDA50 SV 003 UDA5O Service Manual Prepared by Educational Services of Digital Equipment Corporation First Edition May 1983 Second Edition August 1984 Copyright 1984 by Digital Equipment Corporation All Rights Reserved The material in this manual is for informational purposes and is subject to change without notice Digital Equipment Corporation assumes no responsiblity for any errors which may appear in this manual The UDA50 Controller is designed to work with Digital Equipment Corporation host computers tape and disk products Digital Equipment Corporation assumes no responsibility or liability if the computers tape or disk products of another manufacturer are used with the UDASO subsystem Printed in U S A This document was generated using DSRPLUS Class A Computing Devices Notice This equipment generates uses and may emit radio frequency energy The equipment has been type tested and found to comply with the limits for a Class A computing device pursuant to Subpart J of Part 15 of FCC Rules which are designed to provide reasonable protection against such radio frequency interference when operated in a commercial environment Operation of this equipment in a residential area may cause interference in which case the user at his own expense may be required to take measures to correct interference The following are trademarks of Digital Equipment Corporation Maynard Massach
28. Error Log Event Format Codes 15 Error Log Message Flags ins A 16 Error Log Status Event Codes A 16 Controller Class Values m A 20 Controller Model Values aT dae act 20 Drive Model Number A 21 MSCP Error COJO u l Cale x us ace ua Mo eg 21 Status Code of MSCP Packet 21 UDASO Internal Error Code A 22 1 INTRODUCTION 1 1 SCOPE OF MANUAL The UDA50 Service Manual describes the maintenance and troubleshooting procedures needed to support the UDA50 Disk Controller The manual covers both UDA50 resident diagnostic and UDA50 host resident diagnostic operating procedures The service manuals for individual disk products provide device specific service information for troubleshooting disk subsystem problems 12 UDA50 MAINTENANCE PHILOSOPHY The maintenance philosophy planned for the UDA50 Disk Controller 15 module replacement Field Service personnel should not attempt to replace or repair component parts within these modules 1 3 UDASO FIELD REPLACEABLE PARTS The UDASO Disk Controller consists of two hex modules two flat ribbon intermodule cables an unshielded standard disk interface SDD cable
29. NOSTICS 2 8 INTERPRETING HOST RESIDENT DIAGNOSTIC MESSAGES 2 8 Status Address SA Register Contents 2 9 Data Comparison Errors 2 9 Real Time Drive State and Status 2 9 Status Message Byte es 2 12 Status Message Interpretation 2 13 SUBSYSTEM ERROR MESSAGES 2 16 HOST ERROR LOG 2 17 COMMON ERROR LOG MESSAGE DEFINITIONS 2 17 DECODING VMS ERROR LOG 5 2 19 Sample Printout Of VMS Error Report Sequence 1 2 19 Notes For VMS Error Report Sequence 1 2 20 Sample Printout Of VMS Error Report Sequence 2 2 22 Notes for VMS Error Report Sequence 2 2 23 DECODING RSTS E ERROR 2 24 Sample Printout Of RSTS E Error Log 2 24 Notes For RSTS E Error Log 2 25 Decoding MSCP Packet Message 2 26 Recovering Error Information In MSCP 2 26 Status Word Availability ar CER I DPA Cs s 2 30 Decode Status
30. ODE FIELD ae et Ee UHITI D WAIT MODEL FIELD Lor A MODEL UHIT CLA CF BIC So FIELD LASS DISE CLASS DEVICE UNIT SOFTHARE VERSION FIELD HI FIRMWARE VERSION 7 WHIT HARDWARE VERSION HUMBER FIELD f 81 HARDWARE VERSION 4 RETRY LEVEL COUNT FIELD Cid 27 FLIHHIHG RETRY LUUHT ERROR RECDUERT LEUEL 2 38 BMERARL IPSS SERIAL HUMBER FIELD LHOTE 33 31 BJ WOLUME SERIAL HUMBER 7128 BHBERLLLZ45 HEADER UALUE FIELD LHOTE 51 r L i 2 10 4 Notes For RSX 11 Error Log Report DISK TRANSFER Error Format Note 1 For Transfer errors the block number decimal is given here Note that it converts to the HEADER VALUE FIELD Note 2 MESSAGE FORMAT FIELD this case 15 a Disk Transfer Error Log Format Note 3 EVENT CODE always gives the UDA50 interpretation of the error Note 4 The CONTROLLER SOFTWARE VERSION NUMBER must be 3 or higher Note 5 Pack HDA serial number This is good to make note of on data transfer errors Note 6 This is the LBN address of the error sector This converts to 37541 Decimal which Is the same number RSX indicated for the block number See Note 1 above 2 10 5 Remaining Error Log Packet Formats The two remaining packet formats are the CONTROLLER ERROR MSCP PACKET FORMAT and the HOST MEMORY ACCESS ERROR PACKET FORMAT These formats are not shown as examples but the f
31. ODEL UDAH CONTROLLER CLAS pr TEE Li B CLASS S STORAGE CONTROLLERS CONTROLLER SOFTWARE VERSION HUMBER FIELD 81 Unasa FIRMWARE VERSION CONTROLLER HARDWARE VERSION HUMBER FIELD UDA HARDWARE VERSION MLULTIUHIT CODE FIELD B ACCESS FATH 6 L15 121 SHARED SPINDLE 4 HHIT I UD 25 Cu UNIT MODEL FIELD 8 MODEL BABI 2 33 IHTERFRE TAT ITOH CHOTE 67 CHOTE Y 81 MOTE 9 4 1 Ll CHOTE 181 CHITE 111 LHOTE 121 i i I 1 d rn i re rm Pate Res s ciuituoritT rr es eas FHTEY dee 4 HES URN IT CLASS FIELD 7 BI CLASS PISE CLASS DEVICE 887 UNIT SOFTHARE VERSION HUMBER FIELD 7 BJ FIRMMARE UERSION T gas WHIT HARDWARE VERSION HUMBER FIELD HARDWARE VERSION 4 Hagn BROUP FIELI Cis 8 GROOF HOL INE AL HUNBER F IELTI i 67 VOLUME SERIAL HUMBER 7158 agamgaaagaadad HEADER VALUE FIELD DETTO i Pa Lil P i LiT iL d UTE 141 LHOTE 151 NOTE following bytes entered into the error log entry from information provided from the drive This information is obtained from the drive via a host GET STATUS command HAZE 3 SOL SUPPLIED STATUS FIELD J FORMAT OF ERATION EHABLET 1512 SECTOR FORMAT J SPINDLE READY EL SET 1 FOET SHITCH IH 1 RUM SWITCH rT orm ru deo orm ru
32. PEAR Again consult with the system manager before placing the disk drive offline If the drive 15 already offline there is no danger of bringing down the customer s operating system Use the appropriate drive service manual to see how to replace the FRU 2 12 SUMMARY UDASO problems generally appear in diagnostic printouts as SA register error codes Drive problems appear in the error report without SA register error codes Instead the error report will print the real time drive state code and the general status When this happens the drive maintenance guide or service manual should be consulted These manuals give instructions on how to run the drive specific diagnostics to identify the drive failure 2 40 APPENDIX TABLES Listed below are Tables 1 through A 11 Table A 1 Bit Description of Status Message Bytes Status Byte Byte 1 Byte 2 Byte 3 Byte 3 Byte 4 Byte 4 Byte 4 Byte 4 Byte 4 Byte 4 Byte 4 Bit Description Response Code Field Byte 1 is the response code to a controller command Unit Number The unit number consists of two hexadecimal digits representing the unit number of the selected disk drive returning the status 0 254 Subunit Mask The subunit mask is a four bit representation of the subunit that is returning the status message right most bit position represents subunit 0 The left most bit position represents subunit 3 Only one bit can be set at a time
33. RRED CYLINDERS CROS ab i 35 a iO OPERATION INFORMATION FUHCTIOH TYPE OF ERROR see ee E ea at Ute eden e eka e da E AA A AE LHOTE 11 g ppm ec es T mug HEAT SECTOR BLOG Ss LIRAMN PLUS ERROR LOGGING SYSTEM Ue 68 DD MMM W9 YY HH SS TELE ET TT DE TO Rp T Lim DENICE SUFFLIED THPORMNATION 1 0 om r LIT mmm m pe 5 LE IHTEEFEETATIUH 4 4 MESSAGE ENUELOFE FIELD r1 bet 24 CONNECTION MSCP DISE 22 26 MESSAGE TYPE DATAGRAM MESSAGE Lid 16 CREDITS H 15 81 MESSAGE LEHGTH z2 WORDS 111258204594 COMMAND REFERENCE HUMBER FIELD 31 8 COMMAND REFERENCE HUMBER HEEE WHIT HUMBER FIELD 115 8 UNIT HUMBER B MESSAGE SEQUENCE HUMBER 115 8 ERROR PACKET 15 COMPLETE 2 37 55 FIELD TT OPERATION SUC 1 SEQUENCE HUMBER RESET HI LOGGED PACKET gt EUEHT CODE FIELD sli 51 SYMBOL ECC ERROR HI MAJOR STATUS TATA ERROR CONTROLLER SENA 4410asenis et xs od Ami AN OLLER MODEL F LE MOTEL UPDATE CONTROLLER CLA EF Bj CLASS CONTROLLERS CONTROLLER SOFTHARE WER SION HUMBER FIELD FIRMWARE CHOTE 43 LOW 3 CONTROLLER HARDWARE VERS HUMBER FIELD 83 URAS HARDWARE WER i ITOH E MULTIUMIT C
34. SEQUENCE 3 LOGICAL UNIT NUMBER 8 ADDRESS 6 7 LOW WORD OF MID WORD OF CONTROLLER SERIAL NUMBER ad HIGH WORD OF RESERVED CONTROLLER N A 8 11 SERIAL NUMBER HOST MEMORY ADDRESS 12 15 16 19 20 23 24 27 CX 086B Figure 2 12 MSCP Packet for Host Memory Access Error Format 2 9 5 Status Word Availability The status words contain important controller status drive status and error information These status words are only available when the error log program is reporting an SDI error format message format code 03 The availability of the status words is determined by checking which error format code the error log event is reporting In the VMS error log report this is determined by reading the MLSG__ B__FORMAT line In a RSTS E error log report word 4 of the MSCP packet must first be decoded as shown in Figure 2 9 2 30 2 9 6 Decode Status Words Decode the error log report as an SDI error format type format code 03 SDI error format reports contain useful information on the UDA and disk drive in the SDI status message For VMS error log reports the SDI status word information is found on the MSLG_ Z _ SDI line It is reported as three lines of hexadecimal data and shown in decoded form in Figure 2 6 For RSTS E error log reports the SDI status word information is given in words 22 through 27 of the MSCP packet This information is given in an octal word format and is decoded as shown in Figures 2 7 2 8 and 2 13 After t
35. SPEAR Standard Package for Error Analysis and Reporting program should be run if it is available on the operating system SPEAR goes beyond the typical error log capabilities It not only accumulates data but it has the capability of analyzing and predicting which FRU is at fault SPEAR is a library of functions that sorts evaluates and reports on events recorded in the local system event file SPEAR is currently used on TOPS 10 TOPS 20 and VMS operating systems Plans are underway to incorporate RAnn drives under its diagnostic analysis SPEAR Reference Cards may be ordered from Printing and Circulation Services under part number EK SPEAR RC This reference card summarizes the SPEAR function codes system event codes and pro vides other useful information If SPEAR is not available to you permission should be obtained from the operating system manager to run the subsystem diagnostics This requires customers to give up their operating system temporarily 2 11 1 Faulty FRU Selection If SPEAR did not pick out the faulty FRU then consult with the system manager to see if the suspect disk drive can be taken offline to run the drive resident diagnostics It is important to consult with the system manager first since the operating system may depend on the disk pack for a system image or back up file With the system managers approval place the suspect disk offline and run the drive resident diagnostics 2 11 2 Replace The FRU Suggested By S
36. UDASO subsystems can handle only drives that contain no subunits Therefore the only valid number in this status Bite is a hex adecimal 1 Figure 2 4 shows the bit layout For drives that contain no subunits e g the RA80 the right most bit position is always set to a 1 indicating subunit 0 High Unit Number Byte 3 contains the upper four bits to a 12 bit 3 hexadecimal digits unit number OA A binary one in this position indicates the drive is unavailable to UDASO binary zero indicates the drive is available to the UDASO RR A binary one in this position indicates the drive requires an internal readjustment Some drives do not use this bit DR A binary one in this position indicates there is a request for a diagnostic to be loaded in the drive microprocessor memory A binary zero indicates that no diagnostic is being re quested of the host system SR A binary one in this position indicates the drive spindle is up to speed A binary zero indicates the drive spindle is not up to speed EL A binary one in this bit position indicates that there is loggable information in the ex tended status area Bytes 9 15 A binary zero indicates that no information is available in the extended status area PS A binary one in this bit position indicates the drive port select switch for this controller is pushed in selected A binary zero indicates the switch is out RU A binary one in this position indicat
37. ages 2 8 1 Sample Printout Of VMS Error Report Sequence 1 2 19 T UO muito PE tht ii iF REO fi cir Lj C mE om mm iu H mi m hos p m Pf gt t aig TIM Li pi Ee PR Aoc Lid Ta pel Ee 2 uii pu EE E er Pe min F ik 4 E icd i To STE HS IU DE T cr To Seog Shite i Bomb momi ws i et tk uw Ee pie Tie TE i EZ mig hiig m UP RTP RSG AC LPi FERES ARE i EE hm m ans i ES mi L L LLL Ri 2 8 2 Notes For VMS Error Report Sequence 1 Note 1 This is the command number which caused this error to be reported It will be zero if no host command is associated with this error Note 2 This drive unit number relates to the error log message Note 3 Presently unused Will be zero Note 4 The 02 indicates the format of this message is that of a disk transfer error Refer to A 3 in the appendix Note 5 The 41 indicates the sequence number reset flag 01 and the operation continuing flag 40 are set Refer to A 4 in the appendix Note 6 The 00CB identifies a specific error or event being reported by this error log message The error is LOST RECEIVER READY DURING TRANSFER Refer to A 5 in the appendix Note 7 The controller identification message decodes as shown in Figur
38. and capable of being spun up This code indicates an invalid drive state R W RDY should not be asserted with RCVR RDY negated 2 1 Table 2 3 Real Time Drive State Code Interpretation Cont RTDS Description Hex Code 8001 This is normal drive online state 8002 This code indicates invalid drive state ATTN is asserted and RDY 15 negated preventing the drive from receiving controller commands 8003 The drive is online and one of two conditions exist 1 One of the switches on the drive operator control panel has been pushed 2 The drive 15 reporting a successful retry of a seek with recalibration 8040 This code indicates an Invalid drive state R W RDY and AVAIL should never be asserted together Also ATTN should be asserted when the drive is available and capable of being spun up 8041 This code indicates an Invalid drive state R W RDY and AVAIL should never be asserted together Also ATTN should be asserted when the drive is available and capable of being spun up 8042 This code indicates an invalid drive state R W RDY and should never be asserted together Also ATTN is asserted and the drive cannot receive controller commands with RCVR RDY negated 8043 This code indicates an invalid drive state R W RDY and AVAIL should never be asserted together FFFF The controller is unable to get a valid drive state 2 4 4 Status Message Bytes The status line information found in the error messa
39. and the controller model The controller unique number is blasted into ROM The controller class refers to whether it is a mass storage controller or some other disk or tape device Refer to 6 in the appendix for controller class values and 7 for the controller model values Drive identification The drive identification message contains a drive unique number blasted into ROM which is not necessarily the drive serial number etched on the drive S N tag It also contains a controller class number shown previously in 6 and a drive model number given 8 Hardware and software revisions These revision levels are given for both the controller and the drive The values are blasted into ROM for each device Pack HDA serial number This is the low order 32 bits of the serial number of the HDA mounted on the drive This serial number is written on the media at the factory The bit field is zero if the format of the media does not provide for a media serial number The bit field 15 an undefined number if the media is not mounted or the serial number cannot be read Header or logical block number This message gives the logical block number LBN of the physical sector where the error occurred If the high four bits are 0000 binary then the low 28 bits are the logical block number where the error occurred If the high four bits are 0110 binary the low 28 bits are the replacement block number where the error occurred Erro
40. are the drive inoperative and mark it offline The number of retries are different for each disk drive The chart in Figure 2 9 is used to interpret the MSCP packet for the SDI error format The tables in the appendix show how to decode the packet words 2 26 2 WORDS DRIVE NUMBER SEQUENCE COMMAND REFERENCE NUMBER LOGICAL UNIT NUMBER ADDRESS 3 5 6 STATUS EVENT LOW WORD OF MID WORD OF CODE 2 CONTROLLER SERIAL NUMBER 4 7 FLAG GOOD INFO 8 HIGH WORD OF CONTROLLER CNTRL UDA 8 11 SERIAL NUMBER 12 13 LOW WORD OF MID WORD OF 12 15 DRIVE SERIAL NUMBER 18 LOWWORD OF HIGH WORD OF RESERVED 16 19 PACK HDA SERIAL NUMBER LOW WORD OF HIGH WORD OF UDA STATUS ERROR 20 23 LOGICAL BLOCK NUMBER INFORMATION 24 27 ERROR COUNT 083 Figure 2 9 MSCP Packet for SDI Error Format 2 9 4 2 Disk Transfer Error Format Chart The disk transfer error format as determined from the low byte of word 4 is used by the SDI type disk controllers to report errors that occur during a disk transfer This format is generally used to report the results of a series of retries Each retry is recorded by the error log program with the same command reference number If the retries are unsuccessful the controller may declare the drive inoperative and mark it offline The chart in Figure 2 10 shows how to interpret the MSCP packet for the disk transfer error format The tables in the appendix show how to decode the packet words
41. can range from 1 to 32 Increasing the UDA burst parameter to a number greater than increases the overall system efficiency However data late conditions are more likely to exist 1 6 UDA50 PRIORITY PLUG All UDASO M7485 modules are shipped with a level 5 priority plug Because this is the recommended priority level for UDASO disk subsystems the priority plug need not be changed for the majority of installations If another priority level is required in some special circumstance the current priority plug must be removed and the new one inserted The location of the priority plug is shown in Figure 1 4 It should be inserted so the notch on the priority plug aligns with the hole on the module socket 1 7 INSTALLATION OF BOOTSTRAP ROM The proper bootstrap ROMs are shipped with the UDASO Bootstrap 23 767A9 00 must be installed on the PDP 11 bootstrap ROM module M9312 Bootstrap ROM 23 990A9 00 must be installed on the VAX 11 750 1 6 18 RELATED DOCUMENTATION DIGITAL customers can order the following list of UDASO related manuals UDA50 USER GUIDE EK UDAS0 UG UDA50 SERVICE MANUAL EK UDASO SV UDAS0 MAINTENANCE GUIDE 185 UDA50 FIELD MAINTENANCE PRINT SET MP 01331 DSA CONTROLLER DOCUMENTATION KIT QP906 GZ DSA DRIVES DOCUMENTATION KIT QP907 GZ The DSA Controller kit consists of a small looseleaf binder and the maintenance guides for all the DSA controllers The DSA Drives Kit consis
42. d CZUDC Host Error Messages 00001 00002 00003 00004 00005 00006 00007 00008 00009 00010 00013 00020 00021 00022 00023 00024 00025 00026 The program does not like the way you answered the hardware questions or the UDA50 was given more than 1 vector or BR level or burst rate The program does not like the way you answered the hardware questions two units select the same drive The program does not like the way you answered the hardware question more than eight drives selected on UDA Not enough room in memory to test the units selected Checksum error in DM program file Table inconsistency error reload program Error in DM program file DM program not found Two UDAs use the same vector Illegal configuration for test 4 Wrong APT diagnostic is being used with this controller use CIUDX Microcode reports controller model that did not match get status response Memory error trying to read UDA registers check UNIBUS selection switches on UDA module M7485 or UNIBUS UDA resident diagnostic failure replace M7486 Step bit did not set in UDASA register replace M7485 UDA did not clear ring structure in host memory during initialization replace M7485 UDASA register did not go to zero after step 3 write of initialization suspect either 7485 or the UNIBUS UDA did not return correct UDASA register information replace 7485 Data compare error port loop test replace 7485 Table A 2 Subsystem Diag
43. e 2 4 Use Tables A 6 and A 7 to decode the controller class and controller model CONTROLLER ID MESSAGE UNIQUE CNT ID 000000000000 06 MODEL UDA50 01 55 STORAGE CNTRL CLASS 078 Figure 2 4 Decode of Controller Identification Message 2 20 Note 8 This is the microcode version from the UDA Listed below are the possible codes 02 OLD UDA needs upgrade 03 09 microcode e 04 U21 microcode Note 9 Not used Will be zero Note 10 For a UDASO the LSB is the port number 0 3 All other bits are zero Note 11 The unit identification message gives the drive unique device number serial number for RA80 RA81 but a unique number for RA60 the device class and the drive model Figure 2 5 shows how to decode the unit identification message Refer to Tables A 6 and A 8 for the controller class and drive model 00000 4 8 2 UNIT ID MESSAGE UNIQUE IDENTIFIER 000000000482 04 DRIVE MODEL RAGO 02 DISK CLASS DEVICE CX 079B Figure 2 5 Decode of Unit Identification Message Note 12 This message indicates the error recovery procedure used Codes of zero and 255 in dicate no special error recovery is used Note 13 This message indicates the number of retries attempted under the error recovery pro cedure given in note 11 Note 14 The volume serial number gives the serial number of the disk media Note 15 The header message gives the physical location where the error occured I
44. e unit is hung Disk unit size is too big over pack cluster size 16 Controller is offline Unit is not functional Command timed out Data error during read write command Table A 10 Status Code of the MSCP Packet Packet Code Success Invalid command Command aborted Unit Off line Unit available Media format error Write protected Compare error Data error MSCP Packet Status The command or retry of a failed command is unsuccessfully completed An invalid command or command parameters are received by the controller The controller aborts a command in progress The unit identified in the unit number field is in the off line state The unit identified in the unit number field is in the available state The pack or HDA mounted in the drive appears to be formatted incorrectly command requiring a write operation is attempted on write protected unit A compare host data command finds a difference in the data that is written and the data in host memory like a write check command Invalid or uncorrectable data is obtained from the drive 21 Table A 10 Status Code of MSCP Packet Cont Packet Code Host buffer access buffer Controller error Drive error MSCP Packet Status The controller encounters an error like UNIBUS timeout when trying to access host memory The controller encounters an internal controller error The controller discovers an error within a drive Suc
45. ecksum error on level 0 response 03006 Response longer than expected 03007 Code received from subsystem unintelligible 11 Table 2 Subsystem Diagnostic Error Code List Cont Decimal Description Error Number 03008 Command did not return expected response code 03009 Drive not asserting receiver ready in drive state 03010 Failed to receive valid drive state 03011 Can not receive drive state from drive check power 03012 Drive state received with bad parity 03013 No valid state from drive 03014 Subunit characteristics say there are zero read only groups in diagnostic area 03015 Subunit characteristics say less than 1 read write groups in diagnostic area 03016 Neither read write ready nor attention set after recalibration command 03017 Subunit characteristics say less than 1 diagnostic cylinder 03018 Read write ready dropped before format operation 03019 Format operation reported timeout failure 03020 After recalibration error bits were set 03022 Read write ready dropped before write operation 03023 Could not read or write any block on this track write operation failure 03024 Read write ready dropped before read operation 03025 Could not read or write any block on track read operation failure 03026 Could not read or write any block on track data compare word failure 03027 Seek complete time out read write ready did not set 03028 No block on this track can be read last block tried 03029 Available was not asserted a
46. ed type SET TRACE on a VAX prior to starting the test The following is a sample test printout p ec E ET 5 aT Toys wasi pamer mo pes sa gt Pili UDA HISE SUBSYSTEM DIBGHOSTII cd repe ibid gt m gt p 4 m Et H H T ST 4 VS OL STL 1 DRIVE DIAG OH UDA ADDRESS 207 1 HITIAL HRITE COMPLETE The disk exerciser diagnostic in test 4 will continue to run until halted with a CTRL C Type CTRL C to return to the diagnostic supervisor prompt DS gt then type ABORT If test 1 is successful the problem is probably drive related Tests 2 through 4 should detect the failure 2 4 INTERPRETING HOST RESIDENT DIAGNOSTIC MESSAGES The VAX and PDP 11 diagnostics display the same error messages Error messages take on three distinct formats which provide SA register contents data comparison error information or real time drive state and status Consult the drive service manual or maintenance guide for interpretation of the status messages 2 8 2 4 1 Status Address SA Register Contents The following sample error message gives the UDA50 SA register contents For a description of the error and a callout of the most likely faulty FRU find the SA register contents 100004 in Table 2 2 2 4 2 Data Compar
47. ed the controller microcode from being able to keep up with the data transfer to or from the drive 112 EDC error The sector 1s read with correct or correctable ECC and an in valid EDC There is most likely a fault in the ECC logic of this controller or the controller that last wrote the sector 152 Inconsistent internal contro structure Some high level check detects an in consistent data structure For example a reserved field contains a nonzero value or the value in a field is outside its valid range This error usually im plies the existence of a microcode bug 212 Internal EDC error Some low level check detects an inconsistent data struc ture For example a microcode implemented checksum or vertical parity hard parity 15 horizontal associated with interna sector data is inconsistent This error usually implies a fault in the memory addressing logic of one or more of the controller s processing elements It may also result from a double bit error or other error that exceeds the error detection capability of the con troller s hardware memory checking circuitry 412 Data bus overrun The controller attempts to perform too many concurrent transfers causing one or more of them to fail due to a data overrun or underrun A 18 Table 5 Error Log Status Event Codes Cont Hex Octal Code Code 12 452 14 512 16A 552 Drive Error Subcode Hex B 2B 53 4 113 6 153 8 213 253 31
48. egated when the controller attempts to initiate a transfer or at the completion of a transfer Read write ready is previously asserted in dicating the completion of the previous seek This usually results from a drive detected transfer error in which case an additional error log message may be generated containing the drive detected error subcode Drive clock dropout For SDI drives either data clock or state clock is miss ing when it should be present This is usually detected by a timeout Lost receiver ready for transfer For SDI drives receiver ready is negated when the controller attempts to initiate a transfer or does not assert at the com pletion of a transfer This includes all cases of the controller s time out expir ing for a transfer operation level one real time command Drive detected error For SDI drives the controller receives a get status or unsuccessful response with the EL flag set The controller may also receive this response with the DR flag set It does not support automatic diagnosis for that drive type Controller detected pulse or data parity error For SDI drives the controller detects a pulse error on either the state or data line or the controller detects a parity error in a state frame Drive requested error log EL bit set A 19 Table 5 Error Log Status Event Codes Cont Hex Octal Description Code Code 14B 513 Response length or opcode error For SDI drives a level
49. ernal Error Code Cont Internal Error Code HEX C E 17 Description ER DMX Diagnostic mode XFC error Invalid virtual circuit identifier The UDA50 15 trying to execute an MSCP command in the wrong mode ER IWR Interrupt write error on UNIBUS Set if the UDA cannot write to the host and set the Response ring not zero ER SUN Too many subunits on the UDASO 23 Digital Equipment Corporation e Colorado Springs CO 80919
50. es the RUN STOP switch is pushed in RUN A binary zero indicates the switch 15 out STOP A 1 Table 1 Bit Description of Status Message Bytes Cont Status Byte Byte 5 Byte 5 Byte 5 Byte 5 Byte 5 Byte 6 Byte 6 Byte 6 Byte 6 Byte 6 Byte 7 Byte 7 Bit Description W4 W1 Binary ones in any of these four bit positions represent the write protect status for the sub unit represented e g a 0001 indicates subunit 0 within the selected drive is write protected DD binary one in this bit position indicates the drive has been disabled by a controller error routine or diagnostic The FAULT light is on when this bit is set A binary 0 indicates the drive was enabled by a controller error routine or diagnostic FO binary one in this position indicates the drive can be formatted DB A binary one in this position indicates the diagnostic cylinders on the drive can be accessed S7 binary one in this bit position indicates the 576 Byte sector format is selected A binary zero indicates that the 512 Byte sector format is selected The UDASO does not sup port 576 byte format DE A binary one in this position indicates a drive error has occurred and the drive FAULT lamp may be on RE binary one in this position indicates an error occurred in the transmission of a com mand between the drive and the UDA50 The error could be a checksum error or an incor rectly forma
51. essage Note 4 The MSCP error code reports conditions that cause a failure in MSCP communications Refer to A 9 in the appendix for the list of possible causes Note 5 If the second word of the MSCP envelope ends in 20 note 2 the 30 octal words in the MSCP packet contain useful error information Figure 2 7 shows how the MSCP packet message 15 Organized into words Word 4 should be decoded to determine the error message format before the error information is decoded The next paragraph shows you how to determine the message format MSCP PACKET WORDSO0 3 0 1 2 3 WORDS 4 7 WORDS8 11 WORDS 12 15 107001 000000 000400 000004 WORDS 16 19 000000 000000 002023 000000 WORDS 20 23 013400 000005 001653 000000 WORDS 24 27 000000 000000 TWOWORDS WORDS 28 29 ALWAYS ZERO 081 Figure 2 7 MSCP Packet Word Organization 2 25 2 9 3 Decoding MSCP Packet Message Format The error information in the MSCP message packet may come in one of four kinds of formats It is important to determine the format of the message first The four error message formats are Controller error format code 0 Host memory access error format code 1 Disk transfer error format code 2 SDI error format code 3 The error message format codes are found in the low byte of word 4 of the MSCP packet Figure 2 8 shows how to interpret the format code from word 4 in the sample RSTS E error log printout given previously T
52. f a UDA initialization error message Sample printout of an interrupt handler error message Sample printout of a host error message 3 E Sample printout of a diagnostic machine DM error message 4 PROGRAM i ERROR L 2 16 DM error message sample printout error number of 02005 is given description of the error may d in Table 2 in the appendix Each of the above subsystem error messages gives an error number in the printout For example in the ioun 2 6 HOST ERROR LOG EXAMINATION This section assumes you already know how to run the error log program or you have access to an operator who can run error log reports for you Training courses are available on this topic This section focuses on how to interpret error log reports you might encounter for MSCP SDI devices Sam ple error log printouts for both VMS SYE and RSTS ERRDIS are examined They report similar infor mation but in different formats Therefore decoding charts are provided Select the error log format that applies to your operating system 2 7 COMMON ERROR LOG MESSAGE DEFINITIONS Even though the error log printouts on different operating systems vary in appearance there is common message information Read the following definitions before examining the individual sample printouts 1 Command reference number A command reference number 15 given for each error log e
53. f memory write command to drive Error during receive of memory write command response from drive Memory write command was unsuccessful A 10 Table A 2 Subsystem Diagnostic Error Code List Cont Decimal Description Error Number 02031 Memory write command did not return expected response code 02032 Timeout on send of run command to drive 02033 Error during receive of run command response from drive 02034 Run command was unsuccessful 02035 Run command did not return expected response code 02036 Timeout on send of recalibrate command to drive 02037 Error during receive of recalibrate response from drive 02038 Recalibrate command was unsuccessful 02039 Recalibrate command did not return expected response code 02040 Timeout on send of get status command to drive 02041 Error during receive of get status response from drive 02042 Get status command was unsuccessful 02043 Get status command did not return expected response code 02044 Timeout on send of drive clear command to drive 02045 Error during receive of drive clear command from drive 02046 Drive clear command was unsuccessful 02047 Drive clear command did not return expected response code 05000 Unable to find requested drive for testing Test 3 DM Error Messages 03001 Timeout on send of a level 2 command 03002 Timeout of receive on get common characteristics command 03003 First word received was not a start frame 03004 Framing error on level 0 response 03005 Ch
54. f the MSB is a 0 the header is from LBN space If the MSB is a B the header is from RBN space 2 21 intout Of VMS Error Report Sequence 2 2 8 3 Sample Pr E Den 2 22 2 8 4 Notes for VMS Error Report Sequence 2 Note 1 command reference number for error sequence 2 is the same as error sequence 1 This means both error events are related to the same MSCP command Note 2 This error report has a message format code of 03 3 in the appendix shows code 03 is an SDI error report message format The last message on this report MSLG__ Z__ gives three lines of SDI error code information interpreted in note 4 Note 3 MSLG__ W__EVENT message is reporting a hexadecimal status event code of Refer to A 5 in the appendix A 5 shows a drive detected error This information is also given in the verbal description on the right side of the VMS error log printout It is important to learn how to check these status event codes because not all error log reports may give this English interpretation on the printout Note 4 MSLG__ Z__SDI message gives valuable controller specific and drive specific troubleshooting information This information comes in three lines of hexadecimal characters shown in Figure 2 6 the Decoding VAX VMS Error Report SDI Message Figure 2 6 shows the information contained in each byte of the VAX VMS error report SDI message Each byte con
55. fter disconnect 03030 Invalid command aaaa was successful 03031 Command with type length A was successful 03032 Unit did not report transmission error 03033 Unit accepted an invalid group number from group select level 1 03034 Unable to correctly read overlay A 12 Table A 2 Subsystem Diagnostic Error Code List Cont Decimal Description Error Number 03035 Successfully wrote in DBN area while drive was write protected 05000 Unable to find requested drive for testing Test 4 DM Error Messages 04001 Attention asserted during seek error or loggable information 04002 Attention asserted unexpectedly asynchronous drive error or log 04003 Seek did not complete neither attention nor read write ready asserted 04004 RCT area corrupted could not find replacement for RCT LBN 04005 Header not found during write 04006 Select track and write level 1 command not executed 04007 ECC detected error 04008 ECC detected error but correction failed 04009 ECC corrections exceeded threshold 04010 ECC correction succeeded but EDC detects error 04011 Error recovery tried all levels without success 04012 Data comparison failed whether detected by ECC or EDC or not 04013 Drive not on line to UDA and not spinable 04014 Unable to complete seek tried three times 04015 Seek required nnn retries before completing 04016 Errors during drive initialization and setup 04017 No valid states from drive no drive clocks 04018 Attempt to write
56. ge above is the result of the diagnostic performing a GET STATUS command Fourteen of fifteen possible status bytes are printed in the error message Figure 2 2 shows the breakdown of the fifteen status bytes The first byte is not printed because it is UDA50 response code to the GET STATUS command Bytes 9 through 15 contain drive specific status bits The drive service manual or maintenance guide should be consulted for interpretation Table A 1 in the appendix describes status bytes 1 through 8 as shown in Figure 2 2 2 12 MSB LSB BYTE 1 RESPONSE CODE BYTE 2 UNIT BYTE 3 SUBUNIT MASK HI UNIT on on m 0 GENERIC eres we perpe oo ro oe r BITS oe rere er D T se o 8 RETRY COUNT FAILURE CODE BYTE 9 BYTE 10 DRIVE TYPE SPECIFIC EXTENDED STATUS BYTE 11 FOR LOGGING PURPOSES BYTE 12 7 BYTES BYTE 13 BYTE 14 BYTE 15 CX 076A Figure 2 2 Drive Status Bytes 2 4 5 Status Message Interpretation Figure 2 3 shows the breakdown of the status results from the real time drive state and status sample error message 2 13 15 14 13 12 11 10 9 8 7 6 5 4 3 2 Status 00 01 11 00 00 00 0 00 00 00 06 13 10 20 Bytes 9 through 15 contain specific status bits Consult appropriate drive diagnostic maintenance guide for interpretation RETRY COUNT FAILURE CODE CONTROLLER
57. h errors are usually mechanical in nature since they are reported as data errors Table 11 50 Internal Error Code Internal Error Code HEX 1 Description ER PRD UNIBUS packet read error This error can occur when the U RECV routine fails to get the MSCP packet from host memory ER PWR UNIBUS packet write error This error is set if the UNIBUS write UNB WR times out while attempting to send a packet to the host ER RRP UDA50 ROM and RAM parity error ER RAP UDASO RAM parity error This error can be corrected by changing UDASO module M7486 ER ROP UDA50 ROM parity error This error can be corrected by changing UDA50 module M7485 ER RRD UNIBUS ring read error Set if the UNIBUS read times out while reading the host command or response ring ER RWR UNIBUS ring write error Set if the UNIBUS write to update the command or response ring descriptor fails ER INT UNIBUS interrupt master failure Set if the UDASO fails to become interrupt master This is a long timeout ER HTO Host access timeout error This error occurs if the host timer is not reset and times out The timer is reset by the U SEND U RECV or U BFSV routines ER NIM Host exceeded command limit Set if a nonimmediate command is loaded in LOG PKT space ER MST Bus master error Set if the UDASO fails to become bus master during the start up of a UNIBUS read or a UNIBUS write A 22 Table 11 UDAS50 Int
58. he SDI status information is decoded the table codes found in the appendix determine the failing FRU to replace MODE BYTE REQUEST BYTE CONTROLLER BYTE ERROR BYTE HIGH BYTE LOW BYTE RA60 PREV CYL 10 RETRY COUNT WORD 24 RA80 81 LAST POSITION FOR ALL DRIVES COMMAND ISSUED HIGH BYTE LOW BYTE RA60 PREV HEAD RA60 PREV CYL HI WORD 25 RA80 81 CURR CYL LO RA80 81 501 ERROR STATUS HIGH LOW RA60 RA60 CURR CYL LO WORD26 RA80 81 CURRENT GROUP RA80 81 CURR CYL HI HIGH LOW RA60 DRIVE ERROR CODE RA60 CURRENT HEAD WORD 27 RA80 81 FRONT PANEL RA80 81 MICRO PROCESSOR FAULT CODES LED CODES CX 087A Figure 2 13 UDA and Drive Status Error Words 22 through 27 2 31 2 10 DECODING RSX ERROR LOGS This section shows how to decode the information in an RSX 11 error log report Two samples of RSX 11 error log printouts are given with note numbers on the right hand side These numbered notes show how to interpret the error log messages after each sample printout The first example is of an SDI error format and the second is of a disk transfer error format The controller error MSCP error log packet format and the host memory access MSCP error log packet format are not shown as examples rather the differences between them and the examples are pointed out 2 10 1 Sample Printout Of RSX 11 Error Log Report
59. his procedure involves converting the octal coded message into its hexadecimal equivalent Remember the message only provides useful information if the second word of the MSCP envelope equals 20 hex WORD4 0 4 0 4 0 3 OCTAL 0 4 0 4 0 3 OCTAL BINARY CODED OCTAL LOW BYTE HIGH BYTE 0011 BINARY CODED 0100 0001 0000 LOW BYTE OF WORD 4 HEX 03 ERROR FORMAT HIGH BYTE OF WORD 4 HEX 41 MESSAGE FLAG THIS MSCP MESSAGE IS IN AN SDI ERROR FORMAT BECAUSE THE LOW BYTE OF WORD 4 CONTAINED A HEX CODE OF 03 CX 082A Figure 2 8 Decoding MSCP Message Error Format 2 9 4 Recovering Error Information In MSCP Packet After the error format of the MSCP packet is determined one of the following four error format charts is used to interpret the rest of the message The contents of each word may be decoded and found in the appendix tables If the tables do not give the octal code convert the octal word to its hexadecimal equivalent 2 9 4 1 SDI Error Format Chart The SDI error format as determined from the low byte of word 4 is used by the SDI type disk controllers to report drive detected errors and SDI communication drive bus errors Since the controller may retry a failed command separate error log entries are recorded for each attempt Each retry for the same command has the same command reference number words 0 and 1 If recovery from the error condition is unsuccessful the controller may decl
60. ison Errors The UDA can be put into a mode where the UDASA acts as a wrap port In this mode any data being sent to the UDASA will be displayed within small period of time If the data in the UDASA does not match the data sent to it the following error message is displayed In some instances VAX error message prints the name of the failing FRU 2 4 3 Real Time Drive State And Status In the following sample error message the last two lines contain the real time drive state as supplied by the drive and a status message The real time drive state RTDS message consists of four hexadecimal digits Only four state bits within these hexadecimal digits are of diagnostic value to the field service engineer The rest of the bits are too transitory and are masked out before the RTDS message is printed The following are the four important state bits Read write ready RT Drive available LAIL Attention Receiver ready RIP The location of these four state bits within hexadecimal code 15 shown in Figure 2 1 The interpretation of the RTDS message requires an understanding of the causes and effects of each bit It also requires an understanding of drive online drive offline drive available and drive unavailable Definitions of each of the four RTDS message bits and the online and available states follow R W RCVR 14 13 12 11 HEX DIGIT 3 HEX DIGIT 2 HEX DIGIT 1 HEX DIGIT 0
61. lag If set this flag indicates the operation causing this error log message has been successfully completed If clear this flag indicates the operation is not yet successfully completed e Operation continuing flag If set this flag indicates the retry sequence for this operation is continuing If clear the retry sequence for this operation has terminated Ignore this flag status if the operation successful flag is set If the successful and the tion continuing flags are both clear the error log message is reporting a hard unrecoverable error e Sequence number reset flag If set this flag indicates the MSCP command sequence number has been reset by the MSCP server since the last error log message sent Note this bit 15 always set if the MSCP server does not implement the error log sequence number feature If clear this flag indicates the command sequence number has not been reset implying it may be used to detect missing error log messages Refer to A 4 in the appendix to decode these message flags 2 17 10 11 12 13 15 16 Status Event code The status event code identifies specific error event being reported by this error log message Refer to A 5 in the appendix for a list of allthe UDA MSCP status event error log codes Controller identification The controller identification message provides the controller unique number controller class
62. lear UBA UBI status 28 Failed to initialize device bus UBA UBD 29 Failed to clear UBA UBI status 30 Failed to initialize device bus UBA UBI 31 Failed to clear UBA UBI status 32 Channel services interrupt enable failure 33 Channel services interrupt disable failure EVRLA Test 2 Host Error Messages 1 Failed setmap routine 2 Failed to initialize device bus 3 Failed to clear UBA UBI status 4 Channel services interrupt enable failure 5 Channel services interrupt disable failure Table A 2 Subsystem Diagnostic Error Code List Cont Decimal Description Error Number EVRLA Test 3 Host Error Messages 1 Failed setmap routine 2 Failed to initialize device bus UBA UBI 3 Failed to clear UBA UBI status 4 Channel services interrupt enable failure 5 Channel services interrupt disable failure EVRLA Test 4 Host Error Messages 1 Failed setmap routine 2 Failed to initialize device bus UBA UBD 3 Failed to clear UBA UBI status 4 Channel services interrupt enable failure 5 Channel services interrupt disable failure EVRLA UDA Initialization error messages 100 Not enough free memory to test units selected 101 Step bit did not set in UDASA register during initialization 102 UDA resident diagnostics detected a failure 103 UDA did not return correct data in UDASA register during initialization 104 UDASA register did not go to zero after step 3 write of initialization 105 Step bit did not set in UDASA register during initializati
63. levant drive maintenance guide In this case 043 octal 23 hexadecimal Note 24 This is the error code displayed the front panel switches 2 10 3 Sample Printout Of RSX 11 Error Log Report Disk Transfer Format UDASO uses the DISK TRANSFER ERROR log format to report transfer data errors The controller may attempt to retry the failing command so multiple entries with the same Command Reference Number and possibily the same error may occur NOTE This example assumes the previous example RSX SDI error format has been reviewed All the fields in this example are not explained The previous example con tains additional information for fields that you need more information on 162 SEGUEHCE 1 TUBES DEVICE MESSAGE Sia SYMBOL ECC ERROR DD MMM YY YY HH AM SS SYSTEM IDENTIFICATION iE Er Tanan m Bh bad TT LY PPL lir PERI ib as PURPA o 6 PL LIS ise PRP sliver culi CPA 1 2 36 TT g p m pop T L IP m Py To PIE PEPE THR ORMOT TOM dit sn i PiReSt ge es 31 pe p 1i q que Ms FPE ipi EERIE muri DCOHHTED FPE T T ae IS FTL PESE Ei tens A r T bri rm p V E 1 1 1 el Tr poat PCT FALE SZ DEIUE SH HARD COUNT WORDS TRANSFE
64. me out or an inconsistent header Data sync time out Data sync is not found Correctable error in ECC field A transfer encounters a correctable error in which only the ECC field is affected Uncorrectable ECC error A transfer encounters an ECC error that exceeds the correction capability of the subsystem s error correction algorithm One symbol ECC error Two symbol ECC error Three symbol ECC error A 17 Table 5 Error Log Status Event Codes Cont Hex Code 168 188 1 8 1 8 1 8 Octal Description Code 550 Four symbol ECC error 610 Five symbol ECC error 650 SIX symbol ECC error 710 Seven symbol ECC error 750 Eight symbol ECC error A transfer encounters a correctable ECC error with the specified number of ECC symbols in error The number of symbols in er ror corresponds to the severity of the error Host Buffer Access Error Subcode Hex 9 9 29 49 69 89 11 Host buffer access error The controller is unable to access a host buffer to perform a transfer and has no visibility into the cause of the error 51 Odd transfer address 111 Odd byte count 151 Nonexistent memory error 211 Host memory parity error Controller Error Subcode Hex A A 2A 4 SA 10A 12 Reserved for host command timeout expired 52 SERDES overrun underrun error Either the drive is too fast for con troller or a controller hardware fault has prevent
65. nostic Error Code List Cont Decimal Error Number 00027 00028 00029 00030 00031 00032 00033 00036 00037 00038 Description Could not write to UDASA register replace M7485 UDA did not interrupt the host replace M7485 UDA interrupted at different BR level replace M7485 UDA reported error in UDASA register list of UDASA codes Assume DM program hung Unknown request number in DM message buffer suspect UNIBUS UDA or corrupted DM program Response packet from UDA does not contain expected data No interrupt received from UDA for 30 seconds UDA reports error in UDASA register Memory error trying to read UDASA register check UNIBUS select switches on M7486 or UNIBUS or replace M7485 Common EVRLA and CZUDC DM Program Error Messages 01000 01001 01002 01003 01004 01005 01006 02000 02001 02002 02003 02004 TEST 1 DM Error Messages nonexistent host memory error Parity error on read from UNIBUS Memory location did not contain own address Nonexistent memory error trying to read from UNIBUS buffer Parity error on read from UNIBUS within buffer Data compare failed after read then write from UNIBUS UNIBUS addressing error Test 2 DM Error Messages Host specified unit number can not be found Cannot receive valid drive state check drive power Drive state received with bad parity Drive not asserting receiver ready Timeout on send of echo command to drive A 9 Table A 2
66. of two types of informa tion depending upon the status of the DF bit Byte 6 The DF bit monitors the drive initialization process The DF bit remains a zero if initialization is successful In this case Byte 8 contains the retry count from the previous operation i e a seek operation required fourteen retries to be successful If a get status command is initiated Byte 8 contains the number 14 The DF bit being set indicates the drive initialization failed and Byte 8 now contains a specific drive error code This error code can be looked up in the appropriate drive service manual Table A 2 Subsystem Diagnostic Error Code List Decimal Description Error Number EVRLA Host Error Messages EVRLA Initialization Error Messages 1 Failed GETBUF routine 2 Error trying to read DM data file 3 Invalid program name found 4 Invalid DM program version 5 Failed RELBUF routine 6 Failed GETBUF routine 7 Error trying to read DM data file 8 Error trying to read DM data file 9 Failed to read P table 10 Invalid controller encountered 11 Invalid controller encountered 12 Invalid UBA UBI encountered 13 Selected devices are on multiple UNIBUS adapters 14 Duplicate controller address found 15 Duplicate controller vectors found A 3 Table A 2 Subsystem Diagnostic Error Code List Cont Decimal Description Error Number 16 Failed to initialize channel adapter 17 Failed to clear UBA UBI status 18 Failed to initialize device bus
67. ollowing section will highlight what is unique about them 2 10 5 1 Controller Error Packet Format The UDA50 uses the controller error packet format to report controller errors to the host This error log entry will look similar to the previous two examples except 1t will be shorter than either The first field to examine closely is the EVENT CODE FIELD because it is the first item that should indicate the problem The list of various codes for this field can be found in the appendix Table A 5 If the event code is 12 controller error the field following the controller hardware version number should be examined This field is then called the UDA50 INTERNAL ERROR CODE FIELD and can be decoded by looking at the appendix Table A 11 2 10 5 2 Host Memory Access Packet Format The UDA50 uses the Host Memory Access Error Log Packet Format to report host memory access problems Although this entry will look very similar to the other examples there are a couple of key items to look at closely The EVENT CODE FIELD should give a good indication of the kind of problem the UDA 15 experiencing The event code should be looked up in appendix Table A 5 The last item displayed should be a field showing the Host Memory Address This address is valuable since it may indicate what host memory address or if there are several of these entries what se quence of addresses the UDA is having trouble with 2 39 2 11 INTRODUCTION TO SPEAR The
68. on 106 UDA resident diagnostics detected a failure 107 UDA did not return correct data in UDASA register during initialization 108 UDA did not clear ring structure in host memory A 6 Table A 2 Subsystem Diagnostic Error Code List Cont Decimal Description Error Number EVRLA DUP Protocol Errors 200 Diagnostic machine DM program asked for data from unknown drive 300 Response packet from UDA does not contain expected data 401 Microcode reported M7485 7486 that did not match get status response 402 Microcode reported unknown controller model 403 Response packet from UDA does not contain expected data 500 UDA reported a fatal error while loading DM program 501 UDA failed to interrupt 502 UDA reported a fatal error while waiting for get status response 503 UDA failed to interrupt 601 Unknown request received from DM program 602 DM program asked for data from unknown drive 603 Same as 602 604 Same as 602 605 Same as 602 606 Same as 602 607 Same as 602 608 Same as 602 609 Same as 602 610 No interrupt received from DM program 611 Fatal error while running DM program 612 Failed RELBUF routine 700 Failed GETBUF routine 701 Failed SETMAP routine 750 Failed GETBUF routine Table 2 Subsystem Diagnostic Error Code List Cont Decimal Error Number 800 801 802 Description EVRLA Interrupt Handler Error Messages Unknown interrupt encountered Same as 800 Unexpected interrupt encountere
69. on write protected drive error code from UDA 04019 Header not found during read 04020 Select track and read level 1 command not executed 04021 Drive not formatted in 512 byte mode 04023 Unable to continue testing port switch out or run stop switch out or spindle dropped ready 04024 EDC detected error but ECC did not A 13 Table A 2 Subsystem Diagnostic Error Code List Cont Decimal Description Error Number 04025 Write attempted maximum times 04026 Read attempted maximum times 04028 Both read only and write only bits set host error 04033 Unable to correctly read overlay 04034 SERDES overrun error during read 04035 Data or state clock timeout during read 04036 Data synchronization timeout during read 04037 Read write ready dropped during read 04038 Receiver ready dropped during read 04040 All copies of RCT read with errors LBN with header not found 04041 Could not find replacement for LBN that was revectored 04042 Time out waiting for sector or index pulse 04044 Seek or head select error detected during write 04045 Seek or head select error detected during read 04047 Data or state clock timeout during write 04048 Read write ready dropped during write 04049 Receiver ready dropped during write 04050 Operator error beginning block number greater than ending number 04051 Operator error the begin end sets overlap 04052 Operator error begin end block number exceeds maximum 04053 Operator error duplicate bad
70. or write mode wrap serdes error D processor read mode serdes RSGEN amp ECC error U processor ALU error U processor control register error Possibly the host CPU is at fault 2 3 Most Likely FRU Failed M7485 M7485 M7485 or M7486 M7486 M7485 M7485 M7485 M7485 M7485 M7485 M7486 M7486 M7485 M7485 M7485 M7485 M7485 7485 7486 7486 7486 7486 7486 7485 7485 Table 2 2 SA Register Error Codes Cont Error Error Description Most Code Likely Octal FRU Failed 106042 U processor DFAIL control ROM parity BD 1 7485 test CNT 106047 processor constant PROM error with D processor M7485 running SDI test 106055 Unexpected trap found abort diagnostic M7485 106071 processor constant PROM error M7485 106072 U processor control ROM parity error M7485 106200 Step 1 data error MSB not set or RE INIT M7485 107103 U processor RAM parity error M7486 107107 U processor RAM buffer error M7486 107115 Test count was wrong BD __ 2 M7486 112300 Step 2 error M7485 122240 NPR error M7485 122300 Step 3 error M7485 142300 Step 4 error M7485 Possibly the host CPU is at fault 2 2 HOST RESIDENT DIAGNOSTICS A different approach must be taken to isolate faults if no LED error code exists in the UDASO The first step is to examine the customer error log to determine the source of the problem The next step is to cycle the power off and back on resetting much of the logic The UDASO then starts
71. otes For RSX 11 Error Log Report SDI Error Format Note 1 DEVICE MESSAGE is a unique RSX feature that gives a good description of the failure Note 2 The DEVICE and TYPE fields supply Logical drive address and drive type information Note 3 The PACK S N and DRIVE S N fields give the Pack HDA serial number and drive serial number Note 4 DEVICE FUNCTION describes what operation was being performed when the error occurred The TYPE OF ERROR field is the same information previously decoded in the DEVICE MESSAGE see note 1 Note 5 There are two message types e Datagram message These usually contain serious failure information e Sequential message These usually contain status information Note 6 A COMMAND REFERENCE NUMBER is given to each command the host issues the UDA It is possible to have the same number in multiple error log entries if the command caused more than one error or if the UDA did retries Note 7 The MESSAGE FORMAT defines the type of MSCP error log entry this entry decribes It also specifies how the host is to interpret the various message fields see Figure 2 8 Note 8 See Table A 4 Error Log Message Flags in the appendix Note 9 The EVENT CODE gives a good description of a controller detected error However if the error is drive detected the drive specific information in the SDI SUPPLIED STATUS FIELD will be the best source of information 2 35 Note 10 UDASO firmware ver
72. r recovery level The error recovery level reflects the most recent attempt at a data transfer Each device has a specified number of error recovery levels that corresponds to the mechanisms it has available to attempt an error recovery For example if data cannot be read the drive might try offsetting its head position slightly in case 1t had been altered since the data was written For each such attempt the error recovery level will be incremented The values zero and 255 all ones indicate no special error recovery procedures are used Error retry count This message gives the retry count within the current error recovery level of the most recent transfer attempt This value starts at one and increments for each subsequent attempt at the same error recovery level It continues until a drive dependent maximum number is reached then the retry count is set to one and the next error recovery level if any is tried Host memory address This message gives the host memory address being used at the time the error was detected For UDASO a RSTS system the maximum address will occupy 18 bits 0 17 Host memory access errors include UNIBUS parity errors PA PB lines and UNIBUS timeouts SSyn timeout etc MSCP error code This message is printed out at the bottom of the ERRDIS error log printout only It tells why the MSCP communications between the host and controller failed Refer to Table A 9 in the appendix for a list of the MS
73. rror Log Message Flags Bit set Octal Hex Error Message Flag Description in high byte of word 4 7 200 80 Operation successful flag 6 100 40 Operation continuing flag 0 1 1 Sequence number reset flag Table A 5 Error Log Status Event Codes Hex Octal Description Code Code The first list in this table is a group of codes that determine the major status or event being reported such as a media format error or a drive error etc Within these major categories are more specialized subcodes that break down the major category further For example if a hex code of is a drive error a hex code of AB reveals there 15 also a drive clock dropout separate list is given for each of the subcode values 0 Success 1 Invalid command 2 Command aborted 3 Unit offline 4 Unit available 5 Media format error 6 Write protected 7 10 Data error 11 Host buffer access error 12 Controller error 13 Drive error Status event code mask J gt N Q Q tS T 92 Success Subcode Hex 0 20 40 Spindown ignored 40 100 Still connected 80 200 Duplicate unit number 100 400 Already online 200 1000 Still online Invalid Command Subcode Hex 1 1 1 Invalid message length Command Aborted Subcode Hex 2 NOT USED Unit Offline Subcode Hex 3 3 3 Unit unknown online to another controller 23 43 No volume mounted or drive disabled via RUN STOP switch A 16 Table 5
74. sion must be 003 or higher Note 11 The Unit I D is the drive serial number Note that RSX prints the serial number in HEX Note 12 The UNIT MODEL FIELD 15 the drive type 1 RA81 RA60 etc Note 13 In general the UNIT CLASS is either a disk drive or a tape However the UDA50 supports only disk drives Therefore for the UDASO the UNIT CLASS has to be a disk class device Note 14 UNIT SOFTWARE VERSION NUMBER FIELD is drive version Note 15 The UNIT HARDWARE VERSION NUMBER FIELD 15 the drive hardware version Note 16 N A reserved Note 17 Pack HDA serial number Note 18 The HEADER VALUE FIELD 15 the Logical Block Number LBN of the sector being accessed when the error occurred Note 19 SDI SUPPLIED STATUS FIELD represents the state of the drive when the error occurred word 22 of Drive Status Error See Table A 1 in the appendix along with Figure 2 13 Note 20 More state of the drive information word 23 of Drive Status Error Note 21 STATUS BYTE 8 is the retry count for drive initiated retries Note 22 STATUS BYTES 10 through 13 represent the drive detected SDI errors see Figure 2 13 Note 23 This is the best information for drive detected errors Note that the error is listed as 043 octal The drive error code charts are published in hexadecimal so it s easier to convert the octal number to hexadecimal and then look up the error in the re
75. tains two hexadecimal characters that must be decoded further In this example there is controller specific information available in bytes 4 through 8 Use the bit map on the right side of Figure 2 6 for inter pretation Then use Table A 1 to decode the meaning of each controller bit mnemonic MSLGSZ SDI MESSAGE BYTE 7 00 400 4 1 8 4 BYTE 11 0 3 0 3 E D 0 0 BYTE 8 BYTE 15 BYTE 12 4 THRU 8 CONTAIN CONTROLLER SPECIFIC INFORMATION REFER TO TABLE 1 BYTE 4 1B BYTE 5 04 E BYTE 6 40 BYTE 7 00 EG BYTE 8 00 RETRY COUNT FAIL CODE BYTES 9 THRU 15 CONTAIN DRIVE SPECIFIC INFORMATION MESSAGE CONTENT IS DIFFERENT FOR RA60 THAN FOR RA80 RA81 SEE THE DECODE CHART BELOW RA80 RA81 RA60 BYTE 9 LAST POSITION CMD PREV CYCL LO BYTE 10 SDI ERROR STATUS BYTE 11 CURR CYL LO BYTE 12 CURR CYL HI BYTE 13 CURR GROUP BYTE 14 MICROPROCESSOR LEDS BYTE 15 FRT PANEL FAULT CODE NOTE ALL CYLINDER REFERENCES IN THE ABOVE DRIVE CHARTS ARE TO PHYSICAL CYLINDERS CX 080B Figure 2 6 Decoding VAX VMS Error Report SDI Message 2 23 Figure 2 6 also shows the drive specific information contained in bytes 9 through 15 Determine which model drive untt is reporting this error message before you decode the extended status area Information found in bytes 9 through 15 2 9 DECODING RSTS E ERROR LOGS The RSTS E ERRDIS error log report is not as sophisticated as the VMS error log report Instead of giving
76. the initialization routine Only the first phase will run as completion takes host software interaction If the first phase passes the next step is to run the available host resident diagnostics Host resident diagnostics available to diagnose UDASO problems are the PDP 11 CZUDC and the VAX EVRLA Both diagnostic programs have the same tests and error messages with the exception of the pro gram name CZUDC or EVRLA The programs consist of the following four tests Test 1 UNIBUS interrupt address test checks out UDASO functionality Test 2 Disk resident diagnostic test runs the drive resident diagnostics Test 3 Disk function test performs minimum drive functional tests 24 Test 4 Disk exerciser test performs a limited read and write test in the diagnostic cylinder area the diagnostic will not write in the customer data area unless specifically instructed to do so and a warning message will be printed There are two modes of operation for test 4 1 Default operation on the diagnostic cylinder or customer area with all parameters selected by the default answers shown below 2 Manual intervention to the test using new parameters that may include the customer data area This manual intervention is referred to as a fifth test in test printouts 2 2 1 PDP 11 Subsystem Diagnostics Preparation The PDP 11 subsystem diagnostics CZUDC UDASO host resident diagnostics run on any disk drive cabled to the UDASO The MS
77. tops This normally occurs in about two seconds Note 1 LED ON 0 LED OFF x May be ON or OFF When two codes are given for the same error both indicate the same failure 2 1 2 Status Address Register Error Codes More detailed information UDASO functional and diagnostic error codes is reported through the SA register The contents of this register may be examined manually through the CPU console at the UDA50 UNIBUS address plus 2 This address is normally 772152 Table 2 2 lists the SA error codes and indicates the most likely FRU at fault 2 2 Table 2 2 SA Register Error Codes Error Code Octal 100001 100002 100003 100004 100005 100006 100007 100010 100011 100012 100013 100014 100015 100016 100017 104000 104040 104041 105102 105105 105152 105153 105154 106040 106041 Error Description UNIBUS packet read error UNIBUS packet write error UDA ROM or RAM parity error UDA RAM parity error UDA ROM parity error UNIBUS ring read error UNIBUS ring write error UNIBUS interrupt master failure Host access timeout error Host exceeded command limit UDA SI hardware fatal error DM fatal error Hardware timeout of instruction loop Invalid virtual circuit identifier Interrupt write error on UNIBUS Fatal sequence error D processor ALU D processor control ROM parity error D processor with no BD 2 or RAM parity error D processor RAM buffer error D processor SDI error D process
78. ts of the two small binders containing the current maintenance guides for disks that operate on the DSA controllers Employees Non Employees The User Guide and Service Manual can be ordered directly from Publication and Circulation Services 10 Forbes Road Northboro Massachusetts 01532 RCS Code NR12 Code NR03 W3 The Maintenance Guide Maintenance Guide Looseleaf Binder Maintenance Documentation Kit and the Field Maintenance Print Set can be ordered direct ly from the Software Distribution Center 444 Whitney Street Northboro Massachusetts 01532 RCS Code MSDC Mail Code NRO2 1 J6 The above documents can be ordered directly from Digital Equipment Corpora tion P O Box CS2008 Nashua New Hampshire 03061 or by calling toll free 800 258 1710 Outside the United States consult local DIGITAL offices 1 7 2 UDA50 FAULT ISOLATION 2 1 UDASO RESIDENT DIAGNOSTICS There are two ways of obtaining resident diagnostic information from the UDA50 Disk Controller The first is through the UDASO LED error codes The second is by examining the contents of the UDA50 status address SA register The SA register contents are also supplied to the host CPU for error logs and diagnostic error reports 2 1 1 UDAS0 LED Error Codes Table 2 1 lists the LED error codes and which FRU is most likely at fault Table 2 1 LED Error and Symptom Codes M7485 M7486 Error Symptoms Most Likely Failure LEDs LEDs 8421
79. tted command string PE A binary one in this position indicates improper command codes or parameters were issued to the drive DF binary one in this postion indicates a failure in the initialization routine of the drive WE A binary one in this position indicates a write lock error has occurred 54 51 This is a four bit representation of the sub units that have their attention available messages suppressed in the UDASO The right most bit position represents sub unit O The left most bit position represents sub unit 3 If one of the bits is set it indicates the controller is not to interrupt the host CPU with an attention available message when the specified sub unit raises its available real time drive status line to the UDASO 54 51 bits reflect the result of a change controller flags command where attention available messages are not desired for certain sub units C1 C4 This is a four bit drive status code indicating various states of drive operation At the present time only three codes are valid A code of 0000 drive normal operation A code of 1000 the drive is offline due to being under control of a diagnostic A code of 1001 the drive is offline due to another drive having the same unit identifier e g serial number drive type class etc A 2 Table 1 Bit Description of Status Message Bytes Cont Status Bit Description Byte Byte 8 RETRY COUNT FAILURE CODE This 8 bit Byte contains one
80. usetts DEC DECnet OMNIBUS DECUS DECsystem 10 0S 8 DIGITAL DECSYSTEM 20 PDT DECwriter RSTS PDP DIBOL RSX UNIBUS EduSystem VMS VAX IAS VT UDA50 MASSBUS RA80 HSC50 RA60 RA81 3 ON tA tA RN ha hhbN 2 EE nd n gt pa NNNNNNNNNNNN Buin ON Bb WN CONTENTS Page INTRODUCTION SCOPE OF MANUAL ta da a iia 1 1 UDASO MAINTENANCE 5 1 1 UDA50 FIELD REPLACEABLE 5 1 1 UDASO MAINTENANCE 1 2 UDASO ADDRESS SWITCHES AND JUMPERS 1 4 UNIBUS TUNNE doen A Creta ota 1 5 UDASO PRIORITY PLUG e 1 6 INSTALLATION OF BOOTSTRAP 1 6 RELATED DOCUMENTATION aes una aqu CE 1 7 UDA50 FAULT ISOLATION UDASO RESIDENT DIAGNOSTICS 2 1 UDA50 LED Error Codes e v aure a e E qeu be VD odes 2 1 Status Address Register Error 2 2 HOST RESIDENT DIAGNOSTICS 2 4 PDP 11 Subsystem Diagnostics 2 5 VAX Subsystem Diagnostics 2 7 RUNNING THE HOST RESIDENT DIAG
81. vent message It is the MSCP command number that caused the error message to be reported The command reference number 15 zero 1f the error message 15 not related to an outstanding MSCP command When a seek command 15 issued the drive attempts a number of retries Each retry for that seek command has the same command reference number Drive number logical unit address unit message number refers to the logical unit number of the device The unit number may be zero if the error message does not refer to a specific device or unit Sequence number sequence number is assigned to MSCP packet when it is passing infor mation to the host error logger The use of the sequence number is dependent on the MSCP server in the controller microcode If the sequence number is zero the controller does not support the use of this feature Message format A format code associated with each error log event determines the error message format It is important to determine the format code first to interpret the remainder of the error message correctly The format code will reveal whether the error event is reporting a controller error code 0 a host memory access error code 1 a disk transfer error code 2 or an SDI error code 3 Use Table A 3 in the appendix to decode the format of each error log event Message Flags The error log message flags report which of the following three operating conditions apply e Operation successful f
Download Pdf Manuals
Related Search
Related Contents
キャブレタークリーナー Produktkatalog - Krisenvorsorge & Survival Shop Spannbauer warning warning warning warning Trust Leather Protective Sleeve for Smartphone Untitled RMS User Manual 2-6-05 15-055-SIC-6.03-R001-PSC PDF (Drive User Manual) C720 - Lexmark Copyright © All rights reserved.
Failed to retrieve file