Home
Compex Systems SA33-3285-02 Network Card User Manual
Contents
1. Figure 48 Installing the Fast Write Cache Option Card 4 Carefully plug the Fast Write Cache Option Card H into the connector i Ensure that you push the cache card fully home 5 Invert the adapter card so that its components are downward 334 User s Guide and Maintenance Information 6 Refer to Figure 49 H Figure 49 Installing the Mounting Screw of the Fast Write Cache Option Card 7 Install the mounting screw and tighten it fully The screw is supplied with the Fast Write Cache Option Card Reinstall the adapter into the using system see the Installation and Service Guide for the using system Note The battery on the Fast Write Option Card is not fully charged After the adapter is connected to the power initial battery charging completes in 5 to 60 minutes During this time the fast write disks can be enabled and used but the fast write function remains inactive To determine whether the fast write cache is active or inactive give the command ssa_fw_status a ssaX c Chapter 15 Removal and Replacement Procedures 335 Removing the Battery Assembly from the Fast Write Cache Option Card of an Advanced SerialRAID Adapter Attention The ad
2. pdisk3 pdisk4 A A D D A A P P T T E E R CoOk AA de aap at pdisk10 pdisk9 pdisk8 I Z Primary Disks Secondary Disks Figure 17 Primary Disk in Building 1 Secondary Disks in Building 2 Distributed Spares Now assume that for some reason the disk drives in building 2 are no longer available The example array is now in the Offline state because one of its four primary disk drives is in building 2 Only the three primary disk drives that are in building 1 are operational Chapter 5 Hot Spare Management 47 This problem can be solved if in each building a hot spare pool is created for the disk drives In Figure 18 all the disk drives in building 1 have been made members of pool A1 and all the disk drives in building 2 have been made members of Pool A2 A failure of a member disk drive in Pool A1 now causes pdisk1 to be selected as the hot spare disk drive Building 1 Building 2 Pool A1 Pool A2 DmiaAvrdleY Dmiavre Y pdisk10 pdisk9 pdisk8 Li oS Primary Disks Secondary Disks Figure 18 Primary Disk in Building 1 Secondary Disks in Building 2 Pools Hot spare pools can be configured in other ways as shown in figures Eg through Bil 48 User s Guide and Maintenance Information Figure 19 shows how RAID 5 arrays can be protected against the complete failure of an SSA enclosure Each pdisk of each arr
3. If you need help with an item move the cursor to that item and press F1 Help 5 The SSA Physical Disks menu is displayed Ws SSA Physical Disks Move cursor to desired item and press Enter List All Defined SSA Physical Disks List All Supported SSA Physical Disks Add an SSA Physical Disk Change Show Characteristics of an SSA Physical Disk Remove an SSA Physical Disk Configure a Defined SSA Physical Disk Generate Error Report Trace an SSA Physical Disk Show Physical to Logical SSA Disk Relationship List Adapters Connected to an SSA Physical Disk List SSA Physical Disks Connected to an SSA Adapter Identify an SSA Physical Disk Cancel all SSA Disk Identifications Show Connection Paths to an SSA Physical Disk F1 Help F2 Refresh F3 Cancel F8 Image ss ee F1O Exit Enter Do If you need help with an item move the cursor to that item and press F1 Help User s Guide and Maintenance Information Getting Access to the SSA RAID Arrays SMIT Menu ie For fast path access to the SSA RAID Array SMIT menus type smitty ssaraid and press Enter Otherwise a Type smitty and press Enter The System Management menu is displayed b Select Devices The Devices menu is displayed c Select SSA RAID Arrays The SSA RAID Arrays menu is displayed a N SSA RAID Arrays Move cursor to desired item and press Enter List All Defined SSA RAID Arrays List All Supported SSA RAID Arrays List All SS
4. The array enters the Good state for both systems 200 User s Guide and Maintenance Information Failure of a Primary Disk Drive in a Dual Host System Figure 34 shows a dual host system in which the primary disk drive of a RAID 1 array has failed Split Array Resolution Primary System 1 System 2 SSA Adapter SSA Adapter Primary 1 Secondary 1 Figure 34 RAID 1 Disk Drive Failure Because both adapters can still detect the secondary disk drive of the array the Split Array Resolution flag is automatically set to Secondary and I O operations continue The array is in the Exposed or Degraded state When the primary disk drive is reconnected it is resynchronized from the secondary disk drive and the Split Array Resolution flag is automatically set to Primary Chapter 8 Split Site Management 201 RAID 1 Failure of a Host System and a Primary Disk Drive Figure 35 shows system 2 disconnected from system 1 The array is in the Offline state If you set Split Array Resolution to Secondary the array goes into the Good state Split Array Resolution Primary System 2 SSA Adapter Secondary 1 Figure 35 RAID 1 Failure of a Host System and a Primary Disk Drive RAID 10 Failure of a Host System and a Primary Disk Drive Figure 34 shows system 2 disconnected from system 1 The array is in the Offline state If you set Split Array Resolution to Secondary the array goes into the Go
5. SSA physical disks that are free F1 Help F2 Refresh F3 Cancel F1 F7 Select F8 Image F1O Exit F5 Enter Do Find n Find Next Ss es eS ee eae y Select the disk drives that you want to add to the hot spare pool and press Enter Notes a If all the member disk drives of an hdisk are now in pool zero you can select the hdisk This action adds all the member pdisks of the array to the hot spare pool that you are creating If some of the member pdisks of an hdisk have already been assigned to another hot spare pool the hdisk cannot be selected and is shown with a comment flag in front of it as shown in the example screen b If a free disk is selected from the list it is converted to a hot spare disk drive when it is added to the hot spare pool Chapter 6 Using the RAID Array Configurator 85 Adding Disks to or Removing Disks from a Hot Spare Pool This option allows you to exchange the disk drives that are in the hot spare pool or to resynchronize the state of the pool If you are not sure how to configure hot spare pools read K before you proceed 1 For fast path type smitty chg_hsm_pool_adap and press Enter Otherwise select Change Show Delete a Hot Spare Pool from the SSA RAID Arrays menu 2 A list of adapters is displayed in a window A N SSA RAID Arrays Move cursor to desired item and press Enter List All Defined SSA RAID Arrays List All Supported SSA RAID Arrays List All SSA RAID Arrays Co
6. System 1 System 2 SSA Adapter SSA Adapter Primary 1 Secondary 1 Primary2 Secondary 2 Figure 32 System 1 Split from System 2 On system 1 the array is in the Exposed or the Degraded state On system 2 the array is in the Offline state and SRN 48750 is generated The array remains in the Offline state on system 2 until The two systems are reconnected e The Split Array Resolution flag is set to Secondary Attention You must set the Split Array Resolution flag to Secondary on both halves of the array if you need access from system 2 This action ensures that write operations can be performed only on the secondary half of the array If data had been allowed to be written to both halves of the array the data would become unsynchronized If the two halves are later reconnected and the Split Array Resolution flag is set differently on the primary and secondary side the array enters the Offline state to both systems Chapter 8 Split Site Management 199 Normal Reconnection When the systems are reconnected the econ disk drives are resynchronized with the primary disk drives as shown in Split Array Resolution Split Array Resolution Primary System 1 0011 0011 SSA Adapter 1111 0000 0000 1111 1010 1010 Figure 33 Reconnection of a Split Array Primary Seem AE E AS 0011 0011 1111 1111 0000 1111 1010 1010 System 2 SSA Adapter
7. not available for several reasons e The disk drive has failed e The disk drive has been removed from the subsystem e An SSA link has failed A power failure has occurred Action If the SSA service aids are available run the Link Verification service aid see k j to find any failed disk drives failed SSA links or power failures that might have caused the problem If you find any faults go to the Start MAP or eee es in the enclosure service information to isolate the problem then go to of to return the array to the Good state If the SSA service aids are not available or the Link Verification service aid does not find any faults go to EN isolate the problem e ARAID 5 array causes this error code if a disk drive is not available to the array e ARAID 1 or RAID 10 array causes this error if the array has one or more exposed mirrors A RAID 1 or RAID 10 mirror becomes exposed when one disk drive in the mirror pair is not available to the array Chapter 18 SSA Problem Determination Procedures 429 SRN Problem Possible Causes 49500 Description No hot spare disk drives are available for an array that is configured for hot spare disk drives Action Determine whether any disk drives have failed or have been rejected To do this action 1 Display the use of the disk drives that are attached to the SSA are attached to the adapter that logged this error 3 If
8. Each separate write operation is treated separately by the target so when reading each separate write operation requires a separate read operation Configuring the SSA Target Mode 292 Each using system requires its own unique node number The SSA adapter software specifies this node number which is used by Target Mode SSA The configuration database contains the ssar device The node_number attribute sets the number for the node Failure to have unique node numbers in the SSA loops causes unpredictable results with the target mode interface Node numbers that are not unique cause error logs You can use the ssavfynn command to check for duplicate node numbers When the node is configured it automatically inspects the existing SSA loops It detects all nodes that are using the target mode SSA interface now Each detected node is then added to the configuration database if it is not already part of it For each node that is added tmssaXX is created where XX is the node number of the detected node User s Guide and Maintenance Information When configuration is complete special files exist in the dev directory These files allow you to use the target mode interface with each node that is defined in the configuration database Configuration does not need communication to be actually possible between the relevant using systems Communication is needed only for the write operation Buffer Management You can set the buffer sizes tha
9. Select Change Show Delete a Hot Spare Pool Add the hot spare disk drives to the correct pool see Adding Disks Select Change Member Disks in an SSA RAID Array Swap the pdisk whose name you noted in step fd with a hot spare disk drive that is in the pool that you noted in step Pee rarer Disks in an A RAID Arra If the customer chooses not to have the disk drive swapped Note the only way to clear this error condition is to remove all the member disk drives from this hot spare pool then recreate the pool If you do these actions you change the configuration of the hot spare pool Select Change Show Use of an SSA Physical Disk Change to Hot Spare Disk the Current Use parameter of the disk drive pdisk whose name you noted in step id Select Change Show Delete a Hot Spare Pool Add to the correct pool the hot spare disk drive that you created in step id An array member has used a hot spare disk drive from a pool other than its specified pool If the hot spare pools have been correctly configured this error indicates that more than one disk drive might have failed Possible FRUs Device 100 bn page 319 432 User s Guide and Maintenance Information SRN Problem Possible Causes 49530 Description The number of disk drives that remain in a hot spare pool is The number of hot spare disk less than the specified number drives that are now in the assigned pool is l
10. 3 When you have selected an adapter a list is displayed that shows all the devices that are connected to the adapter C A LINK VERIFICATION 802386 SSA Link Verification for nunu ssa0 04 02 IBM SSA 160 SerialRAID Adapter To Set or Reset Identify move cursor onto selection then press lt Enter gt Physical Serial Adapter Port Al A2 Bl B2 Status TOP nunu pdisk2 AC7AAO9A OS 12 Good nunu pdisk9 AC7AA2D6 Te ead Good nunu pdisk8 AC7AAOBD 2 I9 Good nunu pdisk3 AC7AA0B1 3 9 Good ssd32 ssa0 A 4 8 nunu pdisk6 AC7AA0B5 5 7 Good nunu pdisk1 AC7AA052 6 6 Good nunu pdisk7 AC7AA0B9 7 5 Good nunu pdisk5 AC7AA0B3 8 4 Good MORE 3 F3 Cancel F1O Exit DL The columns of information displayed on the screen have the following meanings e The Physical column lists the devices as they appear on the SSA loop A device can be either a pdisk or an adapter If the device is a pdisk it is listed as systemname pdiskname for example nunu pdisk3 where systemname is the name of the using system to which the pdisk is connected adaptername is the physical disk drive resource identifier If the device is an adapter it is listed as systemname adaptername loop for example ssd32 ssaQ A where systemname is the name of the using system that contains the SSA adapter adaptername is the adapter resource identifier loop is the loop connection A indicates that the adapter is connected through ports A1 and A2 B indicates
11. Configuration information of the array is held in a reserved area sector on each of the first three member disk drives of the array If fewer than two of these sectors can be read or written the array normally goes into the Offline state An important characteristic of RAID 10 is that the mirrored pairs can be located in different sites in different power domains The availability of a RAID 10 array is therefore better than that of RAID 5 array However if both domains of a two site configuration are both operational but communication is lost between the sites it is important to ensure that each system does not continue to operate on its own copy of the array Under this condition the data might not be consistent To prevent this problem from occurring the first third and fifth member disk drives of the array are the primary members the second fourth and sixth member disk drives are the secondary members Access to at least one of the primary disk drives that contain the configuration information is normally required for array operations to continue Therefore e Ifa network partition exists the using system that has access to the primary configuration disk drives continues to operate The using system that has access to only the secondary configuration disk drives cannot normally access the array e If the using system fails at the site that contains the secondary member disk drive the using system that has access to the primary confi
12. List all the types of array objects List all the types of objects that can be created List all types of object Notes about the ssaraid Command 1 You can specify RAID object names arrays or member disk drives as either the 15 character connection location or as the device name The preferred name is the 15 character connection location This name is the same as the SSA serial number for the device 2 You can specify Boolean attribute values as any of the following 0 1 f t false true n y no yes off on The attributes must be in lowercase 236 User s Guide and Maintenance Information Command Syntax ssaraid ssaraid M o 1 RaidManager ssaraid 1 RaidManager n Name m p x e t ObjectType r Name c ssaraid I 1 RaidManager n Name m p x e t ObjectType r Name c a Attribute Value o z h S o ssaraid C 1 RaidManager t CreateType s Member Member a Attribute Value d k DeviceName n Name r Name ssaraid D 1 RaidManager n Name u ssaraid H 1 RaidManager n Name u d k DeviceName a Attribute Value ssaraid A 1 RaidManager n Name i InstructType a Attribute Value ssaraid Y c a o 1 RaidManager Legend RaidManager The name of the SSA adapter that has RAID array support Name The name of the specific device upon which the operation is to be performed Object Type The aps of device upon which the
13. Use the Display Download Microcode service aid to check whether the microcode is at the latest available level on the disk drive that ou have just installed see Select Download Microcode to all SSA Physical Disk Drives Select Continue with Microcode Installation Note No microcode is downloaded if the latest available level of microcode is already on the disk drive The disk drive might have been configured with new hdisk and pdisk numbers You can change these numbers For example if the disk drive is a replacement disk drive you might want to make its pdisk and hdisk numbers match those of the original disk drive If you want to change the numbers see EE When you have changed the numbers return to this section and go to Step B2 of page 22a If you do not want to change the numbers go to step B3 on page 325 324 User s Guide and Maintenance Information 23 24 25 26 27 If the disk drive that you are installing is a replacement for a disk drive that was a member of an SSA RAID array go to step Otherwise go no further with these instructions Type smitty ssaraid and press Enter Select Change Show Use of an SSA Physical Disk The pdisk that has been exchanged is listed under SSA Physical Disks that are system disks Select the pdisk from the list Change the Current Use parameter to Hot Spare Disk or to Array Candidate Disk Note Itis the user who should make the choice of Current Use parameter T
14. b To take out of use by the existing computer system uncouple To separate the copy array from the parent RAID 1 or RAID 10 array The metadata of the copy array is updated to show that it is no longer part of the parent RAID array The metadata of the parent array is updated to show that it is no longer associated with the copy array The copy array is brought online and becomes available as a free resource 492 User s Guide and Maintenance Information unrecoverable error An error for which recovery is impossible without the use of recovery methods that are outside the normal computer programs user mode In the operating system a mode in which a process is run in the user s program rather than in the kernel V vary off To make a device control unit or line not available for its normal intended use vary on To make a device control unit or line available for its normal intended use vital product data VPD In the operating system information that uniquely defines system hardware software and microcode elements of a processing system VPD Vital product data Index Numerics 128 MB Memory Module feature 5 3 way copy array copy services 173 Delete a RAID Array Copy option 183 Delete a Volume Group Logical Volumes or Filesystems Copy option 184 List All Copy Candidates option 179 List All Uncoupled Copies option 181 List All Uncoupled Volume Groups option 182 Prepare a Copy option 175 U
15. screen Disk Disk B3 40 MB s 40MB s_ 40 MB s 4 5 B1 Disk Disk Disk Disk ao 40 MB s 5 40 MB s_ 20MB s S _40 meis 5 40 MB s 0 1 2 3 A1 Figure 54 Configuration Shown by the Service Aid 398 User s Guide and Maintenance Information This screen shows the same configuration but here the link is broken between pdisk1 and pdisk2 fee SPEED 802438 D SSA Link Speed for systemname ssaQ 00 03 IBM SSA 160 SerialRAID Adapter To set or reset Identify move cursor onto selection then press lt Enter gt Source Speed Destination systemname ssa0 Al 40 systemname pdisk3 systemname pdisk3 40 systemname pdisk2 systemname pdisk2 00 2222 2222 00 systemname pdisk1 systemname pdisk1l 40 systemname pdisk0 systemname pdiskO 40 systemname ssaQ A2 systemname ssaQ B1 40 systemname pdisk5 systemname pdisk5 40 systemname pdisk4 systemname pdisk4 40 systemname ssaQ B2 eee F1O Exit J Eigure 55 gives a physical representation of the configuration that is shown on the screen Disk Disk B7 40 MB s 40MB s_ 40 MB s 4 5 B1 ao 40 MB s P ao mgs P pisk 40MBis_ 40 MB s 0 1 2 3 A1 Figure 55 Configuration with Broken Link Shown by the Service Aid Chapter 17 SSA Service Aids 399 Service Aid Service Request Numbers SRNs If the SSA service aids detect an unrecoverable error and are unable to continue one of the following service request numbers SRNs
16. 47000 Description An attempt has been made to store in the SSA adapter the User action details of more than 128 arrays Those arrays that cannot be stored become offline to the adapter Possible FRUs seh adapter card 100 Action The system user must delete from the SSA soaps the details of old arrays see Deleting an Old If no details of old arrays are present or if existing old arrays cannot be removed exchange the FRU for a new FRU 47500 Description Part of the array data might have been lost e An unreadable data sector Action G existed on a disk drive when age i an array was created An unreadable data sector was detected during a rebuilding operation on an array The SSA adapter was reset during an attempt to recover a failed disk drive 48000 Description The SSA adapter has detected a link configuration that is not SSA loop configuration valid problem Action See 6 e 48500 Description The array filter has detected a link configuration that is not SSA loop configuration problem 424 User s Guide and Maintenance Information SRN Problem Possible Causes 48600 Description All the member disk drives of an array are not on the same SSA loop configuration SSA loop The array is in the Exposed state and write operations to the problem array are inhibited Action All the member disk drives of an array must be on the same SSA loop Find all the members of the array 1 Type smitt
17. A hardware error that has not been recovered A software error that has been detected by the device driver The target mode device driver passes error recovery responsibility for all detected errors to the caller For these errors the target mode device driver does not know if this type of error is permanent or temporary These types of errors are handled as temporary errors Only errors that the target mode device driver can itself recover through retry operations can be determined to be either temporary or permanent The error is ignored if it succeeds during retry a recovered error The return code to the caller indicates success if a recovered error occurs or failure if an unrecovered error occurs The caller can retry the command or operation but success is probably low for unrecovered errors TMSSA does no error logging If an error occurs that error might be logged by the adapter device driver Chapter 13 Using the Programming Interface 303 tmssa Special File Purpose Description To provide access to the SSA tmssa device driver The Serial Storage Architecture SSA target mode device driver provides an interface that allows the SSA interface to be used for data transfer from using system to using system You can access the data transfer functions through character special files that are named dev tmssann xx where nn is the node number of the node with which you are communicating The xx can be either im initiator
18. Availability Redundant Array of Independent Disks RAID technology provides e Larger disk size e Immediate availability and recovery of data e Redundancy of data at a level that you can choose RAID technology stores data across groups of disk drives that are known as arrays These arrays are contained in array subsystems which can be configured with one or more arrays All arrays except RAID 0 arrays can provide data redundancy that ensures that no data is lost if one disk drive in the array fails An Advanced SerialRAID Adapter which uses microcode below level 5000 provides RAID 0 and RAID 5 functions to control the arrays of the RAID subsystem An Advanced SerialRAID Plus Adapter which uses microcode at or above level 5000 provides RAID 0 RAID 1 RAID 5 and RAID 10 functions The main characteristics of the various RAID types are as follows e RAID O provides data striping across disk drives but provides no added protection against loss of data e RAID 1 provides data mirroring across two member disk drives to protect against loss of data e RAID 5 provides data striping with parity data across disk drives to provide protection against loss of data e RAID 10 provides data striping and data mirroring across disk drives to provide protection against loss of data Availability is an important consideration that can affect the way you configure your arrays It is the ability of a system to continue operating although
19. Chapter 17 SSA Service Aids 385 Notes a In the lists of physical disk drives pdisks that are displayed by the service aids you might see 22222 These question marks show where an SSA loop is broken The service aid cannot display which if any devices are missing from this configuration belay These asterisks indicate an unconfigured device that is an SSA device that is in the SSA network but whose type is not known Such a condition can occur if the device has not been configured on the SSA network or it has not been configured into the using system If you have just switched on an SSA device or a disk drive enclosure subsystem you might need to wait for up to 30 seconds before the device is configured on the SSA network If a new device has been added to the SSA network you must give the cfgmgr command to configure that device into the using system For example i LINK VERIFICATION 802386 SSA Link Verification for nunu ssaQ 04 03 IBM SSA 160 SerialRAID Adapter To Set or Reset Identify move cursor onto selection then press lt Enter gt Physical Serial Adapter Port Al A2 Bl B2 Status TOP nunu pdisk1 AC515E90 0 Good nunu pdisk2 AC515EAB 1 Good 2222 nunu pdisk3 AC515EB1 1 Good nunu pdisk4 AC515EB9 0 Good ganges ssa0 A 0 7 nunu pdisk6 AC51606E 1 6 Good RRR K K 2 5 Good nunu pdisk0 AC506D6E 3 4 Good MORE 4 F3 Cancel F1O Exit XX I This example shows that the SSA loop is broken bet
20. Expansion drawer The location code shows only the position of the SSA adapter in the using system and the type of device that is attached The location of the device within the SSA loop must be found by use of a service aid The service aids use the IEEE standard 16 digit unique ID of the device Pdisks Hdisks and Disk Drive Identification The physical disk drives pdisks in an SSA subsystem can be configured as logical units LUNs A LUN is also known as an hdisk and can consist of one or more physical disk drives An hdisk in an SSA subsystem might therefore consist of one pdisk or several pdisks The configuration software allocates an identification hdisk and pdisk number to each disk drive during the configuration of the SSA link The disk drives do not have fixed physical addresses The numeric identifiers of pdisks hdisks and the disk drive slots are not related to each other For example pdisk1 is not necessarily in slot 1 of the physical unit in which it is installed Chapter 2 Introducing SSA Loops 19 20 The configuration software first recognizes the disk drive by its machine readable serial number The serial number of the disk drive is also displayed by the service aids The service aids show the number as the last eight digits of the IEEE SSA Unique ID Service actions are always related to physical disk drives For this reason errors that occur on SSA disk drives are always logged against the physical dis
21. Figure 9 shows an example configuration that has two loops and two adapters Adapter j Adapter Figure 9 Two Loops with Two Adapters Chapter 2 Introducing SSA Loops 15 Large Configurations Up to eight SSA adapters can be connected in a particular SSA loop and up to 48 disk shows an example of a large configuration that has eight adapters in eight using systems drives can be included in that loop Adapter S Adapter Adapter Adapter Adapter SSA Disk Drives Adapter Adapter Adapter Figure 10 A Large Configuration of Thirty Two Disk Drives Connected to Eight SSA Adapters in Eight Using Systems Figure 11 shows an example of a large configuration that has eight adapters in four using systems Adapter Adapter Usi
22. O object types Command Line Interface 238 ODM attributes PCI 257 Offline state RAID 10 37 Index 501 Offline state RAID 5 34 one loop with two adapters in each of two using systems 12 one loop with two adapters in one using system 11 open and close subroutines adapter device driver 258 open subroutine tmssa device driver 296 open read write and close subroutines disk device driver 272 operation after a loss of member disks split site management 194 adapter not known to remaining half of the array 203 one half of the array is not present 195 split and join procedure not performed correctly 205 options of the RAID Command Line Interface 238 P part numbers 340 paths data SSA link 7 examples broken loop cable removed 403 broken loop disk drive removed 406 normal loops 401 PCI adapter ODM attributes 257 pdisks and hdisks changing numbers 326 explanation of 19 identification 19 reformatting a pdisk as an hdisk 387 physical disk change attributes RAID 5 249 physical relationship of disk drives and adapters 24 one pair of adapter connectors in the loop 24 pairs of adapter connectors in the loop mainly shared data 26 pairs of adapter connectors in the loop some shared data 25 port addresses Advanced SerialRAID Adapter type 4 P 6 POSTs power on self tests adapter 317 powering off using systems in a large configuration 17 powering on using systems in a large configuration 17 Prepare a Copy option 175 prob
23. Type PERM Resource Name ssa0 Resource Class adapter Resource Type ssa Location 04 07 The Type field can have the following flags PEND PERF PERM TEMP UNKN and INFO These flags are described in the using system documentation The PERM flag however is also described here because the SSA definition of the flag is slightly different from the using system software definition The PERM flag is used to log many SSA errors The using system software defines the PERM flag as an error from which recovery is not possible For SSA devices the error although possibly permanent is not necessarily obvious to the customer The PERM flag is used here to ensure that when diagnostics are run in Problem Determination mode the SSA error log analysis runs and any problems that need service action are identified Detail Data Formats The Detail Data fields of SSA error logs use two data formats e SCSI Sense Data format e SSA Error Code format Chapter 11 SSA Error Logs 225 SCSI Sense Data Format Errors that are logged with the following labels have SCSI sense data in the detail data field in the error log DISK_ERR1 SSA_DISK_ERR2 DISK_ERR4 SSA_DISK_ERR3 SSA_DISK_ERR1 SSA_DISK_ERR4 SCSI sense data consists of 32 bytes of data See find out how this data is used SSA Error Code Format Errors that are logged with the following labels have SSA error code data in the detail data field in the error log SSA_ARRAY_ERROR SSA_HDW_E
24. ssa_make_copy P f data_fs2 ssa_make_copy f data_fs2 Step 1 Step 2 Step 3 Source Volume Group Source Volume Group Source Volume Group D hd_2 D hd_2 D hd_2 hd_1 hd_3 hd_1 hd_3 hd_1 hd_3 i A data_fs1 data_fs1 data_fs1 Iv_A data_fs2 Iv_A data_fs2 Iv_A amp data_fs2 Iv_C lv_C lv_C lv_B lv_B lv_B Copy Physical Volumes gt Copy by LV FS name hd_2 naa a D i d_3 hd_3 _3_cp D amp fsdata_fs2 lt fslv_C Figure 28 Copying One Logical Volume or Copying by FS Name 2 Figure 28 shows from left to right e The parent array that contains the source volume group e The empty RAID Copy array coupled to the parent array e The uncoupled RAID Copy array that now contains a copy of the logical volume Note that in the copy all names start with fs 168 User s Guide and Maintenance Information Example 4 Copying a Complete Volume Group and Recreating the Copy on Another Using System In this example you are copying a complete volume group then recreating the copy on another using system 1 On the original using system give the commands ssa_make_copy P v vgname ssa_make_copy U v vgname A message is displayed fo
25. 427 SRN Problem Possible Causes 48950 Description A disk drive has caused an array building operation to fail Action 1 2 oong 10 11 12 13 14 15 Type smitty ssaraid and press Enter Select List all Defined SSA RAID Arrays The hdisk that is causing this problem is listed as exposed or degraded Ask the user to make a backup of the data that is on this array Some data might not be accessible Return to the SSA RAID Arrays menu and select List Identify SSA Physical Disks Select List Disks in an SSA RAID Array Select the failing hdisk Note the pdisk numbers of the member disk drives of the failed array Ask the user to delete the array Return to the SSA RAID Arrays menu and select Change Show Use of an SSA Physical Disk Run diagnostics in System Verification mode to all disk drives that are listed as rejected if any are listed If the diagnostics run successfully run the Certify service aid see to the disk drive that caused this SRN Run the Certify service aid see to all the disk drives that were members of the failed array Run the Certify service aid to all the disk drives that were members of the failed array If problems occur on any disk drive exchange that disk drive for a new one Ask the user to recreate the array A disk drive has failed during an array building operation Because the failure occurred before an initial array b
26. Go to step Eevee add the disk drive to the group of disk drives that are available for use by the RAID manager Note A disk drive that is listed as rejected is not necessarily failing For example the array might have rejected the disk drive because a power problem or an SSA link problem caused that drive to become temporarily unavailable Under such conditions the disk drive can be reused If you think that a disk drive has been rejected because it is failing check the error log history for that disk drive For example if you suspect pdisk3 type on the command line ssa_ela 1 pdisk3 h 5 This command causes the error log for pdisk3 to be analyzed for the previous five days If a problem is detected an SRN is generated J to verify the repair 8 from step m An attempt has been made to create a new array but the adapter already has the maximum number of arrays defined a Type smitty ssaraid and press Enter Select List Delete Old RAID Arrays in an SSA RAID Manager b c Delete any array names that are no longer used d to verify the repair Chapter 18 SSA Problem Determination Procedures 457 9 from step A Attention Part of the data that is on the array has been damaged and cannot be recovered Before any other action is taken the user must recover all the data that is not damaged and create a backup of that data a Type smitty ssaraid and press Enter b Select List Status Of All Defined SS
27. Identify Array Candidate Disks Identify System Disks SSA RAID Manager Move cursor to desired item and press Enter ssaQ Available 00 04 IBM SSA 160 SerialRAID Adapter 14109100 Fl Help F2 Refresh F3 Cancel F8 Image F1O Exit Enter Do Find n Find Next Ne a Select the adapter whose rejected disk drives you want to list Chapter 6 Using the RAID Array Configurator 91 92 3 D N A list of rejected disk drives is displayed a gt COMMAND STATUS Command OK stdout yes stderr no Before command completion additional instructions may appear below pdisk4 08005AEA030D00D member rejected 2 3G Physical disk Fl Help F2 Refresh F3 Cancel F6 Command F8 Image F9 Shel F10 Exit Find wa Next J Check the list of rejected disk drives against other error reports to find out why the disk drive was rejected from the array lf you know the physical location of the rejected disk drive go to step Kad Otherwise go to step Ato identify the rejected disk drive For fast path type smitty ifssaraid and press Enter Otherwise a Return to the List Identify SSA Physical Disks menu b Select Identify Rejected Array Disks The list of adapters that was displayed in step P on page 91 is displayed again User s Guide and Maintenance Information 8 10 11 12 Select the adapter that contains the rejected disk drive The following menu is displayed Identify Rejected Array Disks
28. Is one of the pdisks failing NO YES Go to step Kon page 449 c Goto repair 448 User s Guide and Maintenance Information 6 from step 5 link in one of the loops is broken If two rows of question marks are displayed two links are broken one in each loop In the example shown here pdisk2 is missing ae VERIFICATION 802386 SSA Link Verification for nunu ssaQ 00 04 IBM SSA 160 SerialRAID Adapter To Set or Reset Identify move cursor onto selection then press lt Enter gt Physical Serial Adapter Port Al A2 Bl B2 Status TOP nunu pdisk11 AC7AA09A Oh 5 Good nunu pdisk8 AC7AA2D6 1 4 Good 222 nunu pdisk3 AC7AAQB1 Bree Good nunu pdisk7 AC7AAQB5 4 1 Good nunu pdisk12 AC7AA052 5 0 Good nunu pdisk AC7AAOQB9 0 5 Good nunu pdisk1 AC7AAQB3 1 4 Good nunu pdisk10 AC7AAQB4 2 3 Good MORE 4 Nesta F1O Exit Is a link broken between two pdisks NO YES No trouble found a Find the devices that are on each side of the broken link The Identify function which is available on this display helps you to find the locations of pdisks See if you need more information aboni finding the disk drive j The information that is provided there can help you solve the problem For information on how to identify and exchange the FRU see the service information for the enclosure that contains the device Chapter 18 SSA Problem Determination Procedures 449 MAP 2323 SSA Intermittent Lin
29. List Delete Old RAID Arrays Recorded in an SSA RAID Manager List Status of Hot Spare Pools List Status of Hot Spare Protection for an SSA RAID Array List Components in a Hot Spare Pool SSA RAID Array Move cursor to desired item and press Enter hdisk2 095231779F0737K good 3 4G RAID 5 array hdisk3 09523173A02137K good 3 4G RAID 5 array Fl Help F2 Refresh F3 Cancel F8 Image F1Q Exit Enter Do Find n Find Next S J Select the array whose attributes you want to see or change Chapter 6 Using the RAID Array Configurator 135 4 A list of attributes is displayed eo gt Change Show Attributes of an SSA RAID Array Type or select values in entry fields Press Enter AFTER making all desired changes Entry Fields SSA RAID Manager ssaQ SSA RAID Array hdisk3 Connection Address Array Name 00243199986267K RAID Array Type raid_5 State good Member Disks pdiskl pdisk3 pdisk4 p gt Size of Array 3 46 Percentage Rebuilt Not Rebuilding Enable Use of Hot Spares yes Allow Page Splits yes Current Use System Disk E Fl Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image ee F1Q Exit Enter Do Move the cursor to the attribute that you want to change and press the List key 5 A list of options for that attribute is displayed Select the option that you want 6 If you want to change another attribute move the cursor to that attribute and press the List key Again choose from the list of d
30. Provide an interface to allow SSA device drivers to access SSA physical disks dev hdisk0 dev hdisk1 dev hdiskn Provide an interface to allow SSA device drivers to access SSA logical disks 286 User s Guide and Maintenance Information SSA Disk Concurrent Mode of Operation Interface The SSA subsystem provides support for the broadcast of one byte message codes from one using system to all other using systems that are connected to the same disk drive This ability to pass messages can be used to synchronize access to the disk drive The operating system has a concurrent mode interface to handle the sending and receiving of messages The concurrent mode of operation requires that a top kernel extension run on all the using systems that are sharing a disk drive The top kernel extensions use the concurrent mode interface of the SSA disk device driver to communicate with each other through the SSA subsystem The interface allows a top kernel extension to send and receive messages between using systems The concurrent mode interface consists of an entry point in the SSA disk device driver and an entry point in the top kernel extension Two ioctls register and unregister the top kernel extension with the SSA disk device driver The SSA Disk Device Driver entry point provides the method of sending messages and of locking unlocking and testing the disk drive The top kernel extension entry point processes interrupts which might include th
31. Run diagnostics in System Verification mode to all the disk drives that are listed as rejected b Run the Certify service aid see to all the disk drives that are listed as rejected Chapter 18 SSA Problem Determination Procedures 467 c If problems occur on any disk drive exchange that disk drive for a new disk drive see EEX continue from step 4d in this procedure d A disk drive that is listed as rejected is not necessarily failing For example the array might have rejected the disk drive because a power problem or an SSA link problem caused that drive to become temporarily unavailable Under such conditions the disk drive can be reused If you think that a disk drive has been rejected because it is failing check the error log history for that disk drive For example if you suspect pdisk3 type on the command line ssa_ela 1 pdisk3 h 5 This command causes the error log for pdisk3 to be analyzed for the previous five days If a problem is detected an SRN is generated e Type smitty ssaraid and press Enter f Select Change Show Use of an SSA Physical Disk and for all disk drives that you have tested or exchanged change the Current Use parameter to Array Candidate Disk g Select Change Member Disks in an SSA RAID Array h Select Swap Members of an SSA RAID Array i Select the hdisk that is in the Degraded copy state that is the hdisk that you noted in step bad on nage 467 j Referring to the displ
32. The SSA adapter is missing from the expected configuration Possible FRUs SSA adapter card 100 Action Verify that the SSA adapter card is installed in the expected slot of the using system If it is in the expected slot exchange the FRU for a new FRU If it is not in the expected slot give the diag a command and answer the questions that are displayed 60200 Description SRNs in this range are not adapter SRNs Not applicable 60210 Action For SRNs in this range see the documentation for your SSA enclosure or SSA subsystem 60240 Description A configuration problem has occurred A device cannot be Software error configured for some unknown reason SSA loop configuration Action Go to the START MAP for the a in which the device is problem installed If no problem is found go to 7XXXX Description An SSA device is missing from the expected configuration of An SSA enclosure diagnostic the SSA loop has detected a missing disk drive Action Go to the service information for the enclosure in which the missing device should be installed Note In this SRN an X represents a digit 0 through F 8XXXX Description SRNs in this range are used by the SSA enclosure Not applicable subsystem Action Go to the service information for your SSA enclosure Note In this SRN an X represents a digit 0 through F D0000 Description SRNs in this range are not adapter SRNs Not applicable to DOFFF Action For SRNs in this range
33. Type or select values in entry fields Press Enter AFTER making all desired changes Entry Fields SSA RAID Manager ssa0 Rejected Array Disks Flash Disk Identification Lights yes F1l Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image F9 Shel1 F1O Exit Enter Do Y Select yes in the Flash Disk Identification Lights field Press the List key to list the disk drives From the displayed list select the disk drives that you want to identify The Check light flashes on each disk drive that you have selected If the disk drive was rejected from the array because the disk drive itself has failed go to step aga pana od If the disk drive was rejected from the array because some other part has failed for example a power supply unit or an SSA cable a Correct the problem or call your service representative b Add the disk drive to the array see c Run system diagnostics to verify that the repair is successful Alternatively a Change the use of the original disk drive so that it becomes a hot spare disk drive see b Install a replacement disk drive see c Run system diagnostics to verify that the repair is successful Chapter 6 Using the RAID Array Configurator 93 13 Physically remove the failing disk drive for a new one see the Operator Guide or Service Guide for the unit 14 If you are going to install a replacement disk drive go to Finstalling a Replacemeni 94 Use
34. diagnostic aids adapter POSTs power on self tests 317 SRNs service request numbers 411 direct call entry point 265 description 265 purpose 265 return values 265 disk device driver configuration issues 266 configuring SSA disk drive devices 268 logical and physical disks and RAID arrays 266 multiple adapters 267 description 266 device attributes 270 device dependent subroutines 272 error conditions 274 IOCINFO ioctl operation 277 description 277 files 277 purpose 277 open read write and close subroutines 272 purpose 266 readx and writex subroutines 274 responsibilities 255 special files 276 SSA disk concurrent mode of operation interface 287 device driver entry point 287 top kernel extension entry point 288 SSADISK_ISAL CMD ioctl operation 278 description 278 files 280 purpose 278 return values 279 SSADISK_ISALMgr_CMD ioctl operation description 281 SSADISK_ISALMgr CMD ioctl operation 281 disk device driver continued files 282 purpose 281 return values 282 SSADISK_LIST_PDISKS ioctl operation 285 description 285 files 286 purpose 285 return values 286 SSADISK_SCSI_CMD ioctl operation 283 description 283 files 284 purpose 283 return values 284 syntax 266 disk drive microcode maintenance 315 disk drives failed identifying correcting and removing 91 finding the physical location 409 formatted on different types of machine 387 identification 19 not in arrays 30 reservation of 27 unique IDs UIDs 21 disk f
35. drive and a hot spare disk drive The attributes of the disks are all set to their default values Type the command gt ssaraid C 1 ssaQ t spare_pool n pool_Bl s pdisk0 pdisk3 where C specifies that this operation is a create operation 1 ssa0 specifies that RAID Manager ssa0 is to be used t spare_pool specifies that a spare pool is to be created n pool_B1 specifies the name of the hot spare pool that is to be created The format of the name must match the pool_XY format where X defines the SSA loop and has a value A or B and Y is a number 1 through 31 s pdisk specifies the free pdisk that is to become the member disk of the hot spare pool Valid members are RAID array member disk drives and hot spare disk drives Note Not all RAID managers provide support for the hot spare pool function Example 6 To List All Defined SSA Objects This example shows how to list all the defined SSA objects that are at present connected in summary format to a particular RAID manager Type the command gt ssaraid I z l ssaQ where I specifies that this operation is a list operation z specifies that the output is to be presented in summary format t raid_5 specifies that a RAID 5 array object is to be created 1 ssa0 specifies that all SSA objects that are connected to RAID Manager ssa0 are to be listed A result similar to that shown here is displayed pdiskO 0004AC506C4000D member n a 4 5GB Physical disk pdiskl
36. e The number or retried operations reached the limit that is specified in TM_MAXRETRY without success on an error that cannot be reproduced The target mode device of the remote node is not initialized or open Do the appropriate error recovery routine ETIMEDOUT The command has timed out Do the appropriate error recovery routine 300 User s Guide and Maintenance Information ioctl Subroutine The following ioctl operations are provided by the target mode device driver Some are specific to either the target mode device or the initiator mode device All require the respective device instance be open for the operation run IOCINFO Returns a structure defined in the usr include sys devinfo h file TMCHGIMPARM Allows the caller to change some parameters that are used by the target mode device driver for a particular device instance TMIOSTAT Allows the caller to get status information about the previously run write operation Possible return values for the errno global variable include EFAULT The kernel service failed when it tried to access the caller buffers EINVAL The device not open or not configured The operation is not applicable to mode of this device A parameter that is not valid was passed to the device driver select Entry Point The select entry point allows the caller to know when a specified event has occurred on one or more target mode devices The event input parameter allows the caller to specify about
37. e The uncoupled RAID Copy array that now contains a copy of the complete new volume group Note that in the copy all names start with fs 164 User s Guide and Maintenance Information Example 2 Copying One Logical Volume In this example you are copying only one logical volume lv B from the parent array to the RAID Copy array To copy one logical volume give the commands ssa_make_copy P 1 lv_B ssa_make_copy 1 lv_B Step 1 Step 2 Step 3 Source Volume Group Source Volume Group Source Volume Group Ced eee eA hd_1 hd_3 hd_1 hd_3 hd_1 hd_3 data_fs1 data_fs1 data_fs1 Iv_A a data_fs2 l Iv_A data_fs2 Iv_A data_fs2 lv_C l Iv_C lv_C lv_B i lv_B Iv_B Copy Physical Volumes Copy by LV FS name ic fslv_A fslv_B Figure 26 Copying One Logical Volume Figure 24 shows from left to right The parent array that contains the source volume group The empty RAID Copy array coupled to the parent array e The uncoupled RAID Copy array that now contains a copy of the logical volume Note that in the copy all names start with fs Although the purpose of this job is to copy the logical volume lv_B by default the ssa_make_copy command has copied the comp
38. group to allow the copy disk drives to be reused for another copy and repeat from step Zi Recoupling the copy disk drives to the parent starts a new copy process The terms couple and uncouple which are used in this section have specific meanings couple To attach a copy disk drive to a RAID 1 array or to attach an array of copy disk drives to a RAID 10 array so that data is copied from the RAID 1 or RAID 10 array to the copy array The metadata of the copy array is updated to indicate that it is part of the parent RAID 1 or RAID 10 array The metadata of the parent array is updated to show that the copy array is to be used as a copy The copy array is taken offline and is no longer directly accessible uncouple To separate the copy array from the parent RAID 1 or RAID 10 array The metadata of the copy array is updated to indicate that it is no longer part of the parent RAID 1 or RAID 10 array The metadata of the parent array is updated to show that it is no longer associated with the copy array The copy array is brought online and appears as a free resource 150 User s Guide and Maintenance Information Copying Data from an Array Three methods are available by which you can create a RAID Copy array from a RAID 1 or RAID 10 array Those methods are Using the ssaraid command from the command line Using SMIT Using the ssa_make_copy command the recommended method Using the ssaraid Command to Create a RAID
39. independently of any system I O activity For example if an SSA cable is unexpectedly disconnected an Open Serial Link error is logged immediately The SSA subsystem does not wait for a read or write command before it logs the error Sometimes on the SSA network the SSA adapter and SSA disk drives detect errors that were possibly caused by activities elsewhere on the network Such activities might be the rebooting of another using system a system upgrade or maintenance These errors do not need any service action and should not cause any problem unless the automatic error log analysis determines that the error is critical Because SSA subsystems are designed for high availability most subsystem errors do not cause I O operations to fail Some errors therefore might not be obvious to the user To ensure that the user knows about such errors a health check is run to the adapter each hour This health check is started by a cron table entry that instructs the run_ssa_healthcheck shell script to run once each hour When an SSA adapter receives a health check it logs any currently active errors and conditions that it knows exist on the SSA subsystem 221 Detailed Description SSA error logs are grouped into types of errors Each type of error is assigned to an Error Label and an Error ID The Error Label specifies the text that appears when the error log is displayed It also specifies the priority that is applied to each error type whe
40. pdisk3 0004AC9C00E700D free n a 1 1GB Physical Disk Fl Help F2 Refresh F3 Cancel F7 Select F8 Image F1Q Exit Enter Do Find n Find Next The disks selected must all be on the same loop If a list of disk drives is displayed and the list contains enough disk drives for the array you are creating go to step If no list is displayed or the does not contain ee disk drives go to 44 for a description of how to assign disk drives as array candidates When you have enough candidate disk drives return to step STEVES in this section 68 User s Guide and Maintenance Information 6 Select the disk drives that you want in the array You must select a minimum of Two disk drives if you are creating a RAID 0 array One primary and one secondary disk drive if you are creating a RAID 1 array e Three disk drives if you are creating a RAID 5 array Two primary and two secondary disk drives if you are creating a RAID 10 array Try to select disk drives of equal sizes Although you can mix disk drives of various sizes all the disk drives in a particular array are logically truncated to the size of the smallest disk drive in that array For example if you create an array from the four disk drives pdisk0 pdisk1 pdisk2 and pdisk3 that are shown on the screen in step all four disk drives are assigned as 1 1 GB disk drives because pdisk3 is a 1 1 GB disk drive If you use disk drives of various sizes therefore you wast
41. readv readx or readvx system call to start the receiving of data The kernel mode caller issues an fp_read or fp_rwuio service call to start the receiving of data The SSA target mode device driver then returns data that was received for the application program Implementation Specifics The SSA tmssa device driver provides further information about implementation specifics The tmssa special file is part of Base Operating System BOS Runtime This file is in the device ssa tm rte file set which is in the devices ssa tm package Related Information The close subroutine open subroutine read or readx subroutine and write or writex 304 User s Guide and Maintenance Information IOCINFO Device Information tmssa Device Driver ioctl Operation Purpose To return information about the device in a structure that is defined in the usr include sys devinfo h file Description This operation allows you to supply a pointer to the address of an area of type struct devinfo in the arg parameter to the IOCINFO operation This structure is defined in the usr include sys devinfo h file The SCSI target mode union is used for this as follows Initiator Device buf_size Size of transmit buffer num_bufs Number of transmit buffers max_transfer Unused Set to zero adap_devno Major or Minor devno of SSA adapter to be used for the next transmit operation Use TM_GetDevinfoNodeNum to read the node number to which the data is sent
42. 1 pdisk4 436537676 Failed If the disk drive cannot perform automatic reassign operations 1 Stop all operations to the disk drive 2 Use the a flag and rerun the Certify operation If the certify operation is successful the ssa_certify command returns no output If the ssa_certify command is issued to a RAID 5 hdisk that contains LBAs that have been marked as unreadable an LBA or an LBA count is printed to stout For example gt ssa_certify 1 hdisk2 12288 failed gt ssa_certify 1 hdisk2 c 4 pdisk Specifies the physical disk drive pdisk that the user wants to certify a Enables the reassign action on devices that cannot already perform automatic Chapter 16 Using the SSA Command Line Utilities 345 reassign operations You must not use this flag when the ssa_certify command is running concurrently on an active using system n MaxReadSize Specifies the maximum size in kilobytes of each read command that is sent to the disk drive Default size for this flag 3072 kilobytes Minimum value that can be specified for this flag 64 kilobytes Maximum value that can be specified for this flag 10240 kilobytes Larger values for the MaxReadSize can cause other requests to be delayed if the ssa_certify command is run concurrently on an active using system b StartLBA Specifies the RAID 5 array LBA from where the certify operation should start If this flag is not set the certify operation starts from LBA 0 C S
43. 338 power 211 removing 336 Fast Write Cache feature battery 211 bypassing the cache in a one way fast write network 217 configuring 211 dealing with problems 218 description 27 enabling or disabling Fast Write for multiple devices 215 enabling or disabling Fast Write for one disk drive 214 getting access to the Fast Write menus 213 Fast Write feature 5 Fast Write menus getting access 213 fencing 290 files adapter device driver 260 IOCINFO ioctl operation 261 277 SSA_GET_ENTRY_POINT ioctl operation 264 SSA_TRANSACTION ioctl operation 263 ssadisk SSA disk device driver 276 SSADISK_ISAL_CMD ioctl operation 280 SSADISK_ISALMgr_CMD ioctl operation 282 SSADISK_LIST_PDISKS ioctl operation 286 SSADISK_SCSI_CMD ioctl operation 284 finding the physical location of a device 409 flowchart for RAID 5 array states 35 Format Disk service aid 389 Index 499 FRU part numbers 340 full stride writes definition 272 G getting access to the Fast Write menus 213 getting access to the SSA Adapters SMIT menu 40 getting access to the SSA Disks SMIT menu 41 getting access to the SSA RAID Array SMIT menu 59 good housekeeping 233 Good state RAID 0 31 Good state RAID 10 36 Good state RAID 5 33 H hdisks and pdisks changing numbers 326 explanation of 19 reformatting a pdisk as an hdisk 387 head device driver adapter device driver interface 256 hot spare management 45 choosing how many hot spare disk drives to include in each pool 51 c
44. 5 4 12345 ssa0 _xx 10 32 12 456 234567890ABCDEF1 pdisk22 3 13 5 3 12346 ssal 961120 10 50 12 123 1234567890ABCDE7 pdisk22 7 1 5 4 12345 You can switch off the headings by using the h flag Where possible the ssa_getdump command translates the adapter UID into the adapter name for example ssa0 If the command cannot translate the adapter UID it leaves the ADAP field blank see the third line of output in the example You can limit the search to specific disk drives or adapters by adding various optional arguments to the command Attention The command uses space in the tmp file when it copies a file If the available space is not large enough the command fails Some dumps can be large Copy Mode In Copy mode the command copies data from a specified disk drive to a specified output location You must specify the disk drive and the output location Chapter 16 Using the SSA Command Line Utilities 361 Flags 362 The ssa_getdump command uses several types of flag e Required flags for both modes e Required flags for Copy mode e Optional flags for List mode e Optional flags for Copy mode Required Flags for Both Modes You must use one of these flags l Specifies that the program is to operate in List mode The program searches for dumps C Specifies that the program is to operate in Copy mode The program copies the dump if one is found from the specified location to the specified output point Required Fla
45. 7133 Model D40 or T40 reports that the ambient temperature is outside the specified limits The SRN indicates the service procedures that must be performed SSA_HDW_ERROR 05F97A32 A hardware failure has occurred Run diagnostics in Problem Determination mode to determine which FRUs to exchange for new FRUs SSA_HDW_RECOVERED B3FF2B19 A hardware error has occurred that has been recovered by the error recovery procedures Run error log analysis to determine whether a FRU needs to be exchanged for a new FRU Chapter 11 SSA Error Logs 223 Table 2 Error Labels continued Error Label Error ID Error Description SSA_LINK_ERROR ABECECFD Link errors might be detected by any node in the SSA loop The adapter is notified of these errors It performs any necessary error recovery and logs the error Link errors are normally associated with some other failure on the SSA loop Link errors might be logged when other devices on the loop are switched on or off or when cables or devices are disconnected during service activity Intermittent link errors are not serious If many link errors occur however one of the SSA links might be going to fail Run error log analysis to determine whether any repair action is needed SSA_LINK_OPEN 625E6B9A SSA devices are normally configured in a closed loop The loop consists of a series of links each link connecting two SSA devices A device can be an adap
46. C Physical Disk Drive Move cursor onto selection then press lt Enter gt Set or Reset Identify Select this option to set or reset the Identify indicator on the disk drive gt Set or Reset Service Mode Select this option to set or reset Service Mode on the disk drive ENSURE THAT NO OTHER HOST SYSTEM IS USING THIS DISK DRIVE BEFORE SELECTING THIS OPTION F3 Cancel F1O Exit H Select Service Mode or the Identify function If the original disk drive is to remain in Service Mode you can select only the Identify function now Only one disk drive at a time can be in Service Mode The list of pdisks appears again The pdisk that is in Identify Mode is identified by a 8 D SET SERVICE MODE 802381 Move cursor onto selection then press lt Enter gt systemname pdisk0 AC50AE43 2GB SSA C Physical Disk Drive systemname pdisk1l AC706EA3 2GB SSA C Physical Disk Drive systemname pdisk2 AC1DBE11 2GB SSA C Physical Disk Drive gt systemname pdisk3 AC1IDBEF4 2GB SSA C Physical Disk Drive systemname pdisk4 AC5QAE58 2GB SSA C Physical Disk Drive systemname pdisk5 AC7C6E51 2GB SSA C Physical Disk Drive systemname pdisk6 AC7Q06E9A 2GB SSA C Physical Disk Drive systemname pdisk7 ACIDEEE2 2GB SSA C Physical Disk Drive systemname pdisk8 amp AC1DBE32 2GB SSA C Physical Disk Drive F3 Cancel F1O Exit Se S Identify other disk drives in the same way if required User s Guide and Maintenance Information Li
47. Copy array Valid values for status are Good All the array components are present and operational Offline One or more array members are missing or have failed Unknown A RAID Copy array has been created but has not been coupled to an array An hdisk cannot be created from this RAID Copy array This RAID Copy array can be only coupled to an array or deleted Parent Array The name of the array from which the data was copied Timestamp The date and time at which the copy was uncoupled from the parent array Chapter 7 Copying Data from Arrays and from Volume Groups 181 List All Uncoupled Volume Groups For fast path type smitty copy_ stunvg and press Enter Otherwise select List All Uncoupled Volume Groups from the Array Copy Services menu The following information is displayed C aN COMMAND STATUS Command OK stdout yes stderr no Before command completion additional instructions may appear below Copy Status Parent Array Timestamp fsmyvg01 hdisk7 good hdisk3 Fri May 12 13 23 49 2000 hdisk8 good hdisk4 Fri May 12 13 23 49 2000 fsmyvg02 hdisk9 good hdisk5 Fri May 12 14 11 18 2000 hdisk10 good hdisk6 Fri May 12 14 11 18 2000 Fl Help F2 Refresh F3 Cancel F6 Command F8 Image F9 Shel 1 F1O Exit Find n Find Next NS Y The columns of information displayed on the screen have the following meanings Copy The new volume group name that was created when the copy of the parent volume group was uncoupled Under
48. Disk drives 1 and 2 can communicate with the using system only through connector A1 of the SSA adapter Disk drive number 8 can communicate with the using system only through connector A2 of the SSA adapter Disk drives 4 5 and 6 are isolated from the SSA adapter Using system a1 A2 f B B2 A A Disk Disk Disk Disk Disk Disk 1 2 4 5 6 8 Figure 4 Simple Loop with Two Disk Drives Missing 10 Users Guide and Maintenance Information One Loop with Two Adapters in One Using System In Figure 5 the loop contains two SSA adapters and H that are both in the same using system In this configuration all the disk drives can still communicate with the using system if one SSA adapter fails Using System Ai A2 f B1 B2 A1 A2 Bi B2 D Disk Disk Disk Disk Disk Disk Disk Disk 16 15 14 13 12 11 10 9 Disk Disk Disk Disk Disk Disk Disk Disk 1 2 3 4 5 6 7 8 Figure 5 One Loop with Two Adapters in One Using System Chapter 2 Introducing SSA Loops 11 One Loop with Two Adapters in Each of Two Using Systems If the loop contains four SSA adapters with two adapters in each of two using systems disk drives become isolated if they are connected between the two adapters of one using system and both those adapters fail or are held reset but remain powered on Bypass Note Your SSA Disk Subsystem or SSA Disk Enclosure might contain bypass cards Each bypass card can switch the internal
49. Disks Disk drives to which the array data is being copied Copy State The operational state of the array copy Not Copying No copy is being created for this array Good The coupled disk drives contain an exact copy of the data that is on the array The copy must be in the Good state before it can be uncoupled from the array Chapter 7 Copying Data from Arrays and from Volume Groups 179 Copying Data is being copied to the coupled disk drives but these coupled disk drives do not yet contain an exact copy of the data that is on the array Degraded A copy has been created but one or more coupled disk drives are missing or have failed If missing disk drives are replaced or exchanged for new disk drives the copy operation continues 180 User s Guide and Maintenance Information List All Uncoupled Copies For fast path type smitty copy_lstcopies and press Enter Otherwise select List All Uncoupled Copies from the Array Copy Services menu The following information is displayed a gt COMMAND STATUS Command OK stdout yes stderr no Before command completion additional instructions may appear below Copy Status Parent Array Timestamp hdisk9 good hdisk2 Wed May 10 15 27 21 2000 Fl Help F2 Refresh F3 Cancel F6 Command F8 Image F9 Shel1 F10 Exit Find RUC Next oy The columns of information displayed on the screen have the following meanings Copy The name of the RAID Copy array Status The status of the RAID
50. E9 Does any SSA disk drive have its Check light on NO The disk drive might have been removed from the subsystem a oap p YES oaog Reinstall the removed drive or select a new disk drive for addition to the array Type smitty ssaraid and press Enter Select Change Member Disks in an SSA RAID Array Select Swap Members of an SSA RAID Array Select the degraded hdisk Referring to the displayed instructions exchange the failed member for a new disk drive The Disk to Remove is listed as BlankReserved the Disk to Add is the disk drive that you reinstalled or selected in step Podl When failed disk drives have been exchanged for new disk drives the data is rebuilt and the array changes its state to the Good state 9 to verify the repair Exchange the failed disk drive for a new one see TEENE es on page 2 Type smitty ssaraid and press Enter Select Change Member Disks in an SSA RAID Array Select Swap Members of an SSA RAID Array Select the degraded hdisk Referring to the displayed instructions exchange the failed member for a new disk drive The Disk to Remove is listed as BlankReserved the Disk to Add is the disk drive that you reinstalled or selected in step Boal When failed disk drives have been exchanged for new disk drives the data is rebuilt and the array changes its state to the Good state g to verify the repair Chapter 18 SSA Problem Determination Procedures 465 21
51. Each disk drive that is coupled to the array must be at least this size If larger disk drives are selected the excess space remains unused Verify copy during creation If you select yes all data that is written to the copy is verified before the write operation completes This action reduces the possibility that unrecoverable media errors are found when the copy is uncoupled and read but increases the time required to perform the copy operation Chapter 7 Copying Data from Arrays and from Volume Groups 175 Hot spare selection Default Assigns the coupled disk drives to the pool to which those disk drives were previously assigned Primary Each coupled disk drive is assigned to the hot spare pool to which the Primary disk drive that it is copying is assigned Secondary Each coupled disk drive is assigned to the hot spare pool to which the Secondary disk drive that it is copying is assigned To get more control over this process use the ssaraid command directiy from the command line or r from inside a shell script see RAID copy to be coupled The name of the existing RAID copy to be used Press the List key to list the uncoupled copies that you can use to create the new copy Leave this field blank if you are going to select Disk drives to be coupled Disk drives to be coupled A list of disk drives to be used for the copy Press the List key to list the free disk drives that you can use to create the copy The number of disk d
52. Enter AFTER making all desired changes Entry Fields SSA RAID Manager ssa0 SSA RAID Array hdisk3 Connection Address Array Name 09523173A02137K Disk To Remove Disk To Add F1l Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image ene F10 Exit Enter Do P3 Select Disk to Remove and press the List key From the displayed list select the disk drive that you want to remove and press Enter Note If the disk drive that you are replacing has been removed or the array has rejected it it is listed as BlankReservedZ Select Disk to Add and press the List key From the displayed list select the name of the disk drive that you want to add and press Enter Press Enter to perform the swap operation Chapter 6 Using the RAID Array Configurator 143 Changing or Showing the Use of an SSA Disk Drive This option allows you to change or see how particular disk drives are used 1 For fast path type smitty chgssadisk and press Enter Otherwise select Change Show Use of an SSA Physical Disk from the SSA RAID Arrays menu 2 A list of adapters is displayed in a window is A List Identify SSA Physical Disks Move cursor to desired item and press Enter List Disks in an SSA RAID Array List Hot Spares List Rejected Array Disks List Array Candidate Disks List System Disks Identify Disks in an SSA RAID Array dentify Hot Spares dentify Rejected Array Disks dentify Array Candidate Disks dentify
53. F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image ec dak F1O Exit Enter Do If you want to enable the fast write function for the selected disk drives set the Enable Fast Write option to yes for those disk drives The state of the Force Delete option is ignored If you want to disable the fast write function for the selected disk drives set the Enable Fast Write option to no and the Force Delete option to no for those disk drives Chapter 10 Using the Fast Write Cache Feature 215 Notes a If you are running a two way fast write operation and you enable or disable the fast write function the hdisk on the second using system becomes unavailable From the second using system delete that hdisk and reconfigure as follows 1 Type rmdev 1 hdiskname d 2 Run cfgmgr to reconfigure the new hdisk b The fast write function is disabled only if no data for your selected devices is present in the fast write cache If data for your selected devices is present in the fast write cache and you want to disable the fast write function go to step kl 3 If data for your selected devices is present in the fast write cache and you want to disable the fast write function set the Enable Fast Write option to no and the Force Delete option to yes The Force Delete screen is displayed Enable Disable Fast Write for Multiple Devices Force Delete Setting Force Delete to yes will allow the system to disable F
54. FRUs Some of those FRUs have Power lights for example disk drives and fan and power supply assemblies If none of the FRUs that you exchanged has a Power light go to step Bl if some or all of the FRUs that you exchanged have Power lights check whether all those Power lights are on Do the FRUs that you have exchanged have their Power lights on NO a Exchange for a new one the FRU whose Power light is off b Go to step BL YES Go to step Bl 3 from step bh Are all Check lights off Note For FRUs that do not have a Check light answer YES NO Go to the START MAP for the enclosure in which the device that has its Check light on is installed YES a Run diagnostics in System Verification mode to the device that reported the problem Chapter 18 SSA Problem Determination Procedures 475 Notes 1 Do not run Advanced Diagnostics otherwise errors might be logged on other using systems that share the same loop 2 If you have just exchanged a disk drive or an SSA adapter you might need to run cfgmgr to restore the device to the system configuration If the original problem was not reported by a device run diagnostics to each SSA adapter in the using system b Go to step B 4 from step Bh Do you still have the same SRN NO Go to step B YES Go to step a 5 from step Blin MAP 2010 START and step Hlin this MAP Do you have any other SRN NO Go to step ee YES Go to 6 from step n Have you ha
55. Functions 3 ahh tt 0 a es OF Getting Access to the SSA RAID Array SMIT Menu a J Cals Blaeed 2 2 9S Listing All Defined SSA RAID Arrays 2 we ee 100 Listing All Supported SSA RAID Arrays 2 101 Listing All SSA RAID Arrays That Are Connected to a RAID Manager wistroee d 102 Listing the Status of All Defined SSA RAID Arrays 104 Listing or Identifying SSA Physical Disk Drives 108 Listing or Deleting Old RAID Arrays Recorded in an SSA RAID Manager a 130 Changing or Showing the Attributes of an SSA RAID Array 185 Changing Member Disks in an SSA RAID Array 187 Changing or Showing the Use of an SSA Disk Drive 144 Changing the Use of Multiple SSA Physical Disks 147 Copying RAID 1 or RAID 10 Arrays 2 2 wee eee 148 Chapter 7 Copying Data from Arrays and from Volume Groups 149 Copying Data from anArray 151 Using the ssaraid Command to Create a RAID Copy Array from a a RAID 1 or RAID 10 Array 151 Using SMIT to Create a RAID Copy Array from a RAID 1 or r RAID 10 Array 155 Using the ssa_make_copy Command to Create a RAID Copy from a RAID 1 or RAID 10 Array 2 a a a e aa L e a a a a e ah el a 189 ssa_make_copy Command 2 2 2 2 161 PurpoSe os iy oe an Ai ae OR ee a a oe es ee ge 2 T61 DYNAN es cs AL Ba Be eat ae ot 8 ey Se et Bas Elen 2S fe wale STG Descript
56. Guide for the using system 2 Place the adapter card with its components downward onto a nonconducting surface 3 Refer to Figure 46 Figure 46 Removing the Mounting Screw from the Fast Write Cache Option Card 4 Remove the mounting screw The screw fastens the Fast Write Cache Option card in position 332 User s Guide and Maintenance Information 5 Refer to Figure 471 Z Figure 47 Removing the Fast Write Cache Option Card 6 Invert the adapter card so that its components are upward 7 Carefully unplug the Fast Write Cache Option card H from the connector E Chapter 15 Removal and Replacement Procedures 333 Installing the Fast Write Cache Option Card of an Advanced SerialRAID Adapter Attention The adapter card contains parts that are electrostatic discharge ESD sensitive Use the tools and procedures defined by your organization to protect such parts 1 Remove the adapter from the using system if not already removed see the Installation and Service Guide for the using system 2 Place the adapter card with its components upward onto a nonconducting surface 3 Refer to igure 48 7 kao oo B l
57. Guide and Maintenance Information write subroutine Support for the write entry point is provided only for the initiator mode device driver The write entry point generates one write operation in response to a calling program write request If the device is opened with the O_NDELAY flag set and the write request is for a length that is greater than the total buffer size of the device the write request fails The errno global variable is set to EINVAL The total buffer size for the device is determined by multiplying the value of the XmitBufferSize attribute by the value of the XmitBuffers attribute These values are in the configuration database Support for data gathering is through the user mode writev or writevx subroutine or through the kernel mode fp_rwuio service call The write buffers are gathered so that they are transferred in sequence as one write operation The returned errno global variable is set to EFAULT if an error occurs while the caller data is being copied to the device buffers If the write operation is unsuccessful the return value is set to 1 and the errno global variable is set to the value of the return value from the device driver If the return value is other than 1 the write operation was successful and the return value indicates the number of bytes that were written The caller should validate the number of bytes that are sent to check for any errors Because the whole data transfer length is sent in a single wri
58. Maintenance Information 9 The following information is displayed 10 11 12 13 14 15 16 A Remove a Disk from an SSA RAID Array N Type or select values in entry fields Press Enter AFTER making all desired changes Entry Fields SSA RAID Manager ssaQ SSA RAID Array hdisk3 Connection Address Array Name 095231779F0737K Disk to Remove Fl Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image F9 Shel F1O Exit Enter Do XS A Press F4 to list the disk drives A list of disk drives is displayed From the displayed list select the disk drive that you want to remove Press Enter to remove the disk drive from the array If the Check light of the disk drive that you are removing is off use the Set Service service aid to put that disk drive into Service Mode see If the Check light of the disk drive that you are removing is on you do not need to select Service Mode before you remove that disk drive Physically remove the disk drive See the service information for the device that contains the disk drive then return to here Physically install the replacement disk drive See the service information for the device that contains the disk drive then return to here If the disk drive is in Service Mode reset Service Mode See FSet Service Modd j then return to here Chapter 15 Removal and Replacement Procedures 323 17 18 19 20 21 22
59. Maintenance Information IOCINFO Device Information SSA Disk Device Driver ioctl Operation Purpose To return a structure that is defined in the usr include sys devinfo h file Description The IOCINFO operation returns a structure that is defined in the usr include sys devinfo h header file The caller supplies the address to an area of type struct devinfo in the arg parameter to the IOCINFO operation The device type field for this component is DD_SCDISK the subtype is DS_PV The information that is returned includes the block size in bytes and the total number of blocks on the disk drive Files dev pdiskO dev pdisk1 dev pdiskn Provide an interface that allows SSA device drivers to have access to SSA physical disk drives dev pdiskO dev pdisk1 dev pdiskn Provide an interface that allows SSA device drivers to have access to SSA logical disk drives Chapter 13 Using the Programming Interface 277 SSADISK_ISAL_CMD ISAL Command SSA Disk Device Driver ioctl Operation Purpose Description To provide a method of sending Independent Network Storage Access Language ISAL commands to an SSA physical or logical disk drive ISAL consists of a set of commands that allow a program to control and access a storage device The ISAL command set is described in the Technical Reference for the adapter The SSADISK_ISAL_CMD operation allows the caller to issue an ISAL command to a selected logical or physical disk drive
60. N HA If you select either the volume group name or any hdisk in that volume group all the disk drives in the volume group are selected The Action menu is displayed a s N Delete a Volume Group Logical Volumes or Filesystems Copy Type or select values in entry fields Press Enter AFTER making all desired changes Entry Fields Volume Group vg00 Action Delete and Detach Force no F1l Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image F9 Shel1 F1O Exit Enter Do oy 184 User s Guide and Maintenance Information The meanings of the fields are Volume Group The volume group that you selected previously Action The possible actions are Delete and Detach The volume group name is removed Data on the volume group is no longer accessible The RAID Copy arrays change to free and can be recoupled to parent arrays Delete The volume group name is removed and pdisks that are in each array in the volume group change to free disk drives Data on the volume group is no longer accessible Delete and Recouple The volume group name is removed Each array that is in the volume group is recoupled to its original parent Force Yes No If the specified volume group is varied off yes forces the script to export the volume group If the specified volume group is varied on and file systems are mounted yes attempts to unmount the file systems Chapter 7 Copying Data from Arrays and from Volume
61. On each using system to which this replacement disk drive is connected you must now remove from the system configuration the reference to the pdisk that you have just removed Attention If the disk drive is connected to more than one using system the pdisk and hdisk numbers might be different on each system If you are not sure what the pdisk and hdisk numbers are on either system give the command odmget q connwhere like NNNNNNNN CuDv where NNNNNNNN is the serial number on the front of the removed disk drive The hdisk and pdisk configuration data for the disk drive serial number is displayed To remove the reference to the pdisk that you have just removed type rmdev 1 pdisknumber d where pdisknumber is the pdisk number of the disk drive that you have just removed If the pdisk that you have just removed did not belong to a RAID array you must also remove from the system configuration the reference to the hdisk Type rmdev 1 hdisknumber d where hdisknumber is the hdisk number of the disk drive that you have just removed If you installed the disk drive under concurrent maintenance give the cfgmgr command on each using system to which that disk drive is connected The command configures the disk drive If you installed the disk drive while the using system was switched off switch on the using system when you are ready to do so When you switch on the using system the disk drive is automatically configured
62. Purpose Sa oh N a te ged oe a Se oat os eee Ss Description Return Values Files SSA Adapter Device Driver Direct Call Entry Point Purpose dio ts nce Sey yagn Ecce Bd oY 2 Description Return Values ssadisk SSA Disk Device Driver Purpose Syntax Configuration Issues Contents 242 242 243 243 244 244 248 249 249 251 252 252 253 255 255 255 256 256 257 257 257 257 257 258 259 259 260 261 261 261 261 262 262 262 263 263 264 264 264 264 264 265 265 265 265 266 266 266 266 vii viii Device Attributes Device Dependent Subroutines Error Conditions Special Files te en ee a ee IOCINFO Device Information SSA Disk Device Driver ioctl Operation Purpose bide tie ine i oe OS A aT cee a E Description Files SSADISK_ISAL_ CMD ISAL Command SSA Disk Device Driver ioct Operation Purpose amp FM ds She ete ah ak eu ea a Description Return Values Files SSADISK_ISALMgr_ CMD sat Manager Commana SSA Disk Device Driver ioctl Operation ha ae ee ots Bee as ee de oD Gh a A et amp Purpose Description Return Values Files SSADISK_SCSI CMD SCSI Command SSA Disk Device Driver ioctl Operation Purpose GaSe a Ee Ree ig Ee og hig ae ae hs Ge OB OE ey BR Description Return Values Files SSADISK_LIST_ PDISKS SSA Disk Device Driver ioc
63. RAID Arrays 100 List All SSA RAID Arrays Connected to a RAID Manager 102 List All Supported SSA RAID Arrays 101 List Array Candidate Disks 115 List Components in a Hot Spare Pool 80 List Disks in an SSA RAID Array 109 List Hot Spares 111 List Rejected Array Disks 113 List Status of All Defined SSA RAID Arrays 104 List Status of Hot Spare Pools 74 List Status of Hot Spare Protection for an SSA RAID Array 77 List System Disks 117 List Identify SSA Physical Disks 108 Remove a Disk from an SSA RAID Array 138 SSA Logical Disks 213 Swap Members in an SSA RAID Array 142 SMIT menu for SSA adapters getting access 40 SMIT menu for SSA disks getting access 41 SMIT menu getting access 59 90 98 SMIT menu using 38 SMIT menus for 3 way copy operations 172 SMITTY or SMIT commands ssadlog 213 ssaraid 59 90 98 software and microcode errors 441 solving hot spare pool problems 53 solving problems with SSA links 400 examples broken loop cable removed 403 broken loop disk drive removed 406 normal loops 401 Spare Tool using 209 special files tmssa 304 description 304 implementation specifics 304 purpose 304 special files disk device driver 276 split site management 193 configuration of RAID 1 and RAID 10 arrays 193 operation after a loss of member disks 194 split site management continued adapter not known to remaining half of the array 203 one half of the array is not present 195 split and join procedure not performed correctly 2
64. Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image eee F1Q Exit Enter Do For the meanings of the fields see page led 64 User s Guide and Maintenance Information If you select RAID 10 the following menu is displayed a Add an SSA RAID Array gt Type or select values in entry fields Press Enter AFTER making all desired changes Entry Fields SSA RAID Manager ssa0 RAID Array Type raid_10 Primary Disks Secondary Disks Strip Size KB 16 Split Array Resolution Primary Enable Use of Hot Spares yes Choose Hot Spare only from Preferred Pool no Allow Hot Spare Splits no Allow Page Splits yes a Initial Rebuild no Enable Fast Write no Fl Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image CEE F10 Exit Enter Do 3 For the meanings of the fields see page ka Chapter 6 Using the RAID Array Configurator 65 66 Meanings of the Fields SSA RAID Manager The name of an SSA RAID Manager SSA RAID Managers are devices that control SSA RAID arrays RAID Array Type The type of the SSA RAID array Member Disks For a RAID 0O or a RAID 5 array member disks are the disk drives that are to be added to the SSA RAID array The array must consist of disk drives that are in the same loop Primary Disk The primary disk drive of a RAID 1 array A RAID 1 array must consist of two disk drives one primary and one secondary that are in the same loop The data that is contai
65. SSA Physical Disks Move cursor to desired item and press Enter List Disks in an SSA RAID Array List Hot Spares List Rejected Array Disks List Array Candidate Disks List System Disks Identify Disks in an SSA RAID Array Identify Hot Spares Identify Rejected Array Disks Identify Array Candidate Disks Identify System Disks SSA RAID Manager Move cursor to desired item and press Enter ssaQ Available 00 04 IBM SSA 160 SerialRAID Adapter 14109100 F1 Help F2 Refresh F3 Cancel F8 Image F1Q Exit Enter Do Find n Find Next Select the adapter whose system disk drives you want to list Chapter 6 Using the RAID Array Configurator 117 3 A list of system disk drives is displayed Pee COMMAND STATUS Command OK stdout yes stderr no Before command completion additional instructions may appear below pdisk3 0004AC5119EQ00D system 1 1G Physical disk pdisk5 O8005AEAQ30D00D system 2 3G Physical disk F1 Help F2 Refresh F3 Cancel F6 Command F8 Image F9 Shel F1O Exit Find n Find Next Ne 118 User s Guide and Maintenance Information Identifying the Disk Drives in an SSA RAID Array This option allows you to identify the disk drives that are contained in a particular array 1 For fast path type smitty issaraid and press Enter Otherwise a Select List Identify SSA Physical Disks from the SSA RAID Arrays menu b Select Identify Disks in an SSA RAID Array 2 A list of arrays is displayed in a win
66. SSA RAID Arrays menu A list of disk drives and their usage is displayed in a window fe A SSA RAID Arrays Move cursor to desired item and press Enter List All Defined SSA RAID Arrays SSA Physical Disk Move cursor to desired item and press Enter Use arrow keys to scroll SSA physical disks which are members of arrays pdiskO 00022123DFHCOOD member n a 4 5G Physical d pdiskl 0004AC5119EQ00D member n a 1G Physical d pdisk2 0004AC5119EQ00D member n a 1G Physical d pdisk3 Q8005AEA003500D member n a 4 5G Physical d pdisk4 Q8005AEAQ30D00D member n a 2 3G Physical d pdisk5 Q8005AEAQ80100D member n a 4 5G Physical d pdisk7 08005AEA087A00D member n a 4 5G Physical d SSA physical disks which are hot spares pdisk6 08005AEA080800D spare n a 4 5G Physical d F1l Help F2 Refresh F3 Cance F8 Image F1O Exit Enter Do Find n Find Next N sf 2 Using the arrow keys scroll the information until you find the list of SSA physical disks that are not used 72 Users Guide and Maintenance Information 3 Select the disk drive that you want to designate as a hot spare The following screen is displayed for the disk drive that you have chosen SSA RAID Manager SSA physical disk CONNECTION address Current use Type or select values in entry fields Press Enter AFTER making all desired changes Change Show Attributes of an SSA Physical Disk Entry Fields ssa0 pdisk6 08005AEAQ80800D Hot Spare Disk F1
67. SSA device drivers to access logical SSA disks User s Guide and Maintenance Information SSADISK_SCSL_CMD SCSI Command SSA Disk Device Driver ioctl Operation Purpose Description To provide a method of sending Serial Storage Architecture Small Computer Systems Interface SSA SCSI commands to an SSA physical disk drive that has been opened with the SSADISK_SCSIMODE extension flag The SSADISK_SCSI_CMD operation allows the caller to issue an SSA SCSI command to a selected physical disk The caller must be root or have an effective user ID of root to issue this ioctl The arg parameter for the SSADISK_ISALMgr_CMD operation is the address of an ssadisk_ioctl_parms structure This structure is defined in the usr include sys ssadisk h file The SSADISK_SCSI_CMD operation uses the following fields of the ssadisk_ioctl_parms structure dsb Contains the directive status byte that is returned for the command The byte contains a value from the usr include ipn ipndef h file A non zero value indicates an error result Contains the IPN result word that is returned by IPN for the command The word contains values from the usr include ipn ipntra h file A non zero value indicates an error u scsi data_descriptor Set by the caller to describe the buffer for any data that is transferred by the SCSI command If no data is transferred the length of the buffer should be set to 0 u0 scsi direction Set by the caller to indicate the
68. SSA disk drives Attention Formatting a disk drive destroys all the data on that disk drive Use this procedure only when instructed to do so by the service procedures To use the Format Disk service aid 1 Select Format Disk from the SSA Service Aids menu see EStarting the SSA A list of pdisks is displayed FORMAT DISK systemname systemname pdisk8 systemname systemname systemname systemname systemname systemname systemname pdisk11 pdisk2 pdisk3 pdisk7 pdisk12 pdisk0 pdisk1 pdisk10 Move cursor onto selection then press lt Enter gt AC50AE43 AC706EA3 AC1DBE11 AC1DBEF4 AC50AE58 AC7C6E51 AC706E9A AC1DEEE2 AC1DBE32 802395 9 1GB SSA C Physical Disk Drive 9 1GB SSA C Physical Disk Drive 4GB SSA C Physical Disk Drive 4GB SSA C Physical Disk Drive 9 1GB SSA C Physical Disk Drive 9 1GB SSA C Physical Disk Drive 4GB SSA C 4GB SSA C 4GB SSA C Physical Disk Drive Physical Disk Drive Physical Disk Drive F3 Cancel F1O Exit Chapter 17 SSA Service Aids 389 2 Select the pdisk that you want to format The following instructions are displayed ee FORMAT DISK 802396 systemname pdisk2 AC1DBE11 4GB SSA C Physical Disk Drive Move cursor onto selection then press lt Enter gt Set or Reset Identify Select this option to set or reset the Identify indicator on the disk drive Format Select this option only if you are sure that you have selected th
69. Speed Service Aid ise dk de Peet ie Sak ac 4896 Service Aid Service Request Numbers SRNs TERR Dogon S ee non 400 Using the Service Aids for SSA Link Problem Determination WH ee ce ve 4 400 Example 1 Normal Loops fe See Ce nen age ke a oe AOT Example 2 Broken Loop Cable Removed 2 ge hoe 4h heh gs ak 45 4 2408 Example 3 Broken Loop Disk Drive Removed 406 Finding the Physical Location of a Device fe we oe ae a 409 Finding the Device When Service Aids Are Available si Ge te he amp ee a AOD Finding the Device When No Service Aids Are Available 409 Chapter 18 SSA Problem Determination Procedures 411 Service Request Numbers SRNs 1 eee ee eee AN The SRN Table 2 2 we ee ee 4 Using the SRN Table 2 eee ee ee 4A Software and Microcode Errors A aris Boa ate Ee oe OE Oo ae aAA SSA Loop Configurations that Are Not Valid ge Sr Set wks Bo we eon a a a aa SSA Maintenance Analysis Procedures MAPS 448 Howto Use the MAPS 1 ee ee ee 448 MAP 2010 START 2 2 eee eee 444 MAP 2320 SSA Link be dite oho m whee Gey oe Wedd MAP 2323 SSA Intermittent Link Error Skok de da he a Bl oe ano te a ADO MAP 2324 SSA RAID ide Se eiga oe ae aac Se Se a OE A eo ond 4 MAP 2410 SSA Repair Verification he fie is a ty Se at oe a ee Se ee TD SSA Link Errors D a u fe e Bs n a wha ak ae AS SSA Link E
70. Subsystem D n i SSA rive Pas A Connector Card Cable SSA Subsystem lt SSA Internal Disk Connector Connection Drive Card Figure 61 Seven Part Link in Two Subsystems Example 4 Eigure 62 the link is between a disk drive and an SSA adapter It has five parts SSA Subsystem gt Disk Internal SSA Cable PEE Drive Connection Connector Card Figure 62 Five Part Link between Disk Drive and Adapter Example 5 In Eigure 63 the link is between two SSA adapters It has five parts Note that it has fiber optic cables and optical connectors instead of normal SSA cables Fiber Optic Cables Adapter Optical Connector Optical Connector Figure 63 Five Part Link between Two Adapters 480 User s Guide and Maintenance Information Adapter Link Status Ready Lights If a fault occurs that prevents the operation of a particular link the link status lights of the various parts of the complete link show that the error has occurred You can find the failing link by looking for the flashing green status light at each end of the affected link Some configurations might have other indicators along the link for example SSA connector cards to help with FRU isolation The meanings of the disk drive and adapter lights are summarized here Sta
71. Target Device buf_size Size of receive buffer num_bufs Number of receive buffers max_transfer Unused Set to zero adap_devno Major or Minor devno of SSA adapter initially used by the paired initiator mode device Chapter 13 Using the Programming Interface 305 Use TM_GetDevinfoNodeNum to read the node number from which the data is received The remainder of the structure is filled as follows devtype DD_TMSCSI flags Set to zero devsubtype DS_TM 306 User s Guide and Maintenance Information TMIOSTAT Status tmssa Device Driver ioctl Operation Purpose Description To allow the caller to put the status information for the current or previous write operation into a structure that is defined in the usr include sys tmscsi h file This operation returns information about the last write operation Because a nonblocking write operation might still be running you must ensure that the status information applies to a particular write operation The tm_get_stat structure in the usr include sys tmscsi h file is used to indicate the status as follows status_validity Bit O set scsi_status valid scsi_status SC_BUSY_STATUS Write operation in progress SC_GOOD_STATUS Write operation completed successfully SC_CHECK_CONDITION Write operation failed general_card_status Unused Set to zero b_error errno for a failed write operation or zero b_resid Updated uio_resid for the write operation resvd1 Un
72. The caller must be root or have an effective user ID of root to issue this ioctl The following ISAL commands minor function codes that are defined in the usr include ipn ipnsal h file can be issued FN_ISAL_Read FN_ISALWrite FN_ISAL_ Format FN_ISAL_ Progress FN_ISAL_Lock FN_ISAL_Unlock FN_ISAL_Test FN_ISAL_SCSI FN_ISAL_Download FN_ISAL_ Fence Notes 1 Some of these commands are not valid for SSA hdisks but are valid for SSA pdisks others are valid for SSA hdisks but are not valid for SSA pdisks The adapter card not the device driver checks whether the commands are valid If the caller attempts to send a command to a device for which that command is not valid the adapter returns a non zero result The exception to this procedure occurs when any attempt is made to send a FN_ISAL_Fence command to a SSA physical disk The device driver rejects any such attempt with EINVAL 2 The adapter rejects the FN_ISAL_SCSI command with a non zero result if that command is sent to a device that has not been opened with the SSADISK_SCSIMODE extension parameter The arg parameter for the SSADISK_ISAL_CMD ioctl is the address of an ssadisk_ioctl_parms structure This structure is defined in the usr include sys ssadisk h file 278 User s Guide and Maintenance Information The SSADISK_ISAL_CMD ioctl uses the following fields of the ssadisk_ioctl_parms structure dsb Contains the directive status byte that is returned for th
73. The card has switched to Self Refresh mode Fast Write Cache Option Card 90 Action In the sequence shown exchange the FRUs for new FRUs 590 4252B Description The Fast Write Cache Option Card battery has reached the Possible FRUs end of its life The fast write cache is disabled Action Exchange the FRU for a new FRU Fast Write Cache Option Card battery 100 Chapter 18 SSA Problem Determination Procedures 421 SRN Problem Possible Causes 4252C Description The Fast Write Cache Option Card battery needs to be Possible FRUs exchanged for a new one Fast Write Cache Option Action Exchange the FRU for a new FRU 4252D Description Fast write caching is suspended stopped temporarily for An unexpected loss of power one or more devices has occurred on the using system Action i e The user has stopped an 1 Type smitty devices and press Enter uncouple operation 2 Select SSA Disks 3 Select SSA Logical Disks 4 Select Enable Disable Fast Write for Multiple Devices 5 Note the numbers of each hdisk that is listed under Fast Write is Suspended for these devices 6 If you do not know which RAID Manager SSA adapter is managing the hdisk give the following command from the command line ssaadap 1 hdiskn Where hdiskn is the hdisk number that you noted in step B 7 For each hdisk that you noted in step B give the following command from the command line
74. The separation of amounts of data in preparation for data transfer The operating system splits data on page boundaries where a page is 4 KB parameter A variable that is given a constant value for a specified application PCI Peripheral Component Interconnect pdisk Physical disk physical disk The actual hardware disk drive POST Power on self test power on self test POST A series of diagnostic tests that are run automatically by a device when the power is switched on primary half The term that distinguishes one half of a split array The term secondary half distinguishes the other half of the split array R RAID Redundant array of independent disks RAID array In RAID systems a group of disks that is handled as one large disk by the operating system RAID manager The software that manages the logical units of an array system Rebuilding state The state that a RAID array enters after a missing member disk drive has been returned to the array or exchanged for a replacement disk drive While the array is in this state the data and parity are rebuilt on the returned or replacement disk drive Rejected disk A failing disk drive that the array management software has removed from a RAID array Reserved status The disk drive is used by another using system also router A computer that determines the path of network traffic flow S SCSI Small computer system interface SDRAM Synchro
75. This error can be caused by an adapter hardware failure or by excessive SSA adapter card 40 electrical interference on the SSA loop Action Exchange the FRUs for new FRUs in the given sequence External SSA cables 30 Device 30 fExchansind 50422 Description The SSA adapter has detected an SS_TIMEOUT error A Possible FRUs transaction has exceeded its time out This problem can be caused by SSA adapter card 70 disk drive errors A Action Run diagnostics in Problem Determination mode to all the disk Adapter on page 327 drives that are attached to the adapter If you find any problems solve those problems If you do not find any problems run diagnostics in System Verification mode to the adapter If the diagnostics run successfully go to before you exchange the adapter 50425 Description The SSA adapter has detected an Possible FRUs 436 User s Guide and Maintenance Information SRN Problem Possible Causes 504XX Description The SSA adapter microcode has hung Software error Action Run diagnostics in System Verification mode to the SSA adapter Possible FRUs If the diagnostics fail exchange the FRU for a new FRU If the SSA adapter card 100 diagnostics do not fail go to 54050 Description The adapter bus has been reset Possible FRUs Action Exchange the FRUs for new FRUs Using system board 40 using system service information 60000 Description
76. To do this action go to the command line and type the command ssa_format 1 SSA Adapter b where SSA_Adapter is the name of the adapter onto which you have installed the Fast Write Cache Option Card for example ssa0 Chapter 15 Removal and Replacement Procedures 339 Part Numbers e Advanced SerialRAID Adapter card without SDRAM module and without Fast Write Cache Option card 09L2090 e 64 MB module 09L2104 e 128 MB module 08J0663 e Fast Write Cache Option Card 09L2105 e Fast write cache battery 09L5609 340 User s Guide and Maintenance Information Chapter 16 Using the SSA Command Line Utilities The commands that are described here allow you to get access from the command line to some of the functions that are available in the SSA service aids The commands are very simple and are intended for use mainly from within shell scripts They do not provide many error checking routines or error messages If you need such facilities use the SSA service aids see Under most conditions a command prints a usage string if the syntax is incorrect No message is printed however if the command fails If the command runs without error the return code is 0 If an error occurs the return code is a value other than 0 ssa_sesdid Command Purpose Syntax Description Flags To download new microcode to an SES enclosure ssa_sesdid d device f codefile d device f codefile u t This command can be used to
77. UID is shown the disk drive is reserved to a specific adapter If a node number or using system name is shown the disk drive is reserved to a specific node 366 User s Guide and Maintenance Information Examples The following examples show typical output from the rescheck command The Adapter In Use field shows which adapter path the using system is using ssa_rescheck 1 hdisk1 produces this type of output Disk Primary Secondary Adapter Primary Secondary Reserved Adapter Adapter In Use Access Access to hdiskl ssaQ ssaQ OK none ssa_rescheck 1 hdisk1 h produces this type of output hdiskl ssaQ ssa0 OK none The next example shows the disk drive Open by adapter ssa1 The disk drive is reserved to ssa1 and adapter ssa0 has a Busy status Because the two adapters are in the same using system the Busy status indicates that the node number is not set Disk Primary Secondary Adapter Primary Secondary Reserved Adapter Adapter In Use Access Access to hdisk2 ssal ssaQ ssal Open Busy ssal The next example shows that the disk drive is reserved to a node because the secondary access is OK not Busy and the Reserved To field shows the using system name Disk Primary Secondary Adapter Primary Secondary Reserved Return Codes Adapter Adapter In Use Access Access to hdisk2 ssal ssa0 ssal Open OK abcd location com 0 The command has completed successfully 1 A system error has occurred Any oth
78. an SSA RAID Manager and go to step Lon page i of Deleting an Old RAID Array 130 User s Guide and Maintenance Information Listing Old RAID Arrays Recorded in an SSA RAID Manager This option allows you to list the serial numbers of disconnected arrays whose records remain in the RAID manager 1 Select List Delete Old RAID Arrays in an SSA RAID Manager from the SSA RAID Arrays menu 2 Select List Old RAID Arrays Recorded in an SSA RAID Manager 3 A list of RAID managers is displayed in a window as K List Delete Old RAID Arrays in an SSA RAID Manager Move cursor to desired item and press Enter List Old RAID Arrays Recorded in an SSA RAID Manager Delete an Old RAID Array Recorded in an SSA RAID Manager SSA RAID Manager Move cursor to desired item and press Enter ssa0 Available 00 02 IBM SSA 160 SerialRAID Adapter 14109100 Fl Help F2 Refresh F3 Cancel F8 Image F1Q Exit Enter Do Find n Find Next 4 Select the RAID manager for which you want a list of old arrays Chapter 6 Using the RAID Array Configurator 131 132 5 If any old arrays are in the RAID manager a list of those arrays appears a COMMAND STATUS Command OK stdout yes stderr no Before command completion additional instructions may appear below TOP 0952314698B637K 09523146994837K 0952314699A437K 0952314699CE37K 095231469A9337K 095231469B6D37K 095231469C4537K 095231469CEE37K 095231469D7A37K 095
79. an SSA RAID Array Change Show Attributes of an SSA RAID Array Change Member Disks in an SSA RAID Array Change Show Use of an SSA Physical Disk Change Use of Multiple SSA Physical Disks Change Show Delete a Hot Spare Pool F1 Help F2 Refresh F3 Cancel F8 Image F9 Shel F1O Exit Enter Do XN D From the following list find the option that you want and go to the place that is indicated 98 User s Guide and Maintenance Information Chapter 6 Using the RAID Array Configurator 99 Listing All Defined SSA RAID Arrays This option lists all the arrays that are connected to the SSA adapter 1 For fast path type smitty Ilsdssaraid and press Enter Otherwise select List All Defined SSA RAID Arrays from the SSA RAID Arrays menu 2 A list of defined arrays is displayed is COMMAND STATUS Command OK stdout yes stderr no Before command completion additional instructions may appear below hdisk3 095231779F0737K good 3 4G RAID 5 array hdisk4 09523173A02137K good 3 4G RAID 5 array F1 Help F2 Refresh F3 Cancel F6 Command F8 Image F9 Shel F1O Exit Find n Find Next N 100 User s Guide and Maintenance Information Listing All Supported SSA RAID Arrays This option lists all the types of array that are supported by the installed SSA RAID managers 1 For fast path type smitty Isssaraid and press Enter Otherwise select List All Supported SSA RAID Arrays from the SSA RAID Arrays menu A list of
80. and data paths examples broken loop cable removed 403 broken loop disk drive removed 406 normal loops 401 one loop with two adapters in each of two using systems 12 one loop with two adapters in one using system 11 problem determination 400 rules 22 simple 8 simple one disk drive missing 9 simple two disk drives missing 10 two loops with one adapter 14 two loops with two adapters 15 SSA RAID Array SMIT menu getting access 59 SSA RAID arrays adding a disk drive to an SSA RAID array 140 adding a new hot spare pool 83 adding disks to a hot spare pool 86 adding to the configuration 60 canceling all SSA disk drive identifications 129 changing member disks in an SSA RAID array 137 changing or showing the attributes of an SSA RAID array 135 changing or showing the status of a hot spare pool 74 changing or showing the use of an SSA disk drive 144 changing the use of multiple SSA physical disks 147 creating a hot spare disk drive 72 deleting an old RAID array recorded in an SSA RAID manager 133 deleting from the configuration 70 identifying and correcting or removing failed disk drives 91 identifying array candidate disk drives 125 identifying hot spare disk drives 121 identifying rejected array disk drives 123 identifying system disk drives 127 identifying the disk drives in an SSA RAID array 119 installing a replacement disk drive 95 installing and configuring 58 listing all defined SSA RAID arrays 100 listing all SSA RAID arrays
81. and those adapters are installed in two or more using systems load the adapter microcode then run the cfgmgr command on each using system If the level of the microcode that is stored in the using system is higher than the level of the microcode that is installed on the SSA adapter the higher level microcode is automatically downloaded to the adapter when the using system runs its configuration method Maintaining the Disk Drive Microcode To download disk drive microcode use the Display Download Drive Microcode SSA service aid see ED Chapter 14 SSA Adapter Information 315 Vital Product Data VPD for the SSA Adapter The vital product data VPD for the SSA adapter can be displayed by using the using system service aids This section shows the types of information that are contained in the VPD Part number The part number of the adapter card FRU number The part number of the adapter card field replaceable unit FRU Serial number The serial number of the adapter card Engineering change level The engineering change level of the adapter card Manufacturing location Manufacturer and plant code ROS level and ID The version of read only storage ROS code that is loaded on the adapter Loadable microcode level The version of loadable code that is needed for the satisfactory operation of this card Device driver level The minimum level of device driver that is needed for this level of card Description
82. array returns to the Good state Degraded State A RAID 10 array is in the Degraded state when one or more member disk drives are missing or deconfigured and a write operation has occurred Read and write operations can be performed on the array The missing member disk drives are deconfigured so that they are permanently excluded from the array If they become available again they can be introduced only as new members A RAID 10 array is in the Degraded state also if the secondary half of the array operates while the primary half is deconfigured Under this condition the secondary half holds information about the members of the primary half to track recovery Rebuilding State Offline State A RAID 10 array is in the Rebuilding state when a rebuilding operation is running on one or more member disk drives Read and write operations can be performed on the array When an array is created it enters the Rebuilding state to synchronize the member disk drives When the rebuilding operation is complete the array returns to the Good State If the medium error table fills during a rebuilding operation the array remains in the Rebuilding state until space becomes available in the table A RAID 10 array can be in the Offline state for any of the following reasons e No NVRAM is available to operate the array The array is split across SSA loops e Allthese three conditions exist Inthe secondary half of the array the member disk
83. available and a write operation is performed on the array the array remains in the Degraded state until you take action to return that array to the Good state Chapter 3 RAID Functions and Array States 33 While in Degraded state an array is not protected If another disk drive in the array fails or the power fails during a write operation data might be lost You can return the disk drive to the array or install another disk drive by using the procedure in step BZ on page 473 of MAP 2324 SSA RAID to logically add the device to the array The array management software starts a rebuilding operation to synchronize the new disk drive with the data that is contained in the other disk drives of the array This action returns the array to the Good state Rebuilding State Offline State A RAID 5 array enters the Rebuilding state when e That array is first created A member disk drive is replaced e An adapter is replaced but a correct shutdown has not been performed Initial Rebuilding Operation When an array is first created it enters the Rebuilding state while parity is rebuilt If a disk drive fails during the initial rebuilding operation no hot spare disk drive is exchanged for the failing disk drive Disk Drive Replacement An array enters Rebuilding state after a missing disk drive has been returned to the array or exchanged for a replacement disk drive When the array is in this state all the member disk drives are pre
84. copy 190 removing an SDRAM module 329 removing and replacing an SSA adapter 327 removing disks from a hot spare pool 86 removing the battery assembly from the fast write cache card 336 removing the fast write cache card 332 reserving disk drives 27 responsibilities of the SSA adapter device driver 255 responsibilities of the SSA disk device driver 255 return codes command line interface RAID 5 253 return values direct call entry point 265 SSA_GET_ENTRY_POINT ioctl operation 264 SSA_TRANSACTION ioctl operation 263 SSADISK_ISAL_CMD ioctl operation 279 SSADISK_ISALMgr_CMD ioctl operation 282 SSADISK_LIST_PDISKS ioctl operation 286 SSADISK_SCSI_CMD ioctl operation 284 rmssaraid command 70 rules for SSA loops 22 relationship between disk drives and adapters 24 one pair of adapter connectors in the loop 24 pairs of adapter connectors in the loop mainly shared data 26 pairs of adapter connectors in the loop some shared data 25 rules for hot spare disk drive pools 52 run_ssa_ela_cron 232 run_ssa_healthcheck cron error logging 226 run_ssa_link_speed cron error logging 227 S SDRAM module installing 330 removing 329 select entry point tmssa device driver 301 serial storage architecture SSA 3 service aids 373 Certify Disk 391 Configuration Verification 387 504 User s Guide and Maintenance Information service aids continued Display Download Disk Drive Microcode 393 Format Disk 389 Identify function 375 Link Speed 396 Link
85. d specifies that a system disk is to be attached to the new array This command causes a new SSA logical disk hdiskX to be created and attached to the new array X is the next available hdisk number for example hdisk5 Example 4 To Create a RAID 10 Array This example shows how to use four SSA physical disks to create a RAID 10 array The attributes of the disks are all set to their default values Type the commana gt ssaraid C 1 ssa0 t raid_10 s pdiskO pdiskl pdisk3 pdisk2 d where C specifies that this operation is a create operation 1 ssa0 specifies that RAID Manager ssa0 is to be used t raid_10 specifies that a RAID 10 array object is to be created s pdisk specifies the free pdisk that is to become the member disk of the new array RAID 10 arrays provide support for only even numbers of member disk drives The sequence in which the disk drives are specified is important when you are determining which disk drives are primary and which are secondary members of the array The first third fifth and so on disk drives are primary disk drives The second fourth sixth and so on disk drives are secondary disk drives In this example pdiskO and pdisk3 are the primary disk drives d specifies that a system disk is to be attached to the new array 240 User s Guide and Maintenance Information Example 5 To Create a Hot Spare Pool This example shows how to create a hot spare pool that contains an array member disk
86. direction of the transfer Valid values are SSADISK_SCSIDIRECTION_NONE No data transfer is involved for the command SSADISK_SCSI_DIRECTION_READ Data is transferred from the subsystem into the using system memory SSADISK_SCSI_DIRECTION_WRITE Data is transferred from the using system memory into the subsystem u0 scsi identifier Identifies the SSA SCSI logical unit number to which the command should be sent The format of this field is as defined for SSA_SCSI bit 7 1 identifies the Target routine bits 6 0 identify the Logical Unit routine u0 scsi cdb Set by the caller to define the SCSI Command Descriptor Block CDB for the command Chapter 13 Using the Programming Interface 283 u scsi cdb_length Set by the caller to indicate the length of the CDB u scsi scsi_status Contains the SCSI status that is returned for the command The device driver does not know the contents of the CDB The driver only passes on the CDB to the hardware See the relevant hardware documentation to determine what CDBs are valid for a particular SSA physical disk Return Values Files If the command was successfully sent to the adapter card this operation returns a value of 0 Otherwise it returns a value of 1 and sets the errno global variable set to one of the following values EIO Either an unrecoverable I O error has occurred or the hardware did not recognize the SCSI command as valid EINVAL Either the u0 scsi cdb_length fi
87. disks Before you use the service aids ensure that you are familiar with the principles of SSA loops and physical disk drives pdisks If you are not familiar with these principles first 373 Note The service aids refer to the Advanced SerialRAID Adapter as IBM SSA 160 SerialRAID Adapter 14109100 On some service screens this name is shortened 374 User s Guide and Maintenance Information The Identify Function The Identify function can be accessed from many of the service aid menus This function enables you to determine the location of a particular disk drive that you want to identify but do not want to remove When set the Identify function causes the Check light of the disk drive to flash for identification two seconds on two seconds off but has no effect on the normal operation of the disk drive It also causes the Subsystem Check light if present of the unit containing the selected disk drive to flash You can use the Identify function on any number of disk drives at the same time Instructions displayed by the service aids tell you when you can select the Identify function The service aids display the serial numbers of the devices By checking the serial number label on the device you can verify that the correct device has its Check light flashing Note Normally you can reset the Identify function by selecting to switch it off from inside a service aid display or by leaving that particular service aid displ
88. download the latest level of microcode to all available SES enclosures or to download a specified microcode file to a specified enclosure To enable the latest level of microcode to be downloaded to all enclosures either the microcode file must be located in the etc microcode directory or the microcode filename and path name must be specified with the f flag To enable the microcode to be downloaded to a specified enclosure the microcode file name and its full path name must be specified with the f flag d device Specifies the SES enclosure to which the microcode will be sent This flag is used with the f flag or the u flag f codefile Specifies the name of the microcode file to be downloaded This flag can be used with the d flag or the u flag u When this flag is used with no other flags the latest level of enclosure microcode that is available in the etc microcode directory is downloaded to all available SES enclosures if that latest level is higher than the version that exists in the enclosure If this flag is used with the f flag the file that is specified by the f flag is downloaded to all available SES enclosures if the level of that file is higher than the level that exists in the enclosure 341 Examples t This optional flag allows new levels of microcode to be tested The enclosure returns to the original level of microcode if the enclosure power is switched off then switched on To install the latest level
89. drive for a RAID 1 array or on several disk drives for a RAID 10 array You cannot use the 3 Way Copy procedure for RAID 5 arrays for RAID 0 arrays or for non RAID disk drives whether or not they are configured for Logical Volume mirroring When the copy has been prepared and the copy operation is complete the copy remains synchronized with the parent It can be uncoupled later The uncoupled copy is renamed from the original name so that it can be separately accessed after it is uncoupled If the RAID 1 arrays or RAID 10 arrays are used as raw hdisks you control the 3 Way Copy function from SMIT menus or from ssaraid commands Facilities are provided that allow you to identify suitable candidate disk drives for the copy and to couple those disk drives to the parent array By using SMIT menus or an ssaraid command you can uncouple the copy from the parent array the uncoupled array becomes a new hdisk Before the copy is uncoupled all fast write data is destaged to the array Data held ina system cache that has not yet been sent to the adapter must be flushed before the uncouple operation If the RAID 1 or RAID 10 array is used to hold logical volumes you can use the ssa_make_copy command to make and uncouple a copy of the complete logical volume group or of part of a logical volume group This command e Selects disk drives for the copy operation e Synchronizes data from system cache memory to the array e Copies the required logical v
90. drives that you want to identify The Check light flashes on each disk drive that you have selected 120 User s Guide and Maintenance Information Identifying Hot Spare Disk Drives This option allows you to identify the hot spare disk drives that are available to a particular SSA RAID manager 1 For fast path type smitty ihssaraid and press Enter Otherwise a Select List Identify SSA Physical Disks from the SSA RAID Arrays menu b Select Identify Hot Spares 2 A list of arrays is displayed in a window ta List Identify SSA Physical Disks Move cursor to desired item and press Enter List Disks in an SSA RAID Array List Hot Spares List Rejected Array Disks List Array Candidate Disks List System Disks Identify Disks in an SSA RAID Array Identify Hot Spares Identify Rejected Array Disks Identify Array Candidate Disks SSA RAID Array Move cursor to desired item and press Enter hdisk3 095231779F0737K good 3 4G RAID 5 array hdisk4 09253173A02137K good 3 4G RAID 5 array Fl Help F2 Refresh F3 Cancel F8 Image F1Q Exit Enter Do Find n Find Next Select the RAID manager whose hot spare disk drives you want to identify Chapter 6 Using the RAID Array Configurator 121 3 The following information is displayed A i Identify Hot Spares Type or select values in entry fields Press Enter AFTER making all desired changes Entry Fields SSA RAID Manager ssa Hot Spare Disks Flash Disk I
91. e Run diagnostics from diskette or CD ROM to isolate the problem If you do not find a problem the operating system might have failed None SSA02 Description An unknown error has occurred Action Do one of the actions described here e Run diagnostics in Problem Determination mode to the system unit If you find any problems solve them then try to run the service aid again e If diagnostics fail or if the same problem occurs when you try the service aid again run diagnostics from diskette or CD ROM to isolate the problem If you do not find a problem the operating system might have failed None Chapter 18 SSA Problem Determination Procedures 439 SRN Problem Possible Causes SSA03 Description The service aid was unable to open an hdisk This problem might have occurred because a disk drive has failed or has been removed from the system Action Do the actions described here 1 Use the Configuration Verification service aid see EConfiguration 87 to determine the location code of the SSA adapter to which the hdisk is attached For example if the location code of the hdisk is 00 03 L the location code of the SSA adapter is 00 03 2 Run the Link Verification service aid see to the SSA adapter 3 If a link failure is indicated by the service aid go to MAP 2320 SSA 4 If no link failures are indicated run diagnostics in System Verification mode to each pdisk that is a
92. for example the addition of disk drives or changes to the SSA cabling If you did not intend to make such changes and you correct them the pool returns to its original state If you did intend to make the changes 1 Select Change Show Delete a ak Spare Pool from the smit Saruy 2 Select the reduced hot spare pool 3 Verify that the contents of the pool are as required 4 Press Enter An array in this pool has used a hot spare disk drive from another pool When replacement disk drives are installed in exchange for failed disk drives the replacement disk drives are assigned as hot spare disk drives or as free disk drives The hot spare pools however are no longer configured as intended To correct the configuration 1 a List eli lal in a Hot Spare Pool see 2 Select the mixed hot spare pool 3 From the displayed list note the number of the pdisk that has a status of wrong_pool 4 Note the number of the hdisk to which the pdisk belongs Select Swap Member Disks in an SSA RAID Arrey see Changing 6 Select the hdisk that you noted in step pi The Disk to Remove is the pdisk that you noted in step B The Disk to Add is the replacement disk drive that was installed in exchange for the failed disk drive Hot spare disk drives exist in a pool but they are not protecting any member disk drive This condition does not cause an error to be logged If required you can move hot spare disk drives from this pool
93. format List the parent objects for the named object The name of the reference object for RAID Copy arrays The disk drives that are to become members of the array The type of the object to list or create Remove the device for the specified RAID object List exchange candidate disk drives for the named object Information is presented in summary format You can find the object types argument for the t option if you type the following command gt ssaraid Yc 1 ssaQ where ssa0 is the name of the RAID manager Instruct Types You can give the following instruct type as an argument to the i option when that option is used with the A option exchange couple uncouple 238 User s Guide and Maintenance Information Examples For the examples given here assume that the following items of hardware are available in the system A RAID Manager named ssa0 e Four SSA physical disks named pdiskO pdisk1 pdisk2 and pdisk3 Example 1 To Create a RAID 0 Array This example shows how to use three SSA physical disks to create a RAID 0 array The attributes of the disks are all set to their default values Type the command gt ssaraid C 1 ssa0 t raid_ s pdisk pdiskl pdisk2 d where C specifies that this operation is a create operation 1 ssa0 specifies that RAID Manager ssa0 is to be used t raid_ specifies that a RAID 0O array object is to be created s pdisk specifies the free pdisk that is to be
94. hdisk1 hdiskn and SSA physical disks pdisk0 pdisk1 pdiskn The properties of each are described in the SSA Subsystem Overview Normally the system boot process automatically configures all the disk drives that are connected to the using system You do not need to take any action to configure them Because SSA devices might be added to the SSA network while the using system is running and online you might need to configure SSA disks after the boot process has completed Under these conditions use the cfgmgr command to configure the devices An exception is to configure a specific device with a specific name You can do this with the mkdev command 268 User s Guide and Maintenance Information Using mkdev to Configure a Physical Disk To use mkdev to configure an SSA physical disk specify the following information Parent ssar Class pdisk Subclass ssar Type You can list the types by typing Isdev P c pdisk s ssar ConnectionLocation 15 character unique identifier of the disk drive You can determine the unique identifier in three ways e If the disk drive is already defined you can use the Isdev command to determine the unique identity as follows 1 Type Isdev Ccpdisk r connwhere and press Enter 2 Select the 15 character unique identifier UID for which characters 5 through 12 match the serial number that is on the front of the disk drive e Construct the 15 character unique identifier from the 12 ch
95. in order to meet FCC emission limits Neither the provider nor the manufacturer is responsible for any radio or television interference caused by using other than recommended cables and connectors or by unauthorized changes or modifications to this equipment Unauthorized changes or modifications could void the user s authority to operate the equipment This device complies with Part 15 of FCC Rules Operation is subject to the following two conditions 1 this device may not cause harmful interference and 2 this device must accept any interference received including interference that may cause undesired operation Japanese Voluntary Control Council for Interference VCCI Statement This product is a Class A Information Technology Equipment and conforms to the standards set by the Voluntary Control Council for Interference by Information Technology Equipment VCCI In a domestic environment this product might cause radio interference in which event the user might be required to take adequate measures Korean Government Ministry of Communication MOC Statement Please note that this device has been approved for business purposes with regard to electromagnetic interference If you find that this device is not suitable for your use you can exchange it for one that is approved for non business purposes New Zealand Compliance Statement This is a Class A product In a domestic environment this product might cause radio interference
96. information is displayed he Swap Members of an SSA RAID Array Type or select values in entry fields Press Enter AFTER making all desired changes Entry Fields SSA RAID Manager ssal SSA RAID Array hdisk4 RAID Array Type raid_10 Connection Address Array Name 00703784C540C00 Disk To Remove Disk To Add F1 Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image ee F10 Exit Enter Do J 6 Select Disk to Remove 7 Press the List key to display the disk drive to remove If the disk drive that you are replacing has been rejected by the array that disk drive is listed as BlankReservedZ 8 Select the required disk drive and press Enter 9 Select Disk to Add 10 Press the List key to display the disk drive to add Disk drives that are free or hot spare disk drives are listed 11 Select the required disk drive and press Enter 12 Press Enter to perform the swap operation 96 User s Guide and Maintenance Information Using Other Configuration Functions This part of the chapter describes the maintenance procedures that are available for your Advanced SerialRAID Adapter You can use these procedures at any time You can get to the required SMIT menu by using fast path commands or by working through other menus Notes 1 Although this book always refers to the smitty commands you can use either the smitty command or the smit command The procedures that you follow remain the s
97. is nonblocking In this mode the device checks whether enough buffer space is available for the write operation If enough buffer space is not available the write operation fails and the errno global variable is set to EAGAIN If enough buffer space is available the write operation immediately ends with all the data written successfully The write operation now occurs asynchronously If you want to track the progress of this write operation use the TMIOSTAT operation The driver keeps the status of the last write operation which is then reported by the TMIOSTAT operation Possible return values for the errno global variable include EFAULT The write operation was unsuccessful because of a kernel service error This value is applicable only during data gathering EINTR Interrupted by signal EINVAL Attempted to execute a write operation for a device instance that is not configured not open or is not an initiator mode minor device number If a nonblocking write operation the transfer length is too long or the time out period is zero If the transfer length is too long try the operation again with a smaller transfer length If the time out period is zero use TMCHGIMPARM to set the time out value to another value EAGAIN A nonblocking write operation could not proceed because not enough buffer space was available Try the operation again later EIO One of the following I O errors occurred An error that cannot be produced again
98. is on the disk drive Chapter 13 Using the Programming Interface 287 DD_CONC_TEST Issues a test disk command to verify that the SSA disk drive is still accessible to this using system The concurrent mode entry point returns a value of EINVAL if any of the following is true e The top kernel extension did not perform a DD_CONC_REGISTER operation The conc_cmd pointer is null e The devno field in the conc_cmd structure is not valid The cmd_op field of the conc_cmd structure is not one of the four valid values that were previously listed If the concurrent mode entry point accepts the conc_cmd structure the entry point returns a value of 0 If the SSA disk device driver does not have resources to issue the command the driver queues the command until resources are available The concurrent commands that are queued in the SSA disk device driver are issued before any read or write operations that are queued by the strategy entry point of the device driver The completion status of the concurrent mode commands are returned to the concurrent mode interrupt handler entry point of the top kernel extension Top Kernel Extension Entry Point The top kernel extension must have a concurrent mode command interrupt handler entry point which is called directly from the interrupt handler of the SSA disk device This entry point function can take four arguments conc_cmd pointer e cmd_op field message_code field e devno field The conc
99. is reported each time that the health check is run If an error is intermittent it is logged each time that it occurs Because a particular error need be logged only a defined number of times for the automatic error log analysis to determine that service activity is needed the device driver stops the repeated logging of the same error If error logging were not managed in this way a repeated error could fill the error log and hide other errors that other components in the system might have logged If error logging management is active for one type of error a different type of error can still be sent to the error log All types of error are therefore logged Detailed Description Error logging management is performed for the following error types DISK_ERR4 SSA_ENCL_ERR2 SSA_CACHE_BATTERY SSA_HDW_ERROR SSA_CACHE_ERROR SSA_HDW_RECOVERED SSA_DEGRADED_ERROR SSA_LINK_ERROR SSA_DETECTED_ERROR SSA_LINK_OPEN SSA_DEVICE_ERROR SSA_LOGGING_ERROR SSA_DISK_ERR1 SSA_REMOTE_ERROR SSA_DISK_ERR4 SSA_SETUP_ERROR SSA_ENCL_ERR1 SSA_SOFTWARE_ERROR If one of these error types is permanent on a particular device it is reported each time that the health check is run The SSA adapter sends the resulting error log entries to the device driver The device driver error logger permits these error log entries to be sent on to the error log until the number of entries for that error reaches a predetermined threshold value After that value is reached no more entri
100. its power from a rechargeable battery This battery can maintain data in the write cache for seven days after power has been removed from the adapter card When an adapter is connected to the power the Fast Write Option Cache Card performs a fast charge operation on the battery This fast charge operation lasts from 5 to 60 minutes as determined by the charge state of the battery During the fast charge operation you can enable and get access to fast write disk drives The fast write function however remains inactive until the battery is fully charged Note To determine whether the fast write cache is active or inactive give the command ssa_fw_status a ssaX c The Fast Write Cache Option Card keeps a record of the age of its battery When the battery approaches the age at which it might not be able to maintain the data in the fast write cache for seven days an error is logged This error recommends that the battery be exchanged for a new one At this time the fast write cache continues to work normally If the battery is not exchanged inside approximately three months from the time of the error the fast write cache is disabled and a new error is logged This error indicates that the fast write cache is no longer active Configuring the Fast Write Cache Feature This section describes how to use the system management tool SMIT to configure and install arrays and disks that have the fast write attribute It also gives guidance abou
101. loop contains three or more SSA adapters that are installed in two or more using systems you must ensure that all those using systems are switched on and that all the disk drives in all those using systems are configured as follows e If all the using systems are switched off Micro Channel or PCI 1 For each Micro Channel system a Set Secure mode on each using system b Switch on all the using systems c Wait for 200 to be displayed on the operator panel of each system For each PCI system a Switch on one using system only b Wait for the first display logo to appear on the screen Press F1 immediately The using system goes into System Management Services mode 2 When each using system is in the state described in the preceding steps For Micro Channel systems set Normal mode to continue the boot process For PCI systems press F10 Exit to continue the boot process If one or more using systems are switched on Micro Channel or PCI 1 Switch on the remaining using systems 2 On each using system a Run the cfgmgr command to configure all the disk drives b Manually vary on the volume groups and mount the file systems as required SSA Link Speed 18 Some SSA devices can run at 20 MB per second others can run at 40 MB per second Both types of devices can exist in a particular configuration but for best performance all links should run at the same speed Two types of SSA cable are available e 20
102. managing dumps 259 open and close subroutines 258 purpose 257 SSA_GET_ENTRY_POINT ioctl operation 264 SSA_TRANSACTION ioctl operation 262 summary of SSA error conditions 259 syntax 257 disk configuration issues 266 description 266 device attributes 270 device dependent subroutines 272 error conditions 274 IOCINFO ioctl operation 277 open read write and close subroutines 272 purpose 266 readx and writex subroutines 274 special files 276 SSA disk concurrent mode of operation interface 287 SSADISK_ISAL_CMD ioctl operation 278 SSADISK_ISALMgr_CMD ioctl operation 281 SSADISK_LIST_PDISKS ioctl operation 285 SSADISK_SCSI_CMD ioctl operation 283 syntax 266 interface 256 PCI adapter ODM attributes 257 responsibilities adapter device driver 255 disk device driver 255 tmssa 295 configuration 296 description 295 device dependent subroutines 296 IOCINFO ioctl operation 305 purpose 295 syntax 295 TMCHGIMPARM change parameters 308 498 User s Guide and Maintenance Information device drivers continued tmssa continued TMIOSTAT status 307 trace formatting 256 device dependent routines tmssa device driver 296 close subroutine 297 ioctl subroutine 301 open subroutine 296 read subroutine 297 select entry point 301 write subroutine 299 device dependent subroutines adapter device driver 258 disk device driver open read write and close subroutines 272 readx and writex subroutines 274 devices finding the physical location 409
103. mode interface or tm target mode interface The caller uses the initiator mode to transmit data and the target mode interface to receive data The least significant bit of the minor device number indicates to the device driver which mode interface is selected by the caller When the least significant bit of the minor device number is set to 1 the target mode interface is selected When the least significant bit is set to 0 the initiator mode interface is selected For example tmssat1 im should be defined as an even numbered minor device number to select the initiator mode interface tmssa1 tm should be defined as an odd numbered minor device number to select the target mode interface When the caller opens the initiator mode special file a logical path is set up This path allows data to be transmitted The user mode caller issues a write writev writex or writevx system call to start data transmission The kernel mode user issues an fp_write or fp_rwuio service call to start data transmission The SSA target mode device driver then builds a send command to describe the transfer and the data is sent to the device The transfer can be done as a blocking write operation or as a nonblocking write operation When the write entry point returns the calling program can access the transmit buffer When the caller opens the target mode special file a logical path is set up This path allows data to be received The user mode caller issues a read
104. old RAID arrays recorded in an SSA RAID manager 131 listing rejected array disk drives 113 listing system disk drives 117 listing the disk drives in an SSA RAID array 109 listing the disks that are in a hot spare pool 80 listing the status of all defined SSA RAID arrays 104 removing a disk drive from an SSA RAID array 138 removing disks from a hot spare pool 86 showing the disks that are protected by hot spares 77 swapping members of an SSA RAID array 142 RAID array problems 89 RAID Array SMIT menu getting access 59 RAID arrays adding a disk drive to an SSA RAID array 140 adding a new hot spare pool 83 adding disks to a hot spare pool 86 RAID arrays continued adding to the configuration 60 canceling all SSA disk drive identifications 129 changing member disks in an SSA RAID array 137 changing or showing the attributes of an SSA RAID array 135 changing or showing the status of a hot spare pool 74 changing or showing the use of an SSA disk drive 144 changing the use of multiple SSA physical disks 147 creating a hot spare disk drive 72 deleting an old RAID array recorded in an SSA RAID manager 133 deleting from the configuration 70 identifying and correcting or removing failed disk drives 91 identifying array candidate disk drives 125 identifying hot spare disk drives 121 identifying rejected array disk drives 123 identifying system disk drives 127 identifying the disk drives in an SSA RAID array 119 installing a replacement disk d
105. one of the following EINVAL A request that is not valid was sent to the adapter device driver for example a request for the DUMPSTART option was sent before a DUMPINIT option had been run successfully EIO The adapter device driver was unable to complete the command because the required resources were not available or because an I O error had occurred ETIMEDOUT The adapter did not respond with status before the passed command time out value expired Chapter 13 Using the Programming Interface 259 Files dev ssa0 dev ssa1 dev ssan Provide an interface to allow SSA head device drivers to access SSA devices or adapters 260 User s Guide and Maintenance Information IOCINFO Device Information SSA Adapter Device Driver ioctl Operation Purpose Description Files To return a structure that is defined in the usr include sys devinfo h file The IOCINFO ioctl operation returns a structure that is defined in the usr include sys devinfo h header file The caller supplies the address to an area that is of the type struct devinfo This area is in the arg parameter to the IOCINFO operation The device type field for this component is DD_BUS the subtype is DS_SDA The IOCINFO operation is defined for all device drivers that use the ioctl subroutine as follows The operation returns a devinfo structure The caller supplies the address of this structure in the argument to the IOCINFO operation The device type in t
106. pass the ext parameter which specifies request options The options are constructed by logically ORing zero or more of the following values HWRELOC Request for hardware relocation that is safe UNSAFEREL Request for hardware relocation that is not safe WRITEV Request for write verification Error Conditions Possible errno values that occur for ioctl open read and write subroutines when the SSA disk device driver is used include EBUSY One of the following conditions has occurred 274 User s Guide and Maintenance Information An attempt was made to open an SSA physical device that has already been opened by another process e The target device is reserved by another initiator EFAULT Illegal user address EINVAL One of the following circumstances has occurred e The read or write subroutine supplied an nbyte parameter that is not an even multiple of the block size The data buffer length exceeded the maximum length that is defined in the devinfo structure for an ioctl subroutine operation e The openext subroutine supplied a combination of extension flags that has no support e An ioctl subroutine operation that has no support was attempted An attempt was made to configure a device that is still open An illegal configuration command has been given The data buffer length exceeded the maximum length that is defined for a strategy operation EIO One of the following conditions has occurred e The
107. pool 80 listing the status of all defined SSA RAID arrays 104 removing a disk drive from an SSA RAID array 138 removing disks from a hot spare pool 86 showing the disks that are protected by hot spares 77 swapping members of an SSA RAID array 142 Attention notices formatting disk drives 389 fragility of disk drives 443 service aids 373 attributes action new_member disk 251 old_member disk 251 array member disk drives spare_pool 249 couple action force yes no 252 pool_selection own primary secondary 252 raid_copy copy 252 disk device driver 270 adapter_a 271 adapter_b 271 connwhere_shad 271 location 271 max_coalesce 271 node_number 270 primary_adapter 271 pvid 271 queue_depth 271 reserve _lock 271 size_in_mb 271 write_queue_mod 272 hot spare disk drives change spare_pool 249 hot spare pool minimum_spares 249 ODM PCI 257 bus_intr_level 257 bus_io_ addr 257 bus_mem_start 257 attributes continued ODM PCI continued bus_mem_start2 257 bus_mem_start3 257 bus_mem_start4 257 daemon 258 dma_mem 258 host_address 258 intr_priority 257 poll_threshold 258 scat_gat_pages 258 ucode 257 physical disk change attributes 249 physical disk drive change fastwrite on off 250 force yes no 251 fw_end_block 250 fw_max_length 250 fw_start_block 250 use system spare free 250 RAID arrays change 248 force yes no 248 use system free 248 RAID arrays creation and change 244 allow_page_splits true false 244 bypass_cache
108. problem If the SSA service aids are not available note the value of PAA in this SRN and go to Then go to problem D8XXX Description SRNs in this range are used by the SSA enclosure Not applicable subsystem Action Go to the service information for your SSA enclosure Note In this SRN an X represents a digit 0 through F 438 User s Guide and Maintenance Information SRN Problem Possible Causes DFFFF Note The description and action for this SRN are valid only if you have run diagnostics to the SSA attachment If this SRN has occurred because you have run diagnostics to some other device see the service information for that device Description A command or parameter that has been sent or received is not valid This problem is caused either by the SSA adapter or by an error in the microcode Action Go to before exchanging the FRU Software error Possible FRUs SSA01 Description Not enough using system memory is available for this service aid to continue Action Do one of the actions described here e This problem might be caused by a failed application program Ask the user to end any failed application program then try to run the service aid again e Run diagnostics in Problem Determination mode to the system unit If you find any problems solve them then try to run the service aid again e Close down and reboot the using system then try to run the service aid again
109. running 1 Are the system service aids available NO Go to B YES Go to step ek 2 from step fi Are any Ready link status lights flashing on this SSA loop NO YES Go to 3 from step Hh Run the Link Verification service aid see and select the appropriate SSA adapter from the ceed Link Verification adapter menu If the service aid detects pdisks for the adapter you have selected a list of pdisks is displayed The diagram shows an example list ee gt LINK VERIFICATION 802386 SSA Link Verification for nunu ssaQ 00 04 IBM SSA 160 SerialRAID Adapter To Set or Reset Identify move cursor onto selection then press lt Enter gt Physical Serial Adapter Port Al A2 Bl B2 Status TOP nunu pdisk11 AC7AAO9A 0 5 Good nunu pdisk amp AC7AA2D6 1 4 Good nunu pdisk2 AC7AAOBD 23 Good nunu pdisk3 AC7AA0B1 I2 Good nunu pdisk7 AC7AA0B5 4 1 Good nunu pdisk12 AC7AA052 5 0 Good nunu pdiskO AC7AAOB9 6 5 Good nunu pdisk1l AC7AAQB3 1 4 Good nunu pdisk10 AC7AAQB4 2 3 Good MORE 4 F3 Cancel F10 Exit Chapter 18 SSA Problem Determination Procedures 445 If the service aid cannot detect any pdisks a message is displayed E gt LINK VERIFICATION 802385 Move cursor onto selection then press lt Enter gt nunu ssal 00 04 IBM SSA 160 SerialRAID Adapter nunu ssal 00 05 IBM SSA 160 SerialRAID Adapter nunu ssal 00 07 IBM SSA 160 SerialRAID Adapter No pdisks are in th
110. select Remove a Disk From an SSA RAID Array from the Change Member Disks in an SSA RAID Array menu When the List key is pressed for the Disk to Remove option the following pop up menu is displayed a aN Remove a Disk From an SSA RAID Array Type or select values in entry fields Pr 7 22 22 2 2 o oo nnn nn nen ene Disk to Remove Move cursor to desired item and press Enter PAAA RAA HH HH HH AEEA EAEE EEA EEEE EEEE Primary Disks pdisk0 AC7AA078 04 07 P present 9 1GB pdisk6 AC7AA8A4 04 07 P present 9 1GB He eH EH EH HH a AARAA RAEE REEERE AEAEE Secondary Disks pdisk9 AC7AD176 04 07 P present 9 1GB pdisk10 AC7AE3C9 04 07 P rebuilding 9 1GB He HE HH HH HH HH a a EEEE Coupled Disks pdisk11 AC7AE417 04 07 P present 9 1GB pdisk12 AC7AE41C 04 07 P present 9 1GB F1l Help F2 Refresh F3 Cancel F1 F8 Image F1Q Exit Enter Do F5 Find n Find Next FQ 2 2 2222 enn nn nnn nnn nnn nnn n nnn e X A The status values for coupled disk drives are present The disk drive is present and operational not_present The disk drive is missing or has failed 190 User s Guide and Maintenance Information Swap Members of an SSA RAID Array For fast path type smitty exssaraid and press Enter Otherwise select Swap Members of an SSA RAID Array from the Change Member Disks in an SSA RAID Array menu When the List key is pressed for the Disk to Re
111. service information Device 30 exchanging 9 414 User s Guide and Maintenance Information SRN Problem Possible Causes 34000 Description The adapter cannot initialize a device This problem might be Possible FRUs badly affecting the SSA loop If the device was a member of a RAID Device 90 Exchanging array you might not be able to list this disk drive by using the RAID Disk Drives on page 319 facilities External SSA cables 5 Action Exchange the FRUs for new FRUs Internal SSA connections 5 enclosure service information 35000 Description Loop problems are causing the adapter to try to reset disk Possible FRUs drives External SSA cables ae 30 Action Check the log for other error information including other 35000 errors that might indicate which section of the loop is causing the Internal SSA connections problems 30 enclosure service information Device 30 Exchangind 40000 Description The SSA adapter card has failed Action Exchange the FRU for a new FRU 40016 Description A 16 MB SDRAM in the adapter card module has failed Possible FRUs 16 MB SDRAM modul Action Exchange the FRUs for new FRUs 6 my 40064 Description A64 MB SDRAM in the adapter card module has failed Possible FRUs Action Exchange the FRUs for new FRUs 64 MB SDRAM module 99 Removing an SDRAM Module of a Advanced SerialRAIO Adapter on page
112. services will be offered in your country Any reference to a licensed program in this publication is not intended to state or imply that you can use only the licensed program indicated You can use any functionally equivalent program instead Copyright International Business Machines Corporation 1996 2000 All rights reserved Note to U S Government Users Documentation related to restricted rights Use duplication or disclosure is subject to restrictions set forth in the GSA ADP Schedule Contract Contents Safety Notices Definitions of Safety Notices Safety Notice for Installing Relocating or Servicing About This Book Who Should Use This Book What This Book Contains If You Need More Information Web Support Pages Numbering Convention Part 1 User Information Chapter 1 Introducing SSA and the Advanced SerialRAID Adapters Serial Storage Architecture SSA The Advanced SerialRAID Adapters Oe 4 P Fast Write Cache Feature 128 MB Memory Module Feature Lights of the Advanced SerialRAID Adapters Port Addresses of the Advanced SerialRAID Adapters SSA Adapter ID during Bring Up a 2 hs Chapter 2 Introducing SSA Loops Loops Links and Data Paths Simple Loop Simple Loop One Disk Drive Missing Simple Loop Two Disk Drives Missing One Loop with Two Adapters in One Using System One Loop with Two Adapters in Each of Two Using Systems Two Loops with One Adapter aca E Two Loops wi
113. spare disk drive that is large enough to replace the failing disk drive The spare_exact attribute causes the RAID manager to use only a hot spare disk drive that has the exact size that the array requires The size of the hot spare disk drive is determined by the size of the other member disk drives of the array The sizes of all the member disks are logically truncated to the size of the smallest member disk drive The hot spare disk drive must also have the size of the smallest member disk drive so that it can be exchanged for the failing disk drive Chapter 12 Using the SSA Command Line Interface for RAID Configurations 245 read_only_when_exposeds true false default false With the attribute set to false If the array enters the Exposed state and write operations are made to the array e The first write operation causes the array to enter the Degraded state The written data is not protected If another disk drive in the array fails or the power fails during a write operation data might be lost While the array is in the Degraded state however operations to the array continue e The rebuilding operation that runs on the replacement disk drive takes a long time to complete With the attribute set to true e lf the array enters the Exposed state and hot spare disk drives are not enabled the array operates in read only mode until the failing disk drive is exchanged for a replacement drive e lf the array enters the Exposed sta
114. specified number of retries it tries another SSA adapter If this write operation has already tried all the SSA adapters it fails The calling program can retry the write operation or perform other appropriate error recovery No other error conditions are retried but are returned with the appropriate errno global variable The target mode device driver by default generates a time out value which is the amount of time allowed for the write operation to end If the write operation does not end before the time out value expires the write operation fails The time out value is related to the length of the requested transfer in bytes and is calculated as follows Chapter 13 Using the Programming Interface 299 timeout_value transfer_length 65536 1 20 In the calculation 20 is the default scaling factor that generates the time out value The caller can customize the time out value through the TMCHGIMPARM operation The actual period that elapses before a time out occurs can be up to 10 seconds longer than the calculated value because it is related to the operation of the hardware at the time of the write operation A time out value of zero means that no time out occurs A value of zero is not allowed when the write operation is nonblocking because a deadlock might occur Under this condition EINVAL is returned for the write operation If the caller opened the initiator mode device with the O_NDELAY flag set the write operation
115. ssacand Command Purpose Syntax Description Flags ssa_certify Command i Purpose Syntax Description Flags ssaconn Command Purpose Syntax Description Flags ssa_diag Command Purpose Syntax Description Flags Output ssadisk Command Purpose Syntax Description Flags ssadload Command Purpose Syntax Description Flags Examples ssa_ela Command Purpose Syntax Description Flags Output ssaencl Command Purpose Syntax Description Flags Examples ssa_format Command Purpose Syntax Description Flags Output User s Guide and Maintenance Information 344 344 344 344 344 345 345 345 345 347 347 347 347 347 348 348 348 348 348 348 349 349 349 349 350 350 350 350 351 353 353 353 353 353 354 355 355 355 355 357 358 358 358 358 359 ssa_fw_status Command Purpose Syntax Description Flags Output Examples ssa_getdump Command Purpose Syntax Description Flags Output 3 ssaidentify Command Purpose Syntax Description Flags ssa_progress Command Purpose Syntax Description Flags Output Examples ssa_rescheck Command Purpose Syntax Description Flags Output Examples Return Codes ssa_servicemode Command Purpose Syntax Description Fla
116. ssaraid H 1 RaidManager n hdiskn a fw_suspended false Where RaidManager is the SSA adapter that is managing the fast write device and hdiskn is the hdisk number that you noted in step Al 42540 Description Two way fast write for a disk drive is configured to operate e Failure in another using only when both caches are available One cache however is now not system available e A user action has set Action PAn Cache when way 1 If the using system that contains the partner adapter is switched off y switch it on Configuration change 2 If the configuration has been changed review the configuration rules and restore a valid configuration 3 Run diagnostics on the partner adapter and correct all problems The partner adapter is the other adapter on the SSA loop that includes the adapter that reported the problem 4 If while the other adapter is not available the user wants to use the fast write function on disk drives that are attached to this adapter change the state of the Bypass Cache In 1 Way Fast Write Network 422 User s Guide and Maintenance Information SRN Problem Possible Causes 43PAA Description An SSA device on the link is preventing the completion of the Possible FRUs loop configuration Device 90 Exchanging Action If the SSA service aids are available run the Link Verification vs service aid see j to determine which device is preventing Eeee
117. strings of the subsystem or enclosure if it detects that neither of its connectors is connected to a powered on SSA adapter or device Therefore if the two SSA adapters fail or are held reset but remain powered on the bypass card does not operate and the disk drives become isolated For more information about bypass cards see the publications for your disk subsystem or enclosure In Figure d SSA adapters J and H are in using system 1 SSA adapters and E are in using system 2 In each using system the two adapters are connected to each other If the two SSA adapters of either using system fail or are held reset but remain powered on all the disk drives can still communicate with the other using system Using System 1 A1 A2 Bi B2 A1 A2 Bi B2 ia tT Disk Disk Disk Disk Disk Disk Disk Disk 16 15 14 13 12 11 10 9 Disk Disk Disk Disk Disk Disk Disk Disk 1 2 3 4 5 6 7 8 B2 B1 fey a2 at B2 B1 Baz at Using System 2 Figure 6 One Loop Two Adapters in Each of Two Using Systems 12 Users Guide and Maintenance Information If however disk drives are connected into the link between two SSA adapters that are in the same using system those disk drives become isolated if both SSA adapters fail or are held reset but remain powered on see also disk drives 13 through 16 have been connected between the SSA adapters in using system 1 If both adapters fail or
118. the adapter from the using system see the nstallation and Service Guide for the using system 2 Refer to Figure 43 3 Holding your thumbs against the connector E open the clips by pressing them in the directions shown by the arrows in the diagram This action pulls the SDRAM module H out of the connector 4 Remove the SDRAM module a Gol y Lo w n a a u Figure 43 Removing the SDRAM Module Chapter 15 Removal and Replacement Procedures 329 Installing an SDRAM Module of an Advanced SerialRAID Adapter Attention e The adapter assembly contains parts that are electrostatic discharge ESD sensitive Use the tools and procedures defined by your organization to protect such parts e If you are exchanging a new SDRAM for a failed SDRAM ensure that new SDRAM is the same size as the old SDRAM Refer to the label on each SDRAM 1 Refer to Figure 44 a 4 4 a T i u Figure 44 Check
119. the command ssa_delete_copy v newvgname 170 User s Guide and Maintenance Information ssa_delete_copy Command Purpose To delete a RAID Copy array after it has been backed up Syntax ssa_delete copy v vgname p pvname A C f Flags A Deletes a copy and its RAID Copy arrays C Deletes a copy and couples the RAID Copy to its original parents f If the specified volume group is varied off this flag forces the volume group to be exported If the specified volume group is varied on and file systems are mounted this flag attempts to unmount the file systems Chapter 7 Copying Data from Arrays and from Volume Groups 171 SMIT Menus for 3 Way Copy Operations This section describes the SMIT menus that are related to the 3 Way Copy function These SMIT menus help you to develop your own shell scripts to manage the copy operations Getting Access to the Array Copy Services Menu 1 For fast path access to the SSA RAID Array SMIT menu type smitty ssaraid and press Enter Otherwise a Type smitty and press Enter b Select Devices The Devices menu is displayed c Select SSA RAID Arrays 2 The SSA RAID Arrays menu is displayed ig j SSA RAID Arrays Move cursor to desired item and press Enter List All Defined SSA RAID Arrays List All Supported SSA RAID Arrays List All SSA RAID Arrays Connected to a RAID Manager List Status Of All Defined SSA RAID Arrays
120. the hour run_ssa_link_speed cron SSA links can run at 20 MB per second or at 40 MB per second Normally two SSA nodes that can communicate at 40 MB per second operate the link between them at 40 MB per second Some faults however can cause a 40 MB per second link to run at only 20 MB per second The run_ssa_link_speed program detects when a pair of high speed nodes are running at low speed The following entry is added to the cron table to invoke this program at 04 30 each day 30 4 x x usr Ipp diagnostics bin run_ssa_link_speed 1 gt dev null 2 gt dev null If the program detects a problem it sends an error code to the system error log You can also use the ssa_speed x command to check whether any high speed nodes are running at low speed see Duplicate Node Test The node_number attribute of the ssar can be set to enable SSA disk fencing or SSA target mode operations It is important however that duplicate node numbers do not exist on the subsystem Each hour therefore the device driver performs a duplicate node number test If this test finds a duplicate node number it logs an error code under the SSA_SETUP_ERROR label The device driver continues to log this error each hour until the problem is solved This test runs separately from the run_ssa_healthcheck The test is run hourly but not at any specific time in the hour Chapter 11 SSA Error Logs 227 Error Logging Management Summary If an error is permanent it
121. then make the devices available again 218 User s Guide and Maintenance Information To disable the fast write cache type ssaraid 1 ssaX H n Y a fastwrite off a force yes u where X is the number of the adapter that has reported the failure and Y is the name of the device The name of the device can be either the logical disk name or the SSA serial number A typical command line might be therefore ssaraid ssa3 H n pdisk5 a fastwrite off a force yes u or ssaraid 1 ssa3 H n 2327340C423235K a fastwrite off a force yes u The force attribute ensures that all data is lost from the fast write cache You cannot recover that data The force attribute also prevents the reattachment of the disk to the using system no logical disk can therefore be created The actions of the force attribute are important because the lost data might include file system metadata If that data is damaged as a result of the fast write cache failure further data loss and system crashes might occur when you attempt to restart the file system When the fast write cache has been disabled you can attempt to recover the data on the device Attention Ensure that the disk is not returned with its current use defined as System Disk until you are sure that the file system is safe e To reattach the disk and create a logical disk type ssaraid ssaX H n Y a use system k Z d where X is the adapter number Y is the 15 digit device seri
122. this ioctl The following ISAL commands minor function codes that are defined in the usr include ipn ipnsal h file can be issued FN_ISALMgr_Inquiry FN_ISALMgr_Hardwarelnquiry FN_ISALMgr_GetPhysicalResourcelDs FN_ISALMgrVPDInquiry FN_ISALMgr_Characteristics FN_ISALMgr_Statistics FN_ISALMgr_Flashindicator The arg parameter for the SSADISK_ISALMgr_CMD ioctl is the address of an ssadisk_ioctl_parms structure This structure is defined in the usr include sys ssadisk h file The SSADISK_ISALMgr_CMD ioctl uses the following fields of the ssadisk_ioctl_parms structure dsb Contains the directive status byte that is returned for the command The byte contains a value from the usr include ipn ipndef h file A non zero value indicates an error result Contains the IPN result word that is returned by IPN for the command The word contains values from the usr include ipn ipntra h file A non zero value indicates an error u isal parameter_descriptor Set by the caller to indicate the buffer for parameter data u isal transmit_descriptor Set by the caller to indicate the buffer for transmit data u isal receive_descriptor Set by the caller to indicate the buffer for received data Chapter 13 Using the Programming Interface 281 uQ isal status_descriptor Set by the caller to indicate the buffer for status data u isal minor_function Set by the caller to one of the ISAL Manager Commands that is defined in the usr includ
123. this option to change the level of microcode that is installed on selected Available SSA disk drives Download Microcode to all SSA Physical Disk Drives Select this option to load the latest level of microcode on all Available SSA disk drives F3 Cancel F1O Exit ey Ka To display the levels of microcode that are installed on the SSA disk drives select Display the Microcode levels of all SSA Physical Disk Drives A list of pdisks is displayed MICROCODE DOWNLOAD 802421 To set or reset Identify move cursor onto selection then press lt Enter gt Physical Serial ROSid systemname pdiskO AC51DB47 8877 AC9EDE7F 9292 systemname pdisk1 F3 Cancel F1O Exit S 394 User s Guide and Maintenance Information 3 Attention For several seconds during microcode download new data is written to the disk drive EEPROM If the power fails while that data is being written the disk drive microcode might become corrupted The microcode cannot be corrected Normally exchange the disk drive for a new one If you need to try to save data you might be able to exchange the electronics card assembly of the disk drive For more details see the nstallation and Service Guide for the unit that contains the disk drive To download microcode to one specific disk drive select Download Microcode to selected SSA Physical Disk Drives and follow the instructions that are displayed You normally select this option when you do not
124. to a pool that contains RAID arrays Alternatively you can change the use of hot spare disk drives that are in this pool Chapter 5 Hot Spare Management 55 56 Users Guide and Maintenance Information Chapter 6 Using the RAID Array Configurator This chapter describes how to use the system management interface tool SMIT to manage your SSA RAID arrays The SMIT provides a set of menus from which you can select the various functions of the ssaraid command The ssaraid command allows you to create delete and manage your RAID arrays If you prefer to use the ssaraid command through the command line interface instead Note If you select a List function from a SMIT menu and no resource of the required type exists the following error pop up window might be displayed a N Change Member Disks in an SSA RAID Array Move cursor to desired item and press Enter Remove a Disk from an SSA RAID Array Add a Disk to an SSA RAID Array Swap Members of an SSA RAID Array ERROR MESSAGE Press Enter or Cancel to return to the application Fl Help F2 Refresh F3 Cancel F8 Image F1Q Exit Enter Do 1800 051 There are no items of this type For example if you select Add a Disk to an SSA RAID Array and no exposed or degraded arrays exist the error pop up window is displayed 57 Installing and Configuring SSA RAID Arrays 58 You can get to the required SMIT menu by using fast path commands or by w
125. to an SSA adapter These names are related to devices that are in the customized device data base and have the SSA adapter as their adapter_a or adapter_b attribute Flags a AdapterName Specifies the adapter to which the disk drives are connected P Produce a list of physical disks L Produce a list of logical disks Chapter 16 Using the SSA Command Line Utilities 349 ssadload Command Purpose Syntax Description To download microcode to SSA physical disk drives ssadload d PhysicalDiskName f CodeFileName p ssadload u d PhysicalDiskName a adapter p ssadload s d PhysicalDiskName a adapter The ssadload command performs microcode downloads to SSA physical disk drives The command has three modes of operation Load a specific level of microcode into a specific SSA physical disk drive Using the command in this mode you can force load any available level of microcode into any compatible SSA disk drive Ensure that either a specific SSA physical disk drive or all the physical disk drives that are connected to the system are using the latest levels of microcode that are available on the system Using the command in this mode you can ensure that the latest available level of microcode has been loaded onto all compatible SSA disk drives in the system Show the existing microcode level that is installed on either a specific SSA physical disk drive or on all the SSA physical disk dri
126. to the Good state e A write operation to the array occurs before the missing primary disk drive becomes available again Under this condition Ifa hot spare disk drive is available that hot spare disk drive automatically becomes the new primary disk drive and the array goes into the Rebuilding state If the missing primary disk drive then becomes available again it is rejected If after the write operation occurs no hot spare disk drive is available the array enters the Degraded state If the missing primary disk drive then becomes available again it is rejected System 1 SSA Adapter Primary 1 Secondary 1 Figure 29 Single Host System with Primary Disk Drive Missing 196 User s Guide and Maintenance Information Dual Host System with Primary Disk Drive Missing shows a dual host system that has just been switched on The system contains a RAID 1 array whose primary disk drive is missing The array remains online but in the Exposed state until one of the following occurs The missing primary disk drive becomes available again before a write operation to the array occurs Under this condition the restored disk drive is returned as the array primary disk drive and the array returns to the Good state e A write operation to the array occurs before the missing primary disk drive becomes available again Under this condition Ifa hot spare disk drive is available that hot spare disk drive automat
127. turn each SSA adapter If you are using pools other than AO and BO the hot spare pool to which the exchanged disk drive must now be added is listed with a status of reduced critical or empty If multiple disk drives have been exchanged and multiple hot spare pools exist either ask the customer which disk drives are to be assigned to which hot spare pools or see k guidance 474 User s Guide and Maintenance Information MAP 2410 SSA Repair Verification This MAP helps you to verify that FRUs that you have exchanged for new FRUs or repair actions that you have done have solved all the problems on the subsystem Attention Unless the using system needs to be switched off for some other reason do not switch off the using system when servicing an SSA link or an enclosure in which SSA devices are installed Enclosure power cables and external SSA cables that connect devices to the using system can be disconnected while that system is running 1 fon a B and Bl in MAP 2320 SSA Link step 8 in MAP 2323 SSA Intermitteni step Blin MAP 2324 SSA RAID Have you exchanged a FRU NO a Run diagnostics in System Verification mode to the device that reported the problem Note Do not run Advanced Diagnostics otherwise errors might be logged on other using systems that share the same loop b Go to step on page 474 YES Go to step A 2 from step ml Before you arrived at this MAP you exchanged one or more FRUs for new
128. unused pool_Al 3 2 2 1 full pool Bl 6 1 1 1 full F1 Help F2 Refresh F3 Cancel F6 Command F8 Image F9 Shel1 F10 Exit Find n Find Next VAS S The normal operating state for a hot spare pool is Full Any other state indicates that a problem exists or some configuration actions are required The possible states are Full The number of hot spare disk drives that are in the pool is equal to the number of hot spare disk drives that were in the pool when the pool was last configured Empty No hot spares are in the pool or hot spares in the pool are not of a suitable size for one or more of the arrays in the pool Hot spare disk drives must have a capacity equal to or greater than that of the smallest disk drive in the array or if hot spare exact has been selected for the array the capacity of the hot spare disk drive must be exactly equal to that of the smallest disk drive in the array To add a hot spare disk drive to a pool 1 Ensure that you have disk drives that e Are assigned as hot spare disk drives or as array candidate disk drives Chapter 5 Hot Spare Management 53 e Have a size equal to or greater than the largest disk drive that they will be protecting Are in pool 0 on the SSA loop on which the hot spare pool exists 2 If you are not sure whether your disk drives are assigned correctly a Ensure that the required number of disk drives of the correct size are assigned as hot spare gisk drives or as array ca
129. use of member disk drives You must first remove the disk drives from their array either by deleting the array or by exchanging them out of the array with the A and i exchange options of the ssaraid command You can assign new uses to disk drives that have been rejected You must however first check the disk drives to find the cause of the problem fastwrite on off default off This attribute enables and disables the fast write cache When using the fast write cache you can use the following attributes to control the operation of the cache fw_start_block default 0 See the definition for fw_end_block fw_end_block default array size This attribute and the fw_start_block attribute control the range of blocks for which the fast write cache is enabled Write operations that are outside the default range of O through array size write data directly to the disk and do not use the fast write cache 250 User s Guide and Maintenance Information fw_max_length default 128 This attribute sets the maximum size in blocks of write operations to the fast write cache Write operations that are larger than the specified value write data directly to the disk and do not use the fast write cache force yes no If a disk is using a fast write cache that is failing you must specify this attribute as yes to allow the fast write cache to be disabled Action Attributes RAID 1 RAID 5 and RAID 10 Only You can specify the following attrib
130. use the Isdev command to determine the unique identity as follows 1 Type Isdev Ccdisk r connwhere and press Enter 2 Select the 15 character unique identifier UID that was given by the RAID configuration program when the array was created e Run the ssacand command and specify the adapter to which the logical disk is connected For example ssacand a ssaQ L Device Attributes SSA logical disks and SSA physical disks and the ssar router have several attributes You can use the Isattr command to display these attributes Attributes of the SSA Router ssar node_number This attribute must be set on systems that are using the SSA Fencing facility or the SSA Disk Concurrent Mode of Operation Interface facility These facilities of the SSA disk device driver are used only in configurations where the SSA disk drives are connected to more than one using system Therefore in configurations where the SSA disk drives are connected to only one using system the node_number attribute has no effect For configurations that use SSA Fencing or the SSA Disk Concurrent Mode of Operation Interface set the node_number to a different value on each using system that is in the configuration 270 User s Guide and Maintenance Information Attributes Common to SSA Logical and SSA Physical Disks adapter_a Specifies either the name of one adapter that is connected to the device or none if no adapter is connected as adapter_a now adapter_b Sp
131. using system to which the disk drives are connected Physical disk drive resource identifiers Serial numbers of the physical disk drives The actual serial number of a disk drive disk drive is shown on a label on the disk drive Descriptions of the disk drives Chapter 17 SSA Service Aids 379 2 Select the pdisk that you want to identify or put into Service Mode for example pdisk3 The following display appears with details of the disk drive that you have just selected S Va SET SERVICE MODE 802382 systemname pdiskO AC50AE43 4GB SSA C Physical Disk Drive Move cursor onto selection then press lt Enter gt Set or Reset Identify Select this option to set or reset the Identify indicator on the disk drive gt Set or Reset Service Mode Select this option to set or reset Service Mode on the disk drive ENSURE THAT NO OTHER HOST SYSTEM IS USING THIS DISK DRIVE BEFORE SELECTING THIS OPTION F3 Cancel F1O Exit NS S 380 _User s Guide and Maintenance Information 3 Select Service Mode or the Identify function For this example assume that you have selected Service Mode The list of pdisks is displayed again and the disk drive that you selected is marked by a gt which shows that the disk drive is in Service Mode SET SERVICE MODE 802381 Move cursor onto selection then press lt Enter gt systemname pdisk0 AC50AE43 2GB SSA C Physical Disk Drive systemname pdisk1l AC706EA3 2
132. want the microcode on the selected disk drive to be at the latest available level 4 If you have a new level of microcode to install or if you have replaced a disk drive and want to upgrade it to the present level select Download Microcode to all SSA Physical Disk Drives This option ensures that all disk drives have the latest level of microcode installed It downloads microcode only to those disk drives whose level of microcode is lower than that in the microcode directory or on the microcode diskette Note Different types of SSA disk drive might need different versions of the microcode Microcode download files are provided for each type of disk drive Where a system contains more than one type of SSA disk drive this Service Aid selects the correct microcode file for each of those types Chapter 17 SSA Service Aids 395 Link Speed Service Aid The Link Speed service aid allows you to display the operating speed of each link on an SSA adapter To use the Link Speed service aid 1 Select Link Speed from the SSA Service Aids menu see Aids on page 376 A list of adapters is displayed i gt LINK SPEED 802437 Move cursor onto selection then press lt Enter gt systemname ssaQ 00 03 IBM SSA 160 SerialRAID Adapter systemname ssal 00 04 IBM SSA 160 SerialRAID Adapter systemname ssa2 00 05 IBM SSA 160 SerialRAID Adapter F3 Cancel F1O Exit 396 User s Guide and Maintenance Information 2 Select the ad
133. which of one or more conditions it wants to be notified by a bitwise OR of one or more flags The target mode device driver provides support for the following select events POLLIN Check whether received data is available POLLSYNC Return only events that are currently pending No asynchronous notification occurs The additional events POLLOUT and POLLPRI are not applicable The target mode device driver does not therefore provide support for them The revento output parameter points to the result of the conditional checks The device driver can return a bitwise OR of the following flags POLLIN Received data is available The chan input parameter is used for specifying a channel number This parameter is not applicable for nonmultiplexed device drivers It should be set to 0 for the target mode device driver Chapter 13 Using the Programming Interface 301 Errors The POLLIN event is indicated by the device driver when any data is received for this target instance A nonblocking read subroutine if subsequently issued by the caller returns data For a blocking read subroutine the read does not return until either the requested length is received or the write operation ends whichever comes first Asynchronous notification of the POLLIN event occurs when received data is available This notification occurs only if the select event POLLSYNC was not set The initiator mode device driver provides support for the following s
134. write operation Any protocol that is needed to manage the communication of data must be implemented in user supplied programs The only delays that can occur when data is being received are delays that are characteristics of the SSA system and of the environment in which it operates and delays that are caused by full buffers SSA Target Mode can concurrently send data to and receive data from all attached nodes Blocking read and blocking write operations do nothing until data is available to be read or until the write operation is complete Execution of Target Mode Requests The write operation transfers the data into the device buffers When a buffer is full the SSA adapter starts to transfer the data to the remote using system At the same time the user s application program continues to fill the device buffer with the remaining data that is being transferred If the amount of data that is being written is larger than the available buffer space the application program waits until more space becomes available in the device buffers As each buffer is sent the tmssa device driver checks whether any more data is to be sent If more data is to be sent the device driver continues to send that data If no more data is to be sent and the write operation is in blocking mode the device driver starts the waiting application program If the write operation is in nonblocking mode the write status is updated If an unrecoverable error occurs the wr
135. 0004AC506D6D00D member n a 4 5GB Physical disk pdisk2 0004AC50A44200D member n a 2 3GB Physical disk pdisk3 0004AC515EA400D free n a 4 5GB Physical disk hdisk5 900335FE80C84CK good 9 0GB RAID 5 array Chapter 12 Using the SSA Command Line Interface for RAID Configurations 241 Example 7 To Change an Attribute of an Object This example shows how to change an attribute of an object for example to change the new array so that it does not automatically call a hot spare disk drive if one of its member disk drives goes into the Offline state Type the command gt ssaraid H 1 ssaQ n hdisk5 a spare false where H specifies that this operation is a change operation n hdisk5 specifies that the object hdisk5 is about to be changed a spare specifies the new value for the spare attribute Example 8 To Exchange a Member Disk Drive of an Existing Array 242 This example shows how to exchange a member disk drive of an existing array with a free pdisk for example to exchange pdiskO with pdisk3 Type the command gt ssaraid A 1 ssaQ i exchange n hdisk5 a old_member pdisk0 new_member pdisk3 where A specifies that this operation is an exchange operation n hdisk5 specifies that the array about to be changed a old_member specifies the name of the existing member disk new_member specifies the name of the new member disk drive If you give a List command in summary format a result similar to that shown here is dis
136. 05 SRNs service request numbers 411 SSA adapter device driver head device driver interface 256 SSA adapter ID during bring up 6 SSA adapter removal and replacement 327 SSA Adapters SMIT menu 40 SSA Adapters SMIT menu getting access 40 SSA Command Line Interface for RAID 235 action attributes 251 command syntax 237 couple action attributes 252 hot spare pool creation and change attributes 249 instruct types 238 object types 238 options 238 physical disk change attributes 249 RAID arrays change attributes 248 RAID arrays creation and change attributes 244 return codes 253 SSARAID command attributes 244 uncouple action attributes 252 SSA disk concurrent mode of operation interface 287 device driver entry point 287 top kernel extension entry point 288 SSA disk fencing 290 SSA Disks SMIT menu 41 SSA Disks SMIT menu getting access 41 SSA error conditions summary 259 SSA error logs 221 error log analysis detailed description 230 summary 229 error logging detailed description 222 summary 221 error logging management detailed description 228 summary 228 SSA link error problem determination 478 link status ready lights 481 service aid 482 ssa link errors 478 SSA link speed 18 SSA Logical Disks option 213 SSA loops configuring devices 18 device unique IDs UIDs 21 disk drive identification 19 finding the physical location of a device 409 Index 505 SSA loops continued large configurations 16 links and data paths 7 loops
137. 1 LINK VERIFICATION 802386 SSA Link Verification for nunu ssa0 00 04 IBM SSA 160 SerialRAID Adapter To Set or Reset Identify move cursor onto selection then press lt Enter gt Physical Serial Adapter Port Al A2 Bl B2 Status TOP nunu pdisk1 AC7AA09A 0 3 Good nunu pdisk2 AC7AA2D6 l 2 Good nunu pdisk3 AC7AAOBD 2 61 Good nunu pdisk4 AC7AAOB1 32 20 Good F3 Cancel F1Q Exit K 7 User s Guide and Maintenance Information Example 2 If the SRN is 24104 the device in theory is connected to adapter port 1 shown as A2 on the screen The device however has an SSA address of 04 That address is higher than the highest address that is displayed for adapter port 1 The devices are therefore the SSA adapter and pdisk1 3 from step B The problem is in the SSA link between the two devices that you identified in step a Exchange in the sequence shown the following FRUs for new SS Ensure that for each FRU exchange you go to N to verify the repair a One of the two devices that are identified by the SRN see Exchanging DisH b The other of the two devices The internal SSA connections of the enclosure or enclosures in which the devices are installed d The external SSA cable Chapter 18 SSA Problem Determination Procedures 453 MAP 2324 SSA RAID This MAP helps you to solve problems that have occurred in SSA RAID arrays Attention Unless the using system needs to be switched off for some othe
138. 1 If you are removing a disk drive under concurrent maintenance see the service information for the device that contains the disk drive you must first determine which hdisk is using the pdisk that you want to remove To do this you can either use ine Configuration Ve ification service aid see e If no hdisk is using the pdisk go to E rar e If an hdisk is using the pdisk you must now determine the type of the hdisk The hdisk might be an SSA disk drive or a RAID disk To determine whether the hdisk is a RAID disk type smitty lsdssaraid A list is displayed of all hdisks that are configured as RAID arrays If an hdisk that is a RAID 1 RAID 5 or RAID 10 array is using the pdisk you can remove the disk drive from the array Go to step AE If an hdisk that is a RAID O array is using the pdisk a Ask the customer to make a backup of the data if possible from the array and to make the disk free b Ensure that you delete the array before you exchange the pdisk c Note the hdisk number d Go to step 2i e If the hdisk is not a RAID pdisk a Ask the customer to make a backup of the data if possible from the array and to make the disk free b Ensure that the fast wie function is disabled see c Go to step oa Attention You should be here only if you are working with a RAID O array For fast path type smitty rmssaraid and press Enter Otherwise select Delete an SSA RAID Array from the SSA RAID Array
139. 214 215 217 218 218 220 220 221 221 221 222 228 228 228 229 229 230 233 235 237 238 238 238 239 239 239 240 240 241 241 Example 7 To Change an Attribute of an Object Example 8 To Exchange a Member Disk Drive of an Existing Array Example 9 To Make a New System Disk Example 10 To Delete an Array SSARAID Command Attributes RAID Arrays Creation and Change Attributes RAID Arrays Change Attributes Hot Spare Pool Creation and Change Attribute Physical Disk Drive Change Attributes Action Attributes RAID 1 RAID 5 and RAID 10 Only Couple Action Attributes RAID 1 and RAID 10 Only Uncouple Action Attributes RAID 1 and RAID 10 Only Return Codes Pil te My de ee od Chapter 13 Using the Programming Interface SSA Subsystem Overview Device Drivers Interface between the SSA Adapter Device Driver and Head Device Driver Trace Formatting SSA Adapter Device Driver Purpose rar Syntax Description PCI SSA Adapter ODM Attributes Device Dependent Subroutines Summary of SSA Error Conditions Managing Dumps Files IOCINFO Device Information SSA Adapter Device Driver ioctl Operation Purpose Description Files SSA_ TRANSACTION SSA Adapter Device Driver ioctl Operation Purpose ede atk aks cae Sp E BS A eee ee a ae Description Return Values Files SSA_GET_ ENTRY POINT SSA Adapter Device Driver ioctl Operation
140. 22 23 24 from step Zlin MAP ation and from steps A il ana id in this MAP m RAID Checkout You are now starting the RAID checkout procedure a Type smitty ssaraid and press Enter b Select Change Show Use of an SSA Physical Disk from the SSA RAID Arrays menu Are any disks listed as SSA physical disks that are rejected NO Go to step B8 on page 470 YES a Run diagnostics in System Verification mode to all the disk drives that are listed as rejected b Run the Certify service aid see to all the rejected disk drives c Go to step PZ on page 47d from steps ll and B5 An array is preparing a RAID array copy but a coupled pdisk cannot be detected All read and write operations to the array can complete normally The array copy however cannot be uncoupled until the missing disk drive has been replaced and contains an exact copy of the data that is on the array a Type smitty ssaraid and press Enter b Select Array Copy Services c Select List All Copy Candidates d Note the hdisk that is in the Degraded copy state e Return to the SSA RAID Arrays menu f Select Change Show Use of an SSA Physical Disk The status of the disk drives that are connected to the using system is displayed g Go to step bal from step bah Are any disk drives listed as SSA physical disks that are rejected NO A disk drive has not been detected by the adapter Go to step B5 onl YES a
141. 231469E2C37K 095231469F7C37K 09523146A42637K 09523146A4B737K MORE 15 Fl Help F2 Refresh F3 Cancel F6 Command F8 Image F9 Shel F1O Exit Find n Find Next nes 6 If you want to delete any records note the names of those records and go to User s Guide and Maintenance Information Deleting an Old RAID Array Recorded in an SSA RAID Manager This option allows you to delete the records of RAID managers that have been disconnected but whose records remain in the RAID manager 1 Select List Delete Old RAID Arrays in an SSA RAID Manager from the SSA RAID Arrays menu 2 Select Delete an Old RAID Array Recorded in an SSA RAID Manager 3 Alist of RAID managers is displayed in a window as K List Delete Old RAID Arrays in an SSA RAID Manager Move cursor to desired item and press Enter List Old RAID Arrays Recorded in an SSA RAID Manager Delete an Old RAID Array Recorded in an SSA RAID Manager SSA RAID Manager Move cursor to desired item and press Enter ssa0 Available 00 02 IBM SSA 160 SerialRAID Adapter 14109100 F1l Help F2 Refresh F3 Cancel F8 Image F1Q Exit Enter Do Find n Find Next Select the RAID manager from which you want to delete an old array Chapter 6 Using the RAID Array Configurator 133 4 The following information is displayed ihe Delete an Old RAID Array Recorded in an SSA RAID Manager Type or select values in entry fields Press Ent
142. 329 SSA adapter card 1 Chapter 18 SSA Problem Determination Procedures 415 SRN Problem Possible Causes 40128 Description A 128 MB SDRAM in the adapter card module has failed Possible FRUs 128 MB SDRAM module Action Exchange the FRUs for new FRUs 42000 Description Either no SDRAM module is present on the adapter card or Possible FRUs the POST cannot determine the size of the existing SDRAM module SDRAM module 99 Action Install an SDRAM module of the correct size or exchange the existing SDRAM module for a new one of the correct size 42200 Description Other adapters on the SSA loop are using levels of User or service action microcode that are not compatible Action Install the latest level of adapter microcode onto all the other adapters on this SSA loop 42500 Description The Fast Write Cache Option Card has failed Possible FRUs Action 1 Exchange the cache card for a new one 2 Switch on power to the using system 3 If the original cache card contained data that was not moved to a disk drive new error codes are produced Run diagnostics in System Verification mode to the adapter If an SRN is produced do the actions for that SRN 416 User s Guide and Maintenance Information SRN Problem Possible Causes 42510 Description Not enough SDRAM available to run the fast write cache User or service action operation Action 1 Start t
143. 455 6 from step B Either one or more disk drives have failed or an array that is not complete has been connected to the SSA adapter e f one or more disk drives have been added to this system and those disk drives were previously members of an array on this system or on another system do the following a b c d Type smitty ssaraid and press Enter Select Delete an SSA RAID Array Select the array that is in the Offline state and delete it All data that is on that RAID array is now lost You must now locate and repair any failed disk drives and make those disk drives available for the creation of a new array Go to ee e If no disk drives have been added to this system go to step Z on page 457 456 User s Guide and Maintenance Information 7 from step B The array data cannot be recovered The following steps clear the error condition and change the disk drives to a usable state a Type smitty ssaraid and press Enter b Select Change Show Use of an SSA Physical Disk Are any disk drives listed as SSA physical disks that are rejected NO YES e Go to MAP Ask the user to delete and recreate the array that is in the Offline state Run diagnostics in System Verification mode to all the disk drives that are listed as rejected Run the Certify service aid see to all the disk drives that are listed as rejected If any problems occur exchange the failed disk drives for new disk drives see
144. 59 90 98 ssavfynn 371 ssaxlate 372 swpssaraid 137 configuration information tmssa device driver 296 configuration of RAID 1 and RAID 10 arrays split site management 193 Configuration Verification service aid 387 configuring and installing SSA RAID arrays 58 adding a disk drive to an SSA RAID array 140 configuring and installing SSA RAID arrays continued adding a new hot spare pool 83 adding an SSA RAID array 60 adding disks to a hot spare pool 86 canceling all SSA disk drive identifications 129 changing member disks in an SSA RAID array 137 changing or showing the attributes of an SSA RAID array 135 changing or showing the status of a hot spare pool 74 changing or showing the use of an SSA disk drive 144 changing the use of multiple SSA physical disks 147 creating a hot spare disk drive 72 deleting an old RAID array recorded in an SSA RAID manager 133 deleting an SSA RAID array 70 getting access to the SMIT menu 59 identifying and correcting or removing failed disk drives 91 identifying array candidate disk drives 125 identifying hot spare disk drives 121 identifying rejected array disk drives 123 identifying system disk drives 127 identifying the disk drives in an SSA RAID array 119 installing a replacement disk drive 95 listing all defined SSA RAID arrays 100 listing all SSA RAID arrays that are connected to a RAID manager 102 listing all supported SSA RAID arrays 101 listing array candidate disk drives 115 listing hot
145. 86 canceling all SSA disk drive identifications 129 changing member disks in an SSA RAID array 137 changing or showing the attributes of an SSA RAID array 135 changing or showing the status of a hot spare pool 74 changing or showing the use of an SSA disk drive 144 changing the use of multiple SSA physical disks 147 creating a hot spare disk drive 72 deleting an old RAID array recorded in an SSA RAID manager 133 deleting an SSA RAID array 70 getting access to the SMIT menu 59 identifying and correcting or removing failed disk drives 91 identifying array candidate disk drives 125 identifying hot spare disk drives 121 identifying rejected array disk drives 123 identifying system disk drives 127 identifying the disk drives in an SSA RAID array 119 installing a replacement disk drive 95 listing all defined SSA RAID arrays 100 listing all SSA RAID arrays that are connected to a RAID manager 102 listing all supported SSA RAID arrays 101 listing array candidate disk drives 115 listing hot spare disk drives 111 listing old RAID arrays recorded in an SSA RAID manager 131 listing rejected array disk drives 113 listing system disk drives 117 listing the disk drives in an SSA RAID array 109 listing the disks that are in a hot spare pool 80 listing the status of all defined SSA RAID arrays 104 removing a disk drive from an SSA RAID array 138 removing disks from a hot spare pool 86 showing the disks that are protected by hot spares 77 swapping memb
146. A RAID Array Move cursor to desired item and press Enter hdisk3 095231779F0737K good 3 4G RAID 5 array hdisk4 09523173A02137K good 3 4G RAID 5 array F1l Help F2 Refresh F3 Cancel F8 Image F1O Exit Enter Do Find n Find Next Ne D 2 Select the array that you want to delete 70 Users Guide and Maintenance Information 3 A prompt is displayed in a window ie D SSA RAID Arrays Move cursor to desired item and press Enter List All Defined SSA RAID Arrays List All Supported SSA RAID Arrays List All SSA RAID Arrays Connected to a RAID Manager List Status Of All Defined SSA RAID Arrays List Identify SSA Physical Disks List Delete Old RAID Arrays Recorded in an SSA RAID Manager List Status of Hot Spare Pools List Status of Hot Spare Protection for an SSA RAID Array List Components in a Hot Spare Pool ARE YOU SURE Continuing may delete information you may want to keep This is your last chance to stop before continuing Press Enter to continue Press Cancel to return to the application Fl Help F2 Refresh F3 Cancel F8 Image F1Q Exit Enter Do Find n Find Next 4 At the prompt press Enter if you want to delete the array Press Cancel if you do not want to delete the array Chapter 6 Using the RAID Array Configurator 71 Creating a Hot Spare Disk Drive 1 For fast path type smitty chgssadisk and press Enter Otherwise select Change Show Use of an SSA Physical Disk from the
147. A RAID Arrays c Select in turn each array type that is used in your subsystem and press Enter Are any arrays listed as having an invalid data strip as shown in the following screen Note This example screen shows the status of RAID 5 arrays The screen shows similar information if you select a RAID 1 array or a RAID O array oa gt COMMAND STATUS Command OK stdout yes stderr no Before command completion additional instructions may appear below Unsynced Parity Strips Unbuilt Data Strips hdisk3 0 0 Invalid data strip hdisk4 0 0 F1 Help F2 Refresh F3 Cancel F6 Command F8 Image F9 Shel F1O Exit Find n Find Next ee J NO Review the symptoms then go to MAP start the problem determination procedure again YES a Note the hdisk number of the failing array b Go to step Hoon page 459 458 User s Guide and Maintenance Information 10 from step bh 11 a Type smitty ssaraid and press Enter b Select List Identify SSA Physical Disks c Select List Disks in an SSA RAID Array d Select the failing disk drive and note the pdisk numbers of the disk drives that are members of the array e Ask the user to create a backup of all the data from this array Some data might not be accessible f When the backup has been created ask the user to delete the array g Run diagnostics in System Verification mode to each of the pdisks that you noted previously Do the diagnostics fail when they are
148. A RAID Arrays Connected to a RAID Manager List Status Of All Defined SSA RAID Arrays List Identify SSA Physical Disks List Delete Old RAID Arrays Recorded in an SSA RAID Manager List Status of Hot Spare Pools List Status of Hot Spare Protection for an SSA RAID Array List Components in a Hot Spare Pool Add a Hot Spare Pool Add an SSA RAID Array Delete an SSA RAID Array Change Show Attributes of an SSA RAID Array Change Member Disks in an SSA RAID Array Change Show Use of an SSA Physical Disk Change Use of Multiple SSA Physical Disks Change Show Delete a Hot Spare Pool Fl Help F2 Refresh F3 Cancel F8 Image ai ie F10 Exit Enter Do P Chapter 4 Using the SSA SMIT Menus 43 44 Users Guide and Maintenance Information Chapter 5 Hot Spare Management With all levels of adapter code disk drives can be configured to be hot spare disk drives These hot spare disk drives can be used in any array that is on the same SSA loop If the adapter microcode level is at or higher than level 50 each hot spare disk drive can be configured to a particular hot spare pool The pdisks of arrays can also be configured to hot spare pools You can therefore control which hot spare disk drive is to replace a particular failed member of an array This chapter describes the ways in which you can use hot spare pools Deciding how to Configure Hot Spare Disk Drive Pools RAID 1 and RAID 10 arrays provide data protection by writing the s
149. Adapter PCI SSA Multi Initiator RAID EL Adapter Micro Channel SSA Multi Initiator RAID EL Adapter RAID 10 2 Advanced SerialRAID Adapter at microcode level above 5000 Fast Write 1 Advanced SerialRAID Adapter at microcode level below 5000 2 Advanced SerialRAID Adapter at microcode level above 5000 See the SSA Adapters User s Guide and Maintenance Information manual SA33 3272 Version 01 or above for more information about the level of code that is required in the Micro Channel SSA Multi Initiator RAID EL Adapter type 4 M or the PCI SSA Multi Initiator RAID EL Adapter type 4 N User s Guide and Maintenance Information Checking the Level of the Adapter Microcode If you need to check the level of the adapter microcode 1 Type on the command line Iscfg vl ssan where ssan is the name of the adapter whose microcode you are checking for example ssa0 A list of vital product data VPD is displayed 2 Find ROS Level and ID for example 5000 Chapter 2 Introducing SSALoops 23 Rules for the Physical Relationship between Disk Drives and Adapters The physical relationship between the disk drives and the adapters in an SSA loop can affect the performance of the subsystem The following rules help you to get best performance from your subsystem One Pair of Adapter Connectors in the Loop The following sequence enables you to determine the best relationship between the disk drives and t
150. Advanced SerialRAID Adapters User s Guide and Maintenance Information SA33 3285 02 Advanced SerialRAID Adapters User s Guide and Maintenance Information SA33 3285 02 Third Edition September 2000 This softcopy of 14 January 2002 is a minor revision to SA33 3285 02 It contains new technical changes that are not shown in the printed book Such changes are shown by a colon to the left of each change Changes that are also in the printed book are shown by a vertical line to the left of each change The following paragraph does not apply to any country where such provisions are inconsistent with local law THIS PUBLICATION IS PRINTED AS IS WITHOUT WARRANTY OF ANY KIND EITHER EXPRESS OR IMPLIED INCLUDING BUT NOT LIMITED TO THE IMPLIED WARRANTIES OF MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE Some states do not allow disclaimer of express or implied warranties in certain transactions therefore this statement may not apply to you This publication could contain technical inaccuracies or typographical errors Changes are periodically made to the information herein these changes will be incorporated in new editions of the publication It is possible that this publication may contain reference to or information about products machines and programs programming or services that are not announced in your country Such references or information must not be construed to mean that such products programming or
151. C and adapter D System 1 SSA Adapter A EEE System 3 SSA Adapter C Secondary 1 System 2 SSA Adapter B Primary 1 Primary 2 Figure 38 Moving a RAID 10 Array Under these conditions e The half of the array that is still connected to adapter A and adapter B is in the Exposed state System 4 SSA Adapter D e The half of the array that is connected to adapter C and adapter D is in the Offline state System 3 and system 4 generate SRN 48755 You can solve this problem if you do one of the following Return the primary half of the array to its original adapters e Move the secondary half of the array to the new adapters 204 User s Guide and Maintenance Information Change the state of the Split Array Resolution flag on the new adapters Attention This method helps you to recover the system from severe errors Use this method only if the other half of the array has failed completely and will never be used again Array is Offline because the Split and Join Procedure Was Not Performed Correctly When this condition occurs the host system generates SRN 48760 The condition can be caused if a RAID 1 or a RAID 10 array has been split exactly in half and a separate write operation has been performed to each half of the array This problem can occur in a dual host configuration if Only one adapter can be detected on the SSA loop you change
152. Characteristics of an SSA Logical Disk from the SSA Logical Disks menu b From the menu displayed select the logical disk that you want to change A list of options for the logical disk drives is displayed ie 2 Change Show Characteristics of an SSA Logical Disk Type or select values in entry fields Press Enter AFTER making all desired changes MORE 6 Entry Fields Location Label Parent ssar Size in Megabytes 4512 adapter_a ssa0 adapter_b ssal primary_adapter adapter_a Connection address 0004AC506C3600E Physical volume IDENTIFIER 00406 fdac2fb8203000000 gt ASSIGN physical volume identifier no RESERVE disk on open yes Queue depth 5 Maximum Coalesce 0x20000 Enable Fast Write yes Bypass Cache In 1 Way Fast Write Network no BOTTOM F1l Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image F9 Shel F1Q Exit Enter Do W r4 If you want to enable the fast write function for a particular disk drive set the Enable Fast Write option to yes for that disk drive If you want to disable the fast write function for a particular disk drive set the Enable Fast Write option to no for that disk drive Notes a If you are running a two way fast write operation and you enable or disable the fast write function the hdisk on the second using system becomes unavailable From the second using system delete that hdisk and reconfigure as follows 1 Type rmdev 1 hdiskname d 2 Run
153. Complete 100 e lf the disk is format degraded the following messages are displayed gt ssa_progress 1 pdisk Failed 0 Chapter 16 Using the SSA Command Line Utilities 365 ssa_rescheck Command Purpose To report the reservation status of an hdisk Syntax ssa_rescheck 1 hdisk h Description The ssa_rescheck command tests the access paths to the specified hdisk It checks whether the disk is reserved If the disk is reserved the command attempts to determine why the disk is reserved Flags l hdisk Specifies the hdisk that you want to test h Switches off the header output Output The ssa_rescheck command sends error messages to stderr It sends header information and status output to stdout The messages can be OK Access to the disk drive is possible Open Another program has opened the disk drive Fail Access to the disk drive is not possible Busy The disk drive is reserved to another adapter or using system Notes 1 For an SSA Enhanced Adapter Busy means that another adapter has reserved the disk drive If both adapters are in the same using system the other adapter shows OK or Open 2 For an SSA Enhanced RAID Adapter Busy means that the disk drive is reserved The Reserved To field provides more information N A The adapter cannot return reservation information This occurs when the adapter is not an SSA Enhanced RAID Adapter None The disk drive is not reserved If an adapter name or
154. Copy Array from a RAID 1 or RAID 10 Array This section describes how to use the ssaraid command to create RAID copy arrays not using Logical Volume Manager LVM Note Data must be separately synchronized from system memory cache before the copy is uncoupled The procedure that is described here enables you to e Create a RAID Copy array from available candidate disk drives e Couple the RAID Copy array to your selected parent RAID 1 or RAID 10 array Copy data from the parent array to the RAID Copy array e Uncouple the RAID Copy array from the parent array e Delete the RAID Copy array when the copy data is no longer required The following examples show how you might create a RAID Copy of a RAID 1 array For these examples the RAID 1 array is called hdisk5 and the RAID manager for hdisk5 is ssa2 Note If you are not sure which adapter is the RAID manager for the hdisk that you want to copy give the command ssaadap 1 hdiskx where hdiskx is the name of the hdisk that you want to copy For more information on the ssaraid command see 1 List all the suitable disk drives that can be used to create a RAID Copy array for the selected array Type ssaraid 1 ssa2 t raid_copy c r hdisk5 A list of suitable candidate disk drives is displayed for example 000629CD8A3900D 000629D03BDO00D Chapter 7 Copying Data from Arrays and from Volume Groups 151 152 All the free disk drives that are listed are large enou
155. Degraded 37 Exposed 36 Good 36 multiple 38 Offline 37 Rebuilding 37 Unknown 38 RAID 5 action attributes 251 RAID 5 array states 33 Degraded 33 Exposed 33 read operations while in 33 write operations while in 33 flowchart 35 Good 33 initial rebuilding operation 34 Offline 34 Rebuilding 34 adapter replacement 34 disk drive replacement 34 RAID 5 physical disk change attributes 249 read subroutine tmssa device driver 297 readx and writex subroutines disk device driver 274 Rebuilding state RAID 10 37 Rebuilding state RAID 5 34 adapter replacement 34 disk drive replacement 34 initial rebuilding operation 34 redssaraid command 138 reformatting a pdisk as an hdisk 387 relationship of disk drives and adapters 24 one pair of adapter connectors in the loop 24 pairs of adapter connectors in the loop mainly shared data 26 pairs of adapter connectors in the loop some shared data 25 removal and replacement procedures changing pdisk and hdisk numbers 326 exchanging disk drives 319 Index 503 removal and replacement procedures continued installing a battery assembly into the fast write cache card 338 installing an SDRAM module 330 installing the fast write cache card 334 removing an SDRAM module 329 removing the battery assembly from the fast write cache card 336 removing the fast write cache card 332 SSA adapter 327 Remove a Disk from an SSA RAID Array option 138 Remove a Disk From an SSA RAID Array option effects of array
156. Displays also VPD information i Displays the enclosure identifier as shown in the operator panel if present r Displays RPO present TRUE FALSE RPO is the remote power on control b card Displays the status of the bypass cards If no parameters are given the status for all the bypass cards is displayed Valid values for the card parameter are 1 4 5 8 9 12 13 16 Chapter 16 Using the SSA Command Line Utilities 355 t threshold Displays all the temperature thresholds or only the specified temperature thresholds The valid values for the threshold parameter are lowarn The low temperature warning threshold locrit The low temperature critical threshold hiwarn The high temperature warning threshold hicrit The high temperature critical threshold a Displays the ambient temperature of the enclosure f fan Displays the status of all the fans or of only the specified fans Valid values for the fan parameter are 1 2 3 and so on d drive_bay Displays the status of all the disk drive bays slots or of only the specified disk drive bays Valid values for the drive_bay parameter are 1 2 3 and so on p PSU Displays the status of all the power supply assemblies or of only the specified power supply assemblies Valid values for the PSU parameter are 1 2 3 and so on 0 Displays the status of the operator panel C Displays the status of the controller card e Displays trace informa
157. Entries Microcode Maintenance Checking the ID and Level of the Microcode Package Maintaining the Adapter Microcode San g Maintaining the Disk Drive Microcode Vital Product Data VPD for the SSA Adapter Adapter Power On Self Tests POSTs Chapter 15 Removal and Replacement Procedures Exchanging Disk Drives Changing Pdisk and Hdisk Numbers Removing and Replacing an Advanced SerialRAID Adapter Removing an SDRAM Module of an Advanced SerialRAID Adapter Installing an SDRAM Module of an Advanced SerialRAID Adapter Removing the Fast Write Cache Option Card of an Advanced SerialRAID Adapter Installing the Fast Write Cache Option Card of an Advanced SerialRAID Adapter Removing the Battery Assembly from the Fast Write Cache Option Card of an Advanced SerialRAID Adapter Installing a Battery Assembly into the Fast Write Cache Option Card of an Advanced SerialRAID peer ee ee ny CY ae Part Numbers Chapter 16 Using the SSA Command Line Utilities ssa_sesdid Command 2 a i oe 2 Purpose Syntax Description Flags Examples ssaadap Command Purpose Syntax Description Flags Contents 304 305 305 305 307 307 307 308 308 308 311 313 313 313 314 314 315 315 316 317 319 319 326 327 329 330 332 334 336 338 340 341 341 341 341 341 341 342 343 343 343 343 343 ix x
158. F10 Exit F Enter Do Find n Find Next Select the hot spare pool whose disk drives you want to list The pool status is displayed COMMAND STATUS Command OK stdout yes stderr no Before command completion additional instructions may appear below ssal pool_B2 Component Location Size Status hdisk4 raid_10 pdisk13 04 02 REGY 06 P 18 2GB good pdisk3 04 02 REGY 05 P 9 2GB good pdisk7 04 02 REGY 01 P 9 2GB good Hot Spare Disks pdisk9 04 02 REGY 02 P 9 2GB Fl Help F2 Refresh F3 Cancel F6 Command F8 Image F9 Shel F1O Exit Find n Find Next J The columns of information displayed on the screen have the following meanings Chapter 6 Using the RAID Array Configurator 81 82 Component The array member disk drive of the hdisk that is listed on the screen or hot spare disk drives that are assigned to the pool Location The physical location of the array member disk drive Size The size of the array member disk drive This value is useful to know if you have assigned a hot spare disk drive to a pool but the array member disk drive is too large to be protected by the hot spare disk drive Status The status of the array member disk drive Valid values for status are good The disk drive is working not_present The disk drive cannot be detected It has been removed or it has failed too_large The member disk drive is too large to be protected by any hot spare disk drive in the pool Note The si
159. Feature An optional 128 MB dual inline memory module DIMM feature is available This feature is recommended for two way fast write operations Chapter 1 Introducing SSA and the Advanced SerialRAID Adapters 5 Lights of the Advanced SerialRAID Adapters Each pair of connectors has a green light that indicates the operational status of its related loop Status of Light Meaning Off Both SSA connectors are inactive If disk drives or other SSA adapters are connected to these connectors either those disk drives or adapters are failing or their SSA links are not active Permanently on Both SSA links are active normal operating condition Slow Flash Only one SSA link is active Port Addresses of the Advanced SerialRAID Adapters The port addresses used in some SRNs that relate to these adapters can be numbers 0 through 3 They correspond to the port connectors on the SSA adapter Connector A1 Connector A2 Connector B1 Connector B2 ono ll SSA Adapter ID during Bring Up 6 All adapters that can be used on RISC using systems generate a three digit configuration program indicator number During system bring up this indicator number appears on the three digit display of the using system The numbers are 80C Advanced SerialRAID Adapter type 4 P is being identified or configured User s Guide and Maintenance Information Chapter 2 Introducing SSA Loops This chapter describes the principles of SSA loops how
160. GB SSA C Physical Disk Drive systemname pdisk2 ACIDBE11 2GB SSA C Physical Disk Drive gt systemname pdisk3 ACIDBEF4 2GB SSA C Physical Disk Drive systemname pdisk4 AC50AE58 2GB SSA C Physical Disk Drive systemname pdisk5 AC7C6E51 2GB SSA C Physical Disk Drive systemname pdisk6 AC706E9A 2GB SSA C Physical Disk Drive systemname pdisk7 ACIDEEE2 2GB SSA C Physical Disk Drive systemname pdisk8 amp AC1DBE32 2GB SSA C Physical Disk Drive eee F1O Exit gy Notes a You can select only one disk drive at time b If you select Service Mode and the selected disk drive is not in a closed loop or at the end of a string see selection fails and an error message is displayed Use the Link Verification service aid to identify any open link problems before trying to reselect Service Mode c If you select Service Mode and a file system is mounted on the selected disk drive your selection fails Use the Configuration Verification service aid to determine which hdisk must be have its file system unmounted before you can select Service Mode d If the Check light of the disk drive that you have put into Service Mode does not come on and you are not sure of the location of that disk drive ue the Identify function to help you find it see Chapter 17 SSA Service Aids 381 382 Select a second disk drive if required for example pdisk5 The following display appears again S SET SERVICE MODE 802382 systemname pdisk5 AC7C6E51 4GB SSA
161. Groups 185 Effects of Array Copy on Other SMIT Menus This section shows SMIT menus that are described elsewhere in this book When Array Copy is used however additional fields or display panels are present Change Show Attributes of an SSA RAID Array The following information is displayed for RAID 1 or RAID 10 arrays E i D Change Show Attributes of an SSA RAID Array Type or select values in entry fields Press Enter AFTER making all desired changes TOP Entry Fields SSA RAID Manager ssal SSA RAID Array hdisk2 Connection Address Array Name 8A8E39195F60C40 RAID Array Type raid_10 State good Copy State copying Size of Array 18 2GB Primary Disks pdiskO pdisk6 Secondary Disks pdisk9 pdisk10 Copy Disks pdisk11 pdisk12 Percentage Rebuilt Not Rebuilding Percentage Copied 1 Split Array Resolution Primary Enable Use of Hot Spares yes fx MORE 4 F1l Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image F9 Shel F1O Exit Enter Do S ey The meanings of the additional fields are Copy State The operational state of the array copy Not Copying No copy is being created for this array Good The coupled disk drives contain an exact copy of the data that is on the array The copy must be in the Good state before it can be uncoupled from the array Copying Data is being copied to the coupled disk drives but these coupled disk drives do not yet contain an exact copy of the data that is on
162. ID Array hdisk3 Connection Address Array Name 09523177F0737K Disk to Add F1l Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image eee F1O Exit Enter Do Y Press the List key to list the disk drives From the displayed list select the name of the disk drive that you are adding Chapter 6 Using the RAID Array Configurator 141 142 Swapping Members of an SSA RAID Array This option allows you to swap a disk drive for a replacement disk drive 1 For fast path type smitty exssaraid and press Enter Otherwise a Select Change Member Disks in an SSA RAID Array from the SSA RAID Arrays menu b Select Swap Members of an SSA RAID Array 2 A list of arrays is displayed in a window Pa Move cursor to desired item and press Enter Remove a Disk from an SSA RAID Array Add a Disk to an SSA RAID Array Swap Members of an SSA RAID Array SSA RAID Array Move cursor to desired item and press Enter Change Member Disks of an SSA RAID Array Use arrow keys to scroll hdisk3 095231779F0737K rebuilding 3 4G RAID 5 array hdisk3 09523173A02137K good 3 4G RAID 5 array Fl Help F2 Refresh F3 Cancel F8 Image F1O Exit Enter Do Find n Find Next Select the array whose disk drives you want to swap User s Guide and Maintenance Information 3 The following information is displayed Ae Swap Members of an SSA RAID Array D Type or select values in entry fields Press
163. ID Arrays option 188 Remove a Disk From an SSA RAID Array option 190 Swap Members of an SSA RAID Array option 191 Index 507 three way copy continued SMIT menus for 172 ssa_delete_copy command 171 ssa_make_copy command 161 using SMIT menus to create a copy array 155 using ssa_make_copy command to create a copy array 159 using ssaraid commands to create a copy array 151 TMCHGIMPARM change parameters tmssa device driver ioctl operation 308 description 308 purpose 308 TMIOSTAT status tmssa device driver ioctl operation 307 description 307 purpose 307 tmssa device driver 295 description 295 IOCINFO ioctl operation 305 description 305 purpose 305 purpose 295 syntax 295 TMCHGIMPAR M ioctl operation 308 description 308 purpose 308 TMIOSTAT ioctl operation 307 description 307 purpose 307 tmssa special file 304 description 304 implementation specifics 304 purpose 304 top kernel extension entry point 288 trace formatting 256 two loops with one adapter 14 two loops with two adapters 15 U Uncouple a Volume Group Logical Volumes or File Systems Copy option 177 uncouple action attributes RAID 1 and RAID 10 252 force yes no 252 unique IDs SSA UIDs 21 Unknown state RAID 10 38 using mkdev to configure a logical disk 269 using mkdev to configure a physical disk 269 using SMIT menus to create a copy array 155 using SSA Target Mode 294 using ssa_make_copy command to create a copy array 159 using ssaraid comman
164. If You Need More Information The Problem Solving Guide and Reference SC23 2204 is the first book you should use if you have a problem with your system Other books that you might need are The operator guide for your system Diagnostic Information for Multiple Bus Systems SA38 0509 Technical Reference for your adapter xvii Web Support Pages When you are installing an SSA device or subsystem upgrading your SSA subsystem or doing preventive maintenance on your SSA subsystem refer to the web page shown here This web page provides access to the latest SSA publications micorocde and support information for the using system SSA adapters and SSA subsystem http www storage ibm com hardsoft products ssa Numbering Convention xviii In this book KB means 1 000 bytes MB means 1 000 000 bytes GB means 1 000 000 000 bytes User s Guide and Maintenance Information Part 1 User Information 2 User s Guide and Maintenance Information Chapter 1 Introducing SSA and the Advanced SerialRAID Adapters This chapter describes e Serial storage architecture SSA e The Advanced SerialRAID Adapter and the Advanced SerialRAID Plus Adapter Physically the two types of adapter are the same The Advanced SerialRAID Plus Adapter however provides additional functions In this book the name Advanced SerialRAID Adapter is used both for the Advanced SerialRAID Adapter and for the Advanced SerialRAID Plus Adapter unle
165. List Hot Spares List Rejected Array Disks List Array Candidate Disks List System Disks Identify Disks in an SSA RAID Array Identify Hot Spares Identify Rejected Array Disks Identify Array Candidate Disks Identify System Disks SSA RAID Manager Move cursor to desired item and press Enter ssaQ Available 00 04 IBM SSA 160 SerialRAID Adapter 14109100 F1 Help F2 Refresh F3 Cancel F8 Image F1Q Exit Enter Do Find n Find Next Select the adapter whose hot spare disk drives you want to list Chapter 6 Using the RAID Array Configurator 111 112 3 A list of hot spare disk drives is displayed 2 COMMAND STATUS Command OK stdout yes stderr no Before command completion additional instructions may appear below pdisk3 0004AC5119E000D spare n a 1 1G Physical disk pdisk5 08005AEA030D00D spare n a 2 3G Physical disk Fl Help F2 Refresh F3 Cancel F6 Command F8 Image F9 Shel F1O Exit Find n Find Next Ne User s Guide and Maintenance Information Listing Rejected Array Disk Drives This option allows you to list disk drives that have been rejected probably because of failure from arrays T For fast path type smitty Ifssaraid and press Enter Otherwise a Select List Identify SSA Physical Disks from the SSA RAID Arrays menu b Select List Rejected Array Disks A list of adapters is displayed in a window ts List Identify SSA Physical Disks Move cursor to desired item and press E
166. List Identify SSA Physical Disks List Delete Old RAID Arrays Recorded in an SSA RAID Manager List Status of Hot Spare Pools List Status of Hot Spare Protection for an SSA RAID Array List Components in a Hot Spare Pool Add a Hot Spare Pool Add an SSA RAID Array Delete an SSA RAID Array Change Show Attributes of an SSA RAID Array Change Member Disks in an SSA RAID Array Change Show Use of an SSA Physical Disk Change Use of Multiple SSA Physical Disks Change Show Delete a Hot Spare Pool Array Copy Services Fl Help F2 Refresh F3 Cancel F8 Image ischial F1O Exit Enter Do Select Array Copy Services 172 User s Guide and Maintenance Information Array Copy Services For fast path access to the Array Copy Services menu type smitty ssa_copy and press Enter Otherwise select Array Copy Services from the SSA RAID Arrays menu Note Array Copy Services are designed to be run from shell scripts These SMIT menus are intended as an aid to shell script development and as a problem determination tool a A Array Copy Services Move cursor to desired item and press Enter Prepare a RAID Array Copy Prepare Volume Group Logical Volumes or Filesystems Copy Uncouple a RAID Array Copy Uncouple a Volume Group Logical Volumes or Filesystems Copy List All Copy Candidates List All Uncoupled Copies List All Uncoupled Volume Groups Delete a RAID Array Copy Delete a Volume Gro
167. MB per second SSA cables color coded black 40 MB per second SSA cables color coded blue The speed at which a link runs is automatically agreed between its two nodes Under some fault conditions a link that normally runs at 40 MB per second might run at 20 MB per second The automatic run_ssa_link_speed diagnostic searches for pairs of 40 MB per second nodes that are running at only 20 MB per second This diagnostic is started by an entry in the cron table If you are using 20 MB per second cables to connect 40 MB per second SSA nodes delete the run_ssa_link_speed eniry from the cron table This action prevents the logging of errors that can be solved only by the installation of 40 MB per second cables User s Guide and Maintenance Information Identifying and Addressing SSA Devices This section describes how SSA adapters and devices are known to the using system programs Location Code Format Location codes identify the locations of adapters and devices in the using system and its attached subsystems and devices These codes are displayed when the diagnostic programs isolate a problem For information about the location codes that are used by the using system see the Operator Guide for the using system AB CD EF GH E Unused Unused Unused P Physical disk drive L Logical disk drive Adapter position number of the slot 1 through 8 containing the SSA adapter System I O bus identifier Expansion adapter position
168. Next Select the adapter whose hot spare pools you want to list 74 Users Guide and Maintenance Information 3 A list of hot spare pools and their status is displayed a ae COMMAND STATUS Command OK stdout yes stderr no Before command completion additional instructions may appear below ssal Pool Components Spares Configured Minimum Status pool_AQ 0 1 1 1 unused pool_Al 7 0 1 1 empty pool_Bl 6 2 2 1 full Fl Help F2 Refresh F3 Cancel F6 Command F8 Image F9 Shel1 F10 Exit Find n Find Next y P The columns of information displayed on the screen have the following meanings Pool The pool identifier Note Until you have defined hot spare pools see Adding a New Hol all disk drives are in pool_AO and pool_Bo Any RAID arrays that are in pool_AO and pool_BO cannot be restricted to make them select disk drives from only that pool Components The number of array member disk drives that the hot spare disk drives in the pool are protecting Spares The number of hot spare disk drives that are now in the pool Configured The number of hot spare disk drives that were in the pool when it was created or changed Minimum The value that is selected to be the minimum number of hot spare disk drives that can exist in a pool before an error condition is logged This number is normally set to be the same as the number of disk drives that were originally configured in the pool You can however set the mini
169. Ns are generated by the system error log analysis system configuration code diagnostics and customer problem determination procedures SRNs help you to identify the cause of a problem the failing field replaceable units FRUs and the service actions that might be needed to solve the problem The SRN Table The table in this section lists the SRNs and describes the actions you should do The table columns are SRN The service request number Problem A description of the problem and the action you must take Possible Causes The condition or FRUs that might be causing the problem and how likely it is by percentage that a particular FRU is causing the problem Abbreviations used in the table are DMA Direct memory access FRU Field replaceable unit IOCC Input output channel controller PAA P Adapter port number POS Programmable option select POS registers POST Power On Self Test SDRAM Synchronous dynamic random access memory Using the SRN Table Note You should have been sent here from either diagnostics or a START MAP Do not start problem determination from the SRN table always go to the START MAP for the enclosure in which the device is installed 1 Locate the SRN in the table If you cannot find the SRN refer to the documentation for the subsystem or device If you still cannot find the SRN you have a problem with the diagnostics the microcode or the documentation Call your support center for a
170. PVIDs of the new hdisks e Renames the volume group and assigns a new VGID Renames the logical volumes e Changes the root mount point for all the file systems in the new volume group e Mounts the file systems Prepares the volume group for a copy operation This command does not complete until the copy operations have completed Complete when copy ends This flag can be used only with the P flag When this flag is specified the script waits for the copy operation to end Chapter 7 Copying Data from Arrays and from Volume Groups 163 Example 1 Copying a Complete Volume Group In this example you are copying a complete volume group from the parent array to the RAID Copy array To copy a complete volume group give the commands ssa_make_copy P v vgname ssa_make_copy v vgname Step 1 Step 2 Step 3 Source Volume Group Source Volume Group Source Volume Group hdisk1 hdisk1 hdisk1 data_fs data_fs data_fs v_A vB i vA vB vA vB loglv loglv loglv Copy Physical Volumes New Volume Group aa fsdata_fs fslv_A fslv_B fsloglv Figure 25 Copying a Complete Volume Group Figure 25 shows from left to right e The parent array that contains the source volume group The empty RAID Copy array coupled to the parent array
171. RROR SSA_CACHE_BATTERY SSA_HDW_RECOVERED SSA_CACHE_ERROR SSA_LINK_ERROR SSA_DEGRADED_ERROR SSA_LINK_OPEN SSA_DETECTED_ERROR SSA_LOGGING_ERROR SSA_DEVICE_ERROR SSA_REMOTE_ERROR SSA_ENCL_ERR1 SSA_SETUP_ERROR SSA_ENCL_ERR2 SSA_SOFTWARE_ERROR The SSA Error code data format consists of three bytes of error code followed by up to 153 bytes of debug data See 9 to find out how this data is used run_ssa_healthcheck cron The run_ssa_healthcheck program checks for SSA subsystem problems that do not cause I O errors but cause some loss or redundancy or functionality It reports such errors each hour until the problem is solved During SSA device driver installation the following entry is added to the cron table O x x x usr Ipp diagnostics bin run_ssa_healthcheck 1 gt dev null 2 gt dev null This run_ssa_healthcheck program sends a command to the adapter The command causes the adapter to write a new error log entry for any problems that it can detect although those problems might not be causing any failure in the user s applications Such problems include Adapter hardware faults e Adapter configuration problems e RAID array problems e Fast write cache problems Open serial link conditions e Link configuration faults e Disk drives that are returning Check status to an inquiry command 226 User s Guide and Maintenance Information e Redundant power failures in SSA enclosures The test runs hourly at a specific time in
172. RUs Possible FRUs Chapter 18 SSA Problem Determination Procedures 413 Action Go to N SRN Problem Possible Causes 2A004 Description Async code 04 has been received Probably a software error Software error has occurred Possible FRUs Action Go to Device 50 Exchanging exchanging any FRUs Disk Drives on page 319 pois adepi card 50 2A005 Description SRNs in this range are not adapter SRNs Not applicable to 2A206 Action For SRNs in this range see the documentation for your SSA enclosure or SSA subsystem 2FFFF Description An async code that is not valid has been received Software error Action Go to g 300C0 Description SRNs in this range are not adapter SRNs Not applicable 301C0 Action For SRNs in this range see the documentation for your SSA enclosure or SSA subsystem 303FE Description A disk drive microcode error has been detected Software error Possible FRUs Device 100 303FF Description An SCSI status that is not valid has been received Possible FRUs Device 100 xchanging D 31000 Description SRNs in this range are not adapter SRNs Not applicable Action For SRNs in this range see the documentation for your SSA enclosure or SSA subsystem 33PAA Description Excessive link reconfigurations have been detected Possible FRUs External SSA cables 30 Internal SSA connections 30 enclosure
173. Recycle or discard the battery as instructed by local regulations and where recycling facilities exist XV Xv Users Guide and Maintenance Information About This Book Who Should Use This Book This book is for people who operate or service a RISC system that contains one or more Advanced SerialRAID Adapters To follow the instructions in this book you should be familiar with the basic operational procedures for a RISC system What This Book Contains Part 1 of this book is mainly for the user It describes The Advanced SerialRAID Adapters SSA loops The RAID facilities that are provided by the adapter How to use the SSA SMIT menus How to use the RAID configuration utility to configure arrays of SSA disk drives and how to deal with problems such as the failure of a disk drive in a RAID array How to use the SSA Spare Tool How to configure the Fast Write feature SSA error logs How to use the SSA Command Line Interface How to use the programming interface Part 2 of this book is mainly for service representatives It describes General technical topics about the Advanced SerialRAID Adapters Removal and replacement procedures How to use the SSA Command Line Utilities The SSA service aids Problem determination procedures including Service Request Numbers SRNs and Maintenance analysis procedures MAPs The appendix contains the communications statements for the adapter A glossary and an index are provided
174. SA RAID adapter handles such disk drives in the same way as a non RAID SSA adapter does It transfers data directly between the disk drives and the system and uses no RAID functions When first installed all disk drives are by default defined as system disks that is they are not members of an array Before they can be added to arrays you must redefine them so that the system no longer has direct access to them 30 User s Guide and Maintenance Information RAID 0O Array States A RAID O array can be in either of two states A knowledge of those states is useful when you are configuring your arrays The states are described here Good State A RAID O array is in the Good state when all the member disk drives of that array are present Offline State A RAID 0 array enters Offline state when one or more member disk drives become missing Read and write operations are not allowed Chapter 3 RAID Functions and Array States 31 RAID 1 Array States RAID 1 aray states are the same as RAID 10 array states For details see FRAID 1d In RAID 1 arrays the first member disk drive of the array is defined as the primary disk drive and the second member disk drive is defined as the secondary disk drive These definitions prevent operation on separate member disk drives of the array when the array becomes split but separate systems can still access one of the member disk drives RAID 10 defines the first and third disk drives to be primary
175. SA physical disk drives You cannot specify this flag and the SSADISK_SCSIMODE flag together SSADISK_SCSIMODE Opens an SSA physical disk in SCSI passthrough mode This action allows SSADISK_IOCTL_SCSI ioctls to be issued to the physical disk This flag has support only for SSA physical disk drives You cannot specify this flag and the SSADISK_SERVICEMODE flag together SSADISK_NORETRY Opens a device in no retry mode When a device is opened in this mode commands are not retried if an error occurs SSADISK_FENCEMODE Opens an SSA logical disk drive in fence mode The open subroutine succeeds although the using system might be fenced out from access to the disk drive Only ioctls can be issued to the device while it is open in this mode Any attempt to read from or write to a device that is opened in this mode is rejected with an error This flag has support only for SSA logical disk drives You cannot specify this flag and the SSADISK_NO_RESERVE flag SSADISK_FORCED_OPEN flag or SSADISK_RETAIN_RESERVATION flag together You can find more specific information about the open operations in SSA Options to the openx Subroutine in the Kernel Extensions and Device Support Programming Concepts manuals for AIX versions 4 1 and upward readx and writex Subroutines The readx and writex subroutines provide additional parameters that affect the transfer of raw data that is data that has not been processed or reduced These subroutines
176. SSA RAID Arrays List All Supported SSA RAID Arrays List All SSA RAID Arrays Connected to a RAID Manager List Status Of All Defined SSA RAID Arrays List Identify SSA Physical Disks List Delete Old RAID Arrays Recorded in an SSA RAID Manager List Status of Hot Spare Pools List Status of Hot Spare Protection for an SSA RAID Array List Components in a Hot Spare Pool Add a Hot Spare Pool Add an SSA RAID Array Delete an SSA RAID Array Change Show Attributes of an SSA RAID Array Change Member Disks in an SSA RAID Array Change Show Use of an SSA Physical Disk Change Use of Multiple SSA Physical Disks Change Show Delete a Hot Spare Pool Array Copy Services F1 Help F2 Refresh F3 Cancel F8 Image earn F1O Exit Enter Do From the following list find the option that you want and go to the place that is indicated Chapter 6 Using the RAID Array Configurator 59 Adding an SSA RAID Array This option lets you add an array to the configuration 1 For fast path type smitty mkssaraid and press Enter Otherwise select Add an SSA RAID Array from the SSA RAID Arrays menu A list of adapters is displayed in a window Ye SSA RAID Arrays Move cursor to desired item and press Enter List All Defined SSA RAID Arrays List All Supported SSA RAID Arrays List All SSA RAID Arrays Connected to a RAID Manager List Status Of All Defined SSA RAID Arrays List Identify SSA Physical Disks List Delete Old RAID A
177. SSA devices are known to the system programs and the rules that you must observe when you configure your SSA loops Loops Links and Data Paths In the simplest SSA configuration SSA devices are connected through two or more SSA links to an SSA adapter that is located in a using system The devices SSA links and SSA adapter are configured in loops Each loop provides a data path that starts at one connector of the SSA adapter and passes through a link SSA cable to the devices The loop continues through the devices then returns through another link to a second connector on the SSA adapter The maximum permitted length for an external copper cable that connects two SSA nodes for example disk drives is 25 meters 82 feet The maximum permitted length for an external fiber optic cable that connects two SSA nodes for example disk drives is 10 kilometers 32800 feet Some devices however can operate only at shorter distances See your subsystem documentation for details Details of the rules for configuring SSA loops are given for each SSA adapter in Simple Loop shows a simple SSA loop The devices that are attached to the SSA adapter card are connected through SSA links H These SSA links are configured as a loop Data and commands to a particular device pass through all other devices on the link between the adapter and the target device Data can travel in either direction round the loop The adapter can therefore get acc
178. System Disks SSA RAID Manager Move cursor to desired item and press Enter ssa0 Available 00 04 IBM SSA 160 SerialRAID Adapter 14109100 Fl Help F2 Refresh F3 Cancel F8 Image F1O Exit Enter Do Find n Find Next Ne r4 Select the adapter whose disk drives you want to list 144 User s Guide and Maintenance Information 3 A list of disk drives and their usage is displayed in a window SSA RAID Arrays Move cursor to desired item and press Enter List All Defined SSA RAID Arrays SSA Physical Disk Move cursor to desired item and press Enter Use arrow keys to scroll SSA physical disks which are members of arrays pdiskO 00022123DFHCOOD member n a 4 5G Physical d pdiskl Q004AC5119E000D member n a 1 1G Physical d pdisk2 Q004AC5119E000D member n a 1 1G Physical d pdisk3 Q8005AEA003500D member n a 4 5G Physical d pdisk4 08005AEA030D00D member n a 2 3G Physical d pdisk5 O8005AEA080100D member n a 4 5G Physical d pdisk7 08005AEA087A00D member n a 4 5G Physical d SSA physical disks which are hot spares pdisk6 O8005AEAQ80800D spare n a 4 5G Physical d F1l Help F2 Refresh F3 Cancel F8 Image F1Q Exit Enter Do Find n Find Next BA Using the arrow keys scroll the information until you find the list of SSA physical disks that contains the disk drive that you want to change Chapter 6 Using the RAID Array Configurator 145 4 Select the disk drive that you want to change or show The following
179. That device is the one beyond the last configured device on an open SSA loop If the SSA service aids are not available note the value of PAA in this SRN and go to 44PAA Description An SSA device has a Failed status Possible FRUs Device 100 Action If the SSA service aids are available run the Link Verification e service aid see j to find the failing device If no device is listed with a status of Failed use the PAA part of the SRN to determine which device is failing Before you exchange the failing device run diagnostics in System Verification mode to that device to determine the cause of the problem If the SSA service aids are not available note the value of PAA in this SRN and go to Exchange the failing FRU for a new FRU 45PAA Description The SSA adapter has detected an open SSA loop Possible FRUs Device 40 fEachang ng Action If the SSA service aids are available run the Link Verification a service aid see j to determine which part of the SSA loop is failing If the SSA service aids are not available note the value of PAA in this SRN and go to Then go to External SSA cables Fibre Optic Extenders fiber optic cables or internal connections in the device enclosure 20 enclosure service information 46000 Description An array is in the Offline state because not enough disk drives are present in the array to maintain data availability Action If the SSA service aids are availab
180. The command returns a list that contains one SRN for each SSA device that is on the system Each SRN is for the most significant error of its related SSA device The ssa_ela I device h timeperiod command scans the error log and looks for all SSA errors The command returns the SRN for the most significant error The ssa_ela I pdisk command scans the error log and looks for errors that are logged against the specified pdisk The command returns the SRN for the most significant error The ssa_ela I hdisk command scans the error log and looks for errors that are logged against any hardware that provides support for the specified hdisk pdisks and adapters The command returns the SRN for the most significant error The ssa_ela I adapter command scans the error log and looks for errors that are logged against the specified adapter The command returns the SRN for the most significant error I Device Specifies the device whose error log you want to analyze for the most significant error h timeperiod Instructs the program to start searching the error log from a previous time that is a multiple of 24 hours For example h 1 the default setting starts a search through the previous 24 hours h 2 starts a search through the previous 48 hours Chapter 16 Using the SSA Command Line Utilities 353 Output If an error occurs the ssa_ela command sends an error message to stdout such as ssaQ SRN 42500 If no error occurs the comma
181. The ext parameter can contain any combination of the following flag values logically ORed together SSADISK_PRIMARY Opens the device by using the primary adapter as the path to the device As a result of hardware errors the device driver might automatically switch to the User s Guide and Maintenance Information secondary path if one exists You can prevent this switch by additionally specifying the SSADISK_NOSWITCH flag This flag has support both for SSA logical disk drives and for SSA physical disk drives You cannot specify this flag and the SSADISK_SECONDARY flag together SSADISK_SECONDARY Opens the device using the secondary adapter as the path to the device As a result of hardware errors the device driver might automatically switch to the primary path if one exists You can prevent this switch by additionally specifying the SSADISK_NOSWITCH flag This flag has support both for SSA logical disk drives and for SSA physical disk drives You cannot specify this flag and the SSADISK_PRIMARY flag together SSADISK_NOSWITCH If more than one adapter provides a path to the device the device driver normally switches from one adapter to the other as part of its error recovery This flag prevents the switch This flag has support both for SSA logical disk drives and for SSA physical disk drives SSADISK_FORCED_OPEN Forces the open whether another initiator has the device reserved or not If another initiator has the device reser
182. Type or select values in entry fields Press Enter AFTER making all desired changes Entry Fields SSA RAID Manager ssal Spares Pool pool_B2 Status full Components in Pool 3 Hot Spares Previously Configured 1 Hot Spares in Pool 1 Components to Add g Components to Remove g Hot Spares Minimum 1 Fl Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image F9 Shel F1Q Exit Enter Do Ne A To add disk drives to the pool 1 Select Components to Add and press the List key A list of valid hot spare pool candidates is displayed This list contains RAID disk drives hot spare disk drives and free disk drives that are in pool zero on the selected loop The pop up list is in the same format as the list that is used when a hot spare pool is created 2 Select the member disk drives to add to the pool and press Enter 3 If necessary change the Hot Spares Minimum field 4 Press Enter e To remove disk drives from the pool 1 Select Components to Remove and press the List key A list of disk drives that are now in the pool is displayed 2 Select the member disk drives that are to be removed from the pool and press Enter 3 If necessary change the Hot Spares Minimum field 4 Press Enter The member disk drives that have been removed from the pool are now moved to pool zero of the selected loop Notes 1 If all member disk drives are removed from the pool the pool is automatically deleted 2 When all me
183. Verification 383 Set Service Mode 378 SRNs 400 starting 376 Set Service Mode service aid 378 showing the disks that are protected by hot spares 77 simple loop 8 simple loop one disk drive missing 9 simple loop two disk drives missing 10 SMIT or SMITTY commands add_hsm_pool_adap 83 addssaraid 140 chg_hsm_pool_adap 86 chgssadisk 72 144 chgssadisks 147 chgssardsk 214 chssaraid 135 exssaraid 95 142 iassaraid 127 icssaraid 125 ifssaraid 92 123 ihssaraid 121 issaraid 119 lassaraid 117 Icssaraid 115 lfssaraid 91 113 Inssaraid 111 Is_hsm_array_components 80 Is_hsm_array_status 77 Is_hsm_status 74 Isdssaraid 100 Isidssaraid 108 Ismssaraid 102 Issaraid 109 Isssaraid 101 Istssaraid 104 mkssaraid 60 nvrssaraid 130 redssaraid 138 rmssaraid 70 smit smitty 98 ssa_identify_cancel 129 ssafastw 215 swpssaraid 137 SMIT or SMITTY options Add a Disk to an SSA RAID Array 140 Add a Hot Spare Pool 83 Add an SSA RAID Array 60 Cancel all SSA Disk Identifications 129 Change Member Disks in an SSA RAID Array 137 SMIT or SMITTY options continued Change Show Characteristics of an SSA Logical Disk 214 Change Show Use of an SSA Disk 144 Change Show Delete a Hot Spare Pool 86 Delete an SSA RAID Array 70 Enable Disable Fast Write for Multiple Devices 215 Identify Array Candidate Disks 125 Identify Disks in an SSA RAID Array 119 Identify Hot Spares 121 Identify Rejected Array Disks 123 Identify System Disks 127 List All Defined SSA
184. Write option see 2 Exchange the Fast Write Cache Option Card for a new one Ask the customer to re enable the Fast Write option for the devices that are attached to the new Fast Write Cache Option Card Chapter 18 SSA Problem Determination Procedures System configuration problem Possible FRUs Fast Write Cache Option 419 SRN Problem Possible Causes 42525 Description A fast write logical disk contains unsynchronized data but the Fast Write Option Card does not contain that data The failing disk drive is offline The data was not synchronized on the old adapter before the disk drive was moved to this adapter The wrong Fast Write Option Card was installed onto this Action e If the disk drive has just been moved from another adapter do either of the following actions adapter Return the disk drive to its original adapter The Fast Write Option Card Move the original Fast Write Cache Option Card to this adapter so lost the data because the that the data can be synchronized battery failed e If the wrong Fast Write Cache Option Card has been installed for example if the adapter has been exchanged but the original Possible FRUs Fast Write Cache Option Card is still on the old adapter card install Fast Write Cache Option the original Fast Write Cache Option Card onto this adapter Card 100 e If the adapter card has been switched off for more than seven days the
185. You can choose either or both of these flags h Prevents the output of progress messages from the program X Prevents the actions of the compress command and of the tar command The program copies the dump directly to the specified output point o Note You must ensure that the specified output point has enough free space to hold the dump The ssa_getdump command sends all error messages to stderr and the following to stdout e Header messages e List mode output e Copy progress messages The command generates these return codes The command has completed successfully Some parameters are not correct The disk name is not valid or the pdisk is not present The name of the SSA adapter is not correct or not valid The UID or slot number of the SSA adapter is not correct Cannot open the file or directory in the temporary file tmp oOo a fF WO N O amp O Not enough disk space is available or an error occurred during a write operation to the temporary file 7 Not enough memory is available Note When in Copy mode the command reads data from the disk in blocks of approximately 256 KB 8 An internal or object data manager ODM error has occurred 9 An error occurred during a read operation in Copy mode Chapter 16 Using the SSA Command Line Utilities 363 ssaidentify Command Purpose To set or clear Identify mode for a physical disk Syntax ssaidentify 1 PhysicalDiskName y ssaidentify 1 PhysicalDiskName n D
186. _cmd pointer points at a conc_cmd structure These arguments must be of the same type that is specified by the conc_intr_addr function pointer field of the dd_conc_register structure The following valid concurrent mode commands are defined in the usr include sys ddcon h file For each command the devno field specifies the appropriate SSA disk drive DD_CONC_SEND_REFRESH The DD_CONC_SEND_REFRESH device driver entry point has completed The error field in the conc_cmd structure contains the return code that is necessary for the completion of this command The possible values are defined in the usr include sys errno h file The conc_cmd pointer argument to the special interrupt handler entry point of the top kernel extension is non null The cmd_op message_code and devno fields are 0 DD_CONC_LOCK The DD_CONC_SEND_LOCK device driver entry point has completed The 288 User s Guide and Maintenance Information error field of the conc_cmd structure contains the return code that is necessary for the completion of this command The possible values are defined in the usr include sys errno h file The conc_cmd pointer argument to the special interrupt handler entry point of the top kernel extension is non null The cmd_op message_code and devno fields are zero DD_CONC_UNLOCK The DD_CONC_UNLOCK device driver entry point has completed The error field in the conc_cmd structure contains the return code that is necessary for the completion of th
187. _depth operations on the disk Read operations and write operations are queued together in this mode If write_queue_mod is set to a non zero value the SSA disk device driver maintains two separate seek ordered queues one for read operations and one for write operations In this mode the device driver issues up to queue_depth read commands and up to write_queue_mod write commands to the logical disk This facility is provided because in some environments it might be beneficial to hold back write commands in the device driver so that they can be coalesced into larger operations that can be handled as full stride writes by the RAID software in the adapter This facility is not likely to be useful unless a large percentage of the workload to a RAID 5 device consists of sequential write operations Device Dependent Subroutines 272 The open read write and close subroutines start typical physical volume operations open read write and close Subroutines The open subroutine is mainly for use by the diagnostic commands and utilities Correct authority is required for execution If an attempt is made to run the open subroutine without the correct authority the subroutine returns a value of 1 and sets the errno global variable to a value of EPERM The ext parameter that is passed to the openx subroutine selects the operation for the target device The usr include sys ssadisk h file defines possible values for the ext parameter
188. acement 34 disk drive replacement 34 arrays adding a disk drive to an SSA RAID array 140 adding a new hot spare pool 83 adding disks to a hot spare pool 86 adding to the configuration 60 canceling all SSA disk drive identifications 129 changing member disks in an SSA RAID array 137 changing or showing the attributes of an SSA RAID array 135 changing or showing the status of a hot spare pool 74 changing or showing the use of an SSA disk drive 144 changing the use of multiple SSA physical disks 147 creating a hot spare disk drive 72 deleting an old RAID array recorded in an SSA RAID manager 133 deleting from the configuration 70 identifying and correcting or removing failed disk drives 91 identifying array candidate disk drives 125 identifying hot spare disk drives 121 identifying rejected array disk drives 123 identifying system disk drives 127 identifying the disk drives in an SSA RAID array 119 installing a replacement disk drive 95 494 User s Guide and Maintenance Information arrays continued installing and configuring 58 listing all defined SSA RAID arrays 100 listing all SSA RAID arrays that are connected to a RAID manager 102 listing all supported SSA RAID arrays 101 listing hot spare disk drives 111 listing old RAID arrays recorded in an SSA RAID manager 131 listing rejected array disk drives 113 115 listing system disk drives 117 listing the disk drives in an SSA RAID array 109 listing the disks that are in a hot spare
189. address AA values that are provided by some service request numbers SRNs Examples of these SRNs are 43PAA 44PAA and 45PAA The port P values are related to the port connectors of the adapter Connector A1 Connector A2 Connector B1 Connector B2 ono ll The AA value is the decimal SSA address value It indicates the position of the device that you are trying to find counted along the SSA loop Use the port value to locate the relevant connector on the SSA adapter then follow the SSA cable to the first rea device Include other adapters as real devices if they are in the same SSA link Do not include dummy devices or bypass cards The first device that you reach represents SSA address count 0 Continue to follow the SSA links from device to device increasing the SSA address count by 1 for each device until you reach the device that is indicated in the SRN Note If the SRN is 20xxx 45xxx or D6xxx the failing link is between the device that is located at PAA and the device that is located at PAA minus 1 Chapter 17 SSA Service Aids 409 410 User s Guide and Maintenance Information Chapter 18 SSA Problem Determination Procedures SSA problem determination procedures are provided by power on self tests POSTs service request numbers and maintenance analysis procedures MAPs Some of these procedures use the service aids that are described in en Service Request Numbers SRNs Service request numbers SR
190. al number from the list function that you ran earlier and Z is the name of a logical disk For the logical disk choose a name that is different from the names of existing logical disks This action ensures that the logical disk that you have created is not automatically attached if the using system crashes and reboots When this operation has completed a message is displayed This message tells you that the logical disk Z has been attached and that the device dev Z can be accessed For example ssaraid ssa3 H n 2327340C228635K a use system k ZZDataRecovery d 2327340C228635K attached ZZDataRecovery Available where dev ZZDataRecovery is the device You can now use standard commands for example fsck and fsdb to attempt to repair any possible damage to the file system before you attempt data recovery Chapter 10 Using the Fast Write Cache Feature 219 SRN 42524 If a Fast Write Cache Option Card fails or is removed from the adapter the affected devices are all those that contain unsynchronized data when the cache card fails or is removed To list these devices type ssaraid 1 ssaX Iz a state no_cache where X is the adapter number Use the recovery procedure that is described for SRN 42521 You must recover all the devices that are listed SRN 42525 If a Fast Write Cache Option Card fails or is removed from the adapter the affected devices are all those that contain unsynchronized data when the cache card fa
191. also reserved to another using system the reservation takes priority The return code from the open subroutine is 1 and the global variable errno is set to EBUSY If the using system attempts to break through the reservation by passing the ext parameter 290 User s Guide and Maintenance Information SSADISK_FORCED_OPEN to the openx subroutine the reservation is broken but the open fails with errno set to ENOCONNECT To break through the fence the SSA logical disk must be opened in SSADISK_FENCEMODE and the SSADISK_ISALCMD ioctl operation used to issue the appropriate hardware command to break the fence condition SSA Target Mode The SSA Target Mode interface TMSSA provides node to node communication through the SSA interface The interface uses two special files that provide a logical connection to another node One of the special files the initiator mode device is used for write operations the other the target mode device is used for read operations Data that is sent to a node is written to the initiator Data that is read from a node is read from the target The special files are dev tmssaXX im The initiator mode device which has an even minor device number and is write only dev tmssaXX tm The target mode device which has an odd minor device number and is read only The device is tmssaXX where XX is the node number of the using system with which these files communicate You are not aware of which path connects the
192. ame whichever of the two commands you use If you send the smit command from a graphics terminal however the menus are displayed slightly differently from those shown in this book If you are not familiar with the selection of items from the graphics versions of the menus use the smitty command The menus will then appear as shown in this book 2 If you use fast path commands you might need to go through intermediate steps that are not shown in this book Also some menus might be displayed slightly differently from those shown in this book Chapter 6 Using the RAID Array Configurator 97 Getting Access to the SSA RAID Array SMIT Menu 1 For fast path access to the SSA RAID Array SMIT menus type smitty ssaraid and press Enter Otherwise a Type smitty and press Enter The System Management menu is displayed b Select Devices The Devices menu is displayed c Select SSA RAID Arrays 2 The SSA RAID Arrays menu is displayed E SSA RAID Arrays Move cursor to desired item and press Enter List All Defined SSA RAID Arrays List All Supported SSA RAID Arrays List All SSA RAID Arrays Connected to a RAID Manager List Status Of All Defined SSA RAID Arrays List Identify SSA Physical Disks List Delete Old RAID Arrays Recorded in an SSA RAID Manager List Status of Hot Spare Pools List Status of Hot Spare Protection for an SSA RAID Array List Components in a Hot Spare Pool Add a Hot Spare Pool Add an SSA RAID Array Delete
193. ame data to two disk drives at the same time You can provide more data protection if you put the two disk drives into separate physical domains These physical domains can be separate SSA disk enclosures separate power sources or separate rooms or buildings When you use separate physical domains you provide some capability to recover from an unrecoverable loss of power For a RAID 1 or RAID 10 array to be able to recover after the failure of a physical domain at least one copy of the data must remain available It is important therefore that the action of replacing a failing disk drive with a hot spare disk drive does not cause an array member to move to another physical domain 45 Figure 16 shows an array that has its primary disk drives pdisk2 pdisk3 pdisk10 and pdisk11 in building 1 and its secondary disk drives pdisk5 pdisk6 pdisk7 and pdisk8 in building 2 Pdisk1 and pdisk4 have been assigned as hot spare disk drives but no hot spare pool has been defined Building 1 Building 2 pdisk3 A A D D A A P P T T E E R COR Aw aap p ET pdisk10 pdisk9 pdisk8 m Primary Disks Secondary Disks a Figure 16 Primary Disks in Building 1 Secondary Disks in Building 2 46 User s Guide and Maintenance Information If pdisk2 fails the hot spare disk drive pdisk4 might replace pdisk 2 as one of the primary disk drives in the array as shown in Figura 7A Building 1 Building 2
194. anager can use any available hot spare disk drive to replace a failing member disk drive dynamically The failing disk drive is rejected from the array and the hot spare disk drive is put into the available place User s Guide and Maintenance Information If you select Exact the replacement disk drive is chosen only from hot spare disk drives whose size exactly matches the size of the failing disk drive Choose Hot Spare only from Preferred Pool If you select yes for this option a hot spare disk drive is selected only from the hot spare pool that contains the failed member disk drive If you select no for this option a hot spare disk drive is selected if available from the hot spare pool that contains the failed member disk drive If no hot spare disk drive is available in that pool a hot spare disk drive is selected from the default hot spare pool for that SSA loop Pool AO or BO If no hot spare disk drives are available in pool 0 a hot spare disk drive is selected from any other hot spare pool Allow Hot Spare Splits If you select no for this option the RAID manager does not attempt to use hot spare disk drives to replace missing members when a RAID 1 or RAID 10 array is split exactly in half and all the primary or secondary member disk drives of that array are present It is recommended that this option be set to no if the RAID 1 or RAID 10 array is configured to protect against the loss of a physical domain Allow Page Sp
195. and determine why that configuration is not valid 3 Correct your configuration by reconfiguring the SSA cables or by removing the excess devices or adapters from the loop 4 Switch on the using system Service Hint Cables can easily become crossed If you still have problems disconnect all the cables from the SSA adapter then reconnect them one at a time For each cable that you reconnect run the Link Verification service aid see to check whether the configuration is as you aes If the SRN occurred because additional devices or adapters were added to a working SSA loop 1 Remove the additional devices or adapters that are causing the problem and put the loop back into its original working configuration Note t is important that you do these actions because they enable the configuration code to reset itself from the effects of the error Chapter 18 SSA Problem Determination Procedures 441 2 Review the configuration that you are trying to make and determine why that configuration is not valid 3 Correct your configuration by reconfiguring the SSA cables or by removing the excess devices or adapters from the loop 442 User s Guide and Maintenance Information SSA Maintenance Analysis Procedures MAPs The maintenance analysis procedures MAPs describe how to analyze a failure that has occurred in an SSA loop How to Use the MAPs Attention Unless the using system needs to be switched off for some other reason d
196. any disk drives have failed first exchange those disk drives for new disk drives and assign them to the use that has been specified by the user If the problem remains continue with the next step 4 Type smitty ssaraid and press Enter Select List Status of Hot Spare Protection for an SSA RAID Array 6 Select the adapter that logged the error 7 Observe the displayed list Note which SSA loop contains any array member disk drives that are listed with a protected state of no Devices that are in pool_AO are connected to loop A Devices that are in pool_B are connected to loop B 8 If the status field for the unprotected array member disk drive is too_large the hot spare disk drive is not large enough to support that array component Otherwise no hot spare disk drive exists on the loop that you noted earlier 9 Return to the SSA RAID Arrays menu and select Change Show Use of an SSA Physical Disk 10 Select the adapter that logged the error 11 Find an appropriate disk drive and change its use to hot spare see A RAID array is configured to use hot spare disk drives but no disk drive can be found that is configured as a hot spare This error code is used only when no hot spare pools have been assigned All arrays and hot spare disk drives are in the default poolo 430 User s Guide and Maintenance Information SRN Problem Possible Causes 49510 Description Hot spare configuration is not
197. apter assembly contains parts that are electrostatic discharge ESD sensitive Use the tools and procedures defined by your organization to protect such parts e The Fast Write Cache Option card might contain customer data 1 Remove the adapter from the using system see the Installation and Service Guide for the using system Figure 50 Releasing the Battery Assembly 4 Attention The battery falls when it is released from the Fast Write Cache Option card Hold the Fast Write Cache Option card approximately 10 mm 0 5 in above a work surface 5 Press the clip Ef The battery assembly falls through the hole in the Fast Write Cache Option card 336 User s Guide and Maintenance Information 6 Refer to Figure 511 Figure 51 Disconnecting the Battery Assembly 7 Invert the Fast Write Cache Option card 8 Carefully unplug the battery assembly from the connector fq Chapter 15 Removal and Replacement Procedures 337 Installing a Battery Assembly into the Fast Write Cache Option Card of an Advanced SerialRAID Adapter Attention The adapter assembly contains parts that are electrostatic discharge ESD sensitive Use the tools and procedures defined by your organization
198. apter card e The fast write cache is not empty Data is waiting to be written to another device This message is displayed Cannot be formatted because it is not empty e The adapter card does not provide support for the Fast Write Cache feature This message is displayed This adapter cannot be formatted The ssa_progress l pdisk command allows you to check the progress of the format operation that the ssa_format I pdisk command started pdisk Specifies the pdisk that you want to format I SSA_ Adapter Specifies the adapter whose Fast Write Cache Option Card you want to format 358 User s Guide and Maintenance Information b Specifies that the battery age counter be reset When this flag is used the data on the Fast Write Cache Option Card is not set to zero Important Do not select the b flag unless you have exchanged the battery Otherwise no error message will occur when the battery reaches the end of its life Output The ssa_format command sends all error messages to stderr Chapter 16 Using the SSA Command Line Utilities 359 ssa_fw_status Command Purpose To show the status of the fast write cache Syntax ssa_fw_status a Adapter p 1 c Description The ssa_fw_status command displays the status of the fast write cache of an SSA adapter Flags a Adapter Shows all fast write status for the specific adapter l Displays the expected life of the battery p Displays the number of hours for which the batt
199. apter that you want to inspect A list of link speeds is displayed as shown in this example screen hte SPEED 802438 D SSA Link Speed for systemname ssaQ 00 03 IBM SSA 160 SerialRAID Adapter To set or reset Identify move cursor onto selection then press lt Enter gt Source Speed Destination systemname ssaQ Al 40 systemname pdisk3 systemname pdisk3 40 systemname pdisk2 systemname pdisk2 20 systemname pdisk1 systemname pdisk1 40 systemname pdiskO systemname pdiskO 40 systemname ssaQ A2 systemname ssa0 B1 40 systemname pdisk5 systemname pdisk5 40 systemname pdisk4 systemname pdisk4 40 systemname ssaQ B2 F3 Cancel F1O Exit A J The screen shows an adapter that has four disk drives connected between ports A1 and A2 and two disk drives connected between ports B1 and B2 The link between pdisk1 and pdisk2 is working at 20 MB per second Each row in the list represents one SSA link Chapter 17 SSA Service Aids 397 The Source field represents the link end that is closest to the A1 or B1 port The Destination field represents the link end that is closest to the A2 or B2 port Note The Source and Destination fields have no other meaning They do not indicate data flow The Speed field indicates the operating speed of the link as follows 00 The link is not operational 20 20 MB per second 40 40 MB per second Figure 54 gives a physical representation of the configuration that is shown on the
200. ar disk drives 1 For fast path type smitty lsidssaraid and press Enter Otherwise select List Identify SSA Physical Disks from the SSA RAID Arrays menu 2 The following information is displayed A List Identify SSA Physical Disks Move cursor to desired item and press Enter List Disks in an SSA RAID Array List Hot Spares List Rejected Array Disks List Array Candidate Disks List System Disks Identify Disks in an SSA RAID Array Identify Hot Spares Identify Rejected Array Disks Identify Array Candidate Disks Identify System Disks Cancel all SSA Disk Identifications F1 Help F2 Refresh F3 Cancel F8 Image Reet F1O Exit Enter Do Select the option that you want and go to the instructions for that option 108 User s Guide and Maintenance Information Listing the Disk Drives in an SSA RAID Array This option allows you to list the disk drives that are contained in a particular array E For fast path type smitty Issaraid and press Enter Otherwise a Select List Identify SSA Physical Disks from the SSA RAID Arrays menu b Select List Disks in an SSA RAID Array A list of arrays is displayed in a window r List Identify SSA Physical Disks Move cursor to desired item and press Enter List Disks in an SSA RAID Array List Hot Spares List Rejected Array Disks List Array Candidate Disks List System Disks Identify Disks in an SSA RAID Array Identify Hot Spares Identify Rejected Arra
201. aracter SSA UID that is shown on the label that is on the side of the disk drive You can recognize the UID by its three character suffix 00D e Run the ssacand command and specify the adapter to which the physical disk is connected For example ssacand a ssaQ P Using mkdev to Configure a Logical Disk To use mkdev to configure an SSA logical disk specify the following information Parent ssar Class disk Subclass ssar Type hdisk ConnectionLocation 15 character unique identifier of the logical disk If the logical disk is a system disk you can determine the unique identifier in three ways e If the logical disk is already defined you can use the Isdev command to determine the unique identity as follows 1 Type Isdev Ccdisk r connwhere and press Enter 2 Select the 15 character unique identifier UID for which characters 5 through 12 match the serial number that is on the front of the disk drive Chapter 13 Using the Programming Interface 269 e Construct the 15 character unique identifier from the 12 character SSA UID that is shown on the label that is on the side of the disk drive You can recognize the UID by its three character suffix OOD e Run the ssacand command and specify the adapter to which the logical disk is connected For example ssacand a ssaQ L If the logical disk is an array you can determine the unique identifier in two ways e lf the logical disk is already defined you can
202. are held reset but remain powered on disk drives 1 through 12 can still communicate with using system 2 Disk drives 13 through 16 however cannot communicate with using system 2 because their data paths are through the adapters in using system 1 When using system 1 is rebooted disk drives 13 through 16 remain unavailable for a long time Using System 1 Ai A2 fai B2 Disk Disk Disk Disk 16 15 14 13 Disk Disk Disk Disk 12 11 10 9 Disk Disk Disk Disk Disk Disk Disk Disk 1 2 3 4 5 6 7 8 B2 B1 J 22 at B2 B1 By a2 at Figure 7 Disk Drives Isolated by Failing Using System Ai a2 Js B2 Using System 2 Chapter 2 Introducing SSALoops 13 Two Loops with One Adapter If only one SSA adapter is contained in the SSA loops the adapter can provide support for up to 96 disk drives a maximum of 48 per loop Eigure 8 shows an example configuration that has two loops and one adapter iS et aara AA vedea ETELE aeea Adapter SSA Disk Drives Figure 8 Two Loops with One Adapter 14 Users Guide and Maintenance Information Two Loops with Two Adapters The two adapters can provide support for up to 96 SSA disk drives a maximum of 48 per loop
203. are not familiar with the selection of items from the graphics versions of the menus use the smitty command The menus will then appear as shown in this book 2 If you use fast path commands you might need to go through intermediate steps that are not shown in this book Also some menus might be displayed slightly differently from those shown in this book Chapter 6 Using the RAID Array Configurator 89 Getting Access to the SSA RAID Array SMIT Menu 1 For fast path access to the SSA RAID Array SMIT menus type smitty ssaraid and press Enter Otherwise a Type smitty and press Enter The System Management menu is displayed b Select Devices The Devices menu is displayed c Select SSA RAID Arrays 2 The SSA RAID Arrays menu is displayed A SSA RAID Arrays Move cursor to desired item and press Enter List All Defined SSA RAID Arrays List All Supported SSA RAID Arrays List All SSA RAID Arrays Connected to a RAID Manager List Status Of All Defined SSA RAID Arrays List Identify SSA Physical Disks List Delete Old RAID Arrays Recorded in an SSA RAID Manager List Status of Hot Spare Pools List Status of Hot Spare Protection for an SSA RAID Array List Components in a Hot Spare Pool Add a Hot Spare Pool Add an SSA RAID Array Delete an SSA RAID Array Change Show Attributes of an SSA RAID Array Change Member Disks in an SSA RAID Array Change Show Use of an SSA Physical Disk Change Use of Multiple SSA Physical Disk
204. ast Write for this SSA Logical Disk even if this involves discarding data in an inaccessible Fast Write Cache card The data in the Fast Write Cache card is the most recent copy of some portions of the data on the SSA Logical Disk Discarding this data may destroy the integrity of the data on the disk resulting in system crashes data corruption and oss of system integrity It is suggested that you try selecting no for this option first Force Delete is applicable only if you are setting Enable Fast Write to no yes no Fl Help F2 Refresh F3 Cancel F8 Image F1O Exit Enter Do Find n find next A 216 User s Guide and Maintenance Information Bypassing the Cache in a One Way Fast Write Network If you are using the two way fast write function that is two adapters have access to a fast write disk drive you can choose whether you want fast write operations to continue if the partner adapter fails or becomes not accessible By default fast write operations continue 1 For fast path access to the Change Show Characteristics of an SSA Logical Disk menu a Type smitty chgssardsk and press Enter b From the menu displayed select the logical disk that you want to change Otherwise a Select Change Show Characteristics of an SSA Logical Disk from the SSA Logical Disks menu b From the menu displayed select the logical disk that you want to change A list of options for the logical disk drives is displa
205. at command 358 ssa_fw_status command 360 ssa_getdump command 361 ssa_progress command 365 ssa_rescheck command 366 ssa_servicemode command 368 ssa_sesdid command 341 ssa_speed command 369 ssaadap command 343 ssacand command 344 ssaconn command 347 ssadisk command 349 ssadload command 350 ssaencl command 355 ssaidentify command 364 ssavfynn command 371 ssaxlate command 372 command syntax of the RAID Command Line Interface 237 commands add_hsm_pool_adap 83 addssaraid 140 chg_hsm_pool_adap 86 chgssadisk 72 144 chgssadisks 147 chgssardsk 214 chssaraid 135 exssaraid 95 142 iassaraid 127 icssaraid 125 ifssaraid 92 123 496 User s Guide and Maintenance Information commands continued inssaraid 121 issaraid 119 lassaraid 117 Icssaraid 115 lfssaraid 91 113 Inssaraid 111 Is_hsm_array_components 80 Is_hsm_array_status 77 Is _hsm_status 74 Isdssaraid 100 Isidssaraid 108 Ismssaraid 102 Issaraid 109 Isssaraid 101 Istssaraid 104 mkssaraid 60 nvrssaraid 130 redssaraid 138 rmssaraid 70 smit smitty 59 90 98 213 ssa_certify 345 ssa_delete_copy 171 ssa_diag 348 ssa_ela 353 ssa_format 358 ssa_fw_status 360 ssa_getdump 361 ssa_identify_cancel 129 ssa_make_copy 161 ssa_progress 365 ssa_rescheck 366 ssa_servicemode 368 ssa_sesdid 341 ssa_speed 369 ssaadap 343 ssacand 344 ssaconn 347 ssadisk 349 ssadiload 350 ssadlog 213 ssaencl 355 ssafastw 215 ssaidentify 364 ssaraid
206. ating system the state in which a process runs in kernel mode Contrast with user mode kilobyte KB L LBA Logical block address 1000 bytes logical disk An hdisk See hdisk LUN Logical unit See also hdisk maintenance analysis procedure MAP A service procedure for isolating a problem MAP See maintenance analysis procedure MB Megabyte megabyte MB 1000000 bytes member disk A disk drive that is part of a RAID array metadata Data that describes data objects microcode One or more microinstructions used in a product as an alternative to hard wired circuitry to implement functions of a processor or other system component mirrored pair Two disk drives that contain the same data and are referred to as one entity by the using system mirroring The process of writing the same data to two disk drives at the same time The two disk drives become a mirrored pair The system can therefore continue to operate correctly when one of the mirrored disk drives fails N node In a network a point at which one or more functional units connect channels or data circuits For example in an SSA subsystem a disk drive or an adapter O object data manager ODM In the operating system a data manager intended for the storage of system data ODM Object data manager ODM Offline state The state that a RAID array enters when two or more member disk drives become missing P page split
207. available for a short period Run the error log analysis to determine whether the disk drive should be exchanged for a new one When the error is logged against the adapter it indicates that the adapter has received a report of a status that is not valid The adapter cannot however determine which disk drive sent the bad data Run diagnostics to all SSA disk drives If no failure is found the log might have been caused by a link error SSA_DISK_ERR1 1D2E2C3B An SSA disk drive has received a command or parameter that is not valid This error might be caused by A software error in the adapter e A software error in the disk drive A hardware error SSA_DISK_ERR2 928F5165 The disk drive has performed an internal error recovery operation No action is needed SSA_DISK_ERR3 8BDD5B42 The disk drive has performed internal media maintenance No action is needed SSA_DISK_ERR4 F7863CFE One of the following has occurred e The disk drive has had an unrecovered hardware error e The disk drive has had a hardware error that is now recovered but the disk drive is reporting that it might be going to fail SSA_ENCL_ERR1 BD797922 Errors of this type are logged when an enclosure for example a 7133 Model D40 or T40 reports a failure The SRN indicates the service procedures that must be performed SSA_ENCL_ERR2 ASBEDOBC Errors of this type are logged when an enclosure for example a
208. ay If however you remove or switch off a disk drive that has the Identify function set the function remains set on that disk drive When you reinstall or switch on the disk drive the disk drive Check light continues to flash Under these conditions you can reset the Identify function by either of the following methods Type on the command line ssaidentify 1 pdiskName n e Select the Link Verification service aid for that disk drive Chapter 17 SSA Service Aids 375 Starting the SSA Service Aids To start the SSA service aids 1 Start the using system diagnostics see the Diagnostic Information for Multiple Bus Systems manual and go to the Diagnostic Operating Instructions Follow the instructions to select Function Selection Select Task Selection from the Function Selection menu Select SSA Service Aids from the Tasks Selection list The SSA Service Aids menu is displayed Va SSA SERVICE AIDS 802380 Move cursor onto selection then press Enter Set Service Mode Link Verification Configuration Verification Format Disk Certify Disk Display Download Disk Drive Microcode Link Speed Physical Link Configuration Enclosure Configuration Enclosure Environment Enclosure Settings SMIT SSA RAID Arrays SMIT SSA Disks F3 Cancel FLO Exit Ne Notes a In some configurations of the using system console Esc and 0 Exit Esc and 3 Cancel In such configurations however the displayed instructions f
209. ay hdisk is in a different SSA enclosure The hot spare disk drives are also in a different enclosure Pools A1 and A2 each contain an hdisk and a hot spare disk drive The pools ensure that if any one SSA enclosure fails completely three disk drives are always available for each hdisk Pool A2 Enclosure 1 SE SE Enclosure 2 Se Se Enclosure 3 Enclosure 4 JG Figure 19 Pools and Hdisks across Enclosures Chapter 5 Hot Spare Management 49 Figure 20 shows an alternative method of protecting RAID 5 arrays against the complete failure of an SSA Enclosure This method uses a different hot spare disk drive to protect each member of the array Enclosure 1 disk PolA1 ssf shdisk Ga Enclosure 2 o PAL SOSOSSSSS OO kas ka Enclosure 3 0 fd PIAS i a SOOSSSNSS OO ka ka Figure 20 Pools along and Hdisks across Enclosures 50 User s Guide and Maintenance Information Figure 21 shows how a RAID 10 array can be protected against the complete failure of an SSA enclosure Enclosure 1 eles hues aes ee eed Pool AI a ee ee ees hdisk1 Primary Disks hdisk2 Primary Disks Enclosure 2 oe se et wes te aes es ee hdisk1 Secondary Disks hdisk2 Secondary Disks Figure 21 Pools and H
210. ay member disk drive Valid values for status are good The disk drive is working not_present The disk drive cannot be detected It has been removed or it has failed too_large The member disk drive is too large to be protected by one of the hot spare disk drives that is in the pool Note The size of the member disk drive is not the physical size of the disk drive but the size that the array manager assigns to it For example if a RAID 10 array is created from three 9 GB disk drives and one 18 GB disk drive the size that is assigned to each array member disk drive is 9 GB The 18 GB disk drive can still be protected by a 9 GB hot spare disk drive wrong_pool This member disk drive of the array has been replaced with a hot spare disk drive from another pool This action has occurred because no hot spare disk drive was available in this pool when the array member disk drive failed When all failed disk drives have been replaced this array member disk drive should be exchanged with a disk drive that is in the same physical domain as are the other disk drives in the pool see TALIS Chapter 6 Using the RAID Array Configurator 79 Listing the Disks That Are in a Hot Spare Pool This option shows you all the disk drives that are in a hot spare pool and shows the status of each disk drive 1 For fast path type smitty 1s_hsm_array_components and press Enter Otherwise select List Components in a Hot Spare Pool from the SSA RAID Ar
211. ayed instructions select a disk drive to remove from the array and the disk drive to add to the array The disk drive to remove is listed as not_present the disk drive to add is the disk drive that you tested or exchanged in the previous steps When the exchange is complete the array starts to copy its data to the coupled disk drive to verify the repair 25 from step bab Does the Link Verification service aid indicate an open loop NO Go to step bal YES 26 from step Bs Does any SSA disk drive have its Check light on NO The disk drive might have been removed from the subsystem a Reinstall the removed drive or select a new disk drive for addition to the array b Type smitty ssaraid and press Enter Select Change Show Use of an SSA Physical Disk The pdisk that has been exchanged is listed under SSA Physical Disks that are system disks 468 User s Guide and Maintenance Information YES 2 Select the pdisk from the list and change the Current Use parameter to Array Candidate Disk Select Change Member Disks in an SSA RAID Array Select Swap Members of an SSA RAID Array Select the hdisk that is in the Degraded copy state that is the hdisk that you noted in step padon a Referring to the displayed instructions select a disk drive to remove from the array and the disk drive to add to the array The disk drive to remove is listed as not_present the disk drive to add is the disk drive that you r
212. battery is discharged and data has been lost If the adapter card has been switched off for fewer than seven days exchange the Fast Write Cache Option Card for a new one then do the following 1 Ask the customer to disable the Fast Write option for Each device for which the Fast Write option is offline All other devices that are connected to the failing adapter and have the Fast Write option enabled For instructions on how to disable the Fast Write option see 2 If the Fast Write option has been disabled for a RAID 5 array the hdisk for that array can no longer be configured Delete the RAID 5 array then recreate it For instructions on how to delete the array see 3 Ask the customer to re enable the Fast Write option for the devices that are attached to the Fast Write Cache Option Card 42527 Description A dormant fast write cache entry exists The fast write cache User or service action contains unsynchronized data for a disk drive that is no longer available Action If possible reconnect the disk drive to the adapter to enable the data to be synchronized If you cannot reconnect the disk drive for example because the disk drive has failed the user should delete the 420 User s Guide and Maintenance Information SRN Problem Possible Causes 42528 Description A fast write disk drive has been detected that was previously unsynchronized but has since been configure
213. ber disk drives that hold configuration sectors in the primary half of the array is available Multiple States 38 Different member disk drives of a RAID 10 array can be in different states For example one mirrored pair might be rebuilding while in a different pair one member disk drive is missing and the remaining member is in the Degraded state A priority of array states of the different members is used when the state of an array is reported highest priority first Unknown Offline Exposed Degraded Rebuilding Or Re An error is logged whenever the state of an array changes unless the state changes to Good or Rebuilding User s Guide and Maintenance Information Chapter 4 Using the SSA SMIT Menus This chapter describes how to use the system management interface tool SMIT to display and change characteristics of the SSA devices and to access various service functions Three SSA menus are available through the SMIT Devices menu SSA Adapter SSA Disks e SSA RAID Arrays Notes 1 Although this book always refers to the smitty commands you can use either the smitty command or the smit command The procedures that you follow remain the same whichever of the two commands you use If you send the smit command from a graphics terminal however the menus are displayed slightly differently from those shown in this book If you are not familiar with the selection of items from the graphics versions of
214. bles you to e Create a RAID Copy array from available candidate disk drives e Couple the RAID Copy array to your selected parent RAID 1 or RAID 10 array Copy data from the parent array to the RAID Copy array e Delete the RAID Copy array when the copy data is no longer required Select Array Copy Services from the SSA RAID Arrays menu 2 Select Prepare a RAID Array Copy A list of the RAID 1 and RAID 10 arrays that are not coupled to any copy arrays is displayed for example A i N Array Copy Services l e Uncouple the RAID Copy array from the parent array l l l Move cursor to desired item and press Enter Prepare a RAID Array Copy Prepare Volume Group Logical Volumes or Filesystems Copy Uncouple a RAID Array Copy Uncouple a Volume Group Logical Volumes or Filesystems Copy List All Copy Candidates List All Uncoupled Copies List All Uncoupled Volume Groups Delete a RAID Array Copy Delete a Volume Group Logical Volumes or Filesystems Copy Array to be Copied Move cursor to desired item and press Enter hdisk4 RAID 1 array hdisk5 RAID 1 array F1l Help F2 Refresh F3 Cancel F8 Image F1Q Exit Enter Do F1l Find n Find Next SS Sic a a PT ng J Chapter 7 Copying Data from Arrays and from Volume Groups 155 3 Select the RAID 1 or RAID 10 array hdiskx that you want to copy and press Enter The Prepare a Copy menu is displayed a Type or select values in entry field
215. bsystem but that are not causing application programs to fail The third entry instructs the run_ssa_encl_healthcheck shell script to run at 30 minutes past each hour This shell script searches for SSA enclosures that provide support for SCSI Enclosure Services SES If any of those enclosures detects errors the shell script causes entries to be made in the error log The fourth entry instructs the run_ssa_link_speed shell script to run at 04 30 each day This shell script searches for SSA links that are not running at the best speed If one is found an entry is made in the system error log Note You can use 20 MB per second black SSA cables to connect 40 MB per second SSA nodes If you do however the run_ssa_link_speed shell script might cause errors to be logged that can be solved only by exchanging the black cables for 40 MB per second cables blue SSA cables If you are using 20 MB per second SSA cables to connect 40 MB per second SSA nodes therefore delete the run_ssa_link_speed cron entry Microcode Maintenance This section describes how to e Check the ID and the level of the microcode package e Maintain the adapter microcode e Maintain the disk drive microcode Checking the ID and Level of the Microcode Package For some problems the service request number SRN might ask you to check the microcode package ID or the microcode level before you exchange any field replaceable units FRUs To determine the microcode pac
216. call does not use it Summary of SSA Error Conditions If an open or ioctl subroutine that has been issued to an SSA adapter fails the subroutine returns 1 and the global variable errno is set to a value from the file usr include sys errno h Possible errno values for the SSA adapter device driver are EINVAL An unknown ioctl was attempted or the parameters supplied were not valid EIO An I O error occurred ENOMEM The command could not be completed because not enough real memory or paging space was available ENXIO The requested device does not exist Managing Dumps The SSA adapter device driver is a target for the system dump facility The DUMPQUERY option returns a minimum transfer size of 0 bytes and a maximum transfer size that is appropriate for the SSA adapter To be processed calls to the SSA adapter device driver DUMPWRITE option should use the arg parameter as a pointer to the SSA_loreq_t structure which is defined in usr include sys ssa h Using this interface commands for which the adapter provides support can be run on a previously started opened target device The SSA adapter device driver ignores the uiop parameter Note Only the SsaMCB MCB_Result field of the SSA_loreq_t structure is set at completion of the DUMPWRITE During the dump no support is provided for error logging If the dddump entry point completes successfully it returns a 0 If the entry point does not complete successfully it returns
217. caution notice indicates the presence of a hazard that has the potential of causing moderate or minor personal injury This book contains two caution notices Those caution notices are in this safety section An attention notice indicates an action that could cause damage to a program device system or data Safety Notice for Installing Relocating or Servicing Before connecting or removing any cables to or from connectors at the using system be sure to follow the steps in the installation or relocation checklist specified in the Installation and Service Guide for your using system For safety checks when servicing refer to that manual and to the nstallation and Service Guide for your subsystem CAUTION A lithium battery can cause fire explosion or a severe burn Do not recharge disassemble heat above 100 C 212 F solder directly to the cell incinerate or expose cell contents to water Keep away from children Replace only with the part number specified with your system Use of another battery might present a risk of fire or explosion The battery connector is polarized do not try to reverse the polarity Dispose of the battery according to local regulations Each Advanced SerialRAID Adapter card contains a lithium battery CAUTION The Fast Write Cache Option Card contains a nickel cadmium NiCad battery To avoid possible explosion do not incinerate the battery Exchange it only with a manufacturer approved part
218. cfgmgr to reconfigure the new hdisk b You can disable the fast write function from this menu only if no data for your selected device is present in the fast write cache If data for your selected 214 User s Guide and Maintenance Information device is present in the fast write cache and you want to disable the fast write function go to Enabling or Disabling Fast Write for Multiple Devices This option allows you to enable or disable the fast write function on multiple devices You can select multiple devices from the list that this option displays The displayed list contains also offline and broken cache items so that you can delete them 1 For fast path access to the Enable Disable Fast Write for Multiple Devices menu a Type smitty ssafastw and press Enter b From the menu displayed select all the logical disk drives for which you are enabling or disabling the fast write function Otherwise a Select Enable Disable Fast Write for Multiple Devices from the SSA Logical Disks menu b From the menu displayed select all the logical disk drives for which you are enabling or disabling the fast write function 2 The Enable Disable Fast Write for Multiple Devices menu appears a l D Enable Disable Fast Write for Multiple Devices Type or select values in entry fields Press Enter AFTER making all desired changes Entry Fields List of Devices hdisk1 Enable Fast Write no Force Delete no BOTTOM Fl Help
219. come the member disk of the new array d specifies that a system disk is to be attached to the new array Example 2 To Create a RAID 1 Array This example shows how to use two SSA physical disks to create a RAID 1 array The attributes of the disks are all set to their default values Type the command gt ssaraid C 1 ssa0 t raid_1 s pdisk pdiskl d where C specifies that this operation is a create operation 1 ssa0 specifies that RAID Manager ssa0 is to be used t raid_1 specifies that a RAID 1 array object is to be created s pdisk specifies the free pdisk that is to become the member disk of the new array RAID 1 arrays provide support for only two member disk drives The first disk drive that is specified becomes the primary member disk drive d specifies that a system disk is to be attached to the new array Chapter 12 Using the SSA Command Line Interface for RAID Configurations 239 Example 3 To Create a RAID 5 Array This example shows how to use three SSA physical disks to create a RAID 5 array The attributes of the disks are all set to their default values Type the command gt ssaraid C 1 ssaQ t raid_5 s pdiskO pdiskl pdisk2 d where C specifies that this operation is a create operation 1 ssa0 specifies that RAID Manager ssa0 is to be used t raid_5 specifies that a RAID 5 array object is to be created s pdisk specifies the free pdisk that is to become the member disk of the new array
220. ctors B1 and B2 of the same SSA adapter Disk drives 13 through 16 are connected to connectors A1 and A2 of a different SSA adapter H Although the missing disk drive is reported as an error all the remaining disk drives can still communicate with the using system Disk drives 1 and 2 can communicate through connector A1 of the SSA adapter J Disk drives 4 through 8 can communicate through connector A2 of the SSA adapter Disk drives 9 through 12 can communicate through connectors B1 and B2 of the same SSA adapter normal loop disk drives 13 through 16 can communicate through connectors A1 and A2 of the SSA adapter H 406 User s Guide and Maintenance Information Rene system A1 A2 At a2 EB e2 B1 B2 Using system Ai a2 Bi B2 Figure 58 Broken Loop Disk Drive Removed Disk Disk Disk Disk Disk Disk Disk Disk 16 15 14 13 12 11 10 9 Disk Disk Disk Disk Disk Disk Disk 1 2 4 5 6 7 8 Chapter 17 SSA Service Aids 407 For this example the Link Verification service aid displays the following information ie LINK VERIFICATION 802386 SSA Link Verification for systemname ssaQ 00 04 IBM SSA 160 SerialRAID Adapter To Set or Reset Identify move cursor onto selection then press lt Enter gt Physical Serial Adapter Port Al A2 Bl B2 Status TOP systemname pdisk11 AC50AE43 0 Good systemname pdisk8 AC706EA3 1 Good 2222 systemname pdisk3 AC1DBEF4 4 Good
221. d and the array goes into the Degraded state see L Je 29 for more information Command line parameters are available that allow you to prevent such write operations a Type smitty ssaraid and press Enter b Select Change Show Use of an SSA Physical Disk The status of the disk drives that are connected to the using system is displayed c Go to step 18 on page 464 Chapter 18 SSA Problem Determination Procedures 463 18 from step 17 Are any disk drives listed as SSA physical disks that are rejected NO A disk drive has not been detected by the adapter Go to step Lood YES a Run diagnostics in System Verification mode to all the disk drives that are listed as rejected b Run the Certify service aid see to all the disk drives that are listed as rejected c If problems occur on any disk drive exenange that disk drive for a new disk drive see continue from step sdin this procedure d A disk drive that is listed as rejected is not necessarily failing For example the array might have rejected the disk drive because a power problem or an SSA link problem caused that drive to become temporarily unavailable Under such conditions the disk drive can be reused If you think that a disk drive has been rejected because it is failing check the error log history for that disk drive For example if you suspect pdisk3 type on the command line ssa_ela 1 pdisk3 h 5 This command causes the error log fo
222. d changes Entry Fields SSA RAID Manager ssal Spares Pool pool_B3 Components to Add g Hot Spares Minimum 1 Fl Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image T F10 Exit Enter Do This menu automatically assigns the next available pool number to the new pool 84 Users Guide and Maintenance Information The Hot Spares Minimum field contains a default value of 1 This value defines the minimum number of spares that can exist in the pool before an error condition is logged You should normally set this field to the number of hot spare disk drives that you intend to assign to this pool You can however set it to a lower number if you do not want to be alerted when a single hot has been used see Select Components to Add and press the List key A list of valid hot spare pool candidates is displayed This list contains RAID disk drives hot spare disk drives and free disk drives that are now in pool zero on the selected SSA loop a N Add a Hot Spare Pool 2 ee Pr Components to Add Move cursor to desired item and press F7 ONE OR MORE items can be selected Press Enter AFTER making all selections SSA physical disks that are members of an array hdisk4 raid_10 pdisk13 AC7AAB76 4 02 REGY 06 P 18 2GB good pdisk3 AC7AAOB2 04 02 REGY 05 P 9 2GB good pdisk7 AC7AAOBD 04 02 REGY 01 P 9 2GB good SSA physical disks that are hot spare disks pdisk9 AC7AA2D6 04 02 REGY 02 P 9 2GB
223. d command with the C option to create a RAID 5 array fw_max_length default 128 x n 1 where n is the number of member disk drives in the array This attribute sets the maximum size in blocks of write operations to the cache Write operations that are larger than the specified value write data directly to the array and do not use the fast write cache Note You can set the maximum length to be less than but not greater than the default length Any length that is greater than the default length is ignored and the default is used strip_size default 64 This attribute is used only when the array is created The strip size is the maximum amount of contiguous data that is mapped to one member disk drive Valid values are 64 32 KB or 64 x 512 byte blocks and 128 64 KB or 128 x 512 byte blocks Creation and Change Attribute for RAID 10 Arrays Only You can specify the following attribute with the a option when you are using the ssaraid command with the C option to create a RAID 10 array strip_size default 32 This attribute is used only when the array is created The strip size is the maximum amount of contiguous data that is mapped to one member disk drive Valid values are 32 16 KB or 32 x 512 byte blocks 64 32 KB or 64 x 512 byte blocks and 128 64 KB or 128 x 512 byte blocks RAID Arrays Change Attributes This section describes the creation and change attributes that you can use for All RAID arrays Hot spa
224. d on a different adapter Action If this disk drive contains data that should be kept return the disk drive to the adapter to which it was previously connected If the disk drive does not contain data that should be kept 1 Physically remove the disk drive from the system configuration ine items see es on page jj 2 Ask the user to delete all offl 3 When the items have been deleted reinstall the disk drive that you have just removed User or service action 42529 Description The fast write cache is inactive The battery is in a The using system has recently fast charge operation It remains in the fast charge operation for up to one been switched on The battery hour after the adapter has been connected to the power During this time is still charging the fast write function remains inactive Inactive means that although fast write disk drives can be enabled and accessed they are not Possible FRUs using the fast write function Fast Write Cache Option f Card battery 100 Action If the using system has been switched on for less than one hour wait for the battery to complete charging If the using system has been switched on for more than one hour run Adapter on page 334 diagnostics to the adapter in System Verification mode If the same SRN is generated exchange the FRU for a new FRU 4252A Description The supply voltage to the Fast Write Cache Option Card is Possible FRUs low
225. d the SSA subsystem ensure that the latest versions of microcode and software have been installed If the system is still operational and you have any hot spare disk drives attached to the adapter an automatic dump might have been performed Run ssa_getdump l to see if any dump data is present Software errors can result from hardware failures Always solve hardware problems therefore before looking for software errors 224 User s Guide and Maintenance Information Disk drive errors on SSA subsystems are logged against the physical disk drive pdisk rather than the logical disk drive hdisk If you are looking for the cause of a problem where the failing hdisk is known you can use either of the following methods to find that cause e Use the Configuration Verification service aid or give the ssaxlate I hdisk command to determine which pdisks are associated with the hdisk Give the ssa_ela I hdisk command to run error log analysis When ssa_ela is run to an hdisk it performs an error log analysis for all the devices that support that hdisk Those devices are one or more adapters and one or more pdisks The following example shows a part of an SSA error log See the using system documentation for a detailed description of all the fields that appear in the error log display LABEL SSA_LINK_OPEN IDENTIFIER 625E6B9A Date Time Tue 23 Sep 03 00 00 Sequence Number 640 Machine Id 00400076C400 Node Id identity Class H
226. d_1 hd_3 data_fs1 lv_A Al6O0 data_fs2 Iv_C Iv_B Source Volume Group hd_1 hd_3 data_fs1 WA a data_fs2 lv_C lv_B Source Volume Group hd_1 9 hd_3 data_fs1 Iv_A a data_fs2 lv_C lv_B Copy Physical Volumes hd_2_cp AS Copy by LV FS name hd_2_cp hd_1_cp E fsdata_fs1 a fslv_A 8 lt fslv_B Figure 27 Copying One Logical Volume or Copying by FS Name 1 Eigure 27 shows from left to right The parent array that contains the source volume group The empty RAID Copy array coupled to the parent array The uncoupled RAID Copy array that now contains a copy of the logical volume Note that in the copy all names start with fs Chapter 7 Copying Data from Arrays and from Volume Groups 167 Example 4 Copying One Logical Volume by Logical Volume Name or by FS Name 2 In this example you are copying one logical volume Iv_C from the parent array to the RAID Copy array You can use either the logical volume name Iv_C or the FS name data_fs2 To copy one logical volume by logical volume name give the commands ssa_make_copy P 1 lv_C ssa_make_copy 1 lv_C To copy one logical volume by fs name type
227. dentification Lights yes F1 Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image F9 Shel F1O Exit Enter Do Ne 4 Select yes in the Flash Disk Identification Lights field 5 Press the List key to list the disk drives 6 From the displayed list select the disk drives that you want to identify The Check light flashes on each disk drive that you have selected 122 User s Guide and Maintenance Information Identifying Rejected Array Disk Drives This option allows you to identify disk drives that have been rejected probably because of failure from arrays 1 For fast path type smitty ifssaraid and press Enter Otherwise a Select List Identify SSA Physical Disks from the SSA RAID Arrays menu b Select Identify Rejected Array Disks 2 A list of arrays is displayed in a window ia gt List Identify SSA Physical Disks Move cursor to desired item and press Enter List Disks in an SSA RAID Array List Hot Spares List Rejected Array Disks List Array Candidate Disks List System Disks Identify Disks in an SSA RAID Array Identify Hot Spares Identify Rejected Array Disks Identify Array Candidate Disks SSA RAID Array Move cursor to desired item and press Enter hdisk3 095231779F0737K good 3 4G RAID 5 array hdisk4 09253173A02137K good 3 4G RAID 5 array Fl Help F2 Refresh F3 Cancel F8 Image F1Q Exit Enter Do Find n Find Next KS A Select the array whose rejected disk
228. dev ssan Chapter 13 Using the Programming Interface 263 SSA_GET_ENTRY_POINT SSA Adapter Device Driver ioctl Operation Purpose To allow another kernel extension typically a SSA head device driver to determine the direct call entry point for the SSA adapter device driver This operation is the entry point through which the head device driver communicates with the adapter device driver The address that is supplied is valid only while the calling kernel extension holds an open file descriptor for the SSA adapter device driver This operation is not valid for a user process Description The arg parameter specifies the address of a SSA_GetEntryPointParms_t structure in kernel address space The SSA_GetEntryPointParms_t structure is defined in the usr include sys ssa h file On completion of the operation the fields in the SSA_GetEntryPointParms_t structure are modified as follows EntryPoint Address of the direct call entry point for the SSA adapter device driver which is used to submit operations from a head device driver InterruptPriority The off level interrupt priority at which the calling kernel extension is called back for completion of commands that are started by calling the direct call entry point Return Values When completed successfully this operation returns a value of 0 Otherwise a value of 1 is returned and the errno global variable is set to the following value EINVAL Indicates that the caller was not in ker
229. disk and stops all I O operations Now the copy can be uncoupled from the parent volume group The uncouple process makes a RAID Copy array from the copy disk drives and clears the PVID of the copy disk drive The uncouple process occurs on all hdisks that have been copied it can occur only when all I O operations have stopped The timing is very important here I O operations must be stopped for the minimum time possible The copy disk drives are now unattached RAID Copy arrays I O is restarted to the volume group The fast write cache is reenabled RAID Copy hdisks are configured from the RAID Copy arrays The recreatevg command is run against the new hdisks This command e Changes the physical volume ID PVID references of the VGDA to suit the PVIDs of the new hdisks e Renames the copy volume group and assigns a new VGID Renames the copy logical volumes default prefix fs can be used Changes the root mount point for the file systems in the new volume group default prefix fs can be used e Mounts the file systems Mounts the file systems when the copy operation completes The default is not to mount the file systems Causes file systems that have been mounted to be read only file systems Synchronizes the file systems that is flushes the data from the using system memory onto the disk drive User s Guide and Maintenance Information Determines the name of the copy file system If this flag is not provided
230. disks along Enclosures The primary disk drives of the array are in enclosure 1 the secondary disk drives are in enclosure 2 The secondary disk drives contain the same data as do the primary disk drives Pool A1 contains all the primary disk drives of the arrays and a hot spare disk drive pool A2 contains all the secondary disk drives and a hot spare disk drive If one enclosure fails completely the other enclosure can still recover from a disk drive failure because its disk drives and the hot spare disk drive are in the same pool Choosing How Many Hot Spare Disk Drives to Include in Each Pool The number of hot spare disk drives that can be included in a hot spare pool is limited only by the number of disk drives that are permitted on a single SSA loop When choosing how many disk drives to include in a hot spare pool think about how many disk drives the hot spare is protecting and how much time might elapse before a failed disk drive can be replaced Choosing the Error Threshold Alarm Level for a Hot Spare Pool Normally a hot spare pool reports an error when any hot spare disk drive has been used For some conditions such as a disk drive failure at an unattended site you might prefer to delay service activities until more than one disk drive in a hot spare pool has failed You can specify this requirement when you create a hot spare pool When you create the hot spare pool see g g set the Hot Spare Minimum parameter to be equal to t
231. dow a List Identify SSA Physical Disks D Move cursor to desired item and press Enter List Disks in an SSA RAID Array List Hot Spares List Rejected Array Disks List Array Candidate Disks List System Disks Identify Disks in an SSA RAID Array Identify Hot Spares Identify Rejected Array Disks Identify Array Candidate Disks SSA RAID Array Move cursor to desired item and press Enter hdisk3 095231779F0737K good 3 4G RAID 5 array hdisk4 09253173A02137K good 3 4G RAID 5 array Fl Help F2 Refresh F3 Cancel F8 Image F1Q Exit Enter Do Find n Find Next S 7 Select the array whose disk drives you want to identify Note The menu shown here is displayed when you select a RAID 5 array If you select a RAID 1 or a RAID 10 array two lists of disk drives are available One list is for primary disk drives the other is for secondary disk drives Chapter 6 Using the RAID Array Configurator 119 3 The following information is displayed ihe Identify Disks in an SSA RAID Array Type or select values in entry fields Press Enter AFTER making all desired changes Entry Fields SSA RAID Manager ssaQ SSA RAID Array hdisk2 Member Disks Flash Disk Identification Lights yes F1 Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image ieee F1O Exit Enter Do 4 Select yes in the Flash Disk Identification Lights field 5 Press the List key to list the disk drives 6 From the displayed list select the disk
232. dress using system address for this adapter scat_gat_pages Specifies the number of 4 kilobyte pages that the device driver has kept for the management of scatter gather lists If many large transfer operations are to be performed think about increasing the value of this attribute if the I O does not reach the expected rate dma_mem Specifies the size of the DMA memory On systems that are using AIX version 4 3 or above the memory should be big enough to hold the biggest expected I O load If you specify too small a size the system cannot reach the best possible I O rate poll_ threshold Specifies how many command completions must occur in a 10 millisecond period to cause the adapter device driver to switch so that it is driven by polling command completions rather than by interrupts Polling might reduce the load on the system processor but it can also cause longer I O response times Device Dependent Subroutines The SSA adapter device driver provides support only for the open close and ioctl subroutines It does not provide support for the read and write subroutines open and close Subroutines The open and openx subroutines must be called by any application program that wants to send ioctl calls to the device driver 258 User s Guide and Maintenance Information You can use the open or the openx subroutine call to open the SSA adapter device driver If you use the openx subroutine call set the ext parameter to 0 because the
233. drive that contains the configuration sector is present Inthe primary half of the array no members that hold configuration sectors are present The Split Array Resolution flag is not set e Allthese three conditions exist Inthe primary half of the array disk drive members that hold configuration sectors are present Inthe secondary half of the array the member disk drive that contains the configuration sector is not present The Split Array Resolution flag is set In the primary and secondary halves of the array member disk drives that hold configuration sectors are present and the Split Array Resolution flag is set on the secondary half The array however was not initialized correctly e Two failures in a configuration update configuration sectors fence sector label sector medium error table or unsync table Chapter 3 RAID Functions and Array States 37 Both member disk drives of a mirrored pair are missing deconfigured or rebuilding Unknown State A RAID 10 array is in the Unknown state when not enough array members are present for the array configuration to be determined that is fewer than two of the first three members are present Unless the Split Array Resolution flag is set the array enters the Offline state to allow split arrays to operate if The member disk drive that holds the configuration sector in the secondary half of the array is available and e Neither of the mem
234. drives you want to identify Chapter 6 Using the RAID Array Configurator 123 3 The following information is displayed ihe Identify Rejected Array Disks Type or select values in entry fields Press Enter AFTER making all desired changes Entry Fields SSA RAID Manager ssaQ Rejected Array Disks Flash Disk Identification Lights yes Fl Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image F9 Shel F1O Exit Enter Do S 4 Select yes in the Flash Disk Identification Lights field 5 Press the List key to list the disk drives 6 From the displayed list select the disk drives that you want to identify The Check light flashes on each disk drive that you have selected 124 User s Guide and Maintenance Information Identifying Array Candidate Disk Drives This option allows you to identify disk drives that are available for adding to an array 1 For fast path type smitty icssaraid and press Enter Otherwise a Select List Identify SSA Physical Disks from the SSA RAID Arrays menu b Select Identify Array Candidate Disks 2 A list of adapters is displayed in a window r List Identify SSA Physical Disks Move cursor to desired item and press Enter List Disks in an SSA RAID Array List Hot Spares List Rejected Array Disks List Array Candidate Disks List System Disks Identify Disks in an SSA RAID Array Identify Hot Spares Identify Rejected Array Disks Identify Array Candidate Di
235. ds to create a copy array 151 using the SSA Command Line Interface for RAID 235 using the SSA command line utilities 341 using the SSA SMIT menus 38 508 User s Guide and Maintenance Information using the SSA Spare Tool 209 using the ssaraid command instead of SMIT 235 V vital product data 316 VPD vital product data 316 W write subroutine tmssa device driver 299 Part Number 27H0678 Printed in the U S A SA33 3285 02 1P P N 27H0678 Spine information Advanced SerialRAID Adapters User s Guide and Maintenance Information
236. dssaraid and press Enter Otherwise a Select Change Member Disks in an SSA RAID Array from the SSA RAID Arrays menu b Select Remove a Disk from an SSA RAID Array 2 Alist of arrays is displayed in a window A Change Member Disks in an SSA RAID Array Move cursor to desired item and press Enter Remove a Disk from an SSA RAID Array Add a Disk to an SSA RAID Array Swap Members of an SSA RAID Array SSA RAID Array Move cursor to desired item and press Enter hdisk3 095231779F0737K good 3 4G RAID 5 array hdisk4 09523173A02137K good 3 4G RAID 5 array Fl Help F2 Refresh F3 Cancel F8 Image F1O Exit Enter Do Find n Find Next XS D Select the array from which you want to remove a disk drive 138 User s Guide and Maintenance Information 3 The following information is displayed N Remove a Disk from an SSA RAID Array Type or select values in entry fields Press Enter AFTER making all desired changes Entry Fields SSA RAID Manager ssa0 SSA RAID Array hdisk3 Connection Address Array Name 095231779F0737K Disk to Remove Fl Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image eae F10 Exit Enter Do J Press the List key to list the disk drives 4 From the displayed list select the disk drive that you want to remove 5 Physically remove the disk drive from the subsystem see the Operator Guide or Service Guide for the unit Chapter 6 Using the RAID Array Co
237. e Available state If you are running the diagnostics in Concurrent Mode run cfgmgr to ensure that all pdisks are configured before selecting this option If pdisks cannot be configured then go to the START page in the SSA Subsystem Service Guide F3 Cancel F1O Exit Enter F3 Cancel a F Are any pdisks listed for the selected SSA adapter NO One of the following conditions exists Take the action described e No physical disks are connected to this SSA adapter a Ensure that the external SSA cables are correctly connected to the enclosures in which the devices are installed and to the SSA adapter b Go to repair e All the disk drives are switched off Go to the START MAP for the enclosure in which the SSA devices are installed The SSA adapter is failing a to verify the Exchange the SSA adapter for a new one see b Go to repair YES Go to step mi 4 from step B Observe the Status column on the screen If the status of any pdisk is Power that pdisk has detected a loss of redundant power or cooling In the example shown here pdisk2 has detected such a loss j to verify the 446 User s Guide and Maintenance Information Gene VERIFICATION 802386 N SSA Link Verification for nunu ssa0 00 04 IBM SSA 160 SerialRAID Adapter To Set or Reset Identify move cursor onto selection th
238. e Fast Write Option Card has been switched off for more than seven days or if the Fast Write Option Card battery has been disconnected clear the contents of the fast write cache type the command ssa_format 1 AdapterName where AdapterName is the name of the adapter that contains the fast write cache for example ssa0 If the Fast Write Option Card has not been switched off exchange it for a new one 4 Ask the customer to re enable the Fast Write option for the devices that are attached to the reformatted or new Fast Write Option Card 42523 Description The Fast Write Cache Option Card has a bad version User or service action number Action Install the correct adapter microcode for this cache card 418 User s Guide and Maintenance Information SRN Problem Possible Causes 42524 Description A fast write disk drive or drives contains unsynchronized data but the Fast Write Cache Option Card cannot be detected The disk drive or drives is offline Action If the Fast Write Cache Option Card has been removed reinstall it and test the disk drive subsystem If the Fast Write Cache Option card has failed 1 Ask the customer to disable the Fast Write option for Each device for which the Fast Write option is offline All other devices that are connected to the failing adapter and have the Fast Write option enabled For instructions on how to disable the Fast
239. e Offline state to that adapter Note In the ssaraid command the Split Array Resolution Flag is known as split_resolution When the Split Array Resolution flag is set and the secondary configuration disk drive can be detected access to the array is permitted If the Split Array Resolution flag is set and the secondary configuration disk cannot be detected the array goes into the Offline state The Split Array Resolution flag affects initialization of the array e lf an array appears with the Split Array Resolution flag set only member disk drives from the secondary half of the array are accepted If member disk drives from the primary half of the array appear they are exchanged into the array and a rebuilding process begins These actions are logged for each primary disk drive If all the member disk drives for the primary half of the array appear and they are rebuilding or they have completed rebuilding the Split Array Resolution flag is reset If in a RAID 10 array the first two primary disk drives and the first secondary disk drive are all in separate power domains you can configure that array so that it can always continue to operate after a power failure of a single power domain After the 193 loss of any power domain in this configuration a using system continues to detect either all the primary configuration disk drives or one primary and one secondary configuration disk drive and continues to access the array withou
240. e RAID manager for which you want a list of connected arrays User s Guide and Maintenance Information A list of arrays is displayed io D COMMAND STATUS Command OK stdout yes stderr no Before command completion additional instructions may appear below hdisk4 09523173A02137K good 3 4G RAID 5 array hdisk3 095231779F0737K good 3 4G RAID 5 array Fl Help F2 Refresh F3 Cancel F6 Command F8 Image F9 Shel1 F10 Exit Find Cii Next Y Chapter 6 Using the RAID Array Configurator 103 Listing the Status of All Defined SSA RAID Arrays This option lists the status of each defined array 1 For fast path type smitty Istssaraid and press Enter Otherwise select List Status of All Defined SSA RAID Arrays from the SSA RAID Arrays menu 2 The following information is displayed A N SSA RAID Arrays Move cursor to desired item and press Enter List All Defined SSA RAID Arrays List All Supported SSA RAID Arrays List All SSA RAID Arrays Connected to a RAID Manager List Status Of All Defined SSA RAID Arrays List Identify SSA Physical Disks List Delete Old RAID Arrays Recorded in an SSA RAID Manager List Status of Hot Spare Pools List Status of Hot Spare Protection for an SSA RAID Array RAID Array Type Move cursor to desired item and press Enter raid_l raid_5 raid_10 Fl Help F2 Refresh F3 Cancel F8 Image F1O Exit Enter Do Find n Find Next XS 4 Select the type of RAID array for which you wa
241. e SSA Command Line Interface for RAID Configurations 253 254 User s Guide and Maintenance Information Chapter 13 Using the Programming Interface SSA Subsystem Overview Device Drivers Two types of device driver provide support for all SSA subsystems The SSA adapter device driver which deals with the SSA adapter e The SSA head device drivers which deal with devices that are attached to the SSA adapter The SSA disk device driver is an example of an SSA head device driver For subsystems that use the Micro Channel SSA Multi Initiator RAID EL Adapter the PCI SSA Multi Initiator RAID EL Adapter or the Advanced SerialRAID Adapter the Target Mode SSA TMSSA device driver is also available This device driver provides support for communications from using system to using system For information about SSA Target Mode and the TMSSA device driver see Responsibilities of the SSA Adapter Device Driver The SSA adapter device driver provides a consistent interface to all SSA head device drivers of which the SSA disk device driver is an example The SSA adapter device driver sends commands for SSA devices to the adapter that is related to those devices When the SSA adapter device driver detects that the commands have completed it informs the originator of the command Responsibilities of the SSA Disk Device Driver The SSA disk device driver provides support for the SSA disk drives that are connected to an SSA adapter That su
242. e chdev command The ssavfynn command described in can be used to verify that no duplicate node numbers exist f a reservation is challenged that is a node that does not hold the reservation attempts to access a reserved SSA logical disk the adapter verifies that a valid path still exists to the node that is holding the reservation If no path exists the reservation is removed and the new node is allowed access to the disk drive Therefore if an adapter is used to reserve a disk drive and is then disconnected or powered off that disk drive becomes effectively unreserved Fast Write Cache SSA adapters that are on SSA loops of up to two SSA adapters can use the Fast Write cache feature For a two way configuration the two adapters can be in the same or in different using systems Fast write functions can be enabled for single disk drives or for RAID arrays Performance improvements are related to the type of disk drive single disk or RAID array type and the workload If two way fast write functions are used in a two host system configuration the performance improvements are greater when concurrent access is not permitted to disk drives When concurrent access is permitted performance improvements are less Chapter 2 Introducing SSA Loops 27 28 User s Guide and Maintenance Information Chapter 3 RAID Functions and Array States This chapter describes the RAID functions and the states of RAID arrays RAID Functions
243. e command The byte contains a value from the usr include ipn ipndef h file A non zero value indicates an error result Contains the Independent Packet Network IPN result word that is returned by IPN for the command The word contains values from the usr include ipn ipntra h file A non zero value indicates an error u0 isal parameter_descriptor Set by the caller to indicate the buffer for parameter data u isal transmit_descriptor Set by the caller to indicate the buffer for transmit data u isal receive_descriptor Set by the caller to indicate the buffer for received data u isal status_descriptor Set by the caller to indicate the buffer for status data u isal minor_function Set by the caller to one of the ISAL commands that is defined in the usr include ipn ipnsal h file and listed at the start of the description of this operation Note Structures that are provided in the usr include ipn ipnsal h file can be used to format the contents of the parameter buffer for the various commands The device driver always overwrites with the correct handle the handle that is located in the first four bytes of the parameter buffer Return Values If the command was successfully sent to the adapter card this operation returns a value of 0 Otherwise a value of 1 is returned and the errno global variable set to one of the following values EIO An unrecoverable I O error has occurred EINVAL Either the caller has specified a
244. e correct disk drive FORMATTING DESTROYS ALL DATA ON THE DISK DRIVE F3 Cancel F1O Exit F 3 If you are not sure of the identification pdisk number of the disk drive that you want to format use the Identify function to get a positive physical identification of the disk drive see i You can further ensure that you have selected the correct disk drive by verifying that the serial number on the front of the disk drive is the same as the serial number that is displayed on the screen 4 When you are sure that you have selected the correct disk drive select Format 390 User s Guide and Maintenance Information Certify Disk Service Aid The Certify service aid verifies that all the data on a disk drive can be read correctly Other maintenance procedures tell you when you need to run this service aid To use the Certify Disk service aid 1 Select Certify Disk from the SSA Service Aids menu Eea A list of pdisks is displayed see ee DISK 802404 gt Move cursor onto selection then press lt Enter gt systemname pdisk11 AC50AE43 9 1GB SSA C Physical Disk Drive systemname pdisk8 AC706EA3 9 1GB SSA C Physical Disk Drive systemname pdisk2 AC1DBE11 4GB SSA C Physical Disk Drive systemname pdisk3 AC1DBEF4 4GB SSA C Physical Disk Drive systemname pdisk7 AC5OAE58 9 1GB SSA C Physical Disk Drive systemname pdisk12 AC7C6E51 9 1GB SSA C Physical Disk Drive systemname pdiskO AC706E9A 4GB SSA C Physical Disk D Dr
245. e displayed instructions exchange the failed member for a new disk drive The Disk to Remove is listed as BlankReserved the Disk to Add is the disk drive that you reinstalled or selected in step ked When failed disk drives have been exchanged for new disk drives the data is rebuilt and the array changes its state to the Good state Exchange the failed disk drive for a new one see Type smitty ssaraid and press Enter Select Change Member Disks in an SSA RAID Array Select Swap Members of an SSA RAID Array Select the degraded hdisk Referring to the displayed instructions exchange the failed member for a new disk drive The Disk to Remove is listed as BlankReserved the Disk to Add is the disk drive that you reinstalled or selected in step ied When failed disk drives have been exchanged for new disk drives the data is rebuilt and the array changes its state to the Good state User s Guide and Maintenance Information 17 from step n A RAID 5 array is in the Exposed state when one member disk drive of the array is not available to the array A RAID 1 or RAID 10 array is in the Exposed state when one or more mirrors are exposed A mirror is exposed when one disk drive in the mirror pair is not available to the array If the missing disk drives are returned to the array the array enters the Good state No data rebuilding is necessary If data is written to an array that is in the Exposed state that data is not protecte
246. e ipn ipnsal h file and listed at the start of the description of this operation Note Structures are provided in the usr include ipn ipnsal h file This file can be used to format the contents of the parameter buffer for the various commands The resource ID that is located in the first four bytes of the parameter buffer is always overwritten with the correct Resource ID for the device by the device driver Return Values Files 282 If the command was successfully sent to the adapter card this operation returns a value of 0 Otherwise a value of 1 is returned and the errno global variable set to one of the following values EIO Indicates an unrecoverable O error EINVAL Indicates that the caller has specified an ISAL manager command that is not in the list of supported ISAL manager commands The commands are listed at the start of the description of this operation EPERM Indicates that caller did not have an effective user ID EUID of 0 ENOMEM Indicates that the device driver was unable to allocate or pin enough memory to complete the operation If the return code is 0 the result field of the ssadisk_ioctl_parms structure is valid The return code indicates whether the adapter was able to process the command successfully dev pdiskO dev pdisk1 dev pdiskn Provide an interface to allow SSA device drivers to access physical SSA disks dev hdiskO dev hdisk1 dev hdiskn Provide an interface to allow
247. e receiving of messages from other using systems Note To ensure that the concurrent mode interface works s set the node_number attribute of the ssar router see different non zero value for each using system that is ana a disk drive Device Driver Entry Point The SSA disk device driver concurrent mode entry point sends commands from the top kernel extension that is related to a specified SSA disk drive The top kernel extension calls this entry point directly The DD_CONC_REGISTER ioctl operation registers entry points This entry point function takes one argument that is defined in the usr include sys ddconc h file The argument is a pointer to a conc_cmd structure The conc_cmd structures must be allocated by the top kernel extension The concurrent mode command operation is specified by the cmd_op field in the conc_cmd structure For each operation the devno field of the conc_cmd structure specifies the appropriate SSA disk drive The concurrent mode command operation can have the following values DD_CONC_SEND REFRESH Broadcasts the one byte message code that is specified by the message field of the conc_cmd structure The code is sent to all using systems that are connected to the SSA disk drive DD_CONC_LOCK Locks the specified SSA disk drive for this using system only No other using systems can modify data that is on the disk drive DD_CONC_UNLOCK Unlocks the SSA disk drive Other using systems can lock and modify data that
248. e some storage size Attention When the array has been created you can use it You might however prefer to wait until the array state changes from Rebuilding to Good because hot spare disk drives are not available until the array is in the Good state If a disk drive fails before the array is in the Good state you might no longer be able to write to the array 7 Change other array attributes as required For more information about each attribute move the cursor to the attribute and press the Help key Chapter 6 Using the RAID Array Configurator 69 Deleting an SSA RAID Array This option allows you to delete arrays that you have created through the Add an SSA RAID Array option The deleted array is broken into its member disk drives You cannot delete arrays that do not have a corresponding hdisk 1 For fast path type smitty rmssaraid and press Enter Otherwise select Delete an SSA RAID Array from the SSA RAID Arrays menu A list of arrays is displayed in a window ie D SSA RAID Arrays Move cursor to desired item and press Enter List All Defined SSA RAID Arrays List All Supported SSA RAID Arrays List All SSA RAID Arrays Connected to a RAID Manager List Status Of All Defined SSA RAID Arrays List Identify SSA Physical Disks List Delete Old RAID Arrays Recorded in an SSA RAID Manager List Status of Hot Spare Pools List Status of Hot Spare Protection for an SSA RAID Array List Components in a Hot Spare Pool SS
249. e that disk drive a specific hdisk number Chapter 2 Introducing SSA Loops 21 Rules for SSA Loops For SSA loops that include an Advanced SerialRAID Adapter type 4 P the following 22 r ules apply Each SSA loop must be connected to a valid pair of connectors on the SSA adapter that is either connectors A1 and A2 or connectors B1 and B2 A maximum of one pair of adapter connectors can be connected in a particular SSA loop All member disk drives of an array must be on the same SSA loop A maximum of 48 SSA disk drives can be connected in a particular SSA loop If an SSA adapter that is in a two way configuration is connected to two SSA loops and a second adapter is connected to each loop both loops must be connected to the same second adapter Each SSA loop can be connected to no more than two adapters on any one using system The number of adapters that are supported in an SSA loop is determined by whether any disk drives are configured for RAID or fast write operations and by the type of adapter see Table 1 Number of Adapters that are Supported in an SSA Loop Array Type Number of Type of Adapters Allowed Adapters in Loop Non RAID 8 Advanced SerialRAID Adapter PCI SSA Multi Initiator RAID EL Adapter Micro Channel SSA Multi Initiator RAID EL Adapter RAID 0 1 Advanced SerialRAID Adapter RAID 1 2 Advanced SerialRAID Adapter at microcode level above 5000 RAID 5 2 Advanced SerialRAID
250. e two disk drives of a particular pair contain the same data If either disk drive fails the data is still available This characteristic of these array types allows the mirrored copies of the data to be held on disk drives that are in different domains For example the disk drives can be in different disk subsystems or on different sites If one domain loses power the array continues to operate because one copy of the data is still available The RAID manager ensures that the two copies of the data are synchronized It is possible for an array to become split in half and for the adapters to become unable to communicate with each other The design of the system must include precautions therefore that prevent both using systems from continuing to operate on the half of the array that each system can access Such a condition causes the data on each mirrored pair to be inconsistent The RAID manager allows only one half of the array to remain available This action prevents two using systems writing different data to the separate halves of the array The Split Array Resolution flag determines whether the primary or secondary side of the array can operate when some of the configuration disks are not available Normally the Split Array Resolution flag is reset When the Split Array Resolution flag is reset and the secondary configuration disk can be accessed but no primary configuration disk drive can be accessed by an adapter the array goes into th
251. ecial file SSA logical disks have the following properties They e Are configured as hdiskO hdisk1 hdiskn e Provide support for a character special file dev rhdiskO dev rhdisk1 dev rhdiskn e Provide support for a block special file dev hdiskO dev hdisk1 dev hdiskn e Provide support for the ioctl subroutine call for nonservice and diagnostics functions only e Accept the read and write subroutine call to the special files e Can be members of volume groups and have file systems mounted on them 266 User s Guide and Maintenance Information Multiple Adapters Some SSA subsystems see g allow a disk drive to be controlled by up to two adapters in a particular using system The disk drive has therefore two paths to each using system and the SSA subsystem can continue to function if an adapter fails If an adapter fails or the disk drive cannot be accessed from the original adapter the SSA disk device driver switches to the alternative adapter without returning an error to any working application When a disk drive has been successfully opened takeover by the alternative adapter does not occur simply because a drive becomes reserved or fenced out However during an open of an SSA logical disk the device driver does attempt to access the disk drive through the alternative adapter if the path through the original adapter experiences reservation conflict or fenced out status A medium error on the disk drive doe
252. ecifies either the name of one adapter that is connected to the device or none if no adapter is connected as adapter_b now primary_adapter Specifies whether adapter_a or adapter_b is to be the primary adapter for this device You can use the chdev command to modify this attribute to one of the values adapter_a adapter_b or assign If you set the value to assign static load balancing is performed when this device is made available and the system sets the value to either adapter_a or adapter_b connwhere_shad Holds a copy of the value of the connwhere parameter for this disk drive SSA disks drives cannot be identified by the location field that the Isdev command gives because they are connected in a loop and do not have the hardware selectable addresses of SCSI devices The serial numbers of the disk drives are the only method of identification The serial number of a particular disk drive is written in the connwhere field of the CuDv entry for that disk drive This connwhere_shad attribute which shadows the connwhere value allows you to display the connwhere value for an SSA device for a pdisk or hdisk location Describes in text the descriptions of the disk drives and their locations for example drawer number 1 slot number 1 The user enters the information for this attribute Attributes for SSA Logical Disks Only pvid Holds the ODM copy of the PVID for this disk drive for an hdisk queue_depth Specifies the maximum numb
253. ed the automatic error log analysis 1 Sends an operator message OPMSG to the error log 2 Displays an error message on dev console 3 Sends a mail message to ssa_adm The name ssa_adm is an alias alternative address that is set up in etc aliases By default ssa_adm is set to root You can however change this alias to any valid mail address for your using system See your using system documentation for information about how to change alias addresses Error log analysis also runs automatically each time that diagnostics run in Problem Determination mode In this mode the error log analysis runs before any diagnostic test is run to the SSA devices Diagnostics in Problem Determination mode therefore generate an SRN if any SSA error logs show that service activity is needed If you run the ssa_ela command from the command line you can also run error log analysis to all SSA devices that are attached to a system Chapter 11 SSA Error Logs 229 If SD 6000 is installed on the system it runs error log analysis whenever a hardware error is logged and raises an incident if problems are found that need service activity Detailed Description Error log analysis determines whether the data that is in the error log indicates that service activity is needed on the subsystem The analysis uses the detailed data that is logged with each error If service activity is needed an SRN is produced This SRN provides an entry point into the mai
254. einstalled or selected in step B6a on page 468l When the exchange is complete the array starts to copy its data to the coupled disk drive 475 to verify the repair ae the failed disk drive for a new one see FExchanging Disk Type smitty ssaraid and press Enter Select Change Show Use of an SSA Physical Disk The pdisk that has been exchanged is listed under SSA Physical Disks that are system disks Select the pdisk from the list and change the Current Use parameter to Array Candidate Disk Select Change Member Disks in an SSA RAID Array Select Swap Members of an SSA RAID Array Select the hdisk that is in the Degraded copy state that is the hdisk that you noted in step Badan page 46 Referring to the displayed instructions select a disk drive to remove from the array and the disk drive to add to the array The disk drive to remove is listed as not_present the disk drive to add is the disk drive that you exchanged in the previous steps When the exchange is complete the array starts to copy its data to the coupled disk drive g to verify the repair Chapter 18 SSA Problem Determination Procedures 469 27 from step 2 Is any disk drive failing NO A disk drive that is listed as rejected is not necessarily failing For example the array might have rejected the disk drive because a power problem or an SSA link problem caused that drive to become temporarily unavailable Under such conditions the dis
255. el SSA Multi Initiator RAID EL Adapter or the Advanced SerialRAID Adapter the Target Mode SSA device driver can make entries in the kernel trace buffer its hook ID is 3B4 256 User s Guide and Maintenance Information SSA Adapter Device Driver Purpose To provide support for the SSA adapter Syntax include usr include sys ssa h include usr include sys devinfo h Description The dev ssan special files provide an interface that allows client application programs to access SSA adapters and the SSA devices that are connected to those adapters Multiple head device drivers and application programs can all access a particular SSA adapter and its connected devices at the same time Configuring Devices All the SSA adapters that are connected to the using system are normally configured automatically during the system boot sequence PCI SSA Adapter ODM Attributes The PCI SSA adapter has a number of object data manager ODM attributes that you can display by using the Isattr command ucode Holds the file name of the microcode package file that supplies the adapter microcode that is present in an SSA adapter bus _io_ addr Holds the value of the bus I O base address of the adapter registers that the SSA adapter device driver for this adapter will use bus_mem_start Holds the value of the bus memory start address that the SSA adapter device driver for this adapter will use bus_mem_start2 Holds the value of the bus memory start addre
256. eld in the ssadisk_ioct1 _parms structure was set to a length that is not valid or the u0 scsi direction field in the ssadisk_ioct _parms structure was set to a value that is not valid EPERM The caller did not have an effective user ID EUID of 0 ENOMEM The device driver was unable to allocate or pin enough memory to complete the operation If the return code is 0 the result field of the ssadisk_ioctl_parms structure is valid The return code indicates whether the adapter was able to process the command successfully dev pdiskO dev pdisk1 dev pdiskn Provide an interface to allow SSA device drivers to access physical SSA disks dev hdiskO dev hdisk1 dev hdiskn Provide an interface to allow SSA device drivers to access logical SSA disks 284 User s Guide and Maintenance Information SSADISK_LIST_PDISKS SSA Disk Device Driver ioctl Operation Purpose Description To provide a method of determining which SSA physical disk drives make up a SSA logical disk drive The SSADISK_LIST_PDISKS operation can be issued by any user to an SSA logical disk hdisk The operation returns a list of the SSA physical disks pdisks that make up the specified logical disk drive The arg parameter for the SSADISK_LIST_PDISKS operation is the address of an ssadisk_ioctl_parms structure This structure is defined in the usr include sys ssadisk h file The SSADISK_LIST_PDISKS operation uses the following fields of the ssadisk_
257. elect events POLLOUT Check whether output is possible POLLPRI Check whether an error occurred with the write operation POLLSYNC Return only events that are currently pending No asynchronous notification occurs An additional event POLLIN is not applicable and has no support from the initiator mode device driver The reventp output parameter points to the result of the conditional checks The device driver can return a bitwise OR of the following flags POLLOUT If the initiator device is opened with the O_NDELAY flag some buffer space is not being used now Otherwise this event is always set for the initiator mode device POLLPRI An error occurred with the latest write operation Asynchronous notification of the POLLOUT event occurs when buffer space is made available for further write operations Asynchronous notification of the POLLPRI event occurs if an error occurs with a write operation Note that the error might be recovered successfully by the device driver Possible return values for the errno global variable include EINVAL A specified event has no support or the device instance is not configured or not open Errors that are detected by the target mode device driver can be one of the following A hardware error that occurred while receiving data and cannot be reproduced A hardware error that occurred during an adapter command and cannot be reproduced 302 User s Guide and Maintenance Information
258. em needs to be switched off for some other reason do not switch off the using system when servicing the SSA loop Enclosure power cables and external SSA cables that connect the devices to the using system can be disconnected while that system is running 1 Do you have an SSA subsystem 5 character SRN NO Go to step B YES Go to 2 from step Hh e Ifthe system diagnostics are available go to step a e If the system diagnostics are not available but the stand alone diagnostics are available a Load the stand alone diagnostics b Go to step e f neither the system diagnostics nor the stand alone diagnostics are available go to the problem determination procedures for the enclosure that contains the disk drives 3 from step bh Run the diagnostics in Problem Determination mode Did the diagnostics produce an SRN NO Go to step Hin YES Go to step 444 User s Guide and Maintenance Information MAP 2320 SSA Link This MAP helps you to isolate FRUs that are causing an SSA loop problem between a device and the SSA adapter or between two devices If you are not familiar with SSA before using this explains SSA links strings and loops Attention Unless the using system needs to be switched off for some other reason do not switch off the using system when servicing the SSA loop Enclosure power cables and external SSA cables that connect the devices to the using system can be disconnected while that system is
259. ems Copy Delete a RAID Array Copy Enter Good 100 Good F3 Cancel Enter Do If you select a raid_1 or a raid_10 array from the pop up menu and press Enter the coupled pdisks change to free disk drives The RAID array remains fully accessible If you select a RAID Copy array the array is deleted and the pdisks change to free disk drives Data on the RAID Copy array is no longer accessible Chapter 7 Copying Data from Arrays and from Volume Groups 183 Delete a Volume Group Logical Volumes or Filesystems Copy For fast path type smitty copy_delvglvfs and press Enter Otherwise select Delete a Volume Group Logical Volumes or Filesystems Copy from the Array Copy Services menu The following information is displayed E D Array Copy Services Move cursor to desired item and press Enter Prepare a RAID Array Copy Prepare Volume Group Logical Volumes or Filesystems Copy Uncouple a RAID Array Copy Uncouple a Volume Group Logical Volumes or Filesystems Copy Delete a Volume Group Logical Volumes or Filesystems Copy Move cursor to desired item and press Enter Use arrow keys to scroll fsmyvg01 hdisk7 good hdisk3 Fri May 12 13 2 hdisk8 good hdisk4 Fri May 12 13 2 fsmyvg01 hdisk9 good hdisk5 Fri May 12 14 1 hdisk10 good hdisk6 Fri May 12 14 1 Fl Help F2 Refresh F3 Cancel F8 Image F1Q Exit Enter Do F1l Find n Find Next FQ
260. en press lt Enter gt Physical Serial Adapter Port Al A2 Bl B2 Status TOP nunu pdisk11 AC7AAQ9A 0 5 Good nunu pdisk8 AC7AA2D6 1 4 Good nunu pdisk2 AC7AAOBD 2 3 Power nunu pdisk3 AC7AA B1 Bie Good nunu pdisk7 AC7AA B5 aai Good nunu pdisk12 AC7AA052 5 0 Good nunu pdiskO AC7AA0B9 0 5 Good nunu pdisk1 AC7AA0B3 1 4 Good nunu pdisk10 AC7AA B4 22 33 Good MORE 4 F3 Cancel F1O Exit amp J Does one of the pdisks have a Power status NO Go to step 5 on page 44A YES Go to the START MAP for the enclosure in which the SSA device is installed Chapter 18 SSA Problem Determination Procedures 447 5 from step Observe the Status column on the screen If the status of any pdisk is Failed that pdisk is failing In the example shown here pdisk2 is failing TOP nunu nunu nunu nunu Va LINK VERIFICATION ssa0 Physical pdisk11 pdisk8 pdisk2 nunu nunu nunu pdisk0 nunu nunu MORE pdisk3 pdisk7 pdisk12 pdisk1 pdisk10 4 F3 Cancel XX SSA Link Verification for nunu F1O Exit 00 04 Serial AC7AAO9A AC7AA2D6 AC7AAOBD AC7AAQB1 AC7AAOB5 AC7AA052 AC7AAOB9 AC7AAOB3 AC7AAQB4 802386 IBM SSA 160 SerialRAID Adapter Adapter Port Al A2 B1 0 5 1 4 2 63 3 2 4 1 5 0 0 1 2 B2 To Set or Reset Identify move cursor onto selection then press lt Enter gt Status Good Good Failed Good Good Good Good Good Good N
261. encing 290 Display Download Disk Drive Microcode service aid 393 dumps managing 259 duplicate node test error logging 227 E effects of array copy on other SMIT menus 186 Change Show Attributes of an SSA RAID Array option 186 Identify Disks in an SSA RAID Array option 189 List Status Of All Defined SSA RAID Arrays option 188 Remove a Disk From an SSA RAID Array option 190 Swap Members of an SSA RAID Array option 191 enabling or disabling Fast Write for multiple devices 215 enabling or disabling Fast Write for one disk drive 214 error codes for service aids 400 error conditions disk device driver 274 error log analysis detailed description 230 command line error log analysis 232 error log analysis routine 230 run_ssa_elacron 232 summary 229 error log analysis routine 230 error logging detailed description 222 detail data formats 225 error logging continued detailed description continued duplicate node test 227 run_ssa_healthcheck cron 226 run_ssa_link_speed cron 227 summary 221 tmssa device driver 302 error logging management detailed description 228 summary 228 exchanging adapters and Rebuilding state RAID 5 34 exchanging disk drives 319 execution of Target Mode requests 294 Exposed state RAID 10 36 Exposed state RAID 5 33 read operations while in 33 write operations while in 33 exssaraid command 95 142 F fast write cache card installing 334 removing 332 fast write cache card battery assembly installing
262. ep This is your last chance to stop before continuing Press Enter to continue Press Cancel to return to the application Fl Help F2 Refresh F3 Cancel F8 Image F1O Exit Enter Do Find n Find Next y 5 Attention When an array is deleted all the data that is contained in that array is lost At the prompt press Enter if you want to delete the array Press Cancel if you do not want to delete the array 6 When you have deleted the array go to step E3 on page 324 Chapter 15 Removal and Replacement Procedures 321 7 Attention You should be here only if you are working with a RAID 1 RAID 5 or RAID 10 array For fast path type smitty redssaraid and press Enter Otherwise a Select Change Member Disks in an SSA RAID Array from the SSA RAID Array menu b Select Remove a Disk from an SSA RAID Array 8 A list of arrays is displayed in a window ec D Change Member Disks in an SSA RAID Array Move cursor to desired item and press Enter Remove a Disk from an SSA RAID Array Add a Disk to an SSA RAID Array Swap Members of an SSA RAID Array SSA RAID Array Move cursor to desired item and press Enter hdisk3 095231779F0737K good 3 4G RAID 5 array hdisk4 09523173A02137K good 3 4G RAID 5 array Fl Help F2 Refresh F3 Cancel F8 Image F1O Exit Enter Do Find n Find Next amp Y Select the SSA RAID array from which you are removing the disk drive 322 User s Guide and
263. er are not yet configured as system disk drives Flags a AdapterName Specifies the adapter whose connection locations are to be listed P Produce a list of possible connection locations for physical disks L Produce a list of possible connection locations for logical disks 344 User s Guide and Maintenance Information ssa_certify Command Purpose Syntax Description Flags e To certify the physical disk drive so that data can be read from or written to the disk drive without problems e To certify a RAID 5 array to determine whether any array logical block addresses LBAs have been as unreadable ssa_certify 1 pdisk n MaxReadSize a ssa_certify 1 hdisk b StartLBA c If the ssa_certify command is issued to a pdisk it uses the ISAL_Read ISAL_Write or ISAL_Characteristics command to certify the disk drive It returns 0 unless a nonmedia related problem occurs If such a problem occurs the ssa_certify command prints a message to stderr If a media related problem occurs and the disk drive can perform automatic reassign operations the command attempts to reassign soft error blocks If the attempt to reassign the soft error blocks fails if the block has a hard media error or if the disk drive cannot perform automatic reassign operations the ssa_certify command returns 0 but prints to stdout the LBA of the failing block in decimal followed by the word Failed For example gt ssa_certify
264. er AFTER making all desired changes Entry Fields SSA RAID Manager ssaQ Old SSA RAID Array Record to Delete F1 Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image Rie F10 Exit Enter Do Press the List key to list the records 5 From the displayed list select the record that you want to delete and follow the instructions that are given on the screen 134 User s Guide and Maintenance Information Changing or Showing the Attributes of an SSA RAID Array Each array type has several attributes associated with it This option allows you to see and possibly change those attributes i For fast path type smitty chssaraid and press Enter Otherwise select Change Show Attributes of an SSA RAID Array from the SSA RAID Arrays menu A list of RAID managers is displayed in a window Select the required adapter from the list of RAID managers Note If you are not sure which adapter to select select Change Show Characteristics of an SSA Logical Disk from the SSA Logical Disks menu see Getting Acce o the A on page 40 and note the adapter that is listed as adapter_a A list of arrays is displayed in a window ee gt SSA RAID Arrays N Move cursor to desired item and press Enter List All Defined SSA RAID Arrays List All Supported SSA RAID Arrays List All SSA RAID Arrays Connected to a RAID Manager List Status Of All Defined SSA RAID Arrays List Identify SSA Physical Disks
265. er Disks One Half of the Array Is Not Present Array is Offline because Adapter Is Not Known to the Remaining Half of the Array Array is Offline because the Split and Join Procedure Was Not Performed Correctly Chapter 9 Using the SSA Spare Tool Chapter 10 Using the Fast Write Cache Feature Fast Write Cache Card Battery ee og Configuring the Fast Write Cache Feature Getting Access to the Fast Write Menus Enabling or Disabling Fast Write for One Disk Drive Enabling or Disabling Fast Write for Multiple Devices Bypassing the Cache in a One Way Fast Write Network Dealing with Fast Write Problems oS a Ewe A SRN 42521 SRN 42524 SRN 42525 Chapter 11 SSA Error Logs Error Logging Summary Detailed Description Error Logging Management Summary Detailed Description Error Log Analysis Summary Detailed Description Good Housekeeping Chapter 12 Using the SSA Command Line Interface for RAID Configurations Command Syntax he ata phe oie bs Se ee tO cee ty cee et te Options Object Types Instruct Types Examples Example 1 To Create a RAID 0 Array Example 2 To Create a RAID 1 Array Example 3 To Create a RAID 5 Array Example 4 To Create a RAID 10 Array Example 5 To Create a Hot Spare Pool Example 6 To List All Defined SSA Objects User s Guide and Maintenance Information 190 191 193 193 194 195 203 205 209 211 211 211 213
266. er of commands that the SSA disk device driver dispatches to a logical disk The default value is correct for normal operating conditions You can use the chdev command to modify this attribute Valid entry values are 0 through 200 A value of 0 resets the queue_depth to its default value reserve_lock Specifies whether the SSA disk device driver locks the device with a reservation when it is opened for an hdisk size_in_mb Specifies the size of the logical disk in megabytes Chapter 13 Using the Programming Interface 271 max_coalesce The maximum number of bytes that the SSA disk device driver attempts to transfer to or from an SSA logical disk in one operation The default value is appropriate for most environments For applications that perform very long sequential write operations performance improves when data is written in blocks of 64 KB multiplied by n 1 where n is the number of disks in the array For example if the array contains six member disks the data would be written in blocks of 64 KB x 5 These operations are known as full stride writes To use full stride writes increase the value of this attribute to 64 KB x n 1 or to some multiple of this number write_queue_mod Alters the way in which write commands are queued to SSA logical disks The default value is 0 for all SSA logical disks that do not use the fast write cache with this setting the SSA disk device driver maintains a single seek ordered queue of queue
267. er s Guide and Maintenance Information a Observe the SRN that sent you to this MAP The last three characters are in the format PAA where P is the number of the SSA adapter port and AA is the SSA address of the device Note the value of PAA in the SRN For example If the SRN is 24002 PAA 002 If the SRN is 24104 PAA 104 b Observe the Link Verification screen and identify the physical device that is represented by PAA in the SRN This device is the first of the two devices that are connected by the failing link If the SRN is in the series 21xxx through 29xxx the second device of the two is located at PAA 1 If the SRN is 33xxx the second device of the two is located at PAA 1 Note If the SSA address AA in the SRN is higher than the highest SSA address that is displayed for the adapter port P that address is the address of the SSA adapter Read through the following examples if you need help in identifying the device then go to nn paga dad Otherwise go directly to Bon page 4nd Chapter 18 SSA Problem Determination Procedures 451 452 Example 1 If the SRN is 24002 the device is connected to adapter port 0 shown as A1 on the screen and has an SSA address of 02 shown as 2 on the screen In the example screen that device is pdisk3 which is the first device of the two The second device is pdisk2 PAA 1 If the SRN is 33002 the first device of the two is again pdisk3 The second device is pdisk4 PAA
268. er sets the appropriate bit in the chg_option field The caller can change either the retry parameter or the time out parameter or it can change both parameters To change the delay between send command retries the caller sets the TM_CHG_RETRY_DELAY flag in the chg_option field and puts the required delay value in seconds into the new_delay field of the structure With this command the retry delay can be changed to any value 0 through 255 where 0 instructs the device driver to use as little delay as possible between retries The default value is approximately two seconds To change the send command time out value the caller sets the TM_CHG_SEND_TIMEOUT flag in the chg_option field sets the desired flag in the timeout_type field and puts the desired time out value into the new_timeout field of the structure One flag must be set in the timeout_type field to indicate the required form of the time out If the TM_FIXED_TIMEOUT flag is set in the timeout_type field the value that is put into the new_timeout field is a fixed time out value for all send commands If the TM_SCALED_TIMEOUT flag is set in the timeout_type field the value that is put into the new_timeout field is a scaling factor used in the calculation for time outs as shown under the description of the write entry point The default send command time out value is a scaled time out with a scaling factor of 20 308 User s Guide and Maintenance Information Regardless of the value o
269. er value An error that is more serious that 0 or 1 has occurred Chapter 16 Using the SSA Command Line Utilities 367 ssa_servicemode Command Purpose To put the disk drive into Service Mode set Service Mode or to remove the disk drive from Service Mode reset Service Mode Syntax ssa_servicemode 1 a AdapterName y n Description The ssa_servicemode command opens the adapter special file and sends the appropriate IACL command to put the disk drive into or remove it from Service Mode When the Service Mode has been successfully set or reset the IACL command closes the adapter special file If Service Mode cannot be set or reset for any reason the cop command prints the appropriate error message Flags l pdisk Specifies the pdisk that you want to put into or remove from Service Mode a AdapterName Specifies the adapter to which the pdisk is connected y Puts the pdisk into Service Mode set Service Mode n Removes the pdisk from Service Mode reset Service Mode Output The ssa_servicemode command sends all error messages to stderr 368 User s Guide and Maintenance Information ssa_speed Command Purpose Syntax Description Flags To determine the operating speed of SSA links ssa_speed 1 pdisk s ssa_speed a AdapterName p Loop n Network s ssa_speed x e The ssa_speed command either tests the existing link speeds of the selected pdisk or adapter or searches for link speed exceptio
270. ers of an SSA RAID array 142 installing the fast write cache card 334 installing the SSA adapter 313 instruct types Command Line Interface 238 interface adapter device driver nead device driver 256 IOCINFO ioctl operation 261 305 description 261 305 disk device driver 277 description 277 files 277 purpose 277 files 261 purpose 261 305 ioctl subroutine tmssa device driver 301 issaraid command 119 L large configurations 16 switching off using systems 17 switching on using systems 17 lassaraid command 117 Icssaraid command 115 Ifssaraid command 91 113 Ihssaraid command 111 lights Advanced SerialRAID Adapter type 4 P 6 link error problem determination 478 link errors 478 link speed 18 Link Speed service aid 396 link status ready lights 481 Link Verification service aid 383 List All Copy Candidates option 179 List All Defined SSA RAID Arrays option 100 List All SSA RAID Arrays Connected to a RAID Manager option 102 List All Supported SSA RAID Arrays option 101 List all Uncoupled Copies option 181 List All Uncoupled Volume Groups option 182 List Array Candidate Disks option 115 List Components in a Hot Spare Pool option 80 List Disks in an SSA RAID Array option 109 List Hot Spares option 111 List Rejected Array Disks option 113 List Status of All Defined SSA RAID Arrays option 104 List Status Of All Defined SSA RAID Arrays option effects of array copy 188 List Status of Hot Spare Pools option 74 List Status of Hot Spa
271. ery has been powered on C Shows the state of the fast write cache Output The ssa_fw_status command sends all error messages to stderr and output to stdout Examples e To show the status of the Fast Write cache on an SSA adapter give the command ssa_fw_status a ssaQ e To show the life of the battery give the command ssa_fw_status a ssaQ To show the number of hours for which the battery has been powered on give the command ssa_fw_status a ssaQ p To show whether the fast write cache is active give the command ssa_fw_status a ssaQ c 360 User s Guide and Maintenance Information ssa_getdump Command Purpose Syntax Description To display SSA adapter dump locations and to save the dump to a specified location For the List version of the command ssa_getdump 1 h d pdiskxx a AdapterName n AdapterUID s SlotNumber For the Copy version of the command ssa_getdump c h d pdiskxx a AdapterName n AdapterUID s SlotNumber x o OutputFile The ssa_getdump command has two modes of operation List mode and Copy mode List Mode In List mode the command searches for adapter dumps on unused SSA disk drives It searches the disk drives sequentially and provides information about all the dumps that it finds An example of the output from List mode is shown here ADAPTER DUMPS DATE TIME ADAPTER UID DISK SLOT SIZE STATUS SEQ ADAP 961031 10 31 12 123 1234567890ABCDEF pdisk22 12 1
272. es Change Attribute for Hot Spare Disk Drives Only You can specify the following attribute with the a option only when you are using the ssaraid command with the H option to change a hot spare disk drive spare_pool default pool_A0 or pool_BO as determined by the network ID If you set this attribute the hot spare disk drive is assigned to the specific hot spare pool Change Attribute for Array Member Disk Drives Only You can specify the following attribute with the a option only when you are using the ssaraid command with the H option to change an array member disk drive spare_pool default pool_A0 or pool_BO as determined by the network ID If you set this attribute the array member disk drive is assigned to the specific hot spare pool If the spare_preferred array attribute is set to true for the array to which this disk drive belongs and this disk drive fails a hot spare disk drive if available is selected only from the hot spare pool to which this disk drive is assigned If the spare_preferred array attribute is set to false for the array to which this disk drive belongs and this disk drive fails a hot spare disk drive if available is selected from the hot spare pool to which this disk drive is assigned If no hot spare disk drive is available in that pool a hot spare disk drive is selected from the default hot spare pool for that SSA loop Pool AO or BO If no hot spare disk drive is available in the default hot spare p
273. es The Devices menu is displayed c Select SSA Disks The SSA Disks menu is displayed a N SSA Disks Move cursor to desired item and press Enter SSA Logical Disks SSA Physical Disks Fl Help F2 Refresh F3 Cancel F8 Image ae F1O Exit Enter Do Y Select the type of SSA disk on which you want to work e SSA logical disks are configured into the using system as hdisks SSA hdisks can be single disk drives or SSA RAID arrays e SSA physical disks are configured into the using system as pdisks SSA pdisks are used for service and configuration operations If you select SSA Logical Disks go to step 4 on page 42 If you select SSA Physical Disks go to step Bon page 44 Chapter 4 Using the SSA SMIT Menus 41 42 4 The SSA Logical Disks menu is displayed fe SSA Logical Disks Move cursor to desired item and press Enter List All Defined SSA Logical Disks List All Supported SSA Logical Disks Add an SSA Logical Disk Change Show Characteristics of an SSA Logical Disk Remove an SSA Logical Disk Configure a Defined SSA Logical Disk Generate Error Report Trace an SSA Logical Disk Show Logical to Physical SSA Disk Relationship List Adapters Connected to an SSA Logical Disk List SSA Logical Disks Connected to an SSA Adapter Identify an SSA Logical Disk Cancel all SSA Disk Identifications Enable Disable Fast Write for Multiple Devices F1 Help F2 Refresh F3 Cancel F8 Image F9 Shel F1O Exit Enter Do
274. es of that type are made for that device until the first error has been in the log for at least six hours The example in Figure 41 on page 229 shows an open link error occurring This type of error has a logging threshold value of three The error is logged when the link is first broken in this example at about 04 30 The error is then logged each hour as a result of the heath check 228 User s Guide and Maintenance Information Adapter sends 04 00 05 00 06 00 07 00 08 00 09 00 10 00 11 00 12 00 13 00 LINK_OPEN log to device driver Device driver sends LINK_OPEN log to system error Log SSA Error Logger System Error Log Figure 41 Example of an Open Link Error The example also shows that during any six hour period no more than three errors of this type are sent to the error log If other types of error occur for this device or errors occur for another device they are sent immediately to the error log The actual threshold values that are used for any given error type are regularly reviewed and might change with any new version of the device driver They always permit however enough errors to be logged to ensure that the error log analysis produces an SRN when any service action is required Error Log Analysis Summary The error log is analyzed automatically every 24 hours This automatic error log analysis is started by the run_ssa_ela cron job If the results of the analysis show that any service activity is need
275. escription If the y parameter is specified the disk is set into Identify mode While the disk is in Identify mode its amber Ready light flashes at approximately one second intervals The n flag switches off Identify mode Flags I PhysicalDiskName Specifies the device to place into Identify mode y Switches on Identify mode n Switches off Identify mode 364 User s Guide and Maintenance Information ssa_progress Command Purpose Syntax Description Flags Output Examples To show how much by percentage of a format operation has been completed and to show the status of the format operation The status can be Complete Formatting or Failed ssa_progress 1 pdisk The ssa_progress command opens the pdisk special file and uses the ISAL Progress command to determine the percentage of the formatting operation that is complete The ssa_progress l pdisk command allows you to check the progress of the format operation that the ssa_format I pdisk command started Pdisk Specifies the pdisk of whose format operation you want to check progress and status The ssa_progress command sends error messages to stderr and progress messages to stdout e If the disk has been 30 formatted the following messages are displayed gt ssa_progress 1 pdisk Formatting 30 e If the disk is not formatting and is not format degraded the following messages are displayed gt ssa_progress 1 pdisk
276. ess than the Action specified minimum number 1 Type smitty ssaraid and press Enter P A Possible FRUs 2 Select Change Show Use of an SSA Physical Disk i i Device 100 3 Note all the disk drives that are listed as rejected Exchanging Disk Drives 4 Exchange all the rejected disk drives for new disk drives 5 Select Change Show Use of Multiple SSA Physical Disks 6 Change the Current Use parameter of the exchanged disk drives to Hot Spare Disks 7 Select List Status of Hot Spare Pools 8 Select the adapter that logged the error If the adapter is not known select all adapters The hot spare pool that is listed as empty or critical is the pool to which the hot spare disk drives must now be added Attention If more than one hot spare pool is listed as empty critical or reduced refer to the user s pool assignment records to determine the correct pool to which the hot spare disk drives must now be assigned If the user has no such record see e to determine the 9 Select Change Show Delete a Hot Spare Pool 10 Add to the correct hot spare pool the hot spare disk drives that you created in step A 49540 Description Adapters that do not provide support for hot spare pools have Hot spare disk drives have been detected This problem has occurred because the cabling has been assigned to pools other changed or because an SSA adapter that does not provide support for than pool zero but other SSA hot spare pools has been added t
277. ess to the devices disk drives in this example through two data paths The adapter always however uses the path that has the fewest interconnecting devices between the adapter and the destination device The using system cannot detect which data path is being used Using system Ai A2 Bi B2 A A Disk Disk Disk Disk Disk Disk Disk Disk 1 2 3 4 5 6 7 8 Figure 2 Simple Loop 8 User s Guide and Maintenance Information Simple Loop One Disk Drive Missing If a disk drive fails or is switched off the loop is broken and one of the data paths to a particular disk drive is no longer available The disk drives on the remainder of the loop continue to work but an error is reported to the system The adapter now uses the alternative path to some of the devices In Figure 3 disk drive number 3 has failed Disk drives 1 and 2 can communicate with the using system only through connector A1 of the SSA adapter Disk drives 4 through 8 can communicate only through connector A2 of the SSA adapter Using system Ai A2 fai B2 A A Disk Disk Disk Disk Disk Disk Disk 1 2 4 5 6 7 8 Figure 3 Simple Loop with One Disk Drive Missing Chapter 2 Introducing SSA Loops 9 Simple Loop Two Disk Drives Missing If two or more disk drives are switched off fail or are removed from the loop some disk drives might become isolated from the SSA adapter In Figure 4 disk drives 3 and 7 have been removed
278. etermination Instead of using the normal MAPs to solve a link error problem you can refer directly to the link status lights to isolate the failing FRU as described here In an SSA loop SSA devices are connected through two or more SSA links to an SSA adapter Each SSA link is the connection between two SSA nodes devices or adapters for example disk drive to disk drive adapter to disk drive or adapter to adapter An SSA link can contain several parts When doing problem determination think of the link and all its parts as one complete item Here are some examples of SSA links Each link contains more than one part 478 User s Guide and Maintenance Information Example 1 In Figure 59 the link is between two disk drives that are in the same subsystem It has three parts SSA Subsystem lt gt Disk Internal Disk Drive1 Connection Drive 2 Figure 59 Three Part Link in One Subsystem Example 2 In Eigure 6d the link is between two disk drives that are in the same subsystem It has five parts SSA Subsystem lt e Disk Internal go Internal Disk Drive 1 Connection Dive Connection Drive 2 Figure 60 Five Part Link in One Subsystem Chapter 18 SSA Problem Determination Procedures 479 Example 3 In Figure 61 the link is between two disk drives that are not in the same subsystem It has seven parts SSA
279. f the volume group For a copy of a volume group to be uncoupled I O operations must be temporarily stopped I O operations cannot be stopped for active paging space Chapter 7 Copying Data from Arrays and from Volume Groups 161 Flags 162 10 11 12 c If the dump logical volume and the boot logical volume are copied a warning is displayed You must delete the copies of these logical volumes after the copy volume group has been created d If the logical volume that you want to copy has a file system loglv that is located in a different volume group you cannot copy the logical volume The command searches for suitable RAID Copy arrays that you have used SMIT or ssaraid commands to create previously or it searches for appropriate free disk drives It then either checks your RAID Copy arrays or automatically creates RAID Copy arrays from the free disk drives The command either automatically prepares suitable RAID Copy disk drives or checks the RAID Copy arrays that you have used SMIT or ssaraid commands to create The copy disk drives are coupled to the arrays and the adapter copies the data to the copy disk drives When the copy operation is complete the command waits for an external trigger The command temporarily disables the fast write cache if enabled and flushes the data to disk drive The command then forces the parent volume group to synchronize The synchronize operation flushes all data from memory to
280. f the timeout_type field if the new_timeout field is set to a value of 0 the caller specifies no time out for the send command allowing the command to take an indefinite amount of time If the calling program wants to end a write operation it generates a signal This option is allowed only for blocking type write operations Chapter 13 Using the Programming Interface 309 310 User s Guide and Maintenance Information Part 2 Maintenance Information 311 312 User s Guide and Maintenance Information Chapter 14 SSA Adapter Information Installing the SSA Adapter i Install the adapter and disk drive microcode from the media that are supplied with the adapter A README sheet of installation instructions is also supplied Install the adapter into a slot in the using system see the Installation and Service Guide for the using system Switch on the using system This action ensures that the latest version of the microcode that is available on the using system is downloaded into the adapter before the disk drives are connected Connect the SSA cables to the adapter and to the devices that are to be attached to the adapter For information about how the cables are to be attached see the configuration plan that was created when the subsystem was ordered If the configuration plan is not available use the example configuration information that is given in the service information for the devices on page 7i for general
281. f this uncouple operation fails to complete successfully for example because of a power failure the fast write function remains in suspended mode after the using system becomes online again Use the fw_suspended false attribute to reactivate the fast write function on this disk drive Creation and Change Attributes for RAID 1 RAID 5 and RAID 10 Arrays Only You can specify the following attributes with the a option when you are using the ssaraid command with the C or H option to create or change a RAID 1 RAID 5 or RAID 10 array spare true false default true Normally if the array is in the Exposed state and a write operation to that array is attempted the array enters the Degraded state You can prevent the array from entering the Degraded state if you enable the spare attribute of the array and provide a suitable hot spare disk drive to the RAID manager that is controlling the array When the spare attribute is enabled and a write operation is attempted to an array that is in the Exposed state the RAID manager searches for an available hot spare disk drive to exchange for the failing disk drive This action prevents the array from entering the Degraded state spare_exact true false default false When used with the spare attribute the spare_exact attribute modifies the action of the RAID manager when the RAID manager attempts to exchange a hot spare disk drive for a failing disk drive Normally the RAID manager uses any hot
282. from step th No spare disk drives are available for an array that is configured for hot spare disk drives a If the subsystem contains disk drives that have failed repair tose disk drives or exchange them for new disk drives see Type smitty ssaraid and press Enter Select Change Show Use of an SSA Physical Disk Are any disks listed as SSA Physical disks that are hot spares NO Review with the user the requirement for hot spare disk drives If the customer wants hot spare disk drives one or more disk drives must have their use changed to Hot Spare Disk If the customer does not want hot spare disk drives a b c Return to the SSA RAID Arrays menu Select Change Show Attributes of an SSA RAID Array Change the Enable Use of Hot Spares attribute to No YES The following conditions must be met to make a hot spare disk drive available for use by an array a The hot spare disk drive and the array must be on the same SSA loop see If the spare_exact parameter is set to true the size of the hot spare disk drive must be the same as the size of the smallest member disk determine the sizes of the disk drives that are in the array If the spare_exact parameter is set to false the size of the hot spare disk drive must be at least that of the smallest member disk in the array Ensure that these conditions have been met g to verify the repair 466 User s Guide and Maintenance Information
283. g mit dem Deutschen EMVG das EG Konformitatszeichen CE zu f hren Verantwortlich f r die Konformitatserklarung nach Paragraph 5 des EMVG ist die IBM Deutschland Informationssysteme GmbH 70548 Stuttgart Informationen in Hinsicht EMVG Paragraph 3 Abs 2 Das Ger t erf llt die Schutzanforderungen nach EN 50082 1 und EN 55022 Klasse A EN 55022 Klasse A Ger te m ssen mit folgendem Warnhinweis versehen werden Warnung dies ist eine Einrichtung der Klasse A Diese Einrichtung kann im Wohnbereich Funkst rungen verursachen in diesem Fall kann vom Betreiber verlangt werden angemessene Massnahmen durchzuf hren und daf r aufzukommen EN 50082 1 Hinweis Wird dieses Ger t in einer industriellen Umgebung betrieben wie in EN 50082 2 486 User s Guide and Maintenance Information festgelegt dann kann es dabei eventuell gest rt werden In solch einem Fall ist der Abstand bzw die Abschirmung zu der industriellen St rquelle zu vergr ssern Anmerkung Um die Einhaltung des EMVG sicherzustellen sind die Ger te wie in den Handb chern angegeben zu installieren und zu betreiben Taiwan Class A Compliance Statement FERRE ie te HARRY Bann EAE hA ee SREB IS RARE TSR PB 2 FA OK RA Ee BEER Appendix Communications Statements 487 488 User s Guide and Maintenance Information Glossary This glossary explains terms and abbreviations that are used in the manual The glossary contains terms and defi
284. g to the displayed instructions exchange the failed member for a new disk drive The Disk to Remove is listed as BlankReserved the Disk to Add is the disk drive that you tested or exchanged in step iai When failed disk drives have been exchanged for new disk drives the data is rebuilt and the array changes its state to the Good state Note The array can be used during the rebuilding operation Inform the user however that while the rebuilding operation is running the data is not protected against another disk drive failure The rebuilding operation runs more slowly if the array is being used Chapter 18 SSA Problem Determination Procedures 461 462 k When the rebuilding operation is complete ask the user to run diagnostics in System Verification mode to the SSA adapters to ensure that the rebuilding operation has not found any more problems Go to MAP repair j to verify the 15 from step iE Does the Link Verification service aid indicate an open loop NO Go to step 6 YES 16 from step IE Does any SSA disk drive have its Check light on NO The disk drive might have been removed from the subsystem a eo 2095 YES eo a0 5 Reinstall the removed drive or select a new disk drive for addition to the array Type smitty ssaraid and press Enter Select Change Member Disks in an SSA RAID Array Select Swap Members of an SSA RAID Array Select the degraded hdisk Referring to th
285. gh and have the correct format to become part of the RAID Copy array for hdisk5 With a disk drive from the list for example OO0O629CD8A3900D create your RAID Copy array Type ssaraid C 1 ssa2 t raid_copy s 000629CD8A3900D r hdisk5 A message is displayed for example 185439188B4F4CT created 000629CD8A3900D changed Your RAID copy array is 185439188B4F4CT Couple the RAID Copy array to the parent RAID array Type ssaraid A 1 ssa2 i couple n hdisk5 a raid_copy 185439188B4F4CT a copy_verify_writes true A message is displayed for example 185439188B4F4CT changed 185439188B4F4CT coupled The copy operation starts immediately the array is coupled If the RAID Copy array already contains a copy of data from another array you must use the force flag to reuse the array Add a force yes to the end of the command that is shown at the start of this step User s Guide and Maintenance Information 4 The time needed for the copy operation is related to the amount of data that you are copying A large amount of data might take more that one hour to copy To check whether the copy operation has completed type ssaraid I 1 ssa2 n hdisk5 While the copy operation is running the following information is included in the displayed data array_coupled true copy_status copying copy_percentage 90 uncopied_strips 1234 copy_rate 50 copy_verify_writes true When copy_status changes from copying to good the c
286. global variable is set to the return value from the device driver If the return value is something other than 1 the read operation was successful and the return code indicates the number of bytes that were read The caller should verify the number of bytes that were read File offsets are not applicable and are ignored for target mode read operations The adapter write operations provide the boundary that determines how read requests are controlled If more data is received than is requested in the current read operation the requested data is passed to the caller and the remaining data is retained and returned for the next read operation for this target device If less data is received in the send command than is requested the received data is passed for the read request and the return value indicates how many bytes were read If a write operation has not been completely received when a read request is made the request blocks and waits for data However if the target device is opened with the O_NDELAY flag set the read does not block it returns immediately If no data is available for the read request the read is not successful and the errno global variable is set to EAGAIN If data is available it is returned The return value indicates the number of bytes that were received whether the write operation for this data has ended or not Note If the O_NDELAY flag is not set the read subroutine can block for an undefined time while it wa
287. gs Output ssa_speed Command Purpose Syntax Description Flags Output Examples ssavfynn Command Purpose Syntax Description 360 360 360 360 360 360 360 361 361 361 361 362 363 364 364 364 364 364 365 365 365 365 365 365 365 366 366 366 366 366 366 367 367 368 368 368 368 368 368 369 369 369 369 369 370 370 371 371 371 371 Contents xi Flagse oth fe os kh te Sk i eb te ek at he i ee ek eo oe ST QUIDUE es dae Oe ees a ee a ek ee a BT ssaxlateCommand 2 ee ee ee 872 PUPDOSEs bee og SS A i A ee we ee ce oe Bl Syntaks en ee sl fog amp og ste EM a eed A Jae tent eee ba A a OTe De scriptiony 2 2 4 204 She ob dock oe de a ER Eh we ak 4 4 2 872 Flags ss so Gy wae So oth eS a Se Ge Ace ae we ah aoe amp OM EP ao 4 872 Chapter 17 SSA Service Aids 378 The Identify Function ee ee 875 Starting the SSA Service Aids 2 1 876 Set Service Mode Service Aid a ee ee ee 878 Link Verification Service Aid 2 we ee ee ee 883 Configuration Verification Service Aid 387 Format Disk Service Aid 1 eee ee e889 Certify Disk Service Aid akam 4 ss a ep a G89 Display Download Disk Drive Microcode Service Aid oe Bee ee Re a B98 Link
288. gs for Copy Mode You must use both of these flags d pdiskxx Specifies the disk drive from which the data is to be copied for example pdisk2 o OutputFile Specifies where the tar command is to write its output You must use at least one of these flags a AdapterName Specifies the adapter name for which the program must search for example ssa1 The adapter must be known to the searching machine n AdapterUID Specifies the adapter UID for which the program must search The adapter need not be known to the searching machine s SlotNumber Specifies the slot that contains the disk drive as shown in the List Output Optional Flags for List Mode You can choose either or both of these flags h Prevents the heading lines from being displayed This option is useful for scripts d pdiskxx Allows the you to specify which disk drive is to be searched By specifying the disk drive you reduce the range of the search You can choose either but not both of these flags User s Guide and Maintenance Information Output a AdapterName Specifies the adapter name for which the program must search for example ssa1 The adapter must be known to the searching machine n AdapterUID Specifies the adapter UID for which the program must search The adapter need not be known to the searching machine s SlotNumber Specifies the slot that contains the disk drive as shown in the List Output Optional Flags for Copy Mode
289. guration disk drives continues to operate e Ifthe using system that has access to the primary configuration disk drives fails and the primary configuration disk drives also fail only the site that contains the secondary configuration disk drives remains operational Normally the using system that is at the secondary site is not allowed access to the array To allow this using system access to the array the user must use the RAID array configurator to set a flag that allows the using system to operate on only the secondary disk drives A RAID 10 array can be in one of several states A knowledge of those states is useful when you are configuring your arrays The states are described here A RAID 10 array is in the Good state when All the member disk drives of that array are present e No member disk drive is deconfigured e Read and write operations can be done on the array e No rebuilding operations need to be done The array is fully protected from the loss of multiple member disk drives if one copy of the mirrored data is still available Some unsynchronized records might still be under repair Exposed State A RAID 10 array is in the Exposed state when member disk drives are missing but still configured Read and write operations can be performed on the array although write 36 User s Guide and Maintenance Information operations put the array into the Degraded state When the missing member disk drives are reintroduced the
290. hat choice should be Hot Spare Disk if the use of hot spares is enabled for the RAID arrays on the subsystem Array Candidate Disk if the use of hot spares is disabled for the RAID arrays on the subsystem Chapter 15 Removal and Replacement Procedures 325 Changing Pdisk and Hdisk Numbers Pdisk and hdisk numbers are assigned automatically when the using system is configured To help in system administration it is sometimes useful to change these numbers If you want to change the numbers use the following procedure 1 Find the pdisk or hdisk number that you want to change and select a new unused name Give the command lsdev C1 disknumber Fconnwhere where disknumber is the pdisk or hdisk number for example pdiskO that you want to change The command returns a ConnectionLocation number for example 004AC5119E000D Make a note of this number you will need it later in this procedure Give the command lsdev C1 disknumber Ftype where disknumber is the pdisk or hdisk number for example pdiskO that you want to change The command returns the device Type for the disk Make a note of this device type you will need it later in this procedure Remove the existing pdisk or hdisk number from the configuration Give the command rmdev 1 disknumber d where disknumber is the pdisk or hdisk number for example pdiskO that you want to remove Give the command mkdev p ssar t Type c C
291. he adapter on an SSA loop that contains only one pair of adapter connectors 1 Determine which data is accessed most frequently 2 Assign that data to those disks drives that are farthest round the loop from the adapter connectors By doing this action you prevent the activity of the busiest disk drive from obstructing the data path to the other disk drives For example the loop that is shown in igure 14 contains 16 disk drives and the adapter connectors are between disk drives 1 and 16 The most frequently accessed data therefore should be on disk drives 8 and 9 Disk _ Disk _ Disk _ Disk Disk _ Disk _ Disk _ Disk 1 2 3 4 5 6 7 8 g Q yj xe lt Disk _ Disk _ Disk _ Disk Disk _ Disk _ Disk _ Disk 16 15 14 13 12 11 10 9 Figure 12 One Pair of Connectors in the Loop 24 User s Guide and Maintenance Information Pairs of Adapter Connectors in the Loop Some Shared Data The following sequence enables you to determine the best relationship between the disk drives and the adapter on an SSA loop that contains two or more pairs of adapter connectors Some of the disk drives share data access with other disk drives 1 For each pair of connectors identify all the data that the loop is to access 2 For each pair of connectors identify the data that the loop is to access mo
292. he condition To use the Configuration Verification service aid 1 Select Configuration Verification from the SSA Service Aids menu see Starting A list of pdisks and hdisks is displayed azar VERIFICATION 802390 Move cursor onto selection then press lt Enter gt systemname pdiskO AC51DB47 4GB SSA C Physical Disk Drive systemname pdisk1 AC9EDE7F 9 1GB SSA C Physical Disk Drive systemname hdisk2 AC51DB47 SSA Logical Disk Drive systemname hdisk3 AC9EDE7F SSA Logical Disk Drive CE F10 Exit 2 Chapter 17 SSA Service Aids 387 Select the hdisk or pdisk that you want to verify If you select an hdisk a list of pdisks is displayed CONFIGURATION VERIFICATION 802391 systemname hdisk2 AC51DB47 SSA Logical Disk Drive Good To set or reset Identify move cursor onto selection then press lt Enter gt Physical Serial Adapter Port SSA_Addr Status systemname pdiskO AC51DB47 00 02 Al 0 Good 00 02 A2 1 Good F3 Cancel F1O Exit J If you select a pdisk a list of hdisks is displayed fe gt CONFIGURATION VERIFICATION 802392 systemname pdiskO AC51DB47 4GB SSA C Physical Disk Drive Move cursor onto selection then press lt Enter gt systemname hdisk2 AC51DB47 SSA Logical Disk Drive Good F3 Cancel F1O Exit y Note If you select the hdisk from this screen the hdisk configuration is displayed 388 User s Guide and Maintenance Information Format Disk Service Aid The Format Disk service aid formats
293. he minimum number of hot spare disk drives Chapter 5 Hot Spare Management 51 that is needed to protect the array in the selected pool No error log entry is made until the number of hot spare disk drives that remains in the pool is less than the Hot Spare Minimum parameter Rules for Hot Spare Disk Drive Pools 52 By default all hot spare disk drives are in pool zero Pool zero is called AO for hot spare disk drives that are on SSA loop A and BO for hot spare disk drives that are on SSA loop B Hot spare pool numbers are in the range AO through A31 and BO through B31 The pool number is automatically assigned when a hot spare pool is created Arrays in pool zero can never use hot spare disk drives that are assigned to other hot spare pools Each pdisk can be assigned to a hot spare pool Each member disk drive of a RAID array can be assigned to a different hot spare pool Disk drives can be added to a new pool only if they are in pool zero If a disk drive is removed from a hot spare pool that disk drive moves to pool zero A hot spare pool can exist only on an SSA loop For example hot spare pool B1 on adapter ssa0 has no physical or logical connection to hot spare pool B1 on adapter ssal A hot spare pool can contain any number of hot spare disk drives For instructions on If the Choose Hot Spare Only from Preferred Pool option is set to yes hot spare disk drives are selected only from the hot spare pool that co
294. he using system service aids 2 Select Display or Change Configuration or Vital Product Data VPD 3 Select Display Vital Product Data 4 Find the VPD for the SSA adapter that is logging the error 5 Note the SDRAM and cache sizes Device Specifics ZO and Z1 6 For fast write operations the size of the available SDRAM must be greater than the size of the fast write cache If you cannot determine the correct size of SDRAM to use contact your support center 42515 Description A fast write disk is installed but no Fast Write Cache Option The cache card is not Card has been detected Action 1 If you have not already done so run diagnostics in System Verification mode to the adapter If a different SRN is generated solve that problem first 2 Do the following actions as appropriate e If the cache card is not installed correctly remove it from the adapter then reinstall it correctly e If the cache card is installed correctly it might have failed Exchange for new FRUs the FRUs that are shown in the list of possible FRUs for this SRN e If the Fast Write feature is not installed and you want to delete the fast write configuration for one or more disk drives that have been added to this subsystem a Verify with the customer that the fast write configuration can be deleted for the disk drives Type smitty devices and press Enter Select SSA Disks Select SSA Logical Disks Select Enable Disable Fast Wr
295. his structure is DD_BUS and the subtype is DS_SDA The flags field is set to DF_FIXED dev ssa0 dev ssa1 dev ssan Chapter 13 Using the Programming Interface 261 SSA_TRANSACTION SSA Adapter Device Driver ioctl Operation Purpose Description To send an SSA transaction to an SSA adapter The SSA_TRANSACTION operation allows the caller to issue an IPN Independent Packet Network transaction to a selected SSA adapter IPN is the language that is used to communicate with the SSA adapter The caller must be root or have an effective user ID of root to issue this operation IPN is described in the Technical Reference for the adapter The arg parameter for the SSA_TRANSACTION operation specifies the address of a SSA_TransactionParms t structure This structure is defined in the usr include sys ssa h file The SSA_TRANSACTION operation uses the following fields of the SSA_TransactionParms t structure DestinationNode Contains the target node for the transaction DestinationService Contains the target service on that node MajorNumber Major number of the transaction MinorNumber Minor number of the transaction DirectiveStatusByte Contains the directive status byte for the transaction This contains a value that is defined in the usr include ipn ipndef h file A non zero value indicates an error TransactionResult Contains the IPN result word that is returned by IPN for the transaction This contains values
296. hoosing the error threshold alarm level for a hot spare pool 51 deciding how to configure hot spare disk drive pools 45 rules for hot spare disk drive pools 52 solving hot spare pool problems 53 hot spare pool creation and change attributes 249 hot spare pool adding anew one 83 hot spare pool adding disk to or removing disks from 86 hot spare pool changing or showing the status 74 hot spare pool listing the disks 80 hot spare pool showing the disks 77 housekeeping 233 iassaraid command 127 icssaraid command 125 identification of disk drives 19 Identify Array Candidate Disks option 125 Identify Disks in an SSA RAID Array option 119 Identify Disks in an SSA RAID Array option effects of array copy 189 Identify function 375 Identify Hot Spares option 121 Identify Rejected Array Disks option 123 Identify System Disks option 127 identifying and correcting or removing failed disk drives 91 identifying SSA Devices location code format 19 pdisks and hdisks 19 unique IDs UIDs 21 IEEE SSA unique ID UID 21 500 User s Guide and Maintenance Information ifssaraid command 92 123 inssaraid command 121 indicators Advanced SerialRAID Adapter type 4 P 6 installing a battery assembly into the fast write cache card 338 installing an SDRAM module 330 installing and configuring SSA RAID arrays 58 adding a disk drive to an SSA RAID array 140 adding a new hot spare pool 83 adding an SSA RAID array 60 adding disks to a hot spare pool
297. hreshold value for that error an SRN is generated This SRN is generated from the next 5 characters of the detail data Examples The following is logged for ssa0 0400 0000 0000 00 sess sess Error log analysis produces SRN 40000 The following is logged for ssa0 2450 1000 0000 00 sess Error log analysis produces SRN 45010 only if this error has occurred three times for ssa0 during the previous 24 hours If more than one type of error exists in the error log for a device the error log analysis determines which error code has the highest priority and returns that error code as the result of the analysis Usually the action of correcting the highest priority error also corrects the lower priority problems Chapter 11 SSA Error Logs 231 232 Command Line Error Log Analysis A command line utility has been provided that allows you to run SSA error log analysis from a manually entered command or from shell scripts The utility is ssa_ela It can perform SSA error log analysis on All SSA devices e A selected hdisk e A selected pdisk A selected adapter Any of the above items for a history period of up to seven days for details of how to use the utility run_ssa_ela cron During installation of the SSA device drivers the following entry is added to the cron table 01 5 usr Ipp diagnostics bin run_ssa_ela 1 gt dev null 2 gt dev null This cron entry instructs the run_ssa_ela
298. hysical Disks List Delete Old RAID Arrays Recorded in an SSA RAID Manager List Status of Hot Spare Pools List Status of Hot Spare Protection for an SSA RAID Array List Components in a Hot Spare Pool Add a Hot Spare Pool SSA RAID Manager Move cursor to desired item and press Enter ssa0 Available 00 04 IBM SSA 160 SerialRAID Adapter 14109100 Fl Help F2 Refresh F3 Cancel F8 Image F1Q Exit Enter Do Find n Find Next Select the adapter to which you want to add the hot spare pool Chapter 6 Using the RAID Array Configurator 83 3 A list that shows Loop A or Loop B is displayed Pa SSA RAID Arrays Move cursor to desired item and press Enter List All Defined SSA RAID Arrays List All Supported SSA RAID Arrays List All SSA RAID Arrays Connected to a RAID Manager List Status Of All Defined SSA RAID Arrays List Identify SSA Physical Disks List Delete Old RAID Arrays Recorded in an SSA RAID Manager List Status of Hot Spare Pools List Status of Hot Spare Protection for an SSA RAID Array List Components in a Hot Spare Pool SSA Loops Move cursor to desired item and press Enter Loop A Loop B Fl Help F2 Refresh F3 Cancel F7 Select F8 Image F1O Exit F Enter Do Find n Find Next Select the loop on which you want to create the hot spare pool 4 The Hot Spare Pool Creation menu is displayed Add a Hot Spare Pool Type or select values in entry fields Press Enter AFTER making all desire
299. ible FRUs drive might or might not be configured on this system Device 100 xcha D Action Run diagnostics in System Verification mode to all pdisks Agny If the diagnostics fail exchange the pdisk for a new disk drive If the diagnostics do not detect a failing polsk use the Link Verification service aid see J to search for disk drives that are not configured Such disk drives are listed as Note Other adapters in the SSA loop might also be listed as Exchange for new disk drives all pdisks that are not configured 4BPAA Description A disk drive at PAA cannot be configured because its UID Possible FRUs cannot be read Device 100 Action If the SSA service aids are available 1 Run the Link Verification service aid see Link Verification Service to find the failing device The service aid lists the device as x 2 Exchange the FRU for a new FRU If the service aids are not available 1 Note the value of PAA in this SRN then go to 2 Exchange the FRU for a new FRU 50000 Description The SSA adapter failed to respond to the device driver Possible FRUs see adapter card 100 Action Exchange the FRU for a new FRU p e 50001 Description A data parity error has occurred Possible FRUs SSA adapter card 100 Action Exchange the FRU for a new FRU P e 434 User s Guide and Maintenance Information SRN Problem Possible Causes 50002 Description An SSA adapter DMA error has
300. ically becomes the new primary disk drive and the array goes into the Rebuilding state If the missing primary disk drive then becomes available again it is rejected If after the write operation occurs no hot spare disk drive is available the array enters the Degraded state If the missing primary disk drive then becomes available again it is rejected Split Array Resolution Primary System 1 System 2 SSA Adapter SSA Adapter Primary 1 Secondary 1 Figure 30 Dual Host System with Primary Disk Drive Missing Chapter 8 Split Site Management 197 Dual Host System with Only One System Switched On shows a dual host system that has just been switched on The system contains a RAID 1 array but the primary disk drive is missing because system 1 in not switched on The array remains in the Offline state on both systems until either of the following occurs The primary disk drive becomes available e The Split Array Resolution flag is set to Secondary Split Array Resolution Primary System 2 SSA Adapter Secondary 1 Figure 31 Dual Host System with Only One System Switched On 198 User s Guide and Maintenance Information Split Systems The system can become split because communications have failed between the two systems System 1 Split from System 2 igure 3A shows system 1 split from system 2 Split Array Resolution Split Array Resolution Primary Primary
301. if_oneway 245 fastwrite on off 244 fw_end_block 244 fw_start_block 244 fw_suspended 245 RAID 1 and 10 creation and change copy_rate 247 copy_verify_writes 247 fw_max_length 247 hot_spare_splits 247 split_resolution 247 RAID 1 5 and 10 creation and change read_only_when_exposed true false 245 spare_exact true false 245 spare_preferred 246 spare true false 245 RAID 10 creation and change strip_size 248 RAID 5 creation and change 249 fw_max_length 248 strip_size 248 uncouple action force yes no 252 attributes common to logical and physical disks 271 attributes for logical disks only 271 attributes of the SSA router ssar 270 B battery assembly fast write cache card installing 338 battery assembly fast write cache card continued power 211 removing 336 blue and black cables 314 bring up SSA adapter ID 6 broken loop SSA link 403 406 buffer management SSA Target mode 293 bypassing the cache in a one way fast write network 217 C cables blue and black 314 Cancel all SSA Disk Identifications option 129 Certify Disk service aid 391 change and creation attributes hot spare pool 249 change and creation attributes RAID arrays 244 change attributes physical disk drives fastwrite on off 250 force yes no 251 fw_end_block 250 fw_max_length 250 fw_start_block 250 use system spare free 250 change attributes hot spare pool minimum_spares 249 change attributes RAID arrays 248 array member disk drives s
302. ils or is removed To list these devices type ssaraid 1 ssaX Iz a state wrong_cache where X is the adapter number Use the recovery procedure that is described for SRN 42521 You must recover all the devices that are listed 220 User s Guide and Maintenance Information Chapter 11 SSA Error Logs This chapter describes Error logging Error logging management Error log analysis Good housekeeping Each topic is discussed as a summary then as a detailed description The summaries provide all the information that you need for routine service operations on SSA subsystems For these operations you have no need to inspect the system error log or to attempt to analyze the contents of the log The detailed descriptions help you understand the meaning of the error log data so that you can further analyze the error log For example you might decide to fail over an HACMP system when particular critical failures are logged Error Logging Summary Hardware errors can be detected by an SSA disk drive an SSA Adapter or the SSA device driver The SSA adapter performs error recovery for disk drive errors the SSA device driver performs error recovery for the SSA adapter When a problem is detected that needs to be logged all the relevant data is sent to the error logging service in the device driver The error logging service then sends the data to the system error logger SSA errors are logged asynchronously that is
303. in which event the user might be required to take adequate measures 485 International Electrotechnical Commission IEC Statement This product has been designed and built to comply with IEC Standard 950 Avis de conformit a la r glementation d Industrie Canada Cet appareil num rique de la classe A est conforme a la norme NMB 003 du Canada Industry Canada Compliance Statement This Class A digital apparatus complies with IECS 003 United Kingdom Telecommunications Requirements This apparatus is manufactured to the International Safety Standard EN60950 and as such is approved in the U K under approval number NS G 1234 J 100003 for indirect connection to public telecommunications systems in the United Kingdom European Union EU Statement This product is in conformity with the protection requirements of EU council directive 89 336 EEC on the approximation of the laws of the Member States relating to electromagnetic compatibility Neither the provider nor the manufacturer can accept responsibility for any failure to satisfy the protection requirements resulting from a non recommended modification of the product including the fitting of option cards not supplied by the manufacturer Radio Protection for Germany Zulassungsbescheinigung laut dem Deutschen Gesetz Uber die elektromagnetische Vertraglichkeit von Geraten EMVG vom 30 August 1995 bzw der EMC EG Richtlinie 89 336 Dieses Ger t ist berechtigt in bereinstimmun
304. information See also Ch ju about SSA loops and links Note If for any reason an adapter is exchanged for a replacement adapter all associated arrays that were not synchronized when the adapter failed are rebuilt Cron Table Entries During the installation of the SSA software the following four entries are made in the system cron table 01 5 x usr Ipp diagnostics bin run_ssa_ela 1 gt dev null 2 gt dev null O x x x usr Ipp diagnostics bin run_ssa_healthcheck 1 gt dev null 2 gt dev null 30 x x usr Ipp diagnostics bin run_ssa_encl_healthcheck 1 gt dev null 2 dev null 30 4 x x usr Ipp diagnostics bin run_ssa_link_speed 1 gt dev null 2 gt dev null The first entry instructs the run_ssa_ela shell script to run at 05 01 each day This shell script analyzes the error log If it finds any problems the script warns the user in the following ways It sends An error message to dev console This message is displayed on the system console e An OPMSG to the error log This message indicates the source of the error A mail message to ssa_adm Note ssa_adm is an alias mail address that is set up in etc aliases By default the address is set to root but you can set it to any valid mail address for the using system 313 The second entry instructs the run_ssa_healthcheck shell script to run once each hour This shell script causes the SSA adapter to log any errors that might exist in the SSA su
305. ing the Clips Slots and Guides 2 Ensure that the clips are fully open 3 Hold the SDRAM module so that its slots J align with the guides E of the connector 330 User s Guide and Maintenance Information 4 Refer to Figure 45 j z o m Figure 45 Installing the SDRAM Module 5 Carefully insert the SDRAM module into the connector 6 Press strongly on the module until it is fully home 7 Close the clips by pivoting them in the directions shown by the arrows in the diagram You hear a click when the clips are fully closed If you cannot close the clips the module is not fully home Press the module fully home then close the clips 8 Reinstall the adapter into the using system see the Installation and Service Guide for the using system Chapter 15 Removal and Replacement Procedures 331 Removing the Fast Write Cache Option Card of an Advanced SerialRAID Adapter Attention The adapter assembly contains parts that are electrostatic discharge ESD sensitive Use the tools and procedures defined by your organization to protect such parts The Fast Write Cache Option card might contain customer data 1 Remove the adapter from the using system see the nstallation and Service
306. ioctl_parms structure u 1ist_pdisks name_array Pointer to the array of ssadisk_name_desc_t structures that is in the caller memory On return from the ioctl this array is filled with the names of the hdisks u 1ist_pdisks name_array_elements Set by the caller to indicate the number of elements that are in the array at which the u0 1ist_pdisks name_array parameter is pointing u 1ist_pdisks name_count On return from the ioctl this field indicates the number of names that are in the name array at which the u0 1ist_pdisks name_array parameter is pointing u 1ist_pdisks resource_count On return from the ioctl this field indicates the number of physical disk drives that make up the logical disk drive This number might be less than u 1ist_pdisks name_count if in the user memory not enough elements were allocated in the named array to hold all the pdisk names or if one or more physical disk drives that make up the logical disk have not been configured as physical disk drives Chapter 13 Using the Programming Interface 285 Return Values Files If the command was successfully sent to the adapter card this operation returns a value of 0 Otherwise a value of 1 is returned and the errno global variable is set to one of the following values EIO An unrecoverable I O error has occurred ENOMEN The device driver was unable to allocate or pin enough memory to complete the operation dev pdisk0 dev pdisk1 dev pdiskn
307. ion 2 a me m iow s ee ee ee 16 Flags 2 GS ah Phere ce ae a ae Se 162 Example 1 Copying a Complete Volume Group a Pee re as HO ee a wled Example 2 Copying One Logical Volume 165 Example 3 Copying One Logical Volume by Logical Volume Name c or by FS Name 166 Example 4 Copying One Logical Volume by Logical Volume Name c or by FS Name 2 168 Example 4 Copying a Complete Volume Group and Recreating t the Copy on Another Using System 169 Example 5 Running an Automatic Copy of a i Volume Group bo tee eo hole te ae LO ssa_delete_copy Command 2 171 PUIPOSC as ye Ao cin A ae ee ee ee to pe ee aT SV MAKE AS Ge ods ee oe ee RS SS oe ce ek te Tee ie he he Ss Os eee 2 1 Flags a SE aaa ede as OE A RS Ee NT SMIT Menus for 3 Way Copy Operations ta Gago net ORS Gas Ole ep Ee E E E Getting Access to the Array Copy Services Menu 172 Array Copy Services a lee te ak ch Ud tee cl a 178 Effects of Array Copy on Other SMIT Menus Beg oaa At eet ah Oh Bho pe ah de ca 186 Change Show Attributes of an SSA RAID Array 1 186 List Status Of All Defined SSA RAID Arrays 2 1 188 Identify Disks in an SSA RAID Array 2 2 wee eee 189 Contents V vi Remove a Disk From an SSA RAID Array Swap Members of an SSA RAID Array Chapter 8 Split Site Management Configuration of RAID 1 and RAID 10 Arrays Operation after a Loss of Memb
308. ions For fast path type smitty ssa_identify_cancel and press Enter Otherwise 1 Select List Identify SSA Physical Disks from the SSA RAID Arrays menu 2 Select Cancel all SSA Disk Identifications The Check lights of all identified disk drives stop flashing Chapter 6 Using the RAID Array Configurator 129 Listing or Deleting Old RAID Arrays Recorded in an SSA RAID Manager If an array becomes disconnected from a RAID manager by some method other than the method described in Deleting an O a record of that array remains in the RAID manager The record nut be deleted manually This option allows you to list the serial numbers of such arrays and to delete the records of those arrays from the SSA RAID manager 1 For fast path type smitty nvrssaraid and press Enter Otherwise select List Delete Old RAID Arrays in an SSA RAID Manager from the SSA RAID Arrays menu The following menu is displayed a gt List Delete Old RAID Arrays in an SSA RAID Manager Move cursor to desired item and press Enter List Old RAID Arrays Recorded in an SSA RAID Manager Delete an Old RAID Array Recorded in an SSA RAID Manager F1 Help F2 Refresh F3 Cancel F8 Image RE F10 Exit Enter Do D If you want to list the arrays Ban rage al List Old RAID Arrays Recorded in an SSA RAID Manager and go sep Lan page 1 of Listing Old BAID Arays Becordad If you want to delete the arrays son EEEE Delete an Old RAID Array Recorded in
309. is command The possible values are defined in the usr include sys errno h file The conc_cmd pointer argument to the special interrupt handler entry point of the top kernel extension is non null The cmd_op message_code and devno fields are zero DD_CONC_TEST The DD_CONC_TEST device driver entry point has completed The error field in the conc_cmd structure contains the return code that is necessary for the completion of this command The possible values are defined in the usr include sys errno h file The conc_cmd pointer argument to the special interrupt handler entry point of the top kernel extension is non null The cmd_op message_code and devno fields are zero DD_CONC_RECV_REFRESH A message with message_code was received for the SSA disk drive that is specified by the devno argument The conc_cmd argument is null for this operation DD_CONC_RESET The SSA disk drive that is specified by the devno argument was reset and all pending messages or commands have been flushed The argument conc_cmd is null for this operation e The concurrent command interrupt handler routine must have a short path length because it runs on the SSA disk device driver interrupt level If much command processing is needed this routine should schedule an off level interrupt to its own off level interrupt handler e The top kernel extension must have an interrupt priority that is no higher than the interrupt priority of the SSA disk device driver e The c
310. isplayed options Repeat steps 4 and H tor each attribute that you want to change 136 User s Guide and Maintenance Information Changing Member Disks in an SSA RAID Array This option allows you to remove a disk drive from an array and install a replacement disk drive All the data that is on the original disk drive is automatically written to the replacement disk drive 1 For fast path type smitty swpssaraid and press Enter Otherwise select Change Member Disks in an SSA RAID Array from the SSA RAID Arrays menu 2 The following menu is displayed a Change Member Disks in an SSA RAID Array Move cursor to desired item and press Enter Remove a Disk From an SSA RAID Array Add a Disk to an SSA RAID Array Swap Members of an SSA RAID Array F1 Help F2 Refresh F3 Cancel F8 Image Ga F1O Exit Enter Do yy If you have an available disk drive select Swap Members of an SSA RAID Array and go to step on page 14 of Swapping Members af an SSA RAID Array If you do not have an available disk drive select Remove a Disk from an SSA RAID a and go to step B on page 134 Chapter 6 Using the RAID Array Configurator 137 Removing a Disk Drive from an SSA RAID Array This option allows you to remove a disk drive from an array so that you can install a replacement disk drive Use this option when you do not have either an available online disk drive or a spare slot for a replacement disk drive 1 For fast path type smitty re
311. ite for Multiple Devices Select all the pdisks against which the message Fast Write is enabled for these devices appears oe 2 pce Press Enter h Select no in the Enable Fast Write field i Select yes in the Force Delete field j Press Enter installed correctly e The Fast Write feature is not installed on this machine but a disk drive that is configured for fast write operations has been added to the subsystem Possible FRUs Fast Write Cache Option Chapter 18 SSA Problem Determination Procedures 417 SRN Problem Possible Causes 42521 Description A Fast Write Cache Option Card has failed Data has been Possible FRUs written to the cache card and cannot now be recovered The disk drives Fast Write Cache Option that have lost the data cannot be identified All unsynchronized fast write disk drives that are attached to this adapter are offline Action 1 Ask the customer to refer to De to determine e Which disk drives are affected by this error e How much data has been lost e Which data recovery procedures can be done 2 Ask the customer to disable the Fast Write option for e Each device for which the Fast Write option is offline e All other devices that are connected to the failing adapter and have the Fast Write option enabled For instructions on how to disable the Fast Write option see ONTO no The a At B acne a e on bade 3 If this error has occurred because th
312. ite operation is ended and the remaining buffers are discarded The read operation transfers received data from the device buffers to your application program When the read operation ends or the write operation stops sending data the read operation returns the number of bytes read 294 User s Guide and Maintenance Information SSA tmssa Device Driver Purpose Syntax Description To provide support for using system to using system communications through the SSA target mode device driver include usr include sys devinfo h include usr include sys tmscsi h include usr include sys scsi h include usr include sys tmssa h The Serial Storage Architecture SSA target mode device driver provides an interface to allow using system to using system data transfer by using an SSA interface You can access the data transfer functions through character special files that are named dev tmssann xx where nn is the node number of the node with which you are communicating The xx can be either im initiator mode interface or tm target mode interface The caller uses the initiator mode to transmit data and the target mode interface to receive data When the caller opens the initiator mode special file a logical path is set up This path allows data to be transmitted The user mode caller issues a write writev writex or writevx system call to start sending data The kernel mode user issues an fp_write or fp_rwuio service call to sta
313. item and press Enter List Disks in an SSA RAID Array List Hot Spares List Rejected Array Disks List Array Candidate Disks List System Disks Identify Disks in an SSA RAID Array Identify Hot Spares Identify Rejected Array Disks Identify Array Candidate Disks Identify System Disks SSA RAID Manager Move cursor to desired item and press Enter Fl Help F2 Refresh F3 Cancel F8 Image F1Q Exit Enter Do Find n Find Next ssaQ Available 00 04 IBM SSA 160 SerialRAID Adapter 14109100 Select the adapter whose candidate disk drives you want to list Chapter 6 Using the RAID Array Configurator 115 3 A list of candidate disk drives is displayed 2 COMMAND STATUS Command OK stdout yes stderr no Before command completion additional instructions may appear below pdisk3 0004AC5119E000D free 1 1G Physical disk pdisk5 08005AEA030D00D free 2 3G Physical disk F1 Help F2 Refresh F3 Cancel F6 Command F8 Image F9 Shel F1O Exit Find n Find Next Ne 116 User s Guide and Maintenance Information Listing System Disk Drives This option allows you to list disk drives that are used by the using system These disk drives are not member disk drives of any array 1 For fast path type smitty lassaraid and press Enter Otherwise a Select List Identify SSA Physical Disks from the SSA RAID Arrays menu b Select List System Disks A list of adapters is displayed in a window ts List Identify
314. its for data Because in a read operation the data can come at Chapter 13 Using the Programming Interface 297 any time the device driver does not maintain an internal timer to interrupt the read Therefore if a time out function is required it must be started by the calling program If the calling program wants to break a blocked read subroutine the program can generate a signal The target mode device driver receives the signal and ends the current read subroutine If no bytes were read the errno global variable is set to EINTR otherwise the return value indicates the amount of data that was read before the interrupt occurred The read operation returns with whatever data has been received whether the write operation has completed or not If the remaining data for the write operation is received it is put into a queue where it waits for either another read request or a close command When the target receives the signal and the current read is returned another read operation can be started or the target can be closed If the read request that the calling program wants to break ends before the signal is generated the read operation ends normally and the signal is ignored The target mode device driver attempts to queue received data in front of requests from the application program A read ahead buffer area is used to store the queued data The length of this read ahead buffer is determined by multiplying the value of the RecvBufferS
315. ive You must now make it a system disk drive and configured it on the using system a Using the RAID Copy array serial number that was created in step Eod bage 154 type ssaraid H 1 ssa2 n 185439188B4F4CT a use system The RAID Copy array is now a system disk drive b If you now give the cfgmgr command the RAID Copy array is configured and is given the next available hdisk name for example hdisk6 If you want to configure the RAID Copy array and use a specific hdisk name type mkdev t hdisk p ssar w 185439188B4F4CT 1 hdiskname where 185439188B4F4CT is the RAID Copy serial number and hdiskname is any hdisk name that you select for example hdisk5copy 9 When you no longer need the data that is on the RAID Copy array you can recouple the RAID Copy array to the parent array or delete it To delete the RAID Copy array type ssaraid D 1 ssa2 n hdiskx u where hdiskx is the hdisk name that was assigned to the RAID Copy array when that array was configured in step gl 154 User s Guide and Maintenance Information Using SMIT to Create a RAID Copy Array from a RAID 1 or RAID 10 Array This section describes how to use the SMIT menus to create RAID copy arrays Thig are particularly suitable if you are nat using Logical Volume Manager LVM For more details about the Note Data must be separately synchronized from system memory cache before the copy is uncoupled The procedure that is described here ena
316. ive systemname pdisk1l AC1DEEE2 4GB SSA C Physical Disk D Drive systemname pdisk10 AC1DBE32 4GB SSA C Physical Disk Drive ees F1O Exit Y Chapter 17 SSA Service Aids 391 392 2 Select the pdisk that you want to certify The following instructions are displayed la CERTIFY DISK 802405 systemname pdiskO AC7Q6E9A 4GB SSA C Physical Disk Drive Move cursor onto selection then press lt Enter gt Set or Reset Identify Select this option to set or reset the Identify indicator on the disk drive Certify Select this option to start the Certify operation F3 Cancel F1O Exit 24 3 If you are not sure of the identification pdisk number of the disk drive that you want to certify use the Identify function to get a positive physical identification of the disk drive see You can further ensure that you have selected the correct disk drive by vonid that the serial number on the front of the disk drive is the same as the serial number that is displayed on the screen 4 When you are sure that you have selected the correct disk drive select Certify User s Guide and Maintenance Information Display Download Disk Drive Microcode Service Aid The Display Download Disk Drive Microcode service aid allows you to e Display the level of microcode that is installed on all available disk drives e Change the level of microcode for a specific available disk drive to any level that is available in the using
317. ive the command ssaencl 1 enclosure b e To display the status and VPD of the controller card in enclosure0 give the command ssaencl enclosure0 c v To modify the ID for enclosureO to R2D2 give the command ssaencl 1 enclosured I R2D2 e To display the contents of disk bay slot 8 in enclosureO give the command ssaencl 1 enclosureOd d 8 Chapter 16 Using the SSA Command Line Utilities 357 ssa_format Command Purpose Syntax Description Flags To format the specified device ssa_format 1 pdisk or ssa_format 1 SSA Adapter b The ssa_format I pdisk command opens the pdisk special file and uses the ISAL Format command to format the device You can close the device while the format operation is running If the command cannot format the device it prints an error message The ssa_format I SSA_Adapter command attempts to format the Fast Write Cache Option Card if present If the data that is on the Fast Write Cache Option Card has been moved onto a disk drive destaged the formatting operation sets all the data on the Fast Write Cache Option Card to zero for security reasons The ssa_format I SSA_Adapter b command resets the battery age counter on the Fast Write Cache Option Card Use this option only when you have exchanged the Fast Write Cache Option Card battery Possible failure conditions for the ssa_format command include e No Fast Write Cache Option Card is present on the ad
318. iver Purpose To provide support for Serial Storage Architecture SSA disk drives Syntax include lt sys devinfo h gt include lt sys ssa h gt include lt sys ssadisk h gt Configuration Issues SSA Logical Disks SSA Physical Disks and SSA RAID Arrays Serial Storage Architecture SSA disk drives are represented in the operating system as SSA logical disks hdisk0 hdisk1 hdiskN and SSA physical disks pdisk0 pdisk1 pdiskN SSA RAID arrays are represented as SSA logical disks hdiskO hdisk1 hdiskN SSA logical disks represent the logical properties of the disk drive or array and can have volume groups and file systems mounted on them SSA physical disks represent the physical properties of the disk drive By default e One pdisk is always configured for each physical disk drive e One hdisk is configured either for each disk drive that is connected to the using system or for each array By default all disk drives are configured as system disk drives The array management software deletes hdisks to create arrays SSA physical disks have the following properties They e Are configured as pdiskO pdisk1 pdiskn e Have errors logged against them in the system error log e Provide support for a character special file dev pdisk0O dev pdisk1 dev pdiskn e Provide support for the ioctl subroutine for servicing and diagnostics functions Do not accept read or write subroutine calls for the character sp
319. ize attribute by the value of the RecvBuffers attribute These values are in the configuration database While the application program runs read subroutines the queued data is copied to the application data buffer and the read ahead buffer space is again made available for received data If an error occurs while he data is being copied to the caller data buffer the read operation fails and the errno global variable is set to EFAULLT If the read subroutines are not run quickly enough to fill almost all the read ahead buffers for the device data reception is delayed until the application program runs a read subroutine again When enough area is freed data reception capability is restored from the device Data might be delayed but it is not lost or ignored The target mode device driver controls only received data into its read entry point The read entry point can optionally be used with the select entry point to provide a means of asynchronous notification of received data on one or more target devices Possible return values for the errno global variable include EAGAIN Indicates that a nonblocking read request would have blocked because data is available EFAULT An error occurred while copying data to the caller buffer EINTR Interrupted by a signal EINVAL Attempted to run a read operation for a device instance that is not configured not open or is not a target mode minor device number EIO An I O error occurred 298 User s
320. k Error This MAP helps you to isolate FRUs that are causing an intermittent SSA link problem You are here because you have an SRN from the series 21xxx through 29xxx or you have SRN 33xxx ee ere E SSA links strings and loops Attention Unless the using system needs to be switched off for some other reason do not switch off the using system when servicing an SSA loop Power cables and external SSA cables can be disconnected while that system is running 1 a Run the Link Verification s service aid to the SSA adapter for which this error has been logged see k A list of pdisks similar to the example given here is deana a gt LINK VERIFICATION 802386 SSA Link Verification for nunu ssa0 00 04 IBM SSA 160 SerialRAID Adapter To Set or Reset Identify move cursor onto selection then press lt Enter gt Physical Serial Adapter Port Al A2 Bl B2 Status TOP nunu pdiskl AC7AAO9A o 3 Good nunu pdisk2 AC7AA2D6 1 2 Good nunu pdisk3 AC7AAQBD 2 l Good nunu pdisk4 AC7AA0B1 3 0 Good F3 Cancel F10 Exit NS A Note On the Link Verification screen each adapter port is identified by the number of its related connector on the adapter card Adapter port 0 is identified as A1 e Adapter port 1 is identified as A2 e Adapter port 2 is identified as B1 e Adapter port 3 is identified as B2 SRNs 21xx through 29xxx and SRN 33xxx include the adapter port number 0 3 b Go to step 2l 2 from step i 450 Us
321. k drive pdisk If a disk drive that has been formatted on a machine of a particular type for example a personal computer is later installed into a using system that is of a different type for example a large host system that disk drive is configured only as a pdisk during the configuration of the using system User s Guide and Maintenance Information SSA Unique IDs Each SSA device has a specific identifier that is not used by any other SSA device in the whole world This identifier is called the IEEE SSA Unique ID UID of the device It is written into the device during manufacture The full UID consists of 16 characters The label on the side of a disk drive shows the full UID The label on the front of a disk drive shows the serial number of the disk drive The serial number is actually part of the UID Also part of the UID the Connection Address consists of the LUN name and the device type identifier The software uses this information to access the device Full UID OQOOXXXXXXNNNNNN Disk drive serial number XXNNNNNN Connection Address XXXXXXNNNNNNLLD where XXXXXX IEEE Organization Identifier manufacturer NNNNNN Product ID assigned unique number LL LUN always 00 for a LUN device D Device type D for an SSA Physical disk drive E for a fast write logical disk F for a RAID O array K for a RAID 5 array You might need to know the UID of a disk drive if you want to use the mkdev command to giv
322. k drive can be reused If you think that a disk drive has been rejected because it is failing check the error log history for that disk drive For example if you suspect pdisk3 type on the command line ssa_ela 1 pdisk3 h 5 This command causes the error log for pdisk3 to be analyzed for the previous five days If a problem is detected an SRN is generated Go to step bal YES Exchange the failing disk drive for a new one see FExchanging Dis b Go to step 87 on page 479 to add the disk drive to the group of disk drives that are available for use by the RAID manager 28 from steps bd and b7 a Type smitty ssaraid and press Enter b Select List All SSA RAID Arrays Connected to a RAID Manager c List the arrays that are connected to each SSA Adapter a Are any arrays listed with a status other than Good or Rebuilding NO Go to step ba YES Go to step Lon page 454 29 from steps 28 and B3 a Type smitty ssaraid and press Enter b Select List Status Of All Defined SSA RAID Arrays c Select in turn each type of array that is used on your subsystem and press Enter Do any listed arrays have Invalid Data Strips NO Go to step BO on page 471 YES Go to step Lon page 454 470 User s Guide and Maintenance Information 30 from step bg Do any RAID 5 arrays have Unsynced Parity Strips or Unbuilt Data Strips 31 32 NO YES Go to step Bil The rebuilding operation is running or has stopped bef
323. k drive or a hot spare disk drive to add to the pool If no free or hot spare disk drives exist in the list review the configuration with the customer or see anapona an naga for guidance g Go to step B6 on page 473 from step Bab a Type smitty ssaraid and press Enter b Select Array Copy Services c Select List all Copy Candidates mo aos Are any hdisks listed that are in the Degraded copy state NO Go to step B6 on page 473 YES Go to step B3 on page 4671 User s Guide and Maintenance Information 36 37 from step B5 You have solved all the problems a Run the repair verification or repair completion procedures that are defined by your using system b If you have previously created a backup reload that data now from steps i id ka and 27 Has a failed disk drive been exchanged for a new disk drive NO If you have repaired a power or cabling fault that caused the disk drive to be missing from the system the drive might now be in a rejected state You must change that disk drive into a usable disk drive a Type smitty ssaraid and press Enter b Select Change Show Use of an SSA Physical Disk The disk drive that has been restored to the system is listed under SSA Physical Disks that are rejected c Select the disk drive that has been restored to the system Change the Current Use parameter to Hot Spare Disk or to Array Candidate Disk Note It is the user who should make the choice of Cu
324. k drives to be coupled or press the List key and select the disk drives from the displayed list of available candidate disk drives Note The number of disks that you need to select is listed as Number of disk drives required 156 User s Guide and Maintenance Information c Press Enter The new RAID Copy array is created and coupled to the parent array The copy operation starts 5 The time needed for the copy operation is related to the amount of data that you are copying A large amount of data can take more than one hour to copy To check whether the copy operation has completed a Select Array Copy Services b Select List all copy candidates The Copy State and percentage copied of all RAID 1 and RAID 10 arrays is displayed a D COMMAND STATUS Command OK stdout yes stderr no Before command completion additional instructions may appear below Array Array Type Coupled Disks Copy State hdisk3 raid_10 None Not Copying 0 hdisk4 raid_10 pdisk8 Copying 73 pdisk9 pdisk10 Fl Help F2 Refresh F3 Cancel F6 Command F8 Image F9 Shel1 F1O Exit Find n Find Next K J c When the percentage copied for the particular array reaches 100 the Copy State of that array changes to Good You can now uncouple the RAID Copy from the parent array Chapter 7 Copying Data from Arrays and from Volume Groups 157 6 To uncouple the RAID Copy Array a Select Array Copy Services from the SSA RAID Arrays menu b Select Unco
325. kage ID give the following command lsattr E 1 adapter a ucode where adapter is the ID of the adapter that you want to check for example ssa0 An example of a response to this command is ucode 14109100 05 50 Name of adapter code download file False The microcode package ID is 14109100 05 50 Note The word False shows that this attribute cannot be changed To determine the adapter microcode level use the Display or Change Configuration or Vital Product Data VPD service aid to display the VPD for the adapter see the 314 User s Guide and Maintenance Information Diagnostic Information for Multiple Bus Systems manual The microcode level is shown in the ROS level field Alternatively give the following command Iscfg v1 adapter where adapter is the ID of the adapter that you want to check You can determine the disk drive microcode level by using the Display Download Disk Drive Microcode SSA service aid see ED Aid on page 393 Maintaining the Adapter Microcode Updates to microcode can be delivered on several types of media or you can get them from web page http www storage ibm com hardsoft products ssa 1 Install the microcode as described in the instructions that are provided with the installation media 2 Run the cfgmgr command if you have not already done so to download the new microcode into the SSA adapter 3 If the SSA loops that are connected to this adapter contain two or more SSA adapters
326. ks are not children of an adapter but of an SSA router This router is called ssar It does not represent any actual hardware but exists only to be the parent device for the SSA logical disks and SSA physical disks Note When the SSA disk device driver switches from using one adapter to using the other adapter to communicate with a disk it issues a command that breaks any SSA SCSI reserve condition that might exist on that disk The reservation break is performed only if this using system has successfully reserved the disk drive through the original adapter This check is to prevent adapter takeover from breaking reservations that are held by other using systems If multiple using systems are connected to the SSA disks SSA SCSI reserve should not therefore be used as the only method for controlling access to the SSA disks Chapter 13 Using the Programming Interface 267 Fencing is provided as an alternative method for controlling access to disks that are connected to multiple using systems The PCI SSA Multi Initiator RAID EL Adapter the Micro Channel SSA Multi Initiator RAID EL Adapter and the Advanced SerialRAID Adapter can reserve to a node number rather than to an adapter see highly recommended that you make use of this ability by setting the SSA router node_number attribute if multiple adapters are to be configured as described here Configuring SSA Disk Drive Devices SSA disk drives are represented as SSA logical disks hdiskO
327. l Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image ie F1O Exit Enter Do Move the cursor to Current Use and press the List key Note If the Current Use field shows that the disk drive is owned by an array you cannot change that use 4 Select Hot Spare Disk in the Current Use field 5 Press Enter Chapter 6 Using the RAID Array Configurator 73 Changing or Showing the Status of a Hot Spare Pool This option shows you the existing configuration of the arrays and the status of each hot spare pool 1 For fast path type smitty 1s_hsm_status and press Enter Otherwise select List Status of Hot Spare Pools from the SSA RAID Arrays menu 2 A list of adapters is displayed in a window a SSA RAID Arrays Move cursor to desired item and press Enter List All Defined SSA RAID Arrays List All Supported SSA RAID Arrays List All SSA RAID Arrays Connected to a RAID Manager List Status Of All Defined SSA RAID Arrays List Identify SSA Physical Disks List Delete Old RAID Arrays Recorded in an SSA RAID Manager List Status of Hot Spare Pools SSA RAID Manager Move cursor to desired item and press F7 ONE OR MORE items can be selected Press Enter AFTER making all selections ssa0 Available 04 06 IBM SSA 160 SerialRAID Adapter 14109100 ssal Available 04 07 IBM SSA 160 SerialRAID Adapter 14109100 Fl Help F2 Refresh F3 Cancel F7 Select F8 Image F1O Exit F Enter Do Find n Find
328. lass s ssar w ConnectionLocation 1 NewDiskName where Type is the disk type that you noted in step H Class is pdisk for a pdisk or disk for an hdisk ConnectionLocation is the number that you noted in step NewDiskName is the pdisk or hdisk number for example pdisk0 that you want for the disk For example mkdev p ssar t scsd c pdisk s ssar w 004AC5119E000D 1 pdisk50 326 User s Guide and Maintenance Information Removing and Replacing an Advanced SerialRAID Adapter Attention The adapter assembly contains parts that are electrostatic discharge ESD sensitive Use the tools and procedures defined by your organization to protect such parts 1 Remove the adapter from the using system see the nstallation and Service Guide for the using system 2 If you are exchanging this adapter for another remove from it the SDRAM module and if present the Fast Write Cache Option Card You must keep these items for the replacement adapter card Note The Fast Write Cache Option Card if present on contain customer data Option Card 3 Install the SDRAM module and if present the Fast Write Cache Option Card onto the a adapter card If the network into which you are installing this adapter is not a two way network or it does not contain RAID 1 or RAID 10 arrays with coupled disk drives go to step 5 Attention If the network into which you are installing this adapter is a two way network that contain
329. layed for the disk drives that you have chosen i cee X Change Use of Multiple SSA Physical Disks Type or select values in entry fields Press Enter AFTER making all desired changes Entry Fields SSA RAID Manager ssa SSA physical disk pdisk6 pdisk7 pdisk8 New use System Disks F1l Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image F9 Shel1 F1O Exit Enter Do D 7 5 If you want to select other uses for other disk drives repeat this procedure for each different use Copying RAID 1 or RAID 10 Arrays For details about this m see K 148 User s Guide and Maintenance Information Chapter 7 Copying Data from Arrays and from Volume Groups When disk drives are configured through the Logical Volume Manager for mirroring you can use the Split Copy function to create a copy of a logical volume You can then save that copy on a tape or other medium for backup purposes When disk drives are configured as members of a RAID 1 or RAID 10 array a similar function called 3 Way Copy is available This chapter describes the 3 Way Copy function and how it can be used It does not describe the Split Copy function of the Logical Volume Manager Note The 3 Way Copy function is available only with adapter microcode level A000 or above The 3 Way Copy procedure allows you to create a separate copy of an hdisk of a complete volume group or of part of a volume group The copy is prepared on a separate disk
330. le run the Link Verification service aid see 0 j to find power faults or broken SSA links that might be causing this problem If the SSA service aids are not available or the problem remains go to e Fora RAID 0 ora RAID Copy array at least one member disk drive of the array is missing e For a RAID 5 array at least two member disk drives cannot be accessed e Fora RAID 10 array at least one mirrored pair of disk drives cannot be accessed e Power problem System configuration problem Chapter 18 SSA Problem Determination Procedures 423 valid Action See configuration SRN Problem Possible Causes 46500 Description A member disk drive is missing from an array or the original At least one disk drive missing SSA adapter is not available The array is in the Offline state from an array and that array is not now connected to the SSA Action adapter that opened it The If the missing disk drive is still operational reconnect it to the SSA array remains in the Offline loop state because its data integrity e If the original SSA adapter is operational reconnect the offline array to cannot b vented This problem might have occurred that adapter f because oe adapter and the disk drive have failed delete the array An SSA adapter card and a i disk drive both failed e The using system configuration was changed while an array was still open
331. lem determination for SSA links 400 POSTs adapter power on self tests 317 procedures 411 SRNs service request numbers 411 SSA link error 478 problems RAID array 89 502 User s Guide and Maintenance Information R RAID array configurator adding a disk drive to an SSA RAID array 140 adding a new hot spare pool 83 adding an SSA RAID array 60 adding disks to a hot spare pool 86 canceling all SSA disk drive identifications 129 changing member disks in an SSA RAID array 137 changing or showing the attributes of an SSA RAID array 135 changing or showing the status of a hot spare pool 74 changing or showing the use of an SSA disk drive 144 changing the use of multiple SSA physical disks 147 creating a hot spare disk drive 72 deleting an old RAID array recorded in an SSA RAID manager 133 deleting an SSA RAID array 70 getting access to the SMIT menu 59 identifying and correcting or removing failed disk drives 91 identifying array candidate disk drives 125 identifying hot spare disk drives 121 identifying rejected array disk drives 123 identifying system disk drives 127 identifying the disk drives in an SSA RAID array 119 installing a replacement disk drive 95 installing and configuring SSA RAID arrays 58 listing all defined SSA RAID arrays 100 listing all SSA RAID arrays that are connected to a RAID manager 102 listing all supported SSA RAID arrays 101 listing array candidate disk drives 115 listing hot spare disk drives 111 listing
332. lete contents of hdisk hd_1 which includes the logical volume Iv_A The logical volume fslv_A cannot be accessed however because its loglv is on hdisk hd_2 and has not been copied If a logical volume has its loglv on a separate hdisk the copy of that logical volume can be accessed only if the hdisk that contains the loglv is also copied Chapter 7 Copying Data from Arrays and from Volume Groups 165 Example 3 Copying One Logical Volume by Logical Volume Name or by FS Name In this example you are copying one logical volume Iv_A from the parent array to the RAID Copy array You can use either the logical volume name Iv_A or the FS name data fs_1 To copy one logical volume by logical volume name give either of the following sets of commands ssa_make_copy P v vgname 1 lv_A ssa_make_copy v vgname 1 lv_A or ssa_make_copy P v vgname 1 lv_A 1 lv_B ssa_make_copy v vgname 1 lv_A 1 lv_B To copy one logical volume by fs name give the commands ssa_make_copy P v vgname f data_fsl ssa_make_copy v vgname f data_fsl Although the logical volume lv_B is not shown in two of the commands it is copied because it is stored on the same physical volume as lv_A is When you specify a logical volume or a file system that you want to copy the vgname is not required because it can be resolved 166 User s Guide and Maintenance Information Step 1 Step 2 Step 3 Source Volume Group hd_2 h
333. level 23 adapter POSTs power on self tests 317 adapter takeover 267 adapters Advanced SerialRAID Adapter type 4 P 128 MB Memory Module feature 5 description 4 Fast Write feature 5 lights 6 port addresses 6 ID during bring up 6 installing 313 Add a Disk to an SSA RAID Array option 140 Add a Hot Spare Pool option 83 Add an SSA RAID Array option 60 add_hsm_pool_adap command 83 adding a new hot spare pool 83 adding disks to a hot spare pool 86 addresses port Advanced SerialRAID Adapter type 4 P 6 addressing SSA Devices location code format 19 unique IDs UIDs 21 addssaraid command 140 Advanced SerialRAID Adapter type 4 P description 4 lights 6 port addresses 6 array copy services 173 Delete a RAID Array Copy option 183 Delete a Volume Group Logical Volumes or Filesystems Copy option 184 List All Copy Candidates option 179 List All Uncoupled Copies option 181 List All Uncoupled Volume Groups option 182 493 array copy services continued Prepare a Copy option 175 Uncouple a Volume Group Logical Volumes or File Systems Copy option 177 array states RAID O 31 Good 31 Offline 31 array states RAID 1 32 array states RAID 10 36 Degraded 37 Exposed 36 Good 36 multiple 38 Offline 37 Rebuilding 37 Unknown 38 array states RAID 5 33 Degraded 33 Exposed 33 read operations while in 33 write operations while in 33 flowchart 35 Good 33 initial rebuilding operation 34 Offline 34 Rebuilding 34 adapter repl
334. lits If you enable page splits data that is being written to the array is split into 4096 byte pages The pages are then written in parallel to the member disk drives of the array These actions increase the general speed of write operations to the array although the pages are written in a random sequence If you disable this option the data is written sequentially but the general speed of write operations is decreased The sequence in which data is written to the array might be critical to the program that is using the data if an error occurs during the write operation Initial Rebuild When a RAID 1 or a RAID 10 array is first created the data that is contained in the primary member disk drive of the array is different from the data that is contained in the secondary disk drive of the array When data is written to the array it is written both to the primary disk drive and to the secondary disk drive The data that is on the secondary disk drive is therefore a mirrored copy of the data that is on the primary disk drive If however you use a program that attempts to read data from the array before it has written any data to that array the data that it reads might not be consistent The data can be from either the primary disk drive or from the secondary disk drive which at this time are not mirrored copies of each other If you use such a program use the Initial Rebuild option If you select no for this option any data that your prog
335. m disk must be detached Chapter 12 Using the SSA Command Line Interface for RAID Configurations 243 SSARAID Command Attributes When using the ssaraid command you can specify the following types of attribute RAID array Creation and Change attributes e RAID array Change attributes e Physical Disk Drive Change attributes e Action attributes RAID Arrays Creation and Change Attributes This section describes the creation and change attributes that you can use for All RAID arrays e RAID 1 RAID 5 and RAID 10 arrays only e RAID 1 and RAID 10 arrays only e RAID 5 arrays only e RAID 10 arrays only Creation and Change Attributes for All RAID Arrays You can specify the following attributes with the a option when you are using the ssaraid command with the C or H option to create or change a RAID array allow_page_splits true false default true With the attribute set to true When large blocks of data are sent to an array those blocks can be internally split into smaller 4096 byte blocks that can then be written in parallel to the member disk drives of the array This action greatly improves the performance of write operations to the array although the blocks are not written sequentially to the member disk drives With the attribute set to false The blocks of data are written sequentially to the member disk drives of the array This action can have a negative effect on the performance of write operations to the arra
336. m the SSA loop without taking any special actions If a disk drive does not have its Check light on the SSA loop that passes through it might still be active although the disk drive itself might not be working You must put that disk drive into Service Mode before you remove it from the SSA loop If you leave the Set Service Mode service aid Service Mode is reset 378 User s Guide and Maintenance Information To use the Set Service Mode service aid 1 Select Set Service Mode from the SSA Service Aids menu see FStarting the SSA Bervice Aids on page 374 A list of physical disk drives pdisks is displayed r SET SERVICE MODE 802381 N Move cursor onto selection then press lt Enter gt systemname pdisk0 AC50AE43 2GB SSA C Physical Disk Drive systemname pdisk1l AC706EA3 2GB SSA C Physical Disk Drive systemname pdisk2 AC1DBE11 2GB SSA C Physical Disk Drive systemname pdisk3 ACIDBEF4 2GB SSA C Physical Disk Drive systemname pdisk4 AC50AE58 2GB SSA C Physical Disk Drive systemname pdisk5 AC7C6E51 2GB SSA C Physical Disk Drive systemname pdisk6 AC706E9A 2GB SSA C Physical Disk Drive systemname pdisk7 AC1DEEE2 2GB SSA C Physical Disk Drive systemname pdisk8 amp AC1DBE32 2GB SSA C Physical Disk Drive S F3 Cancel F1O Exit E The columns of information displayed on the screen have the following meanings systemname pdiskO through pdisk8 AC50AE43 through AC1DBE32 2 GB SSA C Physical Disk Drive Name of the
337. mber adding and deleting tasks have been completed if the pool status is anything other than full go to e 88 User s Guide and Maintenance Information Dealing with RAID Array Problems This part of the chapter describes how to solve problems that might occur on your SSA RAID arrays You can get to the required SMIT menu by using fast path commands or by working through other menus During problem determination you can use any of the maintenance procedures described in A hot spare disk drive automatically replaces a failed or missing disk drive in a RAID array if The Enable Use of Hot Spares attribute is set to yes e A hot spare disk drive is available When a hot spare disk drive starts operating its Current Use attribute is changed from Hot Spare Disk to Member of an SSA RAID Array If a member disk drive of an array fails but access to that disk drive is still possible its Current Use attribute is changed from Member of an SSA RAID Array to Rejected For all other changes to the use of a disk drive you must use either the ssaraid commands or the SMIT menus Notes 1 Although this book always refers to the smitty commands you can use either the smitty command or the smit command The procedures that you follow remain the same whichever of the two commands you use If you send the smit command from a graphics terminal however the menus are displayed slightly differently from those shown in this book If you
338. member disk drives RAID 1 defines the first disk drive to be the equivalent of the first and third disk drives together If the first disk drive is missing in RAID 1 it is equivalent to the first and third disk drives being missing in RAID 10 32 User s Guide and Maintenance Information RAID 5 Array States A RAID 5 array can be in one of several states A knowledge of those states is useful when you are configuring your arrays The states are described here A flowchart for the RAID 5 array states is shown in ETE Good State A RAID 5 array is in the Good state when all the member disk drives of that array are present Exposed State A RAID 5 array enters the Exposed state when a member disk drive becomes missing logically or physically from that array When an array is in the Exposed state you can reintroduce the missing disk drive or exchange it for a new one If the missing disk drive is reintroduced the array returns to the Good state The array management software does not need to rebuild the data If a new disk drive is exchanged for the missing disk drive the array management software rebuilds the data that was on the original disk drive before it became missing then writes that rebuilt data to the replacement disk drive When the data is correct the array management software returns the array to the Good state Read Operations while in the Exposed State When a read operation is performed on an array that is in the Exposed sta
339. member objects of another object for example a RAID array List the parent object of a member object List all candidate objects that can be used to exchange an existing member of a RAID array object List all candidate objects that can be used to create new RAID array objects Give information about an object such as size and current usage Give information in colon separated format Give information in a summary format Give information for a specified device its members its parents its exchange candidates or its object candidates Give information for all objects of a particular type Limit the list to objects that have particular attribute values Create an object Create a particular type of object that is built from the specified members Assign values for attributes of the created object Create customized device objects for the new object and if required use the option that allows you to specify the device name Delete an object Delete the named RAID object Use the option that allows you to delete the device that is associated with the deleted RAID object Change an object by specifying new values for attributes of that object Perform an action on an object Swap remove or add disk drives in a RAID array Copy data from a RAID 1 or RAID 10 array onto a RAID Copy array 235 e List the objects that have support from a particular RAID manager
340. menu A list of arrays is displayed in a window 319 A SSA RAID Arrays Move cursor to desired item and press Enter List All Defined SSA RAID Arrays List All Supported SSA RAID Arrays List All SSA RAID Arrays Connected to a RAID Manager List Status Of All Defined SSA RAID Arrays List Identify SSA Physical Disks List Delete Old RAID Arrays Recorded in an SSA RAID Manager List Status of Hot Spare Pools List Status of Hot Spare Protection for an SSA RAID Array List Components in a Hot Spare Pool SSA RAID Array Move cursor to desired item and press Enter hdisk3 095231779F0737K good 3 46 RAID 0 array l hdisk4 09523173A02137K good 3 4G RAID 0 array Fl Help F2 Refresh F3 Cancel F8 Image F1Q Exit Enter Do Find n Find Next 3 Select the array that you want to delete 320 User s Guide and Maintenance Information 4 A prompt is displayed in a window fa SSA RAID Arrays Move cursor to desired item and press Enter List All Defined SSA RAID Arrays List All Supported SSA RAID Arrays List All SSA RAID Arrays Connected to a RAID Manager List Status Of All Defined SSA RAID Arrays List Identify SSA Physical Disks List Delete Old RAID Arrays Recorded in an SSA RAID Manager List Status of Hot Spare Pools List Status of Hot Spare Protection for an SSA RAID Array List Components in a Hot Spare Pool ARE YOU SURE Continuing may delete information you may want to ke
341. might occur SSA01 SSA02 SSA03 These SRNs are explained in the main SRN table see FService Request Numbers SRNs on page mr Using the Service Aids for SSA Link Problem Determination If you have a oo with an SSA loop use the Link Verification service aid see Ling The following examples show various loops and the associated information that is displayed by the Link Verification service aid 400 User s Guide and Maintenance Information Example 1 Normal Loops In Figure 56 disk drives 1 through 8 are connected to connectors A1 and A2 of the SSA adapter J Disk drives 9 through 12 are connected to connectors B1 and B2 of the same SSA adapter Disk drives 13 through 16 are connected to connectors A1 and A2 of a different SSA adapter H Using Far Using fo A1 at a2 B1 B2 B2 A1 at a2 e1 e2 B2 Disk Disk Disk Disk 16 15 14 Disk Disk Disk Disk Disk Disk Disk Disk 1 2 3 4 5 6 7 8 Figure 56 Normal Loop Chapter 17 SSA Service Aids 401 402 For this example the Link Verification service aid displays the following information A LINK VERIFICATION 802386 SSA Link Verification for systemname ssaQ 00 04 IBM SSA 160 SerialRAID Adapter To Set or Reset Identify move cursor onto selection then press lt Enter gt Physical Serial Adapter Port Al A2 B1 B2 Status TOP systemname pdisk11 AC50AE43 0 5 Good systemname pdisk8 AC706EA3 ESA Go
342. mmand displays a warning then stops If the hdisk is of the correct type the ssa_make_copy command finds a suitable copy disk drive for RAID 1 or suitable copy disk drives for RAID 10 It then couples the copy disk drive or drives to the parent volume group and runs the copy operation see hdisk1 pdisk1 pdisk2 Copy Figure 23 A Copy Disk Drive Coupled to the Parent hdisk When the copy operation has completed and the copy array has been uncoupled the fast write cache if present is temporarily disabled and data is flushed to the physical disk drives The ssa_make_copy command then forces the parent volume group to synchronize The synchronize operation flushes all data from memory to disk and stops all I O operations The command uncouples the copy disk drive from the parent hdisk clears the PVID of the copy disk drive restarts the I O operations and reenables the fast write cache Chapter 7 Copying Data from Arrays and from Volume Groups 159 The recreatevg command is run on the RAID Copy array It creates the new volume roup renames the file systems and mounts them to new mount points see hdisk1 pdisk1 pdisk2 N uncouple the Copy Copy amand NewVolume Group recreatevg gt Copy Unattached hdisk2 Raid Copy Array Figure 24 The RAID Copy Array Uncoupled from the Parent hdisk 160 Use
343. mount of data that is available at the time of the operation The amount of returned data is not necessarily the same as the amount that you requested The tmssa device driver provides support for multiple concurrent read and write operations for different devices It does not provide support for multiple read or write operations on the same device The device driver blocks the operation until the device is free Read and write operations can run concurrently on a particular device If a working path exists between two nodes communication works The path must be stable long enough for the driver to transmit the data The maximum time taken to fail a write operation is A R T where A is the number of adapters in the using system R is the number of retries as defined by TM_MAXRETRY in the usr include sys tmscsi h file and T is the retry time out period The minimum time taken to fail a write operation is the write time out period You can adjust the write time out period and the retry time out period see HMGHGIMPARM Changd You can use the select and poll routines to check for read and write capability and can also be notified of the possibility of a read or write operation The amount of data that can be sent by one write operation in blocking mode has no limit but the driver and adapter interface has been optimized for transfers of 512 bytes or less In nonblocking mode enough buffer space must be available for the write operation
344. move option the following pop up menu is displayed i D Swap Members of an SSA RAID Array Type or select values in entry fields Pr Disk to Remove Move cursor to desired item and press Enter EEE HEE HH EE Primary Disks pdisk0 AC7AAQ78 04 07 P presen 9 1GB pdisk6 AC7AA8A4 04 07 P presen 9 1GB EEE HE HE Secondary Disks pdisk9 AC7AD176 04 07 P presen 9 1GB pdisk10 AC7AE3C9 04 07 P presen 9 1GB EEE HE HE Coupled Disks pdisk11 AC7AE417 04 07 P presen 9 1GB pdisk12 AC7AE41C 04 07 P presen 9 1GB Fl Help F2 Refresh F3 Cancel F1 F8 Image F1O Exit Enter Do F5 Find n Find Next FQ naan nn nnn nnn a nnn nn nnn nnn nn nnn nnn nnn nen enn nn nn ee nn een nee e nee n nee n eee J The status values for coupled disks are present The disk drive is present and operational not_present The disk drive is missing or has failed Chapter 7 Copying Data from Arrays and from Volume Groups 191 192 User s Guide and Maintenance Information Chapter 8 Split Site Management This chapter describes how to configure and manage a system in which the computing and disk drive resources are split between two or more sites so that the system can continue to operate if one site is lost Configuration of RAID 1 and RAID 10 Arrays RAID 1 and RAID 10 arrays hold data on mirrored pairs of disk drives that is th
345. mum to a lower number if you do not want to be alerted when a single hot spare disk drive has been used Status The status of the hot spare pool Valid values for status are full The number of hot spare disk drives that are in the pool equals the number of hot spare disk drives that are configured for the pool Chapter 6 Using the RAID Array Configurator 75 empty The pool contains no hot spare disk drives or the hot spare disk drives that are in the pool are not suitable as member disk drives of the pool reduced The number of hot spare disk drives that are in the pool is less than the number of hot spare disk drives that were originally configured but greater than the configured minimum number critical The number of hot spare disk drives that are in the pool is less than the specified number of hot spare disk drives for that pool inconsistent Configuration data for the hot spare pool is saved on all hot spare disk drives The hot spare disk drives that are in the pool however do not all contain the same configuration data mixed An array that is in this pool has used a hot spare disk drive that is from another pool unused Hot spare disk drives exist in a pool but they are protecting no member disk drives 76 Users Guide and Maintenance Information Showing the Disks That Are Protected by Hot Spares This option shows you which member disk drives of an array are protected by hot spare disk drives 1 For fast path t
346. n ISAL command that is not in the list of supported ISAL commands or the caller has attempted to send an FN_ISAL_FENCE command to an SSA physical disk EPERM The caller did not have an effective user ID EUID of 0 ENOMEM The device driver was unable to allocate or pin enough memory to complete the operation If the return code is 0 the result field of the ssadisk_ioctl_parms structure is valid This indicates whether the adapter was able to process the command successfully Chapter 13 Using the Programming Interface 279 Files dev pdiskO dev pdisk1 dev pdiskn Provide an interface to allow SSA device drivers to access SSA physical disk drives dev hdiskO dev hdisk1 dev hdiskn Provide an interface to allow SSA device drivers to access SSA logical disk drives 280 User s Guide and Maintenance Information SSADISK_ISALMgr_CMD ISAL Manager Command SSA Disk Device Driver ioctl Operation Purpose Description To provide a method of sending Independent Network Storage Access Language ISAL Manager commands to an SSA physical or logical disk drive ISAL consists of a set of commands that allow a program to control and access a storage device The ISAL command set is described in the Technical Reference for the adapter The SSADISK_ISALMgr_CMD operation allows the caller to issue an ISAL command to a selected logical or physical disk The caller must be root or have an effective user ID of root to issue
347. n adapter This flag has no effect if the device is a disk drive u Forces a disk reservation to be broken if the device the device that is being tested is a disk drive This flag has no effect if the device is an adapter and is not valid for SSA Enhanced RAID Adapters or Advanced SerialRAID Adapters S Requests the output of the hardware status of a disk drive This flag can be used only with a disk drive It cannot be used with the a flag or with the u flag Disk drive status output is 0 Good the adapter has initialized the pdisk 1 Power the pdisk has detected a loss of redundant power or cooling Note If the pdisk is in a 7133 Model D40 or Model T40 this status indicates that the enclosure has detected a degraded environment Such an environment might be caused by a power cooling or temperature problem or by an enclosure hardware failure 2 Failed the adapter cannot initialize the pdisk If an error occurs the ssa_diag command generates an error message for example ssaQ SRN 42500 and sends it to stdout If no error occurs the command sends no message to stdout A non zero return code indicates an error The command sends an error message to stderr 348 User s Guide and Maintenance Information ssadisk Command Purpose To display the names of disk drives that are connected to an SSA adapter Syntax ssadisk a AdapterName P L Description The ssadisk command lists the names of disk drives that are connected
348. n subroutine does not work Before the initiator mode device can be successfully opened its special file must be opened for write operations only Before the target mode device can be successfully opened its special file must be opened for read operations only Possible return values for the errno global variable include EBUSY Attempted to run an open subroutine for a device instance that is already open 296 User s Guide and Maintenance Information EINVAL Attempted to run an open subroutine for a device instance but either a wrong open flag was used or the device is not yet configured EIO An I O error occurred ENOMEM The SSA device does not have enough memory resources close Subroutine The close subroutine deallocates resources that are local to the target device driver for the target or initiator device No commands are sent to the device as a result of running the close subroutine Possible return values for the errno global variable include EINVAL Attempted to run a close subroutine for a device instance that is not configured or not opened EIO An I O error occurred EBUSY The device is busy read Subroutine Support for the read subroutine is provided only for the target mode device Support for data scattering is provided through the user mode readv or readvx subroutine or through the kernel mode fp_rwuio service call If the read subroutine is not successful the return value is set to 1 and the errno
349. n the cause of a problem is determined The Error ID is a numeric identifier for the Error Label Table J shows the error labels that SSA subsystems use Table 2 Error Labels Error Label DISK_ERR1 Error ID 21F54B38 Error Description An unrecovered media error has been detected The problem will be solved automatically when data is next written to the failing block If you are using RAID 5 no application has failed If you are not using RAID 5 an application might have had a media error Run error log analysis to determine whether the disk drive has become unreliable and should be exchanged for a new one DISK_ERR4 1581762B A recovered media error has been detected An occasional recovered media error is not serious Multiple media errors per day on one disk drive however might indicate that the disk drive is failing Run error log analysis to determine whether the disk drive should be exchanged for a new one SSA_ARRAY_ERROR B4C00618 A RAID array failure has been detected and the array is not fully operational Usually the data on the array is safe but ensure that you follow the service procedures exactly so that you do not lose any data SSA_CACHE_ERROR SSA_CACHE_BATTERY BC31DEA7 26CA120B These errors indicate that the fast write cache has detected a problem Usually the problem has been caused by user or service actions such as moving a Fast Write Cache Option card from one adapter to a
350. ncouple a Volume Group Logical Volumes or File Systems Copy option 177 copying data from an array 151 description 149 effects of array copy on other SMIT menus 186 Change Show Attributes of an SSA RAID Array option 186 Identify Disks in an SSA RAID Arrays option 189 List Status Of All Defined SSA RAID Arrays option 188 Remove a Disk From an SSA RAID Array option 190 Swap Members of an SSA RAID Array option 191 SMIT menus for 172 ssa_delete_copy command 171 ssa_make_copy command 161 using SMIT menus to create a copy array 155 using ssa_make_copy command to create a copy array 159 using ssaraid commands to create a copy array 151 A action attributes RAID 1 RAID 5 and RAID 10 251 new_member disk 251 old_member disk 251 adapter PCI ODM attributes 257 adapter device driver description 257 device dependent subroutines 258 direct call entry point 265 description 265 purpose 265 return values 265 files 260 head device driver interface 256 IOCINFO ioctl operation 261 description 261 files 261 purpose 261 adapter device driver continued managing dumps 259 open and close subroutines 258 PCI ODM attributes 257 responsibilities 255 SSA_GET_ENTRY_POINT ioctl operation 264 description 264 files 264 purpose 264 return values 264 SSA_TRANSACTION ioctl operation 262 description 262 files 263 purpose 262 return value 263 summary of SSA error conditions 259 adapter microcode maintenance 315 adapter microcode checking the
351. nd the failing device run diagnostics to the devices that are connected to this SSA adapter Disk drive error during system configuration Chapter 18 SSA Problem Determination Procedures 435 SS_LINK_CONFIG_FAILED error SSA devices cannot be configured because one device in the SSA loop is causing link responses that are not valid Action Isolate the failing device 1 If only one SSA loop is connected to the adapter go to step A If two SSA loops are connected to the adapter disconnect one loop and run diagnostics in System Verification mode to the adapter to determine which loop contains the failing device Then go to step 2 Disconnect the first device on the SSA loop that contains the failing device and run the diagnostics in System Verification mode to the adapter 3 If the diagnostics show that the failing device is still in the SSA loop reconnect the device and disconnect the next device in sequence Run the diagnostics again 5 Repeat steps Bl and H until you isolate the failing device SRN Problem Possible Causes 50200 Description A duplicate node number has been detected This problem is SSA loop configuration a user error problem Action See n You can use the ssaviynn command line utility ase Essavfynd to determine which node has the duplicate node number 50411 Description The SSA adapter has detected an SS_SIC_CLASS1 error Possible FRUs
352. nd sends no message to stdout A non zero return code indicates an error The command sends an error message to stderr 354 User s Guide and Maintenance Information ssaencl Command Purpose Syntax Description Flags To allow the monitoring and changing of the status for SSA SES disk enclosures subsystems To display enclosure component settings ssaencl 1 name s v i r b card t threshold a f fan d drive_bay p PSU o c e To modify enclosure component settings ssaencl 1 name I ID U B mode card mode S d drive_bay b card p PSU r c o T threshold value To display a usage message ssaencl h ssaencl The ssaencl command can be used to observe the existing settings of an SSA SES disk enclosure subsystem or to modify the settings of that enclosure The command can be used only to observe or modify settings it cannot be used to observe and modify settings at the same time When the command displays enclosure settings it displays them in tables that the user can read easily If the s flag is specified however the command displays the settings in colon separated format as used by commands such as the SMIT commands name Specifies the name of an SSA SES enclosure or pdisk If a pdisk name is used that pdisk must be in an SSA SES enclosure S Displays output in colon separated format V
353. ndidate disk drives Reduced The number of hot spare disk drives that are in the pool is less than the number of hot spare disk drives that were in the pool when the pool was last configured but greater than the minimum number that is specified for this pool This condition does not cause an error to be logged If you removed a disk drive from the configuration on purpose 1 Select Change Show Delete a Spare Pool from the smit ae menu 2 Select the reduced hot spare pool 3 Verify that the contents of the pool are as required 4 Press Enter If you have exchanged a failed disk drive you might now want to add the Critical The number of hot spare disk drives that are in the pool is less than the specified minimum number for that pool If you removed a disk drive from the configuration on purpose 1 Select Change Show Delete a Hot Spare Pool from the smit ss raid menu a A 2 Select the critical hot spare pool 3 Verify that the contents of the pool are as required 4 Press Enter If you have exchanged a failed disk drive you must now add the exchanged disk drive to this pool see Spare Pool on er a 54 User s Guide and Maintenance Information Inconsistent Mixed Unused The member disk drives in the pool do not agree about the size of the hot spare disk drives or about the minimum number of hot spare disk drives that is required This state is probably caused by changes to the SSA loop
354. ned on the primary disk drive is a mirror copy of the data that is contained on the secondary disk drive Secondary Disk The secondary disk drive of a RAID 1 array A RAID 1 array must consist of two disk drives one primary and one secondary that are in the same loop The data that is contained on the secondary disk drive is a mirror copy of the data that is contained on the primary disk drive Primary Disks The primary disk drives of a RAID 10 array A RAID 10 array consists of an even numbered quantity 4 through 16 of disk drives that are in the same loop The minimum RAID 10 array consists of two primary and two secondary disk drives The data that is contained on the primary disk drives is a mirror copy of the data that is contained on the secondary disk drives Ensure that you choose a quantity of primary disk drives that matches the quantity of secondary disk drives Secondary Disks The secondary disk drives of a RAID 10 array A RAID 10 array consists of an even numbered quantity 4 through 16 of disk drives that are in the same loop The data that is contained on the secondary disk drives is a mirror copy of the data that is contained on the primary disk drives Ensure that you choose a quantity of secondary disk drives that matches the quantity of primary disk drives Sirip Size The maximum amount of contiguous data that is mapped to a single member disk drive Enable Use of Hot Spares If you enable this option the SSA RAID M
355. nel mode Files dev ssa0 dev ssa1 dev ssan 264 User s Guide and Maintenance Information SSA Adapter Device Driver Direct Call Entry Point Purpose To allow another kernel extension to send transactions to the SSA adapter device driver This function is not valid for a user process When the function completes its run an off level interrupt notifies the caller See SSA_GET_ENTRY_POINT SSA adapter ioctl operation Description The entry point address is the address that is returned in EntryPoint by the SSA_GET_ENTRY_POINT ioctl operation The function takes a single parameter of type SSA_loreq_t which is defined in the usr include sys ssa h file The fields of the SSA_loreq_t structure are used as follows SsaDPB_ An array of size SSA_DPB_SIZE which is used by the SSA adapter device driver and should be initialized to all NULLs SsaNotify The address of the function in the SSA head device driver that the SSA adapter device driver calls when the directive has completed ud The transaction to be executed Valid transactions are described in the Technical Reference for the adapter Return Values This function does not return errors You can determine success or failure of the directive by examining the directive status byte and transaction result fields which are set up in the SSA MCB For details see the Technical Reference for the adapter Chapter 13 Using the Programming Interface 265 ssadisk SSA Disk Device Dr
356. nfigurator 139 Adding a Disk Drive to an SSA RAID Array This option allows you to install a replacement disk drive into a RAID 5 array that is running in the Exposed or Degraded state because a disk drive has been rejected or removed from the array You cannot use this procedure to add a disk drive to a RAID 1 or RAID 10 array When you install the replacement disk drive all the data that was contained on the original disk drive is automatically written to the replacement disk drive 1 For fast path type smitty addssaraid and press Enter Otherwise a Select Change Member Disks of an SSA RAID Array from the SSA RAID Arrays menu b Select Add a Disk to an SSA RAID Array 2 Alist of arrays is displayed in a window fe Change Member Disks in an SSA RAID Array Move cursor to desired item and press Enter Remove a Disk from an SSA RAID Array Add a Disk to an SSA RAID Array Swap Members of an SSA RAID Array SSA RAID Array Move cursor to desired item and press Enter hdisk2 095231779F0737K degraded 3 4G RAID 5 array Fl Help F2 Refresh F3 Cancel F8 Image F1O Exit Enter Do Find n Find Next Select the array to which you are adding the disk drive 140 User s Guide and Maintenance Information The following information is displayed E D Add a Disk to an SSA RAID Array Type or select values in entry fields Press Enter AFTER making all desired changes Entry Fields SSA RAID Manager ssa0 SSA RA
357. nformation 3 from step bh Do you have any other SRN NO Go to step P2 on page 4671 YES a Solve the problems that caused the SRN b Return to step Lon page 454 from step B Find your SRN in the following table then do the appropriate actions Note If you still do not have any of these SRNs you are in the wrong MAP SRN Cause Action 46000 An array is in the Offline state Go to step fl 47000 You have more than the maximum Go to step Bon page 457 number of arrays allowed 47500 A partial loss of data has occurred Go to step Lan page 458 49000 An array is in the Degraded state Go to step 13 on page 460 49100 An array is in the Exposed state Go to stepiZ on page 463 49950 An array copy disk drive is missing Go to step 23 on page 467 5 from step n An array is in the Offline state if it contains at least one member disk drive but does not contain enough member disk drives to maintain data availability Are any disk drives missing or without power or have any disk drives been recabled not necessarily by you NO Go to step 6 on page 454 YES Restore the original configuration a Type smitty ssaraid and press Enter b Select List All SSA RAID Arrays Connected to a RAID Manager The status of the array changes to Good when the adapter can find all the member disk drives of the array j to verify the repair Chapter 18 SSA Problem Determination Procedures
358. ng System C Using System A Adapter Adapter Adapter Adapter Using System B Using System D Adapter Adapter Figure 11 A Large Configuration of Thirty Two Disk Drives Connected to Eight SSA Adapters in Four Using Systems 16 Users Guide and Maintenance Information Switching Off Using Systems Be careful if you want to switch off one or more using systems in a large configuration If any disk subsystem in the configuration does not use bypass cards some using systems might lose access to disk drives if you e Switch off more than one using system at a time e Switch off a using system when a disk drive has failed Note For more information about bypass cards see the publications for your disk subsystems or enclosures Switching On Using Systems When you switch on using systems of a large configuration ensure that each using system configures all the disk drives in the SSA loop You can switch on each using system and give the cfgmgr command to ensure that all the disk drives are configured If however you need your pdisk assignments to be constant between using systems follow the procedures given in Chapter 2 Introducing SSA Loops 17 Configuring Devices on an SSA Loop If an SSA
359. ng sequence enables you to determine the best relationship between the disk drives and the adapter or adapters on an SSA loop that contains two or more pairs of adapter connectors Most of the disk drives share data access with each other 1 Determine which data is shared between the pairs of adapter connectors 2 Assign this data to the disk drives that are equally spaced between the sharing pairs of adapter connectors For example the loop that is shown in Figure 14 contains 16 disks and four adapters In this loop e The pairs of adapter connectors should be spaced between the disk drives e Data that is shared by adapters A and B should be put onto disk drives 1 through 4 Data that is shared by adapters B and C should be put onto disk drives 5 through 8 AdapterB Disk _ Disk _ Disk _ Disk Disk _ Disk _ Disk _ Disk _ 1 2 3 4 5 6 7 8 a lt O g g a Qa o oO To gej lt lt E Disk _ Disk __ Disk _ Disk Disk _ Disk _ Disk _ Disk E 16 15 14 13 12 11 10 9 Adapter D Figure 14 Pairs of Connectors in the Loop Mainly Shared Data Note For configurations such as that shown here we recommend that the adapters be installed in separate using systems Otherwise disk drives can become isolated should both adapters fail or be held reset in one of the u
360. nitions from the IBM Dictionary of Computing If you do not find the term or abbreviation for which you are looking try the index or refer to the IBM Dictionary of Computing at URL http www networking ibm com nsg nsgmain htm A array Two or more disk drives that are interconnected to increase security performance or reliability l attribute A named property of an entity for example the attributes of a RAID array include state current use and size of array B boot To prepare a computer system for operation by loading an operating system buffer A routine or storage that is used to compensate for a difference in rate of flow of data or time of occurrence of events when transferring data from one device to the other C candidate disk Disk drives that are available for use in an array component The components of a RAID array are the member disk drives that are configured for that array The component of a fast write hdisk is the array or disk drive that is configured for fast write operations contiguous Touching or joining at a common edge or boundary for example an unbroken consecutive series of storage locations couple To attach a copy array to a RAID 1 or RAID 10 array and copy data from the RAID 1 or RAID 10 array to that copy array The metadata of the copy array is updated to indicate that it is part of the parent RAID array The metadata of the parent array is updated to sh
361. nk Verification Service Aid The Link Verification service aid helps you determine e Which devices are connected to the SSA loop e Where an SSA loop has been broken e The status of the disk drives on that SSA loop e The location of enclosure faults that have been detected by the disk drives on that SSA loop To use the Link Verification service aid 1 Select Link Verification from the SSA Service Aids menu see FStarting the SSA J The Link Verification adapter menu is displayed pein VERIFICATION 802385 D Move cursor onto selection then press lt Enter gt nunu ssa0 04 02 IBM SSA 160 SerialRAID Adapter nunu ssal 04 04 IBM SSA 160 SerialRAID Adapter nunu ssa2 04 07 IBM SSA 160 SerialRAID Adapter iced F1O Exit j 2 Select the adapter that you want to test The columns of information displayed on the screen have the following meanings e The first column shows the adapter name The format of the adapter name is systemname adaptername for example nunu ssa0 where systemname is the name of the using system that contains the SSA adapter adaptername is the adapter resource identifier The second column shows the adapter location code for example 04 02 e The third column shows the description of the adapter for example IBM SSA 160 SerialRAID Adapter Note If the adapter name is longer than the description field the name is shortened as shown in the screen above Chapter 17 SSA Service Aids 383
362. nnected to a RAID Manager List Status Of All Defined SSA RAID Arrays List Identify SSA Physical Disks List Delete Old RAID Arrays Recorded in an SSA RAID Manager List Status of Hot Spare Pools List Status of Hot Spare Protection for an SSA RAID Array List Components in a Hot Spare Pool SSA RAID Manager Move cursor to desired item and press Enter ssa0 Available 04 03 IBM SSA 160 SerialRAID Adapter 14109100 ssal Available 04 02 IBM SSA 160 SerialRAID Adapter 14109100 F1l Help F2 Refresh F3 Cancel F8 Image F1O Exit Enter Do F Find n Find Next e EA Select the adapter on which you want to change the hot spare pool 86 User s Guide and Maintenance Information 3 A list of hot spare pools is displayed C D SSA RAID Arrays Move cursor to desired item and press Enter List All Defined SSA RAID Arrays List All Supported SSA RAID Arrays List All SSA RAID Arrays Connected to a RAID Manager SSA Hot Spare Pools Move cursor to desired item and press Enter fH EE EEE A EH EEE EE EE EE A aE Hot Spare Pools in Loop A are pool_Al pool_A2 HEE EEE EE TEE EE EEE EEE AE Hot Spare Pools in Loop B are pool Bl pool_B2 Fl Help F2 Refresh F3 Cancel F7 Select F8 Image F10 Exit F Enter Do Find n Find Next Select the hot spare pool that you want to change Chapter 6 Using the RAID Array Configurator 87 4 The hot spare pool is displayed a gt SSA Hot Spare Pools
363. not determined To display the link speed of pdiskO give the command ssa_speed 1 pdiskO To display the adapter link speeds of ports A1 and A2 on ssa0 give the command ssa_speed a ssa0 p A To display the adapter link speeds of all the devices in network B on ssa0 give the command ssa_speed a ssa0 n B To find link speed exception conditions give the command ssa_speed x 370 User s Guide and Maintenance Information ssavfynn Command Purpose Syntax Description Flags Output To check for duplicated node numbers Note It is recommended that this command be used only when all the adapters on the network are SSA RAID adapters ssavfynn w The ssavfynn command is in the usr lpp diagnostics bin file It has no flags If the ssavfynn command runs and finds no duplicate node numbers on the SSA network it returns no message If the command finds duplicate node numbers it returns a message that is similar to that shown here SSA User Configuration Error Node Number 1 is set on both Local Host abc somewhere ibm com and Remote Host xyz This message says that a problem exists between your machine abc and another machine xyz that is connected through the SSA network The names shown are the DNS names of the machines W Turns on the network wide check It checks local node numbers against remote node numbers and remote node numbers against other remote node numbers The normal check is local n
364. nother or moving a disk drive between adapters before the data in the cache card has been synchronized with the data on the disk drive Take care when moving cache cards or adapters that contain cache cards because they might contain data that needs to be synchronized Always follow the service procedures for the SRN carefully to ensure that you do not lose any data This error recommends that the battery of the Fast Write Cache Option card be exchanged for a new one SSA_DEGRADED_ERROR 3DB7729E An error or condition has occurred that might cause some of the SSA functions to be unavailable or to be working with reduced performance SSA_DETECTED_ERROR EC9903DF Errors of this type are logged by the adapter when a device failure has been reported via SSA asynchronous messages Because the system name of the device or devices that are sending these messages is not known the error is logged against the adapter The SRN indicates the service procedures that must be performed 222 User s Guide and Maintenance Information Table 2 Error Labels continued Error Label Error ID Error Description SSA_DEVICE_ERROR FE9E9357 This error can be logged against the adapter or disk drive resource When the error is logged against a disk drive it indicates that the adapter has detected a failure on the disk drive It is possible however that the failure was detected because the disk drive was un
365. nous dynamic random access memory secondary half The term that distinguishes one half of a split array The term primary half distinguishes the other half of the split array Serial Storage Architecture An industry standard interface that provides high performance fault tolerant attachment of I O storage devices Glossary 491 service request number A number that helps you to identify the cause of a problem the failing field replaceable units FRUs and the service actions that might be needed to solve the problem Service request numbers are generated by the system error log analysis system configuration code and customer problem determination procedures SMIT System management interface tool SRN Service request number SSA Serial Storage Architecture SSA unique ID The specific identifier for a particular SSA device Each SSA device has a specific identifier that is not used by any other SSA device in the whole world stretch A set of stripes that is used to perform a particular level of array management strip The maximum amount of contiguous user data that is mapped to one component stripe A set of strips with their mirrors that have corresponding LBAs on each component system disk A disk that is owned by the using system that is it does not belong to an array and it is not a hot spare disk U unconfigure a To take a device from the available configured state to the defined state
366. ns on all SSA adapters in the using system pdisk Specifies the pdisk that you want to test a AdapterName Specifies the adapter that you want to test If you select the a flag you must also select the p or the n flag S Specifies the supported link speed p Loop Causes the ssa_speed command to show the operating speed of both ports Valid Loop parameters are A a B b for example p A p b n Network Causes the ssa_speed command to show the operating speed of all ports on the selected network Valid Network parameters are A a B b for example n A n b X Causes the ssa_speed command to test each adapter on the using system for link speed exceptions on all nodes To do this the command inspects the Supported Speed and the Current Speed for each pair of neighboring nodes in each network If the Current Speed of a particular pair of nodes is less than the Supported Speed of the slowest device in that pair of nodes the command returns the message lt adapter_name gt lt port_hop gt where lt port_hop gt is the lowest lt port_hop gt count of the tested pair e When used with the x flag the e flag generates an error log entry if a link speed exception condition is detected Chapter 16 Using the SSA Command Line Utilities 369 Output The ssa_speed command sends all error messages to stderr and output to stdout Link speeds can be Examples 20 20 MB 40 40 MB 00 not operational 2
367. nt to display the status Notes a No additional status information is available for RAID O arrays b The menu that is shown here does not show RAID Copy arrays See Effects of 86 for more information 104 User s Guide and Maintenance Information The following information is displayed for RAID 5 arrays is D COMMAND STATUS Command OK stdout yes stderr no Before command completion additional instructions may appear below Unsynced Parity Strips Unbuilt Data Strips hdisk3 0 0 hdisk4 0 0 Fl Help F2 Refresh F3 Cancel F6 Command F8 Image F9 Shel1 F10 Exit Find Cii Next Y Chapter 6 Using the RAID Array Configurator 105 The following information is displayed for RAID 1 and RAID 10 arrays i gt COMMAND STATUS Command OK stdout yes stderr no Before command completion additional instructions may appear below Components Primary Secondary Status hdisk4 degraded pdisk13 pdisk15 good pdisk3 pdisk6 good pdisk7 BlankReserved5Z degraded F1 Help F2 Refresh F3 Cancel F6 Command F8 Image F9 Shel F1O Exit Find n Find Next XX Status data is given for the array and for each mirrored pair of disk drives in the array The status values for the array are good The status of all mirrored pairs is good exposed The status of one or more mirrored pairs is exposed No mirrored pair is degraded degraded The status of one or more mirrored pairs is degraded rebuilding The stat
368. ntains the failing member disk drive If the Choose Hot Spare Only from Preferred Pool option is set to no the selected hot spare disk drive can be A hot spare disk drive that is in the hot spare pool that contains the failing member disk drive A hot spare disk drive that is in hot spare pool zero A hot spare disk drive that is in any other hot spare pool If more that one hot spare disk drive is available in a pool and the hot spare disk drives are of different sizes the smallest appropriate disk drive is selected User s Guide and Maintenance Information Solving Hot Spare Pool Problems Hot spare pool problems are indicated by the state of the pool and by error codes in the system error log When configuring or reconfiguring hot spare pools it is recommended that you use the state of the hot spare pool to help guide your actions If hot spare pool problems occur during normal operations use the Service Request Number SRN that is generated by the diagnostics to guide your actions To display the operating state of a hot spare pool 1 Type smitty ssaraid and press Enter 2 Select List Status of Hot Spare Pools 3 Select the SSA adapter that you want to inspect The status of the hot spare pool is displayed a COMMAND STATUS Command OK stdout yes stderr no Before command completion additional instructions may appear below ssal Pool Components Spares Configured Minimum Status pool_AQ 0 i 1 1
369. ntenance Information 14 from step E3 Are any disk drives listed as SSA physical disks that are rejected NO YES A disk drive has not been detected by the adapter Go to step 115 onl Run diagnostics in System Verification mode to all the disk drives that are listed as rejected Run the Certify service aid see to all the disk drives that are listed as rejected If problems occur on any disk drive exchange that disk drive for a new disk drive see continue from step ladin this procedure A disk drive that is listed as rejected is not necessarily failing For example the array might have rejected the disk drive because a power problem or an SSA link problem caused that drive to become temporarily unavailable Under such conditions the disk drive can be reused If you think that a disk drive has been rejected because it is failing check the error log history for that disk drive For example if you suspect pdisk3 type on the command line ssa_ela 1 pdisk3 h 5 This command causes the error log for pdisk3 to be analyzed for the previous five days If a problem is detected an SRN is generated Type smitty ssaraid and press Enter Select Change Show Use of an SSA Physical Disk and for all disks that you have tested or exchanged change the Current Use to Array Candidate Disk Select Change Member Disks in an SSA RAID Array Select Swap Members of an SSA RAID Array Select the degraded hdisk Referrin
370. ntenance procedures that are given either in this book or in the Service Guide for the SSA subsystem See saris Bag eet Need for more information about SRNs Error log analysis can be started in several ways e If you run diagnostics in Problem Determination mode to an SSA device one of the following procedures occurs An error log analysis is performed for all SSA devices if any SSA device has a permanent PERM error in the error log An error log analysis is performed for that device before the physical device is tested If errors are found no test is performed on the hardware Error log analysis is performed every 24 hours by the run_ssa_ela cron see e You can use the diag command to run error log analysis On the command line enter diag ecd device Error log analysis runs for the selected device If the analysis determines that service action is needed a message is displayed This message indicates that a problem was detected and requests that diagnostics be run to that device e You can run error log analysis to all SSA devices On the command line enter ssa_ela A list of SRNs for all SSA devices that need service action is displayed You can run error log analysis for selected SSA devices On the command line enter ssa_ela device The device that is selected can be an SSA adapter a pdisk or an hdisk If an hdisk is selected the error log analysis runs for the adapters that control
371. nter List Disks in an SSA RAID Array List Hot Spares List Rejected Array Disks List Array Candidate Disks List System Disks Identify Disks in an SSA RAID Array Identify Hot Spares Identify Rejected Array Disks Identify Array Candidate Disks Identify System Disks SSA RAID Manager Move cursor to desired item and press Enter ssaQ Available 00 04 IBM SSA 160 SerialRAID Adapter 14109100 F1 Help F2 Refresh F3 Cancel F8 Image F1Q Exit Enter Do Find n Find Next Select the adapter whose rejected disk drives you want to list Chapter 6 Using the RAID Array Configurator 113 3 A list of rejected disk drives is displayed 2 COMMAND STATUS Command OK stdout yes stderr no Before command completion additional instructions may appear below pdisk3 0004AC5119E000D rejected n a 1 1G Physical disk pdisk5 08005AEA030D00D rejected n a 2 3G Physical disk Fl Help F2 Refresh F3 Cancel F6 Command F8 Image F9 Shel F1O Exit Find n Find Next Ne 114 Users Guide and Maintenance Information Listing Array Candidate Disk Drives This option allows you to list disk drives that are available for adding to an array E For fast path type smitty Icssaraid and press Enter Otherwise a Select List Identify SSA Physical Disks from the SSA RAID Arrays menu b Select List Array Candidate Disks A list of adapters is displayed in a window r List Identify SSA Physical Disks Move cursor to desired
372. nto pdiskO Using the u flag ssadload u With this flag the command identifies the latest level of SSA disk drive microcode that is available in the etc microcode directory It then ensures that all the disk drives are using microcode that is at that level or at a higher level If it finds a disk drive that is using a lower level of microcode the command downloads the latest level of microcode to that disk drive Using the s flag ssadload s With this flag the command lists the existing levels of microcode of the available disk drives Chapter 16 Using the SSA Command Line Utilities 351 352 e Using the s and a adapter flags ssadload s a ssa0 With these flags the command lists the existing levels of microcode of all the available disk drives that are connected to adapter ssa0 e Using the u a adapter and p flags ssadload u a ssa0 p amp ssadload u a ssal p With these flags the command causes the update mechanism to run in parallel and update concurrently the microcode on all the available disk drives that are connected to adapters ssa0 and ssa1 User s Guide and Maintenance Information ssa_ela Command Purpose Syntax Description Flags To look for the most significant error in the error log ssa_ela ssa_ela 1 Device h timeperiod ssa_ela 1 pdisk ssa_ela 1 hdisk ssa_ela 1 adapter The ssa_ela command with no flags scans the error log and looks for all SSA errors
373. o a value of 3 type chdev 1 ssar a node_number 3 Note The command fails if any target mode SSA devices are active By default the value of node_number is 0 This value has particular importance because it is not possible to exclude a using system with node number 0 from access to the disk drive Therefore if a disk drive is moved from a machine that has been using fencing to a machine that has not been using fencing the new machine can communicate with the disk drive If a using system attempts to use the open subroutine to open a disk drive to which it is not allowed access the return code is 1 and the global variable errno is set to the value ENOCONNECT Similarly if an application already has a SSA logical disk open but that logical disk has been fenced out since the open calls to the read or write subroutine fail with errno set to ENOCONNECT The hardware fencing commands provide a method by which you can break through a fence You can use the SSADISK_ISALCMD ioctl operation to give the command but you must first open the disk drive To open a disk drive from which the using system has been excluded use the openx subroutine and specify SSADISK_FENCEMODE extension flag as described in the section on SSA disk device driver device dependent subroutines While the disk drive is open in this mode no read or write operations are permitted If fencing has excluded a using system from access to a disk drive but that disk drive is
374. o not switch off the using system when servicing an SSA loop Enclosure power cables and external SSA cables that connect the devices to the using system can be disconnected while that system is running To isolate the FRUs do the actions and answer the questions given in the MAPs e When instructed to exchange two or more FRUs in sequence 1 Exchange the first FRU in the list for a new one 2 Verify that the problem is solved For some problems verification means running the diagnostic programs see the using system service procedures 3 If the problem remains a Reinstall the original FRU b Exchange the next FRU in the list for a new one 4 Repeat steps bl and until either the problem is solved or all the related FRUs have been exchanged 5 Do the next action indicated by the MAP Attention Disk drives are fragile Handle them with care and keep them well away from strong magnetic fields Chapter 18 SSA Problem Determination Procedures 443 MAP 2010 START This MAP is the entry point to the MAPs for the adapter If you are not familiar with these MAPs read i You might have been sent here because The system problem determination procedures sent you here A problem occurred during the installation of a disk subsystem or a disk drive Another MAP sent you here A customer observed a problem that was not detected by the system problem determination procedures Attention Unless the using syst
375. o the SSA RAID Arrays SMIT Menu Chapter 5 Hot Spare Management Deciding how to Configure Hot Spare Disk Drive Pools Choosing How Many Hot Spare Disk Drives to Include in Each Pool Choosing the Error Threshold Alarm Level for a Hot Spare Pool Rules for Hot Spare Disk Drive Pools do cise WY ae Solving Hot Spare Pool Problems Chapter 6 Using the RAID Array Configurator Installing and Configuring SSA RAID Arrays Getting Access to the SSA RAID Arrays SMIT Menu Adding an SSA RAID Array bo ai atin 7 Deleting an SSA RAID Array Creating a Hot Spare Disk Drive Changing or Showing the Status of a Hot Spare Pool Showing the Disks That Are Protected by Hot Spares Listing the Disks That Are in a Hot Spare Pool Adding a New Hot Spare Pool Adding Disks to or Removing Disks from a Hot Spare Pool v Users Guide and Maintenance Information 27 27 29 29 29 30 31 31 31 32 33 33 33 33 34 34 35 36 36 36 37 37 37 38 38 39 40 41 43 45 45 51 51 52 53 57 58 59 60 70 72 74 77 80 83 86 Dealing with RAID Array Problems toe Bd item Gt ter 4 89 Getting Access to the SSA RAID Array SMIT Menu Sa oe an Sete Be ee 290 Identifying and Correcting or Removing Failed Disk Drives 91 Installing a Replacement Disk Drive 95 Using Other Configuration
376. o the SSA loop adapters on the SSA loop are using versions of microcode Action that do not provide support for 1 If the changes were not planned restore the system to its original spare pools configuration If the changes were planned do either of the following actions e Update the microcode to the latest level if the SSA adapter is an Advanced SerialRAID Adapter e For all hot spare pools that are connected to this RAID Manger a Type smitty ssaraid and press Enter b Select Change Show Delete a Hot Spare Pool c Remove all member disk drives from the pool 49800 Description A different adapter has been detected on each loop SSA loop configuration Action Go to g and observe the configuration rules for this adapter Correct the configuration problem Chapter 18 SSA Problem Determination Procedures 433 SRN Problem Possible Causes 49950 Description An array copy disk drive is missing The array copy is Possible FRUs degraded If the missing disk drive is restored to the network or Device 100 exchanged for a new disk drive the copy rebuilds A disk drive might not be available for one of the following reasons e The disk drive has failed e The disk drive has been removed from the subsystem e An SSA link has failed e A power failure has occurred Action Go to MAP 8 4A100 Description The adapter cannot initialize a disk drive The failing disk Poss
377. occurred Action Exchange the FRU for a new FRU Possible FRUs 50007 Description The OCC detected an internal error Action Exchange the FRU for a new FRU 50008 Description Unable to read or write the POS registers or PCI configuration space Action Exchange the FRU for a new FRU Possible FRUs SSA adapter card 100 50010 Description An SSA adapter or device drive protocol error has occurred Action Go to before exchanging the FRU 50012 Description The SSA adapter microcode has hung Action Run diagnostics in System Verification mode to the SSA adapter If the diagnostics fail exchange the FRU for a new FRU If the diagnostics do not fail go to aaas exchanging the FRU 50013 Description The SSA adapter card has failed Action Exchange the FRU for a new FRU Possible FRUs SSA adapter card 100 50100 Description An attempt was made to log an error against a pdisk that is not available to the using system Action This problem has occurred for one of the following reasons e A user has deleted a pdisk from the system configuration In such an instance the hdisk that is using the pdisk continues to operate normally If the disk drive tries to log an error however this SRN 50100 is produced Give the cfgmgr command to return the pdisk to the system configuration e A device has tried to log an error during system configuration To fi
378. od systemname pdisk2 AC1DBE11 Ze 23 Good systemname pdisk3 AC1DBEF4 3a 2 Good systemname pdisk7 AC50AE58 Ae Good systemname pdisk12 AC7C6E51 Sas 0 Good systemname pdisk0 AC706E9A O85 Good systemname pdisk1l AC1DEEE2 TY 4 Good systemname pdisk10 AC1DBE32 2 23 Good MORE 4 F3 Cancel F10 Exit Ne aN Note Scroll the display to see all the connected disk drives User s Guide and Maintenance Information Example 2 Broken Loop Cable Removed Each disk drive normally communicates with the adapter through one data path Because data can pass round the loop in either direction the adapter automatically reconfigures the loop to enable communication to continue to each disk drive if the loop becomes broken In Figure 57 on page 404 disk drives 1 through 8 should be connected to connectors A1 and A2 of the SSA adapter J but the loop is broken because the SSA cable has been disconnected from connector A2 Disk drives 9 through 12 are connected to connectors B1 and B2 of the same SSA adapter Disk drives 13 through 16 are connected to connectors A1 and A2 of a different SSA adapter H Although the broken loop is reported as an error all the disk drives can still communicate with the using system Disk drives 1 through 8 can communicate through connector A1 of the SSA adapter J Disk drives 9 through 12 can communicate through connectors B1 and B2 of the same SSA adapter normal loop disk drives 13 through 16 can communica
379. od state Split Array Resolution Primary System 2 SSA Adapter Secondary 1 Secondary 2 Figure 36 RAID 10 Failure of a Host System and a Primary Disk Drive 202 User s Guide and Maintenance Information Array is Offline because Adapter Is Not Known to the Remaining Half of the Array When this condition exists the host system generates SRN 48755 To maintain data consistency SSA RAID 1 and RAID 10 arrays keep a record of the adapters to which they are connected If exactly one half of the array is connected to a different adapter the array remains in the Offline state unless you take specific action to make the array available This type of problem can be caused by Incorrect reconfiguration of an array Simultaneous failure of an SSA adapter and a disk drive Failure of an Adapter and a Disk Drive In Eigure 37 an SSA adapter and a disk drive have failed When a new adapter is installed the array remains in the Offline state until the state of the Split Array Resolution flag is changed back to Primary System 1 SSA Adapter Primary 1 Secondary 1 Uncasbsicsecsotarcecoceunecced Figure 37 Failure of an Adapter and a Disk Drive Chapter 8 Split Site Management 203 Moving an Array between Systems In Figure ad exactly half of a RAID 10 array is being moved from its original connections adapter A and adapter B to a new configuration where it is connected to adapter
380. ode numbers against remote node numbers The ssavfynn command sends all error messages to stderr It sends all configuration problem messages to stdout Chapter 16 Using the SSA Command Line Utilities 371 ssaxlate Command Purpose To translate between logical disks hdisks and physical disks pdisks Syntax ssaxlate LogicalDiskName ssaxlate 1 PhysicalDiskName Description If the parameter is a logical disk the output is a list of names of the physical disks that provide support for that logical disk If the parameter is a physical disk the output is a list of names of the logical disks that use that physical disk Flags I DiskName Specifies the logical or physical disk 372 User s Guide and Maintenance Information Chapter 17 SSA Service Aids Note For some problems you can use the SSA command line utilities instead of the SSA service aids For information about the command line utilities see SSA service aids are resident in the using system They help you to service SSA subsystems This section describes those service aids and tells how to use them Attention Do not run the service aids from more than one using system at a time otherwise unexpected results might occur The SSA service aids are e Set Service Mode This service aid enables you to determine the location of a particular disk drive on the SSA loop and to remove that disk drive from the loop Link Verification This service aid tells
381. of function A message that can be displayed Device specific Z0 The size in megabytes of the installed synchronous dynamic random access memory SDRAM modules Device specific Z1 If the adapter contains a pluggable fast write cache module Z1 indicates the cache size in megabytes Device specific Z2 The SSA unique ID that is used to identify this adapter 316 User s Guide and Maintenance Information Adapter Power On Self Tests POSTs Power on self tests POSTs are resident in the SSA adapter These tests ensure that the adapter does not run the functional code until the hardware that uses the code has been tested The hardware consists of only the adapter card memory module and if installed Fast Write Cache Option Card and battery Some POST failures cause the adapter to become unavailable to the using system Other POST failures allow the adapter to be available although some function might not be enabled The particular tests that are run are related to the type of SSA adapter that is being used If a POST fails and prevents the adapter from becoming available exchange the adapter card for a new one If a POST fails but does not prevent the adapter from becoming available an error is logged That error indicates which FRUs must be exchanged for new FRUs Chapter 14 SSA Adapter Information 317 318 User s Guide and Maintenance Information Chapter 15 Removal and Replacement Procedures Exchanging Disk Drives
382. of microcode on all available SES enclosures give the commana ssa_sesdld u To install microcode file coral1014 hex on enclosureO give the command ssa_sesdld d enclosure f etc microcode coral1014 hex To install microcode file coral1014 hex on all available SES enclosures whose existing level of microcode is lower than coral1014 give the command ssa_sesdld f etc microcode coral1014 hex u To install microcode file coral1014 hex on enclosureO only if the existing level of microcode on enclosure is lower than coral1014 give the command ssa_sesdld d enclosure f etc microcode coral1014 hex u 342 User s Guide and Maintenance Information ssaadap Command Purpose To list the adapters to which a logical disk or physical disk is connected Syntax ssaadap 1 LogicalDiskName ssaadap 1 PhysicalDiskName Description The output is the list of SSA adapters to which the logical or physical disk is connected If the list contains more than one adapter the first adapter in the list is the primary adapter Flags I DiskName Specifies the logical or physical disk Chapter 16 Using the SSA Command Line Utilities 343 ssacand Command Purpose To display the unused connection locations for an SSA adapter Syntax ssacand a AdapterName P L Description The ssacand command lists the available connection locations of an SSA adapter These connection locations are related to disk drives that although connected to the adapt
383. oint that is to be used for the new file systems If no mount point is assigned the default value is used Mount new file systems If you select no the file systems are not mounted when the copy is uncoupled Chapter 7 Copying Data from Arrays and from Volume Groups 177 If you select yes the file systems are mounted when the copy is uncoupled If you select read only the file systems are mounted for read only when the copy is uncoupled Synchronize the file systems Select this option to schedule a synchronization operation before the copy is uncoupled 178 User s Guide and Maintenance Information List All Copy Candidates For fast path type smitty copy_lstcopycand and press Enter Otherwise select List All Copy Candidates from the Array Copy Services menu The following information is displayed 4 _ COMMAND STATUS Command OK stdout yes stderr no Before command completion additional instructions may appear below Array Array Type Coupled Disks Copy State hdisk3 raid_10 None Not Copying 0 hdisk4 raid_10 pdisk8 amp Copying 73 pdisk9 pdisk10 Fl Help F2 Refresh F3 Cancel F6 Command F8 Image F9 Shel1 F1O Exit Find n Find Next J The columns of information displayed on the screen have the following meanings Array The hdisk name of the RAID array This field contains the array serial number if the array is not configured Array Type The type of array listed This is raid_1 or raid_10 Coupled
384. olume group logical volume or file system to the copy disk drives e Uncouples the copy disk drives from the parent array e Renames the volume group and logical volumes The complete procedure for synchronizing the data copying to separate disk drives and uncoupling the copy can therefore be done from a single command You can use SMIT menus or ssaraid commands to copy and uncouple RAID 1 and RAID 10 arrays that hold logical volumes as you do for raw hdisks SMIT commands or ssaraid commands however operate only on whole hdisks If you use these commands you must ensure in a separate operation that all cached data in system memory has been 149 synchronized to the array before you run the copy operation The recommended procedure for copying logical volumes is to use the ssa_make_copy command rather than SMIT or ssaraid commands The ssa_delete_copy command is provided to enable you to delete the copy after it has been backed up You can for example use the ssa_make_copy and ssa_delete_copy commands for nightly backups when an automated closed circle of operations is required to 1 Prepare disk drives to copy the volume group that is held on a RAID 1 or RAID 10 array 2 Uncouple the copy of the array 3 Recreate the copy of the volume group 4 Do either of the following e Remove the copy volume group delete the copy array and repeat from step il e Remove the copy volume group recouple the copy array to the parent volume
385. oncurrent command interrupt handler routine might need to disable interrupts at INTCLASSO if it is expected to use concurrent mode on SSA disk drives and on other types of disk drives The other types of disk drives need their own device drivers to provide support for concurrent mode A kernel extension that uses the DD_CONC_REGISTER ioctl must issue a DD_CONC_UNREGISTER ioctl before it closes the SSA disk drive Chapter 13 Using the Programming Interface 289 SSA Disk Fencing SSA disk fencing is a facility that is provided in the SSA subsystem It allows multiple using systems to control access to a common set of disks Using the fencing commands that are provided by the hardware you can prevent particular using systems from accessing a particular disk drive Each disk drive has an access list that is independent of the access lists for the other disk drives Fencing is a function that is provided by the hardware and manipulated by hardware commands The device driver also has some effect The SSA disk device driver provides support for fencing by allowing the SSADISK_ISALCMD ioctl operation to issue the FN_ISAL_FENCE command to SSA logical disk drives The FN_ISAL_FENCE command is defined in the Technical Reference for the adapter To use fencing set the node_number attribute of the ssar router to a different value on each using system that is included in fencing Use the chdev command to do this For example to set the node_number t
386. one or more of its components have failed e RAID 0O provides data availability equivalent to that of a standard disk drive but with better performance for long data transfer operations e RAID 1 provides good data availability because data is mirrored on two member disk drives as it is with RAID 10 RAID 1 arrays however have only two member disk drives Member disk drives of a RAID 1 array can be configured to be in separate domains Separate domains ensure that data remains available if for example a complete domain fails through loss of power e RAID 5 provides good data availability with good performance for workloads that include many read and write operations e RAID 10 provides good data availability and performance that is better than that provided by RAID 5 especially when a member disk drive has failed For long data 29 transfer operations performance is better than that provided by RAID 1 because data is striped across member disk drives For short data transfer operations performance is better because operations are distributed across the member disk drives and the effect of skew is reduced Member disk drives of a RAID 10 array can be configured to be in separate domains Separate domains ensure that data remains available if for example a complete domain fails through loss of power Disk Drives That Are Not in Arrays Disk drives that are connected to an SSA RAID adapter do not need to be members of an array The S
387. onfiguration problem Chapter 18 SSA Problem Determination Procedures 431 SRN Problem Possible Causes 49520 10 11 12 13 14 15 16 17 18 19 20 2i Description Hot spare tuning has been lost Action Te 2 3 Type smitty ssaraid and press Enter Select List Status of Hot Spare Pools Select the adapter that logged the error If the adapter is not known select all adapters Note the RAID Manager adapter and pool number of spare pools that have a status of mixed Select List Components in a Hot Spare Pool Select the RAID Manager and pool that you noted earlier Note the array name hdisk and member disk drive pdisk that have a status of wrong_pool This array member disk drive pdisk is exchanged later after all rejected disk drives have been exchanged Select Change Show Use of an SSA Physical Disk and select the RAID manager that you noted earlier Note all the disk drives that are listed as rejected Exchange all failed and rejected disk drives for new disk drives Select Change Show Use of Multiple SSA Physical Disks Change the Current Use parameter of the exchanged disk drives to Hot Spare Disks If the user has a record of how disk drives are assigned to the hot spare pools use that information to assign the hot spare disk drives to the correct pools If the user has no such record see De g g ge_45 to determine the best hot spare
388. ool a hot spare disk drive is selected from any other hot spare pool that contains a hot spare disk drive Chapter 12 Using the SSA Command Line Interface for RAID Configurations 249 Other Change Attributes for Physical Disk Drives You can specify the following attributes with the a option when you are using the ssaraid command with the H option to change a physical disk drive use system spare free With the attribute set to system The physical disk drive can be used directly by the operating system If you specify also the d option a corresponding hdisk device is created for the physical disk drive With the attribute set to spare The physical disk drive becomes a hot spare disk drive It is therefore available for addition to any arrays on the RAID manager that are in the Exposed state Specify also the u option to ensure that no corresponding hdisk device exists for the physical disk drive With the attribute set to free The physical disk drive has no use assigned to it It is therefore available for any new arrays that are to be created Specify also the u option If you use the ssaraid command with the I option to display information about a physical disk drive the following values for the use attribute can also be displayed member The disk drive is a member of an array rejected The disk drive was a member of an array It was rejected from the array because it reported a problem You cannot change the
389. operation is to be performed See lObjedd Attribute Value The name of the attribute The value of the attribute DeviceName The user preferred name of the newly created device InstructType The swap remove or add action that is to be performed on the object See De ON DaAge Chapter 12 Using the SSA Command Line Interface for RAID Configurations 237 Options You can use the following options with the ssaraid command Option Description A C D H I M S Ya Yc Yo a c d e h i k 1 m n 0 P r Object Types Print a short usage message Perform an action on an object Create an object Delete an object Change an object Report information on an object List all the available SSA RAID managers that are on the system Show the hot spare pool information for an SSA RAID array List all array types List all create types List all objects An attribute and its desired value List the candidates for an object type Create the device for the specified RAID object List copy disk drives that are coupled with the specified RAID object Print column headers when showing object information in summary format The instruct action to perform The device name to use The name of the SSA RAID manager to use List the member objects for the named object The name of an object for example a RAID array or member disk drive Information is presented in colon separated
390. opied from another RAID array use the force yes flag to force this RAID Copy to be coupled with the array Note that the data that is stored on the RAID Copy array is lost Uncouple Action Attributes RAID 1 and RAID 10 Only You can specify the following attribute with the a option when you are using the ssaraid command with the A and i uncouple options to do actions on a RAID 1 or RAID 10 array force yes no If you use force yes when you uncouple a coupled RAID array the RAID Copy array is destroyed The RAID Copy disk drives become free disk drives and the copied data is lost 252 User s Guide and Maintenance Information Return Codes Q o OANODURWON CO Description Successful Some changes made but finally not successful General problem accessing the object data manager ODM Specified object file record ODM object not found Heap allocation failed Open ioctl failure for RAID manager Bad Transaction result Array already known to cfgmgr System call failed Internal logic error Method not found not executable or not correct Problem communicating with back end method Problem with environment variable message catalog and so on Problem with self defining structure for RDVs The argument in the command line is not valid and given to back end Problem with FC_CandidateList transaction Problem with FC_ResrcList transaction Problem with FC_ResrcView transaction Chapter 12 Using th
391. opy operation has completed Alternatively you can check the status of the copy operation by listing the coupled arrays that are being copied for example ssaraid I ssa2 t raid_l a copy_status copying ssaraid I ssa2 t raid_l a copy_status good If the fast write function was enabled on the parent array it is recommended that to increase the speed of the uncouple operation you suspend fast write caching and flush any cached data to the array Type ssaraid H 1 ssa2 n hdisk5 a fw_suspended true If copy_status good you can now uncouple the copy array from the parent array Type ssaraid A 1 ssa2 i uncouple n hdisk5 If the fast write function is enabled for the parent array the uncouple operation checks whether fast write caching has been suspended If you did not suspend fast write caching in step A the uncouple operation now suspends the caching for the parent array and flushes the fast write cache data to disk When the copy array has been uncoupled the uncouple operation restarts caching for the parent array A message is displayed for example 185439188B4F4CT created 185439188B4F4CT changed 8A8E39746BD3C4G uncoupled 185439188B4F4CT is the new RAID Copy array serial number If you suspended caching in step B restart it Type ssaraid H 1 ssa2 n hdisk5 a fw_suspended false Chapter 7 Copying Data from Arrays and from Volume Groups 153 8 The newly created RAID Copy array is created as a free disk dr
392. or the function keys remain the same as those shown in the screen above b For some versions of AIX and for stand alone diagnostics the format of the service aid displays might be slightly different from that shown in this chapter Functionally however the displays remain the same 376 User s Guide and Maintenance Information 5 Select the service aid that you require then go to the relevant instructions in this chapter Chapter 17 SSA Service Aids 377 Set Service Mode Service Aid The Set Service Mode service aid enables you to determine the location of a particular disk drive and to remove that disk drive from the unit in which it is installed It causes the Check light of that disk drive to come on for identification and stops all SSA loop activity through the disk drive It also causes the Subsystem Check light if present of the unit containing the selected disk drive to come on Only one disk drive at a time can be in Service Mode Before using this service aid you must make the selected disk drive unavailable to the using system otherwise an error occurs SSA devices can be maintained concurrently that is they can be removed installed and tested on an SSA loop while the other devices on the loop continue to work normally If a disk drive has its Check light on the pdisk might not be configured If the pdisk is not configured you cannot select Service Mode Under these conditions you can remove the disk drive fro
393. ore completion a Note the number of unsynced parity strips and unbuilt data strips b Press the Cancel key to leave the status display c Wait for a few moments then reselect the status display d Again note the number of unsynced parity strips and unbuilt data strips If the numbers are lower than those that you noted earlier the rebuilding operation is running Wait for the rebuilding operation to complete before you continue If the numbers have not changed the rebuilding operation has stopped e If this is the first time you have been through this step while solving this particular problem return to step Fon paoe aad Otherwise 1 Delete the array see CDeleting an 2 Run the Certify service aid see to each member disk drive 4 Go to MAP the repair from step Bah Have disk drives been going into the rejected state with no other failure indications NO Go to step Bd YES This problem can occur if an array is accessed before all the member disk drives are available Verify that using system procedures ensure that the power system switches on power to all the disk drives before or when it switches on the power to the using system from step Bi Was SRN 46000 logged but no error found when diagnostics were run in System Verification mode NO YES Go to step B3 on page 474 An array was in the Offline state but is now available Verify that using system procedures ensure that the power
394. orking through other menus In this chapter the fast path command for a particular option is given at the start of the description of that option Notes 1 Although this book always refers to the smitty commands you can use either the smitty command or the smit command The procedures that you follow remain the same whichever of the two commands you use If you send the smit command from a graphics terminal however the menus are displayed slightly differently from those shown in this book If you are not familiar with the selection of items from the graphics versions of the menus use the smitty command The menus will then appear as shown in this book 2 Different microcode levels might cause slightly different versions of the menus to be displayed 3 If you use fast path commands you might need to go through intermediate steps that are not shown in this book Also some menus might be displayed slightly differently from those shown in this book User s Guide and Maintenance Information Getting Access to the SSA RAID Arrays SMIT Menu ie For fast path access to the SSA RAID Array SMIT menus type smitty ssaraid and press Enter Otherwise a Type smitty and press Enter The System Management menu is displayed b Select Devices The Devices menu is displayed c Select SSA RAID Arrays The SSA RAID Arrays menu is displayed Ke SSA RAID Arrays Move cursor to desired item and press Enter List All Defined
395. ormed correctly The data on the array is not RAID 1 array or a RAID 10 constant array has been split exactly in half and a write operation has Action See K before you been performed independently attempt to recover the array to both halves of the array 48800 Description The Invalid strip table is full Because of failures on multiple Possible FRUs member disk drives of an array at least 128 blocks of data are not Device 100 accessible Other data on the array might still be readable Action 1 Type smitty ssaraid and press Enter 2 Select List Status of All Defined SSA RAID Arrays 3 The failed hdisk is listed with Invalid data strips Make a note of the hdisk number 4 Ask the customer to make a backup of all data that is still readable and then to delete the failed array 5 When the array has been deleted run the following to each disk drive that was a member of the failed array Diagnostics in System Verification mode e Certify service aid 6 If in the previous step you found any disk drive failures correct those failures 7 Tell the customer that the array can now be recreated 48900 Description An array is not available multiple devices have failed None Action Run diagnostics and the Certify service aid to all the disk drives that were used to create the array If problems occur correct those problems before you attempt to recreate the array Chapter 18 SSA Problem Determination Procedures
396. ow Page Splits Enable Fast Write F1 Help F2 Refresh F3 Cancel F5 Reset F6 Command F7 Edit F9 Shel F1O Exit Enter Do N Add an SSA RAID Array Entry Fields ssaQ raid_0 yes no F4 List F8 Image For the meanings of the fields see page led User s Guide and Maintenance Information If you select RAID 1 the following menu is displayed a SSA RAID Manager RAID Array Type Primary Disk Secondary Disk Split Array Resolution Enable Use of Hot Spares Allow Hot Spare Splits Allow Page Splits Initial Rebuild Enable Fast Write Fl Help F2 Refresh F5 Reset F6 Command F9 Shel F1O Exit X Add an SSA RAID Array Type or select values in entry fields Press Enter AFTER making all desired changes Choose Hot Spare only from Preferred Pool F3 Cancel F7 Edit Enter Do Entry Fields ssa0 raid_l Primary yes no no yes no no F4 List F8 Image teeter eet For the meanings of the fields see page led Chapter 6 Using the RAID Array Configurator 63 If you select RAID 5 the following menu is displayed r Add an SSA RAID Array Type or select values in entry fields Press Enter AFTER making all desired changes Entry Fields SSA RAID Manager ssa RAID Array Type raid_5 Member Disks Strip Size KB 64 z Enable Use of Hot Spares yes a Choose Hot Spare only from Preferred Pool no Allow Page Splits yes et Enable Fast Write no F F1l Help F2
397. ow that the copy array is to be used as a third copy The copy array is made offline and is no longer accessible D daemon In the operating system a program that runs unattended to perform a standard service Some daemons are triggered automatically to perform their task others operate periodically Synonymous with demon deconfigure To remove a disk drive from being a member of an array Degraded state The state that a RAID array enters if while in the Exposed state it receives a write command See also Exposed state descriptor In the object data manager ODM a named and typed variable that defines one characteristic of an object device driver 1 A file that contains the code needed to use an attached device 2 A program that enables a computer to communicate with a specific peripheral device 3 A collection of subroutines that control the interface between I O device adapters and the processor domain That part of a computer network in which the data processing resources are under common control DMA Direct memory access E EEPROM Electrically erasable read only memory enclosure A device or unit that contains disk drives A disk subsystem for example a 7133 489 Exposed state The state that a RAID array enters if a member disk drive becomes missing logically or physically from that array F Failed status The disk drive is not working fencing SSA disk fencing is a facility
398. pare_pool 249 force yes no 248 hot spare disk drives spare_pool 249 use system free 248 Change Member Disks in an SSA RAID Array option 137 Change Show Attributes of an SSA RAID Array 135 Change Show Attributes of an SSA RAID Array option 135 Change Show Attributes of an SSA RAID Array option effects of array copy 186 Change Show Characteristics of an SSA Logical Disk option 214 Change Show Use of an SSA Disk option 144 Change Show Delete a Hot Spare Pool option 86 changing or showing the status of a hot spare pool 74 changing pdisk and hdisk numbers 326 checking the level of adapter microcode 23 chg_hsm_pool_adap command 86 chgssadisk command 72 144 chgssadisks command 147 chgssardsk command 214 choosing how many hot spare disk drives to include in each pool 51 Index 495 choosing the error threshold alarm level for a hot spare pool 51 chssaraid command 135 close subroutine tmssa device driver 297 command line error log analysis 232 Command Line Interface for RAID 235 action attributes 251 command syntax 237 couple action attributes 252 hot spare pool creation and change attributes 249 instruct types 238 object types 238 options 238 physical disk change attributes 249 RAID arrays change attributes 248 RAID arrays creation and change attributes 244 return codes 253 SSARAID command attributes 244 uncouple action attributes 252 command line utilities ssa_certify command 345 ssa_diag command 348 ssa_elacommand 353 ssa_form
399. pdisk6 04 02 REGY 03 P 9 2GB pool_B1 yes good pdisk7 04 02 REGY 01 P 9 2GB pool_B2 yes good pdisk15 04 02 REGY 07 P 18 2GB pool_B1 yes good Fl Help F2 Refresh F3 Cancel F6 Command F8 Image F9 Shel F1O Exit Find n Find Next NS The columns of information displayed on the screen have the following meanings Component The array member disk drive of the hdisk that is listed on the screen Location The physical location code of the array member disk drive Size The size of the array member disk drive This value is useful to know if you have assigned a hot spare disk drive to a pool but the array member disk drive is too large to be protected by the hot spare disk drive Pool The pool to which the array member disk drive is assigned Protected If set to yes this field indicates that if the array member disk drive fails a hot spare disk drive is available to replace it That hot spare disk drive is selected from the listed hot spare pool or if no hot spare disk drives are available in that pool and Choose Hot Spare Only from Preferred Pool is set to no ae ging o the hot spare disk drive is selected from another pool If set to no this field indicates that the array member disk drive is not protected No suitable hot spare disk exists in the listed pool and if Choose Hot Spare Only from Preferred Pool is set to no see J NO Suitable 78 Users Guide and Maintenance Information Status The status of the arr
400. pecifies that a count of unreadable blocks be printed 346 User s Guide and Maintenance Information ssaconn Command Purpose Syntax Description Flags To display the SSA connection details for the physical disk ssaconn 1 PhysicalDiskName a AdapterName The ssaconn command performs a function that is similar to the Link Verification service aid The output from this command is PhysicalDiskName AdapterName hopcount1 hopcount2 hopcount3 hopcount4 The four hop counts represent the number of SSA devices that are between the physical disk and the A1 A2 B1 and B2 ports of the adapter respectively For example if hop count 1 is 0 no devices are between the physical disk and the A1 port of the adapter If hop count 4 is 5 five devices are between the physical disk and the B2 port of the adapter If the disk is not connected to a particular adapter port the hop count is replaced by a dash character I PhysicalDiskName Specifies the physical disk whose connection details are to be listed a AdapterName Specifies the adapter to whose ports the connection details are related Chapter 16 Using the SSA Command Line Utilities 347 ssa_diag Command Purpose Syntax Description Flags Output To run diagnostic tests to a specified device ssa_diag 1 pdiskX ssa_diag 1 ssax The ssa_diag command is in usr lpp diagnostics bin a Causes the adapter to be reset if the device that is being tested is a
401. played pdiskO 0004AC506C4000D free n a 4 5GB Physical disk pdiskl 0004AC506D6D00D member n a 4 5GB Physical disk pdisk2 0004AC50A44200D member n a 2 3GB Physical disk pdisk3 0004AC515EA400D member n a 4 5GB Physical disk hdisk5 900335FE80C84CK rebuilding 9 0GB RAID 5 array User s Guide and Maintenance Information Example 9 To Make a New System Disk This example shows how use a single SSA disk to make a new system disk Type the command gt ssaraid H 1 ssa0 n pdiskO a use system d where H specifies that this operation is a change operation 1 ssa0 specifies the RAID manager that is to be used n pdiskspecifies the pdisk that is to be changed to a system disk a use specifies the new use of the pdisk d specifies that a system disk is to be attached to the new array The result in summary format is similar to that shown here pdiskO Q0004AC506C4000D system n a 4 5GB Physical disk pdiskl 0004AC506D6D00D member n a 4 5GB Physical disk pdisk2 0004AC50A44200D member n a 2 3GB Physical disk pdisk3 0004AC515EA400D member n a 4 5GB Physical disk hdisk5 900335FE80C84CK rebuilding 9 0GB RAID 5 array Example 10 To Delete an Array This example shows how to delete an array Type the command gt ssaraid D 1 ssa0 n hdisk5 u where D specifies that this operation is a delete operation 1 ssa0 specifies the RAID manager that is to be used n hdisk specifies the hdisk that is to be deleted u specifies the syste
402. pport consists of e Standard block I O to SSA logical disks which are represented as hdisks e Character mode I O to SSA logical disks which are represented as rhdisks e Error reporting from SSA physical disks which are represented as pdisks e Diagnostics and service interface to SSA physical disks that are represented as pdisks e Re issue of commands in the event of an adapter reset 255 Interface between the SSA Adapter Device Driver and Head Device Driver To communicate with the SSA adapter device driver the SSA head device driver 1 Uses the fp_open kernel service to open the required instance of the SSA adapter device driver 2 Calls the fp_ioctl kernel service to issue the SSA_GET_ENTRY_POINT operation to the opened adapter 3 Calls the function SSA_Ipn_Directive whose address was returned by the ioctl operation These calls to SSA_Ipn_Directive are used for all communication with the SSA device 4 Uses the fp_close kernel service to close the adapter Note When fp_close is called SSA_Ipn_Directive cannot be called Trace Formatting The SSA adapter device driver and the SSA disk device driver can both make entries in the kernel trace buffer The hook ID for the SSA adapter device driver is 45A The hook ID for the SSA disk device driver is 45B For information on how to use the kernel trace feature refer to the trace command for the kernel debug program With the PCI SSA Multi Initiator RAID EL Adapter the Micro Chann
403. ption 3 ssaadap command 343 ssacand command 344 ssaconn command 347 ssadisk command 349 ssadisk SSA disk device driver 266 configuration issues 266 configuring SSA disk drive devices 268 logical and physical disks and RAID arrays 266 multiple adapters 267 device attributes 270 ssadisk SSA disk device driver continued attributes common to logical and physical disks 271 attributes for logical disks only 271 attributes of the SSA router ssar 270 error conditions 274 special files 276 SSADISK_ISAL_CMD ioctl operation disk device driver 278 description 278 files 280 purpose 278 return values 279 SSADISK_ISALMgr_CMD ioctl operation disk device driver 281 description 281 files 282 purpose 281 return values 282 SSADISK_LIST_PDISKS ioctl operation disk device driver 285 description 285 files 286 purpose 285 return values 286 SSADISK_SCSI_CMD ioctl operation disk device driver 283 description 283 files 284 purpose 283 return values 284 ssadload command 350 ssadlog command 213 ssaencl command 355 ssafastw command 215 ssaidentify command 364 ssaraid command 59 90 98 SSARAID command attributes 244 action attributes 251 couple action attributes 252 hot spare pool creation and change attributes 249 physical disk change attributes 249 RAID arrays change attributes 248 RAID arrays creation and change attributes 244 uncouple action attributes 252 ssavfynn command 371 ssaxlate command 372 starting the service aids 376 states of a
404. py array has been created but has not been coupled to an array An hdisk cannot be created from this RAID Copy array This RAID Copy array can be only coupled to an array or deleted The status value for the disk drives is Good The disk drive is present and operational 188 User s Guide and Maintenance Information Identify Disks in an SSA RAID Array For fast path type smitty issaraid and press Enter Otherwise select Identify Disks in an SSA RAID Array from the List Identify SSA Physical Disks menu The following information is displayed for RAID arrays that have coupled disk drives ea PEAR D Identify Disks in an SSA RAID Array Type or select values in entry fields Press Enter AFTER making all desired changes Entry Fields SSA RAID Manager ssal SSA RAID Array hdisk2 Primary Disks g Secondary Disks o Coupled Disks o Flash Disk Identification Lights yes F1l Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image F9 Shel F10 Exit Enter Do Xe A 1 For the primary secondary and coupled disk drives press the List key to list the disk drives that you want to identify 2 From the menu shown select each disk drive that you want to identify and press Enter 3 Select yes in the Flash Disk Identification Lights field Chapter 7 Copying Data from Arrays and from Volume Groups 189 Remove a Disk From an SSA RAID Array For fast path type smitty redssaraid and press Enter Otherwise
405. r a description of what happens when a coupled copy or an uncoupled copy is deleted Delete a Volume Group Logical Volumes or Filesystems Copy Select this option if you want to delete a volume group copy The copy might be coupled to the array or uncoupled from the array The copy can be in any state The deletion of the copy causes all the daia on the copy to be lost See fora description of the delete epilans 174 User s Guide and Maintenance Information Prepare a Copy For fast path type smitty copy_pre_array and press Enter Otherwise select Prepare a Copy from the Array Copy Services menu The Prepare a Copy menu is displayed a S Prepare a Copy Type or select values in entry fields Press Enter AFTER making all desired changes Entry Fields Array to be copied hdisk3 Number of disk drives required 2 Minimum disk drive size required 9 1GB Verify copy during creation no Hot spare selection Default RAID copy to be coupled o OR Disk drives to be coupled o F1l Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image F9 Shel1 F1O Exit Enter Do 3 3 The meanings of the fields are Array to be copied The array that you selected as the source data for the copy Number of disk drives required The number of disk drives that must be coupled to the array to permit a copy to be performed Minimum disk drive size required The effective size of each member disk drive of the array
406. r example Copy Array hdiskx 014639AB30DCOT Uncouple completed successfully 2 On the using system where you want to recreate the copy a b Cc Run the cfgmgr command to configure the new hdisks Give the command ssaraid I ssan n serial_number z where serial_number is the serial number of the RAID Copy array hdisk for example 014639AB30DCOT A message is displayed for example hdiskx 0146392AB30DCOT good 4 5GB RAID Copy array To recreate the copy give the command ssa_make_copy R d hdiskx If you need multiple hdisks to recreate the copy give the command ssa_make_copy R d hdiskx d hdisky Chapter 7 Copying Data from Arrays and from Volume Groups 169 Example 5 Running an Automatic Copy of a Volume Group Prepare a volume group for copying a Give the command ssa_make_copy P v vgname b Wait for the copy to complete c Wait for the external trigger to be created Uncouple the volume group Give the command ssa_make_copy U v vgname Recreate the volume group Give the command ssa_make_copy v vgname Either delete the copy volume group and destroy the RAID Copy array a Give the command ssa_delete_copy v newvgname A b Repeat from step Hl or delete the copy volume group and couple the children back to the parents a Give the command ssa_delete_copy v newvgname C b Repeat from step Bl or delete the volume group and detach the RAID Copy array a Repeat from step til b Give
407. r pdisk3 to be analyzed for the previous five days If a problem is detected an SRN is generated e Type smitty ssaraid and press Enter f Select Change Show Use of an SSA Physical Disk and for all disks that you have tested or exchanged change the Current Use to Array Candidate Disk g Select Change Member Disks in an SSA RAID Array h Select Swap Members of an SSA RAID Array i Select the degraded hdisk j Referring to the displayed instructions exchange the failed member for a new disk drive The Disk to Remove is listed as BlankReserved the Disk to Add is the disk drive that you tested or exchanged in step fai When failed disk drives have been exchanged for new disk drives the data is rebuilt and the array changes its state to the Good state Note The array can be used during the rebuilding operation Inform the user however that while the rebuilding operation is running the data is not protected against another disk drive failure The rebuilding operation runs more slowly if the array is being used 464 User s Guide and Maintenance Information k When the rebuilding operation is complete ask the user to run diagnostics in System Verification mode to the SSA adapters to ensure that the rebuilding operation has not found any more problems Go to EMAP repair J to verify the 19 from step ia Does the Link Verification service aid indicate an open loop NO Go to step 20 YES from step
408. r reason do not switch off the using system when servicing an SSA link or an enclosure in which SSA devices are installed Enclosure power cables and external SSA cables that connect devices to the using system can be disconnected while that system is running Before starting this MAP ensure that all the disk drives are working correctly 1 Run diagnostics in Problem Determination mode to identify any disk drive problems that have occurred 2 Run the Link Verification service aid see to find all power problems SSA link problems and i SSA disk drives that have a Failed status 3 Correct all those problems before you start this procedure Attention Some of the steps in this MAP need you to change the configuration of the array or to change the use of an SSA disk drive Do not do those steps unless you have the user s permission 1 from steps A bal and bah You have been sent to this step either from another step in this MAP or because you have one of the following Service Request Numbers SRNs 46000 47000 47500 49000 49100 49500 49950 Do you have SRN 49500 NO a Run diagnostics in System Verification mode to the SSA adapters b Go to step A YES No hot spare disk drives are available Go to step Pian page 464 2 from step th Did the diagnostics produce SRN 46000 47000 47500 49000 49100 or 49950 NO Go to step Bon page 455 YES Go to step 4 on page 455 454 User s Guide and Maintenance I
409. r s Guide and Maintenance Information Installing a Replacement Disk Drive 1 Physically install the replacement disk drive see the Operator Guide or Service Guide for the unit 2 If the failed disk drive has been exchanged for a hot spare disk drive change the drive is working correctly If no hot spare disk drive was available when the original disk drive failed the array is now in the Exposed state or in the Degraded state Change the use of the disk drive to the array j 3 For fast path type smitty exssaraid and press Enter Otherwise a Select Change Member Disks of an SSA RAID Array from the SSA RAID Arrays menu b Select Swap Members of an SSA RAID Array 4 A list of arrays is displayed A N Change Member Disks of an SSA RAID Array Move cursor to desired item and press Enter Remove a Disk from an SSA RAID Array Add a Disk to an SSA RAID Array Swap Members of an SSA RAID Array SSA RAID Array Move cursor to desired item and press Enter hdisk3 00703795D3F7C0G system good 9 2GB raid_1 hdisk4 00703784C540C00 system degraded 27 5GB raid 10 hdisk5 007037943540C00 system good 27 5GB raid 10 Fl Help F2 Refresh F3 Cancel F8 Image F1O Exit Enter Do Find n Find Next Y J Select the array into which you are installing the replacement disk drive This array is listed as exposed or degraded Chapter 6 Using the RAID Array Configurator 95 5 The following
410. r s Guide and Maintenance Information ssa_make_copy Command Purpose Syntax Description To create a RAID Copy array from a RAID 1 or RAID 10 array ssa_make_copy v vgname 1 lvname f fsname V newvgname L newlvprename F newfsprempoint s m r e E filename ssa_make_copy U v vgname 1 lvname f fsname s e Usage E filename ssa_make_copy P v vgname 1 lvname f fsname w ssa_make_copy R d pvname V newvgname L newlvprename F newfsprempoint m Ir The ssa_make_copy command creates copy volume groups via the following process 1 The command checks whether you can copy all or part of the volume group If you want to copy a whole volume group the command checks whether all the hdisks in the volume group are RAID 1 or RAID 10 arrays If any hdisk is of the wrong type the command displays a warning then stops If you want to copy logical volumes or file systems the command checks whether all the logical volumes are in the same volume group and whether all the hdisks that make up the volume group are RAID 1 or RAID 10 arrays If any logical volume is not in the required volume group or if any hdisk is of the wrong type the command displays a warning then stops Notes a If any hdisk in that is the volume group does not contain a logical volume that hdisk is not copied If active paging space exists on the volume group you cannot copy any part o
411. ram writes to the array is mirrored Any data that your program reads from the array however might not be consistent because it might not have been written previously by this particular program Chapter 6 Using the RAID Array Configurator 67 If you select yes for this option the array enters the Rebuilding state The data that is on the primary disk drives is copied to the secondary disk drives This operation might take several hours to complete during which time performance is affected Enable Fast Write Switches the fast write cache on or off This facility is not available on adapter cards that do not or cannot have a fast write cache installed 4 Move the cursor to the appropriate disk field that is Member Primary or Secondary and press the List key to display a list of candidate disk drives 5 If candidate disk drives are available a list of those disk drives is displayed in a window Add an SSA RAID Array Type or select values in entry fields Press Enter AFTER making all desired changes Entry Fields SSA RAID Manager ssaQ RAID Array Type raid_5 Member Disks Enable Use of Hot Spares yes Member Disks Move cursor to desired item and press F7 ONE OR MORE items can be selected Press Enter AFTER making all selections Disks in Loop B are pdiskO 0004AC506C2900D free n a 4 5GB Physical Disk pdisk1 0004AC5119E000D free n a 4 5GB Physical Disk pdisk2 Q004AC7COOE800D free n a 4 5GB Physical Disk
412. ray unless The SSA adapter can detect you are sure that no updates to the other unavailable half of the array the primary half of the array will be performed See but the Split Array Resolution before you perform this operation flag is set to Secondary To force access to the available half of the array change the setting of the This problem can be caused by Split Array Resolution flag a power failure on one half of f the array or by a broken SSA tT tt d Enter ype smitty ssaraid and press is loop between the two halves of 2 Select Change Show Attributes of an SSA RAID Array the array 3 Change the setting of the Split Array Resolution flag 4 If possible change the setting of the Split Array Resolution flag on the unavailable half of the array Attention If the unavailable half of the array is reconnected to the available half before the setting of the Split Array Resolution flag has been changed neither SSA adapter can get access to the array 48755 Description One of the following conditions exists An SSA adapter and a pdisk have failed A RAID member disk drive is connected to another SSA adapter A split site configuration has only one SSA adapter configured and the adapter and all the primary or secondary disk drives fail SRN Problem Possible Causes 48760 Description An array is in the Offline state because the split join This problem can occur if a procedure was not perf
413. rays menu 2 A list of adapters is displayed in a window a SSA RAID Arrays Move cursor to desired item and press Enter List All Defined SSA RAID Arrays List All Supported SSA RAID Arrays List All SSA RAID Arrays Connected to a RAID Manager List Status Of All Defined SSA RAID Arrays List Identify SSA Physical Disks List Delete Old RAID Arrays Recorded in an SSA RAID Manager List Status of Hot Spare Pools List Status of Hot Spare Protection for an SSA RAID Array List Components in a Hot Spare Pool Add a Hot Spare Pool SSA RAID Manager Move cursor to desired item and press Enter ssa0 Available 00 04 IBM SSA 160 SerialRAID Adapter 14109100 Fl Help F2 Refresh F3 Cancel F8 Image F1O Exit Enter Do Find n Find Next Select the adapter whose hot spare pools you want to list 80 Users Guide and Maintenance Information A list of hot spare pools is displayed ioe D SSA RAID Arrays Move cursor to desired item and press Enter List All Defined SSA RAID Arrays List All Supported SSA RAID Arrays List All SSA RAID Arrays Connected to a RAID Manager SSA Hot Spare Pools Move cursor to desired item and press F7 ONE OR MORE items can be selected Press Enter AFTER making all selections fH EEE ETH EE THE EEE EEE EH EE Hot Spare Pools in Loop A are pool_Al EEE EEE EE TEE EE EE EE A AE Hot Spare Pools in Loop B are pool Bl F1 Help F2 Refresh F3 Cancel F7 Select F8 Image
414. re Protection for an SSA RAID Array option 77 List System Disks option 117 List Identify SSA Physical Disks option 108 listing the disks that are in a hot spare pool 80 location code format 19 loops and data paths examples broken loop cable removed 403 broken loop disk drive removed 406 normal loops 401 loops links and data paths large configurations 16 one loop with two adapters in each of two using systems 12 one loop with two adapters in one using system 11 simple loop 8 simple loop one disk drive missing 9 10 two loops with one adapter 14 two loops with one adapter in each of two using systems 15 loops links and data paths 7 configuring devices on a loop 18 Is_hsm_array_components command 80 Is_hsm_array_status command 77 Is_hsm_status command 74 Isdssaraid command 100 Isidssaraid command 108 Ismssaraid command 102 Issaraid command 109 Isssaraid command 101 Istssaraid command 104 maintenance analysis procedures MAPs 443 managing dumps 259 MAP 2010 444 MAP 2320 445 MAP 2323 450 MAP 2324 454 MAP 2410 475 microcode adapter checking the level 23 microcode and software errors 441 microcode maintenance 314 adapter 315 checking microcode package ID and level 314 disk drive 315 microcode package ID and level 314 mkdev configuring a logical disk 269 mkdev configuring a physical disk 269 mkssaraid command 60 multiple states RAID 10 38 node_number locking 27 normal loops SSA link 401 nvrssaraid command 130
415. re disk drives only Array member disk drives only Change Attributes for All RAID Arrays You can specify the following attributes with the a option only when you are using the ssaraid command with the H option to change a RAID array use system free With the attribute set to system The array is made usable by the operating system If you specify also the d option a corresponding hdisk device is created for the array With the attribute set to free The array has no use assigned to it and the operating system cannot use it as an hdisk If you specify the u option you ensure that no corresponding device exists for the array 248 User s Guide and Maintenance Information force yes no If an array is using a fast write cache that is failing you must specify this attribute as yes to allow the fast write cache to be disabled Hot Spare Pool Creation and Change Attribute You can specify the following attribute with the a option when you are using the ssaraid command with the C or H option to create or change a hot spare pool minimum_spares default 0 This attribute determines the minimum number of hot spare disk drives that must be in this hot spare pool to prevent a hot spare pool error from being logged Physical Disk Drive Change Attributes This section describes e Change attribute for hot spare disk drives only e Change attribute for array member disk drives only e Other change attributes for physical disk driv
416. rive 95 installing and configuring 58 listing all defined SSA RAID arrays 100 listing all SSA RAID arrays that are connected to a RAID manager 102 listing all supported SSA RAID arrays 101 listing array candidate disk drives 115 listing hot spare disk drives 111 listing old RAID arrays recorded in an SSA RAID manager 131 listing rejected array disk drives 113 listing system disk drives 117 listing the disk drives in an SSA RAID array 109 listing the disks that are in a hot spare pool 80 listing the status of all defined SSA RAID arrays 104 removing a disk drive from an SSA RAID array 138 removing disks from a hot spare pool 86 showing the disks that are protected by hot spares 77 swapping members of an SSARAID array 142 RAID arrays change attributes 248 RAID arrays creation and change attributes 244 RAID Command Line Interface 235 action attributes 251 command syntax 237 couple action attributes 252 hot spare pool creation and change attributes 249 instruct types 238 object types 238 options 238 physical disk change attributes 249 RAID Command Line Interface continued RAID arrays change attributes 248 RAID arrays creation and change attributes 244 return codes 253 SSARAID command attributes 244 uncouple action attributes 252 RAID functions 29 RAID O array states 31 Good 31 Offline 31 RAID 1 and RAID 10 couple action attributes 252 RAID 1 and RAID 10 uncouple action attributes 252 RAID 1 array states 32 RAID 10 array states 36
417. rives selected must equal Number of disk drives required Leave this field blank if you are going to select RAID copy to be coupled 176 User s Guide and Maintenance Information Uncouple a Volume Group Logical Volumes or Filesystems Copy For fast path type smitty copy_unvglvfs and press Enter Otherwise select Uncouple a Volume Group Logical Volumes or Filesystems Copy from the Array Copy Services menu The Uncouple a Volume Group Logical Volumes or Filesystems Copy menu is displayed a D Uncouple a Volume Group Logical Volumes or Filesystems Copy Type or select values in entry fields Press Enter AFTER making all desired changes Entry Fields Volume Group myvg01 New Volume Group name O Logical Volume prefix o Mount point o Mount new file systems no Synchronize the file systems no F1l Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image F9 Shel1 F10 Exit Enter Do S Y The meanings of the fields are Volume Group The name of the volume group from which this copy was taken New Volume Group name The name of this copy of the volume group In this field enter the name of the new volume group If this field remains blank a default volume group name is used Logical Volume prefix The prefix that is added to the parent logical volume names when the new logical volumes are created during the uncouple operation If no prefix is assigned a default value is used Mount point The mount p
418. rrays RAID O 31 Good 31 Offline 31 states of arrays RAID 1 32 states of arrays RAID 10 36 Degraded 37 Exposed 36 states of arrays RAID 10 continued Good 36 multiple 38 Offline 37 Rebuilding 37 Unknown 38 states of arrays RAID 5 33 Degraded 33 Exposed 33 read operations while in 33 write operations while in 33 flowchart 35 Good 33 initial rebuilding operation 34 Offline 34 Rebuilding 34 adapter replacement 34 disk drive replacement 34 summary of SSA error conditions 259 Swap Members in an SSA RAID Array option 142 Swap Members of an SSA RAID Array option effects of array copy 191 switching off using systems in a large configuration 17 switching on using systems in a large configuration 17 swpssaraid command 137 T takeover adapter 267 Target Mode 291 target mode data pacing 293 the Fast Write menus 213 the SMIT menu 59 three way copy array copy services 173 Delete a RAID Array Copy option 183 Delete a Volume Group Logical Volumes or Filesystems Copy option 184 List All Copy Candidates option 179 List All Uncoupled Copies option 181 List All Uncoupled Volume Groups option 182 Prepare a Copy option 175 Uncouple a Volume Group Logical Volumes or File Systems Copy option 177 copying data from an array 151 description 149 effects of array copy on other SMIT menus 186 Change Show Attributes of an SSA RAID Array option 186 Identify Disks in an SSA RAID Arrays option 189 List Status Of All Defined SSA RA
419. rrays Recorded in an SSA RAID Manager List Status of Hot Spare Pools List Status of Hot Spare Protection for an SSA RAID Array List Components in a Hot Spare Pool Add a Hot Spare Pool SSA RAID Manager Move cursor to desired item and press Enter ssa0 Available 00 04 IBM SSA 160 SerialRAID Adapter 14109100 F1l Help F2 Refresh F3 Cancel F8 Image F1O Exit Enter Do Find n Find Next 60 Users Guide and Maintenance Information 2 Select the adapter to which you want to add the array A list of array types is displayed in a window SSA RAID Arrays Move cursor to desired item and press Enter List All Defined SSA RAID Arrays List All Supported SSA RAID Arrays List All SSA RAID Arrays Connected to a RAID Manager List Status Of All Defined SSA RAID Arrays List Identify SSA Physical Disks List Delete Old RAID Arrays Recorded in an SSA RAID Manager List Status of Hot Spare Pools RAID Array Type Move cursor to desired item and press Enter raid_0 RAID 0 array raid_1 RAID 1 array raid_5 RAID 5 array raid_10 RAID 10 array F1l Help F2 Refresh F3 Cancel F8 Image F1Q Exit Enter Do Find n Find Next Chapter 6 Using the RAID Array Configurator 61 62 3 Select the type of array that you want to create If you select RAID O the following menu is displayed Type or select values in entry fields Press Enter AFTER making all desired changes SSA RAID Manager RAID Array Type Member Disks All
420. rrent Use parameter That choice should be Hot Spare Disk if the use of hot spares is enabled for the arrays on the subsystem Array Candidate Disk if the use of hot spares is disabled for the arrays on the subsystem e Go to step 8 on page 474 YES When a new disk drive has been added to the system it is configured as a system disk drive If the disk drive replacement procedure that you used has not already instructed you to do so change the Current Use parameter of the disk drive to Hot Spare Disk or to Array Candidate Disk a Type smitty ssaraid and press Enter b Select Change Show Use of an SSA Physical Disk The pdisk that has been exchanged is listed under SSA Physical Disks that are system disks c Select the pdisk from the list Change the Current Use parameter to Hot Spare Disk or to Array Candidate Disk Note It is the user who should make the choice of Current Use parameter That choice should be Hot Spare Disk if the use of hot spares is enabled for the arrays on the subsystem Chapter 18 SSA Problem Determination Procedures 473 Array Candidate Disk if the use of hot spares is disabled for the arrays on the subsystem e Go to step Bal 38 from step BZ You have changed the use of a disk drive You must now ensure that the hot spare pools are correctly configured To do this action a b c d Type smitty ssaraid and press Enter Select List Status of Hot Spare Pools Select in
421. rror Problem Determination ke da a Se ee a oak ve Ge eo ALS Link Status Ready ae bo to ie de 8 OR oe we pea Ne ot de 4 1 ABT Service Aid Co Ate amp amp Arcs oh But Abs Sa Bok a 482 Repair Actions 2 2 ee eee 482 Part 3 Appendixes 2 2 8 2 4 2 2 2 ee 483 Appendix Communications Statements 485 Xii User s Guide and Maintenance Information Federal Communications Commission FCC Statement Japanese Voluntary Control Council for Interference VCCI Statement Korean Government Ministry of Communication MOC Statement New Zealand Compliance Statement bos Be International Electrotechnical Commission IEC Statement Avis de conformit a la r glementation d Industrie Canada Industry Canada Compliance Statement United Kingdom Telecommunications Requirements European Union EU Statement Radio Protection for Germany Taiwan Class A Compliance Statement Glossary Index Contents 485 485 485 485 486 486 486 486 486 486 487 489 493 xiii xiv Users Guide and Maintenance Information Safety Notices For a translation of the danger and caution notices contained in this book see the Safety Information manual SA23 2652 Definitions of Safety Notices A danger notice indicates the presence of a hazard that has the potential of causing death or serious personal injury This book contains no danger notices A
422. rs that are arranged in two pairs Connectors A1 and A2 are one pair connectors B1 and B2 are the other pair The SSA links must be configured as loops Each loop is connected to a pair of connectors at the SSA adapter card These connectors must be a valid pair that is A1 and A2 or B1 and B2 otherwise the disk drives on the loop are not fully configured and the diagnostics fail Operations to all the disk drives on a particular loop can continue if that loop breaks at any one point This adapter also contains array management software that provides RAID functions to control the arrays of the RAID subsystem see also An array can contain several member disk drives Each array is handled as one disk by the operating system The array management software translates requests to this disk into requests to the member disk drives Although this adapter is a RAID adapter it can be configured so that all some or none of the disk drives that are attached to it are member disk drives of arrays The Advanced SerialRAID Adapter can be connected by way of one or two SSA loops to other SSA adapters These adapters can be either in the same using system or in separate using systems See for details of valid configurations Fast Write Cache Feature An optional 32 MB Fast Write Cache feature is available for the Advanced SerialRAID Adapter This feature improves performance for jobs that include many write operations 128 MB Memory Module
423. rt sending data The SSA target mode device driver then builds a send command to describe the transfer and the data is sent to the device The data can be sent as a blocking write operation or as a nonblocking write operation When the write entry point returns the calling program can access the transmit buffer When the caller opens the target mode special file a logical path is set up This path allows data to be received The user mode caller issues a read readv readx or readvx system call to start receiving data The kernel mode caller issues an fp_read or fp_rwuio service call to start receiving data The SSA target mode device driver then returns data that has been received for the application program The SSA target mode device driver allows an initiator mode device to get access to the data transfer functions through the write entry point it allows a target mode device to get access through the read entry point The only rules that the SSA target mode device driver observes to manage the sending and receiving of data are e Separate write operations need separate read operations e Receive buffers that are full delay the send operation when it tries to resend after a delay Chapter 13 Using the Programming Interface 295 The calling program must observe any other rules that are needed to maintain or otherwise manage the communication of data Delays that occur when data is received or sent through the target mode device dri
424. run to a particular disk drive NO Go to step H YES Exchange the failing disk drive for a new one see FExchanging Disk b Go to step BZan page 474 to add the disk drive to the group of disk drives that are available for use by the RAID manager from step fio Run the Certify service aid see the pdisks that you noted previously Did the Certify service aid fail when it was run to a particular disk drive NO a to each of a Ask the user to recreate the array b Go to step P2 on page 467 YES a Run the Format service aid see to the disk drive b Run the Certify service aid again to the disk drive Go to step E2 on page 460 Chapter 18 SSA Problem Determination Procedures 459 12 from step ft Did the Certify service aid fail again NO a Ask the user to recreate the array b Go to step P2 on page 4671 YES a Exchange the failing disk drive for a new one see Exchanging Disk b Go to step B7 on page 473 to add the disk drive to the group of disk drives that are available for use by the RAID manager 13 from step n An array is in the Degraded state if one member disk drive of the array is missing or has failed and a write command has been sent to that array When an array is in the Degraded state its data is not protected a Type smitty ssaraid and press Enter b Select Change Show Use of an SSA Physical Disk c Go to step h4 on page 461 460 User s Guide and Mai
425. s Array to be copied Number of disk drives required Minimum disk drive size required Verify copy during creation Hot spare selection RAID copy to be coupled OR Disk drives to be coupled Fl Help F2 Refresh F5 Reset F6 Command F9 Shel F10 Exit NS Prepare a Copy Press Enter AFTER making all desired changes Entry Fields hdisk3 1 9 1GB no Default O F3 Cancel F4 List F7 Edit F8 Image Enter Do pa S oH 4 If you want to verify that all the data written to the copy is readable select Verify copy during creation and change the entry to yes Note When this option is set to yes the time that is required to prepare the copy is increased If you are operating your RAID array in a split site configuration you might want to select which hot spare pools are to be used for copy disks If you do select Hot spare selection and select your hot spare preference Press the Help key if you need more information about this entry If you want to couple an existing RAID Copy array a Select RAID Copy to be coupled b Either type the name of the RAID Copy array or press the List key and select a RAID Copy array from the list of available candidates c Press Enter The RAID Copy array is coupled to the parent array The copy operation starts If you want use free disk drives to make a copy of a RAID array a Select Disk drives to be coupled b Either type the names of the dis
426. s Change Show Delete a Hot Spare Pool F1 Help F2 Refresh F3 Cancel F8 Image F9 Shel F1O Exit Enter Do XN D From the following list find the option that you want and go to the place that is indicated 90 User s Guide and Maintenance Information Identifying and Correcting or Removing Failed Disk Drives When a disk drive fails the array rejects it If access to the array is still possible the Current Use attribute of the disk drive is changed from Member of an SSA RAID Array to Rejected The disk drive is listed in the SMIT menus as a rejected disk drive If the disk drive cannot be accessed however it cannot be listed as a rejected disk A disk drive cannot be accessed if its Check light is on or its Power light is off to determine the cause of the failure To find rejected disk drives that cannot be accessed go to To find rejected disk drives that can be accessed do the following procedure 1 For fast path type smitty 1fssaraid and press Enter Otherwise a Select List Identify SSA Physical Disks from the SSA RAID Arrays menu b Select List Rejected Array Disks 2 A list of adapters is displayed in a window a List Identify SSA Physical Disks Move cursor to desired item and press Enter ist Disks in an SSA RAID Array ist Hot Spares ist Rejected Array Disks ist Array Candidate Disks List System Disks Identify Disks in an SSA RAID Array Identify Hot Spares Identify Rejected Array Disks
427. s RAID 1 or RAID 10 arrays with coupled disk drives you must do the following procedure If you do not the coupled disks might return to 0 copied or if data sectors that are not valid exist in the array the array might become unavailable to the using system Before you connect the adapter to the SSA loop ensure that the latest version of adapter microcode is loaded onto the adapter as follows a Do not connect the external SSA cables to the adapter yet b Switch on power to the using system wait for the adapter to be configured The adapter microcode is automatically downloaded c Connect the external SSA cables to the adapter Run the cfgmgr command Chapter 15 Removal and Replacement Procedures 327 If you cannot do this procedure because for example the boot disks are on the SSA network do the following a Run diagnostics to the SSA adapter that is on the partner system to ensure that no array problems exist b Wait until all coupled disk drives can be uncoupled before you connect the new adapter to the network 6 Install the adapter into the using system see the nstallation and Service Guide for the using system 328 User s Guide and Maintenance Information Removing an SDRAM Module of an Advanced SerialRAID Adapter Attention The adapter assembly contains parts that are electrostatic discharge ESD sensitive Use the tools and procedures defined by your organization to protect such parts 1 Remove
428. s been set to Secondary to permit I O operations to be performed on the secondary half of the array Only user actions can set this condition Separate write operations can be performed to each half of the array If these write operations are performed however the two halves of the array must not be reconnected 206 User s Guide and Maintenance Information RAID 10 Array with Unsynchronized Data shows a RAID 10 array whose halves have been reconnected Split Array Resolution Split Array Resolution Primary System 1 SSA Adapter Lao ae ahaa E TAEAE 0011 1110 1010 0000 1110 0001 1111 0100 Figure 40 Reconnecting an Unsynchronized Array Primary System 2 1111 0000 0000 0001 SSA Adapter Secondary 1 0001 0101 0101 1110 Secondary 2 Each half of the array contains different data but no indication is given to which data is valid Under these conditions the array is in the Offline state and both systems generate SRN 48760 Chapter 8 Split Site Management 207 208 User s Guide and Maintenance Information Chapter 9 Using the SSA Spare Tool The SSA Spare Tool helps you to manage your SSA networks when Logical Volume Manager LVM mirrored volume groups are used It works with the LVM to identify stale partitions or missing physical volumes in LVM mirrored volume groups If the SSA Spare Tool finds any stale partitions or missing physical volumes e It automa
429. s determined by the size of the device buffers Understanding Target Mode Data Pacing An initiator mode device can send data faster than the associated target mode device application can read it This condition occurs when The previous write operation is complete but all the device buffers are in use and no space is available for the next write operation e The write operation is not yet completed and the device has no available buffers In both these instances the target mode device driver stops the write operation temporarily and uses the retry mechanism to try again later These actions can cause the write operation to fail As a result the initiator mode device is unable to send any data to the target mode device for the whole of the retry period Alternatively the write operation might time out Think about these possibilities when you set the buffer sizes and the number of buffers for the devices Determine carefully the retry period total write time out period and the Chapter 13 Using the Programming Interface 293 amount of data that is being sent For example to write 64 KB of data with no retry operations you need 64 KB read and write buffers If you allow one retry operation you need only 32 KB buffers Using SSA Target Mode SSA Target Mode does not attempt to manage the data transfer between devices It does however take action if buffers become full and it ensures that read operations can read data from only one
430. s menu is displayed d Select SSA Logical Disks 2 The SSA Logical Disks menu is displayed SSA Logical Disks Move cursor to desired item and press Enter List All Defined SSA Logical Disks List All Supported SSA Logical Disks Add an SSA Logical Disk Change Show Characteristics of an SSA Logical Disk Remove an SSA Logical Disk Configure a Defined SSA Logical Disk Generate an Error Report Trace an SSA Logical Disk Show Logical to Physical SSA Disk Relationship List Adapters Connected to an SSA Logical Disk List SSA Logical Disks Connected to an SSA Adapter Identify an SSA Logical Disk Cancel all SSA Disk Identifications Enable Disable Fast Write for Multiple Devices Fl Help F2 Refresh F3 Cancel F8 Image ae F10 Exit Enter Do Y If you want to enable or disable a fast write attribute for one logical disk drive see If you use the two way fast write function and you want to disable the fast write attribute for a disk drive if the partner adapter becomes not accessible see Chapter 10 Using the Fast Write Cache Feature 213 Enabling or Disabling Fast Write for One Disk Drive This option lets you enable or disable the fast write function for one disk drive 1 2 For fast path access to the Change Show Characteristics of an SSA Logical Disk menu a Type smitty chgssardsk and press Enter b From the menu displayed select the logical disk that you want to change Otherwise a Select Change Show
431. s not cause takeover to occur Takeover occurs only after extensive error recovery activity within the adapter and several retries by the device driver Intermittent errors that last for only approximately one second usually do not cause adapter takeover When takeover has successfully occurred and the device driver has accessed the disk drive through the alternative adapter the original adapter becomes the standby adapter Takeover can therefore occur repeatedly from one adapter to another so long as one takeover event is completed before the next one starts Completion of a takeover event is considered to have occurred when the device driver successfully accesses the disk drive through the alternative adapter When takeover has occurred the device driver continues to use the alternative adapter to access the disk drive until either the system is rebooted or takeover occurs back to the original adapter Each time the SSA disks are configured the SSA disk device driver is informed which path or paths are available to each disk drive and which adapter is to be used as the primary path By default primary paths to disk drives are shared equally among the adapters to balance the load This static load balancing is performed once when the devices are configured for the first time You can use the chdev command to modify the primary path Because of the dynamic nature of the relationship between SSA adapters and disk drives SSA pdisks and hdis
432. screen is displayed for the disk drive that you have chosen Change Show Use of an SSA Physical Disk Type or select values in entry fields Press Enter AFTER making all desired changes SSA RAID Manager SSA physical disk CONNECTION address Current use Entry Fields ssa0 pdisk6 08005AEA080800D Hot Spare Disk Fl Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image F9 Shel F1O Exit Enter Do Ne A If you are only checking the use of the disk drive and do not want to change it go no further with these instructions Otherwise go to step 5 Note If the Current Use field shows that the disk drive is owned by an array you cannot change that use Move the cursor to Current Use and press the List key 6 A list of uses is displayed Make your selection and press Enter 146 User s Guide and Maintenance Information Changing the Use of Multiple SSA Physical Disks 1 For fast path type smitty chgssadisks and press Enter Otherwise select Change Use of Multiple SSA Physical Disks from the SSA RAID Arrays menu 2 A list of adapters is displayed in a window SSA RAID Arrays Move cursor to desired item and press Enter List All Defined SSA RAID Arrays List All Supported SSA RAID Arrays List All SSA RAID Arrays Connected to a RAID Manager List Status Of All Defined SSA RAID Arrays List Identify SSA Physical Disks List Delete Old RAID Arrays Recorded in an SSA RAID Manager List S
433. see the documentation for your SSA enclosure or SSA subsystem Chapter 18 SSA Problem Determination Procedures 437 SRN _ Problem Possible Causes D4000 Description The diagnostics cannot configure the SSA adapter Possible FRUs Action Exchange the FRU for a new FRU D4050 Description The Enhanced Error Handling test has failed Action Exchange the FRUs for new FRUs Host System Board 40 using system nstallation and Service Guide D4100 Description The diagnostics cannot open the SSA adapter Possible FRUs Action Exchange the FRU for a new FRU D4300 Description The diagnostics have detected an SSA adapter POST failure Possible FRUs Action Exchange the FRU for a new FRU D44XX Description The diagnostics have detected that the SSA adapter has corrupted the microcode but cannot download a new version of the microcode Action Exchange the FRU for a new FRU Note In this SRN an X represents a digit 0 through F D6PAA Description A high speed SSA link is running at low speed This error is Possible FRUs logged when the run_ssa_link_speed cron runs the ssa_speed utility External SSA cables 30 Action Internal connections in 1 Read before you exchange any the device enclosure FRUS 30 enclosure service 2 If the SSA service aids are available run the Link Speed service aid information see k to determine the cause of the
434. sent but the data and parity are being rebuilt on the returned or replacement disk drive The array management software allows read and write operations on a disk drive that is in Rebuilding state If the power fails before the rebuilding is complete the array management software restarts the complete rebuilding operation when the power returns Adapter Replacement If for any reason an adapter is exchanged for a replacement adapter and a correct shutdown has not been performed the parity is rebuilt on all the associated arrays when the replacement adapter powers on A RAID 5 array enters Offline state when two or more member disk drives become missing Read and write operations are not allowed 34 User s Guide and Maintenance Information RAID 5 Array State Flowchart Array Good y y gt Disk Disk fails is removed lt y y Array Exposed A Disk rejected LV Second Original disk disk fails or replaced is removed A Write operation Array Offline y New disk Array Degraded Array enabledfor Hot Spare Hot Spare available Y Allow y Write while HotSpare Exposed swappedin y Write Op Array Degraded Array rejected no protection Rebuilding Figure 15 RAID 5 Array State Flowchart Chapter 3 RAID Functions and Array States 35 RAID 10 Array States Good State
435. shell script to run at 05 01 each day for all SSA devices that are configured in the using system The shell script analyzes the error log If it finds any problems the script warns the user in the following ways It sends A message to dev console This message is displayed on the system console An OPMSG to the error log This message indicates the source of the error A mail message to ssa_adm Note ssa_adm is an alias address that is set up in etc aliases By default this address is set to root but you can change it to any valid mail address for the using system User s Guide and Maintenance Information Good Housekeeping The items described here can help you ensure that your SSA subsystem works correctly When you are installing your SSA subsystem ensure that ssa_adm is set to an address that is suitable for your installation Regularly view the mail messages or OPMSGs that are in the error log to determine whether the automatic error log analysis has detected any errors If the automatic error log analysis has detected errors but the diagnostics do not generate an SRN run an error log analysis with the history option set Type ssa_ela 1 Device h timeperiod where timeperiod is the number of 24 hour periods Set timeperiod to include at least the 24 hours that preceded the error For example if at 09 00 on Monday you find that the error log analysis has reported an error on pdisk3 at 05 01 on Sunda
436. sing systems See with NO Agante information If two using systems are switched off disk drives can become isolated if the SSA subsystem does not have bypass cards see Bypa more than one using system is rebooted at the same time disk drives can become isolated while the boot is running 26 User s Guide and Maintenance Information Reserving Disk Drives The Advanced SerialRAID Adapter the Micro Channel SSA Multi Initiator RAID EL Adapter and the PCI SSA Multi Initiator RAID EL Adapter implement reservation by using commands that are sent directly from adapter to adapter They do not use the SCSI reservation command The advantages of this method are System software can read the Physical Volume ID PVID from a reserved disk drive It is possible to use the ssa_rescheck command to determine which adapter is holding a reservation to a disk drive The diagnostics can detect particular failure conditions on reserved disk drives that they cannot detect with the other reservation method e Fencing can be used on a reserved disk drive Node_number locking is supported With node_number locking the disk drive is not locked to an adapter but rather to a using system To enable a disk drive to be locked to a using system each using system in an SSA network must have a unique node number The node number is stored as the node_number attribute of ssar It can be queried with the Isattr command and set by using th
437. sks Identify System Disks SSA RAID Manager Move cursor to desired item and press Enter Fl Help F2 Refresh F3 Cancel F8 Image F1Q Exit Enter Do Find n Find Next ssaQ Available 00 04 IBM SSA 160 SerialRAID Adapter 14109100 Select the adapter whose candidate disk drives you want to identify Chapter 6 Using the RAID Array Configurator 125 3 The following information is displayed ihe Identify Array Candidate Disks Type or select values in entry fields Press Enter AFTER making all desired changes Entry Fields SSA RAID Manager ssaQ Array Candidate Disks Flash Disk Identification Lights yes F1 Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image F9 Shel F1O Exit Enter Do Ne 4 Select yes in the Flash Disk Identification Lights field 5 Press the List key to list the disk drives 6 From the displayed list select the disk drives that you want to identify The Check light flashes on each disk drive that you have selected 126 User s Guide and Maintenance Information Identifying System Disk Drives This option allows you to identify disk drives that are used by the using system These disk drives are not member disk drives of any array 1 For fast path type smitty iassaraid and press Enter Otherwise a Select List Identify SSA Physical Disks from the SSA RAID Arrays menu b Select Identify System Disks 2 Alis t of adapters is displayed in a windo
438. spare disk drives 111 listing old RAID arrays recorded in an SSA RAID manager 131 listing rejected array disk drives 113 listing system disk drives 117 listing the disk drives in an SSA RAID array 109 listing the disks that are in a hot spare pool 80 listing the status of all defined SSA RAID arrays 104 removing a disk drive from an SSA RAID array 138 removing disks from a hot spare pool 86 showing the disks that are protected by hot spares 77 swapping members of an SSA RAID array 142 configuring devices on an SSA loop 18 configuring devices adapter device driver 257 configuring SSA disk drive devices 268 269 using mkdev to configure a logical disk 269 using mkdev to configure a physical disk 269 configuring the Fast Write Cache feature 211 configuring the SSA Target mode 292 copying data from an array 151 copying data from arrays and from volume groups 149 correcting or removing failed disk drives 91 couple action attributes RAID 1 and RAID 10 252 force yes no 252 pool_selection own primary secondary 252 creating a hot spare 72 creation and change attributes all RAID arrays 244 allow_page_splits true false 244 bypass_cache if_oneway 245 fastwrite on off 244 fw_end_block 244 fw_start_block 244 fw_suspended 245 creation and change attributes hot spare pool 249 creation and change attributes RAID arrays 244 creation and change attributes RAID 1 and 10 arrays 247 copy_rate 247 copy_verify_writes 247 fw_max_length 247 hot_spare_
439. splits 247 split_resolution 247 creation and change attributes RAID 1 5 and 10 arrays 245 read_only_when_exposeds true false 245 spare_exact true false 245 spare_preferred 246 spare true false 245 creation and change attributes RAID 10 arrays 248 strip_size 248 creation and change attributes RAID 5 arrays 248 fw_max_length 248 strip_size 248 cron table entries 313 D data paths and loops examples broken loop cable removed 403 broken loop disk drive removed 406 normal loops 401 data paths loops and links configuring devices 18 data paths loops and links 7 large configurations 16 one loop with two adapters in each of two using systems 12 one loop with two adapters in one using system 11 simple loop 8 simple loop one disk drive missing 9 simple loop two disk drives missing 10 two loops with one adapter 14 two loops with two adapters 15 dealing with fast write problems 218 dealing with RAID array problems 89 deciding how to configure hot spare disk drive pools 45 Index 497 Degraded state RAID 10 37 Degraded state RAID 5 33 Delete a RAID Array Copy option 183 Delete a Volume Group Logical Volumes or Filesystems Copy option 184 Delete an SSA RAID Array option 70 detail data formats error logging 225 device attributes 270 device driver entry point 287 device drivers 255 adapter configuring devices 257 description 257 device dependent subroutines 258 direct call entry point 265 IOCINFO ioctl operation 261
440. ss otherwise stated Serial Storage Architecture SSA Serial Storage Architecture SSA is an industry standard interface that provides high performance fault tolerant attachment of I O storage devices In SSA subsystems transmissions to several destinations are multiplexed the effective bandwidth is further increased by spatial reuse of the individual links Commands are forwarded automatically from device to device along a loop until the target device is reached Multiple commands can be travelling around the loop simultaneously SSA retains the SCSI 2 commands queuing model and status and sense bytes The Advanced SerialRAID Adapters type 4 P The Advanced SerialRAID Adapters see Figure 1 are 40 MB per second Serial Storage Architecture SSA Peripheral Component Interconnect PCI adapters that serve as the interface between systems that use PCI architecture and devices that use SSA These adapters provide support for two SSA loops Each loop can contain a maximum of eight pairs of adapter connectors and a maximum of 48 disk drives See Connector B2 H Green light H Green light O Connector A1 Connector B1 Type number label E Connector A2 Figure 1 An Advanced SerialRAID Adapter Card Type 4 P 4 User s Guide and Maintenance Information Note In the SSA service aids this adapter is called IBM SSA 160 SerialRAID Adapter 14109100 The adapter card has four SSA connecto
441. ss that the SSA adapter device driver for this adapter will use bus_mem_start3 Holds the value of the bus memory start address that the SSA adapter device driver for this adapter will use bus_mem_start4 Holds the value of the bus memory start address that the SSA adapter device driver for this adapter will use bus_intr_level Holds the value of the bus interrupt level that the SSA adapter device driver for this adapter will use Chapter 13 Using the Programming Interface 257 intr_priority Holds the value of the interrupt priority that the SSA adapter device driver for this adapter will use daemon Specifies whether to start the SSA adapter daemon If the attribute is set to TRUE the daemon is started when the adapter is configured The daemon holds the adapter device driver open although the operating system might not be using that adapter device driver at the time This action allows the adapter device driver to reset the adapter card if the software that is running on it finds an unrecoverable problem It also allows the adapter device driver to log errors against the adapter The ability of the device driver to log errors against the adapter is especially useful if the adapter is in an SSA loop that is used by another adapter because failure of this adapter can affect the availability of the SSA loop to the other adapter You can use the chdev command to change the value of this attribute host_address Holds the host ad
442. ssistance 411 2 Read carefully the Action you must do for the problem Do not exchange FRUs unless you are instructed to do so 4 When exchanging an adapter always use the instructions that are aiaia with the system unit 412 User s Guide and Maintenance Information SRN Problem Possible Causes 1xxxx Description SRNs in this range are not adapter SRNs Not applicable Action For SRNs in this range see the documentation for your SSA enclosure or SSA subsystem 20PAA Description An open SSA link has been detected Possible FRUs Action Run the Link Verification service aid to isolate the failure see NK e auon e e If the SSA service aids are not available go to the service information for the enclosure in which the device is installed ker aAa External SSA cables 6 Internal SSA connections 4 enclosure service information 21PAA Description An SSA Threshold exceeded link error has been detected Possible FRUs to 29PAA Action Go to MAP Adapter on page 327 External SSA cables 6 Internal SSA connections 4 enclosure service information 2A002 Description Async code 02 has been received Probably a software error Possible FRUs has occurred Action Go to exchanging any FRUs 2A003 Description Async code 03 has been received Probably a software error Software error has occurred Action Go to before exchanging any F
443. st frequently 3 Assign the data for each pair of adapter connectors to the disk drives that are connected immediately next to the pair of connectors in the loop Assign the most frequently accessed data to those disk drives that are farthest from the adapter connectors By doing this action you prevent the activity of the busiest disk drive from obstructing the data path to the other disk drives For example the loop that is shown in Figure 13 contains 16 disk drives The connectors of adapter A are between disk drives 1 and 16 and the connectors of adapter B are between disks 8 and 9 Therefore e Adapter A should access disk drives 1 through 4 and disk drives 13 through 16 The most frequently accessed data should be on disk drives 4 and 13 e Adapter B should access disk drives 5 through 8 and disk drives 9 through 12 The most frequently accessed data should be on disk drives 5 and 12 Disk _ Disk _ Disk _ Disk Disk _ Disk _ Disk _ Disk _ 1 2 3 4 5 6 7 8 lt x m 5 g Q Qa oO ay xe xe lt lt x Disk _ Disk _ Disk _ Disk Disk _ Disk _ Disk _ Disk 16 15 14 13 12 11 10 9 Figure 13 Pairs of Connectors in the Loop Some Shared Data Chapter 2 Introducing SSALoops 25 Pairs Of Adapter Connectors in the Loop Mainly Shared Data The followi
444. stem crash occurs if block special files are used to access devices that provide support for paging logical volumes or mounted file systems Block special files are provided for logical volumes and for disk devices They must be used only by the using system for managing file systems for paging devices and for logical volumes These files should not be used for other purposes The special files that the ssadisk device driver uses include the following listed by type of device e SSA logical disk drives dev hdiskO dev hdisk1 dev hdiskn Provide an interface that allows SSA device drivers to have block I O access to logical SSA disk drives dev rhdiskO dev rhdisk1 dev rhdiskn Provide an interface that allows SSA device drivers to have character access raw I O access and control functions to logical SSA disk drives e SSA physical disk drives dev pdiskO dev pdisk1 dev pdiskn Provide an interface that allows SSA device drivers to have character access control functions only to physical SSA disk drives Note The prefix r on a special file name indicates that the drive is accessed as a raw device rather than as a block device To perform raw I O with an SSA logical disk all data transfers must be in multiples of the device block size Also all Iseek subroutines that are made to the raw device driver must result in a file pointer value that is a multiple of the device block size 276 User s Guide and
445. supported arrays is displayed ie N COMMAND STATUS Command OK stdout yes stderr no Before command completion additional instructions may appear below raid 0 RAID 0 array raid_5 RAID 5 array Fl Help F2 Refresh F3 Cancel F6 Command F8 Image F9 Shel F1O Exit Find ind Next E Chapter 6 Using the RAID Array Configurator 101 Listing All SSA RAID Arrays That Are Connected to a RAID Manager 102 This option lists all the SSA RAID disk drives that are connected to a particular RAID manager 1 For fast path type smitty Ismssaraid and press Enter Otherwise select List All SSA RAID Arrays Connected to a RAID Manager from the SSA RAID Arrays menu A list of RAID managers is displayed in a window we SSA RAID Arrays Move cursor to desired item and press Enter List All Defined SSA RAID Arrays List All Supported SSA RAID Arrays List All SSA RAID Arrays Connected to a RAID Manager List Status Of All Defined SSA RAID Arrays List Identify SSA Physical Disks List Delete Old RAID Arrays Recorded in an SSA RAID Manager List Status of Hot Spare Pools List Status of Hot Spare Protection for an SSA RAID Array List Components in a Hot Spare Pool Add a Hot Spare Pool SSA RAID Manager Move cursor to desired item and press Enter ssa0 Available 00 04 IBM SSA 160 SerialRAID Adapter 14109100 Fl Help F2 Refresh F3 Cancel F8 Image F1O Exit Enter Do Find n Find Next S Select th
446. synchronized This problem has occurred because the cabling has changed or because new hot spare disks have been added to the SSA loop Action 1 If the changes were not planned restore the system to its original configuration If the changes were planned go to step Type smitty ssaraid and press Enter Select List Status of Hot Spare Pools Select the adapter that logged the error If the adapter is not known select all adapters Note the RAID Manager and pool number of spare pools that have a status of Inconsistent Select Change Show Delete a Hot Spare Pool Select the RAID Manager and pool that you noted in step A For each inconsistent spare pool a Check whether the number of hot spare disk drives that are in the pool matches the user s requirements b If necessary select Components to Add or Components to Remove to modify the hot spare disk drives that are in the pool c Set Hot Spares Minimum to the alarm value that the user has specified This number is normally one less than the number of hot spare disk drives that are configured It can however be a lower number d Press Enter Reselect Change Show Delete a Hot Spare Pool This action refreshes the display If the Status of the hot spare pool has changed to Full the repair is complete If any other status value is displayed run diagnostics in System Verification mode to all SSA Adapters to determine the new SRN SSA loop c
447. system switches on power to all the disk drives before or when it switches on the power to the using system Chapter 18 SSA Problem Determination Procedures 471 472 33 34 35 from step B2 Was SRN 49100 logged but no error found when diagnostics were run in System Verification mode NO Go to step Bd YES An array was in the Exposed state but is now in the Good state This problem might have occurred because a disk drive was temporarily removed from the system Verify that using system procedures ensure that the power system switches on power to all the disk drives before or when it switches on the power to the using system from step B3 a Type smitty ssaraid and press Enter b Select List Status of Hot Spare Pools c Display the pool status for each installed SSA adapter Are any pools listed with a status of reduced NO Go to step Ba YES The number of hot spare disk drives that are in the pool is less than the number of hot spare disk drives that were originally assigned to that pool If you have exchanged a failing disk drive and changed its use to Hot Spare Disk or Array Candidate Disk add that disk drive to the reduced hot spare pool To do this action a Type smitty ssaraid and press Enter Select Change Show Delete a Hot Spare Pool Select the adapter and press Enter Select the hot spare pool and press Enter Select Components to Add and press the List key Select a free dis
448. system microcode directory or on diskette Change the level of microcode for all available disk drives to the latest level that is available in the using system microcode directory or on diskette Attention Usually you can download the microcode to disk drives that are in use By doing so however you might cause a temporary delay in the operating system or in the user s application program Do not download microcode to a disk drive that is in use unless you have the user s permission Always refer to the download instructions that are supplied with the microcode and check for any special restrictions that might be applicable If you are not sure do not download to disk drives that are in use When you download new microcode to a disk drive the new level of microcode is not shown by the Display the Microcode Levels option until the disk drives have been reconfigured Run the cfgmgr command before you verify that the new level of microcode is correctly installed Chapter 17 SSA Service Aids 393 To use the Display Download Disk Drive Microcode service aid 1 Select Displa see The following menu is displayed gt Va MICROCODE DOWNLOAD 802420 Move cursor onto selection then press Enter Display the Microcode levels of all SSA Physical Disk Drives Select this option to display the microcode levels installed on all Available SSA disk drives Download Microcode to selected SSA Physical Disk Drives Select
449. systemname pdisk7 AC50AE58 3 Good systemname pdisk12 AC7C6E51 2 Good systemname pdisk0 AC706E9A 1 Good systemname pdisk1l AC1DEEE2 0 Good systemname pdisk10 AC1DBE32 0 7 Good MORE 7 F3 Cancel F10 Exit Ne ay Note that the missing disk drive pdisk8 is represented by a line of question marks 408 User s Guide and Maintenance Information Finding the Physical Location of a Device The physical location of a device for example a disk drive or an SSA adapter cannot be reported directly by the using system because of the way in which the SSA interface works The address of an SSA device is related to the position of that device on the SSA loop The address can therefore change if the configuration is changed Finding the Device When Service Aids Are Available To help you to find the correct physical disk drive the SSA service aids include an Identify function This function when selected causes the Check light of the selected disk drive to flash It also causes the Subsystem Check light if present of the unit containing the selected disk drive to flash Some devices for example adapters do not have Check lights To find such a device you can either use the Identify function to identify devices that are next to the SSA adapter on the SSA link or use the procedure described in Finding the Device When No Service Aids Are Available When no service aids are available you must find the device by using the port P and SSA
450. t are used by each device To set the transmit buffer sizes use the chdev command to adjust the XmitBuffers and XmitBufferSize attributes in the configuration database To set the receive buffer size use the chdev command to adjust the RecvBuffers and RecvBufferSize attributes in the configuration database The buffer sizes must be multiples of 128 bytes The maximum buffer size is 512 bytes A device can have as many buffers as it needs Data can be written into the buffers for the initiator mode device at any time whether or not nonblocking write operations are also transferring data from these buffers The buffers for the target mode device can be read at any time even if a write operation to those buffers is occurring at the same time It is not important if the sizes of the initiator mode device buffers are different from the sizes of the target mode device buffers to which the data is being sent The total buffer space for the target mode device however must be equal to or greater than the size of the initiator mode device buffer size The SSA interface for target mode transfers has been tuned for 512 byte transfers Each write operation can send as much data as is required unless that write operation is nonblocking In a nonblocking write operation the data that is being written must be completely transferred to the device buffers Therefore the maximum amount of data that can be written during a nonblocking write operation i
451. t introduced Similarly hot 194 User s Guide and Maintenance Information spare disk drives are not introduced if the Split Array Resolution flag is set to Secondary secondary disk drives only are being used the HotSpareSplits flag is set to off and all the primary disk drives and the other adapter are not visible Three types of array management problems can cause an array to be in the Offline state to the using system e One half of the array is not present e The adapter is not known to the remaining half of the array e The split and join procedure was not performed correctly This section describes those problems gives examples of possible causes and recommends recovery actions One Half of the Array Is Not Present This condition produces SRN 48750 This type of problem can be caused by e Disk drive failure e Power failure in the using system e Communication problems between two systems on the SSA network Chapter 8 Split Site Management 195 Single Host System with Primary Disk Drive Missing shows a single host system that has just been switched on The system contains a RAID 1 array whose primary disk drive is missing The array remains online but in the Exposed state until one of the following occurs The missing primary disk drive becomes available again before a write operation to the array occurs Under this condition the restored disk drive is returned as the array primary disk drive and the array returns
452. t the need to set the Split Array Resolution flag Operation after a Loss of Member Disks An array goes into the Offline state if after any component failure or change in configuration it becomes possible for the disk drives of mirrored pairs each to receive different data from different systems Under such conditions the disk drives of a mirrored pair might no longer contain matching data that is they are not synchronized An array can continue to operate although one of the following conditions might exist e The number of managing adapters is increased or decreased when all the primary and secondary configuration disk drives are working A primary disk drive is missing cannot be read or cannot be written but the other primary and secondary disk drives are working The secondary configuration disk drive is missing cannot be read or cannot be written but both primary configuration disk drives are working Both primary configuration disk drives are missing but the secondary configuration disk drive is working and all the adapters in the management list can be detected The adapter microcode automatically sets the Split Array Resolution flag Both primary disk drives can be detected but they cannot be read or written and the secondary configuration disk drive is working The adapter microcode automatically sets the Split Array Resolution flag The array goes into the Offline state for the following errors or reconfig
453. t the preferred setting for two way fast write operation If you prefer to use the ssaraid command through the command line interface instead of through the menus You can get access to the SMIT panels by using fast path commands or by working through the menus In this chapter the fast path command for a particular option is given at the start of the description of that option Notes 1 Although this book always refers to the smitty commands you can use either the smitty command or the smit command The procedures that you follow remain the same whichever of the two commands you use If you send the smit command from a graphics terminal however the menus are displayed slightly differently from 211 those shown in this book If you are not familiar with the selection of items from the graphics versions of the menus use the smitty command The menus will then appear as shown in this book 2 If you use fast path commands you might need to go through intermediate steps that are not shown in this book Also some menus might be displayed slightly differently from those shown in this book 212 User s Guide and Maintenance Information Getting Access to the Fast Write Menus 1 For fast path access to the Fast Write SMIT menus type smitty ssadlog and press Enter Otherwise a Type smitty and press Enter The System Management menu is displayed b Select Devices The Devices menu is displayed c Select SSA Disks The SSA Disk
454. target device cannot be located or is not responding e The target device has indicated an unrecovered hardware error ESOFT The target device has reported a recoverable media error EMEDIA The target device has found an unrecovered media error ENODEV One of the following conditions has occurred An attempt was made to access a device that is not defined An attempt was made to close a device that is not defined ENOTREADY An attempt was made to open an SSA physical device in Service mode while an SSA logical device that uses it was in use ENXIO One of the following conditions has occurred e The ioctl subroutine supplied a parameter that is not valid e The openext subroutine supplied extension flags that selected a non existent or nonfunctional adapter path A read or write operation was attempted beyond the end of the fixed disk drive EPERM The attempted subroutine requires appropriate authority ENOCONNECT The using system has been fenced out from access to this device Chapter 13 Using the Programming Interface 275 Special Files ENOMEM The system does not have enough real memory or enough paging space to complete the operation ENOLCK An attempt was made to open a device in Service mode and the device is in an SSA network that is not a loop The ssadisk device driver uses raw and block special files to perform its functions Attention Corruption of data loss of data or loss of system integrity sy
455. tatus of Hot Spare Pools List Status of Hot Spare Protection for an SSA RAID Array List Components in a Hot Spare Pool Add a Hot Spare Pool SSA RAID Manager Move cursor to desired item and press Enter ssaQ Available 00 04 IBM SSA 160 SerialRAID Adapter 14109100 Fl Help F2 Refresh F8 Image F1O Exit Find n Find Next F3 Cancel Enter Do 3 Select the adapter A list is displayed of the disk drives that are attached to the adapter SSA RAID Arrays SSA Physical Disks Move cursor to desired item and press F7 Use arrow keys to scroll ONE OR MORE items can be selected Press Enter AFTER making all selections SSA physical disks that are free pdisk7 0004AC51848900D free n a pdisk8 amp 0004AC51965300D free n a pdisk10 0004AC51BD8F00D free n a SSA physical disks that are hot spares pdisk9 0004AC51BD8000D spare n a SSA physical disks that are system disks pdiskO Q004AC50A30300D system n a F1l Help F2 Refresh F7 Select F8 Image Enter Do Find F3 Cancel F1O Exit n Find Next Physical Physical Physical Physical Physical Chapter 6 Using the RAID Array Configurator 4 Use the Select key to select the disk drives whose use you want to change Select only those disk drives that are to have the same use For example select only disk drives that are to become hot spare disk drives or select only disk drives that are to become system disks The following screen is disp
456. te and hot spare disk drives are enabled a hot spare disk drive is added to the array when the first write operation to that array is attempted If no suitable hot spare disk drive is available the array operates in read only mode spare_preferred default true This attribute is used with the spare attribute With the attribute set to true a hot spare disk drive is selected only from the hot spare pool that contains the failed member disk drive With the attribute set to false a hot spare disk drive if available is selected from the hot spare pool that contains the failed member disk drive If no hot spare disk drive is available in that pool a hot spare disk drive is selected from the default hot spare pool for that SSA loop Pool AO or BO If no hot spare disk drive is available in the default pool a hot spare disk drive is selected from any other hot spare pool that contains a hot spare disk drive 246 User s Guide and Maintenance Information Creation and Change Attributes for RAID 1 and RAID 10 Arrays Only You can specify the following attributes with the a option when you are using the ssaraid command with the C or H option to create or change a RAID 1 or RAID 10 array split_resolution primary secondary default primary This attribute selects the copy of the data that is to remain available if the primary disk drive is completely separated from the secondary disk drive hot_spare_splits no yes default no With the a
457. te the array management software recreates the data that was contained on the missing disk drive On the Advanced SerialRAID Adapter the array management software immediately exchanges a hot spare disk drive for the missing disk drive if a hot spare disk drive is enabled and available when the read command is sent Write Operations while in the Exposed State When a write command is sent to an array that is in the Exposed state the array management software does the following e If a hot spare disk drive is enabled and available when the write command is sent the array management software immediately exchanges the hot spare disk drive for the missing disk drive and returns the array to the Rebuilding state If no hot spare disk drive is enabled and available the first write operation causes the array to enter the Degraded state The written data is not protected If the power fails during a write operation data might be lost 64 KB unless the array is configured to allow read only operations while in the Exposed state Most application programs however cannot be run when write operations are not allowed Degraded State A RAID 5 array enters the Degraded state if while in the Exposed state it receives a write command If a hot spare disk drive is available the array management software immediately exchanges the hot spare disk drive for the missing disk drive and returns the array to the Rebuilding state If no hot spare disk drive is
458. te operation you should suspect that a return code that is not equal to the expected total length is an error File offsets are not applicable and are ignored for target mode write operations If the calling program needs to break a blocked write operation a signal is generated The target mode device driver receives that signal and ends the current write operation The write operation that is in progress fails and the errno global variable is set to EINTR The write operation returns the number of bytes that were already sent before the signal was generated The calling program can then continue by issuing another write operation or an ioctl operation or it can close the device If the write operation that the caller attempts to break completes before the signal is generated the write operation ends normally and the signal is ignored If the buffers of remote using systems are full or no device response status is received for the write operation the target mode device driver automatically retries the write operation It retries the operation up to the number of times that is specified by the value TM_MAXRETRY This value is defined in the usr include sys tmscsi h file By default the target mode device driver delays each retry attempt by approximately two seconds to allow the target device to respond successfully The caller can change the time delayed through the TMCHGIMPARM operation If the write operation is still unsuccessful after the
459. te through connectors A1 and A2 of the SSA adapter H Chapter 17 SSA Service Aids 403 Using system Using system Ai A2 Bi B2 a1 A2 Bi B2 Disk Disk Disk Disk Disk Disk Disk Disk 16 15 14 13 12 11 10 9 Disk Disk Disk Disk Disk Disk Disk Disk 1 2 3 4 5 6 7 8 Figure 57 Broken Loop Cable Removed 404 User s Guide and Maintenance Information For this example the Link Verification service aid displays the following information LINK VERIFICATION 802386 SSA Link Verification for systemname ssaQ 00 04 IBM SSA 160 SerialRAID Adapter To Set or Reset Identify move cursor onto selection then press lt Enter gt Physical Serial Adapter Port Al A2 B1 B2 Status TOP systemname pdisk11 AC50AE43 0 Good systemname pdisk8 AC706EA3 1 Good systemname pdisk2 AC1DBE11 2 Good systemname pdisk3 AC1DBEF4 3 Good systemname pdisk7 AC50AE58 4 Good systemname pdisk12 AC7C6E51 5 Good systemname pdiskO AC706E9A 6 Good systemname pdisk1 AC1DEEE2 7 Good 22222 MORE 8 F3 Cancel F1O Exit amp A Note that the column for adapter connector A2 shows no connections Chapter 17 SSA Service Aids 405 Example 3 Broken Loop Disk Drive Removed In igure 58 on page 407 disk drives 1 through 8 are connected to connectors A1 and A2 of the SSA adapter F but the loop is broken because disk drive number 3 has been removed Disk drives 9 through 12 are connected to conne
460. ted the old_member disk drive is replaced by the new_member disk drive in one operation 2 You can remove disk drives only from arrays that are not in the Exposed state When you remove a disk drive the array enters the Exposed state and remains in that state until you add the new disk drive 3 RAID 5 arrays cannot operate if they lose more than one disk drive at a time 4 To generate a list of suitable exchange candidates use the x flag with the list command Chapter 12 Using the SSA Command Line Interface for RAID Configurations 251 Couple Action Attributes RAID 1 and RAID 10 Only You can specify the following attributes with the a option when you are using the ssaraid command with the A and i couple options to do actions on a RAID 1 or RAID 10 array raid_copy copy This attribute specifies the name that you have given to the existing RAID copy that you are going to use The RAID copy must not be attached to any logical disk pool_selection own primary secondary default own own Assigns the coupled disk drives to the pool to which those disk drives were previously assigned primary Each coupled disk drive is assigned to the hot spare pool to which the primary disk drive that it is copying is assigned secondary Each coupled disk drive is assigned to the hot spare pool to which the secondary disk drive that it is copying is assigned force yes no If the specified RAID Copy array contains data that has been c
461. ter card or a disk drive If this loop becomes broken the alternative signal path round the loop is automatically used A link might be broken if e A device is removed from the loop e A device on the loop is reset or switched off or it fails e An SSA cable is removed or it fails Each SSA device has a Ready light that indicates the operational status of the SSA loop to which that device is attached The light is permanently on when the device can communicate with the two SSA devices that are logically next to it on the SSA loop The light flashes if the device can communicate with only one of those two devices The light is off if the device cannot communicate with either of the two SSA devices Usually an SSA device is present at each side of the point where the SSA loop is broken Each of those devices has its Ready light flashing SSA_LOGGING_ERROR 610BDA5SE The adapter has passed error log data for a disk drive to the device driver error logger but the disk drive to which the data is related is not configured into the using system This problem usually occurs because the disk drive was not available to the adapter when the cfgmgr command was previously run SSA_SETUP_ERROR SSA_SOFTWARE_ERROR 48489B00 91FBD5DB A user procedure has not been performed correctly Use the SRN to determine the procedure that has caused the problem The software has detected an unexpected condition If you have just installe
462. th Two Adapters Large Configurations Switching Off Using Systems Switching On Using Systems Configuring Devices on an SSA Loop SSA Link Speed Identifying and Addressing SSA Devices Location Code Format 2 Pdisks Hdisks and Disk Drive Identification f SSA Unique IDs Rules for SSA Loops Checking the Level of the Adapter Microcode Rules for the Physical Relationship between Disk Drives and Adapters i One Pair of Adapter Connectors in the Loop Pairs of Adapter Connectors in the Loop Some Shared Data Pairs Of Adapter Connectors in the Loop Mainly Shared Data XV XV XV xvii xvii xvii xvii xviii xviii Doaoonoawn kr Ww oO ONN PO PO PP PP PP AH AH AH aH ei h h md k d ORR WNMNHDODODOAANNDAAND gt Reserving Disk Drives Fast Write Cache Chapter 3 RAID Functions and gee States RAID Functions Ee ta sae a he Availability Disk Drives That Are Not in Arrays RAID O Array States BA Good State Offline State RAID 1 Array States RAID 5 Array States Good State Exposed State Degraded State Rebuilding State Offline State RAID 5 Array State Flowchart RAID 10 Array States Good State Exposed State Degraded State Rebuilding State Offline State Unknown State Multiple States Chapter 4 Using the SSA SMIT Menus Getting Access to the SSA Adapters SMIT Menu Getting Access to the SSA Disks SMIT Menu Getting Access t
463. that are connected to a RAID manager 102 listing all supported SSA RAID arrays 101 listing array candidate disk drives 115 listing hot spare disk drives 111 listing old RAID arrays recorded in an SSA RAID manager 131 listing rejected array disk drives 113 506 User s Guide and Maintenance Information SSA RAID arrays continued listing system disk drives 117 listing the disk drives in an SSA RAID array 109 listing the disks that are in a hot spare pool 80 listing the status of all defined SSA RAID arrays 104 removing a disk drive from an SSA RAID array 138 removing disks from a hot spare pool 86 showing the disks that are protected by hot spares 77 swapping members of an SSA RAID array 142 SSA SMIT menus using 38 SSA Target mode buffer management 293 configuring 292 execution of Target Mode requests 294 target mode data pacing 293 using 294 SSA Target Mode 291 SSA unique IDs UIDs 21 ssa_certify command 345 ssa_delete_copy command 171 ssa_diag command 348 ssa_elacommand 353 ssa_format command 358 ssa_fw_status command 360 SSA_GET_ENTRY_POINT ioctl operation 264 description 264 files 264 purpose 264 return values 264 ssa_getdump command 361 ssa_identify_cancel command 129 ssa_make_copy command 161 ssa_progress command 365 ssa_rescheck command 366 ssa_servicemode command 368 ssa_sesdid command 341 ssa_speed command 369 SSA_TRANSACTION ioctl operation 262 description 262 files 263 purpose 262 return values 263 SSA descri
464. that are defined in the usr include ipn ipntra h file A non zero value indicates an error ParameterDDR Set by the caller to indicate the buffer for parameter data TransmitDDR Set by the caller to indicate the buffer for transmit data ReceiveDDR Set by the caller to indicate the buffer for received data StatusDDR Set by the caller to indicate the buffer for status data 262 User s Guide and Maintenance Information TimeOutPeriod Number of seconds after which the transaction is considered to have failed A value of 0 indicates no time limit Note If an operation takes longer to complete than the specified time out the adapter is reset to purge the command Attention This is a very low level interface It is for use only by configuration methods and diagnostics software Use of this interface might result in system hangs system crashes system corruption or undetected data loss Return Values Files When completed successfully this operation returns a value of 0 Otherwise a value of 1 is returned and the errno global variable is set to one of the following values EIO Indicates an unrecoverable I O error ENXIO_ Indicates an unknown device EINVAL Indicates an unknown command Indicates a bad buffer type EACCESS Indicates user does not have root privilege ENOMEM Indicates not enough memory ENOSPC Indicates not enough file blocks EFAULT Indicates bad user address dev ssa0 dev ssa1
465. that is provided in the SSA subsystem It allows multiple using systems to control access to a common set of disk drives flag A character that shows that a particular condition exists FRU Field replaceable unit G GB Gigabyte gigabyte GB 1000000000 bytes Good state The state of a RAID array when all its member disk drives are present H hdisk A logical unit that can consist of one or more physical disk drives pdisks An hdisk in an SSA subsystem might therefore consist or one pdisk or several pdisks An hdisk is also known as a LUN hot spare disk drive A spare disk drive that is automatically added to a RAID array to logically replace a member disk drive that has failed hot spare pool A configured group of disk drives that contains pdisks and a hot spare disk drive The pool ensures that a hot spare disk drive is available if any pdisk in the group fails hot sparing The process by which a spare disk drive is automatically added to a RAID array to logically replace a member disk drive that has failed 490 User s Guide and Maintenance Information interface Hardware software or both that links systems programs or devices IOCC Input output channel controller IPN Independent Packet Network ISAL Independent Network Storage Access Language K KB Kilobyte kernel The part of the operating system that contains functions that are needed frequently kernel mode In the oper
466. that the adapter is connected through ports B1 and B2 e The Serial column lists the serial numbers of the pdisks This column is blank if the device is an adapter 384 User s Guide and Maintenance Information e The Adapter Port column shows the address of each adapter port to which a particular device is connected The device is connected to two ports except when the SSA loop is broken under which condition the device is connected to only one port e The Status column shows the existing status of the physical disk drive as known by the adapter Status conditions are Good The disk drive is working correctly Failed The disk drive has failed Power The disk drive has detected an enclosure fault Format The disk drive is being formatted An SSA link must be configured in a loop around which data can travel in either direction The loop is broken if a cable fails or is removed or if a disk drive fails Because each disk drive on the loop can be accessed from either direction the broken loop does not prevent access to any data unless that data is on the failed disk drive If the loop is broken between two disk drives the Ready lights on those disk drives flash to show that only one SSA path is active Also the Link Verification service aid shows that only one path is available to each disk drive on the broken loop You can find the physical location of any disk drive on the loop by using the Identify function see The Iden Fl
467. the array Degraded A copy has been created but one or more coupled disk drives are missing or have failed If missing disk drives are replaced or exchanged for new disk drives the copy operation continues Coupled Disks The disk drives that are coupled to the array and contain the copy of the array data These disk drives are present only if an array copy has been created 186 User s Guide and Maintenance Information Percentage Copied The percentage of the array data that has been copied onto the coupled disk drives When this reaches 100 the coupled disk drives contain an exact copy of the data that is on the array and can be uncoupled at any time from the array For fields that are not defined here see The following information is displayed for RAID Copy arrays r Change Show Attributes of an SSA RAID Array Type or select values in entry fields Press Enter AFTER making all desired changes Entry Fields SSA RAID Manager ssal SSA RAID Array hdisk9 Connection Address Array Name 8A8E39197D02C4T RAID Array Type raid_copy State good Size of Array 18 2GB Member Disks pdisk11 pdisk12 Parent Array hdisk2 Copy Uncoupled Thu May 11 09 31 11 20 gt Allow Page Splits yes Current Use System Disk F1l Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image F9 Shel F10 Exit Enter Do ie A The meanings of the additional fields are State The status of the RAID Copy array Valid val
468. the default naming convention applies For example the copy of the original file system data is named fs data Determines the name of the copy logical volume If this flag is not provided the default naming convention applies For example the copy of the original logical volume Iv001 is named fslv001 Provides information e E filename P Calls the external trigger You can use the external trigger when you want to automate the copy process In the automated script run the ssa_make_copy v vgname E external_file command This command causes the script ssa_make_copy script to wait until the file external_file has been created before it flushes the cache and stops I O You have already determined how long the copy operation takes to run After that period the file external_file is created If the file exists before the copy operation has completed the script an error is reported The external trigger provides a good way to automatically stopping I O at the correct moment If the e flag is used the default filename applies tmp ssa_copy_svs_ trigger Uncouples the copy This command is run only when you want to recreate the copy volume group on another using system When a copy array is uncoupled a new hdisk is created This hdisk contains a copy of the data that was on the parent when the copy was uncoupled Recreates a copy of the volume group on the copy disk drives e Changes PVID references of the VGDA to suit the
469. the menus use the smitty command The menus will then appear as shown in this book Different microcode levels might cause slightly different versions of the menus to be displayed If you use fast path commands you might need to go through intermediate steps that are not shown in this book Also some menus might be displayed slightly differently from those shown in this book 39 Getting Access to the SSA Adapters SMIT Menu 1 For fast path access to the SSA RAID Array SMIT menus type smitty ssaa and press Enter Otherwise a Type smitty and press Enter The System Management menu is displayed b Select Devices The Devices menu is displayed c Select SSA Adapters 2 The SSA Adapters menu is displayed Via SSA Adapters Move cursor to desired item and press Enter List All SSA Adapters Change Show Characteristics of an SSA Adapter Generate Error Report Trace an SSA Adapter Change Show the SSA Node Number for this System List All SSA Adapter Dumps Copy an SSA Adapter Dump F1 Help F2 Refresh F3 Cancel F8 Image F9 Shel F1O Exit Enter Do If you need help with an item move the cursor to that item and press F1 Help 40 Users Guide and Maintenance Information Getting Access to the SSA Disks SMIT Menu ie For fast path access to the SSA RAID Array SMIT menus type smitty ssad and press Enter Otherwise a Type smitty and press Enter The System Management menu is displayed b Select Devic
470. the selected hdisk and the pdisk or pdisks if it is a RAID array that makes up the hdisk e If SD 6000 is installed on the using system and a hardware error is logged the SD 6000 runs error log analysis and reports an incident if problems are found that need service activity Error Log Analysis Routine The purpose of the SSA error log analysis routine that is contained in the diagnostics is to generate an SRN for any logged errors that need service action Normally the error log analysis is related to the previous 24 hour period If you want to perform an 230 User s Guide and Maintenance Information error log analysis that is related to a period longer than 24 hours use the ssa_ela command see If the detail data field for the error record contains SCSI sense data e SSA_DISK_ERR2 or SSA_DISK_ERR3 type errors do not generate an SRN e DISK_ERR1 or DISK_ERR4 type errors media errors generate an SRN if more than a predetermined number of these errors exist in the log The SRN is 1XXXX where XXXX is the contents of bytes 20 and 21 of the detail data e SSA_DISK_ERR1 or SSA_DISK_ERR4 type errors generate the SRN 1XXXX where XXXX is the contents of bytes 20 and 21 of the detail data If the detail data field contains SSA error code data the first character of the data is used as an error log analysis threshold value If the number of times that a particular error has been logged during the previous 24 hours is greater than the t
471. the state of the Split Array Resolution flag and a separate write operation is performed to each half of the array Both adapters can be detected on the SSA loop but one half of the RAID array cannot be detected and the following sequence of events occurs 1 A write operation is performed 2 The half of the array that could originally be detected can no longer be detected 3 The half of the array that could not originally be detected can now be detected and data is written to it 4 The half of the array that could originally be detected can now be detected again The problem can also occur in a single host system if only one half of the RAID array can be detected and the following sequence of events occurs 1 2 3 A write operation is performed The half of the array that could not originally be detected can now be detected The half of the array that could originally be detected can no longer be detected and data is written to it The half of the array that could originally be detected can now be detected again Chapter 8 Split Site Management 205 RAID 10 Array Split shows a RAID 10 array that has been split into halves Split Array Resolution Split Array Resolution Primary System 1 SSA Adapter Primary 1 Primary 2 Figure 39 Forcing Access on a RAID 10 Array Primary Secondary 1 System 2 SSA Adapter Secondary 2 The Split Array Resolution flag ha
472. the volume group name is listed the RAID arrays that were copied as part of that volume group Status The status of the RAID Copy array Valid values for status are Good _ All the array components are present and operational Offline One or more array members are missing or have failed Unknown A RAID Copy array has been created but has not been coupled to an array An hdisk cannot be created from this RAID Copy array This RAID Copy array can be only coupled to an array or deleted Parent Array The name of the array from which the data was copied Timestamp The date and time at which the copy was uncoupled from the parent array User s Guide and Maintenance Information Delete a RAID Array Copy For fast path type smitty copy_delcopy and press Enter Otherwise select Delete a RAID Array Copy from the Array Copy Services menu The following information is displayed a F1 Array Copy Services Prepare a RAID Array Copy Move cursor to desired item and press Enter Prepare Volume Group Logical Volumes or Filesystems Copy Uncouple a RAID Array Copy List All Copy Candidates List All Uncoupled Copies List All Uncoupled Volume Groups Delete a RAID Array Copy Move cursor to desired item and press hdisk3 raid_1 hdisk9 raid_copy Fl Help F2 Refresh F8 Image F1O Exit Find n Find Next Uncouple a Volume Group Logical Volumes or Filesystems Copy Delete a Volume Group Logical Volumes or Filesyst
473. tically resynchronizes the stale partition if no replacement hardware is needed e It sends an email message that describes the actions taken and indicates whether a disk drive must be exchanged for a new one It logically exchanges the failed disk drive in the volume group with a preassigned spare disk drive This process is called hot sparing It synchronizes the new disk drive with the remaining disk drives in the volume group e It notifies you when it has logically exchanged a failed disk drive and has synchronized the replacement disk drive with the other disk drives in the volume group When a failed disk drive has been physically exchanged and the data has been resynchronized the SSA Spare tool can use various scripts to e Move the data from the temporary spare disk drive to the replacement disk drive e Prepare the spare disk drive so that it can again be used if another disk drive fails To obtain the SSA Spare Tool go to the uniform resource locator URL address and follow the instructions that are given there The address is http www storage ibm com hardsoft products ssa 209 210 User s Guide and Maintenance Information Chapter 10 Using the Fast Write Cache Feature This chapter describes how to configure the Fast Write Cache feature and how to deal with any fast write problems that might occur during fast write operations Fast Write Cache Card Battery The Fast Write Cache Option Card receives
474. tion for service use only I ID U Modifies the enclosure ID The ID must be a four character alphanumeric string The optional U flag causes the object data manager ODM to be updated to show the change B mode Modifies all bypass cards to the given mode B card mode Modifies the specified bypass cards to be in the given modes The valid values for the mode parameter are Automatic Bypass e Inline e Open Valid values for the card parameter are 1 4 5 8 9 12 13 16 S Resets the exchanged flag of the selected FRU S d drive_bay Resets the flag for the selected disk drive bay slot Valid values for drive_bay are 1 2 3 and so on 356 User s Guide and Maintenance Information Examples S b card Resets the flag for the bypass card Valid values for card are 1 4 5 8 9 12 13 16 S p PSU Sr Sc S0 Resets the flag for the selected PSU power supply assembly Valid values for PSU are 1 2 Resets the flag for remote power on RPO Resets the flag for the controller card Resets the flag for the operator panel T threshold value Modifies the specified temperature thresholds to the given values The valid values for the threshold parameter are lowarn The low temperature warning threshold hiwarn The high temperature warning threshold The value parameter is a temperature in degrees C e To display the status of all bypass cards on enclosure0 g
475. tion about the service aids see Repair Actions When you have determined the parts of the failing link check whether the cause of the problem is obvious for example a loose cable connection or a disk drive with its Check light on If the cause is obvious correct it If the cause is not obvious exchange the parts of the link one at a time until the problem is solved 482 User s Guide and Maintenance Information Part 3 Appendixes 483 484 User s Guide and Maintenance Information Appendix Communications Statements The following statements apply to this product The statements for other products intended for use with this product appear in their accompanying manuals Federal Communications Commission FCC Statement This equipment has been tested and found to comply with the limits for a Class A digital device pursuant to Part 15 of the FCC Rules These limits are designed to provide reasonable protection against harmful interference when the equipment is operated in a commercial environment This equipment generates uses and can radiate radio frequency energy and if not installed and used in accordance with the instruction manual may cause interference to radio communications Operation of this equipment in a residential area is likely to cause harmful interference in which case the user will be required to correct the interference at his own expense Properly shielded and grounded cables and connectors must be used
476. tl Operation Purpose Se Fah ks data see Mr oe Wes est 2 aN Description Return Values Files SSA Disk Concurrent Mode of Operation Interface Device Driver Entry Point Top Kernel Extension Entry Point SSA Disk Fencing SSA Target Mode Configuring the SSA Target Mode Buffer Management Understanding Target Mode Data Pacing Using SSA Target Mode Execution of Target Mode Requests SSA tmssa Device Driver Purpose Syntax Description i Configuration Information Device Dependent Subroutines Errors tmssa Special File Purpose Description Implementation Specifics i User s Guide and Maintenance Information 270 272 274 276 277 277 277 277 278 278 278 279 280 281 281 281 282 282 283 283 283 284 284 285 285 285 286 286 287 287 288 290 291 292 293 293 294 294 295 295 295 295 296 296 302 304 304 304 304 Part 2 Maintenance Information Related Information IOCINFO Device Information tmssa Device Driver ioctl Operation i Purpose Description TMIOSTAT Status tmssa Device Driver ioctl Operation Purpose a a ae ae a i a e Description TMCHGIMPARM Change Parameters tmssa Device Driver ioctl Operation Purpose Bin lie een Bk wh ee Oe Fe Aa A Sy Description Chapter 14 SSA Adapter Information Installing the SSA Adapter Cron Table
477. to protect such parts The Fast Write Cache Option card might contain customer data 1 Refer to Figure 52 M Figure 52 Connecting the Battery Assembly 2 Hold the Fast Write Cache Option card so that the battery assembly connector is upward 3 Connect the battery assembly to the connector E 338 User s Guide and Maintenance Information 4 Refer to Figure 53 G N lt Figure 53 Installing the Battery Assembly 5 Locate the battery assembly so that the two lugs are under the edges of the small slots H and the clip is over the larger slot H Carefully press the battery assembly downward until the clip latches under the edge of the slot for the using system Note The battery on the Fast Write Cache Option Card might be iy discharged If it is the diagnostics fail and generate SRN 42529 see Numbars GANS an paga 411 While the battery is discharged the adapter can be used the Fast Write Cache feature however remains disabled until the battery is charged The battery becomes fully charged approximately one hour after the adapter is connected to the power You must now reset the battery age counter
478. ttached to the SSA adapter None 440 User s Guide and Maintenance Information Software and Microcode Errors Some SRNs indicate that a problem might have been caused by a software error or by a microcode error If you have one of these SRNs do the following actions 1 Make a note of the contents of the error log for the device that has the problem 2 For AIX Versions 4 2 and above run the snap b command to collect system configuration data and to dump data For AIX versions below 4 2 go to the using system service aids and select Display Vital Product Data to display the VPD of the failing system Make a note of the VPD for all the SSA adapters and disk drives 3 Report the problem to your support center The center can tell you whether you have a known problem and can if necessary provide you with a correction for the software or microcode If the support center has no known correction for the SRN exchange for new FRUs the FRUs that are listed in the SRN SSA Loop Configurations that Are Not Valid Note This section is related to SRN 48000 SRN 48000 shows that the SSA loop contains more devices or adapters than are allowed The maximum numbers allowed depend on the adapter Fries in ea Te describes these details for each adapter If the SRN occurred when you or the customer switched on the using system 1 Switch off the using system 2 Review the configuration that you are trying to make
479. ttribute set to no the RAID manager does not attempt to use hot spare disk drives to replace missing member disk drives if a RAID 1 or RAID 10 array is split exactly in half or all the secondary member disk drives are present It is recommended that this attribute be set to no if the RAID 1 or RAID 10 array is configured to provide protection against the loss of a physical domain copy_rate default 50 This attribute controls the speed of the copy operation when an I O operation is in progress It can be set to any integer value 1 through 100 Higher values increase the speed of the copy operation but decrease the speed of the I O operation copy_verify_writes default no With the attribute set to yes all data that is written to the RAID Copy array is verified before the write operation completes The results of this action are e Unrecoverable media errors are less likely to be found when the RAID Copy array is uncoupled and read e More time is needed to perform the copy operation fw_max_length default 128 This attribute sets the maximum size in blocks of write operations to the cache Write operations that are larger than the specified value write data directly to the array and do not use the fast write cache Chapter 12 Using the SSA Command Line Interface for RAID Configurations 247 Creation and Change Attribute for RAID 5 Arrays Only You can specify the following attribute with the a option when you are using the ssarai
480. tus of Light Meaning Off Both SSA links are inactive Permanently on Both SSA links are active Slow flash two seconds on two seconds off Only one SSA link is active If your subsystem has other link status lights see the subsystem service information for the subsystem for more details Chapter 18 SSA Problem Determination Procedures 481 Service Aid If service aids are available you can use the Link Verification service aid to show that the SSA loop is broken E N LINK VERIFICATION 802386 SSA Link Verification for systemname ssa0 00 04 SSA Enhanced RAID Adapter To Set or Reset Identify move cursor onto selection then press lt Enter gt Physical Serial Adapter Port Al A2 Bl B2 Status TOP systemname pdisk11 AC50AE43 0L 5 Good systemname pdisk8 AC706EA3 1 4 Good systemname pdisk2 AC1DBE11 2 3 Failed systemname pdisk3 AC1DBEF4 3 2 Good systemname pdisk7 AC50AE58 Leal Good systemname pdisk12 AC7C6E51 5 20 Good systemname pdiskO AC706E9A 0 Good 2222 Good systemname pdisk10 AC1DBE32 3 Good MORE 4 F3 Cancel F1O Exit i y This example screen shows a break in the SSA loop between the pdiskO and pdisk10 In the condition shown by the display the Ready lights on the pdiskO and pdisk10 are both flashing To help locate these disk drives select the pdisk and press Enter The Check light on the selected disk drive flashes This action does not affect the customer s operations For more informa
481. two nodes The path can change if for example SSA loops are changed nodes are switched off or any other physical change is made to the connected SSA loops The TMSSA device driver can use any available path to the other node but does not tell you which path is being used Each node must have in its device configuration database a unique node number that is defined by the node_number attribute of the ssar device Node 1 Node 2 Adapter Adapter Adapter Adapter Adapter Adapter ssa0 ssal ssa2 ssa3 ssa4 ssa5 Figure 42 An Example of Node to Node Communications Figure 42 shows an example configuration of two nodes In this example tmssa is at first using adapter ssa0 on node 1 and adapter ssa5 on node 2 Suddenly the link Chapter 13 Using the Programming Interface 291 between the adapters fails The tmssa device driver automatically switches to adapters ssal and ssa3 or adapters ssa1 and ssa4 The connections between nodes can be modified while they are in use and the target mode interface tries to recover The TMSSA uses either of two methods to read and write data e The blocking method which waits until the I O is complete or an error occurs before it returns control to you e The nonblocking method which returns control to you immediately With this method the write operation occurs at a later time The read operation returns the a
482. ues for status are Good All the array components are present and operational Offline One or more array members are missing or have failed Unknown A RAID Copy array has been created but has not been coupled to an array An hdisk cannot be created from this RAID Copy array This RAID Copy array can be only coupled to an array or deleted Parent Array The name of the array from which the data was copied Copy Uncoupled The date and time when the copy uncoupled from the parent array For fields that are not defined here see Chapter 7 Copying Data from Arrays and from Volume Groups 187 List Status Of All Defined SSA RAID Arrays For fast path type smitty Istssaraid and press Enter Otherwise select List Status Of All Defined SSA RAID Arrays from the SSA RAID Arrays menu The following information is displayed for RAID Copy arrays A X COMMAND STATUS Command OK stdout yes stderr no Before command completion additional instructions may appear below Components Status hdisk9 good pdisk11 good pdisk12 good hdisk11 good pdisk18 good pdisk21 good pdisk17 good Fl Help F2 Refresh F3 Cancel F6 Command F8 Image F9 Shel F1O Exit Find n Find Next N 37 Status data is given for the array and each disk drive in the array The status values for the array are Good All the array components are present and operational Offline One or more array members are missing or have failed Unknown A RAID Co
483. uilding operation had completed a hot spare disk drive has not replaced the failed disk drive Possible FRUs Device 100 bn page 319 428 User s Guide and Maintenance Information SRN Problem Possible Causes 49000 Description An array is in the Degraded state A disk drive might not be A RAID 5 array causes this available for one of the following reasons SRN if a disk drive is not The disk drive has failed available to the array anda e The disk drive has been removed from the subsystem write command is sent to the e An SSA link has failed array A power failure has occurred lt ARAID 1 or RAID 10 array Action If the SSA service aids are available run the Link Verification causes this error if the array service aid see j to find any has one or more degraded failed disk drives failed SSA links or power failures that might have mirrors A RAID 1 or caused the problem RAID 10 mirror becomes degraded when one disk If you find any faults go to the Start MAP or equivalent in the enclosure drive in the mirror pair is not service information to isolate the problem then go to BZ on page 474 of available and a write to return the array to the Good state command is sent to the array If the SSA service aids are not available or the Link Verification service aid does not find any faults go to MAP isolate the problem 49100 Description An array is in the Exposed state A disk drive can become
484. up Logical Volumes or Filesystems Copy Fl Help F2 Refresh F3 Cancel F8 Image F9 Shel1 F1O Exit Enter Do R A The following Array Copy Services are available Prepare a RAID Array Copy Select this option if you want to make a copy of a RAID array If the RAID array is a member of an active volume group select Prepare a Volume Group Logical Volumes or Filesystems Copy When this option is selected an array selection menu is displayed and is followed by the Prepare a Copy menu that is shown in TIRES TT Prepare Volume Group Logical Volumes or Filesystems Copy Select this option if you want to copy a volume group or a part of a volume group You must select this option if you want to copy any part of an active volume group When this option is selected a volume group selection menu is displayed When the volume group is selected the ssa_make_copy script selects disk drives to be used for the copy and couples those disk drives to each array in the volume group To get more control over this process use the ssa_make_copy command directly from the command line or from in a shell script For fast path type smitty copy_pre_vglvfs and press Enter Chapter 7 Copying Data from Arrays and from Volume Groups 173 Uncouple a RAID Array Copy Select this option if you want to uncouple a copy from an array The copy must be in the Good state before it can be uncoupled When the copy is uncoupled a ne
485. uple a RAID Array Copy A list is displayed of all the RAID 1 and RAID 10 arrays that have been copied c Select the array that you want to uncouple for example hdisk3 and press Enter When the array is uncoupled a new hdisk is created This hdisk contains a copy of the data that was on the parent array when the RAID Copy array was uncoupled The displayed output from the uncouple command includes the hdisk name that is assigned to the RAID Copy array 7 When you no longer need the data that is on the RAID Copy array you can recouple the RAID Copy array to the parent array or delete it To delete the RAID Copy array a Select Array Copy Services from the SSA RAID Arrays menu b Select Delete a RAID Array Copy c Select the RAID Copy array to be deleted and press Enter The hdisk is deleted from the system configuration The RAID Copy array is deleted and its member disk drives are changed to free disk drives 158 User s Guide and Maintenance Information Using the ssa_make_copy Command to Create a RAID Copy from a RAID 1 or RAID 10 Array shows a volume group that contains one physical volume hdisk1 This hdisk is a RAID 1 array Volume Group RAID 1 hdisk1 pdisk1 pdisk2 Figure 22 A Volume Group the Contains One Physical Volume The ssa_make_copy command checks whether the hdisk is either a RAID 1 or a RAID 10 array If the array is neither of these types the co
486. uration If the problem has resulted from the failure of an SSA adapter and the failure of half the members of an array do the following actions to attach the array to the adapter 1 Type smitty ssaraid and press Enter 2 Select Change Show Attributes of an SSA RAID Array 3 Change the setting of the Split Array Resolution flag The array becomes available but remains in the Exposed state or the Degraded state until the missing array members are added 4 Add the missing members of the array a Select Change Member Disks in an SSA RAID Array from the SSA RAID Arrays menu b Select Add a Disk to an SSA RAID Array once for each disk drive that you need to add to the array 426 User s Guide and Maintenance Information 48750 Description An array is in the Offline state because the primary or the The adapter cannot get access secondary half of the array is not present the other adapter in the loop because one of the following Action Switch on the power to the other half of the array or repair the conditions exists broken loop that is preventing access to the other disk drives The SSA adapter can detect If the other half of the array has become permanently unavailable the secondary hal ot tne array but the Split Array because an unrecoverable error has occurred you can force access to Resolution flag is set to the available half of the array A Primary Attention Do not force access to the available half of the ar
487. uration e Both primary configuration disks are missing and less than all the managing adapters on the list are visible Split Array Resolution needs to be set before operations to the array can continue e The secondary disk drives are missing and all the managing adapters are missing The primary side of the array initializes read or write operations are performed on the array and the Split Array Resolution flag is not set on the primary configuration disk drives Later the secondary configuration disk drive appears Its Split Array Resolution flag is set Under these conditions both the primary side and the secondary side might have been written independently and the data might not be consistent on the mirrored pair members You must determine whether the correct data is on the primary side or on the secondary side Then you must reinitialize by changing the value of the Split Array Resolution flag on the side that does not have the correct data The HotSpareSplits parameter can be used to control whether hot spare disk drives are to be introduced when exactly half the member disk drives of an array are missing In a split site configuration when one site loses access to the other it might be desirable that hot spare disk drives are not introduced while half the disk drives are no longer visible When the HotSpareSplits flag is set to off and all the secondary disk drives and the other adapter are not visible hot spare disk drives are no
488. ures need to be performed These problems are indicated by any of the following Service Request Numbers SRNs 42521 42524 42525 If any of these SRNs occurs do the following actions 1 Goto Request Numbe RN and find the SRN 2 Do the actions that are given for the SRN and come to this section when the actions tell you to do so Important Do not come to this section until the SRN actions tell you to do so 3 Find the SRN in this section and do the actions given SRN 42521 You can use the ssaraid command to list the devices that are affected by this failure The ssaraid command is in usr sbin To list all devices that are affected by this cache failure type ssaraid 1 ssaX Iz a state cache_data_error where X is the number of the adapter that has reported the failure in the error log for example ssa3 The output from the command produces one line of information for each device as follows 2327340C228635K 2327340C228635K cache_data_error 9 1GB RAID 5 array hdisk3 2327340C423235K cache_data_error 36 4GB RAID 5 array pdisk5 O8005AEA045E00D free n a 9 1GB Physical disk For non RAID disk drives the pdisk is listed with the pdisk state No array resource state exists For RAID arrays the hdisk is listed with the array resource state The location of the corrupted data is not known and no simple data recovery procedure is possible To attempt data recovery you must disable the fast write cache
489. us of one or more mirrored pairs is rebuilding No mirrored pair is exposed or degraded offline One of the following conditions exists e The status of one or more mirrored pairs is offline e The first two member disk drives of the primary half are missing and the Split Array Resolution flag is set to Primary e The first member disk drive of the secondary half is missing and the Split Array Resolution flag is set to Secondary unknown Not enough configuration data is available to determine the state of the array Under these conditions ignore the primary and secondary designations and the array pair status The status values for mirrored pairs are good Both member disk drives of the mirrored pair are present and working 106 User s Guide and Maintenance Information exposed One member disk drive of a mirrored pair is missing or has failed but no write operation has been issued to the pair degraded One member disk drive of a mirrored pair is missing or has failed and a write operation has been issued to the working member disk drive rebuilding A failed member disk drive in a mirrored pair has been exchanged and data is rebuilding offline Both member disk drives of a mirrored pair are missing or have failed Chapter 6 Using the RAID Array Configurator 107 Listing or Identifying SSA Physical Disk Drives This option allows you to list the disk drives that are being used by a particular array and to identify particul
490. used Set to zero resvd2 Unused Set to zero Note The tm_get_stat structure works only for the initiator device Chapter 13 Using the Programming Interface 307 TMCHGIMPARM Change Parameters tmssa Device Driver ioctl Operation Purpose Description To allow the caller to change the retry parameter and the time out parameter that are used by the target mode device driver This operation allows the caller to change the default set up of the device It is allowed only for the initiator mode device The arg parameter to the TMCHGIMPARM operation contains the address of the tm_chg_im_parm structure that is defined in the usr include sys tmscsi h file Default values that are used by the device driver for the retry parameter and for the time out parameter usually do not require change For some calling programs however default values can be changed to fine tune timing parameters that are related to error recovery When a parameter is changed it remains changed until another TMCHGIMPARM operation occurs or until the device is closed When the device is opened the parameters are set to the default values Parameters that can be changed with this operation are e The delay in seconds between device driver initiated retries of send commands the retry parameter The time allowed before the write operation times out the time out parameter To indicate which of the possible two parameters the caller is changing the call
491. utes with the a option when you are using the ssaraid command with the A and i exchange options to do maintenance on an array new_member disk This attribute specifies the disk drive that is to be added to the array either in exchange for a failing disk drive that has caused the array to enter the Exposed state or in exchange for a disk drive that the old_member attribute has specified old_member disk This attribute specifies the member disk drive that is to be removed from the array You can use the attribute on its own or with the new member attribute Use the old_member attribute on its own if you want only to remove the disk drive from the array Use the old_member attribute and the new_member attribute together if you want to exchange the disk drives in one action and the subsystem has a Spare slot available for the new disk drive If no spare slot is available use the following method to exchange the disk drives 1 Logically remove the failing disk drive For this action use the ssaraid command with only the old_member attribute specified 2 Physically remove the disk drive from the slot 3 Install the new disk drive into the slot that contained the old disk drive 4 Logically add the new disk drive to the array For this action use the ssaraid command with only the new_member attribute specified Notes 1 If you specify the new_member attribute and the old_member attribute together an in place exchange is attemp
492. ve exchanged all the FRUs that were originally reported by the SRN NO Exchange the next FRU that is listed for the SRN YES a Run diagnostics in System Verification mode to all the adapters that are in this SSA loop b Run diagnostics in System Verification mode to all the disk drives that are in this SSA loop c Run the Certify service aid see to all the disk drives that are in this SSA iao Correct all errors that are reported by the diagnostics Go to step Zan page 477 476 User s Guide and Maintenance Information 7 from steps 8 in MAP 2010 START and from steps B and H in this MAP Does your subsystem contain RAID arrays If you are not sure whether your subsystem contains RAID arrays a Type smitty ssaraid and press Enter b Select List All Defined SSA RAID Arrays NO Run the repair verification or repair completion procedures that are defined by your using system YES Go to the RAID Checkout at step 22 on page 467 of IMAP 2324 SSA BAID Chapter 18 SSA Problem Determination Procedures 477 SSA Link Errors SSA link errors can be caused if e Power is removed from an SSA device e An SSA device is failing e An SSA device is removed A cable is disconnected Such errors might be indicated by e SRN 45PAA A flashing link status Ready light on the SSA device at each end of the failing link e The indication of an open link by the Link Verification service aid SSA Link Error Problem D
493. ved the reservation is broken Otherwise the open operation runs normally This flag has support only for SSA logical disks You cannot specify this flag and the SSADISK_FENCEMODE flag together SSADISK_RETAIN_RESERVATION Retains the reservation of the device after a close operation by not issuing the release This flag prevents other initiators from using the device unless they break the using system reservation Note This flag does not cause the device to be explicitly reserved during the close if it was not reserved while it was open This flag has support only for SSA logical disk drives You cannot specify this flag and the SSADISK_FENCEMODE together SSADISK_NO_RESERVE Prevents the reservation of a device during an openx subroutine call to that device This operation is provided so a device can be controlled by two processors that synchronize their activity by their own software procedures This flag overrides the setting of the attribute reserve_lock if the value of the attribute is yes This flag has support only for SSA logical disk drives You cannot specify this flag and the SSADISK_FENCEMODE flag together SSADISK_SERVICEMODE Opens an SSA physical disk in service mode This flag wraps the SSA links on Chapter 13 Using the Programming Interface 273 each side of the indicated physical so that the disk can be removed from the loop for service and no errors are caused on the loops This flag has support only for S
494. ver are that are characteristics of the hardware and software driver environment Configuration Information When tmssan is configured where n is the remote node number the tmssan im and tmssan tm special files are both created An initiator mode pair or a target mode pair must exist for each device whether either or both modes are being used The target mode node number for an attached device must be the same as the initiator mode node number Each time that you use the cfgmgr command to configure the node the target mode device driver finds the remote nodes that are already connected and automatically configures them Each node is expected to be identified by a unique node number The target mode device driver configuration entry point must be called only for the initiator mode device number The device driver configuration routine automatically creates the configuration data for the target mode device minor number This data is related to the initiator mode data Device Dependent Subroutines The target mode device driver provides support for the following subroutines open close e read e write e ioctl select open Subroutine The open subroutine allocates and initializes target or initiator device dependent structures No commands are sent to the device as a result of running the open subroutine The initiator mode device or target mode device must be configured but not already opened for that mode otherwise the ope
495. ves that are connected to the system Using the command in this mode you can check the level of microcode on the SSA disk drives that are connected to the system Notes 1 The microcode files that this command can download have names of the pattern ssadisk ros XXXX where XXXX identifies the microcode level also known as the ROS id that the file contains Such microcode files are different from those with names of the pattern ssadisk XXXXXXX YY These files contain a different type of disk microcode and are automatically downloaded by the system configuration software as necessary They cannot work with the ssadload command 2 Microcode can be downloaded to SSA disk drives while the system is using those disk drives You need not vary off any volume groups that contain the disk drives If the microcode is downloaded to a disk drive that the system is using the system might be delayed slightly while the microcode is downloading 3 The microcode images are stored in the etc microcode directory Attention Usually you can download the microcode to disk drives that are in use By doing so however you might cause a temporary delay in the operating system or in the user s application program Do not download microcode to a disk drive that is in use unless you have the user s permission Always refer to the download instructions that are supplied with the microcode and check for any special restrictions that might be applicable If
496. w ts List Identify SSA Physical Disks Move cursor to desired item and press Enter ist Disks in an SSA RAID Array ist Hot Spares ist Rejected Array Disks ist Array Candidate Disks ist System Disks dentify Disks in an SSA RAID Array dentify Hot Spares dentify Rejected Array Disks dentify Array Candidate Disks Identify System Disks SSA RAID Manager Move cursor to desired item and press Enter ssaQ Available 00 04 IBM SSA 160 SerialRAID Adapter 14109100 F1 Help F2 Refresh F3 Cancel F8 Image F1Q Exit Enter Do Find n Find Next Select the adapter whose system disk drives you want to identify Chapter 6 Using the RAID Array Configurator 127 3 The following information is displayed A Identify System Disks Type or select values in entry fields Press Enter AFTER making all desired changes Entry Fields SSA RAID Manager ssa System Disks Flash Disk Identification Lights yes F1 Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image ieee F1Q Exit Enter Do Select yes in the Flash Disk Identification Lights field 5 Press the List key to list the disk drives 6 From the displayed list select the disk drives that you want to identify The Check light flashes on each disk drive that you have selected 128 User s Guide and Maintenance Information Canceling all SSA Disk Drive Identifications This option allows you to cancel all disk drive identificat
497. w hdisk is created This hdisk contains the copied data When this option is selected a RAID array selection menu is displayed If you select a RAID array that has a copy status of good the copy is uncoupled from its parent and a new array is created The new array uses the next available hdisk name For fast path type smitty copy_unarray and press Enter Uncouple a Volume Group Logical Volumes or Filesystems Copy Select this option if you want to uncouple a volume group When this option is selected a volume group selection menu is displayed This menu is followed by the Uncouple a Volume Group Logical Volumes or Filesystems Copy menu see page LZ List all Copy Candidates Select this option to list all SSA RAID arrays that can support disk copy See gq for a description of the output data List all Uncoupled Copies Select this option to find the parent array and the creation date of each array of type raid_copy See description of the output data List all Uncoupled Volume Groups Select this option to find the parent array and the creation date of each array in a volume group that contains arrays of type raid_copy See 82 for a description of the output data Delete a RAID Array Copy Select this option if you want to delete a RAID Array copy The copy might be coupled to the array or uncoupled from the array The copy can be in any state The deletion of the copy causes all the data on the copy to be lost See fo
498. ween pdisk2 and pdisk3 and that an unconfigured device is present between pdisk6 and pdiskO b If you have just made changes to or have just switched on the unit in which the disk drive is installed you might need to wait for up to 30 seconds before detailed information about the SSA network becomes available to the service aids 4 When you have solved a problem press the Cancel key to leave the display then press Enter to reselect it The display now shows the new status of the SSA links g k on page 400 provides more examples of link problems and how to use this service aid to solve them 386 User s Guide and Maintenance Information Configuration Verification Service Aid The Configuration Verification service aid enables you to determine the relationship between SSA logical units hdisks and SSA physical disk drives pdisks It also displays the connection information and operational status of the disk drive Notes 1 User applications communicate with the hdisks error data is logged against the pdisks 2 Ifa disk drive that has been formatted on a machine of a particular type for example a personal computer is later installed into a using system that is of a different type for example a large host system that disk drive is configured only as a pdisk during the configuration of the using system In such an instance use the Format service aid to reformat the disk drive then give the cfgmgr command to correct t
499. y The sequence in which the data is written to the array might be critical to the application program that is using the data if an error occurs during the write operation fastwrite on off default off This attribute enables and disables the fast write cache When using the fast write cache you can use the following attributes to control the operation of the cache fw_start_block default 0 See the definition for fw_end_block fw_end_block default array size This attribute and the fw_start_block attribute control the range of blocks for which the fast write cache is enabled Write operations that are outside the default range of O through array size write data directly to the array and do not use the fast write cache 244 User s Guide and Maintenance Information bypass_cache_if_oneway true false default false With the attribute set to true If the partner fast write cache becomes not available the fast write cache operations to this disk drive are disabled and an entry is made in the error log With the attribute set to false If the partner fast write cache becomes not available fast write operations to this disk drive continue If the SSA adapter fails during these operations some data might not be flushed to the disk drive until the adapter has been repaired fw_suspended default false During a RAID Copy uncouple operation the fast write function on the specified disk drive is suspended stopped temporarily I
500. y type ssa_ela 1 pdisk3 h 3 where 3 is the number of 24 hour periods An SRN for the error is generated Note The error occurred on Sunday When running the error log analysis you need to include at least the 24 hours that preceded the error that is Saturday In this example therefore timeperiod includes Saturday Sunday and Monday If application programs fail run diagnostics in Problem Determination mode to find the SRN Have no concerns about events that occur in the error log unless an application program fails or error log analysis generates an SRN Chapter 11 SSA Error Logs 233 234 User s Guide and Maintenance Information Chapter 12 Using the SSA Command Line Interface for RAID Configurations You can use the ssaraid command from the command line instead of the from the SMIT panels see configure and manage your arrays The Command Line injertace includes a README file that explains the syntax for the ssaraid command The README file is located at usr pp devices ssa IBM_raid ssaraid README Using the ssaraid command from the Command Line Interface you can do the following List all the SSA adapters that provide support for RAID arrays in a system Those adapters are called RAID Manager List SSA objects that are connected to a particular RAID Manager List all objects of a given type for example RAID 5 arrays List the preferred name serial number of an object List all
501. y Disks Identify Array Candidate Disks SSA RAID Array Move cursor to desired item and press Enter hdisk3 095231779F0737K good 3 4G RAID 5 array hdisk4 09253173A02137K good 3 4G RAID 5 array Fl Help F2 Refresh F3 Cancel F8 Image F1Q Exit Enter Do Find n Find Next Select the array whose disk drives you want to list Chapter 6 Using the RAID Array Configurator 109 3 A list of disk drives is displayed fa Command OK pdiskl pdisk4 pdisk7 pdisk8 F1 Help F8 Image n Find Next Ne COMMAND STATUS stdout yes 0004AC5119E000D member 08005AEA030D00D member 08005AEA087A00D member 08005AEA098100D member F2 Refresh F9 Shel stderr no present 1 16 present 2 3G present 4 5G not_present n a F3 Cancel F1O Exit Before command completion additional instructions may appear below Physical disk Physical disk Physical disk Physical disk F6 Command Find 110 User s Guide and Maintenance Information Listing Hot Spare Disk Drives This option allows you to list the hot spare disk drives that are available to a particular array T For fast path type smitty Ihssaraid and press Enter Otherwise a Select List Identify SSA Physical Disks from the SSA RAID Arrays menu b Select List Hot Spares A list of adapters is displayed in a window ts List Identify SSA Physical Disks Move cursor to desired item and press Enter List Disks in an SSA RAID Array
502. y devices and press Enter 2 Select SSA RAID Arrays 3 Select List Identify SSA Physical Disks 4 Select List Disks in an SSA RAID Array 5 Select the hdisk that is in the Exposed state and note all the pdisks If necessary use the Identify function to identify the disk drive 6 Move all the member disk drives to the same SSA loop 48700 Description All the member disk drives of an array are not on the same SSA loop configuration SSA loop The array is in the Offline state Action All the member disk drives of an array must be on the same SSA loop Find all the members of the array 1 aPfon Type smitty devices and press Enter Select SSA RAID Arrays Select List Identify SSA Physical Disks Select List Disks in an SSA RAID Array Select the hdisk that is in the Offline state and note all the pdisks If necessary use the Identify function to identify the disk drives Move all the member disk drives to the same SSA loop problem Chapter 18 SSA Problem Determination Procedures 425 SRN Problem Possible Causes One member of a RAID 1 array has been detected but the adapter is unknown to that array member All the primary or all the secondary disk drives of a RAID 10 array are present but the adapter is unknown to the array Action If the problem has resulted from an unplanned change to the using system configuration return the using system to its original config
503. yed a N Change Show Characteristics of an SSA Logical Disk Type or select values in entry fields Press Enter AFTER making all desired changes MORE 6 Entry Fields Location Label Parent ssar Size in Megabytes 4512 adapter_a ssa0 adapter_b ssal primary_adapter adapter_a Connection address 0004AC506C3600E Physical volume IDENTIFIER 00406 fdac2fb8203000000 gt ASSIGN physical volume identifier no RESERVE disk on open yes Queue depth 5 Maximum Coalesce 0x20000 Enable Fast Write yes Bypass Cache In 1 Way Fast Write Network no BOTTOM F1l Help F2 Refresh F3 Cancel F4 List F5 Reset F6 Command F7 Edit F8 Image ee F1O Exit Enter Do 5 If you do not want the fast write function to continue on a particular disk drive if the partner adapter becomes not accessible set the Bypass Cache in 1 Way Fast Write Network flag to yes for that disk drive With the flag set to yes an error is logged when the cache is bypassed If you want the fast write function to continue on a particular disk drive if the partner adapter becomes not accessible set the Bypass Cache in 1 Way Fast Write Network flag to no for that disk drive Chapter 10 Using the Fast Write Cache Feature 217 Dealing with Fast Write Problems This section describes how to recover from problems that might occur during fast write operations The type of problems are those in which data loss might have occurred or customer data recovery proced
504. you are not sure do not download to disk drives that are in use 350 User s Guide and Maintenance Information Flags Examples d PhysicalDiskName Specifies the physical disk drive that is to receive the microcode f CodeFileName S a P Specifies the microcode file to be downloaded Ensures that all SSA physical disk drives are loaded with the latest level of microcode that is available on the system By default this flag ensures that all SSA physical disk drives are loaded with the latest microcode To update a specific disk drive use this flag and the d flag Shows the existing levels of microcode that are installed on SSA physical disk drives By default this flag shows the existing level of microcode that is installed on all SSA physical disk drives To list a specific disk drive use this flag and the d flag adapter Specifies the adapter on which the operation is to run so that you can update all disk drives that are connected to that adapter Enables more than one copy of ssadload to run at the same time Parallel download operations are therefore possible Attention When this flag is used a download operation to two disk drives that are in the same loop or array can occur if the using system contains two SSA adapters The user must ensure that such operations do not occur Using the f flag ssadload d pdisk0 f ssadisk ros 7899 With this flag the command loads microcode file ssadisk ros 7899 o
505. you the operational status of the links that make an SSA loop Configuration Verification This service aid lets you determine the relationship between physical and logical disk drives e Format Disk This service aid formats an SSA disk drive e Certify Disk This service aid verifies that all the data on a disk drive can be read correctly Display Download Disk Drive Microcode This service aid allows you to observe and modify the microcode level on all the SSA disk drives Link Speed This service aid allows you to observe the operating speed of each link on an SSA loop Physical Link Configuration This service aid provides support for SES enabled SSA enclosures subsystems See your enclosure service information for a description of this service aid Enclosure Configuration This service aid provides support for SES enabled SSA enclosures subsystems See your enclosure service information for a description of this service aid Enclosure Environment This service aid provides support for SES enabled SSA enclosures Subsystems See your enclosure service information for a description of this service aid Enclosure Settings This service aid provides support for SES enabled SSA enclosures subsystems See your enclosure service information for a description of this service aid The selection menu for the SSA service aids allows you direct access to the SMIT menus for fast write and RAID functions on SSA
506. ype smitty 1s_hsm_array_status and press Enter Otherwise select List Status of Hot Spare Protection for an SSA RAID Array from the SSA RAID Arrays menu 2 A list of adapters is displayed in a window a SSA RAID Arrays Move cursor to desired item and press Enter List All Defined SSA RAID Arrays List All Supported SSA RAID Arrays List All SSA RAID Arrays Connected to a RAID Manager List Status Of All Defined SSA RAID Arrays List Identify SSA Physical Disks List Delete Old RAID Arrays Recorded in an SSA RAID Manager List Status of Hot Spare Pools SSA RAID Manager Move cursor to desired item and press F7 ONE OR MORE items can be selected Press Enter AFTER making all selections ssa0 Available 04 06 IBM SSA 160 SerialRAID Adapter 14109100 ssal Available 04 07 IBM SSA 160 SerialRAID Adapter 14109100 F1l Help F2 Refresh F3 Cancel F7 Select F8 Image F1Q Exit F Enter Do Find n Find Next Select the adapter whose protected member disk drives you want to list Chapter 6 Using the RAID Array Configurator 77 3 A list of protected member disk drives is displayed E gt COMMAND STATUS Command OK stdout yes stderr no Before command completion additional instructions may appear below ssal Component Location Size Pool Protected Status hdisk4 raid_10 pdisk13 04 02 REGY 06 P 18 2GB pool_B2 yes good pdisk11 04 02 REGY 08 P 9 2GB pool_B1 yes good pdisk3 04 02 REGY 05 P 9 2GB pool_B2 yes good
507. ze of the member disk drive is not the physical size of the disk drive but the size that the array manager assigns to it For example if a RAID 10 array is created from three 9 GB disk drives and one 18 GB disk drive the size that is assigned to each array member disk drive is 9 GB The 18 GB disk drive can still be protected by a 9 GB hot spare disk drive wrong_pool This member disk drive of the array has been replaced with a hot spare disk drive from another pool This action has occurred because no hot spare disk drive was available in this pool when the array member disk drive failed When all failed disk drives have been replaced this array member disk drive should be exchanged with a disk drive that is in the same physical domain as are the other disk drives in the pool see Chapter E Hat Spard User s Guide and Maintenance Information Adding a New Hot Spare Pool This option allows you to add a new hot spare pool If you are not sure how to configure hot spare pools read you proceed 1 For fast path type smitty add_hsm_pool_adap and press Enter Otherwise select Add a Hot Spare Pool from the SSA RAID Arrays menu 2 A list of adapters is displayed in a window a 5 SSA RAID Arrays Move cursor to desired item and press Enter List All Defined SSA RAID Arrays List All Supported SSA RAID Arrays List All SSA RAID Arrays Connected to a RAID Manager List Status Of All Defined SSA RAID Arrays List Identify SSA P
Download Pdf Manuals
Related Search
Related Contents
16 - Colgate Iridium Brochure デジタルプリンター CP95D Approx 5-Port Desktop Switch ATD Tools ATD-5922 Baby Jumper User Manual Difrnce MP1871 Willkommen BTS Scout Operators Manual TAFCO WINDOWS NU2-113S-W Installation Guide 262ko fichier PDF Copyright © All rights reserved.
Failed to retrieve file