RIOIP01

Use this procedure to isolate a failure in a high-speed link (HSL) loop using i5/OS service tools.

About this task

Follow the steps in the RIOIP01 Section-1: Main task and you will be directed to the proper subtasks.
Note: During this procedure, you will be disconnecting and reconnecting cables. If errors concerning missing resources (such as disk units and HSL failures) occur, ignore them. Missing resources will report in again when the loop reinitializes.

RIOIP01 Section-1: Main task

Procedure

  1. Were you directed here while working on a B700xxxx reference code?
    • Yes, go to step 4.
    • No, continue with the next step.
  2. Are you working on a 520 or 570?
    • Yes, go to step 4.
    • No, continue with the next step.
  3. Were you sent here from a B600xxxx reference code?
    • Yes, use the serviceable event view and the system service documentation to search for a B700xxxx reference code with the same last four characters reported at approximately the same time. If you find one, perform service on that reference code first, and when you close that problem, close this one as well. If you do not find one, continue with the next step.
    • No, continue with the next step.
  4. Before powering down any system unit or expansion unit, work with the customer to end all subsystems in all of the partitions using each partition's console.
  5. From the partition control panel, IPL the system or partition to Dedicated Service Tools (DST).
    Attention: Do not use function 21!
  6. Are all system and expansion units on the loop powered on?
    • Yes, go to step 8.
    • No, continue with the next step.
  7. Perform the following:
    1. Power on all system and expansion units on the loop. If a frame cannot be powered on, perform the RIOIP01 Section-5: Cannot power on unit subtask below, and then continue with step 8.
    2. Was the HSL link error cleared up when the frames were powered back on?
      • Yes, go to Verifying the repair.
      • No, continue with the next step.
      This ends the procedure.
  8. Perform the following:
    1. Access the Service Action Log (SAL) entry for this error; the field replaceable units (FRUs) should be listed there. Look for part numbers and descriptions for the FRUs containing the HSL port for two frames. There should also be a FRU for the cable between them. The locations information for the FRUs is the location of the failed ports on the failed link.
    2. Record the loop number from the SAL (if it is displayed there in one of the FRU descriptions) or from the first four characters of word 7 of the reference code. Go to Converting the loop number to NIC port location labels to determine which HSL/RIO cables on the system you are working with.
      Is this information in the SAL?
  9. Is the cable connecting the failed ports an optical cable?
    • Yes, continue with the next step.
    • No, go to step 11.
  10. Perform the following:
    1. Clean the HSL cable connectors and ports using the tools and procedures in symbolic FRU OPT_CLN.
    2. To determine if cleaning the connectors and ports solved the problem, perform RIOIP01 Section-6: Manually detecting the failed link below and return to this point. Did the ports you were working on have a status of "failed"?
      • Yes, continue with the next step.
      • No, the problem is fixed. Go to Verifying the repair. This ends the procedure.
  11. There are now three cases to consider. Continue with the appropriate subtask of this procedure:
    • The ports on both ends of the failed link are in different system units on the loop.
    • The port on one end of the failed link is in a system unit and the port on the other end is in an I/O unit.
    • The ports on both ends of the failed link are in an I/O unit.

RIOIP01 Section-2: The ports on both ends of the failed link are in different system units on the loop

Procedure

  1. There might be failed hardware that will report a different error on the other system units. Perform the following:
    1. Work any other HSL/RIO problems in the serviceable event view on the other system units.
    2. Perform RIOIP01 Section-6: Manually detecting the failed link below and return to this point. Did the ports you were working on have a status of "failed"?
      • Yes, continue with the next step.
      • No, the problem is fixed. Go to Verifying the repair. This ends the procedure.
  2. Is the cable an optical HSL/RIO cable?
    • Yes, go to step 4.
    • No, continue with the next step.
  3. Perform the following:
    1. Check the thumb screws on the cable connectors at both ends of the cable to ensure they are tight. For any thumb screw that was loose, disconnect the cable at that end, wait a maximum of 30 seconds, reconnect the cable, and tighten the thumb screws. You must tighten both thumb screws within 30 seconds of when the cable makes contact with the port.
    2. If you disconnected and reconnected the cable at either end, perform RIOIP01 Section-6: Manually detecting the failed link below and return to this point. Did the ports you were working on have a status of "failed"?
      • Yes, continue with the next step.
      • No, the problem is fixed. Go to Verifying the repair. This ends the procedure.
  4. Replace the cable between the two system unit ports on the failed link. To determine if replacing the cable resolved the problem, perform RIOIP01 Section-6: Manually detecting the failed link below and return to this point. Did the ports you were working on have a status of "failed"?
    • Yes, continue with the next step.
    • No, the problem is fixed. Go to Verifying the repair. This ends the procedure.
  5. Exchange the FRU with the HSL/RIO port in one of the system units. If you are working with a serviceable event view and the HSL FRUs are listed, exchange the FRU corresponding to the first HSL/RIO cable port listed. Otherwise, exchange the FRU that is quickest and easiest to replace). To determine if replacing the FRU resolved the problem, perform RIOIP01 Section-6: Manually detecting the failed link below and return to this point. Did the ports you were working on have a status of "failed"?
    • Yes, continue with the next step.
    • No, the problem is fixed. Go to Verifying the repair. This ends the procedure.
  6. Exchange the remaining FRU with the HSL/RIO port on the other system unit. To determine if replacing the FRU resolved the problem, perform RIOIP01 Section-6: Manually detecting the failed link below and return to this point. Did the ports you were working on have a status of "failed"?
    • Yes, continue with the next step.
    • No, the problem is fixed. Go to Verifying the repair. This ends the procedure.
  7. Use symbolic FRU HSL_LNK to determine if there are any additional HSL/RIO cable-related FRUs, such as interposer cards and internal ribbon cables, that might be on either unit. Did you exchange any additional HSL/RIO FRUs?
    • Yes, continue with the next step.
    • No, call your next level of support for further instruction. This ends the procedure.
  8. To determine if replacing the FRU resolved the problem, perform RIOIP01 Section-6: Manually detecting the failed link below and return to this point. Did the ports you were working on have a status of "failed"?
    • Yes, call your next level of support for further instruction. This ends the procedure.
    • No, the problem is fixed. Go to Verifying the repair. This ends the procedure.

RIOIP01 Section-3: The port on one end of the failed link is in a system unit and the port on the other end is in an I/O unit

Procedure

  1. Switch the two HSL/RIO cables on the I/O unit with the failed port, so that each cable is connected to the port where the other cable was previously connected. Disconnect both cables at the same time, wait a maximum of 30 seconds, and then reconnect the cables one at a time.
    Attention: For copper cables, you must fully connect the cable and tighten the connector's screws within 30 seconds of when the cable contacts the port. Otherwise, the link fails and you must disconnect and reconnect again. Also, if the connector screws are not tightened, errors occur on the link and it fails.
  2. Refresh the port status for the first failing resource by performing Refresh the port status below. Then continue with the next step.
  3. Is the port on the system unit that was failed now working?
    • Yes, perform symbolic FRU SIIOADP to exchange the HSL I/O bridge FRU in the I/O unit. Go to Verifying the repair. This ends the procedure.
    • No, continue with the next step.
  4. Switch the cables back to their original positions by disconnecting both cables at the same time, waiting 30 seconds, and then reconnecting the cables one at a time.
    Attention: For copper cables, you must fully connect the cable and tighten the connector's screws within 30 seconds of when the cable contacts the port. Otherwise, the link fails and you must disconnect and reconnect again. Also, if the connector screws are not tightened, errors occur on the link and it fails.
  5. Refresh the port status for the first failing resource by performing Refresh the port status below. Then continue with the next step.
  6. Exchange the cable between the two ports on the failed link. To determine if replacing the cable resolved the problem, perform RIOIP01 Section-6: Manually detecting the failed link below and return to this point. Did the ports you were working on have a status of "failed"?
    • Yes, continue with the next step.
    • No, then the problem is fixed. Go to Verifying the repair. This ends the procedure.
  7. Exchange the HSL/RIO FRU that contains the failing port in the system unit. To determine if replacing the FRU resolved the problem, perform RIOIP01 Section-6: Manually detecting the failed link below and return to this point. Did the ports you were working on have a status of "failed"?
    • Yes, continue with the next step.
    • No, the problem is fixed. Go to Verifying the repair. This ends the procedure.
  8. Use symbolic FRU HSL_LNK to determine if there are any additional HSL/RIO cable-related FRUs, such as interposer cards and internal ribbon cables, that may be on either unit. Did you exchange any additional HSL/RIO FRUs?
    • Yes, continue with the next step.
    • No, call your next level of support for further instruction. This ends the procedure.
  9. To determine if replacing the FRU resolved the problem, perform RIOIP01 Section-6: Manually detecting the failed link below and return to this point. Did the ports you were working on have a status of "failed"?
    • Yes, call your next level of support for further instruction. This ends the procedure.
    • No, the problem is fixed. Go to Verifying the repair. This ends the procedure.

RIOIP01 Section-4: The ports on both ends of the failed link are in an I/O unit

Procedure

  1. Switch the two HSL/RIO cables on the first (or "From") cable's I/O unit with the failed port so that each cable is connected to the port where the other cable was previously connected.
    Attention: For copper cables, you must fully connect the cable and tighten the connector's screws within 30 seconds of when the cable contacts the port. Otherwise, the link fails and you must disconnect and reconnect again. Also, if the connector screws are not tightened, errors occur on the link and it fails.
  2. Refresh the port status for the first failing resource by performing Refresh the port status below. Then continue with the next step.
  3. Is the port on the I/O unit on which you did not switch the cables now working?
    • Yes, use symbolic FRU SIIOADP to exchange the HSL/RIO I/O bridge card in the I/O unit where you just switched the cables. The continue with the next step.
    • No, go to step 5.
  4. To determine if replacing the FRU resolved the problem, perform RIOIP01 Section-6: Manually detecting the failed link below and return to this point. Did the ports you were working on have a status of "failed"?
    • Yes, continue with the next step.
    • No, the problem is fixed. Go to Verifying the repair. This ends the procedure.
  5. Switch the cables back to their original positions.
    Attention: For copper cables, you must fully connect the cable and tighten the connector's screws within 30 seconds of when the cable contacts the port. Otherwise, the link fails and you must disconnect and reconnect again. Also, if the connector screws are not tightened, errors occur on the link and it fails.
  6. Switch the two HSL/RIO cables on the second (or "To") I/O unit with the failed port so that each cable is connected to the port where the other cable was previously connected.
    Attention: For copper cables, you must fully connect the cable and tighten the connector's screws within 30 seconds of when the cable contacts the port. Otherwise, the link fails and you must disconnect and reconnect again. Also, if the connector screws are not tightened, errors occur on the link and it fails.
  7. Refresh the port status for the first failing resource by performing Refresh the port status below. Then continue with the next step.
  8. Is the port on the I/O unit on which you did not switch cables now working?
    • Yes, use symbolic FRU SIIOADP to exchange the HSL/RIO I/O bridge card in the I/O unit where you just switched the cables. Then continue with the next step.
    • No, go to step 10.
  9. To determine if replacing the FRU resolved the problem, perform RIOIP01 Section-6: Manually detecting the failed link below and return to this point. Did the ports you were working on have a status of "failed"?
    • Yes, continue with the next step.
    • No, the problem is fixed. Go to Verifying the repair. This ends the procedure.
  10. Switch the cables back to their original positions.
    Attention: For copper cables, you must fully connect the cable and tighten the connector's screws within 30 seconds of when the cable contacts the port. Otherwise, the link fails and you must disconnect and reconnect again. Also, if the connector screws are not tightened, errors occur on the link and it fails.
  11. Exchange the HSL/RIO cable between the two ports on the failed link. To determine if replacing the cable resolved the problem, perform RIOIP01 Section-6: Manually detecting the failed link below and return to this point.
    Did the ports you were working on have a status of "failed"?
    • Yes, continue with the next step.
    • No, the problem is fixed. Go to Verifying the repair. This ends the procedure.
  12. Use symbolic FRU HSL_LNK to determine if there are any additional HSL/RIO cable-related FRUs, such as interposer cards and internal ribbon cables, that may be on either unit. Did you exchange any additional HSL/RIO FRUs?
    • Yes, continue with the next step.
    • No, call your next level of support for further instruction. This ends the procedure.
  13. To determine if replacing the FRU resolved the problem, perform RIOIP01 Section-6: Manually detecting the failed link below and return to this point. Did the ports you were working on have a status of "failed"?
    • Yes, call your next level of support for further instruction. This ends the procedure.
    • No, the problem is fixed. Go to Verifying the repair. This ends the procedure.

RIOIP01 Section-5: Cannot power on unit

Procedure

  1. Work the errors related to powering on the units, and then continue with the next step. If a unit still cannot be powered on, re-cable the HSL/RIO loop without the I/O units and system units that cannot be powered on, allowing the loop to be complete (no disconnected cables).
  2. To determine if re-cabling the loop resolved the problem, perform RIOIP01 Section-6: Manually detecting the failed link below and return to this point.

RIOIP01 Section-6: Manually detecting the failed link

Procedure

  1. Get the loop number from the reference code if you do not already have it. The loop number is a hexadecimal number in word 7 of the reference code.
    • If you are working from the Product Activity Log (PAL), then the loop number is the 4 leftmost characters of the DSA in word 7 (BBBB). Use the DSA translation to convert the hexadecimal loop number to decimal format before continuing with this procedure.
    • If you are working from the Service Action Log (SAL), the loop number should be displayed in the FRU description area in decimal format.
  2. Sign on to SST or DST (if you have not already done so). Select Start a service tool > Hardware service manager > Logical hardware resources > High-speed link (HSL) resources.
  3. Select Resources associated with loop for the HSL loop with the failed link. The HSL bridges will be displayed under the loop.
  4. Select Display detail for the loop with the failed link.
  5. Record the name of the NIC/RIO controller resource you are starting from on the display. You will need this name to determine if you have followed the loop around and back to this resource.
  6. If the leading port does not have a status of "failed", select Follow leading port until a leading port with a "failed" status is found, or the display is showing information for the starting NIC/RIO resource you recorded. Did you find a leading port with a status of "failed"?
    • Yes, record the resource name at the leading port with a "failed" status, and the type, model, and serial number for the resource with the failed status. Continue with the next step.
    • No, the loop is functioning properly. Return to the subtask that sent you here.
  7. Select Follow leading port one more time and note all the information for the resource name with a failed trailing port.
  8. Select Display system information and note the power controlling system's type, model, and serial number (and name, if available). This information may be needed for FRU replacement at a later time.
  9. Select Cancel twice to return to the previous screen.
  10. Select Associated packaging resource(s) from each resource name. This gives the description of the failing item and the frame ID.
  11. Select Display detail to find the part number and location associated with the possible failing item. Then return to the step that sent you here.

RIOIP01 Section-7: Refresh the port status

Procedure

  1. Wait a maximum of one minute, and then sign on to SST or DST (if you have not already done so).
  2. Select Start a service tool > Hardware service manager > Logical hardware resources > High-speed link (HSL) resources.
  3. Move the cursor to the HSL loop that you want to examine and select Display detail > Include non-reporting resources.
  4. If the display is not showing the ports for one of the units you are working on, select Follow leading port. Continue to select Follow leading port until the display is showing the ports for one of the units you are working on. Note the status of the port you were working on. Select Follow leading port until the display is showing the ports for the other unit you are working on, and note the status of the port you were working on.
  5. Select Cancel > Refresh > Display detail for the failing resource you are checking. Note any change in the status for the resource. Then return to the step that sent you here.