MAP4F10 Recovery from I/O enclosure repair with one CEC enclosure fenced

This procedure addresses a situation in which a failure in an I/O enclosure causes a CEC to become fenced and, in some configurations, adapters in another I/O enclosure to become unavailable. Use this procedure after the original I/O enclosure failure has been successfully repaired. This procedure will recover the CEC to dual cluster operational and, if needed, will recover the other I/O enclosure that might not be available.

MAP4F10 Section-1

About this task

  • You completed replacing one of the following I/O enclosure FRUs:
    • I/O enclosure backplane
    • RIO card
    • RIO cable
  • The serviceable event that you just repaired for the I/O enclosure has been automatically closed.
  • A new serviceable event was created with SRC BE400100 = CEC in fenced state during I/O enclosure repair. A CEC enclosure is fenced, quiesced, or powered off.

Procedure

  1. Are there any open serviceable events with SRCs BE1E2167, BE1E2543, or BE1E2551?
    • Yes, close these serviceable events. These are expected when an I/O enclosure has serviceable events and a CEC enclosure is fenced.
    • No, go to the next step.
  2. Are there any open serviceable events with CEC FRUs?
    • Yes, repair them and then go to step 4.
    • No, go to the next step.
  3. Use this special pseudo repair procedure to reset the fenced CEC enclosure, which will quiesce, power off, power on, and resume the CEC enclosure:
    1. Use the Display Storage Facility State (End of Call) to determine which CEC enclosure is fenced.
      1. From the navigation area, click Storage Facility Management > storage facility.
      2. From the bottom Task area, click Service Utilities > View Storage Facility State. The View Storage Facility State (end of call) window opens.
      3. Click the Fenced Resources option at the bottom of the list. Then, click Details and the fenced LPAR information will be shown.
        For example:
        Server     Not Good                      
        lparName   SF75FW820ESS11                
        state=4(Fenced),PartitionState=Running
               
        Note: SFsssssssssESS0x (x = 1 or 2) is in CEC0 (upper) 
              SFsssssssssESS1x (x = 1 or 2) is in CEC1 (lower) 
      4. Return to the Task area, and click Service Utilities > View Hardware Topology. You can identify the fenced CEC enclosure location code.
        For example:
        Current Hardware Topology
        CEC 0 MTMS = 9117-MMA*10D5242
        CEC 0 Unit ID = U787D.001.DQD53K3
        CEC 1 MTMS = 9117-MMA*10D5272
        CEC 1 Unit ID = U787D.001.DQD17BM
    2. Use the Exchange Parts procedure to select the CEC enclosure that displayed as fenced:
      1. From the navigation area, click Storage Facility Management > storage facility.
      2. From the bottom Task area, click Exchange Parts > Exchange CEC Components.... The Exchange CEC Components window opens.
      3. Select a CEC enclosure and click Show FRUs. The Show CEC FRUs window opens.
      4. Select the FRU Location Code, and then click Exchange FRU.
    3. Select the System Processor Card FRU and continue the guided repair:
      1. From the Show CEC FRUs window, select System Processor Card for any Processor Card Slot (LEDs not used) and click Exchange FRU.
        Notes:
        1. If the System Processor Card is not displayed, you may need to maximize the window and manually scroll down the list.
        2. Do not disconnect the black power cables to the CEC enclosure power supplies when directed.
        3. Do not replace the system processor card when directed, leave it installed and continue the repair.
    4. After the pseudo repair of the CEC is complete, go to the next step.
  4. Determine if another I/O enclosure needs to be recovered. Are there any open serviceable events with the following SRCs?
    • BE340012 (Device Area Resource Manager detected device adapter card objects missing in the current harvest)
    • BE360012 (IBM.EssSAHARM detected host adapter card objects missing in the current harvest)
    • Yes, go to the next step.
    • No, go to step 6.
  5. Identify the I/O enclosure for the serviceable events with the SRCs BE340012 (Device adapter card missing) or BE360012 (Host adapter card missing):
    1. Use the location codes in the FRU list to determine the I/O enclosure.
      Note: Each device adapter card and host adapter card in the I/O enclosure will have an open serviceable event.
    2. Do not repair any of these open serviceable events.
    3. Use the "Exchange Parts" procedure to do a pseudo repair of the I/O enclosure RIO card to make the I/O enclosure adapters available.
    4. After the I/O enclosure pseudo repair is successful, manually close each serviceable event with SRC BE340012 or SRC BE360012.
    5. Go to the next step.
  6. Display and repair any open serviceable events containing I/O enclosure FRUs.