MAP4760 Recovering from corrupted files or functions

A cluster file (dataset) or function is corrupted. If this has affected customer operations, a separate serviceable event should have been created. In many cases, customer operations will not be affected. Only Processes and/or files used by the RAS (maintenance package) processes might be affected.

About this task

There are three recommended actions:
  • The cluster can be quiesced, powered off and on, then resumed. This reloads the code into the cluster, which might clear a hung process. If the failure is still present, then the next action is needed.
  • The code is reloaded onto the cluster hard disk drives. An important part of this process is the saving and restoring of the configuration and customization files. This allows the cluster to restore access to the customer data after the process is complete. If the failure is still present, then the next action is needed.
  • The next level of support is contacted. They can remotely access and then do functions similar to that of an AIX system administrator.

MAP4760 Section-1

Procedure

  1. Does the serviceable event SRC = BE1E2120?
    • Yes, contact your next level of support. There is a mismatch of the code levels on the CEC enclosures clusters during the LIC update.
    • No, continue with the next step.
  2. Read the description section above.
  3. Reload the CEC enclosure code by quiescing, powering off, powering on and then resuming the CEC enclosure. Use the exchange parts option to do a pseudo repair of the System Processor Card FRU in the CEC enclosure. Pseudo repair means you use the FRU replace guided process, but when directed to actually replace the FRU, you do not actually replace it.
    1. From the navigation area, click Storage Facility Management > storage facility.
    2. From the bottom Task area, click Exchange Parts. Select the exchange option that contains the FRU to be exchanged. A window opens and displays the enclosures.
    3. Follow the guided procedures, but do not physically remove the FRU.
    4. Complete the pseudo FRU replace.
  4. Display open serviceable events to determine whether the CEC code is still failing. Either the original serviceable event will have updated timestamps in the details view, or a new serviceable event will have been created. Does the CEC code appear to still be failing?