MAP5240 Customer data checks

A data check failure caused customer data to be unreadable.

Before you begin

Note: Customer disruption might occur if microcode and power boundaries are not in the proper conditions for this service action. Verify that you start all service activities in Start here.

About this task

You are here to resolve a data check failure that was logged with one of the SRC BE50xxxx values listed below. An action to repair hardware or microcode is necessary. This action is to repair another open serviceable event.

The customer must restore the data after the hardware or microcode repair action is complete.

This MAP isolates for the following SRC BE50s:
  • SRC BE504900, Failing track log available for the next level of support.
  • SRC BE504901, Data loss events reported on a system that is configured in a copy services relationship as a secondary. Data is recovered on the next establish-pair operation.
  • SRC BE504902, Data loss events reported on a system that is configured in a copy services relationship as a secondary. Data is not recovered on the next establish-pair operation.
  • SRC BE504903, Data loss events reported on a system that is configured in a copy services relationship as a secondary. Data is not recovered on next establish pair operation.
  • SRC BE504910, Customer data check, DDM media error, single LBA.
  • SRC BE504911, Data check (medium error) occurred on one sector on a system that is configured in a copy services relationship as a secondary.
  • SRC BE504920, Customer data check, DDM media error, multiple LBAs.
  • SRC BE504921, Data check (medium error) occurred on multiple sectors on a system that is configured in a copy services relationship as a secondary.
  • SRC BE504930, Customer data check, data LRC, single LBA.
  • SRC BE504940, Customer data check, data LRC, multiple LBAs.
  • SRC BE5049F1, Error information record LRC error, possible data check.
  • SRC BE5049F2, Data check error identified in Cache with a zHyperLink write.
  • SRC BE5049F3, Data check error that is identified in NVS with a zHyperLink write.

MAP5240 Section-1

Procedure

  1. Refer to Table 1 for the SRC BE50 that requires problem resolution. Determine the necessary hardware or microcode repair action.

    If a hardware repair problem is not available for this failure, the failure might be intermittent. If the data failure continues, contact your next level of support for assistance in isolating and repairing the problem.

    Table 1. Customer data checks failure SRC BE50 repairs
    SRC BE50 Description Recommended Action
    BE504900 A failing track log was created. The following actions can occur at the same time:
    • Contact your next level of support. They use remote support to access the failing track log file. They then use a tool to create a list of volumes and tracks for the customer to use for data recovery.
    • Repair any related open serviceable events with SRCs BE504930 or BE504940. The FRU replacements do not recover the data but do restore hardware functions.
    BE504901 Data loss events reported on a system that is configured in a copy services relationship as a secondary. Data is recovered on the next establish-pair operation. A track log file exists, and is on the system. However, due to system being configured in a copy services relationship as a secondary, all data is recovered on the next establish pair operation.

    If the copy services relationship is Global Mirror, then no further action needed (ie/ the establish pair operation is automatically triggered). Otherwise, the client or customer needs to reestablish the copy services relationship with their preferred client management software.

    BE504902 Data loss events reported on a system that is configured in a copy services relationship as a secondary. Data is not recovered on next establish pair operation. Contact your next level of support. They use remote support to access the failing track log file. They then use a tool to create a list of volumes and tracks for the customer to use for data recovery.
    BE504903 Data loss events reported on a system that is configured in a copy services relationship as a secondary. Data is not recovered on next establish pair operation. Contact your next level of support. There is a failing track log file available; however, the number of errors surpassed the capacity of the track log file.
    BE504910
    BE504920
    Customer data checks affecting one or more Logical Block Addresses (LBA) on the target volume. BE504910 indicates one LBA; BE504920 indicates more than one LBA.

    The device adapter card or Fibre Channel interface card (FCIC) reported a media error data transfer from DDM to cache memory.

    Locate and repair the problem with SRC BE3XXXXX or BE8XXXXX that contains a repair action for the DDM, device adapter, or FCIC card that is associated with this data check.
    BE504911
    BE504921
    Customer data checks affecting one or more Logical Block Addresses (LBA) on a system configured in a copy services relationship as a secondary. BE504911 indicates one LBA; BE504921 indicates more than one LBA.

    The device adapter card or Fibre Channel interface card (FCIC) reported a media error data transfer from DDM to cache memory.

    Locate and repair the problem with SRC BE3XXXXX or BE8XXXXX that contains a repair action for the DDM, device adapter, or FCIC card that is associated with this data check.
    BE504930
    BE504940
    Customer data checks affecting one or more LBA on the target volume. BE504930 indicates one LBA, BE504940 indicates more than one LBA.

    An LRC check, sequence number check, or physical address check detected during a data transfer, was not recovered. Data has been marked defective on the DDM. Subsequent attempts to read this data fail.

    Locate and repair any problems with SRC BE3XXXXXor BE8XXXXX.

    If no problems with SRC BE3XXXXX or BE8XXXXX are found, the host might have recovered by rewriting the data. Proceed with the actions in Table 2 to determine if any data is still affected.

    BE5049F1 Error information record LRC error, possible data check.

    An LRC check was detected in an error information record, the record is unreadable. Customer data check is possible.

    Contact next level of support to determine possible affected data.
    BE5049F2
    BE5049F3
    Data check error identified associated with a zHyperLink write. Contact next level of support to determine if an underlying hardware condition needs to be repaired.
  2. After the underlying hardware has been repaired, a customer repair action described in Table 2 is required to restore the track.
    Table 2. Customer repair actions
    Option Description
    Fixed block Refer to the additional FRUs in the problem or other related problems for the failed volumes and first failing LBA on track information. Ask the customer to use appropriate customer media maintenance tools to scan all data on the applicable volumes. Restore this data from backup.
    CKD A Media SIM for Media Maintenance Procedure 2 was sent to the host. Ask the customer to follow this procedure to return the track to usable condition, then restore the customer data from backup. Media Maintenance Procedure 2 is described in Analyzing a media SIM

MAP5240 Section-2 (Analyzing a Media SIM)

About this task

For information about correcting a failure that causes a media SIM, refer to Maintaining IBM® Storage Subsystem Media.
Note: Before the customer does a media maintenance procedure, the customer might need to determine the address of the cylinder and head that is involved in the failure. Use the SIM portion of an EREP system execution report to obtain the address (cccchh).

Instruct the customer to complete the media maintenance procedure indicated in MAP5240 Section-3 (Media SIM Maintenance Procedure 2). In addition, refer to the examples in MAP5240 Section-4 (Example of Media SIM Maintenance Procedure 2)

MAP5240 Section-3 (Media SIM Maintenance Procedure 2)

About this task

The first part of this procedure finds all tracks with unrecoverable data and supplies information on the allocation of the user data (for example, data set names).

The second part of this procedure returns the indicated track to a usable condition. Data on this track is no longer readable. All subsystem attempts at media maintenance are unsuccessful. All attempts to recover the data are unsuccessful.

Procedure

  1. Using ICKDSF Release 16 or higher, enter the following commands:
    IODELAY SET MSEC(100)
    ANALYZE <UNIT() |DDNAME()> NODRIVE SCAN 

    IODELAY adjusts ICKDSF to run concurrently with customer operations. ANALYZE scans the volume for data that is not readable or usable.

  2. See MAP5240 Section-4 (Example of Media SIM Maintenance Procedure 2) for the location of the BE5049XX and addresses of the failing track and head (cccchh) in the Analyze sense information.
  3. For each track that reports an SRC BE5049XX, enter the following command (all on the same line):
    INSPECT <UNIT() | DDNAME()> <VFY()|NOVFY>
    ASSIGN NOCHECK NOPRESERVE TRACK(cccc,hh)
    Attention: The above ICKDSF inspect command results in the loss of all customer data on that track.

    The NOPRESERVE parameter must be specified for the DS8000®. The PRESERVE parameter is not valid for the DS8000. All previous attempts by the subsystem to recover the data are successful. Although the track is returned to a usable state, all customer data on the specified track is lost when the INSPECT command is run.

MAP5240 Section-4 (Example of Media SIM Maintenance Procedure 2)

About this task

To locate all tracks with unrecoverable data, obtain information on the allocation of user data. To restore such tracks to a usable condition, run the ICKDSF command sequence below. ICKDSF must be at level 16 or higher. The bold text in the following example is defined in the note that follows.
ENTER INPUT COMMAND:
analyze unit(1290) nodrive scan
 ANALYZE UNIT(1290) NODRIVER SCAN

ICK00700I DEVICE INFORMATION FOR 1290 IS CURRENTLY AS FOLLOWS:
      PHYSICAL DEVICE = XXXX
      STORAGE CONTROLLER = XXXX
      STORAGE CONTROL DESCRIPTOR = CC
      DEVICE DESCRIPTOR = 06
ICK04000I DEVICE IS IN SIMPLEX STATE
ICK01400I 1290 ANALYZE STARTED
ICK01408I 1290 DATA VERIFICATION TEST STARTED
ICK21776I DATAVER TEST: ERROR DURING DATA VERIFICATION
CSW = D07C88 0200FFFF CCW = DE000000 3000FFFF FILEMASK = 1E
SENSE = 80000000 9000010B 00000034 80000004 02007667 FB200F0B 000040E2 0003A401
ICK21401I 1290 SUSPECTED DRIVE PROBLEM
ICK401I  1290 SUSPECTED DRIVE PROBLEMcchh
ICK01406I 1290 ANALYZE ENDED
ICK00001I FUNCTION COMPLETED, HIGHEST CONDITION CODE WAS 8
Note: In this example, the SRC BE35 is 0F0B and the failing track and head address (cccchh) is 03A401. The cccc is 03A4 and the hh is 01.

MAP5240 Section-5 (Common ICKDSF Messages)

About this task

ICK310541
Device not supported for specific function.

Ensure that the parameters that are specified in the media maintenance procedure are correct and rerun the ICKDSF media maintenance procedure.

ICK12155I
Parameter ignored for device type (parameter).

The parameter identified is not valid for the DS8000. This parameter is ignored and processing continues. No action is needed.