Oprava disku v RAID 1

Vypis ze smartu:

SMART Error Log Version: 1
ATA Error Count: 1658 (device log contains only the most recent five errors)
        CR = Command Register [HEX]
        FR = Features Register [HEX]
        SC = Sector Count Register [HEX]
        SN = Sector Number Register [HEX]
        CL = Cylinder Low Register [HEX]
        CH = Cylinder High Register [HEX]
        DH = Device/Head Register [HEX]
        DC = Device Command Register [HEX]
        ER = Error register [HEX]
        ST = Status register [HEX]
Powered_Up_Time is measured from power on, and printed as
DDd+hh:mm:SS.sss where DD=days, hh=hours, mm=minutes,
SS=sec, and sss=millisec. It "wraps" after 49.710 days.

Error 1658 occurred at disk power-on lifetime: 41239 hours (1718 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  04 41 20 40 bc 93 e5  Error: ABRT 32 sectors at LBA = 0x0593bc40 = 93568064

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 20 40 bc 93 e5 00  32d+06:02:21.127  READ DMA
  ca 00 08 80 a7 4f e6 00  32d+06:02:21.121  WRITE DMA
  ef 10 02 00 00 00 a0 00  32d+06:02:21.117  SET FEATURES [Reserved for Serial ATA]

Error 1657 occurred at disk power-on lifetime: 41239 hours (1718 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  04 41 80 00 aa 0c e7  Error: ABRT 128 sectors at LBA = 0x070caa00 = 118270464

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 80 00 aa 0c e7 00  32d+06:01:22.836  READ DMA
  ef 10 02 00 00 00 a0 00  32d+06:01:22.835  SET FEATURES [Reserved for Serial ATA]
  ec 00 00 00 00 00 a0 00  32d+06:01:22.834  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00  32d+06:01:22.834  SET FEATURES [Set transfer mode]

Error 1656 occurred at disk power-on lifetime: 41239 hours (1718 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  04 41 80 00 aa 0c e7  Error: ABRT 128 sectors at LBA = 0x070caa00 = 118270464

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 80 00 aa 0c e7 00  32d+06:00:58.122  READ DMA
  ef 10 02 00 00 00 a0 00  32d+06:00:58.122  SET FEATURES [Reserved for Serial ATA]
  ec 00 00 00 00 00 a0 00  32d+06:00:58.120  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00  32d+06:00:58.120  SET FEATURES [Set transfer mode]

Error 1655 occurred at disk power-on lifetime: 41239 hours (1718 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  04 41 80 00 aa 0c e7  Error: ABRT 128 sectors at LBA = 0x070caa00 = 118270464

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 80 00 aa 0c e7 00  32d+06:00:33.468  READ DMA
  ef 10 02 00 00 00 a0 00  32d+06:00:33.468  SET FEATURES [Reserved for Serial ATA]
  ec 00 00 00 00 00 a0 00  32d+06:00:33.467  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00  32d+06:00:33.467  SET FEATURES [Set transfer mode]

Error 1654 occurred at disk power-on lifetime: 41239 hours (1718 days + 7 hours)
  When the command that caused the error occurred, the device was active or idle.

  After command completion occurred, registers were:
  ER ST SC SN CL CH DH
  -- -- -- -- -- -- --
  04 41 80 00 aa 0c e7  Error: ABRT 128 sectors at LBA = 0x070caa00 = 118270464

  Commands leading to the command that caused the error were:
  CR FR SC SN CL CH DH DC   Powered_Up_Time  Command/Feature_Name
  -- -- -- -- -- -- -- --  ----------------  --------------------
  c8 00 80 00 aa 0c e7 00  32d+06:00:08.756  READ DMA
  ef 10 02 00 00 00 a0 00  32d+06:00:08.755  SET FEATURES [Reserved for Serial ATA]
  ec 00 00 00 00 00 a0 00  32d+06:00:08.754  IDENTIFY DEVICE
  ef 03 46 00 00 00 a0 00  32d+06:00:08.754  SET FEATURES [Set transfer mode]

SMART Self-test log structure revision number 1
Num  Test_Description    Status                  Remaining  LifeTime(hours)  LBA_of_first_error
# 1  Short offline       Completed without error       00%     41252         -

SMART Selective self-test log data structure revision number 1
 SPAN  MIN_LBA  MAX_LBA  CURRENT_TEST_STATUS
    1        0        0  Not_testing
    2        0        0  Not_testing
    3        0        0  Not_testing
    4        0        0  Not_testing
    5        0        0  Not_testing
Selective self-test flags (0x0):
  After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.

Co vy na to?

29.1.2017 20:34 lertimir | skóre: 64 | blog: Par_slov
Rozbalit Rozbalit vše Re: Oprava disku v RAID 1

short test nepřečte celý disk, ale test je jedno. na disku jsou reportované chyby, takže na jakákoliv důležitá data špatně.

29.1.2017 22:52 Max | skóre: 73 | blog: Max_Devaine
Rozbalit Rozbalit vše Re: Oprava disku v RAID 1

Proč sem nedáš celý výpis (smartctl -a /dev/sdx)?
Zdar Max

Měl jsem sen ... :(

Může to být ojedinělá vada a disk bude dalších mnoho let sloužit bez problému, ale pokud můžeš, tak je samozřejmě lepší disk vyměnit.

Nečitelné sektory přepíše:

echo repair > /sys/block/md0/md/sync_action

Následně si v /sys/block/md0/md/mismatch_cnt můžeš přečíst, kolik jich bylo. Je rozumné spouštět echo check > sync_action z cronu (některé distribuce to dokonce dělají automaticky každý měsíc).

A jinak důležitá data samozřejmě je potřeba mít zazálohována bokem (vždycky)…

30.1.2017 08:23 lertimir | skóre: 64 | blog: Par_slov
Rozbalit Rozbalit vše Re: Oprava disku v RAID 1

Může to být ojedinělá vada a disk bude dalších mnoho let sloužit bez problému, ale pokud můžeš, tak je samozřejmě lepší disk vyměnit.

Nedával bych mu tak naději. i v tom výpisu několika chyb má 2 LBA a celkem přes 1500 chyb. Na nedůležitá data stažená z netu dobrý, ale na cokoliv vlastního, důležitého a nenahraditelného nebrat.

Dotaz: Oprava disku v RAID 1

Odpovědi