Portál AbcLinuxu, 14. května 2025 01:15
[ 8342.286722] ata2.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[ 8342.286757] ata2.01: failed command: READ DMA EXT
[ 8342.286790] ata2.01: cmd 25/00:00:00:07:01/00:04:3d:00:00/f0 tag 0 dma 524288 in
[ 8342.286791] res 51/40:cf:2e:07:01/40:03:3d:00:00/f0 Emask 0x9 (media error)
[ 8342.286867] ata2.01: status: { DRDY ERR }
[ 8342.286888] ata2.01: error: { UNC }
[ 8342.287638] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[ 8342.287671] ata1.00: BMDMA stat 0x65
[ 8342.287697] ata1.00: failed command: READ DMA EXT
[ 8342.287727] ata1.00: cmd 25/00:00:00:06:01/00:01:3d:00:00/e0 tag 0 dma 131072 in
[ 8342.287728] res 51/40:3f:b0:06:01/40:00:3d:00:00/e0 Emask 0x9 (media error)
[ 8342.287804] ata1.00: status: { DRDY ERR }
[ 8342.287825] ata1.00: error: { UNC }
[ 8342.301438] ata2.00: configured for UDMA/133
[ 8342.311412] ata1.00: configured for UDMA/133
[ 8342.311586] ata2.01: configured for UDMA/133
[ 8342.311738] ata2: EH complete
[ 8342.322683] ata1.01: configured for UDMA/133
[ 8342.322756] ata1: EH complete
[ 8344.068295] ata2.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[ 8344.068330] ata2.01: failed command: READ DMA EXT
[ 8344.068362] ata2.01: cmd 25/00:00:00:07:01/00:04:3d:00:00/f0 tag 0 dma 524288 in
[ 8344.068363] res 51/40:cf:2e:07:01/40:03:3d:00:00/f0 Emask 0x9 (media error)
[ 8344.068439] ata2.01: status: { DRDY ERR }
[ 8344.068460] ata2.01: error: { UNC }
[ 8344.093995] ata2.00: configured for UDMA/133
[ 8344.110243] ata2.01: configured for UDMA/133
[ 8344.110389] ata2: EH complete
[ 8344.128133] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[ 8344.128166] ata1.00: BMDMA stat 0x65
[ 8344.128192] ata1.00: failed command: READ DMA EXT
[ 8344.128221] ata1.00: cmd 25/00:00:00:06:01/00:01:3d:00:00/e0 tag 0 dma 131072 in
[ 8344.128222] res 51/40:3f:b0:06:01/40:00:3d:00:00/e0 Emask 0x9 (media error)
[ 8344.128298] ata1.00: status: { DRDY ERR }
[ 8344.128320] ata1.00: error: { UNC }
[ 8344.154050] ata1.00: configured for UDMA/133
[ 8344.162191] ata1.01: configured for UDMA/133
[ 8344.162242] ata1: EH complete
[ 8345.883860] ata2.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[ 8345.883895] ata2.01: failed command: READ DMA EXT
[ 8345.883929] ata2.01: cmd 25/00:00:00:07:01/00:04:3d:00:00/f0 tag 0 dma 524288 in
[ 8345.883930] res 51/40:cf:2e:07:01/40:03:3d:00:00/f0 Emask 0x9 (media error)
[ 8345.884006] ata2.01: status: { DRDY ERR }
[ 8345.884028] ata2.01: error: { UNC }
[ 8345.899530] ata2.00: configured for UDMA/133
[ 8345.914672] ata2.01: configured for UDMA/133
[ 8345.914801] ata2: EH complete
[ 8345.976766] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[ 8345.976799] ata1.00: BMDMA stat 0x65
[ 8345.976826] ata1.00: failed command: READ DMA EXT
[ 8345.976857] ata1.00: cmd 25/00:00:00:06:01/00:01:3d:00:00/e0 tag 0 dma 131072 in
[ 8345.976858] res 51/40:3f:b0:06:01/40:00:3d:00:00/e0 Emask 0x9 (media error)
[ 8345.976934] ata1.00: status: { DRDY ERR }
[ 8345.976955] ata1.00: error: { UNC }
[ 8345.998597] ata1.00: configured for UDMA/133
[ 8346.008825] ata1.01: configured for UDMA/133
[ 8346.008899] ata1: EH complete
[ 8347.707417] ata2.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[ 8347.707453] ata2.01: failed command: READ DMA EXT
[ 8347.707485] ata2.01: cmd 25/00:00:00:07:01/00:04:3d:00:00/f0 tag 0 dma 524288 in
[ 8347.707486] res 51/40:cf:2e:07:01/40:03:3d:00:00/f0 Emask 0x9 (media error)
[ 8347.707562] ata2.01: status: { DRDY ERR }
[ 8347.707583] ata2.01: error: { UNC }
[ 8347.724125] ata2.00: configured for UDMA/133
[ 8347.731229] ata2.01: configured for UDMA/133
[ 8347.731358] ata2: EH complete
[ 8347.825403] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[ 8347.825436] ata1.00: BMDMA stat 0x65
[ 8347.825457] ata1.00: failed command: READ DMA EXT
[ 8347.825481] ata1.00: cmd 25/00:00:00:06:01/00:01:3d:00:00/e0 tag 0 dma 131072 in
[ 8347.825482] res 51/40:3f:b0:06:01/40:00:3d:00:00/e0 Emask 0x9 (media error)
[ 8347.825566] ata1.00: status: { DRDY ERR }
[ 8347.825588] ata1.00: error: { UNC }
[ 8347.851206] ata1.00: configured for UDMA/133
[ 8347.859317] ata1.01: configured for UDMA/133
[ 8347.859370] ata1: EH complete
[ 8349.522992] ata2.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[ 8349.523028] ata2.01: failed command: READ DMA EXT
[ 8349.523063] ata2.01: cmd 25/00:00:00:07:01/00:04:3d:00:00/f0 tag 0 dma 524288 in
[ 8349.523064] res 51/40:cf:2e:07:01/40:03:3d:00:00/f0 Emask 0x9 (media error)
[ 8349.523140] ata2.01: status: { DRDY ERR }
[ 8349.523162] ata2.01: error: { UNC }
[ 8349.536657] ata2.00: configured for UDMA/133
[ 8349.544929] ata2.01: configured for UDMA/133
[ 8349.545079] ata2: EH complete
[ 8349.673923] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[ 8349.673956] ata1.00: BMDMA stat 0x65
[ 8349.678896] ata1.00: failed command: READ DMA EXT
[ 8349.678923] ata1.00: cmd 25/00:00:00:06:01/00:01:3d:00:00/e0 tag 0 dma 131072 in
[ 8349.678924] res 51/40:3f:b0:06:01/40:00:3d:00:00/e0 Emask 0x9 (media error)
[ 8349.679001] ata1.00: status: { DRDY ERR }
[ 8349.679022] ata1.00: error: { UNC }
[ 8349.700833] ata1.00: configured for UDMA/133
[ 8349.709016] ata1.01: configured for UDMA/133
[ 8349.709086] ata1: EH complete
[ 8351.321496] ata2.01: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[ 8351.321531] ata2.01: failed command: READ DMA EXT
[ 8351.321564] ata2.01: cmd 25/00:00:00:07:01/00:04:3d:00:00/f0 tag 0 dma 524288 in
[ 8351.321565] res 51/40:cf:2e:07:01/40:03:3d:00:00/f0 Emask 0x9 (media error)
[ 8351.321641] ata2.01: status: { DRDY ERR }
[ 8351.321662] ata2.01: error: { UNC }
[ 8351.345124] ata2.00: configured for UDMA/133
[ 8351.361398] ata2.01: configured for UDMA/133
[ 8351.361540] sd 1:0:1:0: [sdd] Unhandled sense code
[ 8351.361563] sd 1:0:1:0: [sdd] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 8351.361591] sd 1:0:1:0: [sdd] Sense Key : Medium Error [current] [descriptor]
[ 8351.361626] Descriptor sense data with sense descriptors (in hex):
[ 8351.361652] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
[ 8351.361711] 3d 01 07 2e
[ 8351.361741] sd 1:0:1:0: [sdd] Add. Sense: Unrecovered read error - auto reallocate failed
[ 8351.361786] sd 1:0:1:0: [sdd] CDB: Read(10): 28 00 3d 01 07 00 00 04 00 00
[ 8351.361839] end_request: I/O error, dev sdd, sector 1023477550
[ 8351.361906] ata2: EH complete
[ 8351.522382] ata1.00: exception Emask 0x0 SAct 0x0 SErr 0x0 action 0x0
[ 8351.522408] ata1.00: BMDMA stat 0x65
[ 8351.522428] ata1.00: failed command: READ DMA EXT
[ 8351.522453] ata1.00: cmd 25/00:00:00:06:01/00:01:3d:00:00/e0 tag 0 dma 131072 in
[ 8351.522454] res 51/40:3f:b0:06:01/40:00:3d:00:00/e0 Emask 0x9 (media error)
[ 8351.522530] ata1.00: status: { DRDY ERR }
[ 8351.522551] ata1.00: error: { UNC }
[ 8351.539284] ata1.00: configured for UDMA/133
[ 8351.547507] ata1.01: configured for UDMA/133
[ 8351.547566] sd 0:0:0:0: [sda] Unhandled sense code
[ 8351.547590] sd 0:0:0:0: [sda] Result: hostbyte=DID_OK driverbyte=DRIVER_SENSE
[ 8351.547619] sd 0:0:0:0: [sda] Sense Key : Medium Error [current] [descriptor]
[ 8351.547654] Descriptor sense data with sense descriptors (in hex):
[ 8351.547681] 72 03 11 04 00 00 00 0c 00 0a 80 00 00 00 00 00
[ 8351.547745] 3d 01 06 b0
[ 8351.547775] sd 0:0:0:0: [sda] Add. Sense: Unrecovered read error - auto reallocate failed
[ 8351.547820] sd 0:0:0:0: [sda] CDB: Read(10): 28 00 3d 01 06 00 00 01 00 00
[ 8351.547873] end_request: I/O error, dev sda, sector 1023477424
[ 8351.547904] ata1: EH complete
[ 8351.563692] raid5:md1: read error corrected (8 sectors at 1023280936 on sdd5)
[ 8351.564159] raid5:md1: read error corrected (8 sectors at 1023280944 on sdd5)
[ 8351.564187] raid5:md1: read error corrected (8 sectors at 1023280952 on sdd5)
[ 8351.564214] raid5:md1: read error corrected (8 sectors at 1023280960 on sdd5)
[ 8351.564240] raid5:md1: read error corrected (8 sectors at 1023280968 on sdd5)
[ 8351.564266] raid5:md1: read error corrected (8 sectors at 1023280976 on sdd5)
[ 8351.564293] raid5:md1: read error corrected (8 sectors at 1023280984 on sdd5)
[ 8351.564319] raid5:md1: read error corrected (8 sectors at 1023280992 on sdd5)
[ 8351.564346] raid5:md1: read error corrected (8 sectors at 1023281000 on sdd5)
[ 8351.564372] raid5:md1: read error corrected (8 sectors at 1023281008 on sdd5)
[15996.739851] md: md1: resync done.
[15996.868645] RAID5 conf printout:
[15996.868672] --- rd:4 wd:4
[15996.868696] disk 0, o:1, dev:sda5
[15996.868716] disk 1, o:1, dev:sdb5
[15996.868736] disk 2, o:1, dev:sdc5
[15996.868756] disk 3, o:1, dev:sdd5
cat /proc/mdstat
Personalities : [raid1] [raid6] [raid5] [raid4]
md3 : active raid1 sde2[0] sdf2[1]
31462206 blocks super 1.2 [2/2] [UU]
md2 : active raid1 sdf1[0] sde1[2]
31462175 blocks super 1.2 [2/2] [UU]
md1 : active raid5 sda5[0] sdd5[3] sdc5[2] sdb5[1]
2929986048 blocks super 1.2 level 5, 512k chunk, algorithm 2 [4/4] [UUUU]
md0 : active raid1 sda1[0] sdd1[3] sdc1[2] sdb1[1]
96244 blocks super 1.2 [4/4] [UUUU]
smartctl -A /dev/sda
smartctl 5.40 2010-07-12 r3124 [x86_64-unknown-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 182 179 021 Pre-fail Always - 3883
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 26
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 094 094 000 Old_age Always - 4741
10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 24
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 20
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 5
194 Temperature_Celsius 0x0022 113 111 000 Old_age Always - 34
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0
smartctl -A /dev/sdb
smartctl 5.40 2010-07-12 r3124 [x86_64-unknown-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 182 179 021 Pre-fail Always - 3883
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 26
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 094 094 000 Old_age Always - 4741
10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 24
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 20
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 5
194 Temperature_Celsius 0x0022 111 108 000 Old_age Always - 36
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0
smartctl -A /dev/sdc
smartctl 5.40 2010-07-12 r3124 [x86_64-unknown-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 177 174 021 Pre-fail Always - 4116
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 25
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 094 094 000 Old_age Always - 4742
10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 23
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 19
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 5
194 Temperature_Celsius 0x0022 110 107 000 Old_age Always - 37
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0
smartctl -A /dev/sdd
smartctl 5.40 2010-07-12 r3124 [x86_64-unknown-linux-gnu] (local build)
Copyright (C) 2002-10 by Bruce Allen, http://smartmontools.sourceforge.net
=== START OF READ SMART DATA SECTION ===
SMART Attributes Data Structure revision number: 16
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x002f 200 200 051 Pre-fail Always - 0
3 Spin_Up_Time 0x0027 181 177 021 Pre-fail Always - 3950
4 Start_Stop_Count 0x0032 100 100 000 Old_age Always - 26
5 Reallocated_Sector_Ct 0x0033 200 200 140 Pre-fail Always - 0
7 Seek_Error_Rate 0x002e 200 200 000 Old_age Always - 0
9 Power_On_Hours 0x0032 094 094 000 Old_age Always - 4740
10 Spin_Retry_Count 0x0032 100 253 000 Old_age Always - 0
11 Calibration_Retry_Count 0x0032 100 253 000 Old_age Always - 0
12 Power_Cycle_Count 0x0032 100 100 000 Old_age Always - 24
192 Power-Off_Retract_Count 0x0032 200 200 000 Old_age Always - 20
193 Load_Cycle_Count 0x0032 200 200 000 Old_age Always - 5
194 Temperature_Celsius 0x0022 112 109 000 Old_age Always - 35
196 Reallocated_Event_Count 0x0032 200 200 000 Old_age Always - 0
197 Current_Pending_Sector 0x0032 200 200 000 Old_age Always - 0
198 Offline_Uncorrectable 0x0030 200 200 000 Old_age Offline - 0
199 UDMA_CRC_Error_Count 0x0032 200 200 000 Old_age Always - 0
200 Multi_Zone_Error_Rate 0x0008 200 200 000 Old_age Offline - 0
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
No self-tests have been logged. [To run self-tests, use: smartctl -t]
SMART Selective self-test log data structure revision number 1
SPAN MIN_LBA MAX_LBA CURRENT_TEST_STATUS
1 0 0 Not_testing
2 0 0 Not_testing
3 0 0 Not_testing
4 0 0 Not_testing
5 0 0 Not_testing
Selective self-test flags (0x0):
After scanning selected spans, do NOT read-scan remainder of disk.
If Selective self-test is pending on power-up, resume after 0 minute delay.
Jinak teď každý disk projíždím:
badblocks -v /dev/sdd
Checking blocks 0 to 976762583
Checking for bad blocks (read-only test): 25.13% done, 51:06 elapsed
A pokud by byl vadný disk, neměl by ho raid odpojit?
badblocks -v /dev/sdb
Checking blocks 0 to 976762583
Checking for bad blocks (read-only test): done
Pass completed, 0 bad blocks found.
dd if=/dev/zero of=/data/home2/test.img bs=128k count=800000
Mam jeden satovej disk, když na něm mam swap, tak mi "to" padá, když tam mam jen data, tj. přistupuju POUZE na sdbX a reiserfs, tak komp drží. Nějaký DRDY apod. taky vidim, čas od času.
v podstatě musím mít jednu kartu v záloze nebo celý jiný server se stejným HWTo je to, co jsem říkal, HW raid je relativně drahý, už jenom tím, že abys vůbec dosáhnul stejných možností nápravy jako u SW, musíš mít náhradní hardware nebo servisní smlouvu a slušnou dávku důvěry ve smluvního partnera.
2.6.32-5-amd64 #1 SMP Tue Jun 14 09:42:28 UTC 2011 x86_64 GNU/Linux description: Motherboard product: P55A-UD3 vendor: Gigabyte Technology Co., Ltd.Problém je v tom, že server už asi týden nespadl. A to kopíroval klidně 10h. Teď se již hodinu přestavuje raid a stale nic.
SW raid je k nicemu, kup si radic treba neco od 3ware, mam na dvou serverech uz nekolik let bezi bez problemu a jsem v klidu.Na to ti odepíše hromada lidí, kteří použíávají několik let SW RAID, on totiž sám o sobě problémy moc nemá, navíc má proti HW RAIDu taky některé výhody.
Ale to mluvím o toku dat třeba gigabit z mnoha paralelních čtení skrze 20 disků. Pokud se bavíme o 2-4 discích a nevelké zátěži, tak to je skutečně jedno, resp. SW raid je flexibilnější.Právě. A že mám kolem sebe daleko víc případů, kdy je investice do HW raid zbytečná a ještě způsobí další komplikace při jakémkoli problému s řadičem. Na stejný případ bych tipoval i tazatele.
Tiskni
Sdílej:
ISSN 1214-1267, (c) 1999-2007 Stickfish s.r.o.