Fixing bad blocks
Clash Royale CLAN TAG#URR8PPP
up vote
8
down vote
favorite
After getting
WARNING: Your hard drive is failing
Device: /dev/sdb [SAT], 1 Offline uncorrectable sectors
I run
$ sudo smartctl -a /dev/sdb
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.10.0-514.26.2.el7.x86_64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model: KingDian S200 60GB
Serial Number: 2017022100551
LU WWN Device Id: 0 000000 000000000
Firmware Version: P0707F1
User Capacity: 60,022,480,896 bytes [60.0 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-2 T13/2015-D revision 3
SATA Version is: SATA >3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Tue Oct 3 10:56:08 2017 BST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 120) seconds.
Offline data collection
capabilities: (0x11) SMART execute Offline immediate.
No Auto Offline data collection support.
Suspend Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
No Selective Self-test supported.
SMART capabilities: (0x0002) Does not save SMART data before
entering power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 10) minutes.
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x0032 100 100 050 Old_age Always - 0
5 Reallocated_Sector_Ct 0x0032 100 100 050 Old_age Always - 3
9 Power_On_Hours 0x0032 100 100 050 Old_age Always - 4486
12 Power_Cycle_Count 0x0032 100 100 050 Old_age Always - 13
160 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 1
161 Unknown_Attribute 0x0033 100 100 050 Pre-fail Always - 98
163 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 0
164 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 9724
165 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 9
166 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 1
167 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 5
168 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 1500
169 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 100
175 Program_Fail_Count_Chip 0x0032 100 100 050 Old_age Always - 0
176 Erase_Fail_Count_Chip 0x0032 100 100 050 Old_age Always - 0
177 Wear_Leveling_Count 0x0032 100 100 050 Old_age Always - 9602
178 Used_Rsvd_Blk_Cnt_Chip 0x0032 100 100 050 Old_age Always - 3
181 Program_Fail_Cnt_Total 0x0032 100 100 050 Old_age Always - 0
182 Erase_Fail_Count_Total 0x0032 100 100 050 Old_age Always - 0
192 Power-Off_Retract_Count 0x0032 100 100 050 Old_age Always - 13
194 Temperature_Celsius 0x0022 100 100 050 Old_age Always - 28
195 Hardware_ECC_Recovered 0x0032 100 100 050 Old_age Always - 3994818
196 Reallocated_Event_Count 0x0032 100 100 050 Old_age Always - 2414
197 Current_Pending_Sector 0x0032 100 100 050 Old_age Always - 3
198 Offline_Uncorrectable 0x0032 100 100 050 Old_age Always - 1
199 UDMA_CRC_Error_Count 0x0032 100 100 050 Old_age Always - 0
232 Available_Reservd_Space 0x0032 100 100 050 Old_age Always - 98
241 Total_LBAs_Written 0x0030 100 100 050 Old_age Offline - 36124
242 Total_LBAs_Read 0x0030 100 100 050 Old_age Offline - 10259
245 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 9799
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 4486 -
Selective Self-tests/Logging not supported
The detailed smartctl
output shows:
$ sudo smartctl -x /dev/sdb
smartctl 6.2 2017-02-27 r4394 [x86_64-linux-3.10.0-514.26.2.el7.x86_64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model: KingDian S200 60GB
Serial Number: 2017022100551
LU WWN Device Id: 0 000000 000000000
Firmware Version: P0707F1
User Capacity: 60,022,480,896 bytes [60.0 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-2 T13/2015-D revision 3
SATA Version is: SATA >3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Tue Oct 3 15:49:27 2017 BST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is: Unavailable
APM level is: 128 (minimum power consumption without standby)
Rd look-ahead is: Enabled
Write cache is: Enabled
ATA Security is: Disabled, frozen [SEC2]
Wt Cache Reorder: Unavailable
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 120) seconds.
Offline data collection
capabilities: (0x11) SMART execute Offline immediate.
No Auto Offline data collection support.
Suspend Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
No Selective Self-test supported.
SMART capabilities: (0x0002) Does not save SMART data before
entering power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 10) minutes.
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate -O--CK 100 100 050 - 0
5 Reallocated_Sector_Ct -O--CK 100 100 050 - 3
9 Power_On_Hours -O--CK 100 100 050 - 4491
12 Power_Cycle_Count -O--CK 100 100 050 - 13
160 Unknown_Attribute -O--CK 100 100 050 - 1
161 Unknown_Attribute PO--CK 100 100 050 - 98
163 Unknown_Attribute -O--CK 100 100 050 - 0
164 Unknown_Attribute -O--CK 100 100 050 - 10068
165 Unknown_Attribute -O--CK 100 100 050 - 9
166 Unknown_Attribute -O--CK 100 100 050 - 1
167 Unknown_Attribute -O--CK 100 100 050 - 5
168 Unknown_Attribute -O--CK 100 100 050 - 1500
169 Unknown_Attribute -O--CK 100 100 050 - 100
175 Program_Fail_Count_Chip -O--CK 100 100 050 - 0
176 Erase_Fail_Count_Chip -O--CK 100 100 050 - 0
177 Wear_Leveling_Count -O--CK 100 100 050 - 9687
178 Used_Rsvd_Blk_Cnt_Chip -O--CK 100 100 050 - 3
181 Program_Fail_Cnt_Total -O--CK 100 100 050 - 0
182 Erase_Fail_Count_Total -O--CK 100 100 050 - 0
192 Power-Off_Retract_Count -O--CK 100 100 050 - 13
194 Temperature_Celsius -O---K 100 100 050 - 28
195 Hardware_ECC_Recovered -O--CK 100 100 050 - 4314392
196 Reallocated_Event_Count -O--CK 100 100 050 - 2667
197 Current_Pending_Sector -O--CK 100 100 050 - 3
198 Offline_Uncorrectable -O--CK 100 100 050 - 1
199 UDMA_CRC_Error_Count -O--CK 100 100 050 - 0
232 Available_Reservd_Space -O--CK 100 100 050 - 98
241 Total_LBAs_Written ----CK 100 100 050 - 36474
242 Total_LBAs_Read ----CK 100 100 050 - 10529
245 Unknown_Attribute -O--CK 100 100 050 - 10146
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning
General Purpose Log Directory Version 1
SMART Log Directory Version 1 [multi-sector log support]
Address Access R/W Size Description
0x00 GPL,SL R/O 1 Log Directory
0x01 SL R/O 1 Summary SMART error log
0x02 SL R/O 1 Comprehensive SMART error log
0x03 GPL R/O 1 Ext. Comprehensive SMART error log
0x04 GPL,SL R/O 8 Device Statistics log
0x06 SL R/O 1 SMART self-test log
0x07 GPL R/O 1 Extended self-test log
0x10 GPL R/O 1 NCQ Command Error log
0x11 GPL R/O 1 SATA Phy Event Counters
0x30 GPL,SL R/O 9 IDENTIFY DEVICE data log
0x80-0x9f GPL,SL R/W 16 Host vendor specific log
0xde GPL VS 8 Device vendor specific log
SMART Extended Comprehensive Error Log Version: 1 (1 sectors)
No Errors Logged
SMART Extended Self-test Log Version: 1 (1 sectors)
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 4488 -
# 2 Extended offline Completed without error 00% 4487 -
# 3 Extended offline Completed without error 00% 4486 -
Selective Self-tests/Logging not supported
SCT Commands not supported
Device Statistics (GP Log 0x04)
Page Offset Size Value Description
1 ===== = = == General Statistics (rev 1) ==
1 0x008 4 13 Lifetime Power-On Resets
1 0x010 4 4491 Power-on Hours
1 0x018 6 2390408669 Logical Sectors Written
1 0x020 6 69617191 Number of Write Commands
1 0x028 6 690041929 Logical Sectors Read
1 0x030 6 6959725 Number of Read Commands
7 ===== = = == Solid State Device Statistics (rev 1) ==
7 0x008 1 0 Percentage Used Endurance Indicator
SATA Phy Event Counters (GP Log 0x11)
ID Size Value Description
0x0001 4 0 Command failed due to ICRC error
0x0002 4 0 R_ERR response for data FIS
0x0005 4 1 R_ERR response for non-data FIS
0x000a 4 17 Device-to-host register FISes sent due to a COMRESET
Can anyone help me to a) understand where the problem is b) how to fix it?
hard-disk disk-usage disk ssd fdisk
add a comment |Â
up vote
8
down vote
favorite
After getting
WARNING: Your hard drive is failing
Device: /dev/sdb [SAT], 1 Offline uncorrectable sectors
I run
$ sudo smartctl -a /dev/sdb
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.10.0-514.26.2.el7.x86_64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model: KingDian S200 60GB
Serial Number: 2017022100551
LU WWN Device Id: 0 000000 000000000
Firmware Version: P0707F1
User Capacity: 60,022,480,896 bytes [60.0 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-2 T13/2015-D revision 3
SATA Version is: SATA >3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Tue Oct 3 10:56:08 2017 BST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 120) seconds.
Offline data collection
capabilities: (0x11) SMART execute Offline immediate.
No Auto Offline data collection support.
Suspend Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
No Selective Self-test supported.
SMART capabilities: (0x0002) Does not save SMART data before
entering power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 10) minutes.
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x0032 100 100 050 Old_age Always - 0
5 Reallocated_Sector_Ct 0x0032 100 100 050 Old_age Always - 3
9 Power_On_Hours 0x0032 100 100 050 Old_age Always - 4486
12 Power_Cycle_Count 0x0032 100 100 050 Old_age Always - 13
160 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 1
161 Unknown_Attribute 0x0033 100 100 050 Pre-fail Always - 98
163 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 0
164 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 9724
165 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 9
166 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 1
167 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 5
168 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 1500
169 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 100
175 Program_Fail_Count_Chip 0x0032 100 100 050 Old_age Always - 0
176 Erase_Fail_Count_Chip 0x0032 100 100 050 Old_age Always - 0
177 Wear_Leveling_Count 0x0032 100 100 050 Old_age Always - 9602
178 Used_Rsvd_Blk_Cnt_Chip 0x0032 100 100 050 Old_age Always - 3
181 Program_Fail_Cnt_Total 0x0032 100 100 050 Old_age Always - 0
182 Erase_Fail_Count_Total 0x0032 100 100 050 Old_age Always - 0
192 Power-Off_Retract_Count 0x0032 100 100 050 Old_age Always - 13
194 Temperature_Celsius 0x0022 100 100 050 Old_age Always - 28
195 Hardware_ECC_Recovered 0x0032 100 100 050 Old_age Always - 3994818
196 Reallocated_Event_Count 0x0032 100 100 050 Old_age Always - 2414
197 Current_Pending_Sector 0x0032 100 100 050 Old_age Always - 3
198 Offline_Uncorrectable 0x0032 100 100 050 Old_age Always - 1
199 UDMA_CRC_Error_Count 0x0032 100 100 050 Old_age Always - 0
232 Available_Reservd_Space 0x0032 100 100 050 Old_age Always - 98
241 Total_LBAs_Written 0x0030 100 100 050 Old_age Offline - 36124
242 Total_LBAs_Read 0x0030 100 100 050 Old_age Offline - 10259
245 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 9799
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 4486 -
Selective Self-tests/Logging not supported
The detailed smartctl
output shows:
$ sudo smartctl -x /dev/sdb
smartctl 6.2 2017-02-27 r4394 [x86_64-linux-3.10.0-514.26.2.el7.x86_64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model: KingDian S200 60GB
Serial Number: 2017022100551
LU WWN Device Id: 0 000000 000000000
Firmware Version: P0707F1
User Capacity: 60,022,480,896 bytes [60.0 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-2 T13/2015-D revision 3
SATA Version is: SATA >3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Tue Oct 3 15:49:27 2017 BST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is: Unavailable
APM level is: 128 (minimum power consumption without standby)
Rd look-ahead is: Enabled
Write cache is: Enabled
ATA Security is: Disabled, frozen [SEC2]
Wt Cache Reorder: Unavailable
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 120) seconds.
Offline data collection
capabilities: (0x11) SMART execute Offline immediate.
No Auto Offline data collection support.
Suspend Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
No Selective Self-test supported.
SMART capabilities: (0x0002) Does not save SMART data before
entering power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 10) minutes.
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate -O--CK 100 100 050 - 0
5 Reallocated_Sector_Ct -O--CK 100 100 050 - 3
9 Power_On_Hours -O--CK 100 100 050 - 4491
12 Power_Cycle_Count -O--CK 100 100 050 - 13
160 Unknown_Attribute -O--CK 100 100 050 - 1
161 Unknown_Attribute PO--CK 100 100 050 - 98
163 Unknown_Attribute -O--CK 100 100 050 - 0
164 Unknown_Attribute -O--CK 100 100 050 - 10068
165 Unknown_Attribute -O--CK 100 100 050 - 9
166 Unknown_Attribute -O--CK 100 100 050 - 1
167 Unknown_Attribute -O--CK 100 100 050 - 5
168 Unknown_Attribute -O--CK 100 100 050 - 1500
169 Unknown_Attribute -O--CK 100 100 050 - 100
175 Program_Fail_Count_Chip -O--CK 100 100 050 - 0
176 Erase_Fail_Count_Chip -O--CK 100 100 050 - 0
177 Wear_Leveling_Count -O--CK 100 100 050 - 9687
178 Used_Rsvd_Blk_Cnt_Chip -O--CK 100 100 050 - 3
181 Program_Fail_Cnt_Total -O--CK 100 100 050 - 0
182 Erase_Fail_Count_Total -O--CK 100 100 050 - 0
192 Power-Off_Retract_Count -O--CK 100 100 050 - 13
194 Temperature_Celsius -O---K 100 100 050 - 28
195 Hardware_ECC_Recovered -O--CK 100 100 050 - 4314392
196 Reallocated_Event_Count -O--CK 100 100 050 - 2667
197 Current_Pending_Sector -O--CK 100 100 050 - 3
198 Offline_Uncorrectable -O--CK 100 100 050 - 1
199 UDMA_CRC_Error_Count -O--CK 100 100 050 - 0
232 Available_Reservd_Space -O--CK 100 100 050 - 98
241 Total_LBAs_Written ----CK 100 100 050 - 36474
242 Total_LBAs_Read ----CK 100 100 050 - 10529
245 Unknown_Attribute -O--CK 100 100 050 - 10146
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning
General Purpose Log Directory Version 1
SMART Log Directory Version 1 [multi-sector log support]
Address Access R/W Size Description
0x00 GPL,SL R/O 1 Log Directory
0x01 SL R/O 1 Summary SMART error log
0x02 SL R/O 1 Comprehensive SMART error log
0x03 GPL R/O 1 Ext. Comprehensive SMART error log
0x04 GPL,SL R/O 8 Device Statistics log
0x06 SL R/O 1 SMART self-test log
0x07 GPL R/O 1 Extended self-test log
0x10 GPL R/O 1 NCQ Command Error log
0x11 GPL R/O 1 SATA Phy Event Counters
0x30 GPL,SL R/O 9 IDENTIFY DEVICE data log
0x80-0x9f GPL,SL R/W 16 Host vendor specific log
0xde GPL VS 8 Device vendor specific log
SMART Extended Comprehensive Error Log Version: 1 (1 sectors)
No Errors Logged
SMART Extended Self-test Log Version: 1 (1 sectors)
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 4488 -
# 2 Extended offline Completed without error 00% 4487 -
# 3 Extended offline Completed without error 00% 4486 -
Selective Self-tests/Logging not supported
SCT Commands not supported
Device Statistics (GP Log 0x04)
Page Offset Size Value Description
1 ===== = = == General Statistics (rev 1) ==
1 0x008 4 13 Lifetime Power-On Resets
1 0x010 4 4491 Power-on Hours
1 0x018 6 2390408669 Logical Sectors Written
1 0x020 6 69617191 Number of Write Commands
1 0x028 6 690041929 Logical Sectors Read
1 0x030 6 6959725 Number of Read Commands
7 ===== = = == Solid State Device Statistics (rev 1) ==
7 0x008 1 0 Percentage Used Endurance Indicator
SATA Phy Event Counters (GP Log 0x11)
ID Size Value Description
0x0001 4 0 Command failed due to ICRC error
0x0002 4 0 R_ERR response for data FIS
0x0005 4 1 R_ERR response for non-data FIS
0x000a 4 17 Device-to-host register FISes sent due to a COMRESET
Can anyone help me to a) understand where the problem is b) how to fix it?
hard-disk disk-usage disk ssd fdisk
3
As no ones put it in huge flashing letters yet. IF YOU DON'T HAVE A BACKUP. MAKE ONE NOW
â djsmiley2k
Oct 3 '17 at 15:17
1
You can force the drive to remap the bad sector by writing zeroes to it. This guide might help: BadBlockHowto. If you are feeling confident you may skip thedebugfs
part, and overwrite the block directly withdd if=/dev/zero of=/dev/sdb bs=512 skip=decimal LBA_of_first_error count=1 oflag=direct,sync
â Nazar554
Oct 3 '17 at 16:07
@Nazar554 Please refrain from posting answers in comments - for all we know that might be a command that overwrites the entire drive with zeroes and there isn't an ability to downvote comments so if it was it wouldn't be downvoted.
â wizzwizz4
Oct 3 '17 at 21:47
add a comment |Â
up vote
8
down vote
favorite
up vote
8
down vote
favorite
After getting
WARNING: Your hard drive is failing
Device: /dev/sdb [SAT], 1 Offline uncorrectable sectors
I run
$ sudo smartctl -a /dev/sdb
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.10.0-514.26.2.el7.x86_64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model: KingDian S200 60GB
Serial Number: 2017022100551
LU WWN Device Id: 0 000000 000000000
Firmware Version: P0707F1
User Capacity: 60,022,480,896 bytes [60.0 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-2 T13/2015-D revision 3
SATA Version is: SATA >3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Tue Oct 3 10:56:08 2017 BST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 120) seconds.
Offline data collection
capabilities: (0x11) SMART execute Offline immediate.
No Auto Offline data collection support.
Suspend Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
No Selective Self-test supported.
SMART capabilities: (0x0002) Does not save SMART data before
entering power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 10) minutes.
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x0032 100 100 050 Old_age Always - 0
5 Reallocated_Sector_Ct 0x0032 100 100 050 Old_age Always - 3
9 Power_On_Hours 0x0032 100 100 050 Old_age Always - 4486
12 Power_Cycle_Count 0x0032 100 100 050 Old_age Always - 13
160 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 1
161 Unknown_Attribute 0x0033 100 100 050 Pre-fail Always - 98
163 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 0
164 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 9724
165 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 9
166 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 1
167 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 5
168 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 1500
169 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 100
175 Program_Fail_Count_Chip 0x0032 100 100 050 Old_age Always - 0
176 Erase_Fail_Count_Chip 0x0032 100 100 050 Old_age Always - 0
177 Wear_Leveling_Count 0x0032 100 100 050 Old_age Always - 9602
178 Used_Rsvd_Blk_Cnt_Chip 0x0032 100 100 050 Old_age Always - 3
181 Program_Fail_Cnt_Total 0x0032 100 100 050 Old_age Always - 0
182 Erase_Fail_Count_Total 0x0032 100 100 050 Old_age Always - 0
192 Power-Off_Retract_Count 0x0032 100 100 050 Old_age Always - 13
194 Temperature_Celsius 0x0022 100 100 050 Old_age Always - 28
195 Hardware_ECC_Recovered 0x0032 100 100 050 Old_age Always - 3994818
196 Reallocated_Event_Count 0x0032 100 100 050 Old_age Always - 2414
197 Current_Pending_Sector 0x0032 100 100 050 Old_age Always - 3
198 Offline_Uncorrectable 0x0032 100 100 050 Old_age Always - 1
199 UDMA_CRC_Error_Count 0x0032 100 100 050 Old_age Always - 0
232 Available_Reservd_Space 0x0032 100 100 050 Old_age Always - 98
241 Total_LBAs_Written 0x0030 100 100 050 Old_age Offline - 36124
242 Total_LBAs_Read 0x0030 100 100 050 Old_age Offline - 10259
245 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 9799
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 4486 -
Selective Self-tests/Logging not supported
The detailed smartctl
output shows:
$ sudo smartctl -x /dev/sdb
smartctl 6.2 2017-02-27 r4394 [x86_64-linux-3.10.0-514.26.2.el7.x86_64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model: KingDian S200 60GB
Serial Number: 2017022100551
LU WWN Device Id: 0 000000 000000000
Firmware Version: P0707F1
User Capacity: 60,022,480,896 bytes [60.0 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-2 T13/2015-D revision 3
SATA Version is: SATA >3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Tue Oct 3 15:49:27 2017 BST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is: Unavailable
APM level is: 128 (minimum power consumption without standby)
Rd look-ahead is: Enabled
Write cache is: Enabled
ATA Security is: Disabled, frozen [SEC2]
Wt Cache Reorder: Unavailable
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 120) seconds.
Offline data collection
capabilities: (0x11) SMART execute Offline immediate.
No Auto Offline data collection support.
Suspend Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
No Selective Self-test supported.
SMART capabilities: (0x0002) Does not save SMART data before
entering power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 10) minutes.
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate -O--CK 100 100 050 - 0
5 Reallocated_Sector_Ct -O--CK 100 100 050 - 3
9 Power_On_Hours -O--CK 100 100 050 - 4491
12 Power_Cycle_Count -O--CK 100 100 050 - 13
160 Unknown_Attribute -O--CK 100 100 050 - 1
161 Unknown_Attribute PO--CK 100 100 050 - 98
163 Unknown_Attribute -O--CK 100 100 050 - 0
164 Unknown_Attribute -O--CK 100 100 050 - 10068
165 Unknown_Attribute -O--CK 100 100 050 - 9
166 Unknown_Attribute -O--CK 100 100 050 - 1
167 Unknown_Attribute -O--CK 100 100 050 - 5
168 Unknown_Attribute -O--CK 100 100 050 - 1500
169 Unknown_Attribute -O--CK 100 100 050 - 100
175 Program_Fail_Count_Chip -O--CK 100 100 050 - 0
176 Erase_Fail_Count_Chip -O--CK 100 100 050 - 0
177 Wear_Leveling_Count -O--CK 100 100 050 - 9687
178 Used_Rsvd_Blk_Cnt_Chip -O--CK 100 100 050 - 3
181 Program_Fail_Cnt_Total -O--CK 100 100 050 - 0
182 Erase_Fail_Count_Total -O--CK 100 100 050 - 0
192 Power-Off_Retract_Count -O--CK 100 100 050 - 13
194 Temperature_Celsius -O---K 100 100 050 - 28
195 Hardware_ECC_Recovered -O--CK 100 100 050 - 4314392
196 Reallocated_Event_Count -O--CK 100 100 050 - 2667
197 Current_Pending_Sector -O--CK 100 100 050 - 3
198 Offline_Uncorrectable -O--CK 100 100 050 - 1
199 UDMA_CRC_Error_Count -O--CK 100 100 050 - 0
232 Available_Reservd_Space -O--CK 100 100 050 - 98
241 Total_LBAs_Written ----CK 100 100 050 - 36474
242 Total_LBAs_Read ----CK 100 100 050 - 10529
245 Unknown_Attribute -O--CK 100 100 050 - 10146
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning
General Purpose Log Directory Version 1
SMART Log Directory Version 1 [multi-sector log support]
Address Access R/W Size Description
0x00 GPL,SL R/O 1 Log Directory
0x01 SL R/O 1 Summary SMART error log
0x02 SL R/O 1 Comprehensive SMART error log
0x03 GPL R/O 1 Ext. Comprehensive SMART error log
0x04 GPL,SL R/O 8 Device Statistics log
0x06 SL R/O 1 SMART self-test log
0x07 GPL R/O 1 Extended self-test log
0x10 GPL R/O 1 NCQ Command Error log
0x11 GPL R/O 1 SATA Phy Event Counters
0x30 GPL,SL R/O 9 IDENTIFY DEVICE data log
0x80-0x9f GPL,SL R/W 16 Host vendor specific log
0xde GPL VS 8 Device vendor specific log
SMART Extended Comprehensive Error Log Version: 1 (1 sectors)
No Errors Logged
SMART Extended Self-test Log Version: 1 (1 sectors)
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 4488 -
# 2 Extended offline Completed without error 00% 4487 -
# 3 Extended offline Completed without error 00% 4486 -
Selective Self-tests/Logging not supported
SCT Commands not supported
Device Statistics (GP Log 0x04)
Page Offset Size Value Description
1 ===== = = == General Statistics (rev 1) ==
1 0x008 4 13 Lifetime Power-On Resets
1 0x010 4 4491 Power-on Hours
1 0x018 6 2390408669 Logical Sectors Written
1 0x020 6 69617191 Number of Write Commands
1 0x028 6 690041929 Logical Sectors Read
1 0x030 6 6959725 Number of Read Commands
7 ===== = = == Solid State Device Statistics (rev 1) ==
7 0x008 1 0 Percentage Used Endurance Indicator
SATA Phy Event Counters (GP Log 0x11)
ID Size Value Description
0x0001 4 0 Command failed due to ICRC error
0x0002 4 0 R_ERR response for data FIS
0x0005 4 1 R_ERR response for non-data FIS
0x000a 4 17 Device-to-host register FISes sent due to a COMRESET
Can anyone help me to a) understand where the problem is b) how to fix it?
hard-disk disk-usage disk ssd fdisk
After getting
WARNING: Your hard drive is failing
Device: /dev/sdb [SAT], 1 Offline uncorrectable sectors
I run
$ sudo smartctl -a /dev/sdb
smartctl 6.2 2013-07-26 r3841 [x86_64-linux-3.10.0-514.26.2.el7.x86_64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model: KingDian S200 60GB
Serial Number: 2017022100551
LU WWN Device Id: 0 000000 000000000
Firmware Version: P0707F1
User Capacity: 60,022,480,896 bytes [60.0 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-2 T13/2015-D revision 3
SATA Version is: SATA >3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Tue Oct 3 10:56:08 2017 BST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 120) seconds.
Offline data collection
capabilities: (0x11) SMART execute Offline immediate.
No Auto Offline data collection support.
Suspend Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
No Selective Self-test supported.
SMART capabilities: (0x0002) Does not save SMART data before
entering power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 10) minutes.
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAG VALUE WORST THRESH TYPE UPDATED WHEN_FAILED RAW_VALUE
1 Raw_Read_Error_Rate 0x0032 100 100 050 Old_age Always - 0
5 Reallocated_Sector_Ct 0x0032 100 100 050 Old_age Always - 3
9 Power_On_Hours 0x0032 100 100 050 Old_age Always - 4486
12 Power_Cycle_Count 0x0032 100 100 050 Old_age Always - 13
160 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 1
161 Unknown_Attribute 0x0033 100 100 050 Pre-fail Always - 98
163 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 0
164 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 9724
165 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 9
166 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 1
167 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 5
168 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 1500
169 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 100
175 Program_Fail_Count_Chip 0x0032 100 100 050 Old_age Always - 0
176 Erase_Fail_Count_Chip 0x0032 100 100 050 Old_age Always - 0
177 Wear_Leveling_Count 0x0032 100 100 050 Old_age Always - 9602
178 Used_Rsvd_Blk_Cnt_Chip 0x0032 100 100 050 Old_age Always - 3
181 Program_Fail_Cnt_Total 0x0032 100 100 050 Old_age Always - 0
182 Erase_Fail_Count_Total 0x0032 100 100 050 Old_age Always - 0
192 Power-Off_Retract_Count 0x0032 100 100 050 Old_age Always - 13
194 Temperature_Celsius 0x0022 100 100 050 Old_age Always - 28
195 Hardware_ECC_Recovered 0x0032 100 100 050 Old_age Always - 3994818
196 Reallocated_Event_Count 0x0032 100 100 050 Old_age Always - 2414
197 Current_Pending_Sector 0x0032 100 100 050 Old_age Always - 3
198 Offline_Uncorrectable 0x0032 100 100 050 Old_age Always - 1
199 UDMA_CRC_Error_Count 0x0032 100 100 050 Old_age Always - 0
232 Available_Reservd_Space 0x0032 100 100 050 Old_age Always - 98
241 Total_LBAs_Written 0x0030 100 100 050 Old_age Offline - 36124
242 Total_LBAs_Read 0x0030 100 100 050 Old_age Offline - 10259
245 Unknown_Attribute 0x0032 100 100 050 Old_age Always - 9799
SMART Error Log Version: 1
No Errors Logged
SMART Self-test log structure revision number 1
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 4486 -
Selective Self-tests/Logging not supported
The detailed smartctl
output shows:
$ sudo smartctl -x /dev/sdb
smartctl 6.2 2017-02-27 r4394 [x86_64-linux-3.10.0-514.26.2.el7.x86_64] (local build)
Copyright (C) 2002-13, Bruce Allen, Christian Franke, www.smartmontools.org
=== START OF INFORMATION SECTION ===
Device Model: KingDian S200 60GB
Serial Number: 2017022100551
LU WWN Device Id: 0 000000 000000000
Firmware Version: P0707F1
User Capacity: 60,022,480,896 bytes [60.0 GB]
Sector Size: 512 bytes logical/physical
Rotation Rate: Solid State Device
Device is: Not in smartctl database [for details use: -P showall]
ATA Version is: ACS-2 T13/2015-D revision 3
SATA Version is: SATA >3.1, 6.0 Gb/s (current: 3.0 Gb/s)
Local Time is: Tue Oct 3 15:49:27 2017 BST
SMART support is: Available - device has SMART capability.
SMART support is: Enabled
AAM feature is: Unavailable
APM level is: 128 (minimum power consumption without standby)
Rd look-ahead is: Enabled
Write cache is: Enabled
ATA Security is: Disabled, frozen [SEC2]
Wt Cache Reorder: Unavailable
=== START OF READ SMART DATA SECTION ===
SMART overall-health self-assessment test result: PASSED
General SMART Values:
Offline data collection status: (0x00) Offline data collection activity
was never started.
Auto Offline Data Collection: Disabled.
Self-test execution status: ( 0) The previous self-test routine completed
without error or no self-test has ever
been run.
Total time to complete Offline
data collection: ( 120) seconds.
Offline data collection
capabilities: (0x11) SMART execute Offline immediate.
No Auto Offline data collection support.
Suspend Offline collection upon new
command.
No Offline surface scan supported.
Self-test supported.
No Conveyance Self-test supported.
No Selective Self-test supported.
SMART capabilities: (0x0002) Does not save SMART data before
entering power-saving mode.
Supports SMART auto save timer.
Error logging capability: (0x01) Error logging supported.
General Purpose Logging supported.
Short self-test routine
recommended polling time: ( 2) minutes.
Extended self-test routine
recommended polling time: ( 10) minutes.
SMART Attributes Data Structure revision number: 1
Vendor Specific SMART Attributes with Thresholds:
ID# ATTRIBUTE_NAME FLAGS VALUE WORST THRESH FAIL RAW_VALUE
1 Raw_Read_Error_Rate -O--CK 100 100 050 - 0
5 Reallocated_Sector_Ct -O--CK 100 100 050 - 3
9 Power_On_Hours -O--CK 100 100 050 - 4491
12 Power_Cycle_Count -O--CK 100 100 050 - 13
160 Unknown_Attribute -O--CK 100 100 050 - 1
161 Unknown_Attribute PO--CK 100 100 050 - 98
163 Unknown_Attribute -O--CK 100 100 050 - 0
164 Unknown_Attribute -O--CK 100 100 050 - 10068
165 Unknown_Attribute -O--CK 100 100 050 - 9
166 Unknown_Attribute -O--CK 100 100 050 - 1
167 Unknown_Attribute -O--CK 100 100 050 - 5
168 Unknown_Attribute -O--CK 100 100 050 - 1500
169 Unknown_Attribute -O--CK 100 100 050 - 100
175 Program_Fail_Count_Chip -O--CK 100 100 050 - 0
176 Erase_Fail_Count_Chip -O--CK 100 100 050 - 0
177 Wear_Leveling_Count -O--CK 100 100 050 - 9687
178 Used_Rsvd_Blk_Cnt_Chip -O--CK 100 100 050 - 3
181 Program_Fail_Cnt_Total -O--CK 100 100 050 - 0
182 Erase_Fail_Count_Total -O--CK 100 100 050 - 0
192 Power-Off_Retract_Count -O--CK 100 100 050 - 13
194 Temperature_Celsius -O---K 100 100 050 - 28
195 Hardware_ECC_Recovered -O--CK 100 100 050 - 4314392
196 Reallocated_Event_Count -O--CK 100 100 050 - 2667
197 Current_Pending_Sector -O--CK 100 100 050 - 3
198 Offline_Uncorrectable -O--CK 100 100 050 - 1
199 UDMA_CRC_Error_Count -O--CK 100 100 050 - 0
232 Available_Reservd_Space -O--CK 100 100 050 - 98
241 Total_LBAs_Written ----CK 100 100 050 - 36474
242 Total_LBAs_Read ----CK 100 100 050 - 10529
245 Unknown_Attribute -O--CK 100 100 050 - 10146
||||||_ K auto-keep
|||||__ C event count
||||___ R error rate
|||____ S speed/performance
||_____ O updated online
|______ P prefailure warning
General Purpose Log Directory Version 1
SMART Log Directory Version 1 [multi-sector log support]
Address Access R/W Size Description
0x00 GPL,SL R/O 1 Log Directory
0x01 SL R/O 1 Summary SMART error log
0x02 SL R/O 1 Comprehensive SMART error log
0x03 GPL R/O 1 Ext. Comprehensive SMART error log
0x04 GPL,SL R/O 8 Device Statistics log
0x06 SL R/O 1 SMART self-test log
0x07 GPL R/O 1 Extended self-test log
0x10 GPL R/O 1 NCQ Command Error log
0x11 GPL R/O 1 SATA Phy Event Counters
0x30 GPL,SL R/O 9 IDENTIFY DEVICE data log
0x80-0x9f GPL,SL R/W 16 Host vendor specific log
0xde GPL VS 8 Device vendor specific log
SMART Extended Comprehensive Error Log Version: 1 (1 sectors)
No Errors Logged
SMART Extended Self-test Log Version: 1 (1 sectors)
Num Test_Description Status Remaining LifeTime(hours) LBA_of_first_error
# 1 Extended offline Completed without error 00% 4488 -
# 2 Extended offline Completed without error 00% 4487 -
# 3 Extended offline Completed without error 00% 4486 -
Selective Self-tests/Logging not supported
SCT Commands not supported
Device Statistics (GP Log 0x04)
Page Offset Size Value Description
1 ===== = = == General Statistics (rev 1) ==
1 0x008 4 13 Lifetime Power-On Resets
1 0x010 4 4491 Power-on Hours
1 0x018 6 2390408669 Logical Sectors Written
1 0x020 6 69617191 Number of Write Commands
1 0x028 6 690041929 Logical Sectors Read
1 0x030 6 6959725 Number of Read Commands
7 ===== = = == Solid State Device Statistics (rev 1) ==
7 0x008 1 0 Percentage Used Endurance Indicator
SATA Phy Event Counters (GP Log 0x11)
ID Size Value Description
0x0001 4 0 Command failed due to ICRC error
0x0002 4 0 R_ERR response for data FIS
0x0005 4 1 R_ERR response for non-data FIS
0x000a 4 17 Device-to-host register FISes sent due to a COMRESET
Can anyone help me to a) understand where the problem is b) how to fix it?
hard-disk disk-usage disk ssd fdisk
hard-disk disk-usage disk ssd fdisk
edited Oct 3 '17 at 15:44
Stephen Kitt
145k22317382
145k22317382
asked Oct 3 '17 at 10:18
Manolete
15316
15316
3
As no ones put it in huge flashing letters yet. IF YOU DON'T HAVE A BACKUP. MAKE ONE NOW
â djsmiley2k
Oct 3 '17 at 15:17
1
You can force the drive to remap the bad sector by writing zeroes to it. This guide might help: BadBlockHowto. If you are feeling confident you may skip thedebugfs
part, and overwrite the block directly withdd if=/dev/zero of=/dev/sdb bs=512 skip=decimal LBA_of_first_error count=1 oflag=direct,sync
â Nazar554
Oct 3 '17 at 16:07
@Nazar554 Please refrain from posting answers in comments - for all we know that might be a command that overwrites the entire drive with zeroes and there isn't an ability to downvote comments so if it was it wouldn't be downvoted.
â wizzwizz4
Oct 3 '17 at 21:47
add a comment |Â
3
As no ones put it in huge flashing letters yet. IF YOU DON'T HAVE A BACKUP. MAKE ONE NOW
â djsmiley2k
Oct 3 '17 at 15:17
1
You can force the drive to remap the bad sector by writing zeroes to it. This guide might help: BadBlockHowto. If you are feeling confident you may skip thedebugfs
part, and overwrite the block directly withdd if=/dev/zero of=/dev/sdb bs=512 skip=decimal LBA_of_first_error count=1 oflag=direct,sync
â Nazar554
Oct 3 '17 at 16:07
@Nazar554 Please refrain from posting answers in comments - for all we know that might be a command that overwrites the entire drive with zeroes and there isn't an ability to downvote comments so if it was it wouldn't be downvoted.
â wizzwizz4
Oct 3 '17 at 21:47
3
3
As no ones put it in huge flashing letters yet. IF YOU DON'T HAVE A BACKUP. MAKE ONE NOW
â djsmiley2k
Oct 3 '17 at 15:17
As no ones put it in huge flashing letters yet. IF YOU DON'T HAVE A BACKUP. MAKE ONE NOW
â djsmiley2k
Oct 3 '17 at 15:17
1
1
You can force the drive to remap the bad sector by writing zeroes to it. This guide might help: BadBlockHowto. If you are feeling confident you may skip the
debugfs
part, and overwrite the block directly with dd if=/dev/zero of=/dev/sdb bs=512 skip=decimal LBA_of_first_error count=1 oflag=direct,sync
â Nazar554
Oct 3 '17 at 16:07
You can force the drive to remap the bad sector by writing zeroes to it. This guide might help: BadBlockHowto. If you are feeling confident you may skip the
debugfs
part, and overwrite the block directly with dd if=/dev/zero of=/dev/sdb bs=512 skip=decimal LBA_of_first_error count=1 oflag=direct,sync
â Nazar554
Oct 3 '17 at 16:07
@Nazar554 Please refrain from posting answers in comments - for all we know that might be a command that overwrites the entire drive with zeroes and there isn't an ability to downvote comments so if it was it wouldn't be downvoted.
â wizzwizz4
Oct 3 '17 at 21:47
@Nazar554 Please refrain from posting answers in comments - for all we know that might be a command that overwrites the entire drive with zeroes and there isn't an ability to downvote comments so if it was it wouldn't be downvoted.
â wizzwizz4
Oct 3 '17 at 21:47
add a comment |Â
2 Answers
2
active
oldest
votes
up vote
10
down vote
I have had this issue in the past. IIRC, "Offline uncorrectable sectors" means that the disk controller (the one inside the disk, not the SATA/SCSI controller in your PC) has had repeated read failures with one sector and has decided that it was definitely not usable.
So, I must declare that sector as bad to the filesystem that uses it?
No. Fortunately, today's disks automatically replace bad sectors with good ones taken from a pool of spare sectors. Thus, you don't have to declare those bad sectors to your filesystem so that it doesn't use them anymore. Of course, the size of that pool is limited (Available_Reservd_Space sectors
, I guess) and once all spare sectors have been used, bad sectors will remain unusable and you will have to declare them as such to your FS.
So, everything is fine, this is a harmless message?
Not really. Your drive has tried several times to read the bad sector, and failed every time; so it's been queued up for replacement, but the drive can't do that on its own (it keeps hoping it will eventually be able to read it). Until the sector is overwritten with new data, it will remain "uncorrectable"; once it is overwritten, or if the drive somehow manages to read it, it will be remapped and replaced with a spare sector (in the smartctl
output, Offline_Uncorrectable
will be decremented by 1 and Reallocated_Sector_Ct
will be incremented by 1).
What can I do?
In such an event, I usually force my RAID 1 array to resync (good disk -> faulty disk) in order for that new sector to have the right content. In any case, do a fsck
and, if you have a backup of that partition (you should), compare the backup to your actual content.
One important thing to keep in mind is that the appearance of an "offline uncorrectable" is a warning sign that the drive may fail completely in the near future.
â Mark
Oct 3 '17 at 21:01
add a comment |Â
up vote
4
down vote
Run a long smartctl
test, if it comes back positive for errors you have a problem, if not it should be fine to use the hardrive.
smartctl -t long /dev/sdb
Note: Remember to backup the data in your hardrive before testing it, depending on the condition of your drive, the stress from the test could damage your disk further.
Use Diskscan to try and fix the uncorrectable sectors
if you have errors with your smartctl
test.
2
But first backup your valuable data. Because the testing brings additional stress to your disk and if it's in bad condition, the problem may get much worse quickly.
â Jaroslav Kucera
Oct 3 '17 at 10:49
@JaroslavKucera I don't seem to get errors, but the warning messages
â Manolete
Oct 3 '17 at 14:52
add a comment |Â
2 Answers
2
active
oldest
votes
2 Answers
2
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
10
down vote
I have had this issue in the past. IIRC, "Offline uncorrectable sectors" means that the disk controller (the one inside the disk, not the SATA/SCSI controller in your PC) has had repeated read failures with one sector and has decided that it was definitely not usable.
So, I must declare that sector as bad to the filesystem that uses it?
No. Fortunately, today's disks automatically replace bad sectors with good ones taken from a pool of spare sectors. Thus, you don't have to declare those bad sectors to your filesystem so that it doesn't use them anymore. Of course, the size of that pool is limited (Available_Reservd_Space sectors
, I guess) and once all spare sectors have been used, bad sectors will remain unusable and you will have to declare them as such to your FS.
So, everything is fine, this is a harmless message?
Not really. Your drive has tried several times to read the bad sector, and failed every time; so it's been queued up for replacement, but the drive can't do that on its own (it keeps hoping it will eventually be able to read it). Until the sector is overwritten with new data, it will remain "uncorrectable"; once it is overwritten, or if the drive somehow manages to read it, it will be remapped and replaced with a spare sector (in the smartctl
output, Offline_Uncorrectable
will be decremented by 1 and Reallocated_Sector_Ct
will be incremented by 1).
What can I do?
In such an event, I usually force my RAID 1 array to resync (good disk -> faulty disk) in order for that new sector to have the right content. In any case, do a fsck
and, if you have a backup of that partition (you should), compare the backup to your actual content.
One important thing to keep in mind is that the appearance of an "offline uncorrectable" is a warning sign that the drive may fail completely in the near future.
â Mark
Oct 3 '17 at 21:01
add a comment |Â
up vote
10
down vote
I have had this issue in the past. IIRC, "Offline uncorrectable sectors" means that the disk controller (the one inside the disk, not the SATA/SCSI controller in your PC) has had repeated read failures with one sector and has decided that it was definitely not usable.
So, I must declare that sector as bad to the filesystem that uses it?
No. Fortunately, today's disks automatically replace bad sectors with good ones taken from a pool of spare sectors. Thus, you don't have to declare those bad sectors to your filesystem so that it doesn't use them anymore. Of course, the size of that pool is limited (Available_Reservd_Space sectors
, I guess) and once all spare sectors have been used, bad sectors will remain unusable and you will have to declare them as such to your FS.
So, everything is fine, this is a harmless message?
Not really. Your drive has tried several times to read the bad sector, and failed every time; so it's been queued up for replacement, but the drive can't do that on its own (it keeps hoping it will eventually be able to read it). Until the sector is overwritten with new data, it will remain "uncorrectable"; once it is overwritten, or if the drive somehow manages to read it, it will be remapped and replaced with a spare sector (in the smartctl
output, Offline_Uncorrectable
will be decremented by 1 and Reallocated_Sector_Ct
will be incremented by 1).
What can I do?
In such an event, I usually force my RAID 1 array to resync (good disk -> faulty disk) in order for that new sector to have the right content. In any case, do a fsck
and, if you have a backup of that partition (you should), compare the backup to your actual content.
One important thing to keep in mind is that the appearance of an "offline uncorrectable" is a warning sign that the drive may fail completely in the near future.
â Mark
Oct 3 '17 at 21:01
add a comment |Â
up vote
10
down vote
up vote
10
down vote
I have had this issue in the past. IIRC, "Offline uncorrectable sectors" means that the disk controller (the one inside the disk, not the SATA/SCSI controller in your PC) has had repeated read failures with one sector and has decided that it was definitely not usable.
So, I must declare that sector as bad to the filesystem that uses it?
No. Fortunately, today's disks automatically replace bad sectors with good ones taken from a pool of spare sectors. Thus, you don't have to declare those bad sectors to your filesystem so that it doesn't use them anymore. Of course, the size of that pool is limited (Available_Reservd_Space sectors
, I guess) and once all spare sectors have been used, bad sectors will remain unusable and you will have to declare them as such to your FS.
So, everything is fine, this is a harmless message?
Not really. Your drive has tried several times to read the bad sector, and failed every time; so it's been queued up for replacement, but the drive can't do that on its own (it keeps hoping it will eventually be able to read it). Until the sector is overwritten with new data, it will remain "uncorrectable"; once it is overwritten, or if the drive somehow manages to read it, it will be remapped and replaced with a spare sector (in the smartctl
output, Offline_Uncorrectable
will be decremented by 1 and Reallocated_Sector_Ct
will be incremented by 1).
What can I do?
In such an event, I usually force my RAID 1 array to resync (good disk -> faulty disk) in order for that new sector to have the right content. In any case, do a fsck
and, if you have a backup of that partition (you should), compare the backup to your actual content.
I have had this issue in the past. IIRC, "Offline uncorrectable sectors" means that the disk controller (the one inside the disk, not the SATA/SCSI controller in your PC) has had repeated read failures with one sector and has decided that it was definitely not usable.
So, I must declare that sector as bad to the filesystem that uses it?
No. Fortunately, today's disks automatically replace bad sectors with good ones taken from a pool of spare sectors. Thus, you don't have to declare those bad sectors to your filesystem so that it doesn't use them anymore. Of course, the size of that pool is limited (Available_Reservd_Space sectors
, I guess) and once all spare sectors have been used, bad sectors will remain unusable and you will have to declare them as such to your FS.
So, everything is fine, this is a harmless message?
Not really. Your drive has tried several times to read the bad sector, and failed every time; so it's been queued up for replacement, but the drive can't do that on its own (it keeps hoping it will eventually be able to read it). Until the sector is overwritten with new data, it will remain "uncorrectable"; once it is overwritten, or if the drive somehow manages to read it, it will be remapped and replaced with a spare sector (in the smartctl
output, Offline_Uncorrectable
will be decremented by 1 and Reallocated_Sector_Ct
will be incremented by 1).
What can I do?
In such an event, I usually force my RAID 1 array to resync (good disk -> faulty disk) in order for that new sector to have the right content. In any case, do a fsck
and, if you have a backup of that partition (you should), compare the backup to your actual content.
edited Oct 3 '17 at 17:11
answered Oct 3 '17 at 11:08
xhienne
11.7k2553
11.7k2553
One important thing to keep in mind is that the appearance of an "offline uncorrectable" is a warning sign that the drive may fail completely in the near future.
â Mark
Oct 3 '17 at 21:01
add a comment |Â
One important thing to keep in mind is that the appearance of an "offline uncorrectable" is a warning sign that the drive may fail completely in the near future.
â Mark
Oct 3 '17 at 21:01
One important thing to keep in mind is that the appearance of an "offline uncorrectable" is a warning sign that the drive may fail completely in the near future.
â Mark
Oct 3 '17 at 21:01
One important thing to keep in mind is that the appearance of an "offline uncorrectable" is a warning sign that the drive may fail completely in the near future.
â Mark
Oct 3 '17 at 21:01
add a comment |Â
up vote
4
down vote
Run a long smartctl
test, if it comes back positive for errors you have a problem, if not it should be fine to use the hardrive.
smartctl -t long /dev/sdb
Note: Remember to backup the data in your hardrive before testing it, depending on the condition of your drive, the stress from the test could damage your disk further.
Use Diskscan to try and fix the uncorrectable sectors
if you have errors with your smartctl
test.
2
But first backup your valuable data. Because the testing brings additional stress to your disk and if it's in bad condition, the problem may get much worse quickly.
â Jaroslav Kucera
Oct 3 '17 at 10:49
@JaroslavKucera I don't seem to get errors, but the warning messages
â Manolete
Oct 3 '17 at 14:52
add a comment |Â
up vote
4
down vote
Run a long smartctl
test, if it comes back positive for errors you have a problem, if not it should be fine to use the hardrive.
smartctl -t long /dev/sdb
Note: Remember to backup the data in your hardrive before testing it, depending on the condition of your drive, the stress from the test could damage your disk further.
Use Diskscan to try and fix the uncorrectable sectors
if you have errors with your smartctl
test.
2
But first backup your valuable data. Because the testing brings additional stress to your disk and if it's in bad condition, the problem may get much worse quickly.
â Jaroslav Kucera
Oct 3 '17 at 10:49
@JaroslavKucera I don't seem to get errors, but the warning messages
â Manolete
Oct 3 '17 at 14:52
add a comment |Â
up vote
4
down vote
up vote
4
down vote
Run a long smartctl
test, if it comes back positive for errors you have a problem, if not it should be fine to use the hardrive.
smartctl -t long /dev/sdb
Note: Remember to backup the data in your hardrive before testing it, depending on the condition of your drive, the stress from the test could damage your disk further.
Use Diskscan to try and fix the uncorrectable sectors
if you have errors with your smartctl
test.
Run a long smartctl
test, if it comes back positive for errors you have a problem, if not it should be fine to use the hardrive.
smartctl -t long /dev/sdb
Note: Remember to backup the data in your hardrive before testing it, depending on the condition of your drive, the stress from the test could damage your disk further.
Use Diskscan to try and fix the uncorrectable sectors
if you have errors with your smartctl
test.
edited Oct 4 '17 at 15:11
answered Oct 3 '17 at 10:23
Hunter.S.Thompson
4,57431334
4,57431334
2
But first backup your valuable data. Because the testing brings additional stress to your disk and if it's in bad condition, the problem may get much worse quickly.
â Jaroslav Kucera
Oct 3 '17 at 10:49
@JaroslavKucera I don't seem to get errors, but the warning messages
â Manolete
Oct 3 '17 at 14:52
add a comment |Â
2
But first backup your valuable data. Because the testing brings additional stress to your disk and if it's in bad condition, the problem may get much worse quickly.
â Jaroslav Kucera
Oct 3 '17 at 10:49
@JaroslavKucera I don't seem to get errors, but the warning messages
â Manolete
Oct 3 '17 at 14:52
2
2
But first backup your valuable data. Because the testing brings additional stress to your disk and if it's in bad condition, the problem may get much worse quickly.
â Jaroslav Kucera
Oct 3 '17 at 10:49
But first backup your valuable data. Because the testing brings additional stress to your disk and if it's in bad condition, the problem may get much worse quickly.
â Jaroslav Kucera
Oct 3 '17 at 10:49
@JaroslavKucera I don't seem to get errors, but the warning messages
â Manolete
Oct 3 '17 at 14:52
@JaroslavKucera I don't seem to get errors, but the warning messages
â Manolete
Oct 3 '17 at 14:52
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f395803%2ffixing-bad-blocks%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
3
As no ones put it in huge flashing letters yet. IF YOU DON'T HAVE A BACKUP. MAKE ONE NOW
â djsmiley2k
Oct 3 '17 at 15:17
1
You can force the drive to remap the bad sector by writing zeroes to it. This guide might help: BadBlockHowto. If you are feeling confident you may skip the
debugfs
part, and overwrite the block directly withdd if=/dev/zero of=/dev/sdb bs=512 skip=decimal LBA_of_first_error count=1 oflag=direct,sync
â Nazar554
Oct 3 '17 at 16:07
@Nazar554 Please refrain from posting answers in comments - for all we know that might be a command that overwrites the entire drive with zeroes and there isn't an ability to downvote comments so if it was it wouldn't be downvoted.
â wizzwizz4
Oct 3 '17 at 21:47