Understanding smartctl and hard-drive errors
Clash Royale CLAN TAG#URR8PPP
up vote
0
down vote
favorite
I have a raidz2 ZFS pool and my 2 disks started to give I/O error and after that zfs marked them as faulted. click for dmesg log
I removed the disks and I ran some test on them. Smartctl says;
DISK 1 "click for full log= SMART Health Status: DATA CHANNEL IMPENDING FAILURE DATA ERROR RATE TOO HIGH [asc=5d, ascq=32]
DISK 2 "click for full log= SMART Health Status: HARDWARE IMPENDING FAILURE GENERAL HARD DRIVE FAILURE [asc=5d, ascq=10]
I created a new pool from the "DISK 1" and I started a fio test but i did not see any I/O error on the disk. I did not encounter any error like the previous one.. The disk working normal. Also I created a pool with 4 disk and Disk Utilization was normal too.
I tried this test for 4 days and I have not encountered an error. The disk working like the others right now.
fio --randrepeat=0 --ioengine=libaio --name=test --filename=/disktest/fiofile
--bs=1024k --iodepth=64 --size=5T --readwrite=readwrite --rwmixread=60 --numjobs=20
I have few questions;
1- Why the disk do not give error anymore?
2- If the disk working normal then why it caused I/O error on first pool?
3- What is the best way understanding a hard-drive faulted or not?
4- How we can reset the hard-drive error counters?
5- The disk is garbage or not?
The disk attached from; Controller -> LSI3008HBA -> 2x SAS-cable -> "SC946ED-R2KJBOD" 2xExpander -> Multipath SAS disks.
linux hard-disk zfs smartctl
add a comment |Â
up vote
0
down vote
favorite
I have a raidz2 ZFS pool and my 2 disks started to give I/O error and after that zfs marked them as faulted. click for dmesg log
I removed the disks and I ran some test on them. Smartctl says;
DISK 1 "click for full log= SMART Health Status: DATA CHANNEL IMPENDING FAILURE DATA ERROR RATE TOO HIGH [asc=5d, ascq=32]
DISK 2 "click for full log= SMART Health Status: HARDWARE IMPENDING FAILURE GENERAL HARD DRIVE FAILURE [asc=5d, ascq=10]
I created a new pool from the "DISK 1" and I started a fio test but i did not see any I/O error on the disk. I did not encounter any error like the previous one.. The disk working normal. Also I created a pool with 4 disk and Disk Utilization was normal too.
I tried this test for 4 days and I have not encountered an error. The disk working like the others right now.
fio --randrepeat=0 --ioengine=libaio --name=test --filename=/disktest/fiofile
--bs=1024k --iodepth=64 --size=5T --readwrite=readwrite --rwmixread=60 --numjobs=20
I have few questions;
1- Why the disk do not give error anymore?
2- If the disk working normal then why it caused I/O error on first pool?
3- What is the best way understanding a hard-drive faulted or not?
4- How we can reset the hard-drive error counters?
5- The disk is garbage or not?
The disk attached from; Controller -> LSI3008HBA -> 2x SAS-cable -> "SC946ED-R2KJBOD" 2xExpander -> Multipath SAS disks.
linux hard-disk zfs smartctl
First thing I'd do is to look at the SMART attributes, but-d scsi
prevents them from being shown. You didn't say how your disks are attached, so if possible, try again without-d scsi
.VALUE
is normalized to 100, lower is worse.
â dirkt
Aug 13 at 18:39
@dirkt Controller -> LSI3008HBA -> 2x SAS-cable -> "SC946ED-R2KJBOD" 2xExpander -> Mutlipath SAS disks. I use -d scsci because its not working with different way or I could not. :) What is your advice?
â Morphinz
Aug 14 at 9:19
For LSI controllers, try-d megaraid,N
with a suitableN
, see here.
â dirkt
Aug 14 at 10:25
@dirkt Its HBA not Raid card. Your method works on raid cards.
â Morphinz
Sep 5 at 12:13
Even if it has "raid" in the name, it might also work on other LSI controllers, so it's worth a try. More specifically, it will work on any hardware that supports this particular access method. If your card doesn't support it, then it doesn't; in that case there will probably be no way to get at this information unless you dig up a datasheet that describes how to send SMART commands for your controller.
â dirkt
Sep 5 at 14:08
add a comment |Â
up vote
0
down vote
favorite
up vote
0
down vote
favorite
I have a raidz2 ZFS pool and my 2 disks started to give I/O error and after that zfs marked them as faulted. click for dmesg log
I removed the disks and I ran some test on them. Smartctl says;
DISK 1 "click for full log= SMART Health Status: DATA CHANNEL IMPENDING FAILURE DATA ERROR RATE TOO HIGH [asc=5d, ascq=32]
DISK 2 "click for full log= SMART Health Status: HARDWARE IMPENDING FAILURE GENERAL HARD DRIVE FAILURE [asc=5d, ascq=10]
I created a new pool from the "DISK 1" and I started a fio test but i did not see any I/O error on the disk. I did not encounter any error like the previous one.. The disk working normal. Also I created a pool with 4 disk and Disk Utilization was normal too.
I tried this test for 4 days and I have not encountered an error. The disk working like the others right now.
fio --randrepeat=0 --ioengine=libaio --name=test --filename=/disktest/fiofile
--bs=1024k --iodepth=64 --size=5T --readwrite=readwrite --rwmixread=60 --numjobs=20
I have few questions;
1- Why the disk do not give error anymore?
2- If the disk working normal then why it caused I/O error on first pool?
3- What is the best way understanding a hard-drive faulted or not?
4- How we can reset the hard-drive error counters?
5- The disk is garbage or not?
The disk attached from; Controller -> LSI3008HBA -> 2x SAS-cable -> "SC946ED-R2KJBOD" 2xExpander -> Multipath SAS disks.
linux hard-disk zfs smartctl
I have a raidz2 ZFS pool and my 2 disks started to give I/O error and after that zfs marked them as faulted. click for dmesg log
I removed the disks and I ran some test on them. Smartctl says;
DISK 1 "click for full log= SMART Health Status: DATA CHANNEL IMPENDING FAILURE DATA ERROR RATE TOO HIGH [asc=5d, ascq=32]
DISK 2 "click for full log= SMART Health Status: HARDWARE IMPENDING FAILURE GENERAL HARD DRIVE FAILURE [asc=5d, ascq=10]
I created a new pool from the "DISK 1" and I started a fio test but i did not see any I/O error on the disk. I did not encounter any error like the previous one.. The disk working normal. Also I created a pool with 4 disk and Disk Utilization was normal too.
I tried this test for 4 days and I have not encountered an error. The disk working like the others right now.
fio --randrepeat=0 --ioengine=libaio --name=test --filename=/disktest/fiofile
--bs=1024k --iodepth=64 --size=5T --readwrite=readwrite --rwmixread=60 --numjobs=20
I have few questions;
1- Why the disk do not give error anymore?
2- If the disk working normal then why it caused I/O error on first pool?
3- What is the best way understanding a hard-drive faulted or not?
4- How we can reset the hard-drive error counters?
5- The disk is garbage or not?
The disk attached from; Controller -> LSI3008HBA -> 2x SAS-cable -> "SC946ED-R2KJBOD" 2xExpander -> Multipath SAS disks.
linux hard-disk zfs smartctl
linux hard-disk zfs smartctl
edited Aug 14 at 9:25
asked Aug 13 at 12:29
Morphinz
13111
13111
First thing I'd do is to look at the SMART attributes, but-d scsi
prevents them from being shown. You didn't say how your disks are attached, so if possible, try again without-d scsi
.VALUE
is normalized to 100, lower is worse.
â dirkt
Aug 13 at 18:39
@dirkt Controller -> LSI3008HBA -> 2x SAS-cable -> "SC946ED-R2KJBOD" 2xExpander -> Mutlipath SAS disks. I use -d scsci because its not working with different way or I could not. :) What is your advice?
â Morphinz
Aug 14 at 9:19
For LSI controllers, try-d megaraid,N
with a suitableN
, see here.
â dirkt
Aug 14 at 10:25
@dirkt Its HBA not Raid card. Your method works on raid cards.
â Morphinz
Sep 5 at 12:13
Even if it has "raid" in the name, it might also work on other LSI controllers, so it's worth a try. More specifically, it will work on any hardware that supports this particular access method. If your card doesn't support it, then it doesn't; in that case there will probably be no way to get at this information unless you dig up a datasheet that describes how to send SMART commands for your controller.
â dirkt
Sep 5 at 14:08
add a comment |Â
First thing I'd do is to look at the SMART attributes, but-d scsi
prevents them from being shown. You didn't say how your disks are attached, so if possible, try again without-d scsi
.VALUE
is normalized to 100, lower is worse.
â dirkt
Aug 13 at 18:39
@dirkt Controller -> LSI3008HBA -> 2x SAS-cable -> "SC946ED-R2KJBOD" 2xExpander -> Mutlipath SAS disks. I use -d scsci because its not working with different way or I could not. :) What is your advice?
â Morphinz
Aug 14 at 9:19
For LSI controllers, try-d megaraid,N
with a suitableN
, see here.
â dirkt
Aug 14 at 10:25
@dirkt Its HBA not Raid card. Your method works on raid cards.
â Morphinz
Sep 5 at 12:13
Even if it has "raid" in the name, it might also work on other LSI controllers, so it's worth a try. More specifically, it will work on any hardware that supports this particular access method. If your card doesn't support it, then it doesn't; in that case there will probably be no way to get at this information unless you dig up a datasheet that describes how to send SMART commands for your controller.
â dirkt
Sep 5 at 14:08
First thing I'd do is to look at the SMART attributes, but
-d scsi
prevents them from being shown. You didn't say how your disks are attached, so if possible, try again without -d scsi
. VALUE
is normalized to 100, lower is worse.â dirkt
Aug 13 at 18:39
First thing I'd do is to look at the SMART attributes, but
-d scsi
prevents them from being shown. You didn't say how your disks are attached, so if possible, try again without -d scsi
. VALUE
is normalized to 100, lower is worse.â dirkt
Aug 13 at 18:39
@dirkt Controller -> LSI3008HBA -> 2x SAS-cable -> "SC946ED-R2KJBOD" 2xExpander -> Mutlipath SAS disks. I use -d scsci because its not working with different way or I could not. :) What is your advice?
â Morphinz
Aug 14 at 9:19
@dirkt Controller -> LSI3008HBA -> 2x SAS-cable -> "SC946ED-R2KJBOD" 2xExpander -> Mutlipath SAS disks. I use -d scsci because its not working with different way or I could not. :) What is your advice?
â Morphinz
Aug 14 at 9:19
For LSI controllers, try
-d megaraid,N
with a suitable N
, see here.â dirkt
Aug 14 at 10:25
For LSI controllers, try
-d megaraid,N
with a suitable N
, see here.â dirkt
Aug 14 at 10:25
@dirkt Its HBA not Raid card. Your method works on raid cards.
â Morphinz
Sep 5 at 12:13
@dirkt Its HBA not Raid card. Your method works on raid cards.
â Morphinz
Sep 5 at 12:13
Even if it has "raid" in the name, it might also work on other LSI controllers, so it's worth a try. More specifically, it will work on any hardware that supports this particular access method. If your card doesn't support it, then it doesn't; in that case there will probably be no way to get at this information unless you dig up a datasheet that describes how to send SMART commands for your controller.
â dirkt
Sep 5 at 14:08
Even if it has "raid" in the name, it might also work on other LSI controllers, so it's worth a try. More specifically, it will work on any hardware that supports this particular access method. If your card doesn't support it, then it doesn't; in that case there will probably be no way to get at this information unless you dig up a datasheet that describes how to send SMART commands for your controller.
â dirkt
Sep 5 at 14:08
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
1
down vote
- Some faults can come and go. There's nothing that guarantees you will be warned before a disk is going to die but if SMART starts spitting out failure errors it's better not to risk it and just replace the drive.
- Errors can come and go because sometimes the disk keeps retrying problem regions until it succeeds (at which point it will generally try and avoid using that region again if it can).
- You could run a long SMART self test and/or read/write to every LBA in use (ZFS has a scrub (aka resilvering) process that can be initiated). Watch out though - these might make the disk fail for good...
- You can't.
- Hard to say but let's put it another way: is the money saved by not replacing it unnecessarily worth the risk of having it suddenly fail?
@Morphinz did this answer you questions?
â Anon
Sep 6 at 3:06
thank you for your answer but this is an info about I won't have any control on my disks. My problem getting information and be able to control my disks. Because of 1 faulted disk I'm facing suspend issue. When the problem making noise I need to figure out and stop it. Or Kernel,Multipathd,Zfs should do this for me. Because of 1 disk I dont want to reboot my server. For example I can listen dmesg and when i see "attempting task abort, I/O errors" and I can set the disk as offline. Or I can flush multipath and drop the disk via "/sys/block". With this way the disk will be not a problem.
â Morphinz
Sep 11 at 8:39
Even If i try "zpool import mypool" when zfs searching the disks, the faulted disk causing HBA reset. Thats causing "zpool import" hang for 1-2 minute and after that zpool import output gives me "Unavail, or faulted" everydisks. This issue really important and I cant understand why nobody cares it. Maybe my Broadcom 3008 HBA LSI cards causing this and thats why any other user do not have the problem. I really, really need to stop this HBA reset issue. If I can close the code from kernel I will... Or if I need to change these HBA card I will...
â Morphinz
Sep 11 at 8:45
You might be better off asking a new ZFS specific question as your comments show you need a different answer to those of your original 5 questions...
â Anon
Sep 12 at 18:16
(FYI: grox.net/sysadm/unix/linux_disk_hotplug_helpful_commands talks about usingecho offline > /sys/block/<blockdev>/device/state
) to offline a disk.
â Anon
Sep 12 at 18:19
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
- Some faults can come and go. There's nothing that guarantees you will be warned before a disk is going to die but if SMART starts spitting out failure errors it's better not to risk it and just replace the drive.
- Errors can come and go because sometimes the disk keeps retrying problem regions until it succeeds (at which point it will generally try and avoid using that region again if it can).
- You could run a long SMART self test and/or read/write to every LBA in use (ZFS has a scrub (aka resilvering) process that can be initiated). Watch out though - these might make the disk fail for good...
- You can't.
- Hard to say but let's put it another way: is the money saved by not replacing it unnecessarily worth the risk of having it suddenly fail?
@Morphinz did this answer you questions?
â Anon
Sep 6 at 3:06
thank you for your answer but this is an info about I won't have any control on my disks. My problem getting information and be able to control my disks. Because of 1 faulted disk I'm facing suspend issue. When the problem making noise I need to figure out and stop it. Or Kernel,Multipathd,Zfs should do this for me. Because of 1 disk I dont want to reboot my server. For example I can listen dmesg and when i see "attempting task abort, I/O errors" and I can set the disk as offline. Or I can flush multipath and drop the disk via "/sys/block". With this way the disk will be not a problem.
â Morphinz
Sep 11 at 8:39
Even If i try "zpool import mypool" when zfs searching the disks, the faulted disk causing HBA reset. Thats causing "zpool import" hang for 1-2 minute and after that zpool import output gives me "Unavail, or faulted" everydisks. This issue really important and I cant understand why nobody cares it. Maybe my Broadcom 3008 HBA LSI cards causing this and thats why any other user do not have the problem. I really, really need to stop this HBA reset issue. If I can close the code from kernel I will... Or if I need to change these HBA card I will...
â Morphinz
Sep 11 at 8:45
You might be better off asking a new ZFS specific question as your comments show you need a different answer to those of your original 5 questions...
â Anon
Sep 12 at 18:16
(FYI: grox.net/sysadm/unix/linux_disk_hotplug_helpful_commands talks about usingecho offline > /sys/block/<blockdev>/device/state
) to offline a disk.
â Anon
Sep 12 at 18:19
add a comment |Â
up vote
1
down vote
- Some faults can come and go. There's nothing that guarantees you will be warned before a disk is going to die but if SMART starts spitting out failure errors it's better not to risk it and just replace the drive.
- Errors can come and go because sometimes the disk keeps retrying problem regions until it succeeds (at which point it will generally try and avoid using that region again if it can).
- You could run a long SMART self test and/or read/write to every LBA in use (ZFS has a scrub (aka resilvering) process that can be initiated). Watch out though - these might make the disk fail for good...
- You can't.
- Hard to say but let's put it another way: is the money saved by not replacing it unnecessarily worth the risk of having it suddenly fail?
@Morphinz did this answer you questions?
â Anon
Sep 6 at 3:06
thank you for your answer but this is an info about I won't have any control on my disks. My problem getting information and be able to control my disks. Because of 1 faulted disk I'm facing suspend issue. When the problem making noise I need to figure out and stop it. Or Kernel,Multipathd,Zfs should do this for me. Because of 1 disk I dont want to reboot my server. For example I can listen dmesg and when i see "attempting task abort, I/O errors" and I can set the disk as offline. Or I can flush multipath and drop the disk via "/sys/block". With this way the disk will be not a problem.
â Morphinz
Sep 11 at 8:39
Even If i try "zpool import mypool" when zfs searching the disks, the faulted disk causing HBA reset. Thats causing "zpool import" hang for 1-2 minute and after that zpool import output gives me "Unavail, or faulted" everydisks. This issue really important and I cant understand why nobody cares it. Maybe my Broadcom 3008 HBA LSI cards causing this and thats why any other user do not have the problem. I really, really need to stop this HBA reset issue. If I can close the code from kernel I will... Or if I need to change these HBA card I will...
â Morphinz
Sep 11 at 8:45
You might be better off asking a new ZFS specific question as your comments show you need a different answer to those of your original 5 questions...
â Anon
Sep 12 at 18:16
(FYI: grox.net/sysadm/unix/linux_disk_hotplug_helpful_commands talks about usingecho offline > /sys/block/<blockdev>/device/state
) to offline a disk.
â Anon
Sep 12 at 18:19
add a comment |Â
up vote
1
down vote
up vote
1
down vote
- Some faults can come and go. There's nothing that guarantees you will be warned before a disk is going to die but if SMART starts spitting out failure errors it's better not to risk it and just replace the drive.
- Errors can come and go because sometimes the disk keeps retrying problem regions until it succeeds (at which point it will generally try and avoid using that region again if it can).
- You could run a long SMART self test and/or read/write to every LBA in use (ZFS has a scrub (aka resilvering) process that can be initiated). Watch out though - these might make the disk fail for good...
- You can't.
- Hard to say but let's put it another way: is the money saved by not replacing it unnecessarily worth the risk of having it suddenly fail?
- Some faults can come and go. There's nothing that guarantees you will be warned before a disk is going to die but if SMART starts spitting out failure errors it's better not to risk it and just replace the drive.
- Errors can come and go because sometimes the disk keeps retrying problem regions until it succeeds (at which point it will generally try and avoid using that region again if it can).
- You could run a long SMART self test and/or read/write to every LBA in use (ZFS has a scrub (aka resilvering) process that can be initiated). Watch out though - these might make the disk fail for good...
- You can't.
- Hard to say but let's put it another way: is the money saved by not replacing it unnecessarily worth the risk of having it suddenly fail?
answered Aug 14 at 6:46
Anon
1,3101018
1,3101018
@Morphinz did this answer you questions?
â Anon
Sep 6 at 3:06
thank you for your answer but this is an info about I won't have any control on my disks. My problem getting information and be able to control my disks. Because of 1 faulted disk I'm facing suspend issue. When the problem making noise I need to figure out and stop it. Or Kernel,Multipathd,Zfs should do this for me. Because of 1 disk I dont want to reboot my server. For example I can listen dmesg and when i see "attempting task abort, I/O errors" and I can set the disk as offline. Or I can flush multipath and drop the disk via "/sys/block". With this way the disk will be not a problem.
â Morphinz
Sep 11 at 8:39
Even If i try "zpool import mypool" when zfs searching the disks, the faulted disk causing HBA reset. Thats causing "zpool import" hang for 1-2 minute and after that zpool import output gives me "Unavail, or faulted" everydisks. This issue really important and I cant understand why nobody cares it. Maybe my Broadcom 3008 HBA LSI cards causing this and thats why any other user do not have the problem. I really, really need to stop this HBA reset issue. If I can close the code from kernel I will... Or if I need to change these HBA card I will...
â Morphinz
Sep 11 at 8:45
You might be better off asking a new ZFS specific question as your comments show you need a different answer to those of your original 5 questions...
â Anon
Sep 12 at 18:16
(FYI: grox.net/sysadm/unix/linux_disk_hotplug_helpful_commands talks about usingecho offline > /sys/block/<blockdev>/device/state
) to offline a disk.
â Anon
Sep 12 at 18:19
add a comment |Â
@Morphinz did this answer you questions?
â Anon
Sep 6 at 3:06
thank you for your answer but this is an info about I won't have any control on my disks. My problem getting information and be able to control my disks. Because of 1 faulted disk I'm facing suspend issue. When the problem making noise I need to figure out and stop it. Or Kernel,Multipathd,Zfs should do this for me. Because of 1 disk I dont want to reboot my server. For example I can listen dmesg and when i see "attempting task abort, I/O errors" and I can set the disk as offline. Or I can flush multipath and drop the disk via "/sys/block". With this way the disk will be not a problem.
â Morphinz
Sep 11 at 8:39
Even If i try "zpool import mypool" when zfs searching the disks, the faulted disk causing HBA reset. Thats causing "zpool import" hang for 1-2 minute and after that zpool import output gives me "Unavail, or faulted" everydisks. This issue really important and I cant understand why nobody cares it. Maybe my Broadcom 3008 HBA LSI cards causing this and thats why any other user do not have the problem. I really, really need to stop this HBA reset issue. If I can close the code from kernel I will... Or if I need to change these HBA card I will...
â Morphinz
Sep 11 at 8:45
You might be better off asking a new ZFS specific question as your comments show you need a different answer to those of your original 5 questions...
â Anon
Sep 12 at 18:16
(FYI: grox.net/sysadm/unix/linux_disk_hotplug_helpful_commands talks about usingecho offline > /sys/block/<blockdev>/device/state
) to offline a disk.
â Anon
Sep 12 at 18:19
@Morphinz did this answer you questions?
â Anon
Sep 6 at 3:06
@Morphinz did this answer you questions?
â Anon
Sep 6 at 3:06
thank you for your answer but this is an info about I won't have any control on my disks. My problem getting information and be able to control my disks. Because of 1 faulted disk I'm facing suspend issue. When the problem making noise I need to figure out and stop it. Or Kernel,Multipathd,Zfs should do this for me. Because of 1 disk I dont want to reboot my server. For example I can listen dmesg and when i see "attempting task abort, I/O errors" and I can set the disk as offline. Or I can flush multipath and drop the disk via "/sys/block". With this way the disk will be not a problem.
â Morphinz
Sep 11 at 8:39
thank you for your answer but this is an info about I won't have any control on my disks. My problem getting information and be able to control my disks. Because of 1 faulted disk I'm facing suspend issue. When the problem making noise I need to figure out and stop it. Or Kernel,Multipathd,Zfs should do this for me. Because of 1 disk I dont want to reboot my server. For example I can listen dmesg and when i see "attempting task abort, I/O errors" and I can set the disk as offline. Or I can flush multipath and drop the disk via "/sys/block". With this way the disk will be not a problem.
â Morphinz
Sep 11 at 8:39
Even If i try "zpool import mypool" when zfs searching the disks, the faulted disk causing HBA reset. Thats causing "zpool import" hang for 1-2 minute and after that zpool import output gives me "Unavail, or faulted" everydisks. This issue really important and I cant understand why nobody cares it. Maybe my Broadcom 3008 HBA LSI cards causing this and thats why any other user do not have the problem. I really, really need to stop this HBA reset issue. If I can close the code from kernel I will... Or if I need to change these HBA card I will...
â Morphinz
Sep 11 at 8:45
Even If i try "zpool import mypool" when zfs searching the disks, the faulted disk causing HBA reset. Thats causing "zpool import" hang for 1-2 minute and after that zpool import output gives me "Unavail, or faulted" everydisks. This issue really important and I cant understand why nobody cares it. Maybe my Broadcom 3008 HBA LSI cards causing this and thats why any other user do not have the problem. I really, really need to stop this HBA reset issue. If I can close the code from kernel I will... Or if I need to change these HBA card I will...
â Morphinz
Sep 11 at 8:45
You might be better off asking a new ZFS specific question as your comments show you need a different answer to those of your original 5 questions...
â Anon
Sep 12 at 18:16
You might be better off asking a new ZFS specific question as your comments show you need a different answer to those of your original 5 questions...
â Anon
Sep 12 at 18:16
(FYI: grox.net/sysadm/unix/linux_disk_hotplug_helpful_commands talks about using
echo offline > /sys/block/<blockdev>/device/state
) to offline a disk.â Anon
Sep 12 at 18:19
(FYI: grox.net/sysadm/unix/linux_disk_hotplug_helpful_commands talks about using
echo offline > /sys/block/<blockdev>/device/state
) to offline a disk.â Anon
Sep 12 at 18:19
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f462287%2funderstanding-smartctl-and-hard-drive-errors%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
First thing I'd do is to look at the SMART attributes, but
-d scsi
prevents them from being shown. You didn't say how your disks are attached, so if possible, try again without-d scsi
.VALUE
is normalized to 100, lower is worse.â dirkt
Aug 13 at 18:39
@dirkt Controller -> LSI3008HBA -> 2x SAS-cable -> "SC946ED-R2KJBOD" 2xExpander -> Mutlipath SAS disks. I use -d scsci because its not working with different way or I could not. :) What is your advice?
â Morphinz
Aug 14 at 9:19
For LSI controllers, try
-d megaraid,N
with a suitableN
, see here.â dirkt
Aug 14 at 10:25
@dirkt Its HBA not Raid card. Your method works on raid cards.
â Morphinz
Sep 5 at 12:13
Even if it has "raid" in the name, it might also work on other LSI controllers, so it's worth a try. More specifically, it will work on any hardware that supports this particular access method. If your card doesn't support it, then it doesn't; in that case there will probably be no way to get at this information unless you dig up a datasheet that describes how to send SMART commands for your controller.
â dirkt
Sep 5 at 14:08