Why isn't a disconnected and then reconnected mdadm RAID1 disk automatically re-attached and re-sync?
Clash Royale CLAN TAG#URR8PPP
I'm doing the following test with Debian 9 under Hyper-V: I have two (virtual) disks, each with one partition, and /dev/md0
defined as RAID1 of /dev/sda1
and /dev/sdb1
. The root EXT4 file system defined on md0
works well.
Then I reboot the machine removing the sdb
disk, by deleting the virtual hardware. Everything works fine, I get an email saying that the array is now clean but degraded:
md0 : active raid1 sda1[0]
4877312 blocks super 1.2 [2/1] [U_]
Ok, as expected. I re-attach sdb
, creating a new virtual disk using the same disk file and reboot the machine but the system doesn't seem to re-detect the disk. Nothing is changed, the array has still 1 drive and is in clean, degraded state. mdadm --detail /dev/md0
still reports the disk as removed.
I expected the disk to be re-detected, re-attached and re-sync automatically the next boot, since the uuid and disk name match.
I re-added the disk manually, mdadm --manage /dev/md0 --add /dev/sdb1
, the system syncs it and the array goes back to clean.
Is this the way the system is supposed to work?
PS: I have a significant number of mdadm: Found some drive for an array that is already active
messages at boot. mdadm: giving up.
debian mdadm
add a comment |
I'm doing the following test with Debian 9 under Hyper-V: I have two (virtual) disks, each with one partition, and /dev/md0
defined as RAID1 of /dev/sda1
and /dev/sdb1
. The root EXT4 file system defined on md0
works well.
Then I reboot the machine removing the sdb
disk, by deleting the virtual hardware. Everything works fine, I get an email saying that the array is now clean but degraded:
md0 : active raid1 sda1[0]
4877312 blocks super 1.2 [2/1] [U_]
Ok, as expected. I re-attach sdb
, creating a new virtual disk using the same disk file and reboot the machine but the system doesn't seem to re-detect the disk. Nothing is changed, the array has still 1 drive and is in clean, degraded state. mdadm --detail /dev/md0
still reports the disk as removed.
I expected the disk to be re-detected, re-attached and re-sync automatically the next boot, since the uuid and disk name match.
I re-added the disk manually, mdadm --manage /dev/md0 --add /dev/sdb1
, the system syncs it and the array goes back to clean.
Is this the way the system is supposed to work?
PS: I have a significant number of mdadm: Found some drive for an array that is already active
messages at boot. mdadm: giving up.
debian mdadm
add a comment |
I'm doing the following test with Debian 9 under Hyper-V: I have two (virtual) disks, each with one partition, and /dev/md0
defined as RAID1 of /dev/sda1
and /dev/sdb1
. The root EXT4 file system defined on md0
works well.
Then I reboot the machine removing the sdb
disk, by deleting the virtual hardware. Everything works fine, I get an email saying that the array is now clean but degraded:
md0 : active raid1 sda1[0]
4877312 blocks super 1.2 [2/1] [U_]
Ok, as expected. I re-attach sdb
, creating a new virtual disk using the same disk file and reboot the machine but the system doesn't seem to re-detect the disk. Nothing is changed, the array has still 1 drive and is in clean, degraded state. mdadm --detail /dev/md0
still reports the disk as removed.
I expected the disk to be re-detected, re-attached and re-sync automatically the next boot, since the uuid and disk name match.
I re-added the disk manually, mdadm --manage /dev/md0 --add /dev/sdb1
, the system syncs it and the array goes back to clean.
Is this the way the system is supposed to work?
PS: I have a significant number of mdadm: Found some drive for an array that is already active
messages at boot. mdadm: giving up.
debian mdadm
I'm doing the following test with Debian 9 under Hyper-V: I have two (virtual) disks, each with one partition, and /dev/md0
defined as RAID1 of /dev/sda1
and /dev/sdb1
. The root EXT4 file system defined on md0
works well.
Then I reboot the machine removing the sdb
disk, by deleting the virtual hardware. Everything works fine, I get an email saying that the array is now clean but degraded:
md0 : active raid1 sda1[0]
4877312 blocks super 1.2 [2/1] [U_]
Ok, as expected. I re-attach sdb
, creating a new virtual disk using the same disk file and reboot the machine but the system doesn't seem to re-detect the disk. Nothing is changed, the array has still 1 drive and is in clean, degraded state. mdadm --detail /dev/md0
still reports the disk as removed.
I expected the disk to be re-detected, re-attached and re-sync automatically the next boot, since the uuid and disk name match.
I re-added the disk manually, mdadm --manage /dev/md0 --add /dev/sdb1
, the system syncs it and the array goes back to clean.
Is this the way the system is supposed to work?
PS: I have a significant number of mdadm: Found some drive for an array that is already active
messages at boot. mdadm: giving up.
debian mdadm
debian mdadm
edited Feb 6 at 8:05
ragazzojp
asked Feb 5 at 17:59
ragazzojpragazzojp
84
84
add a comment |
add a comment |
1 Answer
1
active
oldest
votes
If a RAID disk just vanishes suddenly (as far as the OS is concerned), causing the RAID array to become degraded, and then comes back, is it because the system administrator intentionally pulled the disk and then reinserted it? Or perhaps because there is an intermittent connection somewhere, perhaps a bad cable or a loose connection?
If the system had a way of knowing that the removal and restoration was intentional, it could automatically pick it up. But a software RAID has no such knowledge, so it assumes the worst and acts as if the disk or its connection has become unreliable for some reason, until told otherwise.
A hardware RAID controller with hot-pluggable disks may have extra circuitry to detect when a disk in a particular slot has been physically removed and replaced, and may work with an extra assumption that all the disks in slots associated with that RAID controller are always supposed to be RAID disks.
So when a disk vanishes and a hot-plug monitor circuit indicates the disk was physically removed, the controller can check any disk subsequently plugged into that same slot for its own type of RAID metadata. If there is no metadata present, it's presumably a new disk fresh from the factory, and can be freely overwritten as soon as the hot-plug monitor circuit indicates the disk is fully slotted in. Likewise, if the metadata indicates it's the same disk being re-inserted, the RAID set can be automatically recovered.
If the metadata is present and indicates that the disk used to belong to a different RAID set, the system administrator might be setting up to rescue data from a different server that uses the same type of RAID controller, and so it will be best to wait for further instructions: the administrator will decide whether to overwrite the disk or to import its RAID set as another RAID volume with its existing data.
But if a disk vanishes while the hot-plug monitor circuit indicates it's still physically present, the hardware RAID controller will have a very good case for declaring it faulty, even if it later re-appears on its own.
(An important side lesson: If you are moving hardware RAID disks from a failed server to another similar server to salvage the failed server's data, make sure the recipient server does not have any of its own RAID sets in a degraded state before plugging in the disk containing the only copy of some critical data. )
Ok, so you're saying that this is the intended behavior, and even the weekly cron job won't reattach it.
– ragazzojp
Feb 6 at 12:46
Yes. It's the difference between having the right one bit of information, or not having it.
– telcoM
Feb 6 at 15:02
add a comment |
Your Answer
StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);
StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);
else
createEditor();
);
function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);
);
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f498868%2fwhy-isnt-a-disconnected-and-then-reconnected-mdadm-raid1-disk-automatically-re%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
If a RAID disk just vanishes suddenly (as far as the OS is concerned), causing the RAID array to become degraded, and then comes back, is it because the system administrator intentionally pulled the disk and then reinserted it? Or perhaps because there is an intermittent connection somewhere, perhaps a bad cable or a loose connection?
If the system had a way of knowing that the removal and restoration was intentional, it could automatically pick it up. But a software RAID has no such knowledge, so it assumes the worst and acts as if the disk or its connection has become unreliable for some reason, until told otherwise.
A hardware RAID controller with hot-pluggable disks may have extra circuitry to detect when a disk in a particular slot has been physically removed and replaced, and may work with an extra assumption that all the disks in slots associated with that RAID controller are always supposed to be RAID disks.
So when a disk vanishes and a hot-plug monitor circuit indicates the disk was physically removed, the controller can check any disk subsequently plugged into that same slot for its own type of RAID metadata. If there is no metadata present, it's presumably a new disk fresh from the factory, and can be freely overwritten as soon as the hot-plug monitor circuit indicates the disk is fully slotted in. Likewise, if the metadata indicates it's the same disk being re-inserted, the RAID set can be automatically recovered.
If the metadata is present and indicates that the disk used to belong to a different RAID set, the system administrator might be setting up to rescue data from a different server that uses the same type of RAID controller, and so it will be best to wait for further instructions: the administrator will decide whether to overwrite the disk or to import its RAID set as another RAID volume with its existing data.
But if a disk vanishes while the hot-plug monitor circuit indicates it's still physically present, the hardware RAID controller will have a very good case for declaring it faulty, even if it later re-appears on its own.
(An important side lesson: If you are moving hardware RAID disks from a failed server to another similar server to salvage the failed server's data, make sure the recipient server does not have any of its own RAID sets in a degraded state before plugging in the disk containing the only copy of some critical data. )
Ok, so you're saying that this is the intended behavior, and even the weekly cron job won't reattach it.
– ragazzojp
Feb 6 at 12:46
Yes. It's the difference between having the right one bit of information, or not having it.
– telcoM
Feb 6 at 15:02
add a comment |
If a RAID disk just vanishes suddenly (as far as the OS is concerned), causing the RAID array to become degraded, and then comes back, is it because the system administrator intentionally pulled the disk and then reinserted it? Or perhaps because there is an intermittent connection somewhere, perhaps a bad cable or a loose connection?
If the system had a way of knowing that the removal and restoration was intentional, it could automatically pick it up. But a software RAID has no such knowledge, so it assumes the worst and acts as if the disk or its connection has become unreliable for some reason, until told otherwise.
A hardware RAID controller with hot-pluggable disks may have extra circuitry to detect when a disk in a particular slot has been physically removed and replaced, and may work with an extra assumption that all the disks in slots associated with that RAID controller are always supposed to be RAID disks.
So when a disk vanishes and a hot-plug monitor circuit indicates the disk was physically removed, the controller can check any disk subsequently plugged into that same slot for its own type of RAID metadata. If there is no metadata present, it's presumably a new disk fresh from the factory, and can be freely overwritten as soon as the hot-plug monitor circuit indicates the disk is fully slotted in. Likewise, if the metadata indicates it's the same disk being re-inserted, the RAID set can be automatically recovered.
If the metadata is present and indicates that the disk used to belong to a different RAID set, the system administrator might be setting up to rescue data from a different server that uses the same type of RAID controller, and so it will be best to wait for further instructions: the administrator will decide whether to overwrite the disk or to import its RAID set as another RAID volume with its existing data.
But if a disk vanishes while the hot-plug monitor circuit indicates it's still physically present, the hardware RAID controller will have a very good case for declaring it faulty, even if it later re-appears on its own.
(An important side lesson: If you are moving hardware RAID disks from a failed server to another similar server to salvage the failed server's data, make sure the recipient server does not have any of its own RAID sets in a degraded state before plugging in the disk containing the only copy of some critical data. )
Ok, so you're saying that this is the intended behavior, and even the weekly cron job won't reattach it.
– ragazzojp
Feb 6 at 12:46
Yes. It's the difference between having the right one bit of information, or not having it.
– telcoM
Feb 6 at 15:02
add a comment |
If a RAID disk just vanishes suddenly (as far as the OS is concerned), causing the RAID array to become degraded, and then comes back, is it because the system administrator intentionally pulled the disk and then reinserted it? Or perhaps because there is an intermittent connection somewhere, perhaps a bad cable or a loose connection?
If the system had a way of knowing that the removal and restoration was intentional, it could automatically pick it up. But a software RAID has no such knowledge, so it assumes the worst and acts as if the disk or its connection has become unreliable for some reason, until told otherwise.
A hardware RAID controller with hot-pluggable disks may have extra circuitry to detect when a disk in a particular slot has been physically removed and replaced, and may work with an extra assumption that all the disks in slots associated with that RAID controller are always supposed to be RAID disks.
So when a disk vanishes and a hot-plug monitor circuit indicates the disk was physically removed, the controller can check any disk subsequently plugged into that same slot for its own type of RAID metadata. If there is no metadata present, it's presumably a new disk fresh from the factory, and can be freely overwritten as soon as the hot-plug monitor circuit indicates the disk is fully slotted in. Likewise, if the metadata indicates it's the same disk being re-inserted, the RAID set can be automatically recovered.
If the metadata is present and indicates that the disk used to belong to a different RAID set, the system administrator might be setting up to rescue data from a different server that uses the same type of RAID controller, and so it will be best to wait for further instructions: the administrator will decide whether to overwrite the disk or to import its RAID set as another RAID volume with its existing data.
But if a disk vanishes while the hot-plug monitor circuit indicates it's still physically present, the hardware RAID controller will have a very good case for declaring it faulty, even if it later re-appears on its own.
(An important side lesson: If you are moving hardware RAID disks from a failed server to another similar server to salvage the failed server's data, make sure the recipient server does not have any of its own RAID sets in a degraded state before plugging in the disk containing the only copy of some critical data. )
If a RAID disk just vanishes suddenly (as far as the OS is concerned), causing the RAID array to become degraded, and then comes back, is it because the system administrator intentionally pulled the disk and then reinserted it? Or perhaps because there is an intermittent connection somewhere, perhaps a bad cable or a loose connection?
If the system had a way of knowing that the removal and restoration was intentional, it could automatically pick it up. But a software RAID has no such knowledge, so it assumes the worst and acts as if the disk or its connection has become unreliable for some reason, until told otherwise.
A hardware RAID controller with hot-pluggable disks may have extra circuitry to detect when a disk in a particular slot has been physically removed and replaced, and may work with an extra assumption that all the disks in slots associated with that RAID controller are always supposed to be RAID disks.
So when a disk vanishes and a hot-plug monitor circuit indicates the disk was physically removed, the controller can check any disk subsequently plugged into that same slot for its own type of RAID metadata. If there is no metadata present, it's presumably a new disk fresh from the factory, and can be freely overwritten as soon as the hot-plug monitor circuit indicates the disk is fully slotted in. Likewise, if the metadata indicates it's the same disk being re-inserted, the RAID set can be automatically recovered.
If the metadata is present and indicates that the disk used to belong to a different RAID set, the system administrator might be setting up to rescue data from a different server that uses the same type of RAID controller, and so it will be best to wait for further instructions: the administrator will decide whether to overwrite the disk or to import its RAID set as another RAID volume with its existing data.
But if a disk vanishes while the hot-plug monitor circuit indicates it's still physically present, the hardware RAID controller will have a very good case for declaring it faulty, even if it later re-appears on its own.
(An important side lesson: If you are moving hardware RAID disks from a failed server to another similar server to salvage the failed server's data, make sure the recipient server does not have any of its own RAID sets in a degraded state before plugging in the disk containing the only copy of some critical data. )
answered Feb 6 at 11:41
telcoMtelcoM
18.5k12347
18.5k12347
Ok, so you're saying that this is the intended behavior, and even the weekly cron job won't reattach it.
– ragazzojp
Feb 6 at 12:46
Yes. It's the difference between having the right one bit of information, or not having it.
– telcoM
Feb 6 at 15:02
add a comment |
Ok, so you're saying that this is the intended behavior, and even the weekly cron job won't reattach it.
– ragazzojp
Feb 6 at 12:46
Yes. It's the difference between having the right one bit of information, or not having it.
– telcoM
Feb 6 at 15:02
Ok, so you're saying that this is the intended behavior, and even the weekly cron job won't reattach it.
– ragazzojp
Feb 6 at 12:46
Ok, so you're saying that this is the intended behavior, and even the weekly cron job won't reattach it.
– ragazzojp
Feb 6 at 12:46
Yes. It's the difference between having the right one bit of information, or not having it.
– telcoM
Feb 6 at 15:02
Yes. It's the difference between having the right one bit of information, or not having it.
– telcoM
Feb 6 at 15:02
add a comment |
Thanks for contributing an answer to Unix & Linux Stack Exchange!
- Please be sure to answer the question. Provide details and share your research!
But avoid …
- Asking for help, clarification, or responding to other answers.
- Making statements based on opinion; back them up with references or personal experience.
To learn more, see our tips on writing great answers.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f498868%2fwhy-isnt-a-disconnected-and-then-reconnected-mdadm-raid1-disk-automatically-re%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown