Linux VM's disk went read-only = no choice but reboot?
Clash Royale CLAN TAG#URR8PPP
up vote
5
down vote
favorite
I have several Linux VMs on VMware + SAN.
What happened
A problem occured on the SAN (failed path) so that for some time, there were I/O errors on the Linux VMs drives. When the path failover had been done, it was too late: every Linux machine considered most of its drives as not "trustworthy" anymore, setting them as read-only devices. The root filesystem's drives were also impacted.
What I tried
mount -o rw,remount /
without success,echo running > /sys/block/sda/device/state
without success,- dug into
/sys
to find a solution without success.
What I may have not tried
blockdev --setrw /dev/sda
Finally...
I had to reboot all my Linux VMs. The Windows VMs were fine...
Some more info from VMware...
The problem is described here. VMware suggests to increase the Linux scsi timeout to prevent this problem to happen.
The question!
However, when the problem does eventually happen, is there a way to get the drives back to read-write mode? (once the SAN is back to normal)
linux vmware block-device reboot readonly
add a comment |
up vote
5
down vote
favorite
I have several Linux VMs on VMware + SAN.
What happened
A problem occured on the SAN (failed path) so that for some time, there were I/O errors on the Linux VMs drives. When the path failover had been done, it was too late: every Linux machine considered most of its drives as not "trustworthy" anymore, setting them as read-only devices. The root filesystem's drives were also impacted.
What I tried
mount -o rw,remount /
without success,echo running > /sys/block/sda/device/state
without success,- dug into
/sys
to find a solution without success.
What I may have not tried
blockdev --setrw /dev/sda
Finally...
I had to reboot all my Linux VMs. The Windows VMs were fine...
Some more info from VMware...
The problem is described here. VMware suggests to increase the Linux scsi timeout to prevent this problem to happen.
The question!
However, when the problem does eventually happen, is there a way to get the drives back to read-write mode? (once the SAN is back to normal)
linux vmware block-device reboot readonly
2
I have been able to get a disk back to read-write withmount -o remount /mountpoint
on a real system. Perhaps that would work inside a VM too.
– Michael Suelmann
Nov 2 '13 at 17:08
Thanks, but I already tried that, and it didn't work... I've edited my question accordingly.
– Totor
Nov 2 '13 at 17:35
add a comment |
up vote
5
down vote
favorite
up vote
5
down vote
favorite
I have several Linux VMs on VMware + SAN.
What happened
A problem occured on the SAN (failed path) so that for some time, there were I/O errors on the Linux VMs drives. When the path failover had been done, it was too late: every Linux machine considered most of its drives as not "trustworthy" anymore, setting them as read-only devices. The root filesystem's drives were also impacted.
What I tried
mount -o rw,remount /
without success,echo running > /sys/block/sda/device/state
without success,- dug into
/sys
to find a solution without success.
What I may have not tried
blockdev --setrw /dev/sda
Finally...
I had to reboot all my Linux VMs. The Windows VMs were fine...
Some more info from VMware...
The problem is described here. VMware suggests to increase the Linux scsi timeout to prevent this problem to happen.
The question!
However, when the problem does eventually happen, is there a way to get the drives back to read-write mode? (once the SAN is back to normal)
linux vmware block-device reboot readonly
I have several Linux VMs on VMware + SAN.
What happened
A problem occured on the SAN (failed path) so that for some time, there were I/O errors on the Linux VMs drives. When the path failover had been done, it was too late: every Linux machine considered most of its drives as not "trustworthy" anymore, setting them as read-only devices. The root filesystem's drives were also impacted.
What I tried
mount -o rw,remount /
without success,echo running > /sys/block/sda/device/state
without success,- dug into
/sys
to find a solution without success.
What I may have not tried
blockdev --setrw /dev/sda
Finally...
I had to reboot all my Linux VMs. The Windows VMs were fine...
Some more info from VMware...
The problem is described here. VMware suggests to increase the Linux scsi timeout to prevent this problem to happen.
The question!
However, when the problem does eventually happen, is there a way to get the drives back to read-write mode? (once the SAN is back to normal)
linux vmware block-device reboot readonly
linux vmware block-device reboot readonly
edited Apr 8 '14 at 16:08
asked Nov 2 '13 at 16:18
Totor
8,091124779
8,091124779
2
I have been able to get a disk back to read-write withmount -o remount /mountpoint
on a real system. Perhaps that would work inside a VM too.
– Michael Suelmann
Nov 2 '13 at 17:08
Thanks, but I already tried that, and it didn't work... I've edited my question accordingly.
– Totor
Nov 2 '13 at 17:35
add a comment |
2
I have been able to get a disk back to read-write withmount -o remount /mountpoint
on a real system. Perhaps that would work inside a VM too.
– Michael Suelmann
Nov 2 '13 at 17:08
Thanks, but I already tried that, and it didn't work... I've edited my question accordingly.
– Totor
Nov 2 '13 at 17:35
2
2
I have been able to get a disk back to read-write with
mount -o remount /mountpoint
on a real system. Perhaps that would work inside a VM too.– Michael Suelmann
Nov 2 '13 at 17:08
I have been able to get a disk back to read-write with
mount -o remount /mountpoint
on a real system. Perhaps that would work inside a VM too.– Michael Suelmann
Nov 2 '13 at 17:08
Thanks, but I already tried that, and it didn't work... I've edited my question accordingly.
– Totor
Nov 2 '13 at 17:35
Thanks, but I already tried that, and it didn't work... I've edited my question accordingly.
– Totor
Nov 2 '13 at 17:35
add a comment |
4 Answers
4
active
oldest
votes
up vote
1
down vote
We have had this problem here a couple of times, usually due to the network going down for an extended period. The problem is not that the file system is read-only but that the disk device itself is marked read-only. No option here other than reboot. Increasing the scsi timeout will work for transient glitches such as a path failover. Won't work well for a 15-minute network outage.
Hmm, this is a Linux kernel limitation then. It deserves a bug report...
– Totor
Mar 20 '14 at 14:34
Did you tryblockdev --setrw /dev/sda
? I edited my question accordingly.
– Totor
Apr 8 '14 at 17:37
That's a new one to me. Hopefully I'll never see the problem again but I'll try this when it does happen. Thanks.
– Doug O'Neal
Apr 10 '14 at 12:34
add a comment |
up vote
1
down vote
From the man of mount
:
errors=panic
Define the behavior when an error is encountered. (Either
ignore errors and just mark the filesystem erroneous and con‐
tinue, or remount the filesystem read-only, or panic and halt
the system.) The default is set in the filesystem superblock,
and can be changed using tune2fs(8).
So you should mount your VM with the continue
option instead of remount-ro
.
mount -o errors=continue
mount -o remount
add a comment |
up vote
0
down vote
I've had this happen on a RHEL system when rebooting/re-configuring the attached SAN. What worked for me was to deactivate the volume group and LVM, and then reactivate it.
vgchange -a n /vg_group_name
lvchange -a n /lvm_group_name
Then you must reactivate them.
vgchange -a y /vg_group_name
lvchange -a y /lvm_group_name
Then just try and remount everything with a mount -a
.
Probably doesn't work for the root/
filesystem, which was the problematic fs in my case...
– Totor
Aug 10 '17 at 13:55
Not sure. Hopefully this will help someone fighting with production SAN issues like I have been.
– G_Style
Aug 10 '17 at 20:18
add a comment |
up vote
0
down vote
Having run test cases using a test VM running on an NFS datastore that I've been intentionally disabling, I haven't found anything that worked. The blockdev command didn't work, and the vg / lv commands do refuse to work on a mounted root /
system.
At this point, the best option seems to be to set errors=panic
in /etc/fstabso the VM just hard fails.
add a comment |
4 Answers
4
active
oldest
votes
4 Answers
4
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
1
down vote
We have had this problem here a couple of times, usually due to the network going down for an extended period. The problem is not that the file system is read-only but that the disk device itself is marked read-only. No option here other than reboot. Increasing the scsi timeout will work for transient glitches such as a path failover. Won't work well for a 15-minute network outage.
Hmm, this is a Linux kernel limitation then. It deserves a bug report...
– Totor
Mar 20 '14 at 14:34
Did you tryblockdev --setrw /dev/sda
? I edited my question accordingly.
– Totor
Apr 8 '14 at 17:37
That's a new one to me. Hopefully I'll never see the problem again but I'll try this when it does happen. Thanks.
– Doug O'Neal
Apr 10 '14 at 12:34
add a comment |
up vote
1
down vote
We have had this problem here a couple of times, usually due to the network going down for an extended period. The problem is not that the file system is read-only but that the disk device itself is marked read-only. No option here other than reboot. Increasing the scsi timeout will work for transient glitches such as a path failover. Won't work well for a 15-minute network outage.
Hmm, this is a Linux kernel limitation then. It deserves a bug report...
– Totor
Mar 20 '14 at 14:34
Did you tryblockdev --setrw /dev/sda
? I edited my question accordingly.
– Totor
Apr 8 '14 at 17:37
That's a new one to me. Hopefully I'll never see the problem again but I'll try this when it does happen. Thanks.
– Doug O'Neal
Apr 10 '14 at 12:34
add a comment |
up vote
1
down vote
up vote
1
down vote
We have had this problem here a couple of times, usually due to the network going down for an extended period. The problem is not that the file system is read-only but that the disk device itself is marked read-only. No option here other than reboot. Increasing the scsi timeout will work for transient glitches such as a path failover. Won't work well for a 15-minute network outage.
We have had this problem here a couple of times, usually due to the network going down for an extended period. The problem is not that the file system is read-only but that the disk device itself is marked read-only. No option here other than reboot. Increasing the scsi timeout will work for transient glitches such as a path failover. Won't work well for a 15-minute network outage.
answered Mar 19 '14 at 12:51
Doug O'Neal
2,8321817
2,8321817
Hmm, this is a Linux kernel limitation then. It deserves a bug report...
– Totor
Mar 20 '14 at 14:34
Did you tryblockdev --setrw /dev/sda
? I edited my question accordingly.
– Totor
Apr 8 '14 at 17:37
That's a new one to me. Hopefully I'll never see the problem again but I'll try this when it does happen. Thanks.
– Doug O'Neal
Apr 10 '14 at 12:34
add a comment |
Hmm, this is a Linux kernel limitation then. It deserves a bug report...
– Totor
Mar 20 '14 at 14:34
Did you tryblockdev --setrw /dev/sda
? I edited my question accordingly.
– Totor
Apr 8 '14 at 17:37
That's a new one to me. Hopefully I'll never see the problem again but I'll try this when it does happen. Thanks.
– Doug O'Neal
Apr 10 '14 at 12:34
Hmm, this is a Linux kernel limitation then. It deserves a bug report...
– Totor
Mar 20 '14 at 14:34
Hmm, this is a Linux kernel limitation then. It deserves a bug report...
– Totor
Mar 20 '14 at 14:34
Did you try
blockdev --setrw /dev/sda
? I edited my question accordingly.– Totor
Apr 8 '14 at 17:37
Did you try
blockdev --setrw /dev/sda
? I edited my question accordingly.– Totor
Apr 8 '14 at 17:37
That's a new one to me. Hopefully I'll never see the problem again but I'll try this when it does happen. Thanks.
– Doug O'Neal
Apr 10 '14 at 12:34
That's a new one to me. Hopefully I'll never see the problem again but I'll try this when it does happen. Thanks.
– Doug O'Neal
Apr 10 '14 at 12:34
add a comment |
up vote
1
down vote
From the man of mount
:
errors=panic
Define the behavior when an error is encountered. (Either
ignore errors and just mark the filesystem erroneous and con‐
tinue, or remount the filesystem read-only, or panic and halt
the system.) The default is set in the filesystem superblock,
and can be changed using tune2fs(8).
So you should mount your VM with the continue
option instead of remount-ro
.
mount -o errors=continue
mount -o remount
add a comment |
up vote
1
down vote
From the man of mount
:
errors=panic
Define the behavior when an error is encountered. (Either
ignore errors and just mark the filesystem erroneous and con‐
tinue, or remount the filesystem read-only, or panic and halt
the system.) The default is set in the filesystem superblock,
and can be changed using tune2fs(8).
So you should mount your VM with the continue
option instead of remount-ro
.
mount -o errors=continue
mount -o remount
add a comment |
up vote
1
down vote
up vote
1
down vote
From the man of mount
:
errors=panic
Define the behavior when an error is encountered. (Either
ignore errors and just mark the filesystem erroneous and con‐
tinue, or remount the filesystem read-only, or panic and halt
the system.) The default is set in the filesystem superblock,
and can be changed using tune2fs(8).
So you should mount your VM with the continue
option instead of remount-ro
.
mount -o errors=continue
mount -o remount
From the man of mount
:
errors=panic
Define the behavior when an error is encountered. (Either
ignore errors and just mark the filesystem erroneous and con‐
tinue, or remount the filesystem read-only, or panic and halt
the system.) The default is set in the filesystem superblock,
and can be changed using tune2fs(8).
So you should mount your VM with the continue
option instead of remount-ro
.
mount -o errors=continue
mount -o remount
answered Jul 6 '17 at 9:19
Pierre.Sassoulas
1115
1115
add a comment |
add a comment |
up vote
0
down vote
I've had this happen on a RHEL system when rebooting/re-configuring the attached SAN. What worked for me was to deactivate the volume group and LVM, and then reactivate it.
vgchange -a n /vg_group_name
lvchange -a n /lvm_group_name
Then you must reactivate them.
vgchange -a y /vg_group_name
lvchange -a y /lvm_group_name
Then just try and remount everything with a mount -a
.
Probably doesn't work for the root/
filesystem, which was the problematic fs in my case...
– Totor
Aug 10 '17 at 13:55
Not sure. Hopefully this will help someone fighting with production SAN issues like I have been.
– G_Style
Aug 10 '17 at 20:18
add a comment |
up vote
0
down vote
I've had this happen on a RHEL system when rebooting/re-configuring the attached SAN. What worked for me was to deactivate the volume group and LVM, and then reactivate it.
vgchange -a n /vg_group_name
lvchange -a n /lvm_group_name
Then you must reactivate them.
vgchange -a y /vg_group_name
lvchange -a y /lvm_group_name
Then just try and remount everything with a mount -a
.
Probably doesn't work for the root/
filesystem, which was the problematic fs in my case...
– Totor
Aug 10 '17 at 13:55
Not sure. Hopefully this will help someone fighting with production SAN issues like I have been.
– G_Style
Aug 10 '17 at 20:18
add a comment |
up vote
0
down vote
up vote
0
down vote
I've had this happen on a RHEL system when rebooting/re-configuring the attached SAN. What worked for me was to deactivate the volume group and LVM, and then reactivate it.
vgchange -a n /vg_group_name
lvchange -a n /lvm_group_name
Then you must reactivate them.
vgchange -a y /vg_group_name
lvchange -a y /lvm_group_name
Then just try and remount everything with a mount -a
.
I've had this happen on a RHEL system when rebooting/re-configuring the attached SAN. What worked for me was to deactivate the volume group and LVM, and then reactivate it.
vgchange -a n /vg_group_name
lvchange -a n /lvm_group_name
Then you must reactivate them.
vgchange -a y /vg_group_name
lvchange -a y /lvm_group_name
Then just try and remount everything with a mount -a
.
answered Aug 4 '17 at 15:58
G_Style
314
314
Probably doesn't work for the root/
filesystem, which was the problematic fs in my case...
– Totor
Aug 10 '17 at 13:55
Not sure. Hopefully this will help someone fighting with production SAN issues like I have been.
– G_Style
Aug 10 '17 at 20:18
add a comment |
Probably doesn't work for the root/
filesystem, which was the problematic fs in my case...
– Totor
Aug 10 '17 at 13:55
Not sure. Hopefully this will help someone fighting with production SAN issues like I have been.
– G_Style
Aug 10 '17 at 20:18
Probably doesn't work for the root
/
filesystem, which was the problematic fs in my case...– Totor
Aug 10 '17 at 13:55
Probably doesn't work for the root
/
filesystem, which was the problematic fs in my case...– Totor
Aug 10 '17 at 13:55
Not sure. Hopefully this will help someone fighting with production SAN issues like I have been.
– G_Style
Aug 10 '17 at 20:18
Not sure. Hopefully this will help someone fighting with production SAN issues like I have been.
– G_Style
Aug 10 '17 at 20:18
add a comment |
up vote
0
down vote
Having run test cases using a test VM running on an NFS datastore that I've been intentionally disabling, I haven't found anything that worked. The blockdev command didn't work, and the vg / lv commands do refuse to work on a mounted root /
system.
At this point, the best option seems to be to set errors=panic
in /etc/fstabso the VM just hard fails.
add a comment |
up vote
0
down vote
Having run test cases using a test VM running on an NFS datastore that I've been intentionally disabling, I haven't found anything that worked. The blockdev command didn't work, and the vg / lv commands do refuse to work on a mounted root /
system.
At this point, the best option seems to be to set errors=panic
in /etc/fstabso the VM just hard fails.
add a comment |
up vote
0
down vote
up vote
0
down vote
Having run test cases using a test VM running on an NFS datastore that I've been intentionally disabling, I haven't found anything that worked. The blockdev command didn't work, and the vg / lv commands do refuse to work on a mounted root /
system.
At this point, the best option seems to be to set errors=panic
in /etc/fstabso the VM just hard fails.
Having run test cases using a test VM running on an NFS datastore that I've been intentionally disabling, I haven't found anything that worked. The blockdev command didn't work, and the vg / lv commands do refuse to work on a mounted root /
system.
At this point, the best option seems to be to set errors=panic
in /etc/fstabso the VM just hard fails.
answered Nov 19 at 16:20
gerardw
1011
1011
add a comment |
add a comment |
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f98575%2flinux-vms-disk-went-read-only-no-choice-but-reboot%23new-answer', 'question_page');
);
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
Required, but never shown
2
I have been able to get a disk back to read-write with
mount -o remount /mountpoint
on a real system. Perhaps that would work inside a VM too.– Michael Suelmann
Nov 2 '13 at 17:08
Thanks, but I already tried that, and it didn't work... I've edited my question accordingly.
– Totor
Nov 2 '13 at 17:35