Linux VM's disk went read-only = no choice but reboot?

up vote
5
down vote

favorite

I have several Linux VMs on VMware + SAN.

What happened

A problem occured on the SAN (failed path) so that for some time, there were I/O errors on the Linux VMs drives. When the path failover had been done, it was too late: every Linux machine considered most of its drives as not "trustworthy" anymore, setting them as read-only devices. The root filesystem's drives were also impacted.

What I tried

mount -o rw,remount / without success,

echo running > /sys/block/sda/device/state without success,

dug into /sys to find a solution without success.

What I may have not tried

blockdev --setrw /dev/sda

Finally...

I had to reboot all my Linux VMs. The Windows VMs were fine...

Some more info from VMware...

The problem is described here. VMware suggests to increase the Linux scsi timeout to prevent this problem to happen.

The question!

However, when the problem does eventually happen, is there a way to get the drives back to read-write mode? (once the SAN is back to normal)

edited Apr 8 '14 at 16:08

asked Nov 2 '13 at 16:18

Totor

8,091124779

2

I have been able to get a disk back to read-write with mount -o remount /mountpoint on a real system. Perhaps that would work inside a VM too.
– Michael Suelmann
Nov 2 '13 at 17:08

Thanks, but I already tried that, and it didn't work... I've edited my question accordingly.
– Totor
Nov 2 '13 at 17:35

add a comment |

up vote
5
down vote

favorite

I have several Linux VMs on VMware + SAN.

What happened

What I tried

mount -o rw,remount / without success,

echo running > /sys/block/sda/device/state without success,

dug into /sys to find a solution without success.

What I may have not tried

blockdev --setrw /dev/sda

Finally...

I had to reboot all my Linux VMs. The Windows VMs were fine...

Some more info from VMware...

The problem is described here. VMware suggests to increase the Linux scsi timeout to prevent this problem to happen.

The question!

However, when the problem does eventually happen, is there a way to get the drives back to read-write mode? (once the SAN is back to normal)

edited Apr 8 '14 at 16:08

asked Nov 2 '13 at 16:18

Totor

8,091124779

2

I have been able to get a disk back to read-write with mount -o remount /mountpoint on a real system. Perhaps that would work inside a VM too.
– Michael Suelmann
Nov 2 '13 at 17:08

Thanks, but I already tried that, and it didn't work... I've edited my question accordingly.
– Totor
Nov 2 '13 at 17:35

add a comment |

up vote
5
down vote

favorite

I have several Linux VMs on VMware + SAN.

What happened

What I tried

mount -o rw,remount / without success,

echo running > /sys/block/sda/device/state without success,

dug into /sys to find a solution without success.

What I may have not tried

blockdev --setrw /dev/sda

Finally...

I had to reboot all my Linux VMs. The Windows VMs were fine...

Some more info from VMware...

The problem is described here. VMware suggests to increase the Linux scsi timeout to prevent this problem to happen.

The question!

However, when the problem does eventually happen, is there a way to get the drives back to read-write mode? (once the SAN is back to normal)

edited Apr 8 '14 at 16:08

asked Nov 2 '13 at 16:18

Totor

8,091124779

I have several Linux VMs on VMware + SAN.

What happened

What I tried

mount -o rw,remount / without success,

echo running > /sys/block/sda/device/state without success,

dug into /sys to find a solution without success.

What I may have not tried

blockdev --setrw /dev/sda

Finally...

I had to reboot all my Linux VMs. The Windows VMs were fine...

Some more info from VMware...

The problem is described here. VMware suggests to increase the Linux scsi timeout to prevent this problem to happen.

The question!

However, when the problem does eventually happen, is there a way to get the drives back to read-write mode? (once the SAN is back to normal)

linux vmware block-device reboot readonly

edited Apr 8 '14 at 16:08

asked Nov 2 '13 at 16:18

Totor

8,091124779

edited Apr 8 '14 at 16:08

asked Nov 2 '13 at 16:18

Totor

8,091124779

edited Apr 8 '14 at 16:08

asked Nov 2 '13 at 16:18

Totor

8,091124779

asked Nov 2 '13 at 16:18

Totor

8,091124779

asked Nov 2 '13 at 16:18

Totor

8,091124779

2

I have been able to get a disk back to read-write with mount -o remount /mountpoint on a real system. Perhaps that would work inside a VM too.
– Michael Suelmann
Nov 2 '13 at 17:08

Thanks, but I already tried that, and it didn't work... I've edited my question accordingly.
– Totor
Nov 2 '13 at 17:35

add a comment |

2

I have been able to get a disk back to read-write with mount -o remount /mountpoint on a real system. Perhaps that would work inside a VM too.
– Michael Suelmann
Nov 2 '13 at 17:08

Thanks, but I already tried that, and it didn't work... I've edited my question accordingly.
– Totor
Nov 2 '13 at 17:35

I have been able to get a disk back to read-write with mount -o remount /mountpoint on a real system. Perhaps that would work inside a VM too.
– Michael Suelmann
Nov 2 '13 at 17:08

Thanks, but I already tried that, and it didn't work... I've edited my question accordingly.
– Totor
Nov 2 '13 at 17:35

add a comment |

4 Answers
4

active

oldest

votes

up vote
1
down vote

We have had this problem here a couple of times, usually due to the network going down for an extended period. The problem is not that the file system is read-only but that the disk device itself is marked read-only. No option here other than reboot. Increasing the scsi timeout will work for transient glitches such as a path failover. Won't work well for a 15-minute network outage.

answered Mar 19 '14 at 12:51

Doug O'Neal

2,8321817

Hmm, this is a Linux kernel limitation then. It deserves a bug report...
– Totor
Mar 20 '14 at 14:34

Did you try blockdev --setrw /dev/sda? I edited my question accordingly.
– Totor
Apr 8 '14 at 17:37

That's a new one to me. Hopefully I'll never see the problem again but I'll try this when it does happen. Thanks.
– Doug O'Neal
Apr 10 '14 at 12:34

add a comment |

up vote
1
down vote

From the man of mount :

 errors=panic
 Define the behavior when an error is encountered. (Either
 ignore errors and just mark the filesystem erroneous and con‐
 tinue, or remount the filesystem read-only, or panic and halt
 the system.) The default is set in the filesystem superblock,
 and can be changed using tune2fs(8).

So you should mount your VM with the continue option instead of remount-ro.

mount -o errors=continue
mount -o remount

answered Jul 6 '17 at 9:19

Pierre.Sassoulas

1115

add a comment |

up vote
0
down vote

I've had this happen on a RHEL system when rebooting/re-configuring the attached SAN. What worked for me was to deactivate the volume group and LVM, and then reactivate it.

vgchange -a n /vg_group_name
lvchange -a n /lvm_group_name

Then you must reactivate them.

vgchange -a y /vg_group_name
lvchange -a y /lvm_group_name

Then just try and remount everything with a mount -a.

answered Aug 4 '17 at 15:58

G_Style

314

Probably doesn't work for the root / filesystem, which was the problematic fs in my case...
– Totor
Aug 10 '17 at 13:55

Not sure. Hopefully this will help someone fighting with production SAN issues like I have been.
– G_Style
Aug 10 '17 at 20:18

add a comment |

up vote
0
down vote

Having run test cases using a test VM running on an NFS datastore that I've been intentionally disabling, I haven't found anything that worked. The blockdev command didn't work, and the vg / lv commands do refuse to work on a mounted root / system.

At this point, the best option seems to be to set errors=panic in /etc/fstabso the VM just hard fails.

answered Nov 19 at 16:20

gerardw

1011

add a comment |

Your Answer

StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader:
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
,
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);

);

draft saved

draft discarded

StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f98575%2flinux-vms-disk-went-read-only-no-choice-but-reboot%23new-answer', 'question_page');

);

Post as a guest

Name

Required, but never shown

4 Answers
4

active

oldest

votes

4 Answers
4

active

oldest

votes

up vote
1
down vote

answered Mar 19 '14 at 12:51

Doug O'Neal

2,8321817

Hmm, this is a Linux kernel limitation then. It deserves a bug report...
– Totor
Mar 20 '14 at 14:34

Did you try blockdev --setrw /dev/sda? I edited my question accordingly.
– Totor
Apr 8 '14 at 17:37

That's a new one to me. Hopefully I'll never see the problem again but I'll try this when it does happen. Thanks.
– Doug O'Neal
Apr 10 '14 at 12:34

add a comment |

up vote
1
down vote

answered Mar 19 '14 at 12:51

Doug O'Neal

2,8321817

Hmm, this is a Linux kernel limitation then. It deserves a bug report...
– Totor
Mar 20 '14 at 14:34

Did you try blockdev --setrw /dev/sda? I edited my question accordingly.
– Totor
Apr 8 '14 at 17:37

That's a new one to me. Hopefully I'll never see the problem again but I'll try this when it does happen. Thanks.
– Doug O'Neal
Apr 10 '14 at 12:34

add a comment |

up vote
1
down vote

answered Mar 19 '14 at 12:51

Doug O'Neal

2,8321817

answered Mar 19 '14 at 12:51

Doug O'Neal

2,8321817

answered Mar 19 '14 at 12:51

Doug O'Neal

2,8321817

answered Mar 19 '14 at 12:51

Doug O'Neal

2,8321817

answered Mar 19 '14 at 12:51

Doug O'Neal

2,8321817

Hmm, this is a Linux kernel limitation then. It deserves a bug report...
– Totor
Mar 20 '14 at 14:34

Did you try blockdev --setrw /dev/sda? I edited my question accordingly.
– Totor
Apr 8 '14 at 17:37

That's a new one to me. Hopefully I'll never see the problem again but I'll try this when it does happen. Thanks.
– Doug O'Neal
Apr 10 '14 at 12:34

add a comment |

Hmm, this is a Linux kernel limitation then. It deserves a bug report...
– Totor
Mar 20 '14 at 14:34

Did you try blockdev --setrw /dev/sda? I edited my question accordingly.
– Totor
Apr 8 '14 at 17:37

That's a new one to me. Hopefully I'll never see the problem again but I'll try this when it does happen. Thanks.
– Doug O'Neal
Apr 10 '14 at 12:34

Hmm, this is a Linux kernel limitation then. It deserves a bug report...
– Totor
Mar 20 '14 at 14:34

Did you try blockdev --setrw /dev/sda? I edited my question accordingly.
– Totor
Apr 8 '14 at 17:37

That's a new one to me. Hopefully I'll never see the problem again but I'll try this when it does happen. Thanks.
– Doug O'Neal
Apr 10 '14 at 12:34

add a comment |

up vote
1
down vote

From the man of mount :

 errors=panic
 Define the behavior when an error is encountered. (Either
 ignore errors and just mark the filesystem erroneous and con‐
 tinue, or remount the filesystem read-only, or panic and halt
 the system.) The default is set in the filesystem superblock,
 and can be changed using tune2fs(8).

So you should mount your VM with the continue option instead of remount-ro.

mount -o errors=continue
mount -o remount

answered Jul 6 '17 at 9:19

Pierre.Sassoulas

1115

add a comment |

up vote
1
down vote

From the man of mount :

 errors=panic
 Define the behavior when an error is encountered. (Either
 ignore errors and just mark the filesystem erroneous and con‐
 tinue, or remount the filesystem read-only, or panic and halt
 the system.) The default is set in the filesystem superblock,
 and can be changed using tune2fs(8).

So you should mount your VM with the continue option instead of remount-ro.

mount -o errors=continue
mount -o remount

answered Jul 6 '17 at 9:19

Pierre.Sassoulas

1115

add a comment |

up vote
1
down vote

From the man of mount :

 errors=panic
 Define the behavior when an error is encountered. (Either
 ignore errors and just mark the filesystem erroneous and con‐
 tinue, or remount the filesystem read-only, or panic and halt
 the system.) The default is set in the filesystem superblock,
 and can be changed using tune2fs(8).

So you should mount your VM with the continue option instead of remount-ro.

mount -o errors=continue
mount -o remount

answered Jul 6 '17 at 9:19

Pierre.Sassoulas

1115

From the man of mount :

 errors=panic
 Define the behavior when an error is encountered. (Either
 ignore errors and just mark the filesystem erroneous and con‐
 tinue, or remount the filesystem read-only, or panic and halt
 the system.) The default is set in the filesystem superblock,
 and can be changed using tune2fs(8).

So you should mount your VM with the continue option instead of remount-ro.

mount -o errors=continue
mount -o remount

answered Jul 6 '17 at 9:19

Pierre.Sassoulas

1115

answered Jul 6 '17 at 9:19

Pierre.Sassoulas

1115

answered Jul 6 '17 at 9:19

Pierre.Sassoulas

1115

answered Jul 6 '17 at 9:19

Pierre.Sassoulas

1115

add a comment |

up vote
0
down vote

I've had this happen on a RHEL system when rebooting/re-configuring the attached SAN. What worked for me was to deactivate the volume group and LVM, and then reactivate it.

vgchange -a n /vg_group_name
lvchange -a n /lvm_group_name

Then you must reactivate them.

vgchange -a y /vg_group_name
lvchange -a y /lvm_group_name

Then just try and remount everything with a mount -a.

answered Aug 4 '17 at 15:58

G_Style

314

Probably doesn't work for the root / filesystem, which was the problematic fs in my case...
– Totor
Aug 10 '17 at 13:55

Not sure. Hopefully this will help someone fighting with production SAN issues like I have been.
– G_Style
Aug 10 '17 at 20:18

add a comment |

up vote
0
down vote

I've had this happen on a RHEL system when rebooting/re-configuring the attached SAN. What worked for me was to deactivate the volume group and LVM, and then reactivate it.

vgchange -a n /vg_group_name
lvchange -a n /lvm_group_name

Then you must reactivate them.

vgchange -a y /vg_group_name
lvchange -a y /lvm_group_name

Then just try and remount everything with a mount -a.

answered Aug 4 '17 at 15:58

G_Style

314

Probably doesn't work for the root / filesystem, which was the problematic fs in my case...
– Totor
Aug 10 '17 at 13:55

Not sure. Hopefully this will help someone fighting with production SAN issues like I have been.
– G_Style
Aug 10 '17 at 20:18

add a comment |

up vote
0
down vote

I've had this happen on a RHEL system when rebooting/re-configuring the attached SAN. What worked for me was to deactivate the volume group and LVM, and then reactivate it.

vgchange -a n /vg_group_name
lvchange -a n /lvm_group_name

Then you must reactivate them.

vgchange -a y /vg_group_name
lvchange -a y /lvm_group_name

Then just try and remount everything with a mount -a.

answered Aug 4 '17 at 15:58

G_Style

314

I've had this happen on a RHEL system when rebooting/re-configuring the attached SAN. What worked for me was to deactivate the volume group and LVM, and then reactivate it.

vgchange -a n /vg_group_name
lvchange -a n /lvm_group_name

Then you must reactivate them.

vgchange -a y /vg_group_name
lvchange -a y /lvm_group_name

Then just try and remount everything with a mount -a.

answered Aug 4 '17 at 15:58

G_Style

314

answered Aug 4 '17 at 15:58

G_Style

314

answered Aug 4 '17 at 15:58

G_Style

314

answered Aug 4 '17 at 15:58

G_Style

314

Probably doesn't work for the root / filesystem, which was the problematic fs in my case...
– Totor
Aug 10 '17 at 13:55

Not sure. Hopefully this will help someone fighting with production SAN issues like I have been.
– G_Style
Aug 10 '17 at 20:18

add a comment |

Probably doesn't work for the root / filesystem, which was the problematic fs in my case...
– Totor
Aug 10 '17 at 13:55

Not sure. Hopefully this will help someone fighting with production SAN issues like I have been.
– G_Style
Aug 10 '17 at 20:18

Probably doesn't work for the root / filesystem, which was the problematic fs in my case...
– Totor
Aug 10 '17 at 13:55

Not sure. Hopefully this will help someone fighting with production SAN issues like I have been.
– G_Style
Aug 10 '17 at 20:18

add a comment |

up vote
0
down vote

At this point, the best option seems to be to set errors=panic in /etc/fstabso the VM just hard fails.

answered Nov 19 at 16:20

gerardw

1011

add a comment |

up vote
0
down vote

At this point, the best option seems to be to set errors=panic in /etc/fstabso the VM just hard fails.

answered Nov 19 at 16:20

gerardw

1011

add a comment |

up vote
0
down vote

At this point, the best option seems to be to set errors=panic in /etc/fstabso the VM just hard fails.

answered Nov 19 at 16:20

gerardw

1011

At this point, the best option seems to be to set errors=panic in /etc/fstabso the VM just hard fails.

answered Nov 19 at 16:20

gerardw

1011

answered Nov 19 at 16:20

gerardw

1011

answered Nov 19 at 16:20

gerardw

1011

answered Nov 19 at 16:20

gerardw

1011

add a comment |

draft saved

draft discarded

draft saved

draft discarded

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

搜尋此網誌

mjhjmtu