Why does RAID-5 require an additional disk for parity blocks?
Clash Royale CLAN TAG#URR8PPP
up vote
1
down vote
favorite
I know that RAID-5 consists of block-level striping across multiple disks, but using an additional parity-check block on each disk .. and that at least two disks are required for striping.
And it's obvious that each parity block is specific to each disk it belongs to (and so there is no need for allocating an additional disk).
Image from Wikipedia.
However I've been unable to understand why in fact there is an additional disk required for parity checks, as I found on this article:
The minimum number of disks in a RAID 5 set is three (two for data and
one for parity).
Any idea?
storage fault-tolerance space-partitioning
add a comment |Â
up vote
1
down vote
favorite
I know that RAID-5 consists of block-level striping across multiple disks, but using an additional parity-check block on each disk .. and that at least two disks are required for striping.
And it's obvious that each parity block is specific to each disk it belongs to (and so there is no need for allocating an additional disk).
Image from Wikipedia.
However I've been unable to understand why in fact there is an additional disk required for parity checks, as I found on this article:
The minimum number of disks in a RAID 5 set is three (two for data and
one for parity).
Any idea?
storage fault-tolerance space-partitioning
1
"Striping" (arranging in stripes), not "stripping" (removing).
â David Richerby
1 hour ago
add a comment |Â
up vote
1
down vote
favorite
up vote
1
down vote
favorite
I know that RAID-5 consists of block-level striping across multiple disks, but using an additional parity-check block on each disk .. and that at least two disks are required for striping.
And it's obvious that each parity block is specific to each disk it belongs to (and so there is no need for allocating an additional disk).
Image from Wikipedia.
However I've been unable to understand why in fact there is an additional disk required for parity checks, as I found on this article:
The minimum number of disks in a RAID 5 set is three (two for data and
one for parity).
Any idea?
storage fault-tolerance space-partitioning
I know that RAID-5 consists of block-level striping across multiple disks, but using an additional parity-check block on each disk .. and that at least two disks are required for striping.
And it's obvious that each parity block is specific to each disk it belongs to (and so there is no need for allocating an additional disk).
Image from Wikipedia.
However I've been unable to understand why in fact there is an additional disk required for parity checks, as I found on this article:
The minimum number of disks in a RAID 5 set is three (two for data and
one for parity).
Any idea?
storage fault-tolerance space-partitioning
storage fault-tolerance space-partitioning
edited 1 hour ago
David Richerby
63.1k1595180
63.1k1595180
asked 4 hours ago
Kais
206113
206113
1
"Striping" (arranging in stripes), not "stripping" (removing).
â David Richerby
1 hour ago
add a comment |Â
1
"Striping" (arranging in stripes), not "stripping" (removing).
â David Richerby
1 hour ago
1
1
"Striping" (arranging in stripes), not "stripping" (removing).
â David Richerby
1 hour ago
"Striping" (arranging in stripes), not "stripping" (removing).
â David Richerby
1 hour ago
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
2
down vote
I think you've misunderstood what the parity data is. They're not parity checks, so it's not true that "each parity block is specific to each disc it belongs to." The parity data is to allow recovery from a failed disc.
Let's go back to RAID-4 for a second, and assume we have three discs: discs $0$ and  $1$ are data and disc $2$ is parity. "Parity" means that the $b$th block of disc $2$ is the xor of the $b$th blocks of discs $0$ and $1$. The point is that, if any single disc fails, we can recover its data because the $b$th block of any disc is the xor of the $b$th block on the other two discs. For this to work, it's crucial that the parity data is on a separate discs. If you only had two discs and put the parity data on those discs (e.g., each disc was two-thirds data blocks and one-third parity blocks) then the failure of a single drive would destroy some blocks and their corresponding parity data, so you'd be unable to recover the data using just what was left on the remaining disc.
RAID-5 is the same idea except that, instead of putting all the parity data on the last disc, it's spread across all the discs. So, for a three-disc set-up, a third of the blocks would have parity data on disc $2$, a third on disc $1$ and a third on disc $0$.
The point of using RAID-5 rather than RAID-4 is that every time you write data, the corresponding parity block must be updated. If all parity data is on the same disc, that disc will be written to much more than the other discs ($k$Â times as much, in a $k$-disc sytem), so it will fail faster. Spreading the parity data across the discs evens out the wear on them.
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
2
down vote
I think you've misunderstood what the parity data is. They're not parity checks, so it's not true that "each parity block is specific to each disc it belongs to." The parity data is to allow recovery from a failed disc.
Let's go back to RAID-4 for a second, and assume we have three discs: discs $0$ and  $1$ are data and disc $2$ is parity. "Parity" means that the $b$th block of disc $2$ is the xor of the $b$th blocks of discs $0$ and $1$. The point is that, if any single disc fails, we can recover its data because the $b$th block of any disc is the xor of the $b$th block on the other two discs. For this to work, it's crucial that the parity data is on a separate discs. If you only had two discs and put the parity data on those discs (e.g., each disc was two-thirds data blocks and one-third parity blocks) then the failure of a single drive would destroy some blocks and their corresponding parity data, so you'd be unable to recover the data using just what was left on the remaining disc.
RAID-5 is the same idea except that, instead of putting all the parity data on the last disc, it's spread across all the discs. So, for a three-disc set-up, a third of the blocks would have parity data on disc $2$, a third on disc $1$ and a third on disc $0$.
The point of using RAID-5 rather than RAID-4 is that every time you write data, the corresponding parity block must be updated. If all parity data is on the same disc, that disc will be written to much more than the other discs ($k$Â times as much, in a $k$-disc sytem), so it will fail faster. Spreading the parity data across the discs evens out the wear on them.
add a comment |Â
up vote
2
down vote
I think you've misunderstood what the parity data is. They're not parity checks, so it's not true that "each parity block is specific to each disc it belongs to." The parity data is to allow recovery from a failed disc.
Let's go back to RAID-4 for a second, and assume we have three discs: discs $0$ and  $1$ are data and disc $2$ is parity. "Parity" means that the $b$th block of disc $2$ is the xor of the $b$th blocks of discs $0$ and $1$. The point is that, if any single disc fails, we can recover its data because the $b$th block of any disc is the xor of the $b$th block on the other two discs. For this to work, it's crucial that the parity data is on a separate discs. If you only had two discs and put the parity data on those discs (e.g., each disc was two-thirds data blocks and one-third parity blocks) then the failure of a single drive would destroy some blocks and their corresponding parity data, so you'd be unable to recover the data using just what was left on the remaining disc.
RAID-5 is the same idea except that, instead of putting all the parity data on the last disc, it's spread across all the discs. So, for a three-disc set-up, a third of the blocks would have parity data on disc $2$, a third on disc $1$ and a third on disc $0$.
The point of using RAID-5 rather than RAID-4 is that every time you write data, the corresponding parity block must be updated. If all parity data is on the same disc, that disc will be written to much more than the other discs ($k$Â times as much, in a $k$-disc sytem), so it will fail faster. Spreading the parity data across the discs evens out the wear on them.
add a comment |Â
up vote
2
down vote
up vote
2
down vote
I think you've misunderstood what the parity data is. They're not parity checks, so it's not true that "each parity block is specific to each disc it belongs to." The parity data is to allow recovery from a failed disc.
Let's go back to RAID-4 for a second, and assume we have three discs: discs $0$ and  $1$ are data and disc $2$ is parity. "Parity" means that the $b$th block of disc $2$ is the xor of the $b$th blocks of discs $0$ and $1$. The point is that, if any single disc fails, we can recover its data because the $b$th block of any disc is the xor of the $b$th block on the other two discs. For this to work, it's crucial that the parity data is on a separate discs. If you only had two discs and put the parity data on those discs (e.g., each disc was two-thirds data blocks and one-third parity blocks) then the failure of a single drive would destroy some blocks and their corresponding parity data, so you'd be unable to recover the data using just what was left on the remaining disc.
RAID-5 is the same idea except that, instead of putting all the parity data on the last disc, it's spread across all the discs. So, for a three-disc set-up, a third of the blocks would have parity data on disc $2$, a third on disc $1$ and a third on disc $0$.
The point of using RAID-5 rather than RAID-4 is that every time you write data, the corresponding parity block must be updated. If all parity data is on the same disc, that disc will be written to much more than the other discs ($k$Â times as much, in a $k$-disc sytem), so it will fail faster. Spreading the parity data across the discs evens out the wear on them.
I think you've misunderstood what the parity data is. They're not parity checks, so it's not true that "each parity block is specific to each disc it belongs to." The parity data is to allow recovery from a failed disc.
Let's go back to RAID-4 for a second, and assume we have three discs: discs $0$ and  $1$ are data and disc $2$ is parity. "Parity" means that the $b$th block of disc $2$ is the xor of the $b$th blocks of discs $0$ and $1$. The point is that, if any single disc fails, we can recover its data because the $b$th block of any disc is the xor of the $b$th block on the other two discs. For this to work, it's crucial that the parity data is on a separate discs. If you only had two discs and put the parity data on those discs (e.g., each disc was two-thirds data blocks and one-third parity blocks) then the failure of a single drive would destroy some blocks and their corresponding parity data, so you'd be unable to recover the data using just what was left on the remaining disc.
RAID-5 is the same idea except that, instead of putting all the parity data on the last disc, it's spread across all the discs. So, for a three-disc set-up, a third of the blocks would have parity data on disc $2$, a third on disc $1$ and a third on disc $0$.
The point of using RAID-5 rather than RAID-4 is that every time you write data, the corresponding parity block must be updated. If all parity data is on the same disc, that disc will be written to much more than the other discs ($k$Â times as much, in a $k$-disc sytem), so it will fail faster. Spreading the parity data across the discs evens out the wear on them.
answered 1 hour ago
David Richerby
63.1k1595180
63.1k1595180
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcs.stackexchange.com%2fquestions%2f99180%2fwhy-does-raid-5-require-an-additional-disk-for-parity-blocks%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
1
"Striping" (arranging in stripes), not "stripping" (removing).
â David Richerby
1 hour ago