ZFS: re-compress existing files after change in compression algorithm
Clash Royale CLAN TAG#URR8PPP
up vote
11
down vote
favorite
I have a pool that was created in 2011, using lzjb compression, and it wasn't until a couple of years later that an upgrade allowed me to set the compression to lz4. I estimate that at least 20% of the content (by space) on the array was created prior to 2013, which means it's still compressed using lzjb.
I can think of a couple of options to fix this and regain (some) space:
Back up and restore to a new pool. Not really practical, as I do not have sufficient redundant storage to hold the temporary copy. The restore would also require the pool to be offline for several hours.
Write a script to re-copy any file with a timestamp older than 2013. Potentially risky, especially if it chokes on spaces or other special characters and ends up mangling the original name.
So... is there some way to get ZFS to recompress any legacy blocks using the current compression algorithm? Kind of like a scrub, but healing the compression.
A related question: is there some way to see the usage of each type of compression algorithm? zdb just shows overall compression stats, rather than breaking them down into individual algorithms.
Thanks.
zfs
New contributor
add a comment |Â
up vote
11
down vote
favorite
I have a pool that was created in 2011, using lzjb compression, and it wasn't until a couple of years later that an upgrade allowed me to set the compression to lz4. I estimate that at least 20% of the content (by space) on the array was created prior to 2013, which means it's still compressed using lzjb.
I can think of a couple of options to fix this and regain (some) space:
Back up and restore to a new pool. Not really practical, as I do not have sufficient redundant storage to hold the temporary copy. The restore would also require the pool to be offline for several hours.
Write a script to re-copy any file with a timestamp older than 2013. Potentially risky, especially if it chokes on spaces or other special characters and ends up mangling the original name.
So... is there some way to get ZFS to recompress any legacy blocks using the current compression algorithm? Kind of like a scrub, but healing the compression.
A related question: is there some way to see the usage of each type of compression algorithm? zdb just shows overall compression stats, rather than breaking them down into individual algorithms.
Thanks.
zfs
New contributor
2
I'm pretty sure you named the only two options. See also the discussion in issue 3013 for why this functionality doesn't exist and you might not want to do this at all.
â Michael Hamptonâ¦
Oct 1 at 2:14
2
lz4 is supposedly at most 10% better on compressing than lzjb. If 20% of your data can be compressed 10% better you'll get at most 2% more free space. Is it worth it?
â pipe
Oct 1 at 12:01
If you write a shell script to do the copy, addexport LC_ALL=C
to the beginning of the script, and all non-ASCII special characters in filenames will be kept intact. Keeping whitespace and dash intact is trickier, use double quotes and--
, e.g.cp -- "$SOURCE" "$TARGET"
.
â pts
Oct 1 at 12:57
3
@pipe Space is one (very) small advantage, but I'm more interested in decompression speed. From the FreeBSD zpool-features manpage: "Typically, lz4 compression is approximately 50% faster on compressible data and 200% faster on incompressible data than lzjb. It is also approximately 80% faster on decompression, while giving approximately 10% better compression ratio."
â rowan194
Oct 1 at 13:13
@pts I wouldn't call obeying fundamental shell programming rules (double quotes around variables or using--
) "trickier". That's as important as avoiding SQL injection, for example.
â glglgl
Oct 1 at 14:52
add a comment |Â
up vote
11
down vote
favorite
up vote
11
down vote
favorite
I have a pool that was created in 2011, using lzjb compression, and it wasn't until a couple of years later that an upgrade allowed me to set the compression to lz4. I estimate that at least 20% of the content (by space) on the array was created prior to 2013, which means it's still compressed using lzjb.
I can think of a couple of options to fix this and regain (some) space:
Back up and restore to a new pool. Not really practical, as I do not have sufficient redundant storage to hold the temporary copy. The restore would also require the pool to be offline for several hours.
Write a script to re-copy any file with a timestamp older than 2013. Potentially risky, especially if it chokes on spaces or other special characters and ends up mangling the original name.
So... is there some way to get ZFS to recompress any legacy blocks using the current compression algorithm? Kind of like a scrub, but healing the compression.
A related question: is there some way to see the usage of each type of compression algorithm? zdb just shows overall compression stats, rather than breaking them down into individual algorithms.
Thanks.
zfs
New contributor
I have a pool that was created in 2011, using lzjb compression, and it wasn't until a couple of years later that an upgrade allowed me to set the compression to lz4. I estimate that at least 20% of the content (by space) on the array was created prior to 2013, which means it's still compressed using lzjb.
I can think of a couple of options to fix this and regain (some) space:
Back up and restore to a new pool. Not really practical, as I do not have sufficient redundant storage to hold the temporary copy. The restore would also require the pool to be offline for several hours.
Write a script to re-copy any file with a timestamp older than 2013. Potentially risky, especially if it chokes on spaces or other special characters and ends up mangling the original name.
So... is there some way to get ZFS to recompress any legacy blocks using the current compression algorithm? Kind of like a scrub, but healing the compression.
A related question: is there some way to see the usage of each type of compression algorithm? zdb just shows overall compression stats, rather than breaking them down into individual algorithms.
Thanks.
zfs
zfs
New contributor
New contributor
edited Oct 1 at 2:33
New contributor
asked Oct 1 at 2:04
rowan194
563
563
New contributor
New contributor
2
I'm pretty sure you named the only two options. See also the discussion in issue 3013 for why this functionality doesn't exist and you might not want to do this at all.
â Michael Hamptonâ¦
Oct 1 at 2:14
2
lz4 is supposedly at most 10% better on compressing than lzjb. If 20% of your data can be compressed 10% better you'll get at most 2% more free space. Is it worth it?
â pipe
Oct 1 at 12:01
If you write a shell script to do the copy, addexport LC_ALL=C
to the beginning of the script, and all non-ASCII special characters in filenames will be kept intact. Keeping whitespace and dash intact is trickier, use double quotes and--
, e.g.cp -- "$SOURCE" "$TARGET"
.
â pts
Oct 1 at 12:57
3
@pipe Space is one (very) small advantage, but I'm more interested in decompression speed. From the FreeBSD zpool-features manpage: "Typically, lz4 compression is approximately 50% faster on compressible data and 200% faster on incompressible data than lzjb. It is also approximately 80% faster on decompression, while giving approximately 10% better compression ratio."
â rowan194
Oct 1 at 13:13
@pts I wouldn't call obeying fundamental shell programming rules (double quotes around variables or using--
) "trickier". That's as important as avoiding SQL injection, for example.
â glglgl
Oct 1 at 14:52
add a comment |Â
2
I'm pretty sure you named the only two options. See also the discussion in issue 3013 for why this functionality doesn't exist and you might not want to do this at all.
â Michael Hamptonâ¦
Oct 1 at 2:14
2
lz4 is supposedly at most 10% better on compressing than lzjb. If 20% of your data can be compressed 10% better you'll get at most 2% more free space. Is it worth it?
â pipe
Oct 1 at 12:01
If you write a shell script to do the copy, addexport LC_ALL=C
to the beginning of the script, and all non-ASCII special characters in filenames will be kept intact. Keeping whitespace and dash intact is trickier, use double quotes and--
, e.g.cp -- "$SOURCE" "$TARGET"
.
â pts
Oct 1 at 12:57
3
@pipe Space is one (very) small advantage, but I'm more interested in decompression speed. From the FreeBSD zpool-features manpage: "Typically, lz4 compression is approximately 50% faster on compressible data and 200% faster on incompressible data than lzjb. It is also approximately 80% faster on decompression, while giving approximately 10% better compression ratio."
â rowan194
Oct 1 at 13:13
@pts I wouldn't call obeying fundamental shell programming rules (double quotes around variables or using--
) "trickier". That's as important as avoiding SQL injection, for example.
â glglgl
Oct 1 at 14:52
2
2
I'm pretty sure you named the only two options. See also the discussion in issue 3013 for why this functionality doesn't exist and you might not want to do this at all.
â Michael Hamptonâ¦
Oct 1 at 2:14
I'm pretty sure you named the only two options. See also the discussion in issue 3013 for why this functionality doesn't exist and you might not want to do this at all.
â Michael Hamptonâ¦
Oct 1 at 2:14
2
2
lz4 is supposedly at most 10% better on compressing than lzjb. If 20% of your data can be compressed 10% better you'll get at most 2% more free space. Is it worth it?
â pipe
Oct 1 at 12:01
lz4 is supposedly at most 10% better on compressing than lzjb. If 20% of your data can be compressed 10% better you'll get at most 2% more free space. Is it worth it?
â pipe
Oct 1 at 12:01
If you write a shell script to do the copy, add
export LC_ALL=C
to the beginning of the script, and all non-ASCII special characters in filenames will be kept intact. Keeping whitespace and dash intact is trickier, use double quotes and --
, e.g. cp -- "$SOURCE" "$TARGET"
.â pts
Oct 1 at 12:57
If you write a shell script to do the copy, add
export LC_ALL=C
to the beginning of the script, and all non-ASCII special characters in filenames will be kept intact. Keeping whitespace and dash intact is trickier, use double quotes and --
, e.g. cp -- "$SOURCE" "$TARGET"
.â pts
Oct 1 at 12:57
3
3
@pipe Space is one (very) small advantage, but I'm more interested in decompression speed. From the FreeBSD zpool-features manpage: "Typically, lz4 compression is approximately 50% faster on compressible data and 200% faster on incompressible data than lzjb. It is also approximately 80% faster on decompression, while giving approximately 10% better compression ratio."
â rowan194
Oct 1 at 13:13
@pipe Space is one (very) small advantage, but I'm more interested in decompression speed. From the FreeBSD zpool-features manpage: "Typically, lz4 compression is approximately 50% faster on compressible data and 200% faster on incompressible data than lzjb. It is also approximately 80% faster on decompression, while giving approximately 10% better compression ratio."
â rowan194
Oct 1 at 13:13
@pts I wouldn't call obeying fundamental shell programming rules (double quotes around variables or using
--
) "trickier". That's as important as avoiding SQL injection, for example.â glglgl
Oct 1 at 14:52
@pts I wouldn't call obeying fundamental shell programming rules (double quotes around variables or using
--
) "trickier". That's as important as avoiding SQL injection, for example.â glglgl
Oct 1 at 14:52
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
13
down vote
You've have to recopy the data (full or partial) or zfs send/receive the data to a new pool or ZFS filesystem.
There aren't any other options.
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
13
down vote
You've have to recopy the data (full or partial) or zfs send/receive the data to a new pool or ZFS filesystem.
There aren't any other options.
add a comment |Â
up vote
13
down vote
You've have to recopy the data (full or partial) or zfs send/receive the data to a new pool or ZFS filesystem.
There aren't any other options.
add a comment |Â
up vote
13
down vote
up vote
13
down vote
You've have to recopy the data (full or partial) or zfs send/receive the data to a new pool or ZFS filesystem.
There aren't any other options.
You've have to recopy the data (full or partial) or zfs send/receive the data to a new pool or ZFS filesystem.
There aren't any other options.
answered Oct 1 at 2:28
ewwhite
171k73357707
171k73357707
add a comment |Â
add a comment |Â
rowan194 is a new contributor. Be nice, and check out our Code of Conduct.
rowan194 is a new contributor. Be nice, and check out our Code of Conduct.
rowan194 is a new contributor. Be nice, and check out our Code of Conduct.
rowan194 is a new contributor. Be nice, and check out our Code of Conduct.
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f933387%2fzfs-re-compress-existing-files-after-change-in-compression-algorithm%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
2
I'm pretty sure you named the only two options. See also the discussion in issue 3013 for why this functionality doesn't exist and you might not want to do this at all.
â Michael Hamptonâ¦
Oct 1 at 2:14
2
lz4 is supposedly at most 10% better on compressing than lzjb. If 20% of your data can be compressed 10% better you'll get at most 2% more free space. Is it worth it?
â pipe
Oct 1 at 12:01
If you write a shell script to do the copy, add
export LC_ALL=C
to the beginning of the script, and all non-ASCII special characters in filenames will be kept intact. Keeping whitespace and dash intact is trickier, use double quotes and--
, e.g.cp -- "$SOURCE" "$TARGET"
.â pts
Oct 1 at 12:57
3
@pipe Space is one (very) small advantage, but I'm more interested in decompression speed. From the FreeBSD zpool-features manpage: "Typically, lz4 compression is approximately 50% faster on compressible data and 200% faster on incompressible data than lzjb. It is also approximately 80% faster on decompression, while giving approximately 10% better compression ratio."
â rowan194
Oct 1 at 13:13
@pts I wouldn't call obeying fundamental shell programming rules (double quotes around variables or using
--
) "trickier". That's as important as avoiding SQL injection, for example.â glglgl
Oct 1 at 14:52