ZFS: re-compress existing files after change in compression algorithm

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
11
down vote

favorite
1












I have a pool that was created in 2011, using lzjb compression, and it wasn't until a couple of years later that an upgrade allowed me to set the compression to lz4. I estimate that at least 20% of the content (by space) on the array was created prior to 2013, which means it's still compressed using lzjb.



I can think of a couple of options to fix this and regain (some) space:



  1. Back up and restore to a new pool. Not really practical, as I do not have sufficient redundant storage to hold the temporary copy. The restore would also require the pool to be offline for several hours.


  2. Write a script to re-copy any file with a timestamp older than 2013. Potentially risky, especially if it chokes on spaces or other special characters and ends up mangling the original name.


So... is there some way to get ZFS to recompress any legacy blocks using the current compression algorithm? Kind of like a scrub, but healing the compression.



A related question: is there some way to see the usage of each type of compression algorithm? zdb just shows overall compression stats, rather than breaking them down into individual algorithms.



Thanks.










share|improve this question









New contributor




rowan194 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.















  • 2




    I'm pretty sure you named the only two options. See also the discussion in issue 3013 for why this functionality doesn't exist and you might not want to do this at all.
    – Michael Hampton♦
    Oct 1 at 2:14







  • 2




    lz4 is supposedly at most 10% better on compressing than lzjb. If 20% of your data can be compressed 10% better you'll get at most 2% more free space. Is it worth it?
    – pipe
    Oct 1 at 12:01











  • If you write a shell script to do the copy, add export LC_ALL=C to the beginning of the script, and all non-ASCII special characters in filenames will be kept intact. Keeping whitespace and dash intact is trickier, use double quotes and --, e.g. cp -- "$SOURCE" "$TARGET".
    – pts
    Oct 1 at 12:57







  • 3




    @pipe Space is one (very) small advantage, but I'm more interested in decompression speed. From the FreeBSD zpool-features manpage: "Typically, lz4 compression is approximately 50% faster on compressible data and 200% faster on incompressible data than lzjb. It is also approximately 80% faster on decompression, while giving approximately 10% better compression ratio."
    – rowan194
    Oct 1 at 13:13










  • @pts I wouldn't call obeying fundamental shell programming rules (double quotes around variables or using --) "trickier". That's as important as avoiding SQL injection, for example.
    – glglgl
    Oct 1 at 14:52














up vote
11
down vote

favorite
1












I have a pool that was created in 2011, using lzjb compression, and it wasn't until a couple of years later that an upgrade allowed me to set the compression to lz4. I estimate that at least 20% of the content (by space) on the array was created prior to 2013, which means it's still compressed using lzjb.



I can think of a couple of options to fix this and regain (some) space:



  1. Back up and restore to a new pool. Not really practical, as I do not have sufficient redundant storage to hold the temporary copy. The restore would also require the pool to be offline for several hours.


  2. Write a script to re-copy any file with a timestamp older than 2013. Potentially risky, especially if it chokes on spaces or other special characters and ends up mangling the original name.


So... is there some way to get ZFS to recompress any legacy blocks using the current compression algorithm? Kind of like a scrub, but healing the compression.



A related question: is there some way to see the usage of each type of compression algorithm? zdb just shows overall compression stats, rather than breaking them down into individual algorithms.



Thanks.










share|improve this question









New contributor




rowan194 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.















  • 2




    I'm pretty sure you named the only two options. See also the discussion in issue 3013 for why this functionality doesn't exist and you might not want to do this at all.
    – Michael Hampton♦
    Oct 1 at 2:14







  • 2




    lz4 is supposedly at most 10% better on compressing than lzjb. If 20% of your data can be compressed 10% better you'll get at most 2% more free space. Is it worth it?
    – pipe
    Oct 1 at 12:01











  • If you write a shell script to do the copy, add export LC_ALL=C to the beginning of the script, and all non-ASCII special characters in filenames will be kept intact. Keeping whitespace and dash intact is trickier, use double quotes and --, e.g. cp -- "$SOURCE" "$TARGET".
    – pts
    Oct 1 at 12:57







  • 3




    @pipe Space is one (very) small advantage, but I'm more interested in decompression speed. From the FreeBSD zpool-features manpage: "Typically, lz4 compression is approximately 50% faster on compressible data and 200% faster on incompressible data than lzjb. It is also approximately 80% faster on decompression, while giving approximately 10% better compression ratio."
    – rowan194
    Oct 1 at 13:13










  • @pts I wouldn't call obeying fundamental shell programming rules (double quotes around variables or using --) "trickier". That's as important as avoiding SQL injection, for example.
    – glglgl
    Oct 1 at 14:52












up vote
11
down vote

favorite
1









up vote
11
down vote

favorite
1






1





I have a pool that was created in 2011, using lzjb compression, and it wasn't until a couple of years later that an upgrade allowed me to set the compression to lz4. I estimate that at least 20% of the content (by space) on the array was created prior to 2013, which means it's still compressed using lzjb.



I can think of a couple of options to fix this and regain (some) space:



  1. Back up and restore to a new pool. Not really practical, as I do not have sufficient redundant storage to hold the temporary copy. The restore would also require the pool to be offline for several hours.


  2. Write a script to re-copy any file with a timestamp older than 2013. Potentially risky, especially if it chokes on spaces or other special characters and ends up mangling the original name.


So... is there some way to get ZFS to recompress any legacy blocks using the current compression algorithm? Kind of like a scrub, but healing the compression.



A related question: is there some way to see the usage of each type of compression algorithm? zdb just shows overall compression stats, rather than breaking them down into individual algorithms.



Thanks.










share|improve this question









New contributor




rowan194 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











I have a pool that was created in 2011, using lzjb compression, and it wasn't until a couple of years later that an upgrade allowed me to set the compression to lz4. I estimate that at least 20% of the content (by space) on the array was created prior to 2013, which means it's still compressed using lzjb.



I can think of a couple of options to fix this and regain (some) space:



  1. Back up and restore to a new pool. Not really practical, as I do not have sufficient redundant storage to hold the temporary copy. The restore would also require the pool to be offline for several hours.


  2. Write a script to re-copy any file with a timestamp older than 2013. Potentially risky, especially if it chokes on spaces or other special characters and ends up mangling the original name.


So... is there some way to get ZFS to recompress any legacy blocks using the current compression algorithm? Kind of like a scrub, but healing the compression.



A related question: is there some way to see the usage of each type of compression algorithm? zdb just shows overall compression stats, rather than breaking them down into individual algorithms.



Thanks.







zfs






share|improve this question









New contributor




rowan194 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.











share|improve this question









New contributor




rowan194 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









share|improve this question




share|improve this question








edited Oct 1 at 2:33





















New contributor




rowan194 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.









asked Oct 1 at 2:04









rowan194

563




563




New contributor




rowan194 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.





New contributor





rowan194 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.






rowan194 is a new contributor to this site. Take care in asking for clarification, commenting, and answering.
Check out our Code of Conduct.







  • 2




    I'm pretty sure you named the only two options. See also the discussion in issue 3013 for why this functionality doesn't exist and you might not want to do this at all.
    – Michael Hampton♦
    Oct 1 at 2:14







  • 2




    lz4 is supposedly at most 10% better on compressing than lzjb. If 20% of your data can be compressed 10% better you'll get at most 2% more free space. Is it worth it?
    – pipe
    Oct 1 at 12:01











  • If you write a shell script to do the copy, add export LC_ALL=C to the beginning of the script, and all non-ASCII special characters in filenames will be kept intact. Keeping whitespace and dash intact is trickier, use double quotes and --, e.g. cp -- "$SOURCE" "$TARGET".
    – pts
    Oct 1 at 12:57







  • 3




    @pipe Space is one (very) small advantage, but I'm more interested in decompression speed. From the FreeBSD zpool-features manpage: "Typically, lz4 compression is approximately 50% faster on compressible data and 200% faster on incompressible data than lzjb. It is also approximately 80% faster on decompression, while giving approximately 10% better compression ratio."
    – rowan194
    Oct 1 at 13:13










  • @pts I wouldn't call obeying fundamental shell programming rules (double quotes around variables or using --) "trickier". That's as important as avoiding SQL injection, for example.
    – glglgl
    Oct 1 at 14:52












  • 2




    I'm pretty sure you named the only two options. See also the discussion in issue 3013 for why this functionality doesn't exist and you might not want to do this at all.
    – Michael Hampton♦
    Oct 1 at 2:14







  • 2




    lz4 is supposedly at most 10% better on compressing than lzjb. If 20% of your data can be compressed 10% better you'll get at most 2% more free space. Is it worth it?
    – pipe
    Oct 1 at 12:01











  • If you write a shell script to do the copy, add export LC_ALL=C to the beginning of the script, and all non-ASCII special characters in filenames will be kept intact. Keeping whitespace and dash intact is trickier, use double quotes and --, e.g. cp -- "$SOURCE" "$TARGET".
    – pts
    Oct 1 at 12:57







  • 3




    @pipe Space is one (very) small advantage, but I'm more interested in decompression speed. From the FreeBSD zpool-features manpage: "Typically, lz4 compression is approximately 50% faster on compressible data and 200% faster on incompressible data than lzjb. It is also approximately 80% faster on decompression, while giving approximately 10% better compression ratio."
    – rowan194
    Oct 1 at 13:13










  • @pts I wouldn't call obeying fundamental shell programming rules (double quotes around variables or using --) "trickier". That's as important as avoiding SQL injection, for example.
    – glglgl
    Oct 1 at 14:52







2




2




I'm pretty sure you named the only two options. See also the discussion in issue 3013 for why this functionality doesn't exist and you might not want to do this at all.
– Michael Hampton♦
Oct 1 at 2:14





I'm pretty sure you named the only two options. See also the discussion in issue 3013 for why this functionality doesn't exist and you might not want to do this at all.
– Michael Hampton♦
Oct 1 at 2:14





2




2




lz4 is supposedly at most 10% better on compressing than lzjb. If 20% of your data can be compressed 10% better you'll get at most 2% more free space. Is it worth it?
– pipe
Oct 1 at 12:01





lz4 is supposedly at most 10% better on compressing than lzjb. If 20% of your data can be compressed 10% better you'll get at most 2% more free space. Is it worth it?
– pipe
Oct 1 at 12:01













If you write a shell script to do the copy, add export LC_ALL=C to the beginning of the script, and all non-ASCII special characters in filenames will be kept intact. Keeping whitespace and dash intact is trickier, use double quotes and --, e.g. cp -- "$SOURCE" "$TARGET".
– pts
Oct 1 at 12:57





If you write a shell script to do the copy, add export LC_ALL=C to the beginning of the script, and all non-ASCII special characters in filenames will be kept intact. Keeping whitespace and dash intact is trickier, use double quotes and --, e.g. cp -- "$SOURCE" "$TARGET".
– pts
Oct 1 at 12:57





3




3




@pipe Space is one (very) small advantage, but I'm more interested in decompression speed. From the FreeBSD zpool-features manpage: "Typically, lz4 compression is approximately 50% faster on compressible data and 200% faster on incompressible data than lzjb. It is also approximately 80% faster on decompression, while giving approximately 10% better compression ratio."
– rowan194
Oct 1 at 13:13




@pipe Space is one (very) small advantage, but I'm more interested in decompression speed. From the FreeBSD zpool-features manpage: "Typically, lz4 compression is approximately 50% faster on compressible data and 200% faster on incompressible data than lzjb. It is also approximately 80% faster on decompression, while giving approximately 10% better compression ratio."
– rowan194
Oct 1 at 13:13












@pts I wouldn't call obeying fundamental shell programming rules (double quotes around variables or using --) "trickier". That's as important as avoiding SQL injection, for example.
– glglgl
Oct 1 at 14:52




@pts I wouldn't call obeying fundamental shell programming rules (double quotes around variables or using --) "trickier". That's as important as avoiding SQL injection, for example.
– glglgl
Oct 1 at 14:52










1 Answer
1






active

oldest

votes

















up vote
13
down vote













You've have to recopy the data (full or partial) or zfs send/receive the data to a new pool or ZFS filesystem.



There aren't any other options.






share|improve this answer




















    Your Answer







    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "2"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: true,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: 10,
    bindNavPrevention: true,
    postfix: "",
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );






    rowan194 is a new contributor. Be nice, and check out our Code of Conduct.









     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f933387%2fzfs-re-compress-existing-files-after-change-in-compression-algorithm%23new-answer', 'question_page');

    );

    Post as a guest






























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    13
    down vote













    You've have to recopy the data (full or partial) or zfs send/receive the data to a new pool or ZFS filesystem.



    There aren't any other options.






    share|improve this answer
























      up vote
      13
      down vote













      You've have to recopy the data (full or partial) or zfs send/receive the data to a new pool or ZFS filesystem.



      There aren't any other options.






      share|improve this answer






















        up vote
        13
        down vote










        up vote
        13
        down vote









        You've have to recopy the data (full or partial) or zfs send/receive the data to a new pool or ZFS filesystem.



        There aren't any other options.






        share|improve this answer












        You've have to recopy the data (full or partial) or zfs send/receive the data to a new pool or ZFS filesystem.



        There aren't any other options.







        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered Oct 1 at 2:28









        ewwhite

        171k73357707




        171k73357707




















            rowan194 is a new contributor. Be nice, and check out our Code of Conduct.









             

            draft saved


            draft discarded


















            rowan194 is a new contributor. Be nice, and check out our Code of Conduct.












            rowan194 is a new contributor. Be nice, and check out our Code of Conduct.











            rowan194 is a new contributor. Be nice, and check out our Code of Conduct.













             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fserverfault.com%2fquestions%2f933387%2fzfs-re-compress-existing-files-after-change-in-compression-algorithm%23new-answer', 'question_page');

            );

            Post as a guest













































































            Popular posts from this blog

            How to check contact read email or not when send email to Individual?

            Bahrain

            Postfix configuration issue with fips on centos 7; mailgun relay