How to evaluate if it's worth using deduplication?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP





.everyoneloves__top-leaderboard:empty,.everyoneloves__mid-leaderboard:empty,.everyoneloves__bot-mid-leaderboard:empty margin-bottom:0;








3















I have a partition where I am considering to use deduplication.



For the profile of it's data I think it will be a good choice. Still, before doing it, I would like to evaluate the impact in a more systematic way than "feeling".



Is there a tool that evaluates the impact of deduplication on a partition? (either file level or block level).



For now I have ubuntu and ext4, but if deduplication proves to be valuable in this situation I am considering using opendedup or lessfs. Any other sugestion, even if that might mean using a different distribution / free *nix.










share|improve this question
























  • Please add what filesystems you're considering this for and alos what distros.

    – slm
    May 15 '14 at 13:02











  • Unless you have peculiar data patterns, deduplication is worth it if and only if you make snapshots.

    – Gilles
    May 15 '14 at 23:38











  • @gilles I think I have a good pattern for it. I have several users, many of them sharing files among them and very often with duplicates. Still it's hard to tell how much space I would gain.

    – nsn
    May 16 '14 at 12:39

















3















I have a partition where I am considering to use deduplication.



For the profile of it's data I think it will be a good choice. Still, before doing it, I would like to evaluate the impact in a more systematic way than "feeling".



Is there a tool that evaluates the impact of deduplication on a partition? (either file level or block level).



For now I have ubuntu and ext4, but if deduplication proves to be valuable in this situation I am considering using opendedup or lessfs. Any other sugestion, even if that might mean using a different distribution / free *nix.










share|improve this question
























  • Please add what filesystems you're considering this for and alos what distros.

    – slm
    May 15 '14 at 13:02











  • Unless you have peculiar data patterns, deduplication is worth it if and only if you make snapshots.

    – Gilles
    May 15 '14 at 23:38











  • @gilles I think I have a good pattern for it. I have several users, many of them sharing files among them and very often with duplicates. Still it's hard to tell how much space I would gain.

    – nsn
    May 16 '14 at 12:39













3












3








3


1






I have a partition where I am considering to use deduplication.



For the profile of it's data I think it will be a good choice. Still, before doing it, I would like to evaluate the impact in a more systematic way than "feeling".



Is there a tool that evaluates the impact of deduplication on a partition? (either file level or block level).



For now I have ubuntu and ext4, but if deduplication proves to be valuable in this situation I am considering using opendedup or lessfs. Any other sugestion, even if that might mean using a different distribution / free *nix.










share|improve this question
















I have a partition where I am considering to use deduplication.



For the profile of it's data I think it will be a good choice. Still, before doing it, I would like to evaluate the impact in a more systematic way than "feeling".



Is there a tool that evaluates the impact of deduplication on a partition? (either file level or block level).



For now I have ubuntu and ext4, but if deduplication proves to be valuable in this situation I am considering using opendedup or lessfs. Any other sugestion, even if that might mean using a different distribution / free *nix.







filesystems deduplication






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Mar 9 at 14:12









Rui F Ribeiro

41.9k1483142




41.9k1483142










asked May 15 '14 at 12:12









nsnnsn

1816




1816












  • Please add what filesystems you're considering this for and alos what distros.

    – slm
    May 15 '14 at 13:02











  • Unless you have peculiar data patterns, deduplication is worth it if and only if you make snapshots.

    – Gilles
    May 15 '14 at 23:38











  • @gilles I think I have a good pattern for it. I have several users, many of them sharing files among them and very often with duplicates. Still it's hard to tell how much space I would gain.

    – nsn
    May 16 '14 at 12:39

















  • Please add what filesystems you're considering this for and alos what distros.

    – slm
    May 15 '14 at 13:02











  • Unless you have peculiar data patterns, deduplication is worth it if and only if you make snapshots.

    – Gilles
    May 15 '14 at 23:38











  • @gilles I think I have a good pattern for it. I have several users, many of them sharing files among them and very often with duplicates. Still it's hard to tell how much space I would gain.

    – nsn
    May 16 '14 at 12:39
















Please add what filesystems you're considering this for and alos what distros.

– slm
May 15 '14 at 13:02





Please add what filesystems you're considering this for and alos what distros.

– slm
May 15 '14 at 13:02













Unless you have peculiar data patterns, deduplication is worth it if and only if you make snapshots.

– Gilles
May 15 '14 at 23:38





Unless you have peculiar data patterns, deduplication is worth it if and only if you make snapshots.

– Gilles
May 15 '14 at 23:38













@gilles I think I have a good pattern for it. I have several users, many of them sharing files among them and very often with duplicates. Still it's hard to tell how much space I would gain.

– nsn
May 16 '14 at 12:39





@gilles I think I have a good pattern for it. I have several users, many of them sharing files among them and very often with duplicates. Still it's hard to tell how much space I would gain.

– nsn
May 16 '14 at 12:39










1 Answer
1






active

oldest

votes


















4














You didn't specify which filesystem. If you're talking about ZFS, you can use the zdb command to see what effect turning on dedup would have had:



# zdb -S tank
Simulated DDT histogram:

bucket allocated referenced
______ ______________________________ ______________________________
refcnt blocks LSIZE PSIZE DSIZE blocks LSIZE PSIZE DSIZE
------ ------ ----- ----- ----- ------ ----- ----- -----
1 775 96.8M 96.8M 96.8M 775 96.8M 96.8M 96.8M
2 2 256K 256K 256K 6 768K 768K 768K
4 3 384K 384K 384K 13 1.62M 1.62M 1.62M
128 1 128K 128K 128K 158 19.8M 19.8M 19.8M
Total 781 97.5M 97.5M 97.5M 952 119M 119M 119M

dedup = 1.22, compress = 1.00, copies = 1.00, dedup * compress / copies = 1.22





share|improve this answer























    Your Answer








    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "106"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    autoActivateHeartbeat: false,
    convertImagesToLinks: false,
    noModals: true,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    imageUploader:
    brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
    contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
    allowUrls: true
    ,
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













    draft saved

    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f129503%2fhow-to-evaluate-if-its-worth-using-deduplication%23new-answer', 'question_page');

    );

    Post as a guest















    Required, but never shown

























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes









    4














    You didn't specify which filesystem. If you're talking about ZFS, you can use the zdb command to see what effect turning on dedup would have had:



    # zdb -S tank
    Simulated DDT histogram:

    bucket allocated referenced
    ______ ______________________________ ______________________________
    refcnt blocks LSIZE PSIZE DSIZE blocks LSIZE PSIZE DSIZE
    ------ ------ ----- ----- ----- ------ ----- ----- -----
    1 775 96.8M 96.8M 96.8M 775 96.8M 96.8M 96.8M
    2 2 256K 256K 256K 6 768K 768K 768K
    4 3 384K 384K 384K 13 1.62M 1.62M 1.62M
    128 1 128K 128K 128K 158 19.8M 19.8M 19.8M
    Total 781 97.5M 97.5M 97.5M 952 119M 119M 119M

    dedup = 1.22, compress = 1.00, copies = 1.00, dedup * compress / copies = 1.22





    share|improve this answer



























      4














      You didn't specify which filesystem. If you're talking about ZFS, you can use the zdb command to see what effect turning on dedup would have had:



      # zdb -S tank
      Simulated DDT histogram:

      bucket allocated referenced
      ______ ______________________________ ______________________________
      refcnt blocks LSIZE PSIZE DSIZE blocks LSIZE PSIZE DSIZE
      ------ ------ ----- ----- ----- ------ ----- ----- -----
      1 775 96.8M 96.8M 96.8M 775 96.8M 96.8M 96.8M
      2 2 256K 256K 256K 6 768K 768K 768K
      4 3 384K 384K 384K 13 1.62M 1.62M 1.62M
      128 1 128K 128K 128K 158 19.8M 19.8M 19.8M
      Total 781 97.5M 97.5M 97.5M 952 119M 119M 119M

      dedup = 1.22, compress = 1.00, copies = 1.00, dedup * compress / copies = 1.22





      share|improve this answer

























        4












        4








        4







        You didn't specify which filesystem. If you're talking about ZFS, you can use the zdb command to see what effect turning on dedup would have had:



        # zdb -S tank
        Simulated DDT histogram:

        bucket allocated referenced
        ______ ______________________________ ______________________________
        refcnt blocks LSIZE PSIZE DSIZE blocks LSIZE PSIZE DSIZE
        ------ ------ ----- ----- ----- ------ ----- ----- -----
        1 775 96.8M 96.8M 96.8M 775 96.8M 96.8M 96.8M
        2 2 256K 256K 256K 6 768K 768K 768K
        4 3 384K 384K 384K 13 1.62M 1.62M 1.62M
        128 1 128K 128K 128K 158 19.8M 19.8M 19.8M
        Total 781 97.5M 97.5M 97.5M 952 119M 119M 119M

        dedup = 1.22, compress = 1.00, copies = 1.00, dedup * compress / copies = 1.22





        share|improve this answer













        You didn't specify which filesystem. If you're talking about ZFS, you can use the zdb command to see what effect turning on dedup would have had:



        # zdb -S tank
        Simulated DDT histogram:

        bucket allocated referenced
        ______ ______________________________ ______________________________
        refcnt blocks LSIZE PSIZE DSIZE blocks LSIZE PSIZE DSIZE
        ------ ------ ----- ----- ----- ------ ----- ----- -----
        1 775 96.8M 96.8M 96.8M 775 96.8M 96.8M 96.8M
        2 2 256K 256K 256K 6 768K 768K 768K
        4 3 384K 384K 384K 13 1.62M 1.62M 1.62M
        128 1 128K 128K 128K 158 19.8M 19.8M 19.8M
        Total 781 97.5M 97.5M 97.5M 952 119M 119M 119M

        dedup = 1.22, compress = 1.00, copies = 1.00, dedup * compress / copies = 1.22






        share|improve this answer












        share|improve this answer



        share|improve this answer










        answered May 15 '14 at 12:49









        mmusantemmusante

        61736




        61736



























            draft saved

            draft discarded
















































            Thanks for contributing an answer to Unix & Linux Stack Exchange!


            • Please be sure to answer the question. Provide details and share your research!

            But avoid


            • Asking for help, clarification, or responding to other answers.

            • Making statements based on opinion; back them up with references or personal experience.

            To learn more, see our tips on writing great answers.




            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f129503%2fhow-to-evaluate-if-its-worth-using-deduplication%23new-answer', 'question_page');

            );

            Post as a guest















            Required, but never shown





















































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown

































            Required, but never shown














            Required, but never shown












            Required, but never shown







            Required, but never shown






            Popular posts from this blog

            How to check contact read email or not when send email to Individual?

            Bahrain

            Postfix configuration issue with fips on centos 7; mailgun relay