Is find -iregex faster than using multiple -o's?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
1
down vote

favorite
1












We have several find functions defined in our bash environment to exclude folders (often large or auto-generated) before grepping. An example of one of those is this:



function grepsrc()
c


Would using multiple -o -iname be faster than the -iregex?



function grepsrc()

find . -type d ( -name .repo -o -name .git -o ) -prune -o
-type f ( -iname '*.h' -o -iname '*.c' -o -iname '*.cc' -o
-iname '*.cpp' -o -iname '*.S' -o -iname '*.java' -o
-iname '*.xml' -o -iname '*.sh' -o -iname '*.mk' -o
-iname '*.aidl' -o -iname '*.vts' )
-exec grep --color=auto -n "$@" +



On my own tests, the former has an average time of



real 0m3.175s
user 0m3.021s
sys 0m0.145s


while the latter has an average of



real 0m3.170s
user 0m3.024s
sys 0m0.137s


So no real significant difference on my dataset, but I may be missing something.







share|improve this question




















  • Do you loop on grepsrc more than 200 times per second?
    – Emmanuel
    Nov 6 '17 at 16:40














up vote
1
down vote

favorite
1












We have several find functions defined in our bash environment to exclude folders (often large or auto-generated) before grepping. An example of one of those is this:



function grepsrc()
c


Would using multiple -o -iname be faster than the -iregex?



function grepsrc()

find . -type d ( -name .repo -o -name .git -o ) -prune -o
-type f ( -iname '*.h' -o -iname '*.c' -o -iname '*.cc' -o
-iname '*.cpp' -o -iname '*.S' -o -iname '*.java' -o
-iname '*.xml' -o -iname '*.sh' -o -iname '*.mk' -o
-iname '*.aidl' -o -iname '*.vts' )
-exec grep --color=auto -n "$@" +



On my own tests, the former has an average time of



real 0m3.175s
user 0m3.021s
sys 0m0.145s


while the latter has an average of



real 0m3.170s
user 0m3.024s
sys 0m0.137s


So no real significant difference on my dataset, but I may be missing something.







share|improve this question




















  • Do you loop on grepsrc more than 200 times per second?
    – Emmanuel
    Nov 6 '17 at 16:40












up vote
1
down vote

favorite
1









up vote
1
down vote

favorite
1






1





We have several find functions defined in our bash environment to exclude folders (often large or auto-generated) before grepping. An example of one of those is this:



function grepsrc()
c


Would using multiple -o -iname be faster than the -iregex?



function grepsrc()

find . -type d ( -name .repo -o -name .git -o ) -prune -o
-type f ( -iname '*.h' -o -iname '*.c' -o -iname '*.cc' -o
-iname '*.cpp' -o -iname '*.S' -o -iname '*.java' -o
-iname '*.xml' -o -iname '*.sh' -o -iname '*.mk' -o
-iname '*.aidl' -o -iname '*.vts' )
-exec grep --color=auto -n "$@" +



On my own tests, the former has an average time of



real 0m3.175s
user 0m3.021s
sys 0m0.145s


while the latter has an average of



real 0m3.170s
user 0m3.024s
sys 0m0.137s


So no real significant difference on my dataset, but I may be missing something.







share|improve this question












We have several find functions defined in our bash environment to exclude folders (often large or auto-generated) before grepping. An example of one of those is this:



function grepsrc()
c


Would using multiple -o -iname be faster than the -iregex?



function grepsrc()

find . -type d ( -name .repo -o -name .git -o ) -prune -o
-type f ( -iname '*.h' -o -iname '*.c' -o -iname '*.cc' -o
-iname '*.cpp' -o -iname '*.S' -o -iname '*.java' -o
-iname '*.xml' -o -iname '*.sh' -o -iname '*.mk' -o
-iname '*.aidl' -o -iname '*.vts' )
-exec grep --color=auto -n "$@" +



On my own tests, the former has an average time of



real 0m3.175s
user 0m3.021s
sys 0m0.145s


while the latter has an average of



real 0m3.170s
user 0m3.024s
sys 0m0.137s


So no real significant difference on my dataset, but I may be missing something.









share|improve this question











share|improve this question




share|improve this question










asked Nov 6 '17 at 16:24









OnlineCop

1063




1063











  • Do you loop on grepsrc more than 200 times per second?
    – Emmanuel
    Nov 6 '17 at 16:40
















  • Do you loop on grepsrc more than 200 times per second?
    – Emmanuel
    Nov 6 '17 at 16:40















Do you loop on grepsrc more than 200 times per second?
– Emmanuel
Nov 6 '17 at 16:40




Do you loop on grepsrc more than 200 times per second?
– Emmanuel
Nov 6 '17 at 16:40










1 Answer
1






active

oldest

votes

















up vote
0
down vote













There is no significant difference.



find is I/O bound, not CPU bound. Any string operation such as globbing or regexp matching will be dwarfed by disk operations. So your result above is to be expected.



What can (and typically does) affect find performance is the order of tests. For example if you're looking for directories, moving -type d before, say, -name tests can speed things up by telling find it doesn't need to look at files. But changes that affect only name matching don't have any significant effect on speed.






share|improve this answer






















    Your Answer







    StackExchange.ready(function()
    var channelOptions =
    tags: "".split(" "),
    id: "106"
    ;
    initTagRenderer("".split(" "), "".split(" "), channelOptions);

    StackExchange.using("externalEditor", function()
    // Have to fire editor after snippets, if snippets enabled
    if (StackExchange.settings.snippets.snippetsEnabled)
    StackExchange.using("snippets", function()
    createEditor();
    );

    else
    createEditor();

    );

    function createEditor()
    StackExchange.prepareEditor(
    heartbeatType: 'answer',
    convertImagesToLinks: false,
    noModals: false,
    showLowRepImageUploadWarning: true,
    reputationToPostImages: null,
    bindNavPrevention: true,
    postfix: "",
    onDemand: true,
    discardSelector: ".discard-answer"
    ,immediatelyShowMarkdownHelp:true
    );



    );













     

    draft saved


    draft discarded


















    StackExchange.ready(
    function ()
    StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f402866%2fis-find-iregex-faster-than-using-multiple-os%23new-answer', 'question_page');

    );

    Post as a guest






























    1 Answer
    1






    active

    oldest

    votes








    1 Answer
    1






    active

    oldest

    votes









    active

    oldest

    votes






    active

    oldest

    votes








    up vote
    0
    down vote













    There is no significant difference.



    find is I/O bound, not CPU bound. Any string operation such as globbing or regexp matching will be dwarfed by disk operations. So your result above is to be expected.



    What can (and typically does) affect find performance is the order of tests. For example if you're looking for directories, moving -type d before, say, -name tests can speed things up by telling find it doesn't need to look at files. But changes that affect only name matching don't have any significant effect on speed.






    share|improve this answer


























      up vote
      0
      down vote













      There is no significant difference.



      find is I/O bound, not CPU bound. Any string operation such as globbing or regexp matching will be dwarfed by disk operations. So your result above is to be expected.



      What can (and typically does) affect find performance is the order of tests. For example if you're looking for directories, moving -type d before, say, -name tests can speed things up by telling find it doesn't need to look at files. But changes that affect only name matching don't have any significant effect on speed.






      share|improve this answer
























        up vote
        0
        down vote










        up vote
        0
        down vote









        There is no significant difference.



        find is I/O bound, not CPU bound. Any string operation such as globbing or regexp matching will be dwarfed by disk operations. So your result above is to be expected.



        What can (and typically does) affect find performance is the order of tests. For example if you're looking for directories, moving -type d before, say, -name tests can speed things up by telling find it doesn't need to look at files. But changes that affect only name matching don't have any significant effect on speed.






        share|improve this answer














        There is no significant difference.



        find is I/O bound, not CPU bound. Any string operation such as globbing or regexp matching will be dwarfed by disk operations. So your result above is to be expected.



        What can (and typically does) affect find performance is the order of tests. For example if you're looking for directories, moving -type d before, say, -name tests can speed things up by telling find it doesn't need to look at files. But changes that affect only name matching don't have any significant effect on speed.







        share|improve this answer














        share|improve this answer



        share|improve this answer








        edited Nov 7 '17 at 8:20

























        answered Nov 6 '17 at 16:53









        Satō Katsura

        10.7k11533




        10.7k11533



























             

            draft saved


            draft discarded















































             


            draft saved


            draft discarded














            StackExchange.ready(
            function ()
            StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f402866%2fis-find-iregex-faster-than-using-multiple-os%23new-answer', 'question_page');

            );

            Post as a guest













































































            Popular posts from this blog

            Peggy Mitchell

            Palaiologos

            The Forum (Inglewood, California)