Is find -iregex faster than using multiple -o's?

Clash Royale CLAN TAG#URR8PPP
up vote
1
down vote
favorite
We have several find functions defined in our bash environment to exclude folders (often large or auto-generated) before grepping. An example of one of those is this:
function grepsrc()
c
Would using multiple -o -iname be faster than the -iregex?
function grepsrc()
find . -type d ( -name .repo -o -name .git -o ) -prune -o
-type f ( -iname '*.h' -o -iname '*.c' -o -iname '*.cc' -o
-iname '*.cpp' -o -iname '*.S' -o -iname '*.java' -o
-iname '*.xml' -o -iname '*.sh' -o -iname '*.mk' -o
-iname '*.aidl' -o -iname '*.vts' )
-exec grep --color=auto -n "$@" +
On my own tests, the former has an average time of
real 0m3.175s
user 0m3.021s
sys 0m0.145s
while the latter has an average of
real 0m3.170s
user 0m3.024s
sys 0m0.137s
So no real significant difference on my dataset, but I may be missing something.
grep find
add a comment |Â
up vote
1
down vote
favorite
We have several find functions defined in our bash environment to exclude folders (often large or auto-generated) before grepping. An example of one of those is this:
function grepsrc()
c
Would using multiple -o -iname be faster than the -iregex?
function grepsrc()
find . -type d ( -name .repo -o -name .git -o ) -prune -o
-type f ( -iname '*.h' -o -iname '*.c' -o -iname '*.cc' -o
-iname '*.cpp' -o -iname '*.S' -o -iname '*.java' -o
-iname '*.xml' -o -iname '*.sh' -o -iname '*.mk' -o
-iname '*.aidl' -o -iname '*.vts' )
-exec grep --color=auto -n "$@" +
On my own tests, the former has an average time of
real 0m3.175s
user 0m3.021s
sys 0m0.145s
while the latter has an average of
real 0m3.170s
user 0m3.024s
sys 0m0.137s
So no real significant difference on my dataset, but I may be missing something.
grep find
Do you loop ongrepsrcmore than 200 times per second?
â Emmanuel
Nov 6 '17 at 16:40
add a comment |Â
up vote
1
down vote
favorite
up vote
1
down vote
favorite
We have several find functions defined in our bash environment to exclude folders (often large or auto-generated) before grepping. An example of one of those is this:
function grepsrc()
c
Would using multiple -o -iname be faster than the -iregex?
function grepsrc()
find . -type d ( -name .repo -o -name .git -o ) -prune -o
-type f ( -iname '*.h' -o -iname '*.c' -o -iname '*.cc' -o
-iname '*.cpp' -o -iname '*.S' -o -iname '*.java' -o
-iname '*.xml' -o -iname '*.sh' -o -iname '*.mk' -o
-iname '*.aidl' -o -iname '*.vts' )
-exec grep --color=auto -n "$@" +
On my own tests, the former has an average time of
real 0m3.175s
user 0m3.021s
sys 0m0.145s
while the latter has an average of
real 0m3.170s
user 0m3.024s
sys 0m0.137s
So no real significant difference on my dataset, but I may be missing something.
grep find
We have several find functions defined in our bash environment to exclude folders (often large or auto-generated) before grepping. An example of one of those is this:
function grepsrc()
c
Would using multiple -o -iname be faster than the -iregex?
function grepsrc()
find . -type d ( -name .repo -o -name .git -o ) -prune -o
-type f ( -iname '*.h' -o -iname '*.c' -o -iname '*.cc' -o
-iname '*.cpp' -o -iname '*.S' -o -iname '*.java' -o
-iname '*.xml' -o -iname '*.sh' -o -iname '*.mk' -o
-iname '*.aidl' -o -iname '*.vts' )
-exec grep --color=auto -n "$@" +
On my own tests, the former has an average time of
real 0m3.175s
user 0m3.021s
sys 0m0.145s
while the latter has an average of
real 0m3.170s
user 0m3.024s
sys 0m0.137s
So no real significant difference on my dataset, but I may be missing something.
grep find
asked Nov 6 '17 at 16:24
OnlineCop
1063
1063
Do you loop ongrepsrcmore than 200 times per second?
â Emmanuel
Nov 6 '17 at 16:40
add a comment |Â
Do you loop ongrepsrcmore than 200 times per second?
â Emmanuel
Nov 6 '17 at 16:40
Do you loop on
grepsrc more than 200 times per second?â Emmanuel
Nov 6 '17 at 16:40
Do you loop on
grepsrc more than 200 times per second?â Emmanuel
Nov 6 '17 at 16:40
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
0
down vote
There is no significant difference.
find is I/O bound, not CPU bound. Any string operation such as globbing or regexp matching will be dwarfed by disk operations. So your result above is to be expected.
What can (and typically does) affect find performance is the order of tests. For example if you're looking for directories, moving -type d before, say, -name tests can speed things up by telling find it doesn't need to look at files. But changes that affect only name matching don't have any significant effect on speed.
add a comment |Â
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
0
down vote
There is no significant difference.
find is I/O bound, not CPU bound. Any string operation such as globbing or regexp matching will be dwarfed by disk operations. So your result above is to be expected.
What can (and typically does) affect find performance is the order of tests. For example if you're looking for directories, moving -type d before, say, -name tests can speed things up by telling find it doesn't need to look at files. But changes that affect only name matching don't have any significant effect on speed.
add a comment |Â
up vote
0
down vote
There is no significant difference.
find is I/O bound, not CPU bound. Any string operation such as globbing or regexp matching will be dwarfed by disk operations. So your result above is to be expected.
What can (and typically does) affect find performance is the order of tests. For example if you're looking for directories, moving -type d before, say, -name tests can speed things up by telling find it doesn't need to look at files. But changes that affect only name matching don't have any significant effect on speed.
add a comment |Â
up vote
0
down vote
up vote
0
down vote
There is no significant difference.
find is I/O bound, not CPU bound. Any string operation such as globbing or regexp matching will be dwarfed by disk operations. So your result above is to be expected.
What can (and typically does) affect find performance is the order of tests. For example if you're looking for directories, moving -type d before, say, -name tests can speed things up by telling find it doesn't need to look at files. But changes that affect only name matching don't have any significant effect on speed.
There is no significant difference.
find is I/O bound, not CPU bound. Any string operation such as globbing or regexp matching will be dwarfed by disk operations. So your result above is to be expected.
What can (and typically does) affect find performance is the order of tests. For example if you're looking for directories, moving -type d before, say, -name tests can speed things up by telling find it doesn't need to look at files. But changes that affect only name matching don't have any significant effect on speed.
edited Nov 7 '17 at 8:20
answered Nov 6 '17 at 16:53
Satà  Katsura
10.7k11533
10.7k11533
add a comment |Â
add a comment |Â
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f402866%2fis-find-iregex-faster-than-using-multiple-os%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Do you loop on
grepsrcmore than 200 times per second?â Emmanuel
Nov 6 '17 at 16:40