Why is GNU find so fast in comparison with graphical file search utilities?
Clash Royale CLAN TAG#URR8PPP
up vote
47
down vote
favorite
I'm trying to find a file that doesn't exist in my home directory and all subdirectories.
find ~/ -name "bogus"
gives me that information after few seconds, yet KDE's dolphin
file manager needed almost 3 minutes to do the same. This corresponds with my previous experience with GNOME beagle
.
How does find
manage to do the same very fast while graphical search (which is more intuitive to use than commandline parameters) slugs behind?
find performance dolphin
add a comment |Â
up vote
47
down vote
favorite
I'm trying to find a file that doesn't exist in my home directory and all subdirectories.
find ~/ -name "bogus"
gives me that information after few seconds, yet KDE's dolphin
file manager needed almost 3 minutes to do the same. This corresponds with my previous experience with GNOME beagle
.
How does find
manage to do the same very fast while graphical search (which is more intuitive to use than commandline parameters) slugs behind?
find performance dolphin
I don't know what "Dolphin" is, but does it maybe also look inside files?
â Kusalananda
May 9 at 8:48
1
It's a graphical file manager from KDE: kde.org/applications/system/dolphin It has ability to search inside files, but I didn't enabled that option during this short test.
â Red
May 9 at 8:51
9
Did you search more than once in dolphin? It might be "indexing" the 1st time. And "find" is slow too. Try "locate" if the file is older than the last time the database for locate was indexed ;-)
â Rinzwind
May 9 at 8:56
I uselocate
more often thanfind
and it's faster in a huge folder
â phuclv
May 9 at 13:43
11
whilelocate
is really great for finding files, this is a bit OT, because it uses a completely different approach:find
and GUI tools likeDolphin
are traversing the file tree on demand, whilelocate
is using a previously created index structure.
â Michael Schaefers
May 9 at 13:52
add a comment |Â
up vote
47
down vote
favorite
up vote
47
down vote
favorite
I'm trying to find a file that doesn't exist in my home directory and all subdirectories.
find ~/ -name "bogus"
gives me that information after few seconds, yet KDE's dolphin
file manager needed almost 3 minutes to do the same. This corresponds with my previous experience with GNOME beagle
.
How does find
manage to do the same very fast while graphical search (which is more intuitive to use than commandline parameters) slugs behind?
find performance dolphin
I'm trying to find a file that doesn't exist in my home directory and all subdirectories.
find ~/ -name "bogus"
gives me that information after few seconds, yet KDE's dolphin
file manager needed almost 3 minutes to do the same. This corresponds with my previous experience with GNOME beagle
.
How does find
manage to do the same very fast while graphical search (which is more intuitive to use than commandline parameters) slugs behind?
find performance dolphin
edited May 9 at 15:14
Peter Cordes
3,9551032
3,9551032
asked May 9 at 8:33
Red
641815
641815
I don't know what "Dolphin" is, but does it maybe also look inside files?
â Kusalananda
May 9 at 8:48
1
It's a graphical file manager from KDE: kde.org/applications/system/dolphin It has ability to search inside files, but I didn't enabled that option during this short test.
â Red
May 9 at 8:51
9
Did you search more than once in dolphin? It might be "indexing" the 1st time. And "find" is slow too. Try "locate" if the file is older than the last time the database for locate was indexed ;-)
â Rinzwind
May 9 at 8:56
I uselocate
more often thanfind
and it's faster in a huge folder
â phuclv
May 9 at 13:43
11
whilelocate
is really great for finding files, this is a bit OT, because it uses a completely different approach:find
and GUI tools likeDolphin
are traversing the file tree on demand, whilelocate
is using a previously created index structure.
â Michael Schaefers
May 9 at 13:52
add a comment |Â
I don't know what "Dolphin" is, but does it maybe also look inside files?
â Kusalananda
May 9 at 8:48
1
It's a graphical file manager from KDE: kde.org/applications/system/dolphin It has ability to search inside files, but I didn't enabled that option during this short test.
â Red
May 9 at 8:51
9
Did you search more than once in dolphin? It might be "indexing" the 1st time. And "find" is slow too. Try "locate" if the file is older than the last time the database for locate was indexed ;-)
â Rinzwind
May 9 at 8:56
I uselocate
more often thanfind
and it's faster in a huge folder
â phuclv
May 9 at 13:43
11
whilelocate
is really great for finding files, this is a bit OT, because it uses a completely different approach:find
and GUI tools likeDolphin
are traversing the file tree on demand, whilelocate
is using a previously created index structure.
â Michael Schaefers
May 9 at 13:52
I don't know what "Dolphin" is, but does it maybe also look inside files?
â Kusalananda
May 9 at 8:48
I don't know what "Dolphin" is, but does it maybe also look inside files?
â Kusalananda
May 9 at 8:48
1
1
It's a graphical file manager from KDE: kde.org/applications/system/dolphin It has ability to search inside files, but I didn't enabled that option during this short test.
â Red
May 9 at 8:51
It's a graphical file manager from KDE: kde.org/applications/system/dolphin It has ability to search inside files, but I didn't enabled that option during this short test.
â Red
May 9 at 8:51
9
9
Did you search more than once in dolphin? It might be "indexing" the 1st time. And "find" is slow too. Try "locate" if the file is older than the last time the database for locate was indexed ;-)
â Rinzwind
May 9 at 8:56
Did you search more than once in dolphin? It might be "indexing" the 1st time. And "find" is slow too. Try "locate" if the file is older than the last time the database for locate was indexed ;-)
â Rinzwind
May 9 at 8:56
I use
locate
more often than find
and it's faster in a huge folderâ phuclv
May 9 at 13:43
I use
locate
more often than find
and it's faster in a huge folderâ phuclv
May 9 at 13:43
11
11
while
locate
is really great for finding files, this is a bit OT, because it uses a completely different approach: find
and GUI tools like Dolphin
are traversing the file tree on demand, while locate
is using a previously created index structure.â Michael Schaefers
May 9 at 13:52
while
locate
is really great for finding files, this is a bit OT, because it uses a completely different approach: find
and GUI tools like Dolphin
are traversing the file tree on demand, while locate
is using a previously created index structure.â Michael Schaefers
May 9 at 13:52
add a comment |Â
1 Answer
1
active
oldest
votes
up vote
68
down vote
accepted
Looking at Dolphin with Baloo specifically, it seems to look up the metadata of every file in its search domain, even if you're doing a simple file name search. When I trace the file.so
process, I see calls to lstat
, getxattr
and getxattr
again for every file, and even for ..
entries. These system calls retrieve metadata about the file which is stored in a different location from the file name (the file name is stored in the directory contents, but the metadata are in the inode). Querying the metadata of a file multiple times is cheap since the data would be in the disk cache, but there can be a significant difference between querying the metadata and not querying the metadata.
find
is much more clever. It tries to avoid unnecessary system calls. It won't call getxattr
because it doesn't search based on extended attributes. When it's traversing a directory, it may need to call lstat
on non-matching file names because that may be a subdirectory to search recursively (lstat
is the system call that returns file metadata including the file type such as regular/directory/symlink/â¦). However find
has an optimization: it knows how many subdirectories a directory has from its link count, and it stops calling lstat
once it knows that it's traversed all the subdirectories. In particular, in a leaf directory (a directory with no subdirectories), find
only checks the names, not the metadata. Furthermore some filesystems keep a copy of the file type in the directory entry so that find
doesn't even need to call lstat
if that's the only information it needs.
If you run find
with options that require checking the metadata, it'll make more lstat
calls, but it still won't make an lstat
call on a file if it doesn't need the information (for example because the file is excluded by a previous condition matching on the name).
I suspect that other GUI search tools that reinvent the find
wheel are similarly less clever than the command line utility which has undergone decades of optimization. Dolphin, at least, is clever enough to use the locate database if you search âÂÂeverywhereâ (with the limitation which isn't clear in the UI that the results may be out of date).
22
GNU find is so "clever" that it misses some files on some filesystem types. The well known bug in GNU find is that it makes the illegal assumption that the link count of a directory is2 + number of sub-directories.
This works for filesystems that implement the design bug from the UNIX V7 filesystem, but not for all filesystems, since this is not a POSIX requirement. If you like to get a useful performance number for GNU make, you need to specify-noleaf
on order to tell GNU make to behave correctly.
â schily
May 9 at 10:40
12
@schily, GNUfind
may have had that bug a long time ago, but I doubt you'll find a case where you need to specify-noleaf
by hand nowadays. AFAICT, on Linux at leastgetdents()
(and readdir()) tells which files are directory files on UDF, ISO-9660, btrfs which don't have real.
or..
entries andfind
behaves OK there. Do you know of one case where GNUfind
exhibit the problem?
â Stéphane Chazelas
May 9 at 15:21
4
Just use this rotten genisoimage from debian to create a Rock Ridge filesystem using "graft-points" and the link count in a directory is a random value. Since Rock Ridge implements a link count and ./.., GNU find will usually not find all files on such a filesystem.
â schily
May 9 at 15:34
4
@StéphaneChazelas: Last time I checked (for my master's thesis), the bug was fixed by asserting exactly 2 meant known leaf rather than <= 2. The filesystems that don't implement the 2+ counter all return 1 for directory link count so everything's good. Now if some day somebody made a filesystem that did hard links to directories that didn't have this property, somebody's going to have a bad day.
â Joshua
May 9 at 19:09
14
@schily, I was not able to get random link counts with graft-points and RR with genisoimage 1.1.11 on Debian and even if I binary-edit the iso image to change link counts to random values, I still don't see any problem with GNUfind
. And in any case,strace -v
shows thatgetdents()
correctly returns d_type=DT_DIR for directories, so GNU find doesn't have to use the link count trick.
â Stéphane Chazelas
May 9 at 21:06
 |Â
show 4 more comments
1 Answer
1
active
oldest
votes
1 Answer
1
active
oldest
votes
active
oldest
votes
active
oldest
votes
up vote
68
down vote
accepted
Looking at Dolphin with Baloo specifically, it seems to look up the metadata of every file in its search domain, even if you're doing a simple file name search. When I trace the file.so
process, I see calls to lstat
, getxattr
and getxattr
again for every file, and even for ..
entries. These system calls retrieve metadata about the file which is stored in a different location from the file name (the file name is stored in the directory contents, but the metadata are in the inode). Querying the metadata of a file multiple times is cheap since the data would be in the disk cache, but there can be a significant difference between querying the metadata and not querying the metadata.
find
is much more clever. It tries to avoid unnecessary system calls. It won't call getxattr
because it doesn't search based on extended attributes. When it's traversing a directory, it may need to call lstat
on non-matching file names because that may be a subdirectory to search recursively (lstat
is the system call that returns file metadata including the file type such as regular/directory/symlink/â¦). However find
has an optimization: it knows how many subdirectories a directory has from its link count, and it stops calling lstat
once it knows that it's traversed all the subdirectories. In particular, in a leaf directory (a directory with no subdirectories), find
only checks the names, not the metadata. Furthermore some filesystems keep a copy of the file type in the directory entry so that find
doesn't even need to call lstat
if that's the only information it needs.
If you run find
with options that require checking the metadata, it'll make more lstat
calls, but it still won't make an lstat
call on a file if it doesn't need the information (for example because the file is excluded by a previous condition matching on the name).
I suspect that other GUI search tools that reinvent the find
wheel are similarly less clever than the command line utility which has undergone decades of optimization. Dolphin, at least, is clever enough to use the locate database if you search âÂÂeverywhereâ (with the limitation which isn't clear in the UI that the results may be out of date).
22
GNU find is so "clever" that it misses some files on some filesystem types. The well known bug in GNU find is that it makes the illegal assumption that the link count of a directory is2 + number of sub-directories.
This works for filesystems that implement the design bug from the UNIX V7 filesystem, but not for all filesystems, since this is not a POSIX requirement. If you like to get a useful performance number for GNU make, you need to specify-noleaf
on order to tell GNU make to behave correctly.
â schily
May 9 at 10:40
12
@schily, GNUfind
may have had that bug a long time ago, but I doubt you'll find a case where you need to specify-noleaf
by hand nowadays. AFAICT, on Linux at leastgetdents()
(and readdir()) tells which files are directory files on UDF, ISO-9660, btrfs which don't have real.
or..
entries andfind
behaves OK there. Do you know of one case where GNUfind
exhibit the problem?
â Stéphane Chazelas
May 9 at 15:21
4
Just use this rotten genisoimage from debian to create a Rock Ridge filesystem using "graft-points" and the link count in a directory is a random value. Since Rock Ridge implements a link count and ./.., GNU find will usually not find all files on such a filesystem.
â schily
May 9 at 15:34
4
@StéphaneChazelas: Last time I checked (for my master's thesis), the bug was fixed by asserting exactly 2 meant known leaf rather than <= 2. The filesystems that don't implement the 2+ counter all return 1 for directory link count so everything's good. Now if some day somebody made a filesystem that did hard links to directories that didn't have this property, somebody's going to have a bad day.
â Joshua
May 9 at 19:09
14
@schily, I was not able to get random link counts with graft-points and RR with genisoimage 1.1.11 on Debian and even if I binary-edit the iso image to change link counts to random values, I still don't see any problem with GNUfind
. And in any case,strace -v
shows thatgetdents()
correctly returns d_type=DT_DIR for directories, so GNU find doesn't have to use the link count trick.
â Stéphane Chazelas
May 9 at 21:06
 |Â
show 4 more comments
up vote
68
down vote
accepted
Looking at Dolphin with Baloo specifically, it seems to look up the metadata of every file in its search domain, even if you're doing a simple file name search. When I trace the file.so
process, I see calls to lstat
, getxattr
and getxattr
again for every file, and even for ..
entries. These system calls retrieve metadata about the file which is stored in a different location from the file name (the file name is stored in the directory contents, but the metadata are in the inode). Querying the metadata of a file multiple times is cheap since the data would be in the disk cache, but there can be a significant difference between querying the metadata and not querying the metadata.
find
is much more clever. It tries to avoid unnecessary system calls. It won't call getxattr
because it doesn't search based on extended attributes. When it's traversing a directory, it may need to call lstat
on non-matching file names because that may be a subdirectory to search recursively (lstat
is the system call that returns file metadata including the file type such as regular/directory/symlink/â¦). However find
has an optimization: it knows how many subdirectories a directory has from its link count, and it stops calling lstat
once it knows that it's traversed all the subdirectories. In particular, in a leaf directory (a directory with no subdirectories), find
only checks the names, not the metadata. Furthermore some filesystems keep a copy of the file type in the directory entry so that find
doesn't even need to call lstat
if that's the only information it needs.
If you run find
with options that require checking the metadata, it'll make more lstat
calls, but it still won't make an lstat
call on a file if it doesn't need the information (for example because the file is excluded by a previous condition matching on the name).
I suspect that other GUI search tools that reinvent the find
wheel are similarly less clever than the command line utility which has undergone decades of optimization. Dolphin, at least, is clever enough to use the locate database if you search âÂÂeverywhereâ (with the limitation which isn't clear in the UI that the results may be out of date).
22
GNU find is so "clever" that it misses some files on some filesystem types. The well known bug in GNU find is that it makes the illegal assumption that the link count of a directory is2 + number of sub-directories.
This works for filesystems that implement the design bug from the UNIX V7 filesystem, but not for all filesystems, since this is not a POSIX requirement. If you like to get a useful performance number for GNU make, you need to specify-noleaf
on order to tell GNU make to behave correctly.
â schily
May 9 at 10:40
12
@schily, GNUfind
may have had that bug a long time ago, but I doubt you'll find a case where you need to specify-noleaf
by hand nowadays. AFAICT, on Linux at leastgetdents()
(and readdir()) tells which files are directory files on UDF, ISO-9660, btrfs which don't have real.
or..
entries andfind
behaves OK there. Do you know of one case where GNUfind
exhibit the problem?
â Stéphane Chazelas
May 9 at 15:21
4
Just use this rotten genisoimage from debian to create a Rock Ridge filesystem using "graft-points" and the link count in a directory is a random value. Since Rock Ridge implements a link count and ./.., GNU find will usually not find all files on such a filesystem.
â schily
May 9 at 15:34
4
@StéphaneChazelas: Last time I checked (for my master's thesis), the bug was fixed by asserting exactly 2 meant known leaf rather than <= 2. The filesystems that don't implement the 2+ counter all return 1 for directory link count so everything's good. Now if some day somebody made a filesystem that did hard links to directories that didn't have this property, somebody's going to have a bad day.
â Joshua
May 9 at 19:09
14
@schily, I was not able to get random link counts with graft-points and RR with genisoimage 1.1.11 on Debian and even if I binary-edit the iso image to change link counts to random values, I still don't see any problem with GNUfind
. And in any case,strace -v
shows thatgetdents()
correctly returns d_type=DT_DIR for directories, so GNU find doesn't have to use the link count trick.
â Stéphane Chazelas
May 9 at 21:06
 |Â
show 4 more comments
up vote
68
down vote
accepted
up vote
68
down vote
accepted
Looking at Dolphin with Baloo specifically, it seems to look up the metadata of every file in its search domain, even if you're doing a simple file name search. When I trace the file.so
process, I see calls to lstat
, getxattr
and getxattr
again for every file, and even for ..
entries. These system calls retrieve metadata about the file which is stored in a different location from the file name (the file name is stored in the directory contents, but the metadata are in the inode). Querying the metadata of a file multiple times is cheap since the data would be in the disk cache, but there can be a significant difference between querying the metadata and not querying the metadata.
find
is much more clever. It tries to avoid unnecessary system calls. It won't call getxattr
because it doesn't search based on extended attributes. When it's traversing a directory, it may need to call lstat
on non-matching file names because that may be a subdirectory to search recursively (lstat
is the system call that returns file metadata including the file type such as regular/directory/symlink/â¦). However find
has an optimization: it knows how many subdirectories a directory has from its link count, and it stops calling lstat
once it knows that it's traversed all the subdirectories. In particular, in a leaf directory (a directory with no subdirectories), find
only checks the names, not the metadata. Furthermore some filesystems keep a copy of the file type in the directory entry so that find
doesn't even need to call lstat
if that's the only information it needs.
If you run find
with options that require checking the metadata, it'll make more lstat
calls, but it still won't make an lstat
call on a file if it doesn't need the information (for example because the file is excluded by a previous condition matching on the name).
I suspect that other GUI search tools that reinvent the find
wheel are similarly less clever than the command line utility which has undergone decades of optimization. Dolphin, at least, is clever enough to use the locate database if you search âÂÂeverywhereâ (with the limitation which isn't clear in the UI that the results may be out of date).
Looking at Dolphin with Baloo specifically, it seems to look up the metadata of every file in its search domain, even if you're doing a simple file name search. When I trace the file.so
process, I see calls to lstat
, getxattr
and getxattr
again for every file, and even for ..
entries. These system calls retrieve metadata about the file which is stored in a different location from the file name (the file name is stored in the directory contents, but the metadata are in the inode). Querying the metadata of a file multiple times is cheap since the data would be in the disk cache, but there can be a significant difference between querying the metadata and not querying the metadata.
find
is much more clever. It tries to avoid unnecessary system calls. It won't call getxattr
because it doesn't search based on extended attributes. When it's traversing a directory, it may need to call lstat
on non-matching file names because that may be a subdirectory to search recursively (lstat
is the system call that returns file metadata including the file type such as regular/directory/symlink/â¦). However find
has an optimization: it knows how many subdirectories a directory has from its link count, and it stops calling lstat
once it knows that it's traversed all the subdirectories. In particular, in a leaf directory (a directory with no subdirectories), find
only checks the names, not the metadata. Furthermore some filesystems keep a copy of the file type in the directory entry so that find
doesn't even need to call lstat
if that's the only information it needs.
If you run find
with options that require checking the metadata, it'll make more lstat
calls, but it still won't make an lstat
call on a file if it doesn't need the information (for example because the file is excluded by a previous condition matching on the name).
I suspect that other GUI search tools that reinvent the find
wheel are similarly less clever than the command line utility which has undergone decades of optimization. Dolphin, at least, is clever enough to use the locate database if you search âÂÂeverywhereâ (with the limitation which isn't clear in the UI that the results may be out of date).
edited May 9 at 11:15
answered May 9 at 10:31
Gilles
503k1189951522
503k1189951522
22
GNU find is so "clever" that it misses some files on some filesystem types. The well known bug in GNU find is that it makes the illegal assumption that the link count of a directory is2 + number of sub-directories.
This works for filesystems that implement the design bug from the UNIX V7 filesystem, but not for all filesystems, since this is not a POSIX requirement. If you like to get a useful performance number for GNU make, you need to specify-noleaf
on order to tell GNU make to behave correctly.
â schily
May 9 at 10:40
12
@schily, GNUfind
may have had that bug a long time ago, but I doubt you'll find a case where you need to specify-noleaf
by hand nowadays. AFAICT, on Linux at leastgetdents()
(and readdir()) tells which files are directory files on UDF, ISO-9660, btrfs which don't have real.
or..
entries andfind
behaves OK there. Do you know of one case where GNUfind
exhibit the problem?
â Stéphane Chazelas
May 9 at 15:21
4
Just use this rotten genisoimage from debian to create a Rock Ridge filesystem using "graft-points" and the link count in a directory is a random value. Since Rock Ridge implements a link count and ./.., GNU find will usually not find all files on such a filesystem.
â schily
May 9 at 15:34
4
@StéphaneChazelas: Last time I checked (for my master's thesis), the bug was fixed by asserting exactly 2 meant known leaf rather than <= 2. The filesystems that don't implement the 2+ counter all return 1 for directory link count so everything's good. Now if some day somebody made a filesystem that did hard links to directories that didn't have this property, somebody's going to have a bad day.
â Joshua
May 9 at 19:09
14
@schily, I was not able to get random link counts with graft-points and RR with genisoimage 1.1.11 on Debian and even if I binary-edit the iso image to change link counts to random values, I still don't see any problem with GNUfind
. And in any case,strace -v
shows thatgetdents()
correctly returns d_type=DT_DIR for directories, so GNU find doesn't have to use the link count trick.
â Stéphane Chazelas
May 9 at 21:06
 |Â
show 4 more comments
22
GNU find is so "clever" that it misses some files on some filesystem types. The well known bug in GNU find is that it makes the illegal assumption that the link count of a directory is2 + number of sub-directories.
This works for filesystems that implement the design bug from the UNIX V7 filesystem, but not for all filesystems, since this is not a POSIX requirement. If you like to get a useful performance number for GNU make, you need to specify-noleaf
on order to tell GNU make to behave correctly.
â schily
May 9 at 10:40
12
@schily, GNUfind
may have had that bug a long time ago, but I doubt you'll find a case where you need to specify-noleaf
by hand nowadays. AFAICT, on Linux at leastgetdents()
(and readdir()) tells which files are directory files on UDF, ISO-9660, btrfs which don't have real.
or..
entries andfind
behaves OK there. Do you know of one case where GNUfind
exhibit the problem?
â Stéphane Chazelas
May 9 at 15:21
4
Just use this rotten genisoimage from debian to create a Rock Ridge filesystem using "graft-points" and the link count in a directory is a random value. Since Rock Ridge implements a link count and ./.., GNU find will usually not find all files on such a filesystem.
â schily
May 9 at 15:34
4
@StéphaneChazelas: Last time I checked (for my master's thesis), the bug was fixed by asserting exactly 2 meant known leaf rather than <= 2. The filesystems that don't implement the 2+ counter all return 1 for directory link count so everything's good. Now if some day somebody made a filesystem that did hard links to directories that didn't have this property, somebody's going to have a bad day.
â Joshua
May 9 at 19:09
14
@schily, I was not able to get random link counts with graft-points and RR with genisoimage 1.1.11 on Debian and even if I binary-edit the iso image to change link counts to random values, I still don't see any problem with GNUfind
. And in any case,strace -v
shows thatgetdents()
correctly returns d_type=DT_DIR for directories, so GNU find doesn't have to use the link count trick.
â Stéphane Chazelas
May 9 at 21:06
22
22
GNU find is so "clever" that it misses some files on some filesystem types. The well known bug in GNU find is that it makes the illegal assumption that the link count of a directory is
2 + number of sub-directories.
This works for filesystems that implement the design bug from the UNIX V7 filesystem, but not for all filesystems, since this is not a POSIX requirement. If you like to get a useful performance number for GNU make, you need to specify -noleaf
on order to tell GNU make to behave correctly.â schily
May 9 at 10:40
GNU find is so "clever" that it misses some files on some filesystem types. The well known bug in GNU find is that it makes the illegal assumption that the link count of a directory is
2 + number of sub-directories.
This works for filesystems that implement the design bug from the UNIX V7 filesystem, but not for all filesystems, since this is not a POSIX requirement. If you like to get a useful performance number for GNU make, you need to specify -noleaf
on order to tell GNU make to behave correctly.â schily
May 9 at 10:40
12
12
@schily, GNU
find
may have had that bug a long time ago, but I doubt you'll find a case where you need to specify -noleaf
by hand nowadays. AFAICT, on Linux at least getdents()
(and readdir()) tells which files are directory files on UDF, ISO-9660, btrfs which don't have real .
or ..
entries and find
behaves OK there. Do you know of one case where GNU find
exhibit the problem?â Stéphane Chazelas
May 9 at 15:21
@schily, GNU
find
may have had that bug a long time ago, but I doubt you'll find a case where you need to specify -noleaf
by hand nowadays. AFAICT, on Linux at least getdents()
(and readdir()) tells which files are directory files on UDF, ISO-9660, btrfs which don't have real .
or ..
entries and find
behaves OK there. Do you know of one case where GNU find
exhibit the problem?â Stéphane Chazelas
May 9 at 15:21
4
4
Just use this rotten genisoimage from debian to create a Rock Ridge filesystem using "graft-points" and the link count in a directory is a random value. Since Rock Ridge implements a link count and ./.., GNU find will usually not find all files on such a filesystem.
â schily
May 9 at 15:34
Just use this rotten genisoimage from debian to create a Rock Ridge filesystem using "graft-points" and the link count in a directory is a random value. Since Rock Ridge implements a link count and ./.., GNU find will usually not find all files on such a filesystem.
â schily
May 9 at 15:34
4
4
@StéphaneChazelas: Last time I checked (for my master's thesis), the bug was fixed by asserting exactly 2 meant known leaf rather than <= 2. The filesystems that don't implement the 2+ counter all return 1 for directory link count so everything's good. Now if some day somebody made a filesystem that did hard links to directories that didn't have this property, somebody's going to have a bad day.
â Joshua
May 9 at 19:09
@StéphaneChazelas: Last time I checked (for my master's thesis), the bug was fixed by asserting exactly 2 meant known leaf rather than <= 2. The filesystems that don't implement the 2+ counter all return 1 for directory link count so everything's good. Now if some day somebody made a filesystem that did hard links to directories that didn't have this property, somebody's going to have a bad day.
â Joshua
May 9 at 19:09
14
14
@schily, I was not able to get random link counts with graft-points and RR with genisoimage 1.1.11 on Debian and even if I binary-edit the iso image to change link counts to random values, I still don't see any problem with GNU
find
. And in any case, strace -v
shows that getdents()
correctly returns d_type=DT_DIR for directories, so GNU find doesn't have to use the link count trick.â Stéphane Chazelas
May 9 at 21:06
@schily, I was not able to get random link counts with graft-points and RR with genisoimage 1.1.11 on Debian and even if I binary-edit the iso image to change link counts to random values, I still don't see any problem with GNU
find
. And in any case, strace -v
shows that getdents()
correctly returns d_type=DT_DIR for directories, so GNU find doesn't have to use the link count trick.â Stéphane Chazelas
May 9 at 21:06
 |Â
show 4 more comments
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f442707%2fwhy-is-gnu-find-so-fast-in-comparison-with-graphical-file-search-utilities%23new-answer', 'question_page');
);
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Post as a guest
Sign up or log in
StackExchange.ready(function ()
StackExchange.helpers.onClickDraftSave('#login-link');
);
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
Sign up using Google
Sign up using Facebook
Sign up using Email and Password
I don't know what "Dolphin" is, but does it maybe also look inside files?
â Kusalananda
May 9 at 8:48
1
It's a graphical file manager from KDE: kde.org/applications/system/dolphin It has ability to search inside files, but I didn't enabled that option during this short test.
â Red
May 9 at 8:51
9
Did you search more than once in dolphin? It might be "indexing" the 1st time. And "find" is slow too. Try "locate" if the file is older than the last time the database for locate was indexed ;-)
â Rinzwind
May 9 at 8:56
I use
locate
more often thanfind
and it's faster in a huge folderâ phuclv
May 9 at 13:43
11
while
locate
is really great for finding files, this is a bit OT, because it uses a completely different approach:find
and GUI tools likeDolphin
are traversing the file tree on demand, whilelocate
is using a previously created index structure.â Michael Schaefers
May 9 at 13:52