Why is GNU find so fast in comparison with graphical file search utilities?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP











up vote
47
down vote

favorite
5












I'm trying to find a file that doesn't exist in my home directory and all subdirectories.



find ~/ -name "bogus" gives me that information after few seconds, yet KDE's dolphin file manager needed almost 3 minutes to do the same. This corresponds with my previous experience with GNOME beagle.



How does find manage to do the same very fast while graphical search (which is more intuitive to use than commandline parameters) slugs behind?







share|improve this question





















  • I don't know what "Dolphin" is, but does it maybe also look inside files?
    – Kusalananda
    May 9 at 8:48






  • 1




    It's a graphical file manager from KDE: kde.org/applications/system/dolphin It has ability to search inside files, but I didn't enabled that option during this short test.
    – Red
    May 9 at 8:51






  • 9




    Did you search more than once in dolphin? It might be "indexing" the 1st time. And "find" is slow too. Try "locate" if the file is older than the last time the database for locate was indexed ;-)
    – Rinzwind
    May 9 at 8:56











  • I use locate more often than find and it's faster in a huge folder
    – phuclv
    May 9 at 13:43






  • 11




    while locate is really great for finding files, this is a bit OT, because it uses a completely different approach: find and GUI tools like Dolphin are traversing the file tree on demand, while locate is using a previously created index structure.
    – Michael Schaefers
    May 9 at 13:52














up vote
47
down vote

favorite
5












I'm trying to find a file that doesn't exist in my home directory and all subdirectories.



find ~/ -name "bogus" gives me that information after few seconds, yet KDE's dolphin file manager needed almost 3 minutes to do the same. This corresponds with my previous experience with GNOME beagle.



How does find manage to do the same very fast while graphical search (which is more intuitive to use than commandline parameters) slugs behind?







share|improve this question





















  • I don't know what "Dolphin" is, but does it maybe also look inside files?
    – Kusalananda
    May 9 at 8:48






  • 1




    It's a graphical file manager from KDE: kde.org/applications/system/dolphin It has ability to search inside files, but I didn't enabled that option during this short test.
    – Red
    May 9 at 8:51






  • 9




    Did you search more than once in dolphin? It might be "indexing" the 1st time. And "find" is slow too. Try "locate" if the file is older than the last time the database for locate was indexed ;-)
    – Rinzwind
    May 9 at 8:56











  • I use locate more often than find and it's faster in a huge folder
    – phuclv
    May 9 at 13:43






  • 11




    while locate is really great for finding files, this is a bit OT, because it uses a completely different approach: find and GUI tools like Dolphin are traversing the file tree on demand, while locate is using a previously created index structure.
    – Michael Schaefers
    May 9 at 13:52












up vote
47
down vote

favorite
5









up vote
47
down vote

favorite
5






5





I'm trying to find a file that doesn't exist in my home directory and all subdirectories.



find ~/ -name "bogus" gives me that information after few seconds, yet KDE's dolphin file manager needed almost 3 minutes to do the same. This corresponds with my previous experience with GNOME beagle.



How does find manage to do the same very fast while graphical search (which is more intuitive to use than commandline parameters) slugs behind?







share|improve this question













I'm trying to find a file that doesn't exist in my home directory and all subdirectories.



find ~/ -name "bogus" gives me that information after few seconds, yet KDE's dolphin file manager needed almost 3 minutes to do the same. This corresponds with my previous experience with GNOME beagle.



How does find manage to do the same very fast while graphical search (which is more intuitive to use than commandline parameters) slugs behind?









share|improve this question












share|improve this question




share|improve this question








edited May 9 at 15:14









Peter Cordes

3,9551032




3,9551032









asked May 9 at 8:33









Red

641815




641815











  • I don't know what "Dolphin" is, but does it maybe also look inside files?
    – Kusalananda
    May 9 at 8:48






  • 1




    It's a graphical file manager from KDE: kde.org/applications/system/dolphin It has ability to search inside files, but I didn't enabled that option during this short test.
    – Red
    May 9 at 8:51






  • 9




    Did you search more than once in dolphin? It might be "indexing" the 1st time. And "find" is slow too. Try "locate" if the file is older than the last time the database for locate was indexed ;-)
    – Rinzwind
    May 9 at 8:56











  • I use locate more often than find and it's faster in a huge folder
    – phuclv
    May 9 at 13:43






  • 11




    while locate is really great for finding files, this is a bit OT, because it uses a completely different approach: find and GUI tools like Dolphin are traversing the file tree on demand, while locate is using a previously created index structure.
    – Michael Schaefers
    May 9 at 13:52
















  • I don't know what "Dolphin" is, but does it maybe also look inside files?
    – Kusalananda
    May 9 at 8:48






  • 1




    It's a graphical file manager from KDE: kde.org/applications/system/dolphin It has ability to search inside files, but I didn't enabled that option during this short test.
    – Red
    May 9 at 8:51






  • 9




    Did you search more than once in dolphin? It might be "indexing" the 1st time. And "find" is slow too. Try "locate" if the file is older than the last time the database for locate was indexed ;-)
    – Rinzwind
    May 9 at 8:56











  • I use locate more often than find and it's faster in a huge folder
    – phuclv
    May 9 at 13:43






  • 11




    while locate is really great for finding files, this is a bit OT, because it uses a completely different approach: find and GUI tools like Dolphin are traversing the file tree on demand, while locate is using a previously created index structure.
    – Michael Schaefers
    May 9 at 13:52















I don't know what "Dolphin" is, but does it maybe also look inside files?
– Kusalananda
May 9 at 8:48




I don't know what "Dolphin" is, but does it maybe also look inside files?
– Kusalananda
May 9 at 8:48




1




1




It's a graphical file manager from KDE: kde.org/applications/system/dolphin It has ability to search inside files, but I didn't enabled that option during this short test.
– Red
May 9 at 8:51




It's a graphical file manager from KDE: kde.org/applications/system/dolphin It has ability to search inside files, but I didn't enabled that option during this short test.
– Red
May 9 at 8:51




9




9




Did you search more than once in dolphin? It might be "indexing" the 1st time. And "find" is slow too. Try "locate" if the file is older than the last time the database for locate was indexed ;-)
– Rinzwind
May 9 at 8:56





Did you search more than once in dolphin? It might be "indexing" the 1st time. And "find" is slow too. Try "locate" if the file is older than the last time the database for locate was indexed ;-)
– Rinzwind
May 9 at 8:56













I use locate more often than find and it's faster in a huge folder
– phuclv
May 9 at 13:43




I use locate more often than find and it's faster in a huge folder
– phuclv
May 9 at 13:43




11




11




while locate is really great for finding files, this is a bit OT, because it uses a completely different approach: find and GUI tools like Dolphin are traversing the file tree on demand, while locate is using a previously created index structure.
– Michael Schaefers
May 9 at 13:52




while locate is really great for finding files, this is a bit OT, because it uses a completely different approach: find and GUI tools like Dolphin are traversing the file tree on demand, while locate is using a previously created index structure.
– Michael Schaefers
May 9 at 13:52










1 Answer
1






active

oldest

votes

















up vote
68
down vote



accepted










Looking at Dolphin with Baloo specifically, it seems to look up the metadata of every file in its search domain, even if you're doing a simple file name search. When I trace the file.so process, I see calls to lstat, getxattr and getxattr again for every file, and even for .. entries. These system calls retrieve metadata about the file which is stored in a different location from the file name (the file name is stored in the directory contents, but the metadata are in the inode). Querying the metadata of a file multiple times is cheap since the data would be in the disk cache, but there can be a significant difference between querying the metadata and not querying the metadata.



find is much more clever. It tries to avoid unnecessary system calls. It won't call getxattr because it doesn't search based on extended attributes. When it's traversing a directory, it may need to call lstat on non-matching file names because that may be a subdirectory to search recursively (lstat is the system call that returns file metadata including the file type such as regular/directory/symlink/…). However find has an optimization: it knows how many subdirectories a directory has from its link count, and it stops calling lstat once it knows that it's traversed all the subdirectories. In particular, in a leaf directory (a directory with no subdirectories), find only checks the names, not the metadata. Furthermore some filesystems keep a copy of the file type in the directory entry so that find doesn't even need to call lstat if that's the only information it needs.



If you run find with options that require checking the metadata, it'll make more lstat calls, but it still won't make an lstat call on a file if it doesn't need the information (for example because the file is excluded by a previous condition matching on the name).



I suspect that other GUI search tools that reinvent the find wheel are similarly less clever than the command line utility which has undergone decades of optimization. Dolphin, at least, is clever enough to use the locate database if you search “everywhere” (with the limitation which isn't clear in the UI that the results may be out of date).






share|improve this answer



















  • 22




    GNU find is so "clever" that it misses some files on some filesystem types. The well known bug in GNU find is that it makes the illegal assumption that the link count of a directory is 2 + number of sub-directories. This works for filesystems that implement the design bug from the UNIX V7 filesystem, but not for all filesystems, since this is not a POSIX requirement. If you like to get a useful performance number for GNU make, you need to specify -noleaf on order to tell GNU make to behave correctly.
    – schily
    May 9 at 10:40







  • 12




    @schily, GNU find may have had that bug a long time ago, but I doubt you'll find a case where you need to specify -noleaf by hand nowadays. AFAICT, on Linux at least getdents() (and readdir()) tells which files are directory files on UDF, ISO-9660, btrfs which don't have real . or .. entries and find behaves OK there. Do you know of one case where GNU find exhibit the problem?
    – Stéphane Chazelas
    May 9 at 15:21






  • 4




    Just use this rotten genisoimage from debian to create a Rock Ridge filesystem using "graft-points" and the link count in a directory is a random value. Since Rock Ridge implements a link count and ./.., GNU find will usually not find all files on such a filesystem.
    – schily
    May 9 at 15:34






  • 4




    @StéphaneChazelas: Last time I checked (for my master's thesis), the bug was fixed by asserting exactly 2 meant known leaf rather than <= 2. The filesystems that don't implement the 2+ counter all return 1 for directory link count so everything's good. Now if some day somebody made a filesystem that did hard links to directories that didn't have this property, somebody's going to have a bad day.
    – Joshua
    May 9 at 19:09






  • 14




    @schily, I was not able to get random link counts with graft-points and RR with genisoimage 1.1.11 on Debian and even if I binary-edit the iso image to change link counts to random values, I still don't see any problem with GNU find. And in any case, strace -v shows that getdents() correctly returns d_type=DT_DIR for directories, so GNU find doesn't have to use the link count trick.
    – Stéphane Chazelas
    May 9 at 21:06










Your Answer







StackExchange.ready(function()
var channelOptions =
tags: "".split(" "),
id: "106"
;
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function()
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled)
StackExchange.using("snippets", function()
createEditor();
);

else
createEditor();

);

function createEditor()
StackExchange.prepareEditor(
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: false,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
);



);








 

draft saved


draft discarded


















StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f442707%2fwhy-is-gnu-find-so-fast-in-comparison-with-graphical-file-search-utilities%23new-answer', 'question_page');

);

Post as a guest






























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes








up vote
68
down vote



accepted










Looking at Dolphin with Baloo specifically, it seems to look up the metadata of every file in its search domain, even if you're doing a simple file name search. When I trace the file.so process, I see calls to lstat, getxattr and getxattr again for every file, and even for .. entries. These system calls retrieve metadata about the file which is stored in a different location from the file name (the file name is stored in the directory contents, but the metadata are in the inode). Querying the metadata of a file multiple times is cheap since the data would be in the disk cache, but there can be a significant difference between querying the metadata and not querying the metadata.



find is much more clever. It tries to avoid unnecessary system calls. It won't call getxattr because it doesn't search based on extended attributes. When it's traversing a directory, it may need to call lstat on non-matching file names because that may be a subdirectory to search recursively (lstat is the system call that returns file metadata including the file type such as regular/directory/symlink/…). However find has an optimization: it knows how many subdirectories a directory has from its link count, and it stops calling lstat once it knows that it's traversed all the subdirectories. In particular, in a leaf directory (a directory with no subdirectories), find only checks the names, not the metadata. Furthermore some filesystems keep a copy of the file type in the directory entry so that find doesn't even need to call lstat if that's the only information it needs.



If you run find with options that require checking the metadata, it'll make more lstat calls, but it still won't make an lstat call on a file if it doesn't need the information (for example because the file is excluded by a previous condition matching on the name).



I suspect that other GUI search tools that reinvent the find wheel are similarly less clever than the command line utility which has undergone decades of optimization. Dolphin, at least, is clever enough to use the locate database if you search “everywhere” (with the limitation which isn't clear in the UI that the results may be out of date).






share|improve this answer



















  • 22




    GNU find is so "clever" that it misses some files on some filesystem types. The well known bug in GNU find is that it makes the illegal assumption that the link count of a directory is 2 + number of sub-directories. This works for filesystems that implement the design bug from the UNIX V7 filesystem, but not for all filesystems, since this is not a POSIX requirement. If you like to get a useful performance number for GNU make, you need to specify -noleaf on order to tell GNU make to behave correctly.
    – schily
    May 9 at 10:40







  • 12




    @schily, GNU find may have had that bug a long time ago, but I doubt you'll find a case where you need to specify -noleaf by hand nowadays. AFAICT, on Linux at least getdents() (and readdir()) tells which files are directory files on UDF, ISO-9660, btrfs which don't have real . or .. entries and find behaves OK there. Do you know of one case where GNU find exhibit the problem?
    – Stéphane Chazelas
    May 9 at 15:21






  • 4




    Just use this rotten genisoimage from debian to create a Rock Ridge filesystem using "graft-points" and the link count in a directory is a random value. Since Rock Ridge implements a link count and ./.., GNU find will usually not find all files on such a filesystem.
    – schily
    May 9 at 15:34






  • 4




    @StéphaneChazelas: Last time I checked (for my master's thesis), the bug was fixed by asserting exactly 2 meant known leaf rather than <= 2. The filesystems that don't implement the 2+ counter all return 1 for directory link count so everything's good. Now if some day somebody made a filesystem that did hard links to directories that didn't have this property, somebody's going to have a bad day.
    – Joshua
    May 9 at 19:09






  • 14




    @schily, I was not able to get random link counts with graft-points and RR with genisoimage 1.1.11 on Debian and even if I binary-edit the iso image to change link counts to random values, I still don't see any problem with GNU find. And in any case, strace -v shows that getdents() correctly returns d_type=DT_DIR for directories, so GNU find doesn't have to use the link count trick.
    – Stéphane Chazelas
    May 9 at 21:06














up vote
68
down vote



accepted










Looking at Dolphin with Baloo specifically, it seems to look up the metadata of every file in its search domain, even if you're doing a simple file name search. When I trace the file.so process, I see calls to lstat, getxattr and getxattr again for every file, and even for .. entries. These system calls retrieve metadata about the file which is stored in a different location from the file name (the file name is stored in the directory contents, but the metadata are in the inode). Querying the metadata of a file multiple times is cheap since the data would be in the disk cache, but there can be a significant difference between querying the metadata and not querying the metadata.



find is much more clever. It tries to avoid unnecessary system calls. It won't call getxattr because it doesn't search based on extended attributes. When it's traversing a directory, it may need to call lstat on non-matching file names because that may be a subdirectory to search recursively (lstat is the system call that returns file metadata including the file type such as regular/directory/symlink/…). However find has an optimization: it knows how many subdirectories a directory has from its link count, and it stops calling lstat once it knows that it's traversed all the subdirectories. In particular, in a leaf directory (a directory with no subdirectories), find only checks the names, not the metadata. Furthermore some filesystems keep a copy of the file type in the directory entry so that find doesn't even need to call lstat if that's the only information it needs.



If you run find with options that require checking the metadata, it'll make more lstat calls, but it still won't make an lstat call on a file if it doesn't need the information (for example because the file is excluded by a previous condition matching on the name).



I suspect that other GUI search tools that reinvent the find wheel are similarly less clever than the command line utility which has undergone decades of optimization. Dolphin, at least, is clever enough to use the locate database if you search “everywhere” (with the limitation which isn't clear in the UI that the results may be out of date).






share|improve this answer



















  • 22




    GNU find is so "clever" that it misses some files on some filesystem types. The well known bug in GNU find is that it makes the illegal assumption that the link count of a directory is 2 + number of sub-directories. This works for filesystems that implement the design bug from the UNIX V7 filesystem, but not for all filesystems, since this is not a POSIX requirement. If you like to get a useful performance number for GNU make, you need to specify -noleaf on order to tell GNU make to behave correctly.
    – schily
    May 9 at 10:40







  • 12




    @schily, GNU find may have had that bug a long time ago, but I doubt you'll find a case where you need to specify -noleaf by hand nowadays. AFAICT, on Linux at least getdents() (and readdir()) tells which files are directory files on UDF, ISO-9660, btrfs which don't have real . or .. entries and find behaves OK there. Do you know of one case where GNU find exhibit the problem?
    – Stéphane Chazelas
    May 9 at 15:21






  • 4




    Just use this rotten genisoimage from debian to create a Rock Ridge filesystem using "graft-points" and the link count in a directory is a random value. Since Rock Ridge implements a link count and ./.., GNU find will usually not find all files on such a filesystem.
    – schily
    May 9 at 15:34






  • 4




    @StéphaneChazelas: Last time I checked (for my master's thesis), the bug was fixed by asserting exactly 2 meant known leaf rather than <= 2. The filesystems that don't implement the 2+ counter all return 1 for directory link count so everything's good. Now if some day somebody made a filesystem that did hard links to directories that didn't have this property, somebody's going to have a bad day.
    – Joshua
    May 9 at 19:09






  • 14




    @schily, I was not able to get random link counts with graft-points and RR with genisoimage 1.1.11 on Debian and even if I binary-edit the iso image to change link counts to random values, I still don't see any problem with GNU find. And in any case, strace -v shows that getdents() correctly returns d_type=DT_DIR for directories, so GNU find doesn't have to use the link count trick.
    – Stéphane Chazelas
    May 9 at 21:06












up vote
68
down vote



accepted







up vote
68
down vote



accepted






Looking at Dolphin with Baloo specifically, it seems to look up the metadata of every file in its search domain, even if you're doing a simple file name search. When I trace the file.so process, I see calls to lstat, getxattr and getxattr again for every file, and even for .. entries. These system calls retrieve metadata about the file which is stored in a different location from the file name (the file name is stored in the directory contents, but the metadata are in the inode). Querying the metadata of a file multiple times is cheap since the data would be in the disk cache, but there can be a significant difference between querying the metadata and not querying the metadata.



find is much more clever. It tries to avoid unnecessary system calls. It won't call getxattr because it doesn't search based on extended attributes. When it's traversing a directory, it may need to call lstat on non-matching file names because that may be a subdirectory to search recursively (lstat is the system call that returns file metadata including the file type such as regular/directory/symlink/…). However find has an optimization: it knows how many subdirectories a directory has from its link count, and it stops calling lstat once it knows that it's traversed all the subdirectories. In particular, in a leaf directory (a directory with no subdirectories), find only checks the names, not the metadata. Furthermore some filesystems keep a copy of the file type in the directory entry so that find doesn't even need to call lstat if that's the only information it needs.



If you run find with options that require checking the metadata, it'll make more lstat calls, but it still won't make an lstat call on a file if it doesn't need the information (for example because the file is excluded by a previous condition matching on the name).



I suspect that other GUI search tools that reinvent the find wheel are similarly less clever than the command line utility which has undergone decades of optimization. Dolphin, at least, is clever enough to use the locate database if you search “everywhere” (with the limitation which isn't clear in the UI that the results may be out of date).






share|improve this answer















Looking at Dolphin with Baloo specifically, it seems to look up the metadata of every file in its search domain, even if you're doing a simple file name search. When I trace the file.so process, I see calls to lstat, getxattr and getxattr again for every file, and even for .. entries. These system calls retrieve metadata about the file which is stored in a different location from the file name (the file name is stored in the directory contents, but the metadata are in the inode). Querying the metadata of a file multiple times is cheap since the data would be in the disk cache, but there can be a significant difference between querying the metadata and not querying the metadata.



find is much more clever. It tries to avoid unnecessary system calls. It won't call getxattr because it doesn't search based on extended attributes. When it's traversing a directory, it may need to call lstat on non-matching file names because that may be a subdirectory to search recursively (lstat is the system call that returns file metadata including the file type such as regular/directory/symlink/…). However find has an optimization: it knows how many subdirectories a directory has from its link count, and it stops calling lstat once it knows that it's traversed all the subdirectories. In particular, in a leaf directory (a directory with no subdirectories), find only checks the names, not the metadata. Furthermore some filesystems keep a copy of the file type in the directory entry so that find doesn't even need to call lstat if that's the only information it needs.



If you run find with options that require checking the metadata, it'll make more lstat calls, but it still won't make an lstat call on a file if it doesn't need the information (for example because the file is excluded by a previous condition matching on the name).



I suspect that other GUI search tools that reinvent the find wheel are similarly less clever than the command line utility which has undergone decades of optimization. Dolphin, at least, is clever enough to use the locate database if you search “everywhere” (with the limitation which isn't clear in the UI that the results may be out of date).







share|improve this answer















share|improve this answer



share|improve this answer








edited May 9 at 11:15


























answered May 9 at 10:31









Gilles

503k1189951522




503k1189951522







  • 22




    GNU find is so "clever" that it misses some files on some filesystem types. The well known bug in GNU find is that it makes the illegal assumption that the link count of a directory is 2 + number of sub-directories. This works for filesystems that implement the design bug from the UNIX V7 filesystem, but not for all filesystems, since this is not a POSIX requirement. If you like to get a useful performance number for GNU make, you need to specify -noleaf on order to tell GNU make to behave correctly.
    – schily
    May 9 at 10:40







  • 12




    @schily, GNU find may have had that bug a long time ago, but I doubt you'll find a case where you need to specify -noleaf by hand nowadays. AFAICT, on Linux at least getdents() (and readdir()) tells which files are directory files on UDF, ISO-9660, btrfs which don't have real . or .. entries and find behaves OK there. Do you know of one case where GNU find exhibit the problem?
    – Stéphane Chazelas
    May 9 at 15:21






  • 4




    Just use this rotten genisoimage from debian to create a Rock Ridge filesystem using "graft-points" and the link count in a directory is a random value. Since Rock Ridge implements a link count and ./.., GNU find will usually not find all files on such a filesystem.
    – schily
    May 9 at 15:34






  • 4




    @StéphaneChazelas: Last time I checked (for my master's thesis), the bug was fixed by asserting exactly 2 meant known leaf rather than <= 2. The filesystems that don't implement the 2+ counter all return 1 for directory link count so everything's good. Now if some day somebody made a filesystem that did hard links to directories that didn't have this property, somebody's going to have a bad day.
    – Joshua
    May 9 at 19:09






  • 14




    @schily, I was not able to get random link counts with graft-points and RR with genisoimage 1.1.11 on Debian and even if I binary-edit the iso image to change link counts to random values, I still don't see any problem with GNU find. And in any case, strace -v shows that getdents() correctly returns d_type=DT_DIR for directories, so GNU find doesn't have to use the link count trick.
    – Stéphane Chazelas
    May 9 at 21:06












  • 22




    GNU find is so "clever" that it misses some files on some filesystem types. The well known bug in GNU find is that it makes the illegal assumption that the link count of a directory is 2 + number of sub-directories. This works for filesystems that implement the design bug from the UNIX V7 filesystem, but not for all filesystems, since this is not a POSIX requirement. If you like to get a useful performance number for GNU make, you need to specify -noleaf on order to tell GNU make to behave correctly.
    – schily
    May 9 at 10:40







  • 12




    @schily, GNU find may have had that bug a long time ago, but I doubt you'll find a case where you need to specify -noleaf by hand nowadays. AFAICT, on Linux at least getdents() (and readdir()) tells which files are directory files on UDF, ISO-9660, btrfs which don't have real . or .. entries and find behaves OK there. Do you know of one case where GNU find exhibit the problem?
    – Stéphane Chazelas
    May 9 at 15:21






  • 4




    Just use this rotten genisoimage from debian to create a Rock Ridge filesystem using "graft-points" and the link count in a directory is a random value. Since Rock Ridge implements a link count and ./.., GNU find will usually not find all files on such a filesystem.
    – schily
    May 9 at 15:34






  • 4




    @StéphaneChazelas: Last time I checked (for my master's thesis), the bug was fixed by asserting exactly 2 meant known leaf rather than <= 2. The filesystems that don't implement the 2+ counter all return 1 for directory link count so everything's good. Now if some day somebody made a filesystem that did hard links to directories that didn't have this property, somebody's going to have a bad day.
    – Joshua
    May 9 at 19:09






  • 14




    @schily, I was not able to get random link counts with graft-points and RR with genisoimage 1.1.11 on Debian and even if I binary-edit the iso image to change link counts to random values, I still don't see any problem with GNU find. And in any case, strace -v shows that getdents() correctly returns d_type=DT_DIR for directories, so GNU find doesn't have to use the link count trick.
    – Stéphane Chazelas
    May 9 at 21:06







22




22




GNU find is so "clever" that it misses some files on some filesystem types. The well known bug in GNU find is that it makes the illegal assumption that the link count of a directory is 2 + number of sub-directories. This works for filesystems that implement the design bug from the UNIX V7 filesystem, but not for all filesystems, since this is not a POSIX requirement. If you like to get a useful performance number for GNU make, you need to specify -noleaf on order to tell GNU make to behave correctly.
– schily
May 9 at 10:40





GNU find is so "clever" that it misses some files on some filesystem types. The well known bug in GNU find is that it makes the illegal assumption that the link count of a directory is 2 + number of sub-directories. This works for filesystems that implement the design bug from the UNIX V7 filesystem, but not for all filesystems, since this is not a POSIX requirement. If you like to get a useful performance number for GNU make, you need to specify -noleaf on order to tell GNU make to behave correctly.
– schily
May 9 at 10:40





12




12




@schily, GNU find may have had that bug a long time ago, but I doubt you'll find a case where you need to specify -noleaf by hand nowadays. AFAICT, on Linux at least getdents() (and readdir()) tells which files are directory files on UDF, ISO-9660, btrfs which don't have real . or .. entries and find behaves OK there. Do you know of one case where GNU find exhibit the problem?
– Stéphane Chazelas
May 9 at 15:21




@schily, GNU find may have had that bug a long time ago, but I doubt you'll find a case where you need to specify -noleaf by hand nowadays. AFAICT, on Linux at least getdents() (and readdir()) tells which files are directory files on UDF, ISO-9660, btrfs which don't have real . or .. entries and find behaves OK there. Do you know of one case where GNU find exhibit the problem?
– Stéphane Chazelas
May 9 at 15:21




4




4




Just use this rotten genisoimage from debian to create a Rock Ridge filesystem using "graft-points" and the link count in a directory is a random value. Since Rock Ridge implements a link count and ./.., GNU find will usually not find all files on such a filesystem.
– schily
May 9 at 15:34




Just use this rotten genisoimage from debian to create a Rock Ridge filesystem using "graft-points" and the link count in a directory is a random value. Since Rock Ridge implements a link count and ./.., GNU find will usually not find all files on such a filesystem.
– schily
May 9 at 15:34




4




4




@StéphaneChazelas: Last time I checked (for my master's thesis), the bug was fixed by asserting exactly 2 meant known leaf rather than <= 2. The filesystems that don't implement the 2+ counter all return 1 for directory link count so everything's good. Now if some day somebody made a filesystem that did hard links to directories that didn't have this property, somebody's going to have a bad day.
– Joshua
May 9 at 19:09




@StéphaneChazelas: Last time I checked (for my master's thesis), the bug was fixed by asserting exactly 2 meant known leaf rather than <= 2. The filesystems that don't implement the 2+ counter all return 1 for directory link count so everything's good. Now if some day somebody made a filesystem that did hard links to directories that didn't have this property, somebody's going to have a bad day.
– Joshua
May 9 at 19:09




14




14




@schily, I was not able to get random link counts with graft-points and RR with genisoimage 1.1.11 on Debian and even if I binary-edit the iso image to change link counts to random values, I still don't see any problem with GNU find. And in any case, strace -v shows that getdents() correctly returns d_type=DT_DIR for directories, so GNU find doesn't have to use the link count trick.
– Stéphane Chazelas
May 9 at 21:06




@schily, I was not able to get random link counts with graft-points and RR with genisoimage 1.1.11 on Debian and even if I binary-edit the iso image to change link counts to random values, I still don't see any problem with GNU find. And in any case, strace -v shows that getdents() correctly returns d_type=DT_DIR for directories, so GNU find doesn't have to use the link count trick.
– Stéphane Chazelas
May 9 at 21:06












 

draft saved


draft discarded


























 


draft saved


draft discarded














StackExchange.ready(
function ()
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f442707%2fwhy-is-gnu-find-so-fast-in-comparison-with-graphical-file-search-utilities%23new-answer', 'question_page');

);

Post as a guest













































































Popular posts from this blog

How to check contact read email or not when send email to Individual?

Displaying single band from multi-band raster using QGIS

How many registers does an x86_64 CPU actually have?