What's the most resource efficient way to count how many files are in a directory?

The name of the pictureThe name of the pictureThe name of the pictureClash Royale CLAN TAG#URR8PPP












52















CentOS 5.9



I came across an issue the other day where a directory had a lot of files. To count it, I ran ls -l /foo/foo2/ | wc -l



Turns out that there were over 1 million files in a single directory (long story -- the root cause is getting fixed).



My question is: is there a faster way to do the count? What would be the most efficient way to get the count?










share|improve this question



















  • 5





    ls -l|wc -l would be off by one due to the total blocks in the first line of ls -l output

    – Thomas Nyman
    Sep 10 '13 at 21:18






  • 3





    @ThomasNyman It would actually be off by several because of the dot and dotdot pseudo entries, but those can be avoided by using the -A flag. -l is also problematic because of the reading file meta data in order to generate the extended list format. Forcing NOT -l by using ls is a much better option (-1 is assumed when piping output.) See Gilles's answer for the best solution here.

    – Caleb
    Sep 11 '13 at 9:29






  • 2





    @Caleb ls -l doesn't output any hidden files nor the . and .. entries. ls -a output includes hidden files, including . and .. while ls -A output includes hidden files excluding . and ... In Gilles's answer the bash dotglob shell option causes the expansion to include hidden files excluding . and ...

    – Thomas Nyman
    Sep 11 '13 at 9:45
















52















CentOS 5.9



I came across an issue the other day where a directory had a lot of files. To count it, I ran ls -l /foo/foo2/ | wc -l



Turns out that there were over 1 million files in a single directory (long story -- the root cause is getting fixed).



My question is: is there a faster way to do the count? What would be the most efficient way to get the count?










share|improve this question



















  • 5





    ls -l|wc -l would be off by one due to the total blocks in the first line of ls -l output

    – Thomas Nyman
    Sep 10 '13 at 21:18






  • 3





    @ThomasNyman It would actually be off by several because of the dot and dotdot pseudo entries, but those can be avoided by using the -A flag. -l is also problematic because of the reading file meta data in order to generate the extended list format. Forcing NOT -l by using ls is a much better option (-1 is assumed when piping output.) See Gilles's answer for the best solution here.

    – Caleb
    Sep 11 '13 at 9:29






  • 2





    @Caleb ls -l doesn't output any hidden files nor the . and .. entries. ls -a output includes hidden files, including . and .. while ls -A output includes hidden files excluding . and ... In Gilles's answer the bash dotglob shell option causes the expansion to include hidden files excluding . and ...

    – Thomas Nyman
    Sep 11 '13 at 9:45














52












52








52


14






CentOS 5.9



I came across an issue the other day where a directory had a lot of files. To count it, I ran ls -l /foo/foo2/ | wc -l



Turns out that there were over 1 million files in a single directory (long story -- the root cause is getting fixed).



My question is: is there a faster way to do the count? What would be the most efficient way to get the count?










share|improve this question
















CentOS 5.9



I came across an issue the other day where a directory had a lot of files. To count it, I ran ls -l /foo/foo2/ | wc -l



Turns out that there were over 1 million files in a single directory (long story -- the root cause is getting fixed).



My question is: is there a faster way to do the count? What would be the most efficient way to get the count?







bash shell directory ls






share|improve this question















share|improve this question













share|improve this question




share|improve this question








edited Sep 11 '13 at 0:05









Gilles

532k12810651592




532k12810651592










asked Sep 10 '13 at 19:33









Mike BMike B

3,179185376




3,179185376







  • 5





    ls -l|wc -l would be off by one due to the total blocks in the first line of ls -l output

    – Thomas Nyman
    Sep 10 '13 at 21:18






  • 3





    @ThomasNyman It would actually be off by several because of the dot and dotdot pseudo entries, but those can be avoided by using the -A flag. -l is also problematic because of the reading file meta data in order to generate the extended list format. Forcing NOT -l by using ls is a much better option (-1 is assumed when piping output.) See Gilles's answer for the best solution here.

    – Caleb
    Sep 11 '13 at 9:29






  • 2





    @Caleb ls -l doesn't output any hidden files nor the . and .. entries. ls -a output includes hidden files, including . and .. while ls -A output includes hidden files excluding . and ... In Gilles's answer the bash dotglob shell option causes the expansion to include hidden files excluding . and ...

    – Thomas Nyman
    Sep 11 '13 at 9:45













  • 5





    ls -l|wc -l would be off by one due to the total blocks in the first line of ls -l output

    – Thomas Nyman
    Sep 10 '13 at 21:18






  • 3





    @ThomasNyman It would actually be off by several because of the dot and dotdot pseudo entries, but those can be avoided by using the -A flag. -l is also problematic because of the reading file meta data in order to generate the extended list format. Forcing NOT -l by using ls is a much better option (-1 is assumed when piping output.) See Gilles's answer for the best solution here.

    – Caleb
    Sep 11 '13 at 9:29






  • 2





    @Caleb ls -l doesn't output any hidden files nor the . and .. entries. ls -a output includes hidden files, including . and .. while ls -A output includes hidden files excluding . and ... In Gilles's answer the bash dotglob shell option causes the expansion to include hidden files excluding . and ...

    – Thomas Nyman
    Sep 11 '13 at 9:45








5




5





ls -l|wc -l would be off by one due to the total blocks in the first line of ls -l output

– Thomas Nyman
Sep 10 '13 at 21:18





ls -l|wc -l would be off by one due to the total blocks in the first line of ls -l output

– Thomas Nyman
Sep 10 '13 at 21:18




3




3





@ThomasNyman It would actually be off by several because of the dot and dotdot pseudo entries, but those can be avoided by using the -A flag. -l is also problematic because of the reading file meta data in order to generate the extended list format. Forcing NOT -l by using ls is a much better option (-1 is assumed when piping output.) See Gilles's answer for the best solution here.

– Caleb
Sep 11 '13 at 9:29





@ThomasNyman It would actually be off by several because of the dot and dotdot pseudo entries, but those can be avoided by using the -A flag. -l is also problematic because of the reading file meta data in order to generate the extended list format. Forcing NOT -l by using ls is a much better option (-1 is assumed when piping output.) See Gilles's answer for the best solution here.

– Caleb
Sep 11 '13 at 9:29




2




2





@Caleb ls -l doesn't output any hidden files nor the . and .. entries. ls -a output includes hidden files, including . and .. while ls -A output includes hidden files excluding . and ... In Gilles's answer the bash dotglob shell option causes the expansion to include hidden files excluding . and ...

– Thomas Nyman
Sep 11 '13 at 9:45






@Caleb ls -l doesn't output any hidden files nor the . and .. entries. ls -a output includes hidden files, including . and .. while ls -A output includes hidden files excluding . and ... In Gilles's answer the bash dotglob shell option causes the expansion to include hidden files excluding . and ...

– Thomas Nyman
Sep 11 '13 at 9:45











13 Answers
13






active

oldest

votes


















56














Short answer:



ls -afq | wc -l


(This includes . and .., so subtract 2.)




When you list the files in a directory, three common things might happen:



  1. Enumerating the file names in the directory. This is inescapable: there is no way to count the files in a directory without enumerating them.

  2. Sorting the file names. Shell wildcards and the ls command do that.

  3. Calling stat to retrieve metadata about each directory entry, such as whether it is a directory.

#3 is the most expensive by far, because it requires loading an inode for each file. In comparison all the file names needed for #1 are compactly stored in a few blocks. #2 wastes some CPU time but it is often not a deal breaker.



If there are no newlines in file names, a simple ls -A | wc -l tells you how many files there are in the directory. Beware that if you have an alias for ls, this may trigger a call to stat (e.g. ls --color or ls -F need to know the file type, which requires a call to stat), so from the command line, call command ls -A | wc -l or ls -A | wc -l to avoid an alias.



If there are newlines in the file name, whether newlines are listed or not depends on the Unix variant. GNU coreutils and BusyBox default to displaying ? for a newline, so they're safe.



Call ls -f to list the entries without sorting them (#2). This automatically turns on -a (at least on modern systems). The -f option is in POSIX but with optional status; most implementations support it, but not BusyBox. The option -q replaces non-printable characters including newlines by ?; it's POSIX but isn't supported by BusyBox, so omit it if you need BusyBox support at the expense of overcounting files whose name contains a newline character.



If the directory has no subdirectories, then most versions of find will not call stat on its entries (leaf directory optimization: a directory that has a link count of 2 cannot have subdirectories, so find doesn't need to look up the metadata of the entries unless a condition such as -type requires it). So find . | wc -l is a portable, fast way to count files in a directory provided that the directory has no subdirectories and that no file name contains a newline.



If the directory has no subdirectories but file names may contain newlines, try one of these (the second one should be faster if it's supported, but may not be noticeably so).



find -print0 | tr -dc \0 | wc -c
find -printf a | wc -c


On the other hand, don't use find if the directory has subdirectories: even find . -maxdepth 1 calls stat on every entry (at least with GNU find and BusyBox find). You avoid sorting (#2) but you pay the price of an inode lookup (#3) which kills performance.



In the shell without external tools, you can run count the files in the current directory with set -- *; echo $#. This misses dot files (files whose name begins with .) and reports 1 instead of 0 in an empty directory. This is the fastest way to count files in small directories because it doesn't require starting an external program, but (except in zsh) wastes time for larger directories due to the sorting step (#2).




  • In bash, this is a reliable way to count the files in the current directory:



    shopt -s dotglob nullglob
    a=(*)
    echo $#a[@]



  • In ksh93, this is a reliable way to count the files in the current directory:



    FIGNORE='@(.|..)'
    a=(~(N)*)
    echo $#a[@]



  • In zsh, this is a reliable way to count the files in the current directory:



    a=(*(DNoN))
    echo $#a


    If you have the mark_dirs option set, make sure to turn it off: a=(*(DNoN^M)).




  • In any POSIX shell, this is a reliable way to count the files in the current directory:



    total=0
    set -- *
    if [ $# -ne 1 ] || [ -e "$1" ] || [ -L "$1" ]; then total=$((total+$#)); fi
    set -- .[!.]*
    if [ $# -ne 1 ] || [ -e "$1" ] || [ -L "$1" ]; then total=$((total+$#)); fi
    set -- ..?*
    if [ $# -ne 1 ] || [ -e "$1" ] || [ -L "$1" ]; then total=$((total+$#)); fi
    echo "$total"


All of these methods sort the file names, except for the zsh one.






share|improve this answer




















  • 1





    My empirical testing on >1 million files shows that find -maxdepth 1 easily keeps pace with ls -U as long as you don't add anything like a -type declaration that has to do further checks. Are you sure GNU find actually calls stat? Even the slowdown on find -type is nothing compared to how much ls -l bogs if you make it return file details. On the other hand the clear speed winner is zsh using the non sorting glob. (sorted globs are 2x slower than ls while the non-sorting one is 2x faster). I wonder if file system types would significantly effect these results.

    – Caleb
    Sep 11 '13 at 9:44












  • @Caleb I ran strace. This is only true if the directory has subdirectories: otherwise find's leaf directory optimization kicks in (even without -maxdepth 1), I should have mentioned that. A lot of things can affect the result, including the filesystem type (calling stat is a lot more expensive on filesystems that represent directories as linear lists than on filesystems that represent directories as trees), whether the inodes were all created together and are thus close by on the disk, cold or hot cache, etc.

    – Gilles
    Sep 11 '13 at 9:55






  • 1





    Historically, ls -f has been the reliable way to prevent calling stat - this is often simply described today as "output is not sorted" (which it also causes), and does include . and ... -A and -U are not standard options.

    – Random832
    Sep 11 '13 at 12:59







  • 1





    If you specifically want to count file with a common extension (or other string), inserting that into the command eliminates the extra 2. Here is an example: ls -afq *[0-9].pdb | wc -l

    – Steven C. Howell
    Jun 12 '15 at 13:18












  • FYI, with ksh93 version sh (AT&T Research) 93u+ 2012-08-01 on my Debian-based system, FIGNORE doesn't seem to work. The . and .. entries are included into the resulting array

    – Sergiy Kolodyazhnyy
    Jan 4 at 8:52


















15














find /foo/foo2/ -maxdepth 1 | wc -l


Is considerably faster on my machine but the local . directory is added to the count.






share|improve this answer


















  • 1





    Thanks. I'm compelled to ask a silly question though: why is it faster? Because it's not bothering to look-up file attributes?

    – Mike B
    Sep 10 '13 at 20:42






  • 2





    Yes, that's my understanding. As long as your not using the -type parameter find should be faster than ls

    – Joel Taylor
    Sep 10 '13 at 21:02






  • 1





    Hmmm.... if I'm understanding the documentation of find well, this should actually be better than my answer. Anyone with more experience can verify?

    – Luis Machuca
    Sep 11 '13 at 2:38











  • Add a -mindepth 1 to omit the directory itself.

    – Stéphane Chazelas
    Jan 4 at 9:53


















8














ls -1U before the pipe should spend just a bit less resources, as it does no attempt to sort the file entries, it just reads them as they are sorted in the folder on disk. It also produces less output, meaning slightly less work for wc.



You could also use ls -f which is more or less a shortcut for ls -1aU.



I don't know if there is a resource-efficient way to do it via a command without piping though.






share|improve this answer




















  • 8





    Btw, -1 is implied when the output goes to a pipe

    – enzotib
    Sep 10 '13 at 21:04











  • @enzotib - it is? Wow... one learns something new every day!

    – Luis Machuca
    Sep 10 '13 at 21:25


















6














Another point of comparison. While not being a shell oneliner, this C program doesn't do anything superflous. Note that hidden files are ignored to match the output of ls|wc -l (ls -l|wc -l is off by one due to the total blocks in the first line of output).



#include <stdio.h>
#include <stdlib.h>
#include <dirent.h>
#include <error.h>
#include <errno.h>

int main(int argc, char *argv)

int file_count = 0;
DIR * dirp;
struct dirent * entry;

if (argc < 2)
error(EXIT_FAILURE, 0, "missing argument");

if(!(dirp = opendir(argv[1])))
error(EXIT_FAILURE, errno, "could not open '%s'", argv[1]);

while ((entry = readdir(dirp)) != NULL)
if (entry->d_name[0] == '.') /* ignore hidden files */
continue;

file_count++;

closedir(dirp);

printf("%dn", file_count);






share|improve this answer

























  • Using the readdir() stdio API does add some overhead and does not give you control over the size of the buffer passed to the underlying system call (getdents on Linux)

    – Stéphane Chazelas
    Jan 4 at 9:41


















3














You could try perl -e 'opendir($dh,".");$i=0;while(readdir $dh)$i++;print "$in";'



It'd be interesting to compare timings with your shell pipe.






share|improve this answer























  • On my tests, this keeps pretty much exactly the same pace as the three other fastest solutions (find -maxdepth 1 | wc -l, ls -AU | wc -l and the zsh based non sorting glob and array count). In other words it beats out the options with various inefficiencies such as sorting or reading extraneous file properties. I would venture to say since it doesn't earn you anything either, it isn't worth using over a simpler solution unless you happen to be in perl already :)

    – Caleb
    Sep 11 '13 at 9:53












  • Note that this will include the . and .. directory entries in the count, so you need to subtract two to get the actual number of files (and subdirectories). In modern Perl, perl -E 'opendir $dh, "."; $i++ while readdir $dh; say $i - 2' would do it.

    – Ilmari Karonen
    Sep 11 '13 at 10:36


















2














From this answer, I can think of this one as a possible solution.



/*
* List directories using getdents() because ls, find and Python libraries
* use readdir() which is slower (but uses getdents() underneath.
*
* Compile with
* ]$ gcc getdents.c -o getdents
*/
#define _GNU_SOURCE
#include <dirent.h> /* Defines DT_* constants */
#include <fcntl.h>
#include <stdio.h>
#include <unistd.h>
#include <stdlib.h>
#include <sys/stat.h>
#include <sys/syscall.h>

#define handle_error(msg)
do perror(msg); exit(EXIT_FAILURE); while (0)

struct linux_dirent
long d_ino;
off_t d_off;
unsigned short d_reclen;
char d_name;
;

#define BUF_SIZE 1024*1024*5

int
main(int argc, char *argv)

int fd, nread;
char buf[BUF_SIZE];
struct linux_dirent *d;
int bpos;
char d_type;

fd = open(argc > 1 ? argv[1] : ".", O_RDONLY


Copy the C program above into directory in which the files need to be listed. Then execute the below commands.



gcc getdents.c -o getdents
./getdents | wc -l





share|improve this answer




















  • 1





    A few things: 1) if you're willing to use a custom program for this, you might as well just count the files and print the count; 2) to compare with ls -f, don't filter on d_type at all, just on d->d_ino != 0; 3) subtract 2 for . and ...

    – Matei David
    Jan 17 '17 at 16:01











  • See linked answer for a timings example where this is 40x faster than the accepted ls -f.

    – Matei David
    Jan 17 '17 at 16:02



















1














A bash-only solution, not requiring any external program, but don't know how much efficient:



list=(*)
echo "$#list[@]"





share|improve this answer























  • Glob expansion isn't necessary the most resource efficient way to do this. Besides most shells having an upper limit to the number of items they will even process so this will probably bomb when dealing with a million plus items, it also sorts the output. The solutions involving find or ls without sorting options will be faster.

    – Caleb
    Sep 11 '13 at 6:37











  • @Caleb, only old versions of ksh had such limits (and didn't support that syntax) AFAIK. In all most other shells, the limit is just the available memory. You've got a point that it's going to be very inefficient, especially in bash.

    – Stéphane Chazelas
    Jan 4 at 9:45


















1














Probably the most resource efficient way would involve no outside process invocations. So I'd wager on...



cglb() ( c=0 ; set --
tglb() [ -L "$2" ] &&
c=$(($c+$#-1))

for glb in '.?*' *
do tglb $1 $glb##.* $glb#*
set -- ..
done
echo $c
)





share|improve this answer




















  • 1





    Got relative numbers? for how many files?

    – smci
    Nov 20 '17 at 23:44


















0














After fixing the issue from @Joel 's answer, where it added . as a file:



find /foo/foo2 -maxdepth 1 | tail -n +2 | wc -l



tail simply removes the first line, meaning that . isn't counted anymore.






share|improve this answer


















  • 1





    Adding a pair of pipes in order to omit one line of wc input is not very efficient as the overhead increases linearly with regard to input size. In this case, why not simply decrement the final count to compensate for it being off by one, which is a constant time operation: echo $(( $(find /foo/foo2 -maxdepth 1 | wc -l) - 1))

    – Thomas Nyman
    Sep 11 '13 at 6:32






  • 1





    Rather than feed that much data through another process, it would probably be better to just do some math on the final output. let count = $(find /foo/foo2 -maxdepth 1 | wc -l) - 2

    – Caleb
    Sep 11 '13 at 6:34



















0














os.listdir() in python can do the work for you. It gives an array of the contents of the directory, excluding the special '.' and '..' files. Also, no need to worry abt files with special characters like 'n' in the name.



python -c 'import os;print len(os.listdir("."))'


following is the time taken by the above python command compared with the 'ls -Af' command.




~/test$ time ls -Af |wc -l
399144

real 0m0.300s
user 0m0.104s
sys 0m0.240s
~/test$ time python -c 'import os;print len(os.listdir("."))'
399142

real 0m0.249s
user 0m0.064s
sys 0m0.180s





share|improve this answer






























    0














    ls -1 | wc -l comes immediately to my mind. Whether ls -1U is faster than ls -1 is purely academic - the difference should be negligible but for very large directories.






    share|improve this answer






























      0














      I know this is old but I feel that awk has to be mentioned here. The suggestions that include the use of wc simply aren't correct in regards to OP's question: "the most resource efficient way." I recently had a log file get way out of control (due to some bad software) and therefore stumbled onto this post. There was roughly 232 million entries! I first tried wc -l and waited 15 minutes - it was not even able to finish counting the lines. The following awk statement gave me an accurate line count in 3 minutes on that log file. I've learned over the years to never underestimate awk's ability to simulate standard shell programs in a much more efficient fashion. Hope it helps someone like me. Happy hacking!



      awk 'BEGINi=0 i++ ENDprint i' /foo/foo2


      And if you need to substitute a command like ls for counting files in a directory:



      `#Normal:` awk 'BEGINi=0 i++ ENDprint i' <(ls /foo/foo2/)
      `#Hidden:` awk 'BEGINi=0 i++ ENDprint (i-2)' <(ls -f /foo/foo2/)





      share|improve this answer

























      • Or simply, awk 'ENDprint NR'. But in this particular situation, awk may be overkill because ls is the bottleneck, not wc.

        – Amit Naidu
        May 29 '18 at 5:19


















      -2














      I would think echo * would be more efficient than any 'ls' command:



      echo * | wc -w





      share|improve this answer


















      • 4





        What about files with a space in their name? echo 'Hello World'|wc -w produces 2.

        – Joseph R.
        Sep 11 '13 at 20:52












      • @JosephR. Caveat Emptor

        – Dan Garthwaite
        Sep 12 '13 at 0:59










      Your Answer








      StackExchange.ready(function()
      var channelOptions =
      tags: "".split(" "),
      id: "106"
      ;
      initTagRenderer("".split(" "), "".split(" "), channelOptions);

      StackExchange.using("externalEditor", function()
      // Have to fire editor after snippets, if snippets enabled
      if (StackExchange.settings.snippets.snippetsEnabled)
      StackExchange.using("snippets", function()
      createEditor();
      );

      else
      createEditor();

      );

      function createEditor()
      StackExchange.prepareEditor(
      heartbeatType: 'answer',
      autoActivateHeartbeat: false,
      convertImagesToLinks: false,
      noModals: true,
      showLowRepImageUploadWarning: true,
      reputationToPostImages: null,
      bindNavPrevention: true,
      postfix: "",
      imageUploader:
      brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
      contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
      allowUrls: true
      ,
      onDemand: true,
      discardSelector: ".discard-answer"
      ,immediatelyShowMarkdownHelp:true
      );



      );













      draft saved

      draft discarded


















      StackExchange.ready(
      function ()
      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f90106%2fwhats-the-most-resource-efficient-way-to-count-how-many-files-are-in-a-director%23new-answer', 'question_page');

      );

      Post as a guest















      Required, but never shown

























      13 Answers
      13






      active

      oldest

      votes








      13 Answers
      13






      active

      oldest

      votes









      active

      oldest

      votes






      active

      oldest

      votes









      56














      Short answer:



      ls -afq | wc -l


      (This includes . and .., so subtract 2.)




      When you list the files in a directory, three common things might happen:



      1. Enumerating the file names in the directory. This is inescapable: there is no way to count the files in a directory without enumerating them.

      2. Sorting the file names. Shell wildcards and the ls command do that.

      3. Calling stat to retrieve metadata about each directory entry, such as whether it is a directory.

      #3 is the most expensive by far, because it requires loading an inode for each file. In comparison all the file names needed for #1 are compactly stored in a few blocks. #2 wastes some CPU time but it is often not a deal breaker.



      If there are no newlines in file names, a simple ls -A | wc -l tells you how many files there are in the directory. Beware that if you have an alias for ls, this may trigger a call to stat (e.g. ls --color or ls -F need to know the file type, which requires a call to stat), so from the command line, call command ls -A | wc -l or ls -A | wc -l to avoid an alias.



      If there are newlines in the file name, whether newlines are listed or not depends on the Unix variant. GNU coreutils and BusyBox default to displaying ? for a newline, so they're safe.



      Call ls -f to list the entries without sorting them (#2). This automatically turns on -a (at least on modern systems). The -f option is in POSIX but with optional status; most implementations support it, but not BusyBox. The option -q replaces non-printable characters including newlines by ?; it's POSIX but isn't supported by BusyBox, so omit it if you need BusyBox support at the expense of overcounting files whose name contains a newline character.



      If the directory has no subdirectories, then most versions of find will not call stat on its entries (leaf directory optimization: a directory that has a link count of 2 cannot have subdirectories, so find doesn't need to look up the metadata of the entries unless a condition such as -type requires it). So find . | wc -l is a portable, fast way to count files in a directory provided that the directory has no subdirectories and that no file name contains a newline.



      If the directory has no subdirectories but file names may contain newlines, try one of these (the second one should be faster if it's supported, but may not be noticeably so).



      find -print0 | tr -dc \0 | wc -c
      find -printf a | wc -c


      On the other hand, don't use find if the directory has subdirectories: even find . -maxdepth 1 calls stat on every entry (at least with GNU find and BusyBox find). You avoid sorting (#2) but you pay the price of an inode lookup (#3) which kills performance.



      In the shell without external tools, you can run count the files in the current directory with set -- *; echo $#. This misses dot files (files whose name begins with .) and reports 1 instead of 0 in an empty directory. This is the fastest way to count files in small directories because it doesn't require starting an external program, but (except in zsh) wastes time for larger directories due to the sorting step (#2).




      • In bash, this is a reliable way to count the files in the current directory:



        shopt -s dotglob nullglob
        a=(*)
        echo $#a[@]



      • In ksh93, this is a reliable way to count the files in the current directory:



        FIGNORE='@(.|..)'
        a=(~(N)*)
        echo $#a[@]



      • In zsh, this is a reliable way to count the files in the current directory:



        a=(*(DNoN))
        echo $#a


        If you have the mark_dirs option set, make sure to turn it off: a=(*(DNoN^M)).




      • In any POSIX shell, this is a reliable way to count the files in the current directory:



        total=0
        set -- *
        if [ $# -ne 1 ] || [ -e "$1" ] || [ -L "$1" ]; then total=$((total+$#)); fi
        set -- .[!.]*
        if [ $# -ne 1 ] || [ -e "$1" ] || [ -L "$1" ]; then total=$((total+$#)); fi
        set -- ..?*
        if [ $# -ne 1 ] || [ -e "$1" ] || [ -L "$1" ]; then total=$((total+$#)); fi
        echo "$total"


      All of these methods sort the file names, except for the zsh one.






      share|improve this answer




















      • 1





        My empirical testing on >1 million files shows that find -maxdepth 1 easily keeps pace with ls -U as long as you don't add anything like a -type declaration that has to do further checks. Are you sure GNU find actually calls stat? Even the slowdown on find -type is nothing compared to how much ls -l bogs if you make it return file details. On the other hand the clear speed winner is zsh using the non sorting glob. (sorted globs are 2x slower than ls while the non-sorting one is 2x faster). I wonder if file system types would significantly effect these results.

        – Caleb
        Sep 11 '13 at 9:44












      • @Caleb I ran strace. This is only true if the directory has subdirectories: otherwise find's leaf directory optimization kicks in (even without -maxdepth 1), I should have mentioned that. A lot of things can affect the result, including the filesystem type (calling stat is a lot more expensive on filesystems that represent directories as linear lists than on filesystems that represent directories as trees), whether the inodes were all created together and are thus close by on the disk, cold or hot cache, etc.

        – Gilles
        Sep 11 '13 at 9:55






      • 1





        Historically, ls -f has been the reliable way to prevent calling stat - this is often simply described today as "output is not sorted" (which it also causes), and does include . and ... -A and -U are not standard options.

        – Random832
        Sep 11 '13 at 12:59







      • 1





        If you specifically want to count file with a common extension (or other string), inserting that into the command eliminates the extra 2. Here is an example: ls -afq *[0-9].pdb | wc -l

        – Steven C. Howell
        Jun 12 '15 at 13:18












      • FYI, with ksh93 version sh (AT&T Research) 93u+ 2012-08-01 on my Debian-based system, FIGNORE doesn't seem to work. The . and .. entries are included into the resulting array

        – Sergiy Kolodyazhnyy
        Jan 4 at 8:52















      56














      Short answer:



      ls -afq | wc -l


      (This includes . and .., so subtract 2.)




      When you list the files in a directory, three common things might happen:



      1. Enumerating the file names in the directory. This is inescapable: there is no way to count the files in a directory without enumerating them.

      2. Sorting the file names. Shell wildcards and the ls command do that.

      3. Calling stat to retrieve metadata about each directory entry, such as whether it is a directory.

      #3 is the most expensive by far, because it requires loading an inode for each file. In comparison all the file names needed for #1 are compactly stored in a few blocks. #2 wastes some CPU time but it is often not a deal breaker.



      If there are no newlines in file names, a simple ls -A | wc -l tells you how many files there are in the directory. Beware that if you have an alias for ls, this may trigger a call to stat (e.g. ls --color or ls -F need to know the file type, which requires a call to stat), so from the command line, call command ls -A | wc -l or ls -A | wc -l to avoid an alias.



      If there are newlines in the file name, whether newlines are listed or not depends on the Unix variant. GNU coreutils and BusyBox default to displaying ? for a newline, so they're safe.



      Call ls -f to list the entries without sorting them (#2). This automatically turns on -a (at least on modern systems). The -f option is in POSIX but with optional status; most implementations support it, but not BusyBox. The option -q replaces non-printable characters including newlines by ?; it's POSIX but isn't supported by BusyBox, so omit it if you need BusyBox support at the expense of overcounting files whose name contains a newline character.



      If the directory has no subdirectories, then most versions of find will not call stat on its entries (leaf directory optimization: a directory that has a link count of 2 cannot have subdirectories, so find doesn't need to look up the metadata of the entries unless a condition such as -type requires it). So find . | wc -l is a portable, fast way to count files in a directory provided that the directory has no subdirectories and that no file name contains a newline.



      If the directory has no subdirectories but file names may contain newlines, try one of these (the second one should be faster if it's supported, but may not be noticeably so).



      find -print0 | tr -dc \0 | wc -c
      find -printf a | wc -c


      On the other hand, don't use find if the directory has subdirectories: even find . -maxdepth 1 calls stat on every entry (at least with GNU find and BusyBox find). You avoid sorting (#2) but you pay the price of an inode lookup (#3) which kills performance.



      In the shell without external tools, you can run count the files in the current directory with set -- *; echo $#. This misses dot files (files whose name begins with .) and reports 1 instead of 0 in an empty directory. This is the fastest way to count files in small directories because it doesn't require starting an external program, but (except in zsh) wastes time for larger directories due to the sorting step (#2).




      • In bash, this is a reliable way to count the files in the current directory:



        shopt -s dotglob nullglob
        a=(*)
        echo $#a[@]



      • In ksh93, this is a reliable way to count the files in the current directory:



        FIGNORE='@(.|..)'
        a=(~(N)*)
        echo $#a[@]



      • In zsh, this is a reliable way to count the files in the current directory:



        a=(*(DNoN))
        echo $#a


        If you have the mark_dirs option set, make sure to turn it off: a=(*(DNoN^M)).




      • In any POSIX shell, this is a reliable way to count the files in the current directory:



        total=0
        set -- *
        if [ $# -ne 1 ] || [ -e "$1" ] || [ -L "$1" ]; then total=$((total+$#)); fi
        set -- .[!.]*
        if [ $# -ne 1 ] || [ -e "$1" ] || [ -L "$1" ]; then total=$((total+$#)); fi
        set -- ..?*
        if [ $# -ne 1 ] || [ -e "$1" ] || [ -L "$1" ]; then total=$((total+$#)); fi
        echo "$total"


      All of these methods sort the file names, except for the zsh one.






      share|improve this answer




















      • 1





        My empirical testing on >1 million files shows that find -maxdepth 1 easily keeps pace with ls -U as long as you don't add anything like a -type declaration that has to do further checks. Are you sure GNU find actually calls stat? Even the slowdown on find -type is nothing compared to how much ls -l bogs if you make it return file details. On the other hand the clear speed winner is zsh using the non sorting glob. (sorted globs are 2x slower than ls while the non-sorting one is 2x faster). I wonder if file system types would significantly effect these results.

        – Caleb
        Sep 11 '13 at 9:44












      • @Caleb I ran strace. This is only true if the directory has subdirectories: otherwise find's leaf directory optimization kicks in (even without -maxdepth 1), I should have mentioned that. A lot of things can affect the result, including the filesystem type (calling stat is a lot more expensive on filesystems that represent directories as linear lists than on filesystems that represent directories as trees), whether the inodes were all created together and are thus close by on the disk, cold or hot cache, etc.

        – Gilles
        Sep 11 '13 at 9:55






      • 1





        Historically, ls -f has been the reliable way to prevent calling stat - this is often simply described today as "output is not sorted" (which it also causes), and does include . and ... -A and -U are not standard options.

        – Random832
        Sep 11 '13 at 12:59







      • 1





        If you specifically want to count file with a common extension (or other string), inserting that into the command eliminates the extra 2. Here is an example: ls -afq *[0-9].pdb | wc -l

        – Steven C. Howell
        Jun 12 '15 at 13:18












      • FYI, with ksh93 version sh (AT&T Research) 93u+ 2012-08-01 on my Debian-based system, FIGNORE doesn't seem to work. The . and .. entries are included into the resulting array

        – Sergiy Kolodyazhnyy
        Jan 4 at 8:52













      56












      56








      56







      Short answer:



      ls -afq | wc -l


      (This includes . and .., so subtract 2.)




      When you list the files in a directory, three common things might happen:



      1. Enumerating the file names in the directory. This is inescapable: there is no way to count the files in a directory without enumerating them.

      2. Sorting the file names. Shell wildcards and the ls command do that.

      3. Calling stat to retrieve metadata about each directory entry, such as whether it is a directory.

      #3 is the most expensive by far, because it requires loading an inode for each file. In comparison all the file names needed for #1 are compactly stored in a few blocks. #2 wastes some CPU time but it is often not a deal breaker.



      If there are no newlines in file names, a simple ls -A | wc -l tells you how many files there are in the directory. Beware that if you have an alias for ls, this may trigger a call to stat (e.g. ls --color or ls -F need to know the file type, which requires a call to stat), so from the command line, call command ls -A | wc -l or ls -A | wc -l to avoid an alias.



      If there are newlines in the file name, whether newlines are listed or not depends on the Unix variant. GNU coreutils and BusyBox default to displaying ? for a newline, so they're safe.



      Call ls -f to list the entries without sorting them (#2). This automatically turns on -a (at least on modern systems). The -f option is in POSIX but with optional status; most implementations support it, but not BusyBox. The option -q replaces non-printable characters including newlines by ?; it's POSIX but isn't supported by BusyBox, so omit it if you need BusyBox support at the expense of overcounting files whose name contains a newline character.



      If the directory has no subdirectories, then most versions of find will not call stat on its entries (leaf directory optimization: a directory that has a link count of 2 cannot have subdirectories, so find doesn't need to look up the metadata of the entries unless a condition such as -type requires it). So find . | wc -l is a portable, fast way to count files in a directory provided that the directory has no subdirectories and that no file name contains a newline.



      If the directory has no subdirectories but file names may contain newlines, try one of these (the second one should be faster if it's supported, but may not be noticeably so).



      find -print0 | tr -dc \0 | wc -c
      find -printf a | wc -c


      On the other hand, don't use find if the directory has subdirectories: even find . -maxdepth 1 calls stat on every entry (at least with GNU find and BusyBox find). You avoid sorting (#2) but you pay the price of an inode lookup (#3) which kills performance.



      In the shell without external tools, you can run count the files in the current directory with set -- *; echo $#. This misses dot files (files whose name begins with .) and reports 1 instead of 0 in an empty directory. This is the fastest way to count files in small directories because it doesn't require starting an external program, but (except in zsh) wastes time for larger directories due to the sorting step (#2).




      • In bash, this is a reliable way to count the files in the current directory:



        shopt -s dotglob nullglob
        a=(*)
        echo $#a[@]



      • In ksh93, this is a reliable way to count the files in the current directory:



        FIGNORE='@(.|..)'
        a=(~(N)*)
        echo $#a[@]



      • In zsh, this is a reliable way to count the files in the current directory:



        a=(*(DNoN))
        echo $#a


        If you have the mark_dirs option set, make sure to turn it off: a=(*(DNoN^M)).




      • In any POSIX shell, this is a reliable way to count the files in the current directory:



        total=0
        set -- *
        if [ $# -ne 1 ] || [ -e "$1" ] || [ -L "$1" ]; then total=$((total+$#)); fi
        set -- .[!.]*
        if [ $# -ne 1 ] || [ -e "$1" ] || [ -L "$1" ]; then total=$((total+$#)); fi
        set -- ..?*
        if [ $# -ne 1 ] || [ -e "$1" ] || [ -L "$1" ]; then total=$((total+$#)); fi
        echo "$total"


      All of these methods sort the file names, except for the zsh one.






      share|improve this answer















      Short answer:



      ls -afq | wc -l


      (This includes . and .., so subtract 2.)




      When you list the files in a directory, three common things might happen:



      1. Enumerating the file names in the directory. This is inescapable: there is no way to count the files in a directory without enumerating them.

      2. Sorting the file names. Shell wildcards and the ls command do that.

      3. Calling stat to retrieve metadata about each directory entry, such as whether it is a directory.

      #3 is the most expensive by far, because it requires loading an inode for each file. In comparison all the file names needed for #1 are compactly stored in a few blocks. #2 wastes some CPU time but it is often not a deal breaker.



      If there are no newlines in file names, a simple ls -A | wc -l tells you how many files there are in the directory. Beware that if you have an alias for ls, this may trigger a call to stat (e.g. ls --color or ls -F need to know the file type, which requires a call to stat), so from the command line, call command ls -A | wc -l or ls -A | wc -l to avoid an alias.



      If there are newlines in the file name, whether newlines are listed or not depends on the Unix variant. GNU coreutils and BusyBox default to displaying ? for a newline, so they're safe.



      Call ls -f to list the entries without sorting them (#2). This automatically turns on -a (at least on modern systems). The -f option is in POSIX but with optional status; most implementations support it, but not BusyBox. The option -q replaces non-printable characters including newlines by ?; it's POSIX but isn't supported by BusyBox, so omit it if you need BusyBox support at the expense of overcounting files whose name contains a newline character.



      If the directory has no subdirectories, then most versions of find will not call stat on its entries (leaf directory optimization: a directory that has a link count of 2 cannot have subdirectories, so find doesn't need to look up the metadata of the entries unless a condition such as -type requires it). So find . | wc -l is a portable, fast way to count files in a directory provided that the directory has no subdirectories and that no file name contains a newline.



      If the directory has no subdirectories but file names may contain newlines, try one of these (the second one should be faster if it's supported, but may not be noticeably so).



      find -print0 | tr -dc \0 | wc -c
      find -printf a | wc -c


      On the other hand, don't use find if the directory has subdirectories: even find . -maxdepth 1 calls stat on every entry (at least with GNU find and BusyBox find). You avoid sorting (#2) but you pay the price of an inode lookup (#3) which kills performance.



      In the shell without external tools, you can run count the files in the current directory with set -- *; echo $#. This misses dot files (files whose name begins with .) and reports 1 instead of 0 in an empty directory. This is the fastest way to count files in small directories because it doesn't require starting an external program, but (except in zsh) wastes time for larger directories due to the sorting step (#2).




      • In bash, this is a reliable way to count the files in the current directory:



        shopt -s dotglob nullglob
        a=(*)
        echo $#a[@]



      • In ksh93, this is a reliable way to count the files in the current directory:



        FIGNORE='@(.|..)'
        a=(~(N)*)
        echo $#a[@]



      • In zsh, this is a reliable way to count the files in the current directory:



        a=(*(DNoN))
        echo $#a


        If you have the mark_dirs option set, make sure to turn it off: a=(*(DNoN^M)).




      • In any POSIX shell, this is a reliable way to count the files in the current directory:



        total=0
        set -- *
        if [ $# -ne 1 ] || [ -e "$1" ] || [ -L "$1" ]; then total=$((total+$#)); fi
        set -- .[!.]*
        if [ $# -ne 1 ] || [ -e "$1" ] || [ -L "$1" ]; then total=$((total+$#)); fi
        set -- ..?*
        if [ $# -ne 1 ] || [ -e "$1" ] || [ -L "$1" ]; then total=$((total+$#)); fi
        echo "$total"


      All of these methods sort the file names, except for the zsh one.







      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited Jan 4 at 9:35

























      answered Sep 11 '13 at 0:30









      GillesGilles

      532k12810651592




      532k12810651592







      • 1





        My empirical testing on >1 million files shows that find -maxdepth 1 easily keeps pace with ls -U as long as you don't add anything like a -type declaration that has to do further checks. Are you sure GNU find actually calls stat? Even the slowdown on find -type is nothing compared to how much ls -l bogs if you make it return file details. On the other hand the clear speed winner is zsh using the non sorting glob. (sorted globs are 2x slower than ls while the non-sorting one is 2x faster). I wonder if file system types would significantly effect these results.

        – Caleb
        Sep 11 '13 at 9:44












      • @Caleb I ran strace. This is only true if the directory has subdirectories: otherwise find's leaf directory optimization kicks in (even without -maxdepth 1), I should have mentioned that. A lot of things can affect the result, including the filesystem type (calling stat is a lot more expensive on filesystems that represent directories as linear lists than on filesystems that represent directories as trees), whether the inodes were all created together and are thus close by on the disk, cold or hot cache, etc.

        – Gilles
        Sep 11 '13 at 9:55






      • 1





        Historically, ls -f has been the reliable way to prevent calling stat - this is often simply described today as "output is not sorted" (which it also causes), and does include . and ... -A and -U are not standard options.

        – Random832
        Sep 11 '13 at 12:59







      • 1





        If you specifically want to count file with a common extension (or other string), inserting that into the command eliminates the extra 2. Here is an example: ls -afq *[0-9].pdb | wc -l

        – Steven C. Howell
        Jun 12 '15 at 13:18












      • FYI, with ksh93 version sh (AT&T Research) 93u+ 2012-08-01 on my Debian-based system, FIGNORE doesn't seem to work. The . and .. entries are included into the resulting array

        – Sergiy Kolodyazhnyy
        Jan 4 at 8:52












      • 1





        My empirical testing on >1 million files shows that find -maxdepth 1 easily keeps pace with ls -U as long as you don't add anything like a -type declaration that has to do further checks. Are you sure GNU find actually calls stat? Even the slowdown on find -type is nothing compared to how much ls -l bogs if you make it return file details. On the other hand the clear speed winner is zsh using the non sorting glob. (sorted globs are 2x slower than ls while the non-sorting one is 2x faster). I wonder if file system types would significantly effect these results.

        – Caleb
        Sep 11 '13 at 9:44












      • @Caleb I ran strace. This is only true if the directory has subdirectories: otherwise find's leaf directory optimization kicks in (even without -maxdepth 1), I should have mentioned that. A lot of things can affect the result, including the filesystem type (calling stat is a lot more expensive on filesystems that represent directories as linear lists than on filesystems that represent directories as trees), whether the inodes were all created together and are thus close by on the disk, cold or hot cache, etc.

        – Gilles
        Sep 11 '13 at 9:55






      • 1





        Historically, ls -f has been the reliable way to prevent calling stat - this is often simply described today as "output is not sorted" (which it also causes), and does include . and ... -A and -U are not standard options.

        – Random832
        Sep 11 '13 at 12:59







      • 1





        If you specifically want to count file with a common extension (or other string), inserting that into the command eliminates the extra 2. Here is an example: ls -afq *[0-9].pdb | wc -l

        – Steven C. Howell
        Jun 12 '15 at 13:18












      • FYI, with ksh93 version sh (AT&T Research) 93u+ 2012-08-01 on my Debian-based system, FIGNORE doesn't seem to work. The . and .. entries are included into the resulting array

        – Sergiy Kolodyazhnyy
        Jan 4 at 8:52







      1




      1





      My empirical testing on >1 million files shows that find -maxdepth 1 easily keeps pace with ls -U as long as you don't add anything like a -type declaration that has to do further checks. Are you sure GNU find actually calls stat? Even the slowdown on find -type is nothing compared to how much ls -l bogs if you make it return file details. On the other hand the clear speed winner is zsh using the non sorting glob. (sorted globs are 2x slower than ls while the non-sorting one is 2x faster). I wonder if file system types would significantly effect these results.

      – Caleb
      Sep 11 '13 at 9:44






      My empirical testing on >1 million files shows that find -maxdepth 1 easily keeps pace with ls -U as long as you don't add anything like a -type declaration that has to do further checks. Are you sure GNU find actually calls stat? Even the slowdown on find -type is nothing compared to how much ls -l bogs if you make it return file details. On the other hand the clear speed winner is zsh using the non sorting glob. (sorted globs are 2x slower than ls while the non-sorting one is 2x faster). I wonder if file system types would significantly effect these results.

      – Caleb
      Sep 11 '13 at 9:44














      @Caleb I ran strace. This is only true if the directory has subdirectories: otherwise find's leaf directory optimization kicks in (even without -maxdepth 1), I should have mentioned that. A lot of things can affect the result, including the filesystem type (calling stat is a lot more expensive on filesystems that represent directories as linear lists than on filesystems that represent directories as trees), whether the inodes were all created together and are thus close by on the disk, cold or hot cache, etc.

      – Gilles
      Sep 11 '13 at 9:55





      @Caleb I ran strace. This is only true if the directory has subdirectories: otherwise find's leaf directory optimization kicks in (even without -maxdepth 1), I should have mentioned that. A lot of things can affect the result, including the filesystem type (calling stat is a lot more expensive on filesystems that represent directories as linear lists than on filesystems that represent directories as trees), whether the inodes were all created together and are thus close by on the disk, cold or hot cache, etc.

      – Gilles
      Sep 11 '13 at 9:55




      1




      1





      Historically, ls -f has been the reliable way to prevent calling stat - this is often simply described today as "output is not sorted" (which it also causes), and does include . and ... -A and -U are not standard options.

      – Random832
      Sep 11 '13 at 12:59






      Historically, ls -f has been the reliable way to prevent calling stat - this is often simply described today as "output is not sorted" (which it also causes), and does include . and ... -A and -U are not standard options.

      – Random832
      Sep 11 '13 at 12:59





      1




      1





      If you specifically want to count file with a common extension (or other string), inserting that into the command eliminates the extra 2. Here is an example: ls -afq *[0-9].pdb | wc -l

      – Steven C. Howell
      Jun 12 '15 at 13:18






      If you specifically want to count file with a common extension (or other string), inserting that into the command eliminates the extra 2. Here is an example: ls -afq *[0-9].pdb | wc -l

      – Steven C. Howell
      Jun 12 '15 at 13:18














      FYI, with ksh93 version sh (AT&T Research) 93u+ 2012-08-01 on my Debian-based system, FIGNORE doesn't seem to work. The . and .. entries are included into the resulting array

      – Sergiy Kolodyazhnyy
      Jan 4 at 8:52





      FYI, with ksh93 version sh (AT&T Research) 93u+ 2012-08-01 on my Debian-based system, FIGNORE doesn't seem to work. The . and .. entries are included into the resulting array

      – Sergiy Kolodyazhnyy
      Jan 4 at 8:52













      15














      find /foo/foo2/ -maxdepth 1 | wc -l


      Is considerably faster on my machine but the local . directory is added to the count.






      share|improve this answer


















      • 1





        Thanks. I'm compelled to ask a silly question though: why is it faster? Because it's not bothering to look-up file attributes?

        – Mike B
        Sep 10 '13 at 20:42






      • 2





        Yes, that's my understanding. As long as your not using the -type parameter find should be faster than ls

        – Joel Taylor
        Sep 10 '13 at 21:02






      • 1





        Hmmm.... if I'm understanding the documentation of find well, this should actually be better than my answer. Anyone with more experience can verify?

        – Luis Machuca
        Sep 11 '13 at 2:38











      • Add a -mindepth 1 to omit the directory itself.

        – Stéphane Chazelas
        Jan 4 at 9:53















      15














      find /foo/foo2/ -maxdepth 1 | wc -l


      Is considerably faster on my machine but the local . directory is added to the count.






      share|improve this answer


















      • 1





        Thanks. I'm compelled to ask a silly question though: why is it faster? Because it's not bothering to look-up file attributes?

        – Mike B
        Sep 10 '13 at 20:42






      • 2





        Yes, that's my understanding. As long as your not using the -type parameter find should be faster than ls

        – Joel Taylor
        Sep 10 '13 at 21:02






      • 1





        Hmmm.... if I'm understanding the documentation of find well, this should actually be better than my answer. Anyone with more experience can verify?

        – Luis Machuca
        Sep 11 '13 at 2:38











      • Add a -mindepth 1 to omit the directory itself.

        – Stéphane Chazelas
        Jan 4 at 9:53













      15












      15








      15







      find /foo/foo2/ -maxdepth 1 | wc -l


      Is considerably faster on my machine but the local . directory is added to the count.






      share|improve this answer













      find /foo/foo2/ -maxdepth 1 | wc -l


      Is considerably faster on my machine but the local . directory is added to the count.







      share|improve this answer












      share|improve this answer



      share|improve this answer










      answered Sep 10 '13 at 20:40









      Joel TaylorJoel Taylor

      743413




      743413







      • 1





        Thanks. I'm compelled to ask a silly question though: why is it faster? Because it's not bothering to look-up file attributes?

        – Mike B
        Sep 10 '13 at 20:42






      • 2





        Yes, that's my understanding. As long as your not using the -type parameter find should be faster than ls

        – Joel Taylor
        Sep 10 '13 at 21:02






      • 1





        Hmmm.... if I'm understanding the documentation of find well, this should actually be better than my answer. Anyone with more experience can verify?

        – Luis Machuca
        Sep 11 '13 at 2:38











      • Add a -mindepth 1 to omit the directory itself.

        – Stéphane Chazelas
        Jan 4 at 9:53












      • 1





        Thanks. I'm compelled to ask a silly question though: why is it faster? Because it's not bothering to look-up file attributes?

        – Mike B
        Sep 10 '13 at 20:42






      • 2





        Yes, that's my understanding. As long as your not using the -type parameter find should be faster than ls

        – Joel Taylor
        Sep 10 '13 at 21:02






      • 1





        Hmmm.... if I'm understanding the documentation of find well, this should actually be better than my answer. Anyone with more experience can verify?

        – Luis Machuca
        Sep 11 '13 at 2:38











      • Add a -mindepth 1 to omit the directory itself.

        – Stéphane Chazelas
        Jan 4 at 9:53







      1




      1





      Thanks. I'm compelled to ask a silly question though: why is it faster? Because it's not bothering to look-up file attributes?

      – Mike B
      Sep 10 '13 at 20:42





      Thanks. I'm compelled to ask a silly question though: why is it faster? Because it's not bothering to look-up file attributes?

      – Mike B
      Sep 10 '13 at 20:42




      2




      2





      Yes, that's my understanding. As long as your not using the -type parameter find should be faster than ls

      – Joel Taylor
      Sep 10 '13 at 21:02





      Yes, that's my understanding. As long as your not using the -type parameter find should be faster than ls

      – Joel Taylor
      Sep 10 '13 at 21:02




      1




      1





      Hmmm.... if I'm understanding the documentation of find well, this should actually be better than my answer. Anyone with more experience can verify?

      – Luis Machuca
      Sep 11 '13 at 2:38





      Hmmm.... if I'm understanding the documentation of find well, this should actually be better than my answer. Anyone with more experience can verify?

      – Luis Machuca
      Sep 11 '13 at 2:38













      Add a -mindepth 1 to omit the directory itself.

      – Stéphane Chazelas
      Jan 4 at 9:53





      Add a -mindepth 1 to omit the directory itself.

      – Stéphane Chazelas
      Jan 4 at 9:53











      8














      ls -1U before the pipe should spend just a bit less resources, as it does no attempt to sort the file entries, it just reads them as they are sorted in the folder on disk. It also produces less output, meaning slightly less work for wc.



      You could also use ls -f which is more or less a shortcut for ls -1aU.



      I don't know if there is a resource-efficient way to do it via a command without piping though.






      share|improve this answer




















      • 8





        Btw, -1 is implied when the output goes to a pipe

        – enzotib
        Sep 10 '13 at 21:04











      • @enzotib - it is? Wow... one learns something new every day!

        – Luis Machuca
        Sep 10 '13 at 21:25















      8














      ls -1U before the pipe should spend just a bit less resources, as it does no attempt to sort the file entries, it just reads them as they are sorted in the folder on disk. It also produces less output, meaning slightly less work for wc.



      You could also use ls -f which is more or less a shortcut for ls -1aU.



      I don't know if there is a resource-efficient way to do it via a command without piping though.






      share|improve this answer




















      • 8





        Btw, -1 is implied when the output goes to a pipe

        – enzotib
        Sep 10 '13 at 21:04











      • @enzotib - it is? Wow... one learns something new every day!

        – Luis Machuca
        Sep 10 '13 at 21:25













      8












      8








      8







      ls -1U before the pipe should spend just a bit less resources, as it does no attempt to sort the file entries, it just reads them as they are sorted in the folder on disk. It also produces less output, meaning slightly less work for wc.



      You could also use ls -f which is more or less a shortcut for ls -1aU.



      I don't know if there is a resource-efficient way to do it via a command without piping though.






      share|improve this answer















      ls -1U before the pipe should spend just a bit less resources, as it does no attempt to sort the file entries, it just reads them as they are sorted in the folder on disk. It also produces less output, meaning slightly less work for wc.



      You could also use ls -f which is more or less a shortcut for ls -1aU.



      I don't know if there is a resource-efficient way to do it via a command without piping though.







      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited Oct 30 '16 at 16:46









      Jeff Schaller

      39.5k1054126




      39.5k1054126










      answered Sep 10 '13 at 19:42









      Luis MachucaLuis Machuca

      310310




      310310







      • 8





        Btw, -1 is implied when the output goes to a pipe

        – enzotib
        Sep 10 '13 at 21:04











      • @enzotib - it is? Wow... one learns something new every day!

        – Luis Machuca
        Sep 10 '13 at 21:25












      • 8





        Btw, -1 is implied when the output goes to a pipe

        – enzotib
        Sep 10 '13 at 21:04











      • @enzotib - it is? Wow... one learns something new every day!

        – Luis Machuca
        Sep 10 '13 at 21:25







      8




      8





      Btw, -1 is implied when the output goes to a pipe

      – enzotib
      Sep 10 '13 at 21:04





      Btw, -1 is implied when the output goes to a pipe

      – enzotib
      Sep 10 '13 at 21:04













      @enzotib - it is? Wow... one learns something new every day!

      – Luis Machuca
      Sep 10 '13 at 21:25





      @enzotib - it is? Wow... one learns something new every day!

      – Luis Machuca
      Sep 10 '13 at 21:25











      6














      Another point of comparison. While not being a shell oneliner, this C program doesn't do anything superflous. Note that hidden files are ignored to match the output of ls|wc -l (ls -l|wc -l is off by one due to the total blocks in the first line of output).



      #include <stdio.h>
      #include <stdlib.h>
      #include <dirent.h>
      #include <error.h>
      #include <errno.h>

      int main(int argc, char *argv)

      int file_count = 0;
      DIR * dirp;
      struct dirent * entry;

      if (argc < 2)
      error(EXIT_FAILURE, 0, "missing argument");

      if(!(dirp = opendir(argv[1])))
      error(EXIT_FAILURE, errno, "could not open '%s'", argv[1]);

      while ((entry = readdir(dirp)) != NULL)
      if (entry->d_name[0] == '.') /* ignore hidden files */
      continue;

      file_count++;

      closedir(dirp);

      printf("%dn", file_count);






      share|improve this answer

























      • Using the readdir() stdio API does add some overhead and does not give you control over the size of the buffer passed to the underlying system call (getdents on Linux)

        – Stéphane Chazelas
        Jan 4 at 9:41















      6














      Another point of comparison. While not being a shell oneliner, this C program doesn't do anything superflous. Note that hidden files are ignored to match the output of ls|wc -l (ls -l|wc -l is off by one due to the total blocks in the first line of output).



      #include <stdio.h>
      #include <stdlib.h>
      #include <dirent.h>
      #include <error.h>
      #include <errno.h>

      int main(int argc, char *argv)

      int file_count = 0;
      DIR * dirp;
      struct dirent * entry;

      if (argc < 2)
      error(EXIT_FAILURE, 0, "missing argument");

      if(!(dirp = opendir(argv[1])))
      error(EXIT_FAILURE, errno, "could not open '%s'", argv[1]);

      while ((entry = readdir(dirp)) != NULL)
      if (entry->d_name[0] == '.') /* ignore hidden files */
      continue;

      file_count++;

      closedir(dirp);

      printf("%dn", file_count);






      share|improve this answer

























      • Using the readdir() stdio API does add some overhead and does not give you control over the size of the buffer passed to the underlying system call (getdents on Linux)

        – Stéphane Chazelas
        Jan 4 at 9:41













      6












      6








      6







      Another point of comparison. While not being a shell oneliner, this C program doesn't do anything superflous. Note that hidden files are ignored to match the output of ls|wc -l (ls -l|wc -l is off by one due to the total blocks in the first line of output).



      #include <stdio.h>
      #include <stdlib.h>
      #include <dirent.h>
      #include <error.h>
      #include <errno.h>

      int main(int argc, char *argv)

      int file_count = 0;
      DIR * dirp;
      struct dirent * entry;

      if (argc < 2)
      error(EXIT_FAILURE, 0, "missing argument");

      if(!(dirp = opendir(argv[1])))
      error(EXIT_FAILURE, errno, "could not open '%s'", argv[1]);

      while ((entry = readdir(dirp)) != NULL)
      if (entry->d_name[0] == '.') /* ignore hidden files */
      continue;

      file_count++;

      closedir(dirp);

      printf("%dn", file_count);






      share|improve this answer















      Another point of comparison. While not being a shell oneliner, this C program doesn't do anything superflous. Note that hidden files are ignored to match the output of ls|wc -l (ls -l|wc -l is off by one due to the total blocks in the first line of output).



      #include <stdio.h>
      #include <stdlib.h>
      #include <dirent.h>
      #include <error.h>
      #include <errno.h>

      int main(int argc, char *argv)

      int file_count = 0;
      DIR * dirp;
      struct dirent * entry;

      if (argc < 2)
      error(EXIT_FAILURE, 0, "missing argument");

      if(!(dirp = opendir(argv[1])))
      error(EXIT_FAILURE, errno, "could not open '%s'", argv[1]);

      while ((entry = readdir(dirp)) != NULL)
      if (entry->d_name[0] == '.') /* ignore hidden files */
      continue;

      file_count++;

      closedir(dirp);

      printf("%dn", file_count);







      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited Sep 10 '13 at 21:25

























      answered Sep 10 '13 at 20:50









      Thomas NymanThomas Nyman

      20.2k74970




      20.2k74970












      • Using the readdir() stdio API does add some overhead and does not give you control over the size of the buffer passed to the underlying system call (getdents on Linux)

        – Stéphane Chazelas
        Jan 4 at 9:41

















      • Using the readdir() stdio API does add some overhead and does not give you control over the size of the buffer passed to the underlying system call (getdents on Linux)

        – Stéphane Chazelas
        Jan 4 at 9:41
















      Using the readdir() stdio API does add some overhead and does not give you control over the size of the buffer passed to the underlying system call (getdents on Linux)

      – Stéphane Chazelas
      Jan 4 at 9:41





      Using the readdir() stdio API does add some overhead and does not give you control over the size of the buffer passed to the underlying system call (getdents on Linux)

      – Stéphane Chazelas
      Jan 4 at 9:41











      3














      You could try perl -e 'opendir($dh,".");$i=0;while(readdir $dh)$i++;print "$in";'



      It'd be interesting to compare timings with your shell pipe.






      share|improve this answer























      • On my tests, this keeps pretty much exactly the same pace as the three other fastest solutions (find -maxdepth 1 | wc -l, ls -AU | wc -l and the zsh based non sorting glob and array count). In other words it beats out the options with various inefficiencies such as sorting or reading extraneous file properties. I would venture to say since it doesn't earn you anything either, it isn't worth using over a simpler solution unless you happen to be in perl already :)

        – Caleb
        Sep 11 '13 at 9:53












      • Note that this will include the . and .. directory entries in the count, so you need to subtract two to get the actual number of files (and subdirectories). In modern Perl, perl -E 'opendir $dh, "."; $i++ while readdir $dh; say $i - 2' would do it.

        – Ilmari Karonen
        Sep 11 '13 at 10:36















      3














      You could try perl -e 'opendir($dh,".");$i=0;while(readdir $dh)$i++;print "$in";'



      It'd be interesting to compare timings with your shell pipe.






      share|improve this answer























      • On my tests, this keeps pretty much exactly the same pace as the three other fastest solutions (find -maxdepth 1 | wc -l, ls -AU | wc -l and the zsh based non sorting glob and array count). In other words it beats out the options with various inefficiencies such as sorting or reading extraneous file properties. I would venture to say since it doesn't earn you anything either, it isn't worth using over a simpler solution unless you happen to be in perl already :)

        – Caleb
        Sep 11 '13 at 9:53












      • Note that this will include the . and .. directory entries in the count, so you need to subtract two to get the actual number of files (and subdirectories). In modern Perl, perl -E 'opendir $dh, "."; $i++ while readdir $dh; say $i - 2' would do it.

        – Ilmari Karonen
        Sep 11 '13 at 10:36













      3












      3








      3







      You could try perl -e 'opendir($dh,".");$i=0;while(readdir $dh)$i++;print "$in";'



      It'd be interesting to compare timings with your shell pipe.






      share|improve this answer













      You could try perl -e 'opendir($dh,".");$i=0;while(readdir $dh)$i++;print "$in";'



      It'd be interesting to compare timings with your shell pipe.







      share|improve this answer












      share|improve this answer



      share|improve this answer










      answered Sep 10 '13 at 20:00









      Doug O'NealDoug O'Neal

      2,8521817




      2,8521817












      • On my tests, this keeps pretty much exactly the same pace as the three other fastest solutions (find -maxdepth 1 | wc -l, ls -AU | wc -l and the zsh based non sorting glob and array count). In other words it beats out the options with various inefficiencies such as sorting or reading extraneous file properties. I would venture to say since it doesn't earn you anything either, it isn't worth using over a simpler solution unless you happen to be in perl already :)

        – Caleb
        Sep 11 '13 at 9:53












      • Note that this will include the . and .. directory entries in the count, so you need to subtract two to get the actual number of files (and subdirectories). In modern Perl, perl -E 'opendir $dh, "."; $i++ while readdir $dh; say $i - 2' would do it.

        – Ilmari Karonen
        Sep 11 '13 at 10:36

















      • On my tests, this keeps pretty much exactly the same pace as the three other fastest solutions (find -maxdepth 1 | wc -l, ls -AU | wc -l and the zsh based non sorting glob and array count). In other words it beats out the options with various inefficiencies such as sorting or reading extraneous file properties. I would venture to say since it doesn't earn you anything either, it isn't worth using over a simpler solution unless you happen to be in perl already :)

        – Caleb
        Sep 11 '13 at 9:53












      • Note that this will include the . and .. directory entries in the count, so you need to subtract two to get the actual number of files (and subdirectories). In modern Perl, perl -E 'opendir $dh, "."; $i++ while readdir $dh; say $i - 2' would do it.

        – Ilmari Karonen
        Sep 11 '13 at 10:36
















      On my tests, this keeps pretty much exactly the same pace as the three other fastest solutions (find -maxdepth 1 | wc -l, ls -AU | wc -l and the zsh based non sorting glob and array count). In other words it beats out the options with various inefficiencies such as sorting or reading extraneous file properties. I would venture to say since it doesn't earn you anything either, it isn't worth using over a simpler solution unless you happen to be in perl already :)

      – Caleb
      Sep 11 '13 at 9:53






      On my tests, this keeps pretty much exactly the same pace as the three other fastest solutions (find -maxdepth 1 | wc -l, ls -AU | wc -l and the zsh based non sorting glob and array count). In other words it beats out the options with various inefficiencies such as sorting or reading extraneous file properties. I would venture to say since it doesn't earn you anything either, it isn't worth using over a simpler solution unless you happen to be in perl already :)

      – Caleb
      Sep 11 '13 at 9:53














      Note that this will include the . and .. directory entries in the count, so you need to subtract two to get the actual number of files (and subdirectories). In modern Perl, perl -E 'opendir $dh, "."; $i++ while readdir $dh; say $i - 2' would do it.

      – Ilmari Karonen
      Sep 11 '13 at 10:36





      Note that this will include the . and .. directory entries in the count, so you need to subtract two to get the actual number of files (and subdirectories). In modern Perl, perl -E 'opendir $dh, "."; $i++ while readdir $dh; say $i - 2' would do it.

      – Ilmari Karonen
      Sep 11 '13 at 10:36











      2














      From this answer, I can think of this one as a possible solution.



      /*
      * List directories using getdents() because ls, find and Python libraries
      * use readdir() which is slower (but uses getdents() underneath.
      *
      * Compile with
      * ]$ gcc getdents.c -o getdents
      */
      #define _GNU_SOURCE
      #include <dirent.h> /* Defines DT_* constants */
      #include <fcntl.h>
      #include <stdio.h>
      #include <unistd.h>
      #include <stdlib.h>
      #include <sys/stat.h>
      #include <sys/syscall.h>

      #define handle_error(msg)
      do perror(msg); exit(EXIT_FAILURE); while (0)

      struct linux_dirent
      long d_ino;
      off_t d_off;
      unsigned short d_reclen;
      char d_name;
      ;

      #define BUF_SIZE 1024*1024*5

      int
      main(int argc, char *argv)

      int fd, nread;
      char buf[BUF_SIZE];
      struct linux_dirent *d;
      int bpos;
      char d_type;

      fd = open(argc > 1 ? argv[1] : ".", O_RDONLY


      Copy the C program above into directory in which the files need to be listed. Then execute the below commands.



      gcc getdents.c -o getdents
      ./getdents | wc -l





      share|improve this answer




















      • 1





        A few things: 1) if you're willing to use a custom program for this, you might as well just count the files and print the count; 2) to compare with ls -f, don't filter on d_type at all, just on d->d_ino != 0; 3) subtract 2 for . and ...

        – Matei David
        Jan 17 '17 at 16:01











      • See linked answer for a timings example where this is 40x faster than the accepted ls -f.

        – Matei David
        Jan 17 '17 at 16:02
















      2














      From this answer, I can think of this one as a possible solution.



      /*
      * List directories using getdents() because ls, find and Python libraries
      * use readdir() which is slower (but uses getdents() underneath.
      *
      * Compile with
      * ]$ gcc getdents.c -o getdents
      */
      #define _GNU_SOURCE
      #include <dirent.h> /* Defines DT_* constants */
      #include <fcntl.h>
      #include <stdio.h>
      #include <unistd.h>
      #include <stdlib.h>
      #include <sys/stat.h>
      #include <sys/syscall.h>

      #define handle_error(msg)
      do perror(msg); exit(EXIT_FAILURE); while (0)

      struct linux_dirent
      long d_ino;
      off_t d_off;
      unsigned short d_reclen;
      char d_name;
      ;

      #define BUF_SIZE 1024*1024*5

      int
      main(int argc, char *argv)

      int fd, nread;
      char buf[BUF_SIZE];
      struct linux_dirent *d;
      int bpos;
      char d_type;

      fd = open(argc > 1 ? argv[1] : ".", O_RDONLY


      Copy the C program above into directory in which the files need to be listed. Then execute the below commands.



      gcc getdents.c -o getdents
      ./getdents | wc -l





      share|improve this answer




















      • 1





        A few things: 1) if you're willing to use a custom program for this, you might as well just count the files and print the count; 2) to compare with ls -f, don't filter on d_type at all, just on d->d_ino != 0; 3) subtract 2 for . and ...

        – Matei David
        Jan 17 '17 at 16:01











      • See linked answer for a timings example where this is 40x faster than the accepted ls -f.

        – Matei David
        Jan 17 '17 at 16:02














      2












      2








      2







      From this answer, I can think of this one as a possible solution.



      /*
      * List directories using getdents() because ls, find and Python libraries
      * use readdir() which is slower (but uses getdents() underneath.
      *
      * Compile with
      * ]$ gcc getdents.c -o getdents
      */
      #define _GNU_SOURCE
      #include <dirent.h> /* Defines DT_* constants */
      #include <fcntl.h>
      #include <stdio.h>
      #include <unistd.h>
      #include <stdlib.h>
      #include <sys/stat.h>
      #include <sys/syscall.h>

      #define handle_error(msg)
      do perror(msg); exit(EXIT_FAILURE); while (0)

      struct linux_dirent
      long d_ino;
      off_t d_off;
      unsigned short d_reclen;
      char d_name;
      ;

      #define BUF_SIZE 1024*1024*5

      int
      main(int argc, char *argv)

      int fd, nread;
      char buf[BUF_SIZE];
      struct linux_dirent *d;
      int bpos;
      char d_type;

      fd = open(argc > 1 ? argv[1] : ".", O_RDONLY


      Copy the C program above into directory in which the files need to be listed. Then execute the below commands.



      gcc getdents.c -o getdents
      ./getdents | wc -l





      share|improve this answer















      From this answer, I can think of this one as a possible solution.



      /*
      * List directories using getdents() because ls, find and Python libraries
      * use readdir() which is slower (but uses getdents() underneath.
      *
      * Compile with
      * ]$ gcc getdents.c -o getdents
      */
      #define _GNU_SOURCE
      #include <dirent.h> /* Defines DT_* constants */
      #include <fcntl.h>
      #include <stdio.h>
      #include <unistd.h>
      #include <stdlib.h>
      #include <sys/stat.h>
      #include <sys/syscall.h>

      #define handle_error(msg)
      do perror(msg); exit(EXIT_FAILURE); while (0)

      struct linux_dirent
      long d_ino;
      off_t d_off;
      unsigned short d_reclen;
      char d_name;
      ;

      #define BUF_SIZE 1024*1024*5

      int
      main(int argc, char *argv)

      int fd, nread;
      char buf[BUF_SIZE];
      struct linux_dirent *d;
      int bpos;
      char d_type;

      fd = open(argc > 1 ? argv[1] : ".", O_RDONLY


      Copy the C program above into directory in which the files need to be listed. Then execute the below commands.



      gcc getdents.c -o getdents
      ./getdents | wc -l






      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited Apr 13 '17 at 12:36









      Community

      1




      1










      answered Aug 7 '14 at 23:02









      RameshRamesh

      23.3k32101182




      23.3k32101182







      • 1





        A few things: 1) if you're willing to use a custom program for this, you might as well just count the files and print the count; 2) to compare with ls -f, don't filter on d_type at all, just on d->d_ino != 0; 3) subtract 2 for . and ...

        – Matei David
        Jan 17 '17 at 16:01











      • See linked answer for a timings example where this is 40x faster than the accepted ls -f.

        – Matei David
        Jan 17 '17 at 16:02













      • 1





        A few things: 1) if you're willing to use a custom program for this, you might as well just count the files and print the count; 2) to compare with ls -f, don't filter on d_type at all, just on d->d_ino != 0; 3) subtract 2 for . and ...

        – Matei David
        Jan 17 '17 at 16:01











      • See linked answer for a timings example where this is 40x faster than the accepted ls -f.

        – Matei David
        Jan 17 '17 at 16:02








      1




      1





      A few things: 1) if you're willing to use a custom program for this, you might as well just count the files and print the count; 2) to compare with ls -f, don't filter on d_type at all, just on d->d_ino != 0; 3) subtract 2 for . and ...

      – Matei David
      Jan 17 '17 at 16:01





      A few things: 1) if you're willing to use a custom program for this, you might as well just count the files and print the count; 2) to compare with ls -f, don't filter on d_type at all, just on d->d_ino != 0; 3) subtract 2 for . and ...

      – Matei David
      Jan 17 '17 at 16:01













      See linked answer for a timings example where this is 40x faster than the accepted ls -f.

      – Matei David
      Jan 17 '17 at 16:02






      See linked answer for a timings example where this is 40x faster than the accepted ls -f.

      – Matei David
      Jan 17 '17 at 16:02












      1














      A bash-only solution, not requiring any external program, but don't know how much efficient:



      list=(*)
      echo "$#list[@]"





      share|improve this answer























      • Glob expansion isn't necessary the most resource efficient way to do this. Besides most shells having an upper limit to the number of items they will even process so this will probably bomb when dealing with a million plus items, it also sorts the output. The solutions involving find or ls without sorting options will be faster.

        – Caleb
        Sep 11 '13 at 6:37











      • @Caleb, only old versions of ksh had such limits (and didn't support that syntax) AFAIK. In all most other shells, the limit is just the available memory. You've got a point that it's going to be very inefficient, especially in bash.

        – Stéphane Chazelas
        Jan 4 at 9:45















      1














      A bash-only solution, not requiring any external program, but don't know how much efficient:



      list=(*)
      echo "$#list[@]"





      share|improve this answer























      • Glob expansion isn't necessary the most resource efficient way to do this. Besides most shells having an upper limit to the number of items they will even process so this will probably bomb when dealing with a million plus items, it also sorts the output. The solutions involving find or ls without sorting options will be faster.

        – Caleb
        Sep 11 '13 at 6:37











      • @Caleb, only old versions of ksh had such limits (and didn't support that syntax) AFAIK. In all most other shells, the limit is just the available memory. You've got a point that it's going to be very inefficient, especially in bash.

        – Stéphane Chazelas
        Jan 4 at 9:45













      1












      1








      1







      A bash-only solution, not requiring any external program, but don't know how much efficient:



      list=(*)
      echo "$#list[@]"





      share|improve this answer













      A bash-only solution, not requiring any external program, but don't know how much efficient:



      list=(*)
      echo "$#list[@]"






      share|improve this answer












      share|improve this answer



      share|improve this answer










      answered Sep 10 '13 at 20:55









      enzotibenzotib

      33.7k710394




      33.7k710394












      • Glob expansion isn't necessary the most resource efficient way to do this. Besides most shells having an upper limit to the number of items they will even process so this will probably bomb when dealing with a million plus items, it also sorts the output. The solutions involving find or ls without sorting options will be faster.

        – Caleb
        Sep 11 '13 at 6:37











      • @Caleb, only old versions of ksh had such limits (and didn't support that syntax) AFAIK. In all most other shells, the limit is just the available memory. You've got a point that it's going to be very inefficient, especially in bash.

        – Stéphane Chazelas
        Jan 4 at 9:45

















      • Glob expansion isn't necessary the most resource efficient way to do this. Besides most shells having an upper limit to the number of items they will even process so this will probably bomb when dealing with a million plus items, it also sorts the output. The solutions involving find or ls without sorting options will be faster.

        – Caleb
        Sep 11 '13 at 6:37











      • @Caleb, only old versions of ksh had such limits (and didn't support that syntax) AFAIK. In all most other shells, the limit is just the available memory. You've got a point that it's going to be very inefficient, especially in bash.

        – Stéphane Chazelas
        Jan 4 at 9:45
















      Glob expansion isn't necessary the most resource efficient way to do this. Besides most shells having an upper limit to the number of items they will even process so this will probably bomb when dealing with a million plus items, it also sorts the output. The solutions involving find or ls without sorting options will be faster.

      – Caleb
      Sep 11 '13 at 6:37





      Glob expansion isn't necessary the most resource efficient way to do this. Besides most shells having an upper limit to the number of items they will even process so this will probably bomb when dealing with a million plus items, it also sorts the output. The solutions involving find or ls without sorting options will be faster.

      – Caleb
      Sep 11 '13 at 6:37













      @Caleb, only old versions of ksh had such limits (and didn't support that syntax) AFAIK. In all most other shells, the limit is just the available memory. You've got a point that it's going to be very inefficient, especially in bash.

      – Stéphane Chazelas
      Jan 4 at 9:45





      @Caleb, only old versions of ksh had such limits (and didn't support that syntax) AFAIK. In all most other shells, the limit is just the available memory. You've got a point that it's going to be very inefficient, especially in bash.

      – Stéphane Chazelas
      Jan 4 at 9:45











      1














      Probably the most resource efficient way would involve no outside process invocations. So I'd wager on...



      cglb() ( c=0 ; set --
      tglb() [ -L "$2" ] &&
      c=$(($c+$#-1))

      for glb in '.?*' *
      do tglb $1 $glb##.* $glb#*
      set -- ..
      done
      echo $c
      )





      share|improve this answer




















      • 1





        Got relative numbers? for how many files?

        – smci
        Nov 20 '17 at 23:44















      1














      Probably the most resource efficient way would involve no outside process invocations. So I'd wager on...



      cglb() ( c=0 ; set --
      tglb() [ -L "$2" ] &&
      c=$(($c+$#-1))

      for glb in '.?*' *
      do tglb $1 $glb##.* $glb#*
      set -- ..
      done
      echo $c
      )





      share|improve this answer




















      • 1





        Got relative numbers? for how many files?

        – smci
        Nov 20 '17 at 23:44













      1












      1








      1







      Probably the most resource efficient way would involve no outside process invocations. So I'd wager on...



      cglb() ( c=0 ; set --
      tglb() [ -L "$2" ] &&
      c=$(($c+$#-1))

      for glb in '.?*' *
      do tglb $1 $glb##.* $glb#*
      set -- ..
      done
      echo $c
      )





      share|improve this answer















      Probably the most resource efficient way would involve no outside process invocations. So I'd wager on...



      cglb() ( c=0 ; set --
      tglb() [ -L "$2" ] &&
      c=$(($c+$#-1))

      for glb in '.?*' *
      do tglb $1 $glb##.* $glb#*
      set -- ..
      done
      echo $c
      )






      share|improve this answer














      share|improve this answer



      share|improve this answer








      edited Aug 8 '14 at 0:37

























      answered Aug 7 '14 at 22:42









      mikeservmikeserv

      45.5k668154




      45.5k668154







      • 1





        Got relative numbers? for how many files?

        – smci
        Nov 20 '17 at 23:44












      • 1





        Got relative numbers? for how many files?

        – smci
        Nov 20 '17 at 23:44







      1




      1





      Got relative numbers? for how many files?

      – smci
      Nov 20 '17 at 23:44





      Got relative numbers? for how many files?

      – smci
      Nov 20 '17 at 23:44











      0














      After fixing the issue from @Joel 's answer, where it added . as a file:



      find /foo/foo2 -maxdepth 1 | tail -n +2 | wc -l



      tail simply removes the first line, meaning that . isn't counted anymore.






      share|improve this answer


















      • 1





        Adding a pair of pipes in order to omit one line of wc input is not very efficient as the overhead increases linearly with regard to input size. In this case, why not simply decrement the final count to compensate for it being off by one, which is a constant time operation: echo $(( $(find /foo/foo2 -maxdepth 1 | wc -l) - 1))

        – Thomas Nyman
        Sep 11 '13 at 6:32






      • 1





        Rather than feed that much data through another process, it would probably be better to just do some math on the final output. let count = $(find /foo/foo2 -maxdepth 1 | wc -l) - 2

        – Caleb
        Sep 11 '13 at 6:34
















      0














      After fixing the issue from @Joel 's answer, where it added . as a file:



      find /foo/foo2 -maxdepth 1 | tail -n +2 | wc -l



      tail simply removes the first line, meaning that . isn't counted anymore.






      share|improve this answer


















      • 1





        Adding a pair of pipes in order to omit one line of wc input is not very efficient as the overhead increases linearly with regard to input size. In this case, why not simply decrement the final count to compensate for it being off by one, which is a constant time operation: echo $(( $(find /foo/foo2 -maxdepth 1 | wc -l) - 1))

        – Thomas Nyman
        Sep 11 '13 at 6:32






      • 1





        Rather than feed that much data through another process, it would probably be better to just do some math on the final output. let count = $(find /foo/foo2 -maxdepth 1 | wc -l) - 2

        – Caleb
        Sep 11 '13 at 6:34














      0












      0








      0







      After fixing the issue from @Joel 's answer, where it added . as a file:



      find /foo/foo2 -maxdepth 1 | tail -n +2 | wc -l



      tail simply removes the first line, meaning that . isn't counted anymore.






      share|improve this answer













      After fixing the issue from @Joel 's answer, where it added . as a file:



      find /foo/foo2 -maxdepth 1 | tail -n +2 | wc -l



      tail simply removes the first line, meaning that . isn't counted anymore.







      share|improve this answer












      share|improve this answer



      share|improve this answer










      answered Sep 11 '13 at 4:23









      haneefmubarakhaneefmubarak

      220129




      220129







      • 1





        Adding a pair of pipes in order to omit one line of wc input is not very efficient as the overhead increases linearly with regard to input size. In this case, why not simply decrement the final count to compensate for it being off by one, which is a constant time operation: echo $(( $(find /foo/foo2 -maxdepth 1 | wc -l) - 1))

        – Thomas Nyman
        Sep 11 '13 at 6:32






      • 1





        Rather than feed that much data through another process, it would probably be better to just do some math on the final output. let count = $(find /foo/foo2 -maxdepth 1 | wc -l) - 2

        – Caleb
        Sep 11 '13 at 6:34













      • 1





        Adding a pair of pipes in order to omit one line of wc input is not very efficient as the overhead increases linearly with regard to input size. In this case, why not simply decrement the final count to compensate for it being off by one, which is a constant time operation: echo $(( $(find /foo/foo2 -maxdepth 1 | wc -l) - 1))

        – Thomas Nyman
        Sep 11 '13 at 6:32






      • 1





        Rather than feed that much data through another process, it would probably be better to just do some math on the final output. let count = $(find /foo/foo2 -maxdepth 1 | wc -l) - 2

        – Caleb
        Sep 11 '13 at 6:34








      1




      1





      Adding a pair of pipes in order to omit one line of wc input is not very efficient as the overhead increases linearly with regard to input size. In this case, why not simply decrement the final count to compensate for it being off by one, which is a constant time operation: echo $(( $(find /foo/foo2 -maxdepth 1 | wc -l) - 1))

      – Thomas Nyman
      Sep 11 '13 at 6:32





      Adding a pair of pipes in order to omit one line of wc input is not very efficient as the overhead increases linearly with regard to input size. In this case, why not simply decrement the final count to compensate for it being off by one, which is a constant time operation: echo $(( $(find /foo/foo2 -maxdepth 1 | wc -l) - 1))

      – Thomas Nyman
      Sep 11 '13 at 6:32




      1




      1





      Rather than feed that much data through another process, it would probably be better to just do some math on the final output. let count = $(find /foo/foo2 -maxdepth 1 | wc -l) - 2

      – Caleb
      Sep 11 '13 at 6:34






      Rather than feed that much data through another process, it would probably be better to just do some math on the final output. let count = $(find /foo/foo2 -maxdepth 1 | wc -l) - 2

      – Caleb
      Sep 11 '13 at 6:34












      0














      os.listdir() in python can do the work for you. It gives an array of the contents of the directory, excluding the special '.' and '..' files. Also, no need to worry abt files with special characters like 'n' in the name.



      python -c 'import os;print len(os.listdir("."))'


      following is the time taken by the above python command compared with the 'ls -Af' command.




      ~/test$ time ls -Af |wc -l
      399144

      real 0m0.300s
      user 0m0.104s
      sys 0m0.240s
      ~/test$ time python -c 'import os;print len(os.listdir("."))'
      399142

      real 0m0.249s
      user 0m0.064s
      sys 0m0.180s





      share|improve this answer



























        0














        os.listdir() in python can do the work for you. It gives an array of the contents of the directory, excluding the special '.' and '..' files. Also, no need to worry abt files with special characters like 'n' in the name.



        python -c 'import os;print len(os.listdir("."))'


        following is the time taken by the above python command compared with the 'ls -Af' command.




        ~/test$ time ls -Af |wc -l
        399144

        real 0m0.300s
        user 0m0.104s
        sys 0m0.240s
        ~/test$ time python -c 'import os;print len(os.listdir("."))'
        399142

        real 0m0.249s
        user 0m0.064s
        sys 0m0.180s





        share|improve this answer

























          0












          0








          0







          os.listdir() in python can do the work for you. It gives an array of the contents of the directory, excluding the special '.' and '..' files. Also, no need to worry abt files with special characters like 'n' in the name.



          python -c 'import os;print len(os.listdir("."))'


          following is the time taken by the above python command compared with the 'ls -Af' command.




          ~/test$ time ls -Af |wc -l
          399144

          real 0m0.300s
          user 0m0.104s
          sys 0m0.240s
          ~/test$ time python -c 'import os;print len(os.listdir("."))'
          399142

          real 0m0.249s
          user 0m0.064s
          sys 0m0.180s





          share|improve this answer













          os.listdir() in python can do the work for you. It gives an array of the contents of the directory, excluding the special '.' and '..' files. Also, no need to worry abt files with special characters like 'n' in the name.



          python -c 'import os;print len(os.listdir("."))'


          following is the time taken by the above python command compared with the 'ls -Af' command.




          ~/test$ time ls -Af |wc -l
          399144

          real 0m0.300s
          user 0m0.104s
          sys 0m0.240s
          ~/test$ time python -c 'import os;print len(os.listdir("."))'
          399142

          real 0m0.249s
          user 0m0.064s
          sys 0m0.180s






          share|improve this answer












          share|improve this answer



          share|improve this answer










          answered Sep 16 '13 at 20:47









          indrajeetindrajeet

          1




          1





















              0














              ls -1 | wc -l comes immediately to my mind. Whether ls -1U is faster than ls -1 is purely academic - the difference should be negligible but for very large directories.






              share|improve this answer



























                0














                ls -1 | wc -l comes immediately to my mind. Whether ls -1U is faster than ls -1 is purely academic - the difference should be negligible but for very large directories.






                share|improve this answer

























                  0












                  0








                  0







                  ls -1 | wc -l comes immediately to my mind. Whether ls -1U is faster than ls -1 is purely academic - the difference should be negligible but for very large directories.






                  share|improve this answer













                  ls -1 | wc -l comes immediately to my mind. Whether ls -1U is faster than ls -1 is purely academic - the difference should be negligible but for very large directories.







                  share|improve this answer












                  share|improve this answer



                  share|improve this answer










                  answered Aug 7 '14 at 22:58









                  countermodecountermode

                  5,22841943




                  5,22841943





















                      0














                      I know this is old but I feel that awk has to be mentioned here. The suggestions that include the use of wc simply aren't correct in regards to OP's question: "the most resource efficient way." I recently had a log file get way out of control (due to some bad software) and therefore stumbled onto this post. There was roughly 232 million entries! I first tried wc -l and waited 15 minutes - it was not even able to finish counting the lines. The following awk statement gave me an accurate line count in 3 minutes on that log file. I've learned over the years to never underestimate awk's ability to simulate standard shell programs in a much more efficient fashion. Hope it helps someone like me. Happy hacking!



                      awk 'BEGINi=0 i++ ENDprint i' /foo/foo2


                      And if you need to substitute a command like ls for counting files in a directory:



                      `#Normal:` awk 'BEGINi=0 i++ ENDprint i' <(ls /foo/foo2/)
                      `#Hidden:` awk 'BEGINi=0 i++ ENDprint (i-2)' <(ls -f /foo/foo2/)





                      share|improve this answer

























                      • Or simply, awk 'ENDprint NR'. But in this particular situation, awk may be overkill because ls is the bottleneck, not wc.

                        – Amit Naidu
                        May 29 '18 at 5:19















                      0














                      I know this is old but I feel that awk has to be mentioned here. The suggestions that include the use of wc simply aren't correct in regards to OP's question: "the most resource efficient way." I recently had a log file get way out of control (due to some bad software) and therefore stumbled onto this post. There was roughly 232 million entries! I first tried wc -l and waited 15 minutes - it was not even able to finish counting the lines. The following awk statement gave me an accurate line count in 3 minutes on that log file. I've learned over the years to never underestimate awk's ability to simulate standard shell programs in a much more efficient fashion. Hope it helps someone like me. Happy hacking!



                      awk 'BEGINi=0 i++ ENDprint i' /foo/foo2


                      And if you need to substitute a command like ls for counting files in a directory:



                      `#Normal:` awk 'BEGINi=0 i++ ENDprint i' <(ls /foo/foo2/)
                      `#Hidden:` awk 'BEGINi=0 i++ ENDprint (i-2)' <(ls -f /foo/foo2/)





                      share|improve this answer

























                      • Or simply, awk 'ENDprint NR'. But in this particular situation, awk may be overkill because ls is the bottleneck, not wc.

                        – Amit Naidu
                        May 29 '18 at 5:19













                      0












                      0








                      0







                      I know this is old but I feel that awk has to be mentioned here. The suggestions that include the use of wc simply aren't correct in regards to OP's question: "the most resource efficient way." I recently had a log file get way out of control (due to some bad software) and therefore stumbled onto this post. There was roughly 232 million entries! I first tried wc -l and waited 15 minutes - it was not even able to finish counting the lines. The following awk statement gave me an accurate line count in 3 minutes on that log file. I've learned over the years to never underestimate awk's ability to simulate standard shell programs in a much more efficient fashion. Hope it helps someone like me. Happy hacking!



                      awk 'BEGINi=0 i++ ENDprint i' /foo/foo2


                      And if you need to substitute a command like ls for counting files in a directory:



                      `#Normal:` awk 'BEGINi=0 i++ ENDprint i' <(ls /foo/foo2/)
                      `#Hidden:` awk 'BEGINi=0 i++ ENDprint (i-2)' <(ls -f /foo/foo2/)





                      share|improve this answer















                      I know this is old but I feel that awk has to be mentioned here. The suggestions that include the use of wc simply aren't correct in regards to OP's question: "the most resource efficient way." I recently had a log file get way out of control (due to some bad software) and therefore stumbled onto this post. There was roughly 232 million entries! I first tried wc -l and waited 15 minutes - it was not even able to finish counting the lines. The following awk statement gave me an accurate line count in 3 minutes on that log file. I've learned over the years to never underestimate awk's ability to simulate standard shell programs in a much more efficient fashion. Hope it helps someone like me. Happy hacking!



                      awk 'BEGINi=0 i++ ENDprint i' /foo/foo2


                      And if you need to substitute a command like ls for counting files in a directory:



                      `#Normal:` awk 'BEGINi=0 i++ ENDprint i' <(ls /foo/foo2/)
                      `#Hidden:` awk 'BEGINi=0 i++ ENDprint (i-2)' <(ls -f /foo/foo2/)






                      share|improve this answer














                      share|improve this answer



                      share|improve this answer








                      edited Jun 20 '16 at 8:35









                      Pierre.Vriens

                      97651015




                      97651015










                      answered Dec 23 '15 at 4:53









                      user.friendlyuser.friendly

                      1414




                      1414












                      • Or simply, awk 'ENDprint NR'. But in this particular situation, awk may be overkill because ls is the bottleneck, not wc.

                        – Amit Naidu
                        May 29 '18 at 5:19

















                      • Or simply, awk 'ENDprint NR'. But in this particular situation, awk may be overkill because ls is the bottleneck, not wc.

                        – Amit Naidu
                        May 29 '18 at 5:19
















                      Or simply, awk 'ENDprint NR'. But in this particular situation, awk may be overkill because ls is the bottleneck, not wc.

                      – Amit Naidu
                      May 29 '18 at 5:19





                      Or simply, awk 'ENDprint NR'. But in this particular situation, awk may be overkill because ls is the bottleneck, not wc.

                      – Amit Naidu
                      May 29 '18 at 5:19











                      -2














                      I would think echo * would be more efficient than any 'ls' command:



                      echo * | wc -w





                      share|improve this answer


















                      • 4





                        What about files with a space in their name? echo 'Hello World'|wc -w produces 2.

                        – Joseph R.
                        Sep 11 '13 at 20:52












                      • @JosephR. Caveat Emptor

                        – Dan Garthwaite
                        Sep 12 '13 at 0:59















                      -2














                      I would think echo * would be more efficient than any 'ls' command:



                      echo * | wc -w





                      share|improve this answer


















                      • 4





                        What about files with a space in their name? echo 'Hello World'|wc -w produces 2.

                        – Joseph R.
                        Sep 11 '13 at 20:52












                      • @JosephR. Caveat Emptor

                        – Dan Garthwaite
                        Sep 12 '13 at 0:59













                      -2












                      -2








                      -2







                      I would think echo * would be more efficient than any 'ls' command:



                      echo * | wc -w





                      share|improve this answer













                      I would think echo * would be more efficient than any 'ls' command:



                      echo * | wc -w






                      share|improve this answer












                      share|improve this answer



                      share|improve this answer










                      answered Sep 11 '13 at 20:33









                      Dan GarthwaiteDan Garthwaite

                      3,4581109




                      3,4581109







                      • 4





                        What about files with a space in their name? echo 'Hello World'|wc -w produces 2.

                        – Joseph R.
                        Sep 11 '13 at 20:52












                      • @JosephR. Caveat Emptor

                        – Dan Garthwaite
                        Sep 12 '13 at 0:59












                      • 4





                        What about files with a space in their name? echo 'Hello World'|wc -w produces 2.

                        – Joseph R.
                        Sep 11 '13 at 20:52












                      • @JosephR. Caveat Emptor

                        – Dan Garthwaite
                        Sep 12 '13 at 0:59







                      4




                      4





                      What about files with a space in their name? echo 'Hello World'|wc -w produces 2.

                      – Joseph R.
                      Sep 11 '13 at 20:52






                      What about files with a space in their name? echo 'Hello World'|wc -w produces 2.

                      – Joseph R.
                      Sep 11 '13 at 20:52














                      @JosephR. Caveat Emptor

                      – Dan Garthwaite
                      Sep 12 '13 at 0:59





                      @JosephR. Caveat Emptor

                      – Dan Garthwaite
                      Sep 12 '13 at 0:59

















                      draft saved

                      draft discarded
















































                      Thanks for contributing an answer to Unix & Linux Stack Exchange!


                      • Please be sure to answer the question. Provide details and share your research!

                      But avoid


                      • Asking for help, clarification, or responding to other answers.

                      • Making statements based on opinion; back them up with references or personal experience.

                      To learn more, see our tips on writing great answers.




                      draft saved


                      draft discarded














                      StackExchange.ready(
                      function ()
                      StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2funix.stackexchange.com%2fquestions%2f90106%2fwhats-the-most-resource-efficient-way-to-count-how-many-files-are-in-a-director%23new-answer', 'question_page');

                      );

                      Post as a guest















                      Required, but never shown





















































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown

































                      Required, but never shown














                      Required, but never shown












                      Required, but never shown







                      Required, but never shown






                      Popular posts from this blog

                      How to check contact read email or not when send email to Individual?

                      Displaying single band from multi-band raster using QGIS

                      How many registers does an x86_64 CPU actually have?