Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change

fast count files

by gautamparimoo (Beadle)
on Jan 06, 2012 at 07:35 UTC ( #946545=perlquestion: print w/replies, xml ) Need Help??

gautamparimoo has asked for the wisdom of the Perl Monks concerning the following question:

I have tried File::Name and readdir to count file in the drive but results are slow when disk size increases. Iz there a way to just give total count of files in drive very quickly(eg 10 mins for 200gb)? Help is appreciated

Replies are listed 'Best First'.
Re: fast count files
by BrowserUk (Pope) on Jan 06, 2012 at 08:17 UTC

    What OS? Which filesystem?

      Windows os NTFS filesystem

        If you literally just want to count the files on the entire disk, this is by far the fastest simple method I know of.

        It counts the 1.2 million files on my cold-cache, 640GB (400GB used) drive in a little under 7 minutes:

        $t=time; $n = `attrib /s c:\\* | wc -l`; printf "$n : %.f\n", time()-$t;; 1233597 : 394

        Try it and see how you fare. I vaguely remember finding a faster method years ago, and I'll try to remember enough to look it up.

        Note: Don't do my @files = `attrib /s c:\\*`; my $n = scalar @files; All the memory allocation slows things down horribly.

        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        The start of some sanity?

Re: fast count files
by umasuresh (Hermit) on Jan 09, 2012 at 12:53 UTC
    Try GNU parallel
    find /current_dir/ -type f | parallel -k -j150% -n 1000 -m stat -c %s + > sizeslist && awk '{sum+=$1}END{print "Total Bytes:",sum}' sizeslis +t
Re: fast count files
by spx2 (Deacon) on Jan 09, 2012 at 11:19 UTC
    If your only concern is speed I think you should rewrite it in C. (First place I'd start looking is at the source of du or df)

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://946545]
Approved by planetscape
Front-paged by toolic
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (6)
As of 2021-01-17 22:08 GMT
Find Nodes?
    Voting Booth?