Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Fastest way to calculate directory size in linux

by Anonymous Monk
on Feb 01, 2012 at 10:44 UTC ( #951173=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi

What is the fastest way to calculate directory sizes in linux (that also works on windows)? I have used File::Find module but it is very slow. Also there is Win32::ole module but it is only for windows. Is there any other way to do fast directory size calculation?

Comment on Fastest way to calculate directory size in linux
Re: Fastest way to calculate directory size in linux
by Anonymous Monk on Feb 01, 2012 at 10:51 UTC
    du from GNU coreutils.

    • http://gnuwin32.sf.net/packages/coreutils.htm
    • http://sf.net/projects/mingw/files/MSYS/Base/coreutils/coreutils-5.97-3/
    • http://cygwin.com/packages/coreutils/

      can'nt that be done by a perl script ??

      I just did some testing with du on my Windows machine. This does "work", but it is not fast - I doubt any faster than File::Find. It visits every file and adds the sizes up - this takes awhile! Its a lot slower than the "properties" click button in the Windows file manager. Satisfying both "fast" and "multi-platform" simultaneously is a pretty tall order.

      maybe if we look at your File::Find code, we can see some improvements?

      Update: This is the fastest way that I know how, that also satisfies "multi-platform". This code runs on my Windows machine.

      #!/usr/bin.perl -w use strict; use File::Find; my $byte_total; my $file_total; find (\&sum_bytes, "C:/temp"); sub sum_bytes { return() unless (-f $File::Find::name); $byte_total += -s _; # not a typo! # "_" different than "$_"! # see [id://951317] $file_total++; } print "total bytes in C:/temp: $byte_total in $file_total files\n"; # prints on my Windows system: # total bytes in C:/temp: 656201485 in 2554 files
      One "expensive" file operation is run for each file system entry. The results of that "file system query", "stat" operation are re-used in a subsequent file operation. This is multi-platform, but not completely optimal for Windows NTFS.

      You can get "fastest" and you can get "multi-platform", but not both together. There is a Windows NTFS way to get this number faster. But this is the "fastest" multi-platform solution of which I am aware.

        File::Find has to visit all files as well... And while I don't claim to have any Windows related knowledge, I do know that on Unix, you'll have to visit all files (or rather, their inodes) to calculate the "size of the directory".

        Thanks for the script but it does'nt give the correct size of the directory(maybe some system files etc issue)

Re: Fastest way to calculate directory size in linux
by mr_magoo (Initiate) on Feb 01, 2012 at 20:16 UTC
    Let Google be your friend!
    Found this (did not test it - will leave that to you) with
    the following search: "calculating directory size perl"

    At following url:

    http://bytes.com/topic/perl/answers/603354-calculate-size-all-files-directory

    Take a look at "docsnyder's" post

      This script is recursive or something as it gives "out of memory"

Re: Fastest way to calculate directory size in linux
by DrHyde (Prior) on Feb 02, 2012 at 10:57 UTC
    If all you care about is the size of the directory, just use stat:

    $ perl -e 'print +(stat("/tmp"))[7]' 49152

    I have no idea whether stat() works on Windows or not, but it's not mentioned in perlport.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://951173]
Front-paged by Arunbear
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (9)
As of 2014-11-28 07:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My preferred Perl binaries come from:














    Results (194 votes), past polls