Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: Fastest way to calculate directory size in linux

by Anonymous Monk
on Feb 01, 2012 at 10:51 UTC ( [id://951174]=note: print w/replies, xml ) Need Help??


in reply to Fastest way to calculate directory size in linux

du from GNU coreutils.

  • http://gnuwin32.sf.net/packages/coreutils.htm
  • http://sf.net/projects/mingw/files/MSYS/Base/coreutils/coreutils-5.97-3/
  • http://cygwin.com/packages/coreutils/

Replies are listed 'Best First'.
Re^2: Fastest way to calculate directory size in linux
by Marshall (Canon) on Feb 01, 2012 at 11:23 UTC
    I just did some testing with du on my Windows machine. This does "work", but it is not fast - I doubt any faster than File::Find. It visits every file and adds the sizes up - this takes awhile! Its a lot slower than the "properties" click button in the Windows file manager. Satisfying both "fast" and "multi-platform" simultaneously is a pretty tall order.

    maybe if we look at your File::Find code, we can see some improvements?

    Update: This is the fastest way that I know how, that also satisfies "multi-platform". This code runs on my Windows machine.

    #!/usr/bin.perl -w use strict; use File::Find; my $byte_total; my $file_total; find (\&sum_bytes, "C:/temp"); sub sum_bytes { return() unless (-f $File::Find::name); $byte_total += -s _; # not a typo! # "_" different than "$_"! # see [id://951317] $file_total++; } print "total bytes in C:/temp: $byte_total in $file_total files\n"; # prints on my Windows system: # total bytes in C:/temp: 656201485 in 2554 files
    One "expensive" file operation is run for each file system entry. The results of that "file system query", "stat" operation are re-used in a subsequent file operation. This is multi-platform, but not completely optimal for Windows NTFS.

    You can get "fastest" and you can get "multi-platform", but not both together. There is a Windows NTFS way to get this number faster. But this is the "fastest" multi-platform solution of which I am aware.

      File::Find has to visit all files as well... And while I don't claim to have any Windows related knowledge, I do know that on Unix, you'll have to visit all files (or rather, their inodes) to calculate the "size of the directory".

        Indeed, if there was a faster, portable way to calculate the space taken up by a directory, I'm sure that the GNU folk would have coded "du" to use the faster method.

      Thanks for the script but it does'nt give the correct size of the directory(maybe some system files etc issue)

Re^2: Fastest way to calculate directory size in linux
by Anonymous Monk on Feb 01, 2012 at 11:12 UTC

    can'nt that be done by a perl script ??

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://951174]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (3)
As of 2024-03-19 04:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found