Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

Using stat() to get the total size of files in a folder

by joshua (Pilgrim)
on May 24, 2002 at 04:18 UTC ( #168974=perlquestion: print w/ replies, xml ) Need Help??
joshua has asked for the wisdom of the Perl Monks concerning the following question:

I'm using stat() to get the total size of the files in a given folder. I've been using something like this.
my $size = 0; opendir (FOO, "foo") or error_stuff(); foreach (readdir(FOO)) { $size += (stat("foo/$_"))[7] unless /^\.\.?$/; } closedir (FOO);
This seems like a lot of code to do a simple task. But if I use my $size = (stat("foo"))[7]; it returns 4096, but the total size of the files in that directory are 30981 bytes.

Is there an easier way to do this, or it this my best bet?


Edit kudra, 2002-05-26 Changed title

Comment on Using stat() to get the total size of files in a folder
Select or Download Code
Re: Using stat()
by samtregar (Abbot) on May 24, 2002 at 04:51 UTC
    How about:

    use File::Find; my $size = 0; find(sub { $size += -s if -f $_ }, "foo");

    That will recurse down into subdirectories. If that's not what you want, you could use:

    use File::Find; my $size = 0; find(sub { $size += -s if $File::Find::name eq "foo/$_" and -f $_}, "f +oo");


      Sam has it right. File::Find is very fast and efficient. You can see a good example on my website

      Neil Watson
        Ok, I've decided to use File::Find. When I call the find() function, I get this error.Insecure $ENV{PATH} while running with -T switch at /usr/lib/perl5/5.6.1/ line 92.So, I decided to add $ENV{PATH} = '/bin:/usr/bin'; to my script. Now, I'm getting this error.Insecure dependency in chdir while running with -T switch at /usr/lib/perl5/5.6.1/File/ line 467.I noticed in Neil's script, It didn't have the -T switch. Will File::Find just not run in taint mode?


Re: Using stat()
by vladb (Vicar) on May 24, 2002 at 04:55 UTC
    Referring to The UNIX File System page, a directory is defined as following:

    A directory is actually implemented as a file that has one line for each item contained within the directory. Each line in a directory file contains only the name of the item, and a numerical reference to the location of the item. The reference is called an i-number, and is an index to a table known as the i-list. The i-list is a complete list of all the storage space available to the file system.

    This is exactly the reason for your confusion:

    it returns 4096, but the total size of the files in that directory are 30981 bytes

    The 4096 number you get is the size of that special directory 'file' that only lists directory items names and their respective i-numbers.

    I believe using 'du -k' command should get you what you want. Here's an example:
    my %stats = map {m/(\d+)[\t\s]+(.*)/; $2 => $1} `du -k`; for (keys %stats) { print "$_ = $stats{$_}\n"; }
    Which gives this output (in my 'test' directory):
    ./.ssh = 5 ./temp = 273 . = 280
    Tell me if this was of any use to you ;-)

    $"=q;grep;;$,=q"grep";for(`find . -name ".saves*~"`){s;$/;;;/(.*-(\d+) +-.*)$/; $_=["ps -e -o pid | "," $2 | "," -v "," "];`@$_`?{print"+ $1"}:{print" +- $1"}&&`rm $1`; print$\;}
Re: Using stat()
by sfink (Deacon) on May 24, 2002 at 19:08 UTC
    Module-free version. In brief:
    perl -le '$size += (-f $_ ? -s _ : 0) for (<*>); print $size'
    Unnecessarily cryptic, sorry. (-f $filename) checks whether something is a plain file, as opposed to a socket, directory, link, or some other weird beast. The above is in a for loop, so $_ is each filename coming out of <*>. (-s $filename) gives the size. Except that we've already called stat() on the file, so we don't want to take the time to do it again, which is what the magic underscore filehandle is good for.

    Finally, <*> does a pattern match on the current directory. You may prefer to say glob("*").

    Readable version:

    my $size = 0; for my $filename (glob("*")) { next unless -f $filename; $size += -s _; }
      If the input location is some other directory (not the current directory) then what to put inside glob ?
Re: Using stat()
by joshua (Pilgrim) on May 25, 2002 at 04:57 UTC
    Thanks everyone for your input. vladb, thanks for the explanation on how directories work. I'll try you all's ideas when I get a chance.

    Thanks again.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://168974]
Approved by samtregar
Front-paged by samtregar
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (7)
As of 2014-10-02 12:59 GMT
Find Nodes?
    Voting Booth?

    What is your favourite meta-syntactic variable name?

    Results (56 votes), past polls