http://www.perlmonks.org?node_id=1022720

mishikal has asked for the wisdom of the Perl Monks concerning the following question:

Does anyone know how I can obtain actual usage size of a sparse file?

File::DiskUsage gives me the size of the sparse file. So does -s $file. stat gives me the total block size and blocks used, which isn't what I want either.

In my case, I have an 80GB sparse file with 13GB of actual data used:

zimbra@zre-ldap001:/tmp$ ls -l /opt/zimbra/data/ldap/mdb/db/data.mdb

-rw------- 1 zimbra zimbra 85899345920 Mar 8 19:11 /opt/zimbra/data/ldap/mdb/db/data.mdb

zimbra@zre-ldap001:/tmp$ du -c -h /opt/zimbra/data/ldap/mdb/db/data.mdb

13G /opt/zimbra/data/ldap/mdb/db/data.mdb

13G total

I'd like to be able to obtain the usage size via perl. Thanks!

Replies are listed 'Best First'.
Re: Usage of a sparse file
by davido (Cardinal) on Mar 11, 2013 at 06:22 UTC

    Is the sparse file a documented format? I see mention in your path of "zimbra". Does the documentation for that OS project discuss how the sparse file is built, and how it's indexed? The operating system indeed only knows about the file's total size. It doesn't care or know about how the file is structured internally. There's also reference in your post to an ".mdb" file, which is probably a fairly well-known database. The database may even have housekeeping tools available to you once you know for sure which one it is. If those tools have a command line interface, Perl can drive it.

    Update: I stand corrected. Live! Learn! ;)


    Dave

      A sparse file is a basic file concept. You can read about it here on Wikipedia

      All my question has to do with is whether or not Perl has a way to query actual usage from disk like "du" can. It does not matter what the file type is, as that has zero to do with the answer. It could be *any* sparse file, I simply am using one I have at hand to ask the question. Thanks!

Re: Usage of a sparse file
by mishikal (Novice) on Mar 11, 2013 at 08:16 UTC
    I've found this works

    my $total=`du --block-size=1 $line`; my $junk; chomp($total); ($total,$junk)=split /\t/, $total, 2;
    But I'd like to do it in perl rather than calling out to the shell.

      Just a parenthetic note. If  $junk is something you don't care about at all, then you don't need to bother capturing it:

      >perl -wMstrict -le "my $total = qq{123\tasdfghhj\n}; print qq{[$total]]}; ;; ($total) = split /\t/, $total, 2; print qq{'$total'}; " [123 asdfghhj ]] '123'
Re: Usage of a sparse file
by Anonymous Monk on Mar 11, 2013 at 12:54 UTC
    my ($blocks_used) = (stat $filename)[12];
      By the way, block size is usually 512 (at least man 2 stat says so about the st_blocks field of the struct stat).
      Sorry if my advice was wrong.
        Hi, as I noted in my original post,the information retrieved from stat is not usable, as it returns the blocks used by the full size of the sparse file, not actual usage size

        Thanks though!