Category: |
Utility Scripts |
Author/Contact Info |
jesse_becker@yahoo.com |
Description: |
I'm constantly having to clean out space on lots of computers, and looking at several screens of 'du' output hurts. So I wrote this little script to parse and format the output from 'du'. I know, I know, it's not strictly perl, but monks should be aware that there are thing that exist outside these cloistered walls.
N.B. Since this is meant to be used in a pipe, it's usually all on a single line, and without comments. |
du -sk . * | perl -e '
$sum=<>; # Get the otal space used from the first line
# This is so we don't run 'du' twice
while (<>) {
($size, $inode)=split;
$inode .= "/" if (-d $inode);
printf("%30s | %5d | %5.2f%%\n",$inode,$size,$size/$sum*100);
}'
| sort -rn -k 3 | head
|
Re: What's eating all your disk space?
by csh (Novice) on Jan 26, 2004 at 21:10 UTC
|
use strict;
use IO::File;
my $size;
my $inode;
my $sum = 0;
my @entries;
my $e;
my $percent = 0;
my $remsum = 0;
my $counter = 0;
my $du = new IO::File;
if (@ARGV) {
chdir "$ARGV[0]" or die "cannot change to [ $ARGV[0] ]\n";
}
$du->open("du -sk *|") or
die "cannot open du program and pipe";
while (<$du>) {
($size, $inode)=split;
$inode .= "/" if (-d $inode);
$sum += $size;
push @entries, { size => $size, inode => $inode };
}
@entries = sort { $b->{size} <=> $a->{size} } @entries;
$du->close;
foreach $e (@entries) {
$percent = $e->{size}/$sum*100;
if ($counter < 10) {
printf(
"%30s | %5d | %5.2f%%\n",
$e->{inode},
$e->{size},
$percent);
}
else {
$remsum += $e->{size};
}
$counter++;
}
if ($remsum > 0) {
printf(
"%30s | %5d | %5.2f%%\n",
"REMAINING FILES",
$remsum,
$remsum/$sum*100);
}
Edits:
* moved sort out of loop | [reply] [d/l] |
RE: What's eating all your disk space?
by ivey (Beadle) on Jul 12, 2000 at 22:09 UTC
|
| [reply] |
|
I like this script a lot; very handy. It was taking too
long on some of my larger directory trees, though,
so I took the
liberty of
speeding it up. The following does the sorting in
Perl, and also calculates the sum internally to eliminate
the '.' from the du call. This saves du from having to
walk the directory tree twice (once for '.' and once for
the individual '*' arguments) and sped things up a lot for me.
#! /usr/bin/env perl
open(DU, "du -sk *|") || die "Can't exec du: $!\n";
while (<DU>) {
($size, $inode)=split;
chop($size);
$sum += $size;
push @entries, { size => $size, inode => $inode };
}
close(DU);
@entries = sort { $b->{size} <=> $a->{size} } @entries;
foreach $e (@entries[0 .. 10]) {
printf("%30s | %5d | %2.2f%%\n",$e->{inode},$e->{size},$e->{siz
+e}/$sum*1000);
}
Thanks for a cool script! | [reply] [d/l] |
|
J. J. Horner
Linux, Perl, Apache, Stronghold, Unix
jhorner@knoxlug.org http://www.knoxlug.org/
| [reply] |
|
|
I think that you meant $sum*100 in the end...
:)
Here is a small improvement. It adds a '/' to directory names,
.
.
.
($size, $inode)=split;
$inode .= "/" if (-d $inode);
chop($size);
.
.
.
also, the printf should be changed. Instead of "%2.2f"
you probably meant "%5.2f". The first digit is the total
number of digits, including the period. | [reply] [d/l] [select] |
|
I thought about building the sorting but decided
against it ("One tool does one thing"). I figure that I'll leave
sorting to 'sort'. ;-)
As to 'df' walking the tree twice, I looked at that as well. In my tests, it looked like
the results were cached somewhere, and thus 'df -sk . *' is quite fast. A prior version of the
script did something horrible along the lines of: 'du -sk *|perl -e `$sum=du -sk .` while(<>) {....}, so this is an improvement already.
This is pretty quick, and I use it on 40GB raid arrays. :-)
| [reply] [d/l] |
|
A very handy little script, i will use it a lot
Orthanc
| [reply] |
Re: What's eating all your disk space? -- duke!
by gremio (Acolyte) on Jul 14, 2001 at 22:26 UTC
|
Hi Hawson,
I have to do the same kind of task routinely, and try to farm as much of it off to the users themselves as possible, and wrote this to help both of us out.
It does close to the same thing as du (though any help on figuring out how du actually comes up with its numbers would be appreciated!), and is pretty handy for not having to dig through directory trees doing du's over and over again in subdirectories. Though it's noticeably slower than du on large directories, I find it's actually faster because I have to do the du only once, even for several levels of nested dirst.
It also displays age, which can be very useful to determine what needs killing, and I find novices have little problem understanding the output. YMMV, but I hope you like it.
I call it "duke" --Gremio
| [reply] [d/l] |
|
If your $size is an integer multiple of your $blksize, you'll overstate $size by $blksize in
$size = $blksize*(1+(int($size / $blksize)));
| [reply] [d/l] |
Re: What's eating all your disk space?
by chibiryuu (Beadle) on Apr 19, 2005 at 03:07 UTC
|
I once wrote a script much like the one above. du -b "$@" | sort -n is nice, but du -h is nice too, so I had a shell script for du -b "$@" | perl -pe's/ome/complicated/regex' | sort -n for a while. Recently, I rewrote it in pure Perl.
#!/usr/bin/perl -w
use strict;
use File::Find;
my %conf = (a => 0, c => 0, s => 0, x => 0);
my @dirs = ();
while (defined ($_ = shift)) {
if ($_ eq "--") {push @dirs, @ARGV; last}
elsif (/^-(.*)$/s) {
for (split //, $1) {
if ($_ eq "a" and !$conf{s}) {$conf{a} = 1}
elsif ($_ eq "c") {$conf{c} = 1}
elsif ($_ eq "s" and !$conf{a}) {$conf{s} = 1}
elsif ($_ eq "x") {$conf{x} = 1}
else {
print STDERR "$0 [-a] [-c] [-s] [-x] [--] ...\n";
exit 1;
}
}
}
else {push @dirs, $_}
}
s/\/*$//s for @dirs;
@dirs = qw(.) unless @dirs;
my %spec = (no_chdir => 1);
if ($conf{a}) {
$spec{wanted} = sub {
stat;
my $s = -f _ ? -s _ : 0;
$File::Find::name =~ /^\Q$dirs[0]\E\/?(.*)$/s;
my @a = split /\//, $1;
for (unshift @a, $dirs[0]; @a; pop @a) {
$_{join "/", @a} += $s;
}
};
}
elsif ($conf{s}) {
$spec{wanted} = sub {
stat;
$_{$dirs[0]} += -f _ ? -s _ : 0;
};
}
else {
$spec{wanted} = sub {
stat;
my $s = -f _ ? -s _ : 0;
$File::Find::name =~ /^\Q$dirs[0]\E\/?(.*)$/s;
my @a = split /\//, $1;
! -d _ and pop @a;
for (unshift @a, $dirs[0]; @a; pop @a) {
$_{join "/", @a} += $s;
}
};
}
if ($conf{x}) {
$spec{preprocess} = sub {
my $dev = (lstat $File::Find::dir)[0];
grep {$dev == (lstat "$File::Find::dir/$_")[0]} @_;
};
}
while (@dirs) {
find(\%spec, $dirs[0] eq "" ? "/" : $dirs[0]);
$_{""} += $_{$dirs[0]} if $conf{c};
shift @dirs;
}
$_{$_} < 1024 ** 1 ? printf "%s «%-6.6sB» %s\n", $_{$_}, sprintf("%6.6
+f", "$_{$_}" / 1024 ** 0), $_ :
$_{$_} < 1024 ** 2 ? printf "%s «%-6.6sK» %s\n", $_{$_}, sprintf("%6.6
+f", "$_{$_}" / 1024 ** 1), $_ :
$_{$_} < 1024 ** 3 ? printf "%s «%-6.6sM» %s\n", $_{$_}, sprintf("%6.6
+f", "$_{$_}" / 1024 ** 2), $_ :
$_{$_} < 1024 ** 4 ? printf "%s «%-6.6sG» %s\n", $_{$_}, sprintf("%6.6
+f", "$_{$_}" / 1024 ** 3), $_ :
$_{$_} < 1024 ** 5 ? printf "%s «%-6.6sT» %s\n", $_{$_}, sprintf("%6.6
+f", "$_{$_}" / 1024 ** 4), $_ :
$_{$_} < 1024 ** 6 ? printf "%s «%-6.6sP» %s\n", $_{$_}, sprintf("%6.6
+f", "$_{$_}" / 1024 ** 5), $_ :
$_{$_} < 1024 ** 7 ? printf "%s «%-6.6sE» %s\n", $_{$_}, sprintf("%6.6
+f", "$_{$_}" / 1024 ** 6), $_ :
$_{$_} < 1024 ** 8 ? printf "%s «%-6.6sZ» %s\n", $_{$_}, sprintf("%6.6
+f", "$_{$_}" / 1024 ** 7), $_ :
printf "%s «%-6.6sY» %s\n", $_{$_}, sprintf("%6.6
+f", "$_{$_}" / 1024 ** 8), $_
for sort {$_{$a} <=> $_{$b} or $a eq "" ? 1 : $a cmp $b} keys %_;
- Not so good:
- I don't implement qw(-D --exclude-from -l -L --max-depth -S -X) like the real du does.
- I probably shouldn't do my own argument parsing, and I hope there's a better way to do the printing at the end.
- That being said,
You like? | [reply] [d/l] |
Re: What's eating all your disk space? (Treemaps)
by jimX11 (Friar) on Aug 19, 2005 at 01:02 UTC
|
Treemaps!! First heard about them at OSCON 05 when
Tim O'Reilly mentioned them in a keynote.
The paragraph below is from
Treemaps for space-constrained visualization of hierarchies
During 1990, in response to the common problem of a filled hard disk, I became obsessed with the idea of producing a compact visualization of directory tree structures. Since the 80 Megabyte hard disk in the HCIL was shared by 14 users it was difficult to determine how and where space was used. Finding large files that could be deleted, or even determining which users consumed the largest shares of disk space were difficult tasks.
Just today I found
treemap on the CPAN.
Wonder how the Perl sourcecode could be treemapped?
The linux
kernel is.
| [reply] |
Re: What's eating all your disk space?
by chanio (Priest) on Aug 18, 2005 at 21:49 UTC
|
I want to thank you all for these valuable pieces of code. All the page is very useful.
Perhaps, it could help others if I show the way I used it. (based on the main node)
##[ my_df.sh ]##
chdir ~
du -sk . * .* | perl -e ' ## ADDED .files
$sum=<>;
while (<>)
{
($size, $inode) =split;
$inode .= "/" if (-d $inode); ## /_SHOW SIZE IN Kb + GRAPHIC: _\
printf("%25s | %4d Kb |%6.2f%% [%+11s]\n",$inode,int($size/1024),$
+size/$sum*100,"<".("=" x (int($size/$sum*10)))) unless ($inode=~/\.\.
+/);
## EVERY LINE LOOKS LIKE THIS:
## Documents/ | 710 Kb | 52.20% [ <=====]
}'| sort -rn -k 3 | head | xmessage -center -file -
The output would show in an xmessage screen (come on, burn me). But it could work well with a TK one, as well as with the full perl way of doing it all, who doubts that :) .
| [reply] [d/l] |
|
|