Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Bloated File Detector for Unix Boxes

by bpoag (Monk)
on Mar 15, 2005 at 01:00 UTC ( #439514=CUFP: print w/replies, xml ) Need Help??

I got tired of finding gigantic tumor-like growths left over from the prior sysadmin's lack of filesystem caretaking, so I wrote this little script. When invoked, it sniffs around for bloated files and emails me a list of the main offenders. In this case, it looks for the largest 100 files above 4MB (4096K, ala 8192 blocks).. Feel free to tweak to your hearts delight.
#!/usr/bin/perl ### ### BloatDetector.pl v0.1 written 031405:1749 by BJP ### ### Builds a report showing the top 100 biggest-sized files, excluding ### oracle stuff, backups, etc... Thats what the back|archiv|ora excl +usion stuff does. :) ### use Mail::Sendmail; $reportFile="/tmp/bloatdetector.tmp"; $recipient="youremailaddress\@goes.here.com"; chomp($hostName=`hostname`); # $HOSTNAME is non-ubiquitous. Ugh. $sender=$ENV{"USER"}."\@".$hostName; chomp($dateStamp=`date`); print "BloatDetector: Recipient is [$recipient]\n"; print "BloatDetector: Sender is [$sender]\n"; print "BloatDetector: Scanning files..This may take a while.\n"; @bloatedFiles = `find / -depth -size +8192 -ls| grep -i -v -E \"back|a +rchiv|ora?\"| sort -r -n +6 | cut -d" " -f2- | head -n100`; open(FILE,"+>>$reportFile") or die ("BloatDetector: Can't open tempora +ry logfile. My life was short, yet sweet -- grieve not for me, my fri +end."); print "BloatDetector: Preparing report..\n"; print FILE "\n\n\nHere's the latest BloatDetector report from $hostNam +e..\n\n"; foreach $line (@bloatedFiles) { print FILE "$line"; } chomp($endTime=`date`); print FILE "\n\n"; print FILE "Time invoked : $dateStamp\n"; print FILE "Time completed : $endTime"; close(FILE); open(REPORT,$reportFile); @report = <REPORT>; chomp(@report); close(REPORT); %mail=( To => $recipient, From => $sender, Subject => "BloatDetector Results for ".$dateStamp." from ".$s +ender, Message => join("\n",@report)); print "BloatDetector: Sending report..\n"; sendmail(%mail); # a-la-peanut-butter-sandwiches! print "BloatDetector: Exiting..\n"; unlink $reportFile;

Replies are listed 'Best First'.
Re: Bloated File Detector for Unix Boxes
by merlyn (Sage) on Mar 15, 2005 at 02:15 UTC
    I'm not sure what -depth is doing in your "find" line, but you can pull all that finding stuff inside your Perl code with:
    use File::Finder; my @bloatedFiles = File::Finder ->size('+8192') ->not->name(qr/back|archiv|ora?/) ->collect(sub {[$File::Find::name => -s _]}, '/'); @bloatedFiles = map $_->[0], sort { $b->[1] <=> $a->[1] } @bloatedFile +s; splice(@bloatedFiles, 100) if @bloatedFiles > 100;

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.

      Hi merlyn, The -depth flag is just a personal preference -- It traverses the directory's contents before searching the directory itself. There are ways to fine-tune find, to tell it how many directories deep you want it to go. -maxdepth and -mindepth, iirc.
        I'd understand it as a preference if it in fact did something useful for your application here. But it does nothing. It's like throwing "/o" on the end of a regular expression that contains no variables. It's pointless.

        That's why I wondered why you did it. You're doing nothing. It'd be like having an extra variable declared, and initializing it, only to never use it again in your program. You're going to extra work to do something that does nothign.

        -- Randal L. Schwartz, Perl hacker
        Be sure to read my standard disclaimer if this is a reply.

Re: Bloated File Detector for Unix Boxes
by blahblahblah (Priest) on Mar 15, 2005 at 04:32 UTC
    Nice script. I do the same kind of search once in a while when the disk that all of our developers share starts to fill up. I'm not that familiar with 'find' or File::Finder so I use this:
    du -a | egrep '^[0-9]{4,}' | sort -nr > ~/du.txt
    The nice thing about it is that it also shows me directories which contain many small files.
Re: Bloated File Detector for Unix Boxes
by elwarren (Curate) on Mar 15, 2005 at 21:46 UTC
    "You said nobody would ever find the garbage file!?!?"

    "But these are Hackers!"

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: CUFP [id://439514]
Approved by thor
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (5)
As of 2019-07-24 06:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    If you were the first to set foot on the Moon, what would be your epigram?






    Results (32 votes). Check out past polls.

    Notices?