Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re: Optimizing performance for script to traverse on filesystem

by kejohm (Hermit)
on Feb 01, 2012 at 21:37 UTC ( #951307=note: print w/ replies, xml ) Need Help??


in reply to Optimizing performance for script to traverse on filesystem

Rather than rolling your own code, you should probably use a module like File::Find. Here is an example that should do want you're trying to do (untested):

#!perl use 5.012; use File::Find; use List::Util qw(min); my $workdir = ...; my %top_size_mailbox; my $box_size; my $all_mailbox_size = 0; my $all_mailbox_count = 0; my $empty_mail_box = 0; find( { -wanted => sub { next if m/^\.+$/; next if m/\.(dat|mdb|snapshot)$/; if ( -d and m/^\d+\@\w/ ) { $all_mailbox_count++; $box_size = 0; } elsif ( m/\.msg$/ ) { my $msg_size = -s _; if ( $msg_size < 4096) { $box_size += 4096; } else { $box_size += $msg_size; } } }, -postprocess => sub { if ( $File::Find::dir =~ m/(\d+)\@/ ) { my $msisdn = $1; $all_mailbox_size += $box_size; if ( $box_size == 0 ) { $empty_mailbox++; } else { top_size_mailbox( $msisdn, $box_size ); } } }, } $workdir, ); sub top_size_mailbox { my ( $msisdn, $box_size ) = @_; if ( keys( %top_size_mailbox ) < $num_top_size_box ) { $top_size_mailbox{$box_size} = $msisdn; } else { my $min = min( keys %top_size_mailbox ); if ( $box_size > $min ) { delete $top_size_mailbox{$min}; $top_size_mailbox{$box_size} = $msisdn; } } } __END__

There are other similar modules like File::Find::Rule that you could also try.


Comment on Re: Optimizing performance for script to traverse on filesystem
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://951307]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (5)
As of 2014-10-24 06:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (130 votes), past polls