Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re: Optimizing performance for script to traverse on filesystem

by kejohm (Hermit)
on Feb 01, 2012 at 21:37 UTC ( #951307=note: print w/ replies, xml ) Need Help??


in reply to Optimizing performance for script to traverse on filesystem

Rather than rolling your own code, you should probably use a module like File::Find. Here is an example that should do want you're trying to do (untested):

#!perl use 5.012; use File::Find; use List::Util qw(min); my $workdir = ...; my %top_size_mailbox; my $box_size; my $all_mailbox_size = 0; my $all_mailbox_count = 0; my $empty_mail_box = 0; find( { -wanted => sub { next if m/^\.+$/; next if m/\.(dat|mdb|snapshot)$/; if ( -d and m/^\d+\@\w/ ) { $all_mailbox_count++; $box_size = 0; } elsif ( m/\.msg$/ ) { my $msg_size = -s _; if ( $msg_size < 4096) { $box_size += 4096; } else { $box_size += $msg_size; } } }, -postprocess => sub { if ( $File::Find::dir =~ m/(\d+)\@/ ) { my $msisdn = $1; $all_mailbox_size += $box_size; if ( $box_size == 0 ) { $empty_mailbox++; } else { top_size_mailbox( $msisdn, $box_size ); } } }, } $workdir, ); sub top_size_mailbox { my ( $msisdn, $box_size ) = @_; if ( keys( %top_size_mailbox ) < $num_top_size_box ) { $top_size_mailbox{$box_size} = $msisdn; } else { my $min = min( keys %top_size_mailbox ); if ( $box_size > $min ) { delete $top_size_mailbox{$min}; $top_size_mailbox{$box_size} = $msisdn; } } } __END__

There are other similar modules like File::Find::Rule that you could also try.


Comment on Re: Optimizing performance for script to traverse on filesystem
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://951307]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (10)
As of 2015-07-04 06:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (58 votes), past polls