Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Re: A very odd happening (at least. . . to me)

by maverick (Curate)
on Jun 24, 2002 at 16:09 UTC ( #176843=note: print w/replies, xml ) Need Help??


in reply to A very odd happening (at least. . . to me)
in thread Processing large files many times over

From a quick glance at the code, one of the first questions that comes to mind is "how many files are in these directories?" I suspect that part of the source of your slowness is that you read both the entire list of files, and the entire contents of each file into memory. If you alter your reading structure like so:
open(DIR,"$base_dir\\$dir") or die "$dir failed to open: $!"; while (my $file = readdir(DIR)) { next unless $file =~ /\.txt$/; # etc, etc. open(IN,"$full_name") || die "can't open $!"; while (my $line = <IN>) { # processing } close(IN); } closedir(DIR);
you won't have the overhead of all the memory allocation. In your second example there's a system call to a secondary perl script. That's going to be time consuming too. Consider making the second perl program a subroutine...that will avoid a fork, exec, and compile for every file you have.

HTH

/\/\averick
OmG! They killed tilly! You *bleep*!!

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://176843]
help
Chatterbox?
[marinersk]: Corion This DBA wants to smack others who like putting whitespace in column names.
[marinersk]: s/smack/whap/;
[talexb]: Wow, what hilariously bad form.
[SuicideJunkie]: Just wait; someday soon, you'll be given a DB with unicode emojis in the column names.
[Corion]: marinersk: Well, I have done select statements like select sum(foo) as "Total Amount", ..., but to have a table like that makes me shudder
[Corion]: SuicideJunkie: :-D
[marinersk]: SuicideJunkie LOL
[choroba]: Woohoo! Fixed a test that hasn't run for 3 years.
[marinersk]: Corion Yes, sometimes whitespace in column headers is acceptable, but I still consider it be less than desireable if that query might get revectored for an ETL-esque process...
[marinersk]: choroba++

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (11)
As of 2017-05-25 15:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?