I've been messing around with the Benchmark module and experimenting what the quickest way is to approach file-reading. I'm currently benchmarking the following code working on a 2.5 MB wordlist:
use Benchmark qw/countit cmpthese/;
sub run($) { countit(1, @_) }
cmpthese {
read_proc => run q{
open(WORDS,"words.txt") or die("Wordlist unavaliable.\n");
my @words = <WORDS>;
close(WORDS);
foreach $word (@words){
chomp $word;
if ($word =~ m/[aeiouyAEIOUY]{4,}/){
push(@hitwords,$word);
$hitcounter++;
}
$counter++;
}
},
for_proc => run q{
open(WORDS,"words.txt") or die("Wordlist unavaliable.\n");
foreach $word (<WORDS>){
chomp $word;
if ($word =~ m/[aeiouyAEIOUY]{4,}/){
push(@hitwords,$word);
$hitcounter++;
}
$counter++;
}
close(WORDS);
},
while_proc => run q{
open(WORDS,"words.txt") or die("Wordlist unavaliable.\n");
while($word = <WORDS>){
chomp $word;
if ($word =~ m/[aeiouyAEIOUY]{4,}/){
push(@hitwords,$word);
$hitcounter++;
}
$counter++;
}
close(WORDS);
}
};
I would expect while_proc to be the fastest by far and the entire chunck of code should take only a few minutes to run. However, when I run it, it takes hours and then gives me this as a result:
s/iter for_proc read_proc while_proc
for_proc 5983 -- -6% -100%
read_proc 5645 6% -- -100%
while_proc 2.42 246713% 232775% --
Now I know that can't be right. Each block of code runs fine by itself, and while_proc IS the fastest version of the code. But the Benchmark results don't verify that at all. What am I doing wrong? I know I need to run more iterations for a reliable benchmark, but I can't really do that if one takes half the day.