in reply to Re: Loading 283600 records (Updated)
in thread Loading 283600 records (WordNet)
Thanks for reply, BrowserUK.
I tried and below is the result.
s/iter 02_split1 04_unpack 03_split2 01_substr 02_split1 6.34 -- -34% -41% -57% 04_unpack 4.17 52% -- -11% -35% 03_split2 3.71 71% 12% -- -27% 01_substr 2.70 134% 54% 37% --And test code. I hope there is no silly mistakes.
#!/usr/bin/perl use strict; use warnings; use Time::HiRes; use Benchmark qw/cmpthese/; my $href; sub test1{ $href={}; open(my $fh, "<", "04.txt") or die $!; while(<$fh>){ chomp; push @{ $href->{ substr($_,0,10)} }, [ substr($_,10,10), subst +r($_,20)]; } close $fh; } sub test2{ my @rec; $href={}; open(my $fh, "<", "04.txt") or die $!; push @{ $href->{ $rec[0] } }, [ @rec[ 1, 2 ] ] while @rec = split '(?<=-[a-z])', <$fh>; close $fh; } sub test3{ #04-1.txt, with delimiter '|' my @rec; $href={}; open(my $fh, "<", "04-1.txt") or die $!; push @{ $href->{ $rec[0]} }, [ @rec[1, 2] ] while @rec = split /\|/, <$fh>; close $fh; } sub test4{ #with unpack my @rec; $href={}; open(my $fh, "<", "04.txt") or die $!; @rec = unpack( 'a10a10a4', $_ ), push @{ $href->{ $rec[0] } }, [ @rec[ 1, 2 ] ] while <$fh>; close $fh; } my %tests = ( '01_substr' => \&test1, '02_split1' => \&test2, '03_split2' => \&test3, '04_unpack' => \&test4, ); cmpthese( -20, #for 20 cpu secs \%tests );
With large loop, setting value to variable becomes some cost( this is BrowserUK taught me in this thread). So I think if I can avoid to use @rec, unpack and split becomes faster. Is there a good way?open(my $fh, "<", "24length_packed.data" ) or die $!; local $/ = undef; map { push @{ $hash{ $_->[0] } }, [ $_->[1], $_->[2] ] } unpack( '(a10a10a4)*', <$fh>), close $fh;
In Section
Seekers of Perl Wisdom