Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked

Re: Speed up perl code

by bulk88 (Priest)
on Sep 21, 2012 at 08:54 UTC ( #994854=note: print w/replies, xml ) Need Help??

in reply to Speed up perl code

Use B::Concise. Look at how many opcodes each line takes. For example,
if(decode('cp1252',$oWkC->{Val}) eq "Confi +gurator mapping") { for (my $iR = $oWkS->{MinRow} +1; defi +ned $oWkS->{MaxRow} && $iR <= $oWkS->{MaxRow} ; $iR++)
MinRow and MaxRow hash slices should be copied to lexical my scalars. my scalars are faster than a hash reference, and you have a loop there.
if(defined $market_cel +l_name and defined $market_cell and length(decode('cp1252',$market_ce +ll->{Val})) gt 0 and length(decode('cp1252',$market_cell_name->{Val}) +) gt 0) { if(decode('cp1252' +,$market_cell->{Val}) eq '0') { $ignored_marke +t_string = $ignored_market_string . decode('cp1252',$market_cell_name +->{Val}) . ","; } else { $market_string + = $market_string . decode('cp1252',$market_cell_name->{Val}) . ","; } } }
You wrote "decode('cp1252',$market_cell_name->{Val})" three times (more memory for opcodes) and ran it twice. Why are you decoding just for '0'? '0' is the same in 1252 as in Latin 1. Dont do a decode just to check for '0', or do the decode ONCE and cache the decoded result. IDK if "eq ''" or your "length() > 0" is faster. Use NYTProf and find out. I've got a question, was this written by hand or is the output of a script generating tool?
my $market_Cmin = -1; my $market_Cmax = -1; my $title_row = -1; for (my $iC = $oWkS->{MinCol}; defined $oWkS->{Max +Col} && $iC <= $oWkS->{MaxCol} ; $iC++)
IDK the API of the modules you are using, but is "$oWkS->{MaxCol}" really going to stop being defined at some sudden point in the loop? undef converts to 0 in perl.
} elsif(decode('cp1252',$oWkC->{Val}) eq "Market"){ foreach my $area ( @{ $oWkS->{MergedAr +ea} } ) { if($area->[1] eq $iC and $area->[0 +] eq $oWkS->{MinRow}){ $market_Cmax = $area->[3]; } } $market_Cmin = $iC; } elsif(decode('cp1252',$oWkC->{Val}) eq " +Customer"){ foreach my $area ( @{ $oWkS->{MergedAr +ea} } ) { if($area->[1] eq $iC and $area->[0 +] eq $oWkS->{MinRow}){ $customer_Cmax = $area->[3]; } } $customer_Cmin = $iC; } elsif(decode('cp1252',$oWkC->{Val}) eq " +Brand"){ foreach my $area ( @{ $oWkS->{MergedAr +ea} } ) { if($area->[1] eq $iC and $area->[0 +] eq $oWkS->{MinRow}){ $brand_Cmax = $area->[3]; } } $brand_Cmin = $iC; }
You need to make functions and not call decode('cp1252',$oWkC->{Val}) a dozen times in a row. Also, if you know you will find $*****_Cmax only once (currently the *last* Cmax that matches the condition is selected, IDK if you require this), add last to all your foreach loops so they exit after the first match rather than continuing to iterate through all the remaining data.
for(my $iR = $title_row +1; defined $oWkS->{MaxRow} && $iR <= +$oWkS->{MaxRow} ; $iR++){ $internal_cell = $oWkS->{Cells}[$iR][$map_ +Cmin]; $active_cell = $oWkS->{Cells}[$iR][$active +_col]; $family_cell = $oWkS->{Cells}[$iR][$active +_col];
Copy the $oWkS->{Cells}$iR array reference to a my variable, then do map/active/family on that 1 my variable, not do 2 extra dereferences every line. You can also do ($internal_cell, $active_cell, $family_cell) = $oWkS->{Cells}[$iR][$map_Cmin, $active_col, $active_col]; and save more overhead/opcodes.
$oWkC = $oWkS->{Cells}[$oWkS->{MinRow}][$iC]; if(defined $oWkC) { if(decode('cp1252',$oWkC->{Val}) eq "Confi +gurator mapping") {
Combine the 2 ifs with &&, you saved 1-3 opcodes. The "after" look is visually messy but
my $map_Cmin = -1; # the first is internal my $map_Cmax = -1; my $artno_col = -1; my $active_col = -1; my $colorzones_col = -1; my $family_col = -1; my $tec_desc_col = -1; my $brand_Cmin = -1; my $brand_Cmax = -1; my $customer_Cmin = -1; my $customer_Cmax = -1; my $market_Cmin = -1; my $market_Cmax = -1; my $title_row = -1;
this can be improved in speed at expense of visual clarity. Put all that in one ";" terminated statement. Short example,  my ($map_Cmax,$artno_col, $active_col)  = (-1, -1, -1);. Use whitespace and tabs to lay out the variable names in columns to try and organize that one long my statement.

Someone in the past brought up your "string" gt 0 lines. IDK if they are right. For example,
#set Brands if($brand_Cmin gt 0 and $brand_Cma +x gt 0) { my $brand_string = ""; my $ignored_brand_string = ""; for(my $brandC = $brand_Cmin; +$brandC <= $brand_Cmax ; $brandC++)
$brand_Cmax has to be a number, not a string for this to work.
#set Brands if($brand_Cmin gt 0 and $brand_Cma +x gt 0) { my $brand_string = ""; my $ignored_brand_string = ""; for(my $brandC = $brand_Cmin; +$brandC <= $brand_Cmax ; $brandC++) { $brand_cell_name = $oWkS-> +{Cells}[$title_row][$brandC]; if(defined $brand_cell_nam +e and decode('cp1252',$brand_cell_name->{Val}) ne "Epta std") { $brand_cell = $oWkS->{ +Cells}[$iR][$brandC]; if(defined $brand_cell + and length(decode('cp1252',$brand_cell->{Val})) gt 0) { if(decode('cp1252' +,$brand_cell->{Val}) eq '0') { $ignored_brand +_string = $ignored_brand_string . decode('cp1252',$brand_cell_name->{ +Val}) . ","; } else { $brand_string += $brand_string . decode('cp1252',$brand_cell_name->{Val}) . ","; } } } }
considered adding a "last" to this numeric for loop?
if(length($brand_string) gt 0){ $materialmapping_item2->se +tAttribute('brand',$brand_string); } if(length($ignored_brand_strin +g) gt 0){ $materialmapping_item2->se +tAttribute('ignore_brand',$ignored_brand_string); }
I don't remember if a "something() if $condition;" is faster/less ops in Concise than new block opening ifs. I would switch then to no block ifs.

Edit: fixed module name

Replies are listed 'Best First'.
Re^2: Speed up perl code
by frozenwithjoy (Priest) on Sep 21, 2012 at 19:30 UTC
Re^2: Speed up perl code
by cibien (Novice) on Oct 09, 2012 at 15:09 UTC

    The script is writed by hand. thank you for your very esauxtive answer. I have used your adwice, but I understand that it speding a lot of time here:

     while( ($filename = readdir(DIR))){

    the module Spreadsheet::ParseExcel spend a lot of time to open the excel file... Is fast to open light .xls files, and very slow to open big (more than 4mb) .xls files I think is nothing to do. anyway thankyou very much

      Yes, it is spending much time in that loop, because 90% of your code is there.

      You need to divide your code into subroutines, preferably small ones, to get a reasonable analysis from a profiler.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://994854]
and a soft breeze sighs...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (8)
As of 2017-02-24 18:10 GMT
Find Nodes?
    Voting Booth?
    Before electricity was invented, what was the Electric Eel called?

    Results (360 votes). Check out past polls.