<?xml version="1.0" encoding="windows-1252"?>
<node id="994854" title="Re: Speed up perl code" created="2012-09-21 04:54:16" updated="2012-09-21 04:54:16">
<type id="11">
note</type>
<author id="857302">
bulk88</author>
<data>
<field name="doctext">
Use [mod://B::Concise]. Look at how many opcodes each line takes. For example,
&lt;code&gt;
                            if(decode('cp1252',$oWkC-&gt;{Val}) eq "Confi
+gurator mapping")
                            {
                                for (my $iR = $oWkS-&gt;{MinRow} +1; defi
+ned $oWkS-&gt;{MaxRow} &amp;&amp; $iR &lt;= $oWkS-&gt;{MaxRow} ; $iR++)

&lt;/code&gt;
MinRow and MaxRow hash slices should be copied to lexical my scalars. my scalars are faster than a hash reference, and you have a loop there.
&lt;code&gt;
                                                if(defined $market_cel
+l_name and defined $market_cell and length(decode('cp1252',$market_ce
+ll-&gt;{Val})) gt 0 and length(decode('cp1252',$market_cell_name-&gt;{Val})
+) gt 0)
                                                {
                                                    if(decode('cp1252'
+,$market_cell-&gt;{Val}) eq '0')
                                                    {
                                                        $ignored_marke
+t_string = $ignored_market_string . decode('cp1252',$market_cell_name
+-&gt;{Val}) . ",";
                                                    } else {    
                                                        $market_string
+ = $market_string . decode('cp1252',$market_cell_name-&gt;{Val}) . ",";
                                                    }
                                                }
                                        }
&lt;/code&gt;
You wrote "decode('cp1252',$market_cell_name-&gt;{Val})" three times (more memory for opcodes) and ran it twice. Why are you decoding just for '0'? '0' is the same in 1252 as in Latin 1. Dont do a decode just to check for '0', or do the decode ONCE and cache the decoded result. IDK if "eq ''" or your "length() &gt; 0" is faster. Use NYTProf and find out. I've got a question, was this written by hand or is the output of a script generating tool?
&lt;code&gt;
                    my $market_Cmin = -1;
                    my $market_Cmax = -1;
                    my $title_row = -1;
                    
                    for (my $iC = $oWkS-&gt;{MinCol}; defined $oWkS-&gt;{Max
+Col} &amp;&amp; $iC &lt;= $oWkS-&gt;{MaxCol} ; $iC++)
&lt;/code&gt;
IDK the API of the modules you are using, but is "$oWkS-&gt;{MaxCol}" really going to stop being defined at some sudden point in the loop? undef converts to 0 in perl.
&lt;code&gt;
		} elsif(decode('cp1252',$oWkC-&gt;{Val}) eq "Market"){
								foreach my $area ( @{ $oWkS-&gt;{MergedArea} } ) {
									if($area-&gt;[1] eq $iC and $area-&gt;[0] eq $oWkS-&gt;{MinRow}){
										$market_Cmax = $area-&gt;[3];
									}
								}
								$market_Cmin = $iC;
							} elsif(decode('cp1252',$oWkC-&gt;{Val}) eq "Customer"){
								foreach my $area ( @{ $oWkS-&gt;{MergedArea} } ) {
									if($area-&gt;[1] eq $iC and $area-&gt;[0] eq $oWkS-&gt;{MinRow}){
										$customer_Cmax = $area-&gt;[3];
									}
								}
								$customer_Cmin = $iC;
							} elsif(decode('cp1252',$oWkC-&gt;{Val}) eq "Brand"){
								foreach my $area ( @{ $oWkS-&gt;{MergedArea} } ) {
									if($area-&gt;[1] eq $iC and $area-&gt;[0] eq $oWkS-&gt;{MinRow}){
										$brand_Cmax = $area-&gt;[3];
									}
								}
								$brand_Cmin = $iC;
							}
&lt;/code&gt;
You need to make functions and not call decode('cp1252',$oWkC-&gt;{Val}) a dozen times in a row. Also, if you know you will find $*****_Cmax only once (currently the *last* Cmax that matches the condition is selected, IDK if you require this), add [doc://perlfunc#last] to all your foreach loops so they exit after the first match rather than continuing to iterate through all the remaining data.
&lt;code&gt;
		for(my $iR = $title_row +1; defined $oWkS-&gt;{MaxRow} &amp;&amp; $iR &lt;= $oWkS-&gt;{MaxRow} ; $iR++){
							$internal_cell = $oWkS-&gt;{Cells}[$iR][$map_Cmin];
							$active_cell = $oWkS-&gt;{Cells}[$iR][$active_col];
							$family_cell = $oWkS-&gt;{Cells}[$iR][$active_col];
&lt;/code&gt;
Copy the $oWkS-&gt;{Cells}[$iR] array reference to a my variable, then do map/active/family on that 1 my variable, not do 2 extra dereferences every line. You can also do &lt;code&gt;($internal_cell, $active_cell, $family_cell) = $oWkS-&gt;{Cells}[$iR][$map_Cmin, $active_col, $active_col];&lt;/code&gt; and save more overhead/opcodes.
&lt;code&gt;
		$oWkC = $oWkS-&gt;{Cells}[$oWkS-&gt;{MinRow}][$iC];
						if(defined $oWkC)
						{
							if(decode('cp1252',$oWkC-&gt;{Val}) eq "Configurator mapping")
							{
&lt;/code&gt;
Combine the 2 ifs with &amp;&amp;, you saved 1-3 opcodes.

The "after" look is visually messy but
&lt;code&gt;
	my $map_Cmin = -1; # the first is internal
					my $map_Cmax = -1;
					my $artno_col = -1;
					my $active_col = -1;
					my $colorzones_col = -1;
					my $family_col = -1;
					my $tec_desc_col = -1;
					my $brand_Cmin = -1;
					my $brand_Cmax = -1;
					my $customer_Cmin = -1;
					my $customer_Cmax = -1;
					my $market_Cmin = -1;
					my $market_Cmax = -1;
					my $title_row = -1;
&lt;/code&gt;
this can be improved in speed at expense of visual clarity. Put all that in one ";" terminated statement. Short example, &lt;code&gt; my ($map_Cmax,$artno_col, $active_col)  = (-1, -1, -1);&lt;/code&gt;. Use whitespace and tabs to lay out the variable names in columns to try and organize that one long my statement.
&lt;br&gt;&lt;br&gt;
Someone in [id://983556|the past] brought up your &lt;code&gt;"string" gt 0&lt;/code&gt; lines. IDK if they are right. For example,
&lt;code&gt;
		#set Brands
									if($brand_Cmin gt 0 and $brand_Cmax gt 0)
									{
										my $brand_string = "";
										my $ignored_brand_string = "";
										for(my $brandC = $brand_Cmin; $brandC &lt;= $brand_Cmax ; $brandC++)
&lt;/code&gt;
$brand_Cmax has to be a number, not a string for this to work.
&lt;code&gt;
		#set Brands
									if($brand_Cmin gt 0 and $brand_Cmax gt 0)
									{
										my $brand_string = "";
										my $ignored_brand_string = "";
										for(my $brandC = $brand_Cmin; $brandC &lt;= $brand_Cmax ; $brandC++)
										{
											$brand_cell_name = $oWkS-&gt;{Cells}[$title_row][$brandC];
											if(defined $brand_cell_name and decode('cp1252',$brand_cell_name-&gt;{Val}) ne "Epta std")
											{
												$brand_cell = $oWkS-&gt;{Cells}[$iR][$brandC];
												if(defined $brand_cell and length(decode('cp1252',$brand_cell-&gt;{Val})) gt 0)
												{
													if(decode('cp1252',$brand_cell-&gt;{Val}) eq '0')
													{
														$ignored_brand_string = $ignored_brand_string . decode('cp1252',$brand_cell_name-&gt;{Val}) . ",";
													} else {
														$brand_string = $brand_string . decode('cp1252',$brand_cell_name-&gt;{Val}) . ",";
													}
												}
											}
										}
&lt;/code&gt;
considered adding a "last" to this numeric for loop?

&lt;code&gt;
			if(length($brand_string) gt 0){
											$materialmapping_item2-&gt;setAttribute('brand',$brand_string);
										}
										if(length($ignored_brand_string) gt 0){
											$materialmapping_item2-&gt;setAttribute('ignore_brand',$ignored_brand_string);
										}
&lt;/code&gt;
I don't remember if a "something() if $condition;" is faster/less ops in Concise than new block opening ifs. I would switch then to no block ifs.
&lt;br&gt;&lt;br&gt;
Edit: fixed module name</field>
<field name="root_node">
994817</field>
<field name="parent_node">
994817</field>
</data>
</node>
