http://www.perlmonks.org?node_id=1051681


in reply to Searching for two elements in two different lines

#use warnings;

You shouldn't disable warnings.



my $count = 0;

You never use this variable anywhere.



my ($el1,$el2,$output,$input); my (@data,@file2_data,@array_2,@data_1,@array_4,@data_4,@array_5,@arra +y_6,@data_7,@array_8); my ($input_1,$input_2,$input_3,$input_4,$stage_1,$stage_2,$stage_3,$st +age_4,$output_3,$final_join); my ($marc_data,$marc_add,$V_output,$join_1,$join_2,$join_3,$join_4); my ($data,$address,$stage_output,$element2,$marc_data_1,$marc_add_1,$j +oin_for_2nd_stage); my ($element_4,$file3_data,$file3_add,$dataout,$marc_array,$file3_data +_1,$file3_add_1,$dataout_1,$marc_array_1,$join_1st_stage_file2_file3, +$nothing,$nothing_1,$nothing_2,$nothing_3,$dataout_final,$marc_array_ +final);

Most of these variables are not needed, or at least not here at file scope.



open my $fh2, '<', $file2 or die "Can't open $file2: $!"; #XX.file2 fi +le open my $fh3, '<', $file3 or die "Can't open $file3: $!"; #xx.file3 fi +le ... my @array_1 = <$fh2>; ... my @array_4 = <$fh3>; ... open my $fh5, '<', $file2 or die "Can't open $file2: $!"; #XX.file2 fi +le my @array_3 = <$fh5>; ... open my $fh6, '<', $file3 or die "Can't open $file2: $!"; #XX.file2 fi +le my @array_7 = <$fh6>;

You already have $file2 stored in @array_1 and $file3 stored in @array_4 so there is no reason to reopen those files and store them again in @array_3 and @array_7.    Also, the error message for $fh6 says $file2 when it should say $file3.



$join_1 = join (" ", ("$input_1","$V_output","$marc_data +","$marc_add","$stage_1")); ... $join_2 = join (" ", ("$input_2","$V_output","$marc_data +","$marc_add","$stage_2")); ... $join_3 = join (" ", ("$input_3","$V_output","$marc_data +","$marc_add","$stage_3")); ... $join_4 = join (" ", ("$input_4","$V_output","$marc_data +","$marc_add","$stage_4")); ... $join_1st_stage_file2_file3 = join (" ", ("$input_2","$output_2","$da +ta","$address","$stage_output_1","$file3_data","$file3_add","$dataout +","$marc_array")); ... $join_for_2nd_stage = join(" ",("$input_2","$output_2","$data","$a +ddress","$stage_output_1","$file3_data","$file3_add","$dataout","$mar +c_array","$nothing","$nothing_1")); ... $join_for_2nd_stage = join(" ",("$input_2","$output_2","$data","$a +ddress","$stage_output_1","$file3_data","$file3_add","$dataout","$mar +c_array","$nothing","$nothing_1")); ... $final_join = join (" ",("$input_2","$output_2","$data","$address" +,"$stage_output_1","$file3_data","$file3_add","$dataout","$marc_array +","$nothing","$nothing_1","$dataout_final","$marc_array_final")); ... $final_join = join (" ",("$input_2","$output_2","$data","$address" +,"$stage_output_1","$file3_data","$file3_add","$dataout","$marc_array +","$nothing","$nothing_1","$dataout_final","$marc_array_final"));

You shouldn't quote variables.    You are basically making copies of all those variables when you don't have to.



just hope that you guys can give a good point on where to optimise it.

Here is how I would write it:

#!/usr/bin/perl use strict; use warnings; use Getopt::Long; my $output_file = 'OUTPUT_FILE'; ## output file name GetOptions( 'file1=s' => \my $file1, 'file2=s' => \my $file2, 'file3=s +' => \my $file3 ); open my $fh1, '<', $file1 or die "Can't open $file1: $!"; # file1 li +st by user output list open my $fh2, '<', $file2 or die "Can't open $file2: $!"; # XX.file2 + file # can make this as an option 1st... my @array_1 = <$fh2>; my @array_2; while ( <$fh1> ) { ## my config file next unless /\S/; my ( $input, $output ) = split; for ( @array_1 ) { next unless /\S/; next unless /\Q$output/ && /\Q$input/; # print out all the matching output lines my @V_output = ( split )[ 7, 2, 3 ]; my %stages = ( split )[ 11, 13, 15, 17, 19, 21, 23, 25 ]; for my $key ( grep /\Q$input/, keys %stages ) { push @array_2, "$key @V_output $stages{$key}"; last; } } } open my $fh3, '<', $file3 or die "Can't open $file3: $!"; # xx.file3 + file my @array_3 = <$fh3>; my @array_4; for ( @array_2 ) { my @data_1 = split; for ( @array_3 ) { my @data_2 = ( split )[ 3, 4, 7, 9 ]; if ( $data_2[ 0 ] == $data_1[ 2 ] && $data_2[ 0 ] == $data_1[ +3 ] ) { push @array_4, "@data_1 @data_2"; } } } my @array_5; for ( @array_4 ) { my @data = split; for ( @array_1 ) { my ( $marc_data, $marc_add, $output, $input, $stage ) = ( spli +t )[ 2, 3, 7, 11, 13 ]; if ( $output =~ /\Q$data[1]/ ) { if ( $input =~ /\Q$data[0]/ && $stage =~ /\Q$data[1]/ ) { push @array_5, "@data N/A N/A"; } elsif ( $input =~ /\Q$data[4]/ && $stage =~ /\Q$data[1]/ ) + { push @array_5, "@data $marc_data $marc_add"; } } } } my @array_6; for ( @array_5 ) { my @data = split; for ( @array_3 ) { my ( $file3_data, $file3_add, $dataout, $marc_array ) = ( spli +t )[ 3, 4, 7, 9 ]; if ( 'N/A' =~ /$data[9]/ && 'N/A' =~ /$data[10]/ ) { push @array_6, "@data N/A N/A"; last; } elsif ( $file3_data =~ /$data[9]/ && $file3_add =~ /$data[10]/ + ) { push @array_6, "@data $dataout $marc_array"; last; } } } open my $fh4, '>', $output_file or die "Can't open $output_file: $!"; + #output file print $fh4 <<'HEADER'; STAGE_N + STAGE_N+1 <<<<<<INPUT>>>>>>>>. <<<<<OUTPUt>>>>>>> <<<<MARC>>><<<<<<DATAOUT>>>>>> +>>>>>>>>><<<<<<<<<<<<<<MARC_ARRAY>>>>>>>>>>>><<<<<MARC>>>>><<<<<DATAO +UT>>>>>>>>>>>>>>>><<<<<<<<MARC_ARRAY>>>>>>>>>>>> DATA ADD + DATA ADD HEADER printf $fh4 "%-25s %12s %4s %6s %-30s %30s %2s %4s %20s %20s\n", ( spl +it )[ 0 .. 3, 7 .. 12 ] for @array_6;

Replies are listed 'Best First'.
Re^2: Searching for two elements in two different lines
by noob_mas (Novice) on Sep 03, 2013 at 03:13 UTC

    I am more than happy :-) for the feedback given by u guys, i will surely apply all the tips to optimize the script. Many do's and dont's are learned through here. Thank you for spending time to identify and correct my mistakes.FYI: this is my 1st time (definitely not the last time) writing in this forum, if there are any mistake in my response please let me know.. .THANK YOU GUYS..