Re: Unix to Perl
by CountZero (Bishop) on Mar 26, 2010 at 07:19 UTC
|
use strict;
use warnings;
my $number;
while (my $line = <DATA>) {
my @data = split ';', $line;
if ($data[0] =~ m/^\d+$/) {
$number = $data[0];
}
else {
unshift @data, $number;
}
{
local $, = ';';
print @data;
}
}
__DATA__
100;ABC
CDE
FGH
101;IJK
LMN
OPQ
Output:100;ABC
100;CDE
100;FGH
101;IJK
101;LMN
101;OPQ
CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James
| [reply] [d/l] [select] |
|
Hi count,
Really sorry for misusing the words.
When i said unix , I actually meant to say shell scripting ( using AWK , while loop ., etc ) . please find my unix code below. It takes approx 15 mins for processing a file with 25k lines.
cat inputfile.txt | while read line
do
export LINE_DATA=$line
export No_field=`echo $line | awk -F ";" '{print NF}'`
if [ $No_field -eq 2 ]
then
export VAR_NAME=`echo $line | awk -F ";" '{print $1}'`
echo $line >> completed.txt
else if [ $No_field -eq 0 ]
then echo $line >> completed.txt
else
echo $VAR_NAME";"$line >> completed.txt
fi
fi
done
echo "Processing Completed.Please validate the output file completed.txt"
| [reply] [d/l] [select] |
|
You've already got some good tips for using perl. However, your problem is that you're not using awk and shell scripting correctly. Your script takes a long time because you're running about one hundred thousand programs to dig through your data1. The first improvement I'd suggest is that since you're using awk anyway, just stay in awk to do the job. Your outer loop of cat / while read line can simply be eliminated since awk will automatically wrap your script with that. You can prefix a block in awk with a condition which must be true in order to execute the block, so the code:
cat inputfile.txt | while read line
do
export LINE_DATA=$line
export No_field=`echo $line | awk -F ";" '{print NF}'`
if [ $No_field -eq 2 ]
then
echo $line
fi
done
echo "Processing Completed.Please validate the output file completed.t
+xt"
could be replaced by
awk 'NF==2 { print $1 }' inputfile.txt
Notes:
1: You're running echo twice per line, and awk nearly twice per line.
...roboticus
| [reply] [d/l] [select] |
|
AWK is a programming language, a primordial Perl. Using it to split off one field is a real waste, you could write the whole thing in AWK or in a shell. I am not surprised you have performance issues calling the AWK language twice in each shell loop!
I do not understand why you are using cat(1), the shell syntax for reading within a loop and splitting on a ';' is:
while IFS=';' read No_field TheRest
do
echo "No_field: $No_field"
done < inputfile.txt
You also seem to be exporting variables for no good reason. Sorry to be brutal, but a badly written perl script won't necessarily work any better than a badly written shell script. | [reply] [d/l] |
|
| [reply] |
|
|
|
|
I'm not sure why your shell script runs so slow. This and other solutions that I've seen in this thread will run in much less than one second with 25K lines.
#!usr/bin/perl -w
use strict;
my $curr_prefix="";
my $letters = "";
while (<DATA>)
{
chomp;
my ($tok1, $tok2) =split(/;/,$_);
if ($tok2)
{
$curr_prefix = $tok1;
$letters = $tok2;
}
else
{
$letters = $tok1;
}
print "$curr_prefix;$letters\n";
}
=prints
100;ABC
100;CDE
100;FGH
101;IJK
101;LMN
101;OPQ
=cut
__DATA__
100;ABC
CDE
FGH
101;IJK
LMN
OPQ
| [reply] [d/l] |
|
Dear Count,
Thanks a lot for your time and help. My input file name is inputfile.txt and the output file name is completed.txt . Am not sure how to use it in your Perlscirpt. Could you please help me ?
Thanks
Umesh
| [reply] |
|
| [reply] |
|
open, my $INPUT, '<', 'inputfile.txt';
open, my $OUTPUT, '>', 'completed.txt';
Rather than reading from the DATA filehandle you now read from the $INPUT-filehandle:while (my $line = <$INPUT>) { ... }
Finally you print to the $OUTPUT-filehandle:print $OUTPUT @data;
It is considered good practie to close the filehandles once you are done with them, i.e. after the while loop. They get closed automatically as soon as their lexical variables go out of scope or at the end of the program, so it does not really matter in this little example program:
close $INPUT;
close $OUTPUT;
CountZero A program should be light and agile, its subroutines connected like a string of pearls. The spirit and intent of the program should be retained throughout. There should be neither too little or too much, neither needless loops nor useless variables, neither lack of structure nor overwhelming rigidity." - The Tao of Programming, 4.1 - Geoffrey James
| [reply] [d/l] [select] |
|
Re: Unix to Perl
by planetscape (Chancellor) on Mar 26, 2010 at 07:06 UTC
|
| [reply] |
Re: Unix to Perl
by Boldra (Deacon) on Mar 26, 2010 at 09:22 UTC
|
dear all , I am struck with performace issue on road while trying drive to Memphis: I need to turn left somewhere, until i hit a cow. Then i need to move the cow to another place. The way was given: left, right, up, up more, white cow, red cow, left. could you please drive me to Memphis in your porsche as i heard its much fastre than road.
BTW, Really sorry for misusing the words. When i said road, I actually meant to say car ( tractor, wheels, etc). please find my road below. It takes approx 15 mins for me to drive to Memphis.
| [reply] |
|
| [reply] |
|
Humorous, but rather unfair.
The confusions of the OP are, unfortunately, commonly held ones. I have had people attend a "UNIX Programming" course expecting to be taught how to write shell scripts (if they had read the description they would have seen it was for C programmers). I had a dig about the use of AWK above, yet lines like that are very common in shell scripts.
At least give the OP credit in asking for help. This is an opportunity for education, not mockery.
| [reply] |
|
|
|
| [reply] |