Hello Monks, I've struggling with a script for the past few days. I have report that I need to parse into a CSV file and I have it somewhat working but could use some help making it better. The text file that I need to parse has the following format and is repeated per page:
Per CountZero's suggestion here is a mock up of the text file.
Date 08/17/11 Report Page 1
Time 12:46
Important Text: 1
Misc Text: All
Misc Text: Sec
** Indicates
APPTEXT
PLINE SCODE PCODE FID SEC unsec fcs
-------------------------------------------------------------------------------------------------
TEST TT TT00 TT00.1 NO xxxx TTD
TEST TT TT00 **TT00.2 YES XXXXXX
TEST TT TT00 **TT00.3 YES XXX
TEST TT TT01 TT01.1 NO XXXXXXXXXXX TT
TEST TT **TT02 TT02.1 YES XXXXX
I need to combine "text1" with each line of the columns into a CSV record. The most recent thing I found out is that each of the column lines is variable, and there the number of white spaces in between are variable. Here is the script that I have, but I was wondering what I could do to take into variability of the lines. Also, I'm not very knowledgeable about PERL. I've put this together from skimming some books and picking up things on the internet.
The output then would be something like this:
1,TEST,TT,TT00,TT001,NO,xxxx
1,TEST,TT,TT00,**TT00.2,YEST,XXXXXX
#! /usr/bin/perl
$OutPut= '>secout.txt';
open(INFILE,'sec_rpt3.txt') or die "Can't open file.\n";
open(OUT, $OutPut) or die "Can't open output.\n";
sub rtrim($)
{
my $string = shift;
$string =~ s/\s+$//;
return $string;
}
sub trim($)
{
my $string = shift;
$string =~ s/^\s+//;
$string =~ s/\s+$//;
return $string;
}
sub ltrim{
my $string = $_;
$string =~ s/^\s*//;
return $string;
}
while (<INFILE>)
{
$ThisLine=ltrim($_);
chomp($ThisLine);
$LineLen=length($ThisLine);
if (index($ThisLine,'IMPORTANT TEXT') ne -1)
{
$LenSec=int($LineLen)-17;
$SecClass=substr($ThisLine,17,$LenSec);
}
if (index($ThisLine,"TEST") ne -1)
{
$pline = trim(substr($ThisLine,0,16));
$mod = trim(substr($ThisLine,18,6));
$tok = trim(substr($ThisLine,24,10));
$form = trim(substr($ThisLine,34,13));
$sec = trim(substr($ThisLine,47,7));
$unsec =substr($ThisLine,54,21);
$secfc = substr($ThisLine,76,21);
$rec = join(',',$SecClass,$pline,$mod,$tok,$form,$sec,$unsec,$
+secfc);
print OUT "$rec\n";
};
}
close(INFILE);
close(OUT);
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.