Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

Hello Monks, I've struggling with a script for the past few days. I have report that I need to parse into a CSV file and I have it somewhat working but could use some help making it better. The text file that I need to parse has the following format and is repeated per page:

Per CountZero's suggestion here is a mock up of the text file.
Date 08/17/11 Report Page 1
Time 12:46

Important Text: 1
Misc Text: All
Misc Text: Sec
** Indicates
APPTEXT


PLINE     SCODE    PCODE    FID    SEC    unsec     fcs
-------------------------------------------------------------------------------------------------
TEST     TT     TT00    TT00.1    NO    xxxx    TTD
TEST    TT    TT00    **TT00.2    YES    XXXXXX
TEST    TT    TT00    **TT00.3    YES    XXX
TEST    TT    TT01    TT01.1    NO    XXXXXXXXXXX    TT
TEST    TT    **TT02    TT02.1    YES    XXXXX

I need to combine "text1" with each line of the columns into a CSV record. The most recent thing I found out is that each of the column lines is variable, and there the number of white spaces in between are variable. Here is the script that I have, but I was wondering what I could do to take into variability of the lines. Also, I'm not very knowledgeable about PERL. I've put this together from skimming some books and picking up things on the internet.

The output then would be something like this:
1,TEST,TT,TT00,TT001,NO,xxxx
1,TEST,TT,TT00,**TT00.2,YEST,XXXXXX

#! /usr/bin/perl $OutPut= '>secout.txt'; open(INFILE,'sec_rpt3.txt') or die "Can't open file.\n"; open(OUT, $OutPut) or die "Can't open output.\n"; sub rtrim($) { my $string = shift; $string =~ s/\s+$//; return $string; } sub trim($) { my $string = shift; $string =~ s/^\s+//; $string =~ s/\s+$//; return $string; } sub ltrim{ my $string = $_; $string =~ s/^\s*//; return $string; } while (<INFILE>) { $ThisLine=ltrim($_); chomp($ThisLine); $LineLen=length($ThisLine); if (index($ThisLine,'IMPORTANT TEXT') ne -1) { $LenSec=int($LineLen)-17; $SecClass=substr($ThisLine,17,$LenSec); } if (index($ThisLine,"TEST") ne -1) { $pline = trim(substr($ThisLine,0,16)); $mod = trim(substr($ThisLine,18,6)); $tok = trim(substr($ThisLine,24,10)); $form = trim(substr($ThisLine,34,13)); $sec = trim(substr($ThisLine,47,7)); $unsec =substr($ThisLine,54,21); $secfc = substr($ThisLine,76,21); $rec = join(',',$SecClass,$pline,$mod,$tok,$form,$sec,$unsec,$ +secfc); print OUT "$rec\n"; }; } close(INFILE); close(OUT);

In reply to Parsing text file to CSV by apok69

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (3)
As of 2024-03-29 02:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found