Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Converting text file to CSV format

by pelp (Initiate)
on Mar 18, 2003 at 18:57 UTC ( [id://244122]=perlquestion: print w/replies, xml ) Need Help??

pelp has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks :-)

I was given a project to convert 10,000+ log files to the CSV format.

Here's an example of the old format used and the new one.

Can anyone share any good methods of how to approach this? My background in PERL is limited, and I need some direction.

Thanks,

pelp

Replies are listed 'Best First'.
Re: Converting text file to CSV format
by BrowserUk (Patriarch) on Mar 18, 2003 at 21:21 UTC

    Using perl's -n and -l switches on the shebang line of a short script in conjuction with a few global variables and a BEGIN block make this kind of processing the epitomy of the perl 'backronym' Practical Extraction & Reporting Language. Real bread&butter stuff.

    You will probably need to strengthen the regexes to suit your data, and you may need to adjust the format used for the printf somewhat, but this should get you some of the way there.

    The only slightly trick part if the $headerProcessed var and the use of eof which isn't very well described.

    Usage: yourscriptname files* > unified.log

    If your on a platform that expands wildcard arguments on the command line, the comment out the first line of the BEGIN block. The second line will list the expanded wilcards to STDERR which may or may not be useful in use.

    #! perl -nlw use strict; use vars qw[$format $headerProcessed $author $MRnumber $releaseNo $fea +tureNo $featureName $filename $date]; BEGIN{ @ARGV = map{ glob } @ARGV; print STDERR "Processing files @ARGV"; $format = '%-25s ' x 9 . $/; $headerProcessed = 0; printf $format, 'File Name,', 'Author(Core ID),', 'Date (MM/DD/YEAR),', 'Release No.,', 'MR No.,', 'Feature No.,', 'Feature Name,', 'Paragrah No.,', 'Requirement No.'; } $headerProcessed = 0 if eof; unless ( $headerProcessed ) { $author = $1 . ',' if m[^Author \(Core ID\) : (.*$)]; $MRnumber = $1 . ',' if m[^MR Number : (.*$)]; $releaseNo = $1 . ',' if m[^Release Number : (.*$)]; ($featureNo, $featureName) = ($1 . ',', $2 . ',') if m[^Feature : (.*?) : (. +*$)]; $filename = $1 . ',' if m[^File Name : (.*$)]; $date = $1 . ',' if m[^Modification Date : (.*$)]; $headerProcessed = 1 if m[^Paragraph Number Requiremen +t Number Last Modified Release$]; } else { printf $format, $filename, $author, $date, $releaseNo, $MRnumber, $featureNo, +$featureName, (split/\s+/)[0,1]; } __END__ C:\test>244122 244122.dat? Processing files 244122.dat1 244122.dat2 File Name, Author(Core ID), Date (MM/DD/YEAR), + Release No., MR No., Feature N +o., Feature Name, Paragrah No., +Requirement No. cpsfs_sdu_setup.fm, az1287, 10/24/2001, + 16.1, sc018498.14, na, + Add IP, SDUSETUP-SDU-249 +CPSFS-SDUSETUP-224 cpsfs_sdu_setup.fm, bz1287, 10/24/2001, + 16.1, sc018498.14, na, + Add IP, SDUSETUP-SDU-249 +CPSFS-SDUSETUP-224 C:\test>

    Examine what is said, not who speaks.
    1) When a distinguished but elderly scientist states that something is possible, he is almost certainly right. When he states that something is impossible, he is very probably wrong.
    2) The only way of discovering the limits of the possible is to venture a little way past them into the impossible
    3) Any sufficiently advanced technology is indistinguishable from magic.
    Arthur C. Clarke.
Re: Converting text file to CSV format
by Super Monkey (Beadle) on Mar 18, 2003 at 19:28 UTC
    the straight forward solution... you can open the old file using open() and read each line using a while loop. assuming you know what order the old data is in, you will probably want to use split() to parse each line. you can open another file for output or pipe the output to a file when you run the script. either way, simply print the parsed data seperated by commas. that should do it.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://244122]
Approved by VSarkiss
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (2)
As of 2024-04-19 18:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found