Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Parsing Text from a File to HTML Table

by anupchandu (Initiate)
on Oct 27, 2013 at 12:07 UTC ( #1059896=perlquestion: print w/ replies, xml ) Need Help??
anupchandu has asked for the wisdom of the Perl Monks concerning the following question:

Hi,
I have a log file that has some information. I need to parse that log file and display in a HTML table. Below is the log file format
Status Company Name Start Date End Date
Pass My Company1 Start Date End Date
Pass My Company2 Start Date End Date
Pass My Company3 Start Date End Date
Pass My Company4 Start Date End Date
Pass My Company5 Start Date End Date

I tried using the below code but I am not successful
#!C:/Perl/bin/perl.exe use strict; use warnings; use Win32; use Cwd; use CGI; use HTML::Template; main(@ARGV); sub main { use Cwd qw(abs_path); my $pwd = abs_path(); my $tempfile = "$pwd/temp/test.txt"; open(FH, $tempfile) || die "Error: $!\n"; my $line; while ($line=<FH>) { print "<tr>"; my @cells= split /\s+/,$line; foreach my $cell (@cells) { print "<td>$cell</td>"; } print "</tr>\n"; } close FH; }
Can someone address this issue? Thanks in Advance -Anoop

Comment on Parsing Text from a File to HTML Table
Download Code
Re: Parsing Text from a File to HTML Table
by Athanasius (Monsignor) on Oct 27, 2013 at 12:45 UTC

    Hello anupchandu, and welcome to the Monastery!

    The obvious problem is that you are splitting the input on whitespace, but individual log file records also contain embedded whitespace. The solution depends on the way in which the data is delimited within the log file. For example:

    If this advice fails to address your problem, you will need to detail both the format of your log file and the way in which your current script is “not successful.”

    Hope that helps,

    Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

Re: Parsing Text from a File to HTML Table
by Not_a_Number (Parson) on Oct 27, 2013 at 13:30 UTC

    In addition to what Athanasius says, you need to add

    print "<table>";

    before your while loop, and

    print "</table>;"

    after it.

    Unless, of course you're not showing us the totality of your code (I notice that you have use HTML::Template; at the top of your snippet, but you never actually, er, use it).

    Also, what is use Win32; supposed to do?

Re: Parsing Text from a File to HTML Table
by marinersk (Chaplain) on Oct 27, 2013 at 15:03 UTC
    As already noted, splitting based on whitespace is a faulty assumption in your algorithm, assuming company names have whitespace in them.

    This, in my experience, is a common error for someone parsing a log for the first time so don't feel bad.  :-) I prefer to parse logs based on predictable components. The more wild the potential format, the more complicated the code gets, but for a relatively simple format like the one you are suggesting, I think it's fairly straightforward (assuming you have a basic understanding of Regular Expressions).

    You have to craft your Regular Expression to match the data you are expecting. A technique I have become fond of is the use of an if statement, which provides the additional feature of filtering out lines that don't match my preconceived format. I often capture those out to another file for occasional review to see if the parsing routine needs to compensate for previously unknown formats or conditions. I won't do that in this example so we can save space.

    C:\Steve\Dev\PerlMonks\P-2013-10-27@0838-Log-Parse>type test1.log GOOD Acme Toy Company 2010-01-01 2011-12-31 BAD XYZZY 1972-01-01 1972-06-18 UGLY Enron 2001-10-01 2011-09-11 C:\Steve\Dev\PerlMonks\P-2013-10-27@0838-Log-Parse>parselog.pl test1.l +og

    Status Company Name Start Date End Date
    GOOD Acme Toy Company 2010-01-01 2011-12-31
    BAD XYZZY 1972-01-01 1972-06-18
    UGLY Enron 2001-10-01 2011-09-11

Re: Parsing Text from a File to HTML Table
by kcott (Abbot) on Oct 28, 2013 at 02:56 UTC

    G'day anupchandu,

    If the status and dates contain no spaces, a quick and dirty fix might be:

    my $status = shift @cells; my $end = pop @cells; my $start = pop @cells; print "<td>$_</td>" for ($status, join(' ', @cells), $start, $end);

    A more robust solution would be to work on the basis that the formats of the status and dates are known: if you match them, whatever is left in the middle is the company name. Here's an example:

    # Do this once: my $status_re = qr{\w+}; my $date_re = qr{\d{2}-\w{3}-\d{4}}; my $log_re = qr{^($status_re)\s+(.*?)\s+($date_re)\s+($date_re)\R$}; ... # Do this in the while loop print "<td>$_</td>" for $line =~ $log_re;

    As you can see from earlier responses, our ability to provide you with a definitive answer is hampered by the lack of information in your original post. A better question gets better answers: read the guidelines in "How do I post a question effectively?" to see how you could have improved on this.

    -- Ken

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1059896]
Approved by Athanasius
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (14)
As of 2014-08-27 14:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (241 votes), past polls