Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Re: trouble parsing log file...

by inman (Curate)
on Nov 20, 2006 at 19:48 UTC ( #585118=note: print w/ replies, xml ) Need Help??


in reply to trouble parsing log file...

The line @logarray=<LOG>;   # dumps all of $logfile into @logarray is reading all of the lines into an array. The test @logarray eq $error is in scalar context. It is comparing the number of lines in the file to the text. This will bever succeed.

Ignore the list building and just work through the file line by line and use a regular expression to test.

Take a look at the following as an example.

use strict; # Set the button to green initially my $button = "perlgreenblink"; # test the file line by line. # The line gets read into $_ # I am testing on the DATA segment to illustrate the point while (<DATA>){ # test with a regex and end the # while loop if there is a problem if (/DOWN/){ $button = "perlredblink2"; last; } if (/PROBLEM/){ $button = "perlyellowblink"; last; } } print "HTML for <img src=\"$button.gif\" />\n"; __DATA__ nothing here going smoothly Its all going DOWN no PROBLEM at all


Comment on Re: trouble parsing log file...
Select or Download Code
Re^2: trouble parsing log file...
by coreolyn (Parson) on Nov 20, 2006 at 19:56 UTC

    Actually dumping the file to an array and parsing it element(line) by line is more efficient than parsing the file line by line.

    foreach my $element (@logarray) .

    You may want to close that open logfile after reading it into the array

      Dumping a multi Gigabyte log file into an array is going to get ugly quickly. A combination of the Perl internals and the IO buffering on the computer should take care of this situation line by line.
        Here's what I have:
        use strict; use warnings; my $logfile="log.txt"; my $error="DOWN"; my $warn="PROBLEM"; my $redbutton="\<img src\=\'default_files/perlredblink\.gif'>"; my $greenbutton="\<img src\=\'default_files/perlgreenblink\.gif'>"; my $yellowbutton="\<img src\=\'default_files/perlyellowblink\.gif'>"; open LOG, $logfile or die "Cannot open $logfile for read :$!"; my $button = $greenbutton; while ($_ = <LOG>) { if ($_ =~ /$error/i) { $button = $redbutton; print "<!--Content-type: text/html-->\n\n"; print "$button"; last; } elsif ($_ =~ /$warn/i) { $button = $yellowbutton; print "<!--Content-type: text/html-->\n\n"; print "$button"; last; } else { print "<!--Content-type: text/html-->\n\n"; print "$button"; last; } } close LOG;
        Unfortunately, it does everything I want except go through the log line by line. How can I safely get this program to do that? Can you explain the alternative to using an array and/or a safe way to do this using an array? Thanks!

      I thought I'd reply rather than --ing your post just because I disagreed.

      I cannot think of any meaning of the phrase "more efficient" which would render your statement correct.

      All the reading I've ever done on the matter says that parsing a file line by line is extremely efficient. What happens is as follows. The operating system reads a chunk of the file into memory; this is then broken up on newlines (or whatever the value of $/ is); then we iterate over each line until we run out and the process repeats. We can parse a file line by line as follows:

      while ( <FILE> )

      If we choose to stop reading the file at any point (perhaps we've found what we want) and call last, then we end up only reading the smallest part of the file as necessary. This means it's efficient time-wise, and because we're only holding one chunk of file in memory at a time, it's efficient memory-wise.

      Alternately, my reading has said that "dumping the file to an array" and parsing it line by line is very inefficient. This is the case whether we do this like this:

      my @logarray = <FILE>; foreach my $element (@logarray)

      or like this:

      foreach my $element (<FILE>)

      This is because the file system still gives Perl the file on a chunk by chunk basis, and Perl still splits it up on $/, but Perl has to do this for the whole file even if we're only going to look at the first 10 lines. Worse, Perl now has to store the entire file in memory, rather than just a chunk. So this is the least efficient way to handle a file in Perl.

      It is however very useful when we need random access to the whole file; for example when sorting it, or pulling out random quotes.

      I'd love to hear why, if you think I'm mistaken in my understanding in this matter.

        Perl did some old tricks that reached a little bit too far inside the <stdio.h> macros to be completely portable but that allowed Perl line-at-a-time I/O to be about twice as fast as C line-at-a-time I/O... on sufficiently "standard enough" systems. That was back in the days of AT&T Unix, before Linux. Last time I checked (long enough ago that I hope things have improved but not long enough ago that I've heard that they have), Perl still did line-at-a-time I/O unnecessarilly inefficiently when compiled on a system that isn't "standard enough" (which is nearly every system these days).

        This meant that Perl line-at-a-time I/O was 4 times slower than it really should be on Linux (for example). This actually made re-implementing line-at-a-time I/O in Perl code faster than using Perl's own line-at-a-time I/O implemented in C code (about twice as fast, which means that when Perl gets fixed, it would be about twice as slow, which would be expected).

        Yes, it makes little sense for Perl code to be faster than Perl's own C code. Unfortunately, that was certainly the case not too long ago.

        The command perl -V:d_stdstdio will tell you whether Perl thinks your platform is "standard enough".

        But, yes, the speed difference between line-at-a-time I/O and "slurping" is usually small enough not to matter (even considering Perl's quirk here). The memory consumption difference can be hugely significant, of course.

        - tye        

        Hi, I tried to do this and couldn't get it to work correctly, can you show me what I'm doing wrong?
        use strict; use warnings; my $logfile="log.txt"; my $error="DOWN"; my $warn="PROBLEM"; my $redbutton="\<img src\=\'default_files/perlredblink\.gif'>"; my $greenbutton="\<img src\=\'default_files/perlgreenblink\.gif'>"; my $yellowbutton="\<img src\=\'default_files/perlyellowblink\.gif'>"; open LOG, $logfile or die "Cannot open $logfile for read :$!"; my $button = $greenbutton; my @logfile=<LOG>; # throw logfile into an array while (<LOG>) { if ($_ =~ /$error/i) { $button = $redbutton; print "<!--Content-type: text/html-->\n\n"; print "$button"; last; } elsif ($_ =~ /$warn/i) { $button = $yellowbutton; print "<!--Content-type: text/html-->\n\n"; print "$button"; last; } else { print "<!--Content-type: text/html-->\n\n"; print "$button"; last; } } close LOG;
Re^2: trouble parsing log file...
by perl_geoff (Acolyte) on Nov 20, 2006 at 20:25 UTC
    Ok, I tried this, but could only get it to display the green button...also, I'm not sure I understand the logic of setting the button to green first. When I run your script it seems to ignore everything but the last line. Also, I don't believe any of my log files are above about 5 mb or so.
      The logic is that you are testing the log file for a specific condition. You create a starting condition that assumes that everything has gone well and would result in a green button. The idea is that if you get to the end of the file without hitting one of your two tests then everything was OK.

      You read the file one line at a time looking for either DOWN or PROBLEM. When one of these tests work, you set the button response accordingly and use last to leave the while loop and do something with the outcome.

      You may only be reading 5Mb of files but this translates into a much larger use of memory. It also involves the computer reading the file line by line anyway as it puts it into memory. If your match is on line 20 of a 2000 line file, your script only needs to read 20 lines and you are done.

      If you continue to have trouble, post your code in the replies.

        Ok, now I understand. Here's what I tried, sorry I wasn't more specific. Unfortunately it only displays green:
        $logfile="log.txt"; $error=(/DOWN/); $warn=(/PROBLEM/); $redbutton="\<img src\=\'default_files/perlredblink2\.gif'>"; $greenbutton="\<img src\=\'default_files/perlgreenblink\.gif'>"; $yellowbutton="\<img src\=\'default_files/perlyellowblink\.gif'>"; open LOG, $logfile or die "Cannot open $logfile for read :$!"; # @logarray=<LOG>; # dumps all of $logfile into @logarray use strict; # Set the button to green initially my $button = "perlgreenblink"; # test the file line by line. # The line gets read into $_ # I am testing on the DATA segment to illustrate the point while (<DATA>){ # test with a regex and end the # while loop if there is a problem if (/DOWN/){ $button = "perlredblink2"; last; } if (/PROBLEM/){ $button = "perlyellowblink"; last; } } print "HTML for <img src=\"$button.gif\">\n";

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://585118]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (18)
As of 2014-07-31 18:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (250 votes), past polls