Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Re: virus log parser

by Rhose (Priest)
on Jul 02, 2002 at 20:44 UTC ( [id://178995]=note: print w/replies, xml ) Need Help??


in reply to virus log parser

How about collecting the information, then printing the record when you get to one of the '-----' lines? (This assumes all records -- even the last one -- end with a '-----' line.)

The following code reads from __DATA__ and writes its (tab delimited) records to the screen; you would probably want to open your log file for processing (open(LF,"$logFile")), and write to a results file (open(OF,">$outputFile")).

#!/usr/bin/perl -w use strict; my $gCurRec; foreach(qw(name to file action virus)) { $gCurRec->{$_}=''; } while(<DATA>) { $gCurRec->{name}=$1 if (/^From:\s*(.+?)\s*$/); $gCurRec->{to}=$1 if (/^To:\s*(.+?)\s*$/); $gCurRec->{file}=$1 if (/^File:\s*(.+?)\s*$/); $gCurRec->{action}=$1 if (/^Action:\s*(.+?)\s*$/); $gCurRec->{virus}=$1 if (/^Virus:\s*(.+?)\s*$/); if (/^-----/) { print $gCurRec->{name},"\t", $gCurRec->{to},"\t", $gCurRec->{file},"\t", $gCurRec->{action},"\t", $gCurRec->{virus},"\n"; foreach(qw(name to file action virus)) { $gCurRec->{$_}=''; } } } __DATA__ From: pminich@foo.com To: esquared@foofoo.com File: value.scr Action: The uncleanable file is deleted. Virus: WORM_KLEZ.H ---------------------------------- Date: 06/30/2002 00:01:21 From: mef@mememe.com To: inet@microsoft.com File: Nr.pif Action: The uncleanable file is deleted. Virus: WORM_KLEZ.H ----------------------------------

Comment: One other thing I found I like is opening files with three parameters. For example, instead of:

open(OF,">$outputFile") || die;

I use:

open(OF,'>',$outputFile) || die;

I hope this helps! *Smiles*

Update:

Now that I have re-read my code, I should have made

qw(name to file action virus)

a constant so it was defined but one place, and should have made the field separator a constant as well. This would simplify changes to the code. (Not that it is critical on such a small program, but it is a good practice... well, for me at least.)

Replies are listed 'Best First'.
Re: Re: virus log parser
by rincew (Novice) on Jul 02, 2002 at 21:56 UTC
    I would like to add some random thoughts I had when I saw your code.

    First of all, the construct

    foreach(qw(name to file action virus)) { $gCurRec->{$_}=''; }
    can be expressed very succinctly using so called hash slices, i.e.
    my @columns = qw(name to file action virus); @{ $gCurRec }{ @columns } = ('') x @columns;
    See for example this for a good introduction.

    Furthermore, why do you use a hash reference to store the data when a hash would be sufficient? (This is probably a matter of style.)

    Then, I usually consider multiple repeated lines with trivial differences like

    $gCurRec->{name}=$1 if (/^From:\s*(.+?)\s*$/); $gCurRec->{to}=$1 if (/^To:\s*(.+?)\s*$/);
    to be a sign that some kind of abstraction like a loop is needed. In this case, keying each datum by its header field
    /^(\w+):\s*(.+?)\s*$/ and $gCurRec->{$1} = $2;
    does so and furthermore removes the need to spell out the interesting header fields several times. This of course means that unknown fields like the Date: are ignored, but your code ignores them as well.

    So finally here is my attempt at implementing your algorithm:

    #!/usr/bin/perl -w use strict; my %gCurRec = (); while(<DATA>) { /^-+\s*$/ and do { print join("\t", map { exists $gCurRec{$_} ? $gCurRec{$_} : '' } qw(from to file action virus) ) . "\n"; %gCurRec = (); next; }; /^(\w+):\s*(.+?)\s*$/ and $gCurRec{lc $1} = $2; } __DATA__ From: pminich@foo.com To: esquared@foofoo.com File: value.scr Action: The uncleanable file is deleted. Virus: WORM_KLEZ.H ---------------------------------- Date: 06/30/2002 00:01:21 From: mef@mememe.com To: inet@microsoft.com File: Nr.pif Action: The uncleanable file is deleted. Virus: WORM_KLEZ.H ----------------------------------

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://178995]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (8)
As of 2024-04-23 10:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found