Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

Re: Re: Re: Log parsing by timestamp dilema

by DaveH (Monk)
on Feb 01, 2003 at 18:29 UTC ( [id://231865]=note: print w/replies, xml ) Need Help??


in reply to Re: Re: Log parsing by timestamp dilema
in thread Log parsing by timestamp dilema

Hi.

Sorry, I couldn't resist rewriting your code. :-) The problem "got at me". It uses adrianh's solution, but translating it into your script, you would end up with something like the rewrite below.

First, I removed the whole while loop, around lines 86-90, and all the code inbetween was cut out and saved for later. Alot of the repeated code was moved into subroutines. I have tested it as best as I can, and it works for me. I took advantage of the fact that you had already done the work of finding the files, which were stored in @Logs. This was used instead of @ARGV. I tried not to impose my coding style on the script, but it has been run through PerlTidy. This may have moved stuff around a bit.

The other main change was the way of handling specified date ranges. Whilst the 'if' logic remains, I generalised it into a subroutine, and made use of a new %Range hash to store the 'begin' and 'end' dates (which are updated if the '-t' option is specified). By defaulting appropriately, this allows the code to check for dates being in the range the user want in just one line of code. Also, this means that the complicated regexes to parse command line args are only performed once, ranther for every line of every file.

#!/usr/bin/perl -w use strict; use Getopt::Std; use Time::Local; use IO::File; use POSIX qw(strftime); use Data::Dumper; $|++; my %Opt = (); my @Conns = (); my %Range = (); use constant MIN => 0; # 32-bit "Thu Jan 1 00:00:00 1970" use constant MAX => 2147483647; # 32-bit "Tue Jan 19 03:14:07 2038" &GetArgs(); &GetConns(); &GetLogs(); sub GetArgs { my $Usage = qq{Usage: $0 [options] -h : This help message. -c : Specific connector - default is to list all connectors. -d : Specific direction - default is to list all directions -n : Trap name - default is to list all names -t : Time in stamp format mm/dd/yy-hh:mm or mm/dd/yy +<stamp> - show entries created after specified stamp If time is not given, defaults to 23:59 -<stamp> - show entries created before specified stamp If time is not given, default to 00:00 =<stamp> - show entries created on specified stamp If time is not given it is ignored (all day) <stamp>+-<stamp> - show entries created between specif +ied stamps If time is not given on first stamp, 00:00 is +used If time is not given on second stamp, 23:59 is + used Note: This includes the day(s) specified -s : Size of files caught in bytes +<size> - show entries with files larger than specifie +d size -<size> - show entries with files smaller than specifi +ed size =<size> - show entries with files equal to specified s +ize <size>+-<size> - show entries with files between speci +fied sizes } . "\n"; getopts( 'hc:d:n:t:s:', \%Opt ) or die "$Usage"; die "$Usage" if $Opt{h}; if ( $Opt{d} ) { $Opt{d} = lc( $Opt{d} ); die "$Usage" if ( $Opt{d} ne "in" && $Opt{d} ne "out" && $Opt{d} ne "both +" ); } } sub GetConns { open( CONNECTORS, "/var/wt400/conf/_wtd.cfg" ) or die "Unable to open connector file!"; while (<CONNECTORS>) { next unless ( $_ =~ /^unit="(.*)"/ ); my $Conn = lc($1); next if ( $Conn eq "ins" || $Conn eq "ins2" || $Conn eq "_wtd" + ); push @Conns, $Conn; } close(CONNECTORS); if ( $Opt{c} ) { $Opt{c} = lc( $Opt{c} ); if ( grep /\b$Opt{c}\b/, @Conns ) { @Conns = $Opt{c}; } else { die "\nInvalid connector - $Opt{c} !\n"; } } } sub GetLogs { my @Logs; foreach my $Conn (@Conns) { my @Directions; if ( $Opt{d} ) { @Directions = $Opt{d}; } else { @Directions = (qw(in out both)); } foreach my $Dir (@Directions) { push @Logs, "/var/spool/wt400/log/$Conn/trap_${Dir}.log" if ( -r "/var/spool/wt400/log/$Conn/trap_${Dir}.log" && +-s _ ); } } unless (@Logs) { die "\nUnable to find any logs!\n"; } else { %Range = getRange(); # MAKE A FILEHANDLE FOR EACH FILE WE WERE GIVEN my @files = map { new IO::File $_ or die "could not open $_: $!" } @Logs +; # READ IN A LINE FOR EACH FILE my @lines = map { scalar(<$_>) } @files; # GET THE DATES FOR EACH LINE; my @dates = map { (m/^\s*(\d+)\s+/) } @lines; my $found; do { # FIND THE LINE WITH THE EARLIEST DATE my $min = MAX; $found = undef; for ( my $i = 0 ; $i < @Logs ; $i++ ) { my $num = $dates[$i]; if ( $num < $min ) { $found = $i; $min = $num; } } if ( defined($found) ) { # IF WE FOUND A LINE, SHOW IT AND READ THE NEXT # LINE IN FOR THAT LOG FILE prettyPrint( $lines[$found], $Logs[$found] ); my $io = $files[$found]; if ( defined( $_ = <$io> ) ) { $lines[$found] = $_; ( $dates[$found] ) = (m/^\s*(\d+)\s+/); } else { # Delete this log off the queues. splice @files, $found, 1; splice @lines, $found, 1; splice @dates, $found, 1; splice @Logs, $found, 1; } } } while ( defined($found) ); } } # Update the desired range defaults (from command line) sub getRange { my %range = ( start => MIN, end => MAX, ); if ( $Opt{t} ) { $Opt{t} =~ s/\s+//; # delete ws if ( $Opt{t} =~ /^\+(.*)/ ) { $range{start} = getStamp( "after", $1 ); } elsif ( $Opt{t} =~ /^\-(.*)/ ) { $range{end} = getStamp( "before", $1 ); } elsif ( $Opt{t} =~ /^\=(.*)/ ) { $range{start} = getStamp( "before", $1 ); $range{end} = getStamp( "after", $1 ); } elsif ( $Opt{t} =~ /^(.*)\+\-(.*)/ ) { $range{start} = getStamp( "before", $1 ); $range{end} = getStamp( "after", $2 ); } } if ( $range{start} < 0 || $range{end} < 0 ) { die "invalid range created"; } return %range; } # Get a timestamp from a formatted date sub getStamp { # '$before_after' decides what the default value # for $hour & $min will be: 00:00 or 23:59. my $before_after = $_[0]; my ( $mon, $day, $year, $hour, $min ) = split ?[-/:]?, $_[1]; if ( $before_after eq "before" ) { ( $hour ||= 00, $min ||= 00 ); } elsif ( $before_after eq "after" ) { ( $hour ||= 23, $min ||= 59 ); } else { return undef; # error: undefined behaviour } return timelocal( 0, $min, $hour, $day, $mon - 1, $year + 100 ); } # Print out the line in a nice way sub prettyPrint { my $Line = $_[0]; my $File = $_[1]; my @Fields = split " ", $Line; # Check that the line is in date range return unless ( $Fields[0] >= $Range{start} && $Fields[0] <= $Rang +e{end} ); if ( $Opt{n} ) { return unless ( lc( $Opt{n} ) eq lc( $Fields[3] ) ); } if ( $Opt{s} ) { $Opt{s} =~ s/\s+//; # delete ws if ( $Opt{s} =~ /^\+(.*)/ ) { return unless ( $Fields[2] > $1 ); } elsif ( $Opt{s} =~ /^\-(.*)/ ) { return unless ( $Fields[2] < $1 ); } elsif ( $Opt{s} =~ /^\=(.*)/ ) { return unless ( $Fields[2] == $1 ); } elsif ( $Opt{s} =~ /^(.*)\+\-(.*)/ ) { return unless ( $Fields[2] >= $1 && $Fields[2] <= $2 ); } } if ( $File =~ m{^.*/(.*)/trap_(.*)\.log} ) { my $Conn = $1; my $Dir = $2; my $Time = strftime( "[%x-%X]", localtime( $Fields[0] ) ); print "$Time $Conn $Dir $Fields[3] $Fields[1] $Fields[2]\n"; } }

The only manual changes I have made above from the tested version is the directory paths. All the other code has been tested with test data (IWFM).

Hope that helps. :-) Thanks for making my boring Saturday more interesting. Please post back if you see any issues.

Cheers,

-- Dave :-)


$q=[split+qr,,,q,~swmi,.$,],+s.$.Em~w^,,.,s,.,$&&$$q[pos],eg,print

Replies are listed 'Best First'.
Re: Re: Re: Re: Log parsing by timestamp dilema
by Limbic~Region (Chancellor) on Feb 01, 2003 at 20:51 UTC
    DaveH,
    Thanks!
    My logical interpretation of adrianh's solution was pretty much correct - I just couldn't see it in the code. This works as is, but I am going to test its speed against tall_man's suggestion as it runs considerably slower. I know that it is doing a lot more work, so this is expected and with the $|++ - the humans viewing it shouldn't really notice a difference. None the less, I am going to code my own version of the logic to see if I can't speed it up in addition to benching it against a version using File::MergeSort. If I can't do any better than your integration of adrianh's solution, the only change I will make is having it being an option and not the default. This way it will not effect the overall speed if someone chooses to do a -c and only look at one connector log.

    Cheers - L~R

      I'd be interested to see what you come up with. :-) That was definitely a "quick" hack of your original code, so there was definitely room for improvement.

      Glad it helped.

      Cheers,

      -- Dave :-)


      $q=[split+qr,,,q,~swmi,.$,],+s.$.Em~w^,,.,s,.,$&&$$q[pos],eg,print

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://231865]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others about the Monastery: (4)
As of 2024-09-08 06:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuli‥ 🛈The London Perl and Raku Workshop takes place on 26th Oct 2024. If your company depends on Perl, please consider sponsoring and/or attending.