So, you're expecting a user to come up with a set of threshold values to put on the command line? How likely is it, really, that a user will want to try a bunch of different variations of threshold values? (In fact, how likely is it that a user already knows what ranges of values are going to be useful?)
Any a priori assumptions that might provide sensible aid to reduce the user's "cognitive load" would be worth building into the script -- e.g. maybe threshold values should always be evenly spaced over an appropriate range, and users would just say how many thresholds (histogram bins) they want on a given run.
Regarding the code you posted, I'd offer a few "stylistic" points:
use Getopt::Long; # or Getopt::Std, which might be easier to grok.
That will make it easy to offer useful default values for things like number of bins, start-time and end-time. There could even be a default value for the name of the log file to read.
Perl gives a warning about line 59 -- it's harmless, but worth fixing.
When there's an "if" block that always ends with "exit 1" (which should just be "die"), there's no need for an "else" block after that (you can eliminate a layer of embedding). Likewise, you don't need an "else" block that contains just a next statement, given that there's nothing after that block in the enclosing loop.
Assuming you have an array of threshold values, you just need to make sure the array values are sorted, and loop over them to work out which bin a given value should be counted in -- here's a simple example that leaves aside all your other issues about selecting/excluding log entries:
my @thresh = ( 1000, 4000, 7000, 10000 );
my @bins;
while (<LOG>) {
my $val = ( split )[10];
next unless ( $val =~ /^\d+$/ );
my $i;
for $i ( 0 .. $#thresh ) {
last if ( $val < $thresh[$i] );
}
$bins[$i]++;
}
(UPDATED to give appropriate scope to $i -- thanks to wfsp for pointing that out.)
Geez! As GrandFather points out below, I really didn't get that right. Even after wfsp had told me it wouldn't work, I still had it wrong. What I should have suggested was something like this (thanks, GrandFather):
my @thresh = ( 1000, 4000, 7000, 10000 );
my @bins;
while (<LOG>) {
my $val = ( split )[10];
next unless ( $val =~ /^\d+$/ );
my $i = 0;
while ( $i < @thresh and $val > $thresh[$i] ) {
$i++;
}
$bins[$i]++;
}
| [reply] [Watch: Dir/Any] [d/l] [select] |
Thanks Graf, i'll give this a try tonight or tomorrow.
Realistically, the user will likely specify thresholds between 1000 and 10,000 spaced by about 5000. So in all likelihood 1000, 5000 and 10000 are probably the only thresholds that this script will ever see, but I wanted to leave the user's options open.
You're probably right, and I should just hard-code a range of thresholds to be run by default and maybe provide an override to run the script using a single user-defined threshold.
| [reply] [Watch: Dir/Any] |
Note the correction to my code snippet -- if $i were lexically scoped in the "for" statement (as originally posted), it would be unavailable after exiting that loop.
| [reply] [Watch: Dir/Any] |
but I can provide code examples of my current script
Please do. Make it so that we can run it too. Provide a few lines of your input file, and provide the output you expect to eliminate any ambiguity. See also: histogram
Did you really mean 2000, not 2500?
| [reply] [Watch: Dir/Any] |
#!/usr/bin/perl
# Description: Read SRLabs feed handler log file, search for and print
+ "pending queues" exceeding a given threshold within a given time-fra
+me.
# Usage: Run script without command line arguments for usage details
use strict;
my $hostname = `hostname -s`;
$hostname =~ s/^\s*(\S*(?:\s+\S+)*)\s*$/$1/;
my $total; # variable to be used for queue total for the given timefra
+me
my $count = 0;
# Check usage. Exit if incorrect, then display usage details
my $numArgs = $#ARGV + 1;
if (($numArgs <= 3) || ($numArgs > 5)) {
print "\nusage:\n";
print usage();
exit 1;
}
else {
# Get command line arguments, convert time to seconds
my $logFile = $ARGV[0];
my $sTime= $ARGV[1];
my @sTime=split(/:/,$sTime); # split start time
my $sSecs=$sTime[0] * 3600 + $sTime[1] * 60 + $sTime[2]; # con
+vert start-time to seconds
my $eTime = $ARGV[2];
my @eTime=split(/:/,$eTime); # split end time
my $eSecs=$eTime[0] * 3600 + $eTime[1] * 60 + $eTime[2]; # con
+vert stop-time to seconds
my $tHold = $ARGV[3];
my $date = $ARGV[4];
# Get today's date which will be used as the default. Will add opt
+ion to enter date, manually, soon
my($day, $month, $year) = (localtime)[3,4,5];
$month = sprintf '%02d', $month+1;
$day = sprintf '%02d', $day;
$year = $year+1900;
#my $ymd = "$year-$month-$day";
my $ymd = "2011-09-12";
open LOGFILE, "<", "$logFile" or die $!;
while (<LOGFILE>){
my $line=$_;
chomp;
my @data=split(/ /,$line); # split the line up
unless (($data[10] =~ m/Pending=/)) { next; } # skip elements
+we don't want
my $lineInfo = $data[9];
$lineInfo =~ s/[Connection\[\].]//g;
$data[10] =~ s/[A-Za-z=.]//g; # delete "Pending", "=" and "."
$data[1] =~ s/\..*//g; # delete the millisecond element of the
+ (current)cTime var
my @cTime=split(/:/,$data[1]); # split the current time
my $curSecs=$cTime[0] * 3600 + $cTime[1] * 60 + $cTime[2]; # c
+onvert current-time to seconds
if (($data[0] eq $ymd) && ($data[10] >= $tHold) && ($curSecs >
+= $sSecs) && ($curSecs <= $eSecs))
{
$count+1;
$total += $data[10];
$count++;
} else { next; }
}
print "$hostname,$ymd,$sTime-$eTime,$tHold,$count,$total\n";
print "\n$hostname ($ymd)\n";
print "$count pending queues meet or exceed $tHold\n";
print "Aggregate ($sTime to $eTime): $total\n\n"; # print the Queu
+e for the timeframe
}
close LOGFILE;
sub usage {
print "\n\./readLog.pl [log-file] [start-time] [end-time] [threshold]\
+n\n";
print " log-file Log file name\n";
print " start-time Start time as HH:MM [e.g. \"06:00\"]\n";
print " end-time End time as HH:MM [e.g. \"13:15\"]\n";
print " max-pending Pending queue threshold [e.g. \"1000\"]\
+n\n";
}
-------------------------------------------------------------------------------------------------------------------
Example log file output
2011-09-12 10:32:16.285 pmmd-ltc-fsrlabs02 13996,14019: (WALL, MDConnection.cpp:420) Received data on ConnectionOPRA-GRP41-44. Pending=0. Total Messages processed by ConnectionOPRA-GRP41-44=10570000
0
2011-09-12 10:32:16.499 pmmd-ltc-fsrlabs02 13996,14019: (WALL, MDConnection.cpp:420) Received data on ConnectionOPRA-GRP17-20. Pending=0. Total Messages processed by ConnectionOPRA-GRP17-20=12193000
0
2011-09-12 10:32:16.876 pmmd-ltc-fsrlabs02 13996,14019: (WALL, MDConnection.cpp:420) Received data on ConnectionOPRA-GRP33-36. Pending=0. Total Messages processed by ConnectionOPRA-GRP33-36=10667000
0
2011-09-12 10:32:16.935 pmmd-ltc-fsrlabs02 13996,14019: (WALL, MDConnection.cpp:420) Received data on ConnectionOPRA-GRP45-48. Pending=0. Total Messages processed by ConnectionOPRA-GRP45-48=98140000
2011-09-12 10:32:16.966 pmmd-ltc-fsrlabs02 13996,14019: (WALL, MDConnection.cpp:420) Received data on ConnectionOPRA-GRP5. Pending=0. Total Messages processed by ConnectionOPRA-GRP5=31930000
2011-09-12 10:32:17.073 pmmd-ltc-fsrlabs02 13996,14019: (WALL, MDConnection.cpp:420) Received data on ConnectionOPRA-GRP17-20. Pending=0. Total Messages processed by ConnectionOPRA-GRP17-20=12194000
0
2011-09-12 10:32:17.123 pmmd-ltc-fsrlabs02 13996,14019: (WALL, MDConnection.cpp:420) Received data on ConnectionOPRA-GRP29-32. Pending=0. Total Messages processed by ConnectionOPRA-GRP29-32=10861000
0
2011-09-12 10:32:17.172 pmmd-ltc-fsrlabs02 13996,14019: (WALL, MDConnection.cpp:420) Received data on ConnectionOPRA-GRP37-38. Pending=0. Total Messages processed by ConnectionOPRA-GRP37-38=63700000
2011-09-12 10:32:17.196 pmmd-ltc-fsrlabs02 13996,14019: (WALL, MDConnection.cpp:420) Received data on ConnectionOPRA-GRP9-12. Pending=0. Total Messages processed by ConnectionOPRA-GRP9-12=119390000
2011-09-12 10:32:17.236 pmmd-ltc-fsrlabs02 13996,14019: (WALL, MDConnection.cpp:420) Received data on ConnectionOPRA-GRP13-16. Pending=87. Total Messages processed by ConnectionOPRA-GRP13-16=1041100
00
2011-09-12 10:32:17.248 pmmd-ltc-fsrlabs02 13996,14019: (WALL, MDConnection.cpp:420) Received data on ConnectionOPRA-GRP39-40. Pending=6. Total Messages processed by ConnectionOPRA-GRP39-40=51620000
2011-09-12 10:32:17.301 pmmd-ltc-fsrlabs02 13996,14019: (WALL, MDConnection.cpp:420) Received data on ConnectionOPRA-GRP41-44. Pending=341. Total Messages processed by ConnectionOPRA-GRP41-44=105710
000
2011-09-12 10:32:17.330 pmmd-ltc-fsrlabs02 13996,14019: (WALL, MDConnection.cpp:420) Received data on ConnectionOPRA-GRP1-4. Pending=2. Total Messages processed by ConnectionOPRA-GRP1-4=93230000
2011-09-12 10:32:17.374 pmmd-ltc-fsrlabs02 13996,14019: (WALL, MDConnection.cpp:420) Received data on ConnectionOPRA-GRP7-8. Pending=0. Total Messages processed by ConnectionOPRA-GRP7-8=63680000
2011-09-12 10:32:17.390 pmmd-ltc-fsrlabs02 13996,14019: (WALL, MDConnection.cpp:420) Received data on ConnectionOPRA-GRP25-28. Pending=0. Total Messages processed by ConnectionOPRA-GRP25-28=10501000
0
2011-09-12 10:32:17.392 pmmd-ltc-fsrlabs02 13996,14019: (WALL, MDConnection.cpp:420) Received data on ConnectionOPRA-GRP21-24. Pending=2. Total Messages processed by ConnectionOPRA-GRP21-24=89610000
2011-09-12 10:32:17.418 pmmd-ltc-fsrlabs02 13996,14019: (WALL, MDConnection.cpp:420) Received data on ConnectionOPRA-GRP6. Pending=0. Total Messages processed by ConnectionOPRA-GRP6=26110000
-------------------------------------------------------------------------------------------------------------------
hostname,2011-09-14,8:30-15:00,5000,492,3704405
hostname (2011-09-14)
492 pending queues meet or exceed 5000
Aggregate (8:30 to 15:00): 3704405
I did mean 2500, but forgot to change the earlier thresholds. I only put 2500 in, to make it known that the thresholds won't be static and are completely decided by the user
| [reply] [Watch: Dir/Any] [d/l] [select] |
Oops... that logfile excerpt needs to be in a code-tag, too. Can you please edit the post? We only need to see one or two of the lines.
| [reply] [Watch: Dir/Any] |
pardon the sloppy code, and i'm sure that there are much easier ways to accomplish that script. Still learning and just needed to whip something up quickly.
| [reply] [Watch: Dir/Any] |
Hey jb, I'm new to perl and wanted to know if you can explain the variables for your script.
Thanks AJ
| [reply] [Watch: Dir/Any] |