Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re: Counting concurrent event jobs

by gaal (Parson)
on Apr 24, 2006 at 17:06 UTC ( [id://545330]=note: print w/replies, xml ) Need Help??


in reply to Counting concurrent event jobs

If this is the level of detail you need, it looks like you don't have to correlate specific start and end events. So all you need is one counter per command type (e.g., "split" and "filter").

Replies are listed 'Best First'.
Re^2: Counting concurrent event jobs
by mantadin (Beadle) on Apr 24, 2006 at 18:05 UTC

    In that case, your script would print out the results for each time interval and then go on to evaluate the next lines of the logfiles, like this:

    while(<>) { $timestring = $_ =~ $some_regexp; $min = &get_nr_of_mins ($timestring); if(/start:/) { /split/ and ++$split; /filter/ and ++$filter; } elsif(/finish:/) { /split/ and --$split; /filter/ and --$filter; } if($min % $granularity == 0) { print "$time: $split, $filter\n"; } }

    (note, that this gives you the number of processes at the time of the last line eaten from the logfile, not an average value for the last interval).

    OTOH, if you want to do more sophisticated analysis of the logfile, this approach might be too simple.

Re^2: Counting concurrent event jobs
by vagnerr (Prior) on Apr 24, 2006 at 18:11 UTC
    Unfortunately that is not the case. The processing of these log files goes on for hours and involves hundreds of logs, each taking between a few minutes and an hour or two to run. We need to be able to graph the data (hence the csv output) and see that for example we do a lot of split jobs at one time of day and a lot of filter jobs at another. We need to know because some of the jobs use a lot of cpu, others may use a lot of network bandwidth, and we want to be able to tune things to share the resources we have.


    ___________
    Remember that amateurs built Noah's Ark. Professionals built the Titanic.

        for a more Perlish way, consider useing the RRDs module from the RRDtool website or RRD::Simple , RRD:OO from CPAN.

        :)))))
      That still doesn't seem to contradict gaal's approach, and only differs from mantadin's in choosing when and how to report. If I'm missing something, then tell me what is wrong with
      my $REPORT_INTERVAL = 300; # seconds my %active = ( 'split' => 0, 'filter' => 0); my $next_report = date_to_timestamp("...start of day..."); my $last_report = date_to_timestamp("...end of day..."); while(<>) { # Parse out the fields my ($date, $action, $jobtype, $logfile) = /.../; # Update current active job counts if ($action eq 'start') { ++$active{$jobtype}; elsif ($action eq 'finish') { --$active{$jobtype}; } else { die "Huh? $_"; } # Output counts for all report lines between # the last printed report and the time of this # log line. Most of the time, this will be empty # because we won't have reached the next report # time yet. my $stamp = date_to_timestamp($date); while ($stamp > $next_report) { report_counts($next_report, \%active); $next_report += $REPORT_INTERVAL; } } # Finish off the report for the report periods # at the end of the reporting range. while ($next_report < $last_report) { report_counts($next_report, \%active); $next_report += $REPORT_INTERVAL; }

      Based on your proposed solution, it seems like you think that you have to correlate a finish event with the start event for that job -- but if all you want is the counts, then as gaal said, the correlation is unnecessary.

      If for some reason you do need to correlate them, then you can always keep all active jobs' state in the %active hash:

      ... if ($action eq 'start') { $active{$jobtype}{$logfile} = 1; } elsif ($action eq 'finish') { delete $active{$jobtype}{$logfile}; } ... my $split_count = keys %{ $active{'split'} }; ...

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://545330]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (5)
As of 2024-03-28 13:08 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found