Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

bspencer

by bspencer (Acolyte)
on Jan 28, 2017 at 12:16 UTC ( [id://1180524]=user: print w/replies, xml ) Need Help??


Posts by bspencer
Multi-CPU when reading STDIN and small tasks in Seekers of Perl Wisdom
3 direct replies — Read more / Contribute
by bspencer
on Jan 28, 2017 at 08:14

    Background: A central syslog server receives syslog messages from a servers and filters messages of a certain type to a PERL script which in turn reads STDIN. The task of the script is to flatten the multi-line messages into a single line and write them out to disk. Because the lines may not all be related when they come in I use a hash index to relate all of the associated lines. Some lines have a "end of event" marker which is used to trigger then write while others may not and I time them out after 5 seconds.

    Issue: Currently the script is single threaded and it's anticipated that this limit will be the first one reached since rsyslog is multithreaded. While I've read some about threading in PERL which could help they seem to be directed towards workloads which actually do work where the this script mostly spends time on very small tasks of 1) looking for a couple of patterns in a single stream; 2) writing data out; 3) occasionally stepping through a couple of loops writing out expired data.

    Question: Given the nature of this script, would threading help or does such things as the single input stream and the needing to relate the data before writing hinder that?

    The main code loop:

    while(<>){ chop; if (/^node=(\S+).*audit\((\d+\....):(\d+)\)/){ if (! $time{$2}{$1}{$3}){ $time{$2}{$1}{$3}=1; } if (/^node=(\S+) type=EOE msg=audit\((\d+\....):(\d+)\)/){ print_data($1,$2,$3); $totalevents++; }else{ push(@{$data{$1}{"$2:$3"}},$_); } } $cnt++; if ($cnt > $agecheck){ # see if entries have aged off and should be written out $date=&dateonly; while (my ($t)=each(%time)){ if ($t < (time() - $age)){ foreach my $host (keys(%{$time{$t}})){ foreach my $event (keys(%{$time{$t}{$host}})){ logit("Aged: node=$host $t:$event"); print_data($host,$t,$event); update_stats(); } } } } $cnt=0; } $totallines++; }

    sub print_data { ... # Dedup the data base on data in the string # Parent: http://www.perlmonks.org/bare/?node_id=104565 # specific post: http://www.perlmonks.org/bare/?node_id=104602 my $singleline=join(" ",@{$data{$host}{"$time:$event"}}); $databefore+=length($singleline); $singleline=~s/((\S+)\s?)/$count{$2}++ ? '' : $1/eg; $dataafter+=length($singleline); print ${$fh} "$singleline\n"; ... }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (3)
As of 2024-04-19 22:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found