Beefy Boxes and Bandwidth Generously Provided by pair Networks BBQ
Perl: the Markov chain saw
 
PerlMonks  

Syslog files revisited

by stevbutt (Novice)
on Aug 02, 2012 at 21:34 UTC ( #985103=perlquestion: print w/ replies, xml ) Need Help??
stevbutt has asked for the wisdom of the Perl Monks concerning the following question:

I have been trying to deal with some syslog files to store them in a DB but the format has just been too variable so I had another idea how to deal with them

My lines look like the following

May  2 04:06:15 lon-pop.mail.mydom.com pop3login: LOGOUT, user=gonenow, ip=[::ffff:127.0.0.1], top=0, retr=0, rcvd=24, sent=5560, time=1

The first three parts are pretty standard i.e. datetime( though with no year ) sysloghost, progname then the very variable message.

What I would like to do is convert this to a csv line as follows where its broken into at least four parts but if the string user=gonenow appears then add gonenow ( or whatever the users name was ) as an extra field, same for ip=

02/05/2012 04:06:15,lon-pop.mail.mydom.com,pop3login,"LOGOUT, user=gonenow, ip=[::ffff:127.0.0.1], top=0, retr=0, rcvd=24, sent=5560, time=1",gonenow,::ffff:127.0.0.1

I dont mind either way if I end up with empty fields or no fields at the end i.e. ,, or it just ends after the end quote of the message, but I have been banging my head trying to figure this for a few days

Comment on Syslog files revisited
Select or Download Code
Re: Syslog files revisited
by johngg (Abbot) on Aug 02, 2012 at 22:38 UTC

    split and sprintf will probably do what you want.

    knoppix@Microknoppix:~$ perl -Mstrict -Mwarnings -E ' > my $line = q{May 2 04:06:15 lon-pop.mail.mydom.com pop3login: LOGOU +T, user=gonenow, ip=[::ffff:127.0.0.1], top=0, retr=0, rcvd=24, sent= +5560, time=1}; > my ( $mon, $day, $time, $dom, $login, $remainder ) = > split m{:?\s+}, $line, 6; > my %monthNos = do { > my $no = 0; > map { $_ => ++ $no } > qw{ Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec }; > }; > > my $yr = q{2012}; > my $csv = sprintf q{%02d/%02d/%s %s,%s,%s,%s}, > $day, $monthNos{ $mon }, $yr, $time, $dom, $login, $remainder; > > say $csv;' 02/05/2012 04:06:15,lon-pop.mail.mydom.com,pop3login,LOGOUT, user=gone +now, ip=[::ffff:127.0.0.1], top=0, retr=0, rcvd=24, sent=5560, time=1 knoppix@Microknoppix:~$

    I hope this is helpful.

    Update: After replying to your reply I realised I had totally missed the need to quote the variable message and add the user and ip to the csv line. Here is revised code.

    knoppix@Microknoppix:~$ perl -Mstrict -Mwarnings -E ' > my $line = q{May 2 04:06:15 lon-pop.mail.mydom.com pop3login: LOGOU +T, user=gonenow, ip=[::ffff:127.0.0.1], top=0, retr=0, rcvd=24, sent +=5560, time=1}; > my ( $mon, $day, $time, $dom, $login, $remainder ) = > split m{:?\s+}, $line, 6; > my %monthNos = do { > my $no = 0; > map { $_ => ++ $no } > qw{ Jan Feb Mar Apr May Jun Jul Aug Sep Oct Nov Dec }; > }; > > my $yr = q{2012}; > my ( $user, $ip ) = > $remainder =~ m{user=([^,]+),\s+ip=\[([^\]]+)}; > $remainder = qq{"$remainder"}; > my $csv = sprintf q{%02d/%02d/%s %s,%s,%s,%s,%s,%s}, > $day, $monthNos{ $mon }, $yr, $time, $dom, > $login, $remainder, $user, $ip; > > say $csv;' 02/05/2012 04:06:15,lon-pop.mail.mydom.com,pop3login,"LOGOUT, user=gon +enow, ip=[::ffff:127.0.0.1], top=0, retr=0, rcvd=24, sent=5560, time= +1",gonenow,::ffff:127.0.0.1 knoppix@Microknoppix:~$

    Cheers,

    JohnGG

      This is certainly a move in the right direction and the date part is superb !, but it gives me the problem that the variable message at the end gets split up into an indeterminate number of csv values where if it were at least quoted I could read that as one field as the commas inside the quotes would be ignored ( I think )

        ... but it gives me the problem that the variable message at the end gets split up into an indeterminate number of csv values ...

        No, the third parameter to split limits the number of resultant fields so the $remainder scalar variable holds the entirety of your variable message.

        Cheers,

        JohnGG

      Thanks John, Almost all lines are now being loaded correctly but I have one further problem with occasional lines and specifically just on the user= part.

      Sometimes it says this

      user=gonenow sent=34 recv=34

      Or

      user=<gonenow>, etc etc

      Instead of the comma separated list I expected

      Is there a way to just extract my user value up until either a comma or space and to strip < > if they occur ?

        The user=<gonenow>, etc etc case can probably be got around with a couple of minor changes to the code. The user=gonenow sent=34 recv=34 is more of a problem as it fails the premise on which the split approach was based. I'd probably wrap the original code in an if condition and deal with the non-comma-separated lines in an else clause.

        It just so happens that I am about to go on holiday so I will not be able to provide further help for a week or so. I suggest you create another question in SoPW, linking to this thread, and post some example data lines illustrating your problem so that others can perhaps help you build on the solution you have so far.

        Cheers,

        JohnGG

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://985103]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (6)
As of 2014-04-18 10:53 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    April first is:







    Results (466 votes), past polls