Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Some assistance with splitting variables

by Anonymous Monk
on Apr 05, 2013 at 20:00 UTC ( #1027211=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello everyone - I have a log file which is in the following format:

19476 2013-04-05,12:10:51.909293 host:internal.machine44.company.net main INFO Running normally with ACTION=<processing> FAN_A=<OK> FAN_B=<OK> SEND=<Sent mail (221 2.0.0 Service closing transmission channel)> FAILURE=<2>

What I'd like to do is split this into a few different pieces. The first number is a line number and I do not need it, as well as the host: information. So I'd like to get the data into this format:

$date => 2013-04-05,12:10:51.909293 $info[ACTION] => processing $info[FAN_A] => OK $info[FAN_B] => OK $info[SEND] => Sent mail (221 2.0.0 Service closing transmission chann +el) $info[FAILURE] => 2

I can do this by using performing multiple greps, but can someone show me if there is a more efficient way to read the information in and use the items such as FAN_A and FAN_B as elements for the array and the value inserted from the information between the <>s?

Thanks in advance.

Comment on Some assistance with splitting variables
Select or Download Code
Re: Some assistance with splitting variables
by Skeeve (Vicar) on Apr 05, 2013 at 20:17 UTC

    Seems like you're just interested in stuff of the form:

    (\w+)=<(.*?)>

    So something like this should help:

    while (<$log>) { my $info; $info{$1}= $2 while (/(\w+)=<(.*?)>/g); }

    s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
    +.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e
Re: Some assistance with splitting variables
by hdb (Prior) on Apr 05, 2013 at 20:28 UTC

    Or like this:

    use strict; use warnings; my $line="19476 2013-04-05,12:10:51.909293 host:internal.machine44.com +pany.net main INFO Running normally with ACTION=<processing> FAN_A=<O +K> FAN_B=<OK> SEND=<Sent mail (221 2.0.0 Service closing transmission + channel)> FAILURE=<2>"; $line =~ s/\d+ (.*?) /DATE=<$1> /; # to give you the date as well ;) my %info = $line =~ /(\w+)=<(.*?)>/g; print "$_:$info{$_}\n" for keys %info;

    UPDATE: Added the date field...

Re: Some assistance with splitting variables
by ww (Bishop) on Apr 05, 2013 at 21:53 UTC
    1. Sometimes, using multiple regexen is easier or clearer than trying to do all the work with one (more complex) regex
    2. Captures the first element OP wanted; also perhaps somewhat easier to read.
    #!/usr/bin/perl use 5.016; # 1027211 use Data::Dumper; my @info; my $info = '19476 2013-04-05,12:10:51.909293 host:internal.machine44.c +ompany.net main INFO Running normally with ACTION=<processing> FAN_A= +<OK> FAN_B=<OK> SEND=<Sent mail (221 2.0.0 Service closing transmissi +on channel)> FAILURE=<2> '; if ($info =~ /\d+\s(\d\d\d\d-\d\d-\d\d,\d\d:\d\d:\d\d\.\d+) / ) { # ex +cessively detailed. push @info, $1; # A +well written # ch +ar_class would be # an + improvement, as } # wu +d using quantifiers while ( $info=~ /(\w+=<.*?)>/g) { push @info, $1; } say Dumper @info;
    output:
    $VAR1 = '2013-04-05,12:10:51.909293'; $VAR2 = 'ACTION=<processing'; $VAR3 = 'FAN_A=<OK'; $VAR4 = 'FAN_B=<OK'; $VAR5 = 'SEND=<Sent mail (221 2.0.0 Service closing transmission chann +el)'; $VAR6 = 'FAILURE=<2';

    If you didn't program your executable by toggling in binary, it wasn't really programming!

Re: Some assistance with splitting variables
by kcott (Abbot) on Apr 06, 2013 at 06:17 UTC

    This captures the data in the format you want:

    $ perl -Mstrict -Mwarnings -E ' use constant { ACTION => 0, FAN_A => 1, FAN_B => 2, SEND => 3, FAILURE => 4, }; my $re = qr{ \A \d+ \s+ (?<date> \S+ ) .*? ACTION=< (?<ACTION> [^>]+ ) > \s+ FAN_A=< (?<FAN_A> [^>]+ ) > \s+ FAN_B=< (?<FAN_B> [^>]+ ) > \s+ SEND=< (?<SEND> [^>]+ ) > \s+ FAILURE=< (?<FAILURE> [^>]+ ) }x; my $log_line = q{19476 2013-04-05,12:10:51.909293 host:internal.ma +chine44.company.net main INFO Running normally with ACTION=<processin +g> FAN_A=<OK> FAN_B=<OK> SEND=<Sent mail (221 2.0.0 Service closing t +ransmission channel)> FAILURE=<2>}; $log_line =~ /$re/; my ($date, @info) = @+{qw{date ACTION FAN_A FAN_B SEND FAILURE}}; say $date; say $info[ACTION]; say $info[FAN_A]; say $info[FAN_B]; say $info[SEND]; say $info[FAILURE]; ' 2013-04-05,12:10:51.909293 processing OK OK Sent mail (221 2.0.0 Service closing transmission channel) 2

    This is just a commandline proof-of-concept. Your real-world application would probably look more like:

    ... use constant { ... my $re = qr{ ... open my $log_fh, '<', $logfile or die $!; while my $log_line (<$log_fh>) { $log_line =~ /$re/; my ($date, @info) = @+{qw{date ACTION FAN_A FAN_B SEND FAILURE}}; # do something with $date and @info here }

    See perlre - Extended Patterns for details of Named Capture Groups: (?<NAME>pattern)

    -- Ken

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1027211]
Approved by Skeeve
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (9)
As of 2014-10-25 11:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (143 votes), past polls