Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Some assistance with splitting variables

by Anonymous Monk
on Apr 05, 2013 at 20:00 UTC ( #1027211=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello everyone - I have a log file which is in the following format:

19476 2013-04-05,12:10:51.909293 host:internal.machine44.company.net main INFO Running normally with ACTION=<processing> FAN_A=<OK> FAN_B=<OK> SEND=<Sent mail (221 2.0.0 Service closing transmission channel)> FAILURE=<2>

What I'd like to do is split this into a few different pieces. The first number is a line number and I do not need it, as well as the host: information. So I'd like to get the data into this format:

$date => 2013-04-05,12:10:51.909293 $info[ACTION] => processing $info[FAN_A] => OK $info[FAN_B] => OK $info[SEND] => Sent mail (221 2.0.0 Service closing transmission chann +el) $info[FAILURE] => 2

I can do this by using performing multiple greps, but can someone show me if there is a more efficient way to read the information in and use the items such as FAN_A and FAN_B as elements for the array and the value inserted from the information between the <>s?

Thanks in advance.

Comment on Some assistance with splitting variables
Select or Download Code
Re: Some assistance with splitting variables
by Skeeve (Vicar) on Apr 05, 2013 at 20:17 UTC

    Seems like you're just interested in stuff of the form:

    (\w+)=<(.*?)>

    So something like this should help:

    while (<$log>) { my $info; $info{$1}= $2 while (/(\w+)=<(.*?)>/g); }

    s$$([},&%#}/&/]+}%&{})*;#$&&s&&$^X.($'^"%]=\&(|?*{%
    +.+=%;.#_}\&"^"-+%*).}%:##%}={~=~:.")&e&&s""`$''`"e
Re: Some assistance with splitting variables
by hdb (Prior) on Apr 05, 2013 at 20:28 UTC

    Or like this:

    use strict; use warnings; my $line="19476 2013-04-05,12:10:51.909293 host:internal.machine44.com +pany.net main INFO Running normally with ACTION=<processing> FAN_A=<O +K> FAN_B=<OK> SEND=<Sent mail (221 2.0.0 Service closing transmission + channel)> FAILURE=<2>"; $line =~ s/\d+ (.*?) /DATE=<$1> /; # to give you the date as well ;) my %info = $line =~ /(\w+)=<(.*?)>/g; print "$_:$info{$_}\n" for keys %info;

    UPDATE: Added the date field...

Re: Some assistance with splitting variables
by ww (Bishop) on Apr 05, 2013 at 21:53 UTC
    1. Sometimes, using multiple regexen is easier or clearer than trying to do all the work with one (more complex) regex
    2. Captures the first element OP wanted; also perhaps somewhat easier to read.
    #!/usr/bin/perl use 5.016; # 1027211 use Data::Dumper; my @info; my $info = '19476 2013-04-05,12:10:51.909293 host:internal.machine44.c +ompany.net main INFO Running normally with ACTION=<processing> FAN_A= +<OK> FAN_B=<OK> SEND=<Sent mail (221 2.0.0 Service closing transmissi +on channel)> FAILURE=<2> '; if ($info =~ /\d+\s(\d\d\d\d-\d\d-\d\d,\d\d:\d\d:\d\d\.\d+) / ) { # ex +cessively detailed. push @info, $1; # A +well written # ch +ar_class would be # an + improvement, as } # wu +d using quantifiers while ( $info=~ /(\w+=<.*?)>/g) { push @info, $1; } say Dumper @info;
    output:
    $VAR1 = '2013-04-05,12:10:51.909293'; $VAR2 = 'ACTION=<processing'; $VAR3 = 'FAN_A=<OK'; $VAR4 = 'FAN_B=<OK'; $VAR5 = 'SEND=<Sent mail (221 2.0.0 Service closing transmission chann +el)'; $VAR6 = 'FAILURE=<2';

    If you didn't program your executable by toggling in binary, it wasn't really programming!

Re: Some assistance with splitting variables
by kcott (Abbot) on Apr 06, 2013 at 06:17 UTC

    This captures the data in the format you want:

    $ perl -Mstrict -Mwarnings -E ' use constant { ACTION => 0, FAN_A => 1, FAN_B => 2, SEND => 3, FAILURE => 4, }; my $re = qr{ \A \d+ \s+ (?<date> \S+ ) .*? ACTION=< (?<ACTION> [^>]+ ) > \s+ FAN_A=< (?<FAN_A> [^>]+ ) > \s+ FAN_B=< (?<FAN_B> [^>]+ ) > \s+ SEND=< (?<SEND> [^>]+ ) > \s+ FAILURE=< (?<FAILURE> [^>]+ ) }x; my $log_line = q{19476 2013-04-05,12:10:51.909293 host:internal.ma +chine44.company.net main INFO Running normally with ACTION=<processin +g> FAN_A=<OK> FAN_B=<OK> SEND=<Sent mail (221 2.0.0 Service closing t +ransmission channel)> FAILURE=<2>}; $log_line =~ /$re/; my ($date, @info) = @+{qw{date ACTION FAN_A FAN_B SEND FAILURE}}; say $date; say $info[ACTION]; say $info[FAN_A]; say $info[FAN_B]; say $info[SEND]; say $info[FAILURE]; ' 2013-04-05,12:10:51.909293 processing OK OK Sent mail (221 2.0.0 Service closing transmission channel) 2

    This is just a commandline proof-of-concept. Your real-world application would probably look more like:

    ... use constant { ... my $re = qr{ ... open my $log_fh, '<', $logfile or die $!; while my $log_line (<$log_fh>) { $log_line =~ /$re/; my ($date, @info) = @+{qw{date ACTION FAN_A FAN_B SEND FAILURE}}; # do something with $date and @info here }

    See perlre - Extended Patterns for details of Named Capture Groups: (?<NAME>pattern)

    -- Ken

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1027211]
Approved by Skeeve
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (16)
As of 2014-10-30 12:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    For retirement, I am banking on:










    Results (208 votes), past polls