Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

Using variable to hold regex expression

by salatconed (Initiate)
on Mar 11, 2013 at 23:07 UTC ( #1022892=perlquestion: print w/ replies, xml ) Need Help??
salatconed has asked for the wisdom of the Perl Monks concerning the following question:

I am trying to use variables to hold the regex exppression to make it easier to code, and I'm running into one issue parsing firewall logs.

When I call the same regex variable multiple times the first group returns the correct result but the second one shows part of the first IP address.

-- Sample data Mar 10 07:42:38 DR-FW-1 : %ASA-6-305011: Built dynamic UDP translation from inside:172.28.17.130/3324 to outside(internet-traffic):69.176.102.83/24295

output: re1 -> 172.28.17.130 re2 -> 17. ------------------------------------------ my $Raw_Log = ""; my $re_ipv4 = qr/(([2]([0-4][0-9]|[5][0-5])|[0-1]?[0-9]?[0-9])[.]){3}( +([2]([0-4][0-9]|[5][0-5])|[0-1]?[0-9]?[0-9]))/; # Open file to read lines my $logfile = $ARGV[0]; my $linenum = 0; open(LOGFILEHD, $logfile); while( <LOGFILEHD>){ $Raw_Log = $_; print "$Raw_Log\n"; $Raw_Log =~ /($re_ipv4).*($re_ipv4)/; print "re1 -> $1\n"; print "re2 -> $2\n"; $linenum += 1; } close(LOGFILEHD);

Comment on Using variable to hold regex expression
Download Code
Replies are listed 'Best First'.
Re: Using variable to hold regex expression
by choroba (Canon) on Mar 11, 2013 at 23:12 UTC
    $1 Corresponds to the first opening capturing parenthesis, $2 corresponds to the second one. You probably want to use $5 instead of $2 - let us count:
    ((([2]([0-4][0-9]|[5][0-5])|[0-1]?[0-9]?[0-9])[.]){3}(([2]([0-4][0-9]| +[5][0-5])|[0-1]?[0-9]?[0-9]))).*( 123 4 56 7
    لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: Using variable to hold regex expression
by Kenosis (Priest) on Mar 11, 2013 at 23:36 UTC

    Have you considered using Regexp::Common::net to capture those IPs?

    use strict; use warnings; use Regexp::Common qw/net/; while (<DATA>) { if ( my ( $firstIP, $secondIP ) = /($RE{net}{IPv4})/g ) { print "FirstIP: $firstIP\nSecondIP: $secondIP\n\n"; } } __DATA__ Sample data Mar 10 07:42:38 DR-FW-1 : %ASA-6-305011: Built dynamic UDP + translation from inside:172.28.17.130/3324 to outside(internet-traff +ic):69.176.102.83/24295 Sample data Mar 10 07:42:38 DR-FW-1 : %ASA-6-305011: Built dynamic UDP + translation from inside:155.0.42.42/3324 to outside(internet-traffic +):71.200.20.7/24295

    Output:

    FirstIP: 172.28.17.130 SecondIP: 69.176.102.83 FirstIP: 155.0.42.42 SecondIP: 71.200.20.7
      Perhaps using IP addresses was not a good example, I'm trying to figure out how to parse a string which has repetitive data, so I can write the regex expression once and get multiple returns if they exist the same way your code got both IP addresses in one call.
        ... how to parse a string which has repetitive data ...

        As choroba pointed out, every  (pattern) pair of parentheses in a regex captures something (even undef possibly) to its corresponding capture variable. One way to parse a string using nested regexes is avoid using a gazillion capturing groups. Use the non-capturing  (?:pattern) instead for grouping. See perlre, perlrequick, perlretut. In the IP example (but this should generalize to any repetitive data you wish to extract):

        >perl -wMstrict -le "my $decimal_octet = qr{ 2 (?: [0-4] \d | 5 [0-5]) | [01]? \d? \d }xms; my $ip = qr{ (?<! \d) $decimal_octet (?: \. $decimal_octet){3} (?! \d) }xms; print $ip; ;; my $s = '123.45.6.234 xx yyy zz 000.12.34.255'; my @ips = $s =~ m{ $ip }xmsg; printf qq{'$_' } for @ips; " (?^msx: (?<! \d) (?^msx: 2 (?: [0-4] \d | 5 [0-5]) | [01]? \d? \d ) (? +: \. (?^msx: 2 (?: [0-4] \d | 5 [0-5]) | [01]? \d? \d )){3} (?! \d) ) '123.45.6.234' '000.12.34.255'

        Note that neither  (?:pattern) nor the  (?<!pattern) (?!pattern) look-around assertions capture. Indeed, nothing captures (to a capture variable) since data is extracted in list context directly to an array.

        If I'm understaing you correctly, the my ( $firstIP, $secondIP ) = /($RE{net}{IPv4})/g in the above code does what you've described.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1022892]
Approved by BrowserUk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (6)
As of 2015-08-01 20:02 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found
    past polls