Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Regex to catch IPV4 and IPV6 whenever ip appears withing brackets

by theravadamonk (Scribe)
on Jul 10, 2018 at 04:13 UTC ( #1218209=perlquestion: print w/replies, xml ) Need Help??
theravadamonk has asked for the wisdom of the Perl Monks concerning the following question:

My maillog has IPV4 and IPV6 addresses withing brackets. I am trying to write a regex to catch ONLY IPV4 and IPV6 when they appear withing brackets. (IPV6 may be compressed or decompressed)

Pls keep in mind that My maillog begins with time stamp like this.

2018 Jun 26 09:05:15 ( It has : and IPV6 also has it )

how IPs appear in maillog file.

(209.85.208.68)

(172.217.194.27 < 209.85.208.68)

(172.217.194.27 < 2001:4860:4860:0:0:0:0:8888)

(2001:4860:4860:0:0:0:0:8888)

(2001:4860:4860:0:0:0:0:8888 < 2001:4860:4860::8844)

(2001:4860:4860:0:0:0:0:8888 < 172.217.194.27)

Sometimes, it appears in this way too. It may have IPV6 as well.

(172.217.194.27 < 172.217.194.27 < 209.85.208.68)

Anyway, I stared with below. It won't fulfill. It can catch ipv4

 [\(\d+\.\d+\.\d+\.\d+ \<\)]+

Shall we Try?

this may be a tiny task for Perl monks. I hv been writing since yesterday. Hope to hear from you

Replies are listed 'Best First'.
Re: Regex to catch IPV4 and IPV6 whenever ip appears withing brackets
by AnomalousMonk (Chancellor) on Jul 10, 2018 at 04:47 UTC

    The latest version of Regexp::Common::net is said to support IPv4 and IPv6 IP addresses, but I cannot say how complete the support is for IPv6. See Regexp::Common for instructions on how to use the  net (and other) module extension(s); it's slightly roundabout.


    Give a man a fish:  <%-{-{-{-<

Re: Regex to catch IPV4 and IPV6 whenever ip appears withing brackets
by NetWallah (Canon) on Jul 10, 2018 at 04:49 UTC
    Somewhat crude, but passes your test cases without false positives:
    use strict; use warnings; use feature "say"; while (<DATA>){ my @ips = m/\(([\d\.:]+)[\s<>]*([\d\.:]+)?[\s<>]*([\d\.:]+)?\)/; $_ and say "$. >$_<" for @ips; } __DATA__ email (209.85.208.68) (172.217.194.27 < 209.85.208.68) (172.217.194.27 < 2001:4860:4860:0:0:0:0:8888) (2001:4860:4860:0:0:0:0:8888) (2001:4860:4860:0:0:0:0:8888 < 2001:4860:4860::8844) (2001:4860:4860:0:0:0:0:8888 < 172.217.194.27) (2001:4860:4860:0:0:0:0:9999 < 172.217.194.29 Not terminated by close +paren Sometimes, it appears in this way too. It may have IPV6 as well. (172.217.194.27 < 172.217.194.27 < 209.85.208.68) Anyway, I stared with below. It won't fulfill. It can catch ipv
    Output:
    2 >209.85.208.68< 3 >172.217.194.27< 3 >209.85.208.68< 4 >172.217.194.27< 4 >2001:4860:4860:0:0:0:0:8888< 5 >2001:4860:4860:0:0:0:0:8888< 6 >2001:4860:4860:0:0:0:0:8888< 6 >2001:4860:4860::8844< 7 >2001:4860:4860:0:0:0:0:8888< 7 >172.217.194.27< 10 >172.217.194.27< 10 >172.217.194.27< 10 >209.85.208.68<

                    Memory fault   --   brain fried

      Thanks for your reply.

      IPV6 have characters as well. see below stuffs

      2404:6800:4003:c04::5e

      2600:3c00::f03c:91ff:fedf:f016

      So, In YOUR below code, \d can be replaced with \w

       my @ips = m/\(([\w\.:]+)[\s<>]*([\w\.:]+)?[\s<>]*([\w\.:]+)?\)/;

      \w Matches any letter, digit or underscore. Equivalent to a-zA-Z0-9_

      it's greedy isn't it ? anyway, Thanks for your effort.

Re: Regex to catch IPV4 and IPV6 whenever ip appears withing brackets
by AnomalousMonk (Chancellor) on Jul 10, 2018 at 06:50 UTC
    I stared with below. It won't fulfill. It can catch ipv4
     [\(\d+\.\d+\.\d+\.\d+ \<\)]+

    One big character class like this is not the way to match structured substrings. This class is equivalent to  [()\d. +<]+ which will match '12345' or '.+.+.+.' or '()(()((()' as well as any dotted decimal IPv4 address, valid or otherwise. Please see perlre and perlretut on the general topic of "character class(es)" (and also perlrecharclass).

    The following works (or seems to) for IPv4 dotted decimal addresses only, but it should be fairly easy to extend the  $rx_IPv4_dd regex to include IPv6 using the latest Regexp::Common::net. Needs Perl version 5.10+ for the  (?|pattern) "branch reset" extension. Needs many more test cases.

    c:\@Work\Perl\monks>perl -wMstrict -le "use 5.010; ;; use Test::More 'no_plan'; use Test::NoWarnings; ;; use Regexp::Common qw(net); ;; my $rx_IPv4_dd = qr{ (?<! \d) $RE{net}{IPv4} (?! \d) }xms; my $rx_arrow = qr{ \s+ < \s+ }xms; ;; VECTOR: for my $ar_vector ( 'all these contain one or more IPv4 addresses', [ 'x (209.85.208.68) x (172.217.194.27 < 123.23.34.45) x', '209.85.208.68', '172.217.194.27', '123.23.34.45', ], [ ' < 209.85.208.68) x (172.217.194.27 < 123.23.34.45) x', '172.217.194.27', '123.23.34.45', ], [ 'x (1.2.3.4) x (9.8.7.6 < 5.6.7.8) x', '1.2.3.4', '9.8.7.6', '5.6.7.8', ], 'none of these should match', [ '', ], [ 'x', ], [ '123', ], [ '209.85.208.68 (999.12.23.999) (12.23.34.45 98.76.54.32)', ], ) { if (not ref $ar_vector) { note $ar_vector; next VECTOR; } ;; my ($str, @expected) = @$ar_vector; ;; my @got = $str =~ m{ (?| (?: [(] ($rx_IPv4_dd) (?= (?: $rx_arrow $rx_IPv4_dd)? [)] ) ) | (?: \G (?! \A) $rx_arrow ($rx_IPv4_dd) [)] ) ) }xmsg; is_deeply \@got, \@expected, qq{'$str' -> (@expected)}; } ;; done_testing; ;; exit; " # all these contain one or more IPv4 addresses ok 1 - 'x (209.85.208.68) x (172.217.194.27 < 123.23.34.45) x' -> (209 +.85.208.68 172.217.194.27 123.23.34.45) ok 2 - ' < 209.85.208.68) x (172.217.194.27 < 123.23.34.45) x' -> (172 +.217.194.27 123.23.34.45) ok 3 - 'x (1.2.3.4) x (9.8.7.6 < 5.6.7.8) x' -> (1.2.3.4 9.8.7.6 5.6.7 +.8) # none of these should match ok 4 - '' -> () ok 5 - 'x' -> () ok 6 - '123' -> () ok 7 - '209.85.208.68 (999.12.23.999) (12.23.34.45 98.76.54.32)' -> () 1..7 ok 8 - no warnings 1..8


    Give a man a fish:  <%-{-{-{-<

Re: Regex to catch IPV4 and IPV6 whenever ip appears withing brackets
by jwkrahn (Monsignor) on Jul 10, 2018 at 04:54 UTC

      Hmm, Many thanks for your INPUT. Really profitable for me,

      Here I TEST it with below code

      #!/usr/bin/perl use strict; use warnings; use Regexp::IPv4 qw($IPv4_re); use Regexp::IPv6 qw($IPv6_re); my $ipv4_address = "192.168.0.10"; my $ipv6_address = "2600:3c00::f03c:91ff:fedf:f016"; $ipv4_address =~ /^$IPv4_re$/ and print "IPv4 address\n"; $ipv6_address =~ /^$IPv6_re$/ and print "IPv6 address\n";

      Hmm. It Works

Re: Regex to catch IPV4 and IPV6 whenever ip appears withing brackets
by Anonymous Monk on Jul 10, 2018 at 15:12 UTC
    If you specifically want to catch IP-addresses in brackets, I would first look for bracketed strings that contain digits, e.g. /(\(.*?0-9.*?\))/g, specifying "non-greedy" so that you get the shortest possible string.and applying this pattern as often as necessary to each input line. For each captured string, apply a Regexp::Common pattern to search for IP-addresses, also applying the pattern multiple times per-string. The check for digits occurring within the brackets is to reduce the number of false-positives on the first match loop.
      If a group could contain an inner parenthesized group then you may wish to use greedy patterns instead.
        If this is an issue, I would prefer to use Regexp::Common::Balanced.
        Bill

      This "sundial" guy has been here for how many years? and he still doesn't know that <tt>...</tt> is not the way to mark off code. Dumbshit.

        A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1218209]
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (6)
As of 2018-11-21 17:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My code is most likely broken because:
















    Results (245 votes). Check out past polls.

    Notices?