http://www.perlmonks.org?node_id=722908

ewhitt has asked for the wisdom of the Perl Monks concerning the following question:

I need some help with RegEx. The first two columns display unique IP addresses. I need to grab the number after "1000" and disregard the following numbers. How can I ignore all the unwanted information?
*> 4.23.88.0/23 64.135.0.1 0 1000 234 +46164 i *> 4.23.89.0/24 64.135.0.1 0 1000 2014 +8 46164 i *> 4.23.92.0/23 64.135.0.1 0 1000 2013 +8 46164 i *> 4.23.92.0/22 64.135.0.1 0 1000 3018 + 46164 i *> 4.23.94.0/23 64.135.0.1 0 1000 4041 +8 46164 i *> 4.23.112.0/24 64.135.0.1 0 1000 1018 + 174 21889 i *> 4.23.113.0/24 64.135.0.1 0 1000 2018 + 174 21889 i *> 4.23.114.0/24 64.135.0.1 0 1000 18 1 +74 21889 i *> 4.36.118.0/24 64.135.0.1 0 1000 7018 + 174 21889 i

Replies are listed 'Best First'.
Re: Need help with Regex
by toolic (Bishop) on Nov 11, 2008 at 17:15 UTC
    I'm not entirely sure what you want for output because you did not show an example. But, if you just want to grab the 6th column from your input, you could use split instead of a regex:
    use strict; use warnings; while (<DATA>) { my $num = (split)[5]; print "$num\n"; } __DATA__ *> 4.23.88.0/23 64.135.0.1 0 1000 234 +46164 i *> 4.23.89.0/24 64.135.0.1 0 1000 2014 +8 46164 i *> 4.23.92.0/23 64.135.0.1 0 1000 2013 +8 46164 i *> 4.23.92.0/22 64.135.0.1 0 1000 3018 + 46164 i *> 4.23.94.0/23 64.135.0.1 0 1000 4041 +8 46164 i *> 4.23.112.0/24 64.135.0.1 0 1000 1018 + 174 21889 i *> 4.23.113.0/24 64.135.0.1 0 1000 2018 + 174 21889 i *> 4.23.114.0/24 64.135.0.1 0 1000 18 1 +74 21889 i *> 4.36.118.0/24 64.135.0.1 0 1000 7018 + 174 21889 i

    prints:

    234 20148 20138 3018 40418 1018 2018 18 7018
      Interesting. I have never seen "split" used like that before. How could I pass it a variable? Say I wanted to print "four" ?
      $output = "one two three four five"; $output = (split $line)[4];
      Thanks!
        The parentheses around split forces it to return an array. The square brackets are used to select a single element from the array returned by split. We could either use a constant numeric value, such as 3, or we could use a scalar variable, such as $col to select the array item. Keep in mind that Perl arrays start at 0, not 1. So, "one" becomes array element [0], and "four" is element [3]:
        use strict; use warnings; my $col = 3; my $str = "one two three four five"; my $output = (split /\s+/, $str)[$col]; print "output=$output\n"; __END__ output=four
Re: Need help with Regex
by swampyankee (Parson) on Nov 11, 2008 at 17:58 UTC

    &ellip;and if toolic's suggestion won't always work, i.e., the number of columns changes, you could try something like this:

    #/usr/bin/perl use strict; use warnings; while(<DATA>){ (my $junk, my $keep) = split(/\s1000\s/, $line, 2); my @datum = split(/\s+/,$keep); print shift(@datum),"\n"; } __DATA__ *> 4.23.88.0/23 64.135.0.1 0 1000 234 +46164 i *> 4.23.89.0/24 64.135.0.1 0 1000 2014 +8 46164 i *> 4.23.92.0/23 64.135.0.1 0 1000 2013 +8 46164 i *> 4.23.92.0/22 64.135.0.1 0 1000 3018 + 46164 i *> 4.23.94.0/23 64.135.0.1 0 1000 4041 +8 46164 i *> 4.23.112.0/24 64.135.0.1 0 1000 1018 + 174 21889 i *> 4.23.113.0/24 64.135.0.1 0 1000 2018 + 174 21889 i *> 4.23.114.0/24 64.135.0.1 0 1000 18 1 +74 21889 i *> 4.36.118.0/24 64.135.0.1 0 1000 7018 + 174 21889 i
    I know that the second split and the print can be wrapped into one statement. I'm at work, so I don't want to spend too much time on this.


    Information about American English usage here and here. Floating point issues? Please read this before posting. — emc

Re: Need help with Regex
by JavaFan (Canon) on Nov 11, 2008 at 19:27 UTC
    print $1 if /1000\s+([0-9]+)/;
    Be careful, it might grab an unintended number if there's another 1000 preceeding the 1000 you intend. But the regexp above grabs the "number after '1000'".
Re: Need help with Regex
by Grey Fox (Chaplain) on Nov 11, 2008 at 17:46 UTC