http://www.perlmonks.org?node_id=1040230

vyeddula has asked for the wisdom of the Perl Monks concerning the following question:

Pattern matching File

10.128.99.190 10.128.100.100

1.1.1.1 2.2.2.2

3.3.3.3 4.4.4.4

100.100.100.100 200.200.200.200

My program #!/usr/bin/perl -w use strict; my $source=shift @ARGV; open(FH,'<',$source) or die "I can't open the file $!\n"; while(<FH>) { s/^(\d*).(\d*)/X.X/g; } close FH;

output:

X.X.99.190 10.128.100.100

X.X.1.1 2.2.2.2

X.X.3.3 4.4.4.4

X.X.100.100 200.200.200.200

I want the first 2 octets of other string should also be X and X How to do that?

Replies are listed 'Best First'.
Re: Patternmatching IPaddresses
by BrowserUk (Patriarch) on Jun 21, 2013 at 22:47 UTC
    I want the first 2 octets of other string should also be X and X How to do that?

    Your first step would be to remove the ^ anchoring your regex to the start of the line but that will cause every pair of numbers separated by a dot, (actually, given your regex, any character, you should escape the '.'), which would be all of them result in in everything becoming 'xx.xx.xx.xx';

    So you need to ensure that only the first pair of each quad get modified, so try: s/(?<![0-9.])\d+\.\d+/X.X/g;


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Patternmatching IPaddresses
by choroba (Cardinal) on Jun 21, 2013 at 22:48 UTC
    The other string is not at the beginning of a line. Remove the ^. Also, a dot is a special character in regexes. Backslash it to match literally. You do not need the replaced numbers - no need for capturing parentheses. You do not want the third and fourth numbers to be replaced - you can achieve this by requiring a dot after the numbers to be replaced:
    s/\d*\.\d*\./X.X./g
    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: Patternmatching IPaddresses
by 2teez (Vicar) on Jun 21, 2013 at 22:53 UTC

    Something like:

    while(<DATA>){ s/[0-9]+\.[0-9]+\./X.X./g; print $_; } __DATA__ 10.128.99.190 10.128.100.100 1.1.1.1 2.2.2.2 3.3.3.3 4.4.4.4 100.100.100.100 200.200.200.200

    If you tell me, I'll forget.
    If you show me, I'll remember.
    if you involve me, I'll understand.
    --- Author unknown to me
Re: Patternmatching IPaddresses
by johngg (Canon) on Jun 22, 2013 at 12:34 UTC

    A slightly different approach so as to replace each digit of the masked octets with an 'x' rather than the whole octet with a single 'X' just as an exercise.

    $ perl -Mstrict -Mwarnings -E ' open my $inFH, q{<}, \ <<EOD or die $!; 10.128.99.190 10.128.100.100 1.1.1.1 2.2.2.2 3.3.3.3 4.4.4.4 100.100.100.100 200.200.200.200 EOD print for map { s{ (\d+\.\d+) (?= (?:\.\d+){2} ) } { do{ my $c = $1; $c =~ tr{0-9}{x}; $c } }xeg; $_; } <$inFH>;' xx.xxx.99.190 xx.xxx.100.100 x.x.1.1 x.x.2.2 x.x.3.3 x.x.4.4 xxx.xxx.100.100 xxx.xxx.200.200 $

    I hope this is of interest.

    Cheers,

    JohnGG

      Replace each digit with an x? Hehe, that reminds me to the winning entry (bottom of page) of the 2008 Underhanded C Contest. Sure, blacking out each digit makes sense if you're doing this on paper with a black pen, or with a scanned image you do not want to OCR, but not much in a string replacement.

Re: Patternmatching IPaddresses
by rjt (Curate) on Jun 22, 2013 at 23:42 UTC

    You might try explicitly matching either the beginning of the string or a space with (^|\s):

    use 5.014; # For /r regex modifier print s/(^|\s)\d+\.\d+/$1X.X/gr for <DATA>; __DATA__ 10.128.99.190 10.128.100.100 1.1.1.1 2.2.2.2 3.3.3.3 4.4.4.4 100.100.100.100 200.200.200.200

    Note that if your Perl is older than 5.014, and hence you can not use the /r modifier, you can replace the print() statement with this:

        print map { s/(^|\s)\d+\.\d+/$1X.X/g; $_ } <DATA>;

    If you need stricter validation of your input data, the following regexp will only match lines that have two IP addresses and nothing else:

        s/^\d+\.\d+\.(\d+\.\d+) \d+\.\d+\.(\d+\.\d+)$/X.X.$1 X.X.$2/;

    Note that the /g modifier is not necessary in this case as the regexp covers the entire string.

Re: Patternmatching IPaddresses
by sundialsvc4 (Abbot) on Jun 24, 2013 at 13:00 UTC

    Another very useful CPAN package to get to know is Regexp::Common, which is a kitchen sink collection of “canned,” known good regular-expression patterns, including ones for IP-addresses of all kinds.   Especially useful when you need to validate the contents of a file, or to “future proof” your logic.

    (To fully see what I mean, click on that page, then click on the link to the right of the author’s name in the gray-shadowed line above “Module Version.”   This shows you all of the packages that are part of this one.   Then, if you search for the stringRegexp::Common,” you’ll see about 150 more equally-large ones.