Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

Inconsistent behaviour of pos() in s/ ... / ... /

by johngg (Abbot)
on May 10, 2006 at 22:48 UTC ( #548583=perlquestion: print w/replies, xml ) Need Help??
johngg has asked for the wisdom of the Perl Monks concerning the following question:

Prompted by the responses in Newbie Q:How do I compare items within a string?, in particular those of Zaxo and TedPride I had a go at expanding my attempt with some of their ideas. This lead to some confusion about the use of pos() as shown here. In order to explore this further I have written this script which attempts to find occurrences of one string within another and to annotate each occurrence with the offset (zero-based) within the string

use strict; use warnings; # 1 2 # 012345678901234567890123456789 my $string = q(The cat scattered caterpillars\n); # ^ ^ ^ # 4 9 18 print "\n Match against string - $string\n"; print "Start matching\n\n"; my $match = 1; while($string =~ /(cat)/g) { print "Match @{[$match ++]}\n"; print " found - $1\n"; print " Value of \$-[1] - $-[1]\n"; print " Value of \$+[1] - $+[1]\n"; print " Value of \$+[0] - $+[0]\n"; print " Value of \$#+ - $#+\n"; print "Value of pos(\$string) - @{[pos($string)]}\n"; } print "\nStart annotation\n\n"; $string =~ s { (cat)(?{print "Value of pos(\$string) - @{[pos($string)]}\n"}) } { $1 . "[@{[pos($string)]}]" }xeg; print "\n Annotated string - $string\n";

which when run produces

Match against string - The cat scattered caterpillars Start matching Match 1 found - cat Value of $-[1] - 4 Value of $+[1] - 7 Value of $+[0] - 7 Value of $#+ - 1 Value of pos($string) - 7 Match 2 found - cat Value of $-[1] - 9 Value of $+[1] - 12 Value of $+[0] - 12 Value of $#+ - 1 Value of pos($string) - 12 Match 3 found - cat Value of $-[1] - 18 Value of $+[1] - 21 Value of $+[0] - 21 Value of $#+ - 1 Value of pos($string) - 21 Start annotation Value of pos($string) - 7 Value of pos($string) - 12 Value of pos($string) - 21 Annotated string - The cat[4] scat[9]tered cat[18]erpillars

As per the documentation, during the matching phase the value returned by pos() corresponds with the value of $+[1], pointing to just after the match. As Zaxo pointed out, $-[1] points to the start of the match.

In the annotation stage pos() again points to just after the match in the (?{ ... }) block, where the last / .../g match left off, I think the documentation says. However, when I use pos() as part of the substitution it seems to return values corresponding to $-[1] and not $+[1].

My question is, what could be causing this apparent change in behaviour?



Replies are listed 'Best First'.
Re: Inconsistent behaviour of pos() in s/ ... / ... /
by Errto (Vicar) on May 10, 2006 at 23:10 UTC
    I just did a quick test to confirm the behavior. I don't see it mentioned in the docs for pos, but I guess the reason would be that you're doing a substitution, so the matched text is essentially considered removed from the string at that point. Therefore pos will return the position where the substituted value will be inserted once you've computed it.
      That is an interesting suggestion and it makes perfect sense if it is in fact the designed behaviour. I'm kicking myself for not thinking of it. I couldn't see anything in the documentation either. If this is the way pos() should work it would be nice to have it documented perhaps.

      Thank you for the suggestion.



Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://548583]
Approved by GrandFather
and the daffodils sway...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (6)
As of 2017-04-27 07:21 GMT
Find Nodes?
    Voting Booth?
    I'm a fool:

    Results (501 votes). Check out past polls.