http://www.perlmonks.org?node_id=149134

dragonchild has asked for the wisdom of the Perl Monks concerning the following question:

I've want to capture every pair of letters in a string, but I can't seem to get it to work.
$_ = "blah"; my @matches = /(.(?=.))/g; print "@matches\n";
I want it to print out "bl la ah", but, to capture the second level, I need to put another set of capturing parentheses within the lookahead. But, this puts the second character in $2, not $1 like I want to.

Help!

------
We are the carpenters and bricklayers of the Information Age.

Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

Replies are listed 'Best First'.
Re: Capturing with look-ahead
by japhy (Canon) on Mar 04, 2002 at 15:14 UTC
    Sorry, but you're out of luck. Your only chance is to use (?=(..)) instead. Putting capturing parentheses outside of a look-ahead is useless, because the look-ahead doesn't advance in the string when it's done -- it stays put. You can cheat by putting capturing parentheses in the look-ahead itself, though, as you have seen.

    _____________________________________________________
    Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a (from-home) job
    s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

Grabbing Adjacent Chars In A String - Re: Capturing with look-ahead
by metadoktor (Hermit) on Mar 04, 2002 at 15:57 UTC
    I suspect that this question is related to the current Perl Review golfing challenge, but here is a solution to your problem. I could give you a regex-based solution but it would be more complicated and probably less efficient.

    #!/usr/local/bin/perl -w use strict; my @x; $_="blah"; for my $i (0..length($_)-2) { $x[$i]=substr($_,$i,1).substr($_,$i+1,1); } print "@x\n";
    produces...
    bl la ah
    

    metadoktor

    "The doktor is in."

Re: Capturing with look-ahead
by danger (Priest) on Mar 04, 2002 at 17:27 UTC

    Capture two chars inside the lookahead and throw in another dot afterwards to advance one character in the string:

    $_ = 'blah'; my @matches = /(?=(..))./g; print "@matches\n";
      The extra . isn't necessary, although I have a feeling it makes the regex more efficient.

      _____________________________________________________
      Jeff[japhy]Pinyan: Perl, regex, and perl hacker, who'd like a (from-home) job
      s++=END;++y(;-P)}y js++=;shajsj<++y(p-q)}?print:??;

        Why is that? (?=) is zero-width, why does it look further one char next time?
Re: Capturing with look-ahead
by Ido (Hermit) on Mar 04, 2002 at 19:31 UTC
    That thread/question pretty much spoiled it all. Yes, it shortened my trial golf. But now I can't submit it at all. I would rather submit the long one...
      Your choice. *shrugs* I was asking a general question whose answer I plan on using in a number of other applications. There are many situations where one would want to parse something using a regex as groups of N characters, advancing M characters at a time. It just so happens that this is the first reason I've wanted to.

      ------
      We are the carpenters and bricklayers of the Information Age.

      Don't go borrowing trouble. For programmers, this means Worry only about what you need to implement.

        "Who me?"

        Your timing is pretty lame. You should have known you'd be spoiling the golf tournament for some people. And contrary to your assertion, I do not think that capturing groups of characters with a regex while only advancing partially is actually something you'd want to do in general. Using substr would be more efficient, as metadoktor's reply wisely noted. This is primarily useful for shortening your golf, so don't act all innocent.