http://www.perlmonks.org?node_id=937893


in reply to Re: RegEx related line split
in thread RegEx related line split

Would you mind if I ask my question? I don't understand "$ option" and saw perlre's Extended Patterns, but I could not figure out what is this.
(?=$|MARK)
"(?=" is zero width look ahead assertion and I wonder what is "$|" ? Usually I will do this with character class
@lines = $RefLine =~ /(\([a-z]*\)[^\(]*)/g;
This will fail if $RefLine includes another ().
my $RefLine = "(a) This is first line(once all 4 lines were one line). + (b) This is second line; ( c) This is different line 32. (d) Here is the last line.";
But your's works fine. More robust. I am glad with some pointer or clue for me. regards.

Replies are listed 'Best First'.
Re^3: RegEx related line split
by JavaFan (Canon) on Nov 14, 2011 at 11:41 UTC
    Do you understand foo|bar? Do you understand $? Do you understand (?= )? Combine all three and you get (?=$|MARK).
      Try to explain myself.

      foo|bar is foo or bar. if it is grouped by (foo|bar), the matched $1 will be set to "foo" or "bar".

      In this case ... it is not "non capturing grouping" (?foo|bar), because it is zero width look ahead assertion '(?='. Zero width look ahead assertion works like place holder and it does not eat up pos($expr) in matching.

      $ is the end of line... as far as I know.

      Well, it says look ahead for "end of line" or MARK and match against them as 'place holder'. I think I understand this!

      #!/usr/bin/perl use strict; use warnings; my $RefLine = "(a) This is first line(once all 4 was one line). (b) Th +is is second line; ( print "original -----\n"; print "$RefLine\n"; print "original -----\n\n"; print "\n## without 'end of line or' condtion. last line fails\n"; while( $RefLine =~ /(\([a-z]\).*?)(?=\([a-z]\))/g ){ my $p=pos $RefLine; print "$-[0], $p,matched=$&\n"; print "---\n"; } print "\n## without lookahead assertion... \n"; while( $RefLine =~ /(\([a-z]\).*?)($|\([a-z]\))/g ){ my $p=pos $RefLine; print "$-[0], $p,matched=$&\n"; print "---\n"; } print "\n## with 'end of line or' condtion and zero width place holder +\n"; while( $RefLine =~ /(\([a-z]\).*?)(?=$|\([a-z]\))/g ){ my $p=pos $RefLine; print "$-[0], $p,matched=$&\n"; print "---\n"; }

      Thank you very much JavaFan.