Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re^2: RegEx related line split

by remiah (Hermit)
on Nov 14, 2011 at 08:54 UTC ( #937893=note: print w/ replies, xml ) Need Help??


in reply to Re: RegEx related line split
in thread RegEx related line split

Would you mind if I ask my question? I don't understand "$ option" and saw perlre's Extended Patterns, but I could not figure out what is this.

(?=$|MARK)
"(?=" is zero width look ahead assertion and I wonder what is "$|" ? Usually I will do this with character class
@lines = $RefLine =~ /(\([a-z]*\)[^\(]*)/g;
This will fail if $RefLine includes another ().
my $RefLine = "(a) This is first line(once all 4 lines were one line). + (b) This is second line; ( c) This is different line 32. (d) Here is the last line.";
But your's works fine. More robust. I am glad with some pointer or clue for me. regards.


Comment on Re^2: RegEx related line split
Select or Download Code
Re^3: RegEx related line split
by JavaFan (Canon) on Nov 14, 2011 at 11:41 UTC
    Do you understand foo|bar? Do you understand $? Do you understand (?= )? Combine all three and you get (?=$|MARK).
      Try to explain myself.

      foo|bar is foo or bar. if it is grouped by (foo|bar), the matched $1 will be set to "foo" or "bar".

      In this case ... it is not "non capturing grouping" (?foo|bar), because it is zero width look ahead assertion '(?='. Zero width look ahead assertion works like place holder and it does not eat up pos($expr) in matching.

      $ is the end of line... as far as I know.

      Well, it says look ahead for "end of line" or MARK and match against them as 'place holder'. I think I understand this!

      #!/usr/bin/perl use strict; use warnings; my $RefLine = "(a) This is first line(once all 4 was one line). (b) Th +is is second line; ( print "original -----\n"; print "$RefLine\n"; print "original -----\n\n"; print "\n## without 'end of line or' condtion. last line fails\n"; while( $RefLine =~ /(\([a-z]\).*?)(?=\([a-z]\))/g ){ my $p=pos $RefLine; print "$-[0], $p,matched=$&\n"; print "---\n"; } print "\n## without lookahead assertion... \n"; while( $RefLine =~ /(\([a-z]\).*?)($|\([a-z]\))/g ){ my $p=pos $RefLine; print "$-[0], $p,matched=$&\n"; print "---\n"; } print "\n## with 'end of line or' condtion and zero width place holder +\n"; while( $RefLine =~ /(\([a-z]\).*?)(?=$|\([a-z]\))/g ){ my $p=pos $RefLine; print "$-[0], $p,matched=$&\n"; print "---\n"; }

      Thank you very much JavaFan.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://937893]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (5)
As of 2015-07-01 23:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (25 votes), past polls