Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re^2: Why multiline regex doesn't work?

by nbd (Novice)
on Jun 09, 2015 at 00:19 UTC ( [id://1129573]=note: print w/replies, xml ) Need Help??


in reply to Re: Why multiline regex doesn't work?
in thread Why multiline regex doesn't work?

I was guided by this part of perldoc:

- m modifier (//m): Treat string as a set of multiple lines. '.' matches any character except "\n" . ^ and $ are able to match at the start or end of any line within the string.

- both s and m modifiers (//sm): Treat string as a single long line, but detect multiple lines. '.' matches any character, even "\n" . ^ and $ , however, are able to match at the start or end of any line within the string.

Does the correction in the code you made mean that Perl processes the multiline string line by line and not as a single string?

UPDATE: I see that Perl process the string as a whole. The code from the first sight just looked as an awk line by line pattern matching. Thanks.

  • Comment on Re^2: Why multiline regex doesn't work?

Replies are listed 'Best First'.
Re^3: Why multiline regex doesn't work?
by AnomalousMonk (Archbishop) on Jun 09, 2015 at 01:14 UTC

    You should also enable warnings (and strictures; see strict), especially if you are a Perl novice. Consider your first regex with warnings enabled:

    c:\@Work\Perl\monks>perl -le "use warnings; use strict; ;; my $s = qq{aaa : AAA\n} . qq{bbb : BBB\n} . qq}ccc : CCC\n} ; print qq{[[$s]]}; ;; my $m = 'bbb'; ;; my $t = $s =~ s/.*^$m *: (.*?)$.*/$1/rsm ; ;; print qq{[[$t]]}; " [[aaa : AAA bbb : BBB ccc : CCC ]] Use of uninitialized value $. in regexp compilation at -e line 1. [[BBB ccc : CCC ]]
    The Use of uninitialized value $. in regexp compilation... message gives you a clue about what is happening.

    If the  $ is unambiguously a regex metacharacter:

    c:\@Work\Perl\monks>perl -le "use warnings; use strict; ;; my $s = qq{aaa : AAA\n} . qq{bbb : BBB\n} . qq}ccc : CCC\n} ; print qq{[[$s]]}; ;; my $m = 'bbb'; ;; my $t = $s =~ s/.*^$m *: (.*?)$(?:.*)/$1/rsm ; ;; print qq{[[$t]]}; " [[aaa : AAA bbb : BBB ccc : CCC ]] [[BBB]]
    You have your intended output for this regex.


    Give a man a fish:  <%-(-(-(-<

Re^3: Why multiline regex doesn't work?
by jeffa (Bishop) on Jun 09, 2015 at 00:40 UTC

    You really should try to work with simpler examples before you make things complicated:

    use strict; use warnings; use Data::Dumper; my ($str,@match); $str = " foo bar baz "; @match = $str =~ /(foo.*bar)/; # nope! print Dumper \@match; @match = $str =~ /(foo.*bar)/m; # nope! print Dumper \@match; @match = $str =~ /(foo.*bar)/s; # this one! print Dumper \@match; $str = " foo bar foo baz "; @match = $str =~ /^(foo bar)/; # nope! print Dumper \@match; @match = $str =~ /^(foo bar)/s; # nope! print Dumper \@match; @match = $str =~ /^(foo bar)/m; # this one! print Dumper \@match;

    The first set of matches illustrates a case when the 's' modifier gets the match and the second set of matches illustrates a case when the 'm' modifier gets the match. Hope this helps!

    jeffa

    L-LL-L--L-LL-L--L-LL-L--
    -R--R-RR-R--R-RR-R--R-RR
    B--B--B--B--B--B--B--B--
    H---H---H---H---H---H---
    (the triplet paradiddle with high-hat)
    
Re^3: Why multiline regex doesn't work? ( ^$\n? anchors are "string location assertion")
by Anonymous Monk on Jun 09, 2015 at 01:11 UTC

    See perlvar#$.
    rxrx and http://perldoc.perl.org/re.html#%27debug%27-mode and other regex tools
    The "anchor" misnomer in regexes (string location assertion)
    Why \n matches but not $^?
    Disabling regexp optimizations?

    matches after newline (or beginning of string). $ matches before newline (or end of string)

    $ perl -MData::Dump -Mre=debug -le " dd( $_=qq{a\n\nb} ); s{^$}{boop}m +; dd( $_ ); " Compiling REx "^$" Final program: 1: MBOL (2) 2: MEOL (3) 3: END (0) anchored ""$ at 0 anchored(MBOL) minlen 0 "a\n\nb" Matching REx "^$" against "a%n%nb" 0 <> <a%n%nb> | 1:MBOL(2) 0 <> <a%n%nb> | 2:MEOL(3) failed... 2 <a%n> <%nb> | 1:MBOL(2) 2 <a%n> <%nb> | 2:MEOL(3) 2 <a%n> <%nb> | 3:END(0) Match successful! "a\nboop\nb" Freeing REx: "^$"

    Trying to match newline after end of line won't work, $\n won't work

    $ perl -MData::Dump -Mre=debug -le " dd( $_=qq{a\n\nb} ); s{^$\n}{boop +}m; dd( $_ ); " "a\n\nb" Compiling REx "^%nn" Final program: 1: MBOL (2) 2: EXACT <\nn> (4) 4: END (0) anchored "%nn" at 0 (checking anchored) anchored(MBOL) minlen 2 Guessing start of match in sv for REx "^%nn" against "a%n%nb" Did not find anchored substr "%nn"... Match rejected by optimizer "a\n\nb" Freeing REx: "^%nn"

    But matching an OPTIONAl newline works

    $ perl -MData::Dump -Mre=debug -le " dd( $_=qq{a\n\nb} ); s{^$\n?}{boo +p}ms; dd( $_ ); " "a\n\nb" Compiling REx "^%nn?" Final program: 1: MBOL (2) 2: EXACT <\n> (4) 4: CURLY {0,1} (8) 6: EXACT <n> (0) 8: END (0) anchored "%n" at 0 (checking anchored) anchored(MBOL) minlen 1 Guessing start of match in sv for REx "^%nn?" against "a%n%nb" Found anchored substr "%n" at offset 1... Found /^/m, restarting lookup for check-string at offset 2... Found anchored substr "%n" at offset 2... Position at offset 2 does not contradict /^/m... Guessed: match at offset 2 Matching REx "^%nn?" against "%nb" 2 <a%n> <%nb> | 1:MBOL(2) 2 <a%n> <%nb> | 2:EXACT <\n>(4) 3 <a%n%n> <b> | 4:CURLY {0,1}(8) EXACT <n> can match 0 times out of 1 +... 3 <a%n%n> <b> | 8: END(0) Match successful! "a\nboopb" Freeing REx: "^%nn?"

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1129573]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others examining the Monastery: (9)
As of 2024-04-19 06:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found