Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

regex compilation

by morgon (Priest)
on Feb 12, 2019 at 17:33 UTC ( #1229823=perlquestion: print w/replies, xml ) Need Help??

morgon has asked for the wisdom of the Perl Monks concerning the following question:

Hi

could someone please explain this behaviour to me as it is unexpected for me:

my $regex1 = qr/(?:(?|(?:\")([^\\\"]*(?:\\.[^\\\"]*)*)(?:\")))/; my $regex2 = qq/(?:(?|(?:\")([^\\\"]*(?:\\.[^\\\"]*)*)(?:\")))/; my ($match1) = q|"hubba \"bubba\""| =~ $regex1; my ($match2) = q|"hubba \"bubba\""| =~ /$regex2/; print "$match1\n$match2\n"; __END__ output: hubba \"bubba\" hubba \
In the first case I compile a regex immediately, in the second I define a string and only in the line where I match it get's compiled.

I would have expected that the result would be the same - but it isn't...

Why?

Many thanks!

Replies are listed 'Best First'.
Re: regex compilation
by davido (Cardinal) on Feb 12, 2019 at 17:57 UTC

    /$regex2/ will have the same interpolation as qr/$regex2/, so we can look more closely at what you are getting by examining the resulting regex2 object:

    my $regex1 = qr/(?:(?|(?:\")([^\\\"]*(?:\\.[^\\\"]*)*)(?:\")))/; my $regex2 = qq/(?:(?|(?:\")([^\\\"]*(?:\\.[^\\\"]*)*)(?:\")))/; print $regex1, "\n", qr/$regex2/, "\n";

    The output:

    (?^:(?:(?|(?:\")([^\\\"]*(?:\\.[^\\\"]*)*)(?:\")))) (?^:(?:(?|(?:")([^\"]*(?:\.[^\"]*)*)(?:"))))

    The single quoted construct is gobbling the escapes.


    Dave

      FWIW, single-quoted regex notation gets you closer (update: than double-quoting) to the desired final form, but still not all the way and there are some pitfalls. Better, IMHO, to stick to qr//. (BTW: morgon: It's not necessary to escape a  " (double-quote) in a qr// expression.)

      File rx_qr_qq_q_1.pl:

      use warnings; use strict; my $rx_qr = qr/(?:(?|(?:\")([^\\\"]*(?:\\.[^\\\"]*)*)(?:\")))/; my $rx_qq = qq/(?:(?|(?:\")([^\\\"]*(?:\\.[^\\\"]*)*)(?:\")))/; my $rx__q = q/(?:(?|(?:\")([^\\\"]*(?:\\.[^\\\"]*)*)(?:\")))/; print qq{raw qr: $rx_qr \n}; print qq{qr/qq/: }, qr/$rx_qq/, "\n"; print qq{qr/q/ : }, qr/$rx__q/, "\n";
      Output:
      c:\@Work\Perl\monks\morgon>perl rx_qr_qq_q_1.pl raw qr: (?-xism:(?:(?|(?:\")([^\\\"]*(?:\\.[^\\\"]*)*)(?:\")))) qr/qq/: (?-xism:(?:(?|(?:")([^\"]*(?:\.[^\"]*)*)(?:")))) qr/q/ : (?-xism:(?:(?|(?:\")([^\\"]*(?:\.[^\\"]*)*)(?:\"))))


      Give a man a fish:  <%-{-{-{-<

      Neat trick, thanks a lot.

      So I need to double all the backslashes.

        So I need to double all the backslashes.

        I would say that you need to use  qr// to build Regexp objects. These objects behave essentially like strings in the situations in which you seem to be interested in using them.


        Give a man a fish:  <%-{-{-{-<

        So I need to double all the backslashes.
        Probably not. I think that you need to use qr//
Re: regex compilation
by Corion (Pope) on Feb 12, 2019 at 17:37 UTC

    Have you printed the two values you get? They are different because qq treats the backslash differently from what qr does.

      The output is above in the end block.

      What I need is string that behaves just as the compiled regex as I want to feed that into a perl5-compatible regex-engine for go.

        Have you printed the two values you get?
        The output is above in the end block.

        I'm pretty sure that by "the two values", Corion meant $regex1 and $regex2.

        By the way, the only quoting construct that doesn't interpolate at all is here docs with a single-quoted delimiter:

        chomp( my $regex2 = <<'ENDREGEX' ); (?:(?|(?:\")([^\\\"]*(?:\\.[^\\\"]*)*)(?:\"))) ENDREGEX print $regex2, "\n"; my ($match2) = q|"hubba \"bubba\""| =~ /$regex2/; print $match2, "\n"; __END__ (?:(?|(?:\")([^\\\"]*(?:\\.[^\\\"]*)*)(?:\"))) hubba \"bubba\"
        ... string that behaves just as the compiled regex ...

        That's what  qr// is for; see perlop. The  Regexp object returned by qr// is almost a string; it interpolates as a string in  m// s/// qr// expressions or when print-ed or string interpolated. It is not, however, the same as what is produced by the  qq// q// operators.


        Give a man a fish:  <%-{-{-{-<

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1229823]
Front-paged by haukex
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (3)
As of 2020-06-07 05:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Do you really want to know if there is extraterrestrial life?



    Results (42 votes). Check out past polls.

    Notices?