Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Printing regular expression variable

by MattLG (Sexton)
on May 27, 2013 at 21:00 UTC ( #1035462=perlquestion: print w/ replies, xml ) Need Help??
MattLG has asked for the wisdom of the Perl Monks concerning the following question:

I've written my own CGI untainting library which uses regular expression variables like qr/^.*$/ to validate incoming data. I'd like to extend this library to also provide automated frontend javascript field validation, preferably by using the same regular expression.

When I print a regular expression variable like the one above, I get (?^:^.*$). This looks to me like the original regular expression wrapped with (?^:.....), is this right?

In other words, is it safe just to remove the (?^: and the ) and send the middle bit through to the frontend javascript, or does this wrapper actually mean something that I should try to understand?

MattLG

Comment on Printing regular expression variable
Re: Printing regular expression variable
by choroba (Abbot) on May 27, 2013 at 21:34 UTC
    This "wrapper" has a meaning. Moreover, the "wrapper" may be different in a different version of Perl. Try wrapping regular expressions with flags:
    my $regex = qr/^.*$/m; print "$regex\n"; # (?^um:^.*$) in 5.16.0

    See perlre - Perl regular expressions for details.

    لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

      I've not been able to find any details about the regexp output format in those perl docs. Is this documented anywhere.

      And is there an alternative method to get the original regexp out of a variable?

      MattLG

        See Extended Patterns in the linked document.
        لսႽ ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
Re: Printing regular expression variable
by ww (Bishop) on May 27, 2013 at 23:51 UTC

    Tangential observation: Pro forma untainting may be worse than none:

    #!/usr/bin/perl -T use 5.016; my $regex = qr/^.*$/; # match anything, including an empty string my @strings = ('delete everything', 'overclock till cpu smokes', 'we ownz you exec(nasty code here)', ' ', '', ); untaint(@strings); sub untaint() { for my $elem(@strings) { if ( $elem =~ /$regex/ ) { say "Thank you, sucker. You are borked, really bad!"; }else{ say "Oh look, untainting did something more than merely al +low any-old-badstruff to pass untaint. string untainted was -|$elem|- +"; } } }

    Execution produces:

    C:\>untaint-bad.pl Thank you, sucker. You are borked, really bad! |delete everything|' passed. Thank you, sucker. You are borked, really bad! |overclock till cpu smokes|' passed. Thank you, sucker. You are borked, really bad! |we ownz you exec(nasty code here)|' passed. Thank you, sucker. You are borked, really bad! | |' passed. Thank you, sucker. You are borked, really bad! ||' passed.

    If you didn't program your executable by toggling in binary, it wasn't really programming!

      Err, sorry, what?

      MattLG

        "...my own CGI untainting library which uses regular expression variables like qr/^.*$/ to validate incoming data."
        Err, your example regex validates nothing; untaints nothing. If it's merely a simple example that occured to you for the purposes of your question, fine; if you believe it's doing something useful, you err.

        If you didn't program your executable by toggling in binary, it wasn't really programming!

Re: Printing regular expression variable
by AnomalousMonk (Monsignor) on May 27, 2013 at 23:51 UTC

    Further to choroba's reply...

    ... is it safe just to remove the (?^: and the ) and send the middle bit through to the frontend javascript, or does this wrapper actually mean something ...

    It actually means something.

    Consider the two regexes /^xyz$/ and  /^xyz$/i for case-sensitive versus case-insensitive matching.

    >perl -wMstrict -le "my $qr1 = qr/^xyz$/; print $qr1; ;; my $qr2 = qr/^xyz$/i; print $qr2; " (?^:^xyz$) (?^i:^xyz$)

    Clearly, the two regexes have very different effects. This difference is lost if you simply strip away the wrapper. The same goes for the other modifiers you could use on a regex.

    Take a look at the Extended Patterns section in perlre for the  "(?:pattern)" extended pattern, usually the third one in the section, near the beginning. The flags used may be slightly different in your version of Perl, but this grouping pattern is essentially the 'wrapper' you're talking about.

    I don't know enough about Java regexes to be able to advise you about the path forward. Jeffrey Friedl has written the excellent (and expensive) Mastering Regular Expressions that could probably tell you all about this sort of conversion, but beyond that, I'm not sure what to say.

Re: Printing regular expression variable
by kcott (Abbot) on May 28, 2013 at 01:27 UTC
      ....qr{...} does not return a string!

      Yes, I'm aware of that, otherwise it wouldn't work.

      MattLG

Re: Printing regular expression variable
by Anonymous Monk on May 28, 2013 at 04:33 UTC

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1035462]
Approved by Jim
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (9)
As of 2014-07-11 10:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (224 votes), past polls