Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re: Parse::RandGen::Regexp

by Tanktalus (Canon)
on Aug 06, 2005 at 16:46 UTC ( #481516=note: print w/replies, xml ) Need Help??


in reply to Parse::RandGen::Regexp

There are a number of aspects here. First, your initial problem: the \s. Try escaping the \:

my $regexp = "/^STOR\\s[^\n]{100}/smi";
However, then you need to still use the qr operator:
my $r = Parse::RandGen::Regexp->new(qr/$regexp/);
But even that won't work the way you think it will because the leading and trailing delimiters will be treated literally - so the regular expression will match something that has a slash, then a beginning-of-line zero-width assertion, and has a trailing "/smi", literally. You could do something like:
my $r = Parse::RandGen::Regexp->new(eval "qr$regexp");
but that's unsafe if you get your regexp from an unsafe source. Then again, if you're getting regexp's from unsafe sources, I'm not sure how easy it is to strip out unsafe aspects of regular expressions which could execute arbitrary perl code during a match.

To remove the eval, you would also have to limit your input to not include the leading/trailing delimiters. Nor would the smi flags be allowed (or they're mandatory). However, even then, the input can still control these flags inside a regular expression:

(?smi)^STOR\s[^\n]{100}
Note how the \'s aren't escaped here. Because this is data, and not interpreted by perl (until we get to the regular expression handler), we don't need to escape here. Nothing is escapable because nothing is treated as special. Once you've read this in, you can go back to using qr/$regexp/ in your call to the P::RG::R constructor.

Hope this helps.

Replies are listed 'Best First'.
Re^2: Parse::RandGen::Regexp
by paulski (Beadle) on Aug 07, 2005 at 03:30 UTC
    Your first suggestions are close to what I need but I don't want to have to escape characters. What other characters would I have to escape beside '\s'? This gets hard to manage.

    So I still have the problem, how do I convert a string

    (?smi)^STOR\s^\n{100}

    OR

    /^STOR\s^\n{100}/smi

    into a regexp that the P::RG::R wil handle?

      To convert the first string, (?smi)^STOR\s[^\n]{100}, into a compiled regular expression for P::RG::R, just use qr/$string/, assuming $string contained the string as read from the external source (don't forget to chomp it if that's the case!).

      To convert the second string, /^STOR\s[^\n]{100}/smi, again, assuming $string is read from an external source (not perl code - perl DATA is fine), just use eval "qr$string".

      No matter what, if the string is starting out inside perl code, you have to escape the \'s that have special meaning to the regular expression parser but not to perl, such as \s, \w, \S, \W, \d, \D, ... so that they survive to the parser. This is not needed if the string is stored outside of code because then the perl compiler won't see the \'s.

        OK that makes sense to me now. :-) I tried this and got it working. I'm not sure what you mean by "stored outside of code" though. Could you please clarify?

        Thanks,

        Paul

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://481516]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (5)
As of 2021-05-11 17:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Perl 7 will be out ...





    Results (120 votes). Check out past polls.

    Notices?