Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask

Putting the stringified Regexp object into code

by sedusedan (Monk)
on Nov 06, 2012 at 17:54 UTC ( #1002548=perlquestion: print w/replies, xml ) Need Help??
sedusedan has asked for the wisdom of the Perl Monks concerning the following question:

Recently in one of my projects, I need to put the stringified Regexp object into code. For example, I have:

$re = qr(/);

which stringifies into (in Perl 5.14):


I then need to generate some Perl code (later to be eval()-ed) to test data against this regex:

$code = "if (\$data =~ qr/$re/) { blah() }";

Currently I just do this:

$re = "$re"; $re =~ s!/!\\/!g;

I'm wondering if this is generally safe for the various regexes, or are there edge cases which I missed?

Replies are listed 'Best First'.
Re: Putting the stringified Regexp object into code
by kennethk (Abbot) on Nov 06, 2012 at 18:40 UTC
    Is there a reason you are avoiding using a closure? Doing something like
    my $re = qr(/); my $code = sub {if ($_[0] =~ $re) { blah() }}; ... $code->($data);
    will probably save you some maintenance headache and keeps errors more localized to where they were coded. There is nothing wrong per se with what you've done, but it doesn't take advantage of a lot of the strengths modern Perl has to offer. In my own work, I have found that a string eval correlates strongly with me being too clever for my own good.

    #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

      Yup, I am specifically generating Perl code (as string). The resulting code can be eval()-ed or embedded in other source code.

        How are you embedding? This sounds like something you should be solving using modules if you need to share the result among multiple scripts. But as I said, the code should stand as is. I would point out that the qr in $code = "if (\$data =~ qr/$re/) { blah() }"; is unnecessary.

        #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

Re: Putting the stringified Regexp object into code
by AnomalousMonk (Canon) on Nov 06, 2012 at 18:53 UTC

    Do you really need to stringize the entire if-statement? If you can get away with it, it's much simpler to use the stringized regex object on its own. (When a match such as  m// or  s/// is compiled, the compiler looks for terminating delimiters before interpolating scalars, so embedded '/' characters in the scalars should not be a problem.) (Also: The  =~ operator will bind to a plain old string.)

    >perl -wMstrict -le "my $rx = qr{ foo/bar }xms; print 'ref: ', ref $rx; $rx = qq{$rx}; print 'de-refized: ', ref $rx; print $rx; ;; my $str = 'xx foo/bar yy'; print 'match' if $str =~ $rx; print qq{captured '$1'} if $str =~ m/($rx)/; " ref: Regexp de-refized: (?^msx: foo/bar ) match captured 'foo/bar'

      Yes, basically I want to generate a Perl code (as string) where an input Regexp object needs to be a literal in the generated code.

      BTW, the project is Data::Sah and here's an example of the code generation in action:

      $ TRACE=1 LOG_SAH_VALIDATOR_CODE=1 perl \ -MLog::Any::App -MData::Sah=gen_validator \ -E'$v = gen_validator([str => {match=>qr!/!}]); for ("a", "a/") { say "$_: ", $v->($_) }' ... [98] validator code: 1|sub { 2| my ($data) = @_; 3| my $res = 4| # skip if undef 5| (!defined($data) ? 1 : 6| (# check type 'str' 7| (!ref($data)) 8| && 9| (# clause: match 10| ($data =~ qr/(?^u:\/)/)))); 11| return $res; 12|} a: a/: 1


        The reason kennethk is wondering why you're avoiding closures is that much template-generated code can be easily replaced with a closure. Generally a string eval can be replaced with something better.

        Munging strings to generate functions is certainly possible, and many have done it. But sometimes it's just easier to generate the function itself. An example (very loosely) based on your sample code:

        $ cat #!/usr/bin/perl use strict; use warnings; # Function to accept only 4 digit numbers my $v_4dig = gen_validator('^\d{4}$'); # Function to accept only 3 digit numbers my $v_3dig = gen_validator('^\d{3}$'); # Function to accept only lower-case alphabetic strings my $v_alpha = gen_validator('^[a-z]+$'); for my $t ('apple', 123, 456, 7890) { print "$t:\t", $v_4dig->($t), "\t", $v_3dig->($t), "\t", $v_alpha- +>($t), "\n"; } sub gen_validator { my $regex = shift; # Create function to validate against current regex return sub { my $data = shift; return 1 if ! defined $data; if (ref $data eq '') { return 1 if $data =~ /$regex/; } return 0; } } $ perl apple: 0 0 1 123: 0 1 0 456: 0 1 0 7890: 1 0 0

        Update: Added the third validator function example, just for variety. Also added a couple of comments.

        Update: Re-read thread and properly attributed suggestion.


        When your only tool is a hammer, all problems look like your thumb.

Re: Putting the stringified Regexp object into code
by tobyink (Abbot) on Nov 06, 2012 at 20:35 UTC

    There is only one edge case that springs to mind - where there are pragma that effect how the regexp works. Certain pragma like use re '/u' will affect the stringification of the regexp reference, so you don't need to worry about them. But others, like re::engine::Lua effect the behaviour of the regexp without effecting the stringification.

    If you can wrap the regexp in a coderef, then you can use B::Deparse to get a string of Perl code, and that will preserve pragmata...

    sub regexp2code (&) { require B::Deparse; my $re = shift; '(do ' . B::Deparse->new->coderef2text($re) . ')'; } LUA_STYLE: { use re::engine::Lua; CORE::say regexp2code { qr(%a) }; } PERL_STYLE: { CORE::say regexp2code { qr(%a) }; }
    perl -E'sub Monkey::do{say$_,for@_,do{($monkey=[caller(0)]->[3])=~s{::}{ }and$monkey}}"Monkey say"->Monkey::do'
      Thanks, Toby. I'm bookmarking this in case I need it.
Re: Putting the stringified Regexp object into code
by AnomalousMonk (Canon) on Nov 06, 2012 at 21:39 UTC

    The deferred interpolation of scalars in a regex can still be depended on if the deference of all the various interpolations involved can be synched up properly.

    (However, I share the suspicions of others about this approach to what seems like a templating problem.)

    >perl -wMstrict -le "my $rx = qr{ foo/bar }xms; print $rx; ;; my $str = 'xx foo/bar yy'; eval qq{ print q{match} if \$str =~ \$rx; my \$ry = q{$rx}; print qq{captured '\$1'} if \$str =~ m/(\$ry)/; }; $@ and die qq{eval err: $@}; " (?^msx: foo/bar ) match captured 'foo/bar'

    Update: Slightly altered example code to more closely match my previous post.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1002548]
Approved by marto
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (7)
As of 2017-01-18 13:18 GMT
Find Nodes?
    Voting Booth?
    Do you watch meteor showers?

    Results (161 votes). Check out past polls.