Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

How can I tell if an object is based on a regex?

by kyle (Abbot)
on May 31, 2007 at 16:42 UTC ( #618533=perlquestion: print w/ replies, xml ) Need Help??
kyle has asked for the wisdom of the Perl Monks concerning the following question:

In Object Oriented Perl, Chapter 5, TheDamian blesses a regular expression made with qr (see perlop). I never gave much thought to this, but I recently noticed the following:

  • ref for a regular expression returns "Regexp".
  • Scalar::Util::reftype for a regular expression says it's "SCALAR".
  • Scalar::Util::blessed says it's already blessed.
  • Its functionality as a regexp is not tied to the fact that it's blessed as 'Regexp'. You can bless it into some other class, and it's still a regexp.

What this seems to mean is that once I bless a regular expression into some other class, I can't tell that it's really a regular expression under the hood anymore, but it is. I also can't tell if some scalar reference is really a regexp or just some scalar that some joker decided to call a 'Regexp'.

The code below demonstrates what I'm talking about. In it, a real regular expression is examined and used both before and after being blessed. Next, a blessed scalar ref is examined the same way. It looks the same, but it doesn't work the same.

use Test::More 'tests' => 25; use Scalar::Util qw( reftype blessed ); ok( 'foo' =~ //, '"foo" matches empty expression' ); # a real regexp, not blessed, works like this my $rx = qr/f(oo)/; is( ref $rx, 'Regexp', 'ref $rx eq "Regexp"' ); is( blessed $rx, 'Regexp', '$rx is blessed as "Regexp"' ); is( reftype $rx, 'SCALAR', 'reftype $rx eq "SCALAR"' ); ok( ! defined $$rx, '$$rx is not defined' ); ok( 'foo' =~ $rx, '$rx matches "foo"' ); is( $&, 'foo', 'matched text is "foo"' ); is( $1, 'oo', 'first capture is "oo"' ); # a blessed regexp works the same way, # but what other evidence can tell me it's a regexp? bless $rx, 'NotRegexp'; is( ref $rx, 'NotRegexp', 'ref $rx eq "NotRegexp"' ); is( blessed $rx, 'NotRegexp', '$rx is blessed as "NotRegexp"' ); is( reftype $rx, 'SCALAR', 'reftype $rx eq "SCALAR"' ); ok( ! defined $$rx, '$$rx is not defined' ); ok( 'foo' =~ $rx, '$rx matches "foo"' ); is( $&, 'foo', 'matched text is "foo"' ); is( $1, 'oo', 'first capture is "oo"' ); # this looks just like the blessed regexp, but it doesn't function my $o; my $notrx = \$o; bless $notrx, 'NotRegexp'; is( ref $notrx, 'NotRegexp', 'ref $notrx eq "NotRegexp"' ); is( blessed $notrx, 'NotRegexp', '$notrx is blessed as "NotRegexp"' ); is( reftype $notrx, 'SCALAR', 'reftype $notrx eq "SCALAR"' ); ok( ! defined $$notrx, '$$notrx is not defined' ); ok( !('foo' =~ $notrx), '$notrx does not match "foo"' ); # this looks just like the real regexp, but it doesn't function bless $notrx, 'Regexp'; is( ref $notrx, 'Regexp', 'ref $notrx eq "Regexp"' ); is( blessed $notrx, 'Regexp', '$notrx is blessed as "Regexp"' ); is( reftype $notrx, 'SCALAR', 'reftype $notrx eq "SCALAR"' ); ok( ! defined $$notrx, '$$notrx is not defined' ); ok( !('foo' =~ $notrx), '$notrx does not match "foo"' );

My question is, how can I tell if some scalar reference I receive really is a regular expression? Obviously, if it produces a match when used, I know it is a regular expression (a scalar reference to a string won't act as a regexp), but a non-match doesn't tell me much.

Comment on How can I tell if an object is based on a regex?
Select or Download Code
Re: How can I tell if an object is based on a regex?
by chromatic (Archbishop) on May 31, 2007 at 17:57 UTC

    The only reliable way I know is to look for regexp magic on the SV:

    $ perl -MDevel::Peek my $r = qr/foo/; my $n = 'bar'; Dump( $r ); Dump( $n ); SV = RV(0x817fb18) at 0x8152cdc REFCNT = 1 FLAGS = (PADBUSY,PADMY,ROK) RV = 0x81535dc SV = PVMG(0x816dac8) at 0x81535dc REFCNT = 1 FLAGS = (OBJECT,SMG) IV = 0 NV = 0 PV = 0 MAGIC = 0x817c1f8 MG_VIRTUAL = 0x814ee88 MG_TYPE = PERL_MAGIC_qr(r) MG_OBJ = 0x8175270 STASH = 0x81530f0 "Regexp" SV = PV(0x8153b00) at 0x815360c REFCNT = 1 FLAGS = (PADBUSY,PADMY,POK,pPOK) PV = 0x816ba88 "bar"\0 CUR = 3 LEN = 4

    Unfortunately, I don't have an easy way to do that from pure-Perl code. I've also never had the need to do so.

Re: How can I tell if an object is based on a regex?
by blazar (Canon) on May 31, 2007 at 22:34 UTC
    In Object Oriented Perl, Chapter 5, TheDamian blesses a regular expression made with qr (see perlop).

    Now, for those without the book, the question is why does he do that while we know that, as you correctly point out, the return value of qr is already a blessed reference into the Regexp package? Especially since the latter circumstance of its own makes for interesting tricks, as a recent poor attempt of mine shows...

Re: How can I tell if an object is based on a regex?
by diotalevi (Canon) on Jun 01, 2007 at 00:29 UTC

    The Data::Dump::Streamer and re modules provide the hooks to examine an object and see if it has regexp nature.

    ⠤⠤ ⠙⠊⠕⠞⠁⠇⠑⠧⠊

Re: How can I tell if an object is based on a regex?
by demerphq (Chancellor) on Jun 01, 2007 at 17:12 UTC

    Long ago i had this question, and the general answer was as chromatic replied, so i wrote the code that diotalevi points out, first in Data::Dump::Streamer and later moved it to re.pm in core. However the latter is only in the blead perl version and is not yet available in a production release of Perl.

    Had chromatic not vetoed my UNIVERSAL::DOES patch it would have been possible to do UNIVERSAL::DOES($rx,'qr//') but he elected to make said function/method essentially useless in the name of OO purity instead. Sad but true. Such is life in open source. :-(

    ---
    $world=~s/war/peace/g

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://618533]
Approved by Limbic~Region
Front-paged by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (13)
As of 2014-07-28 18:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (206 votes), past polls