http://www.perlmonks.org?node_id=1012724

smontsar has asked for the wisdom of the Perl Monks concerning the following question:

Is there some quick and dirty way to detect is a string is a pure string, or if it is a regular expression.

Given:

$a="a.*b"; $b="acdb";

I am looking for a technique or method such that (in pseudocode):

methodIsItRegEx($a) == TRUE methodIsItRegEx)$b) == FALSE

Replies are listed 'Best First'.
Re: Detecting if a string is a regular expression
by MidLifeXis (Monsignor) on Jan 10, 2013 at 19:45 UTC

    Wouldn't $b also be a regular expression, and $a also be a valid string? In other words, what are your criteria?

    --MidLifeXis

Re: Detecting if a string is a regular expression
by davido (Cardinal) on Jan 10, 2013 at 19:48 UTC

    What problem are you trying to solve?


    Dave

Re: Detecting if a string is a regular expression
by LanX (Saint) on Jan 10, 2013 at 19:58 UTC
    > Is there some quick and dirty way to detect is a string is a pure string, or if it is a regular expression.

    no, IMHO any $string can be used as a valid regex like in m/$string/ w/o syntax error.

    but if you need a regex datatype to distinguish $scalars use qr// to generate them.

    DB<106> $s='abc' => "abc" DB<107> $r=qr/abc/ => qr/abc/ DB<108> ref $r eq 'Regexp' => 1 DB<109> ref $s eq 'Regexp' => "" DB<110> 'xabcx'=~m/$r/ => 1 DB<111> 'xabcx'=~m/$s/ => 1

    Cheers Rolf

      ... any $string can be used as a valid regex like in m/$string/ ...

      ... and you don't even need the  m// or  qr// incantation. The  =~ binding operator can do the trick:

      >perl -wMstrict -le "my $s = 'abc'; ;; print 'match' if 'xabcx' =~ $s; print 'match' if 'xabcx' =~ 'abc'; " match match
        yeah I noticed, but normally I prefer the more obvious syntax. (the m is superfluous too =)

        Cheers Rolf

        PS: I know someone will show us now benchmarks demonstrating that at least 3 and half processor cycles are waisted this way ... ;)

Re: Detecting if a string is a regular expression
by BrowserUk (Patriarch) on Jan 11, 2013 at 00:29 UTC

    To test if a string contains regex metachars, quotemeta it and compare it to the original:

    print "$_: does", ($_ eq quotemeta() ? 'not' : '' ), 'contain metachar +s' for 'a.c', '[abc]', 'abcd';; a.c: does contain metachars [abc]: does contain metachars abcd: does not contain metachars

    Of course, that doesn't test if the metachars make sense as a regex.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
      for completeness:

      > Of course, that doesn't test if the metachars make sense as a regex

      let's Perl parse Perl =)

      $ perl -ple ' $_ = eval "sub { /$_/ }" ? "OK\n" : "$@" ' a OK a( Unmatched ( in regex; marked by <-- HERE in m/a( <-- HERE / at (eval 2 +) line 1, <> line 2. a[ Unmatched [ in regex; marked by <-- HERE in m/a[ <-- HERE / at (eval 3 +) line 1, <> line 3. a() OK

      Cheers Rolf

      UPDATE

      unfortunately not always correct:

      $ perl -ple ' $_ = eval "sub { /$_/ }" ? "OK\n" : "$@" ' a[$] OK

      but

      $ perl -e ' "abc" =~ /a[$]/ ' Unmatched [ in regex; marked by <-- HERE in m/a[ <-- HERE 5.010000/ at + -e line 1.

      UPDATE

      interesting, this happens because it's a run-time error ... (why?)

      $ perl -e 'sub {"aaa" =~ /a[$]/}' $ perl -e '"aaa" =~ /a[$]/' Unmatched [ in regex; marked by <-- HERE in m/a[ <-- HERE 5.010000/ at + -e line 1. $ perl -ce '"aaa" =~ /a[$]/' -e syntax OK

      looks like a parser problem for me!

      UPDATE

      variable interpolation is part of the problem, a[$] is a valid pattern, as long as it's not interpolated:

      perl -E 'say q(a$) =~ q(a[$]) ' 1

      UPDATE

      this is very reliable

      perl -ple ' $_ = eval "sub { 'a' =~ q\0$_\0 }" ? "OK\n" : "$@" '

      to avoid problems with \0-delim consider using here-docs instead, or even a pack/unpack combination

Re: Detecting if a string is a regular expression
by Kenosis (Priest) on Jan 10, 2013 at 19:59 UTC

    You could create a regex that checks the string for the characters you want to allow, e.g.:

    sub isStrOK { return $_[0] =~ /\A[\s\w]+\z/; }

    The above's not intended to be exhaustive for allowable characters, but only an example.