Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

Making sure user input is a valid regexp

by gri6507 (Deacon)
on Jun 08, 2010 at 12:48 UTC ( #843657=perlquestion: print w/replies, xml ) Need Help??
gri6507 has asked for the wisdom of the Perl Monks concerning the following question:

Fellow monks,

I have a bit of a dillema. I have a need to take user input and use it as the regexp pattern in the search against some data. Here's an example code:

use warnings; use strict; print "Input regexp: \n"; my $user_input = <STDIN>; my $string = "abcdef"; if ($string =~ /$user_input/x) { print "$string - matches\n"; } else { print "$string - no match found\n"; }

When you run this, and type in something valid like "a", or like ".", then you get a proper match. However, when you pass in some invalid regexp like "*a", then the code returns an error:
Quantifier follows nothing in regex; marked by <-- HERE in m/* <-- HERE a
I understand why this error happens, but I would like to trap it in a more user friendly way. Is there a method for ensuring that a regexp is valid before passing it to the m// operator?

Thank you as always!

Replies are listed 'Best First'.
Re: Making sure user input is a valid regexp
by moritz (Cardinal) on Jun 08, 2010 at 12:54 UTC
    chomp $user_input; my $regex = eval { qr{$user_input} }; my $error = $@; unless (defined $regex) { print "Your regex seems to be invalid. Try `perldoc perlretut` for +an explanation of valid regexes.\n" print "Regex compilation error was: $error"; }

    See also: eval, perlop.

    Update: unconfused variable names, as pointed out by choroba++ and ReturnOfThelonious++

      Good, but $! should be $@.
      And in print, you would use $error instead of $! replaced by $@ :)
Re: Making sure user input is a valid regexp
by ww (Archbishop) on Jun 09, 2010 at 01:29 UTC

    You must have astoundingly high confidence that none of your users will ever - by intent, accident or ignorance - abuse this direct input. Shouting now: TAINT! UNTAINT!

    perldoc -q taint
    ...that source refers the reader to
    "Laundering and Detecting Tainted Data" in perlsec.

    Among the questions that pop to (my) mind (even knowing and allowing -- or trying to do so -- for the fact you've given us a brief sample that likely bears no resemblance to your actual code):

    • How are you going to untaint all the possible regexen?
    • How is $string generated? Assuming a user actually inputs a (safe) regex, your code must have use psi enabled for the user to have any chance of matching anything.
    • Why do you need to let users use regexen? Why wouldn't a decent search engine (Kino... etc) do the job more securely (and perhaps better)?
    • What basis do you have for expecting your users to know how to write a proper regex (you clearly have an expectation that some won't have that knowledge)?
    • And, repeated for emphasis, why a regex?
Re: Making sure user input is a valid regexp
by rjt (Deacon) on Jun 08, 2010 at 17:43 UTC

    My preference would be to explicitly test for compilation failure with $@.

    use warnings; use strict; $| = 1; print "Input regexp: "; my $user_input = <STDIN>; my $regex = eval { qr/$user_input/x }; if ($@) { print "Your regexp would not compile. The error was:\n$@\n"; exit 1; } my $string = 'abcdef'; print "$string - " . (($string =~ $regex) ? "matches\n" : "no match found\n");

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://843657]
Approved by moritz
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (4)
As of 2018-02-22 00:51 GMT
Find Nodes?
    Voting Booth?
    When it is dark outside I am happiest to see ...

    Results (288 votes). Check out past polls.