Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Making sure user input is a valid regexp

by gri6507 (Deacon)
on Jun 08, 2010 at 12:48 UTC ( #843657=perlquestion: print w/ replies, xml ) Need Help??
gri6507 has asked for the wisdom of the Perl Monks concerning the following question:

Fellow monks,

I have a bit of a dillema. I have a need to take user input and use it as the regexp pattern in the search against some data. Here's an example code:

use warnings; use strict; print "Input regexp: \n"; my $user_input = <STDIN>; my $string = "abcdef"; if ($string =~ /$user_input/x) { print "$string - matches\n"; } else { print "$string - no match found\n"; }

When you run this, and type in something valid like "a", or like ".", then you get a proper match. However, when you pass in some invalid regexp like "*a", then the code returns an error:
Quantifier follows nothing in regex; marked by <-- HERE in m/* <-- HERE a
I understand why this error happens, but I would like to trap it in a more user friendly way. Is there a method for ensuring that a regexp is valid before passing it to the m// operator?

Thank you as always!

Comment on Making sure user input is a valid regexp
Download Code
Replies are listed 'Best First'.
Re: Making sure user input is a valid regexp
by moritz (Cardinal) on Jun 08, 2010 at 12:54 UTC
    chomp $user_input; my $regex = eval { qr{$user_input} }; my $error = $@; unless (defined $regex) { print "Your regex seems to be invalid. Try `perldoc perlretut` for +an explanation of valid regexes.\n" print "Regex compilation error was: $error"; }

    See also: eval, perlop.

    Update: unconfused variable names, as pointed out by choroba++ and ReturnOfThelonious++

      Good, but $! should be $@.
      And in print, you would use $error instead of $! replaced by $@ :)
Re: Making sure user input is a valid regexp
by ww (Bishop) on Jun 09, 2010 at 01:29 UTC

    You must have astoundingly high confidence that none of your users will ever - by intent, accident or ignorance - abuse this direct input. Shouting now: TAINT! UNTAINT!

    perldoc -q taint
    ...that source refers the reader to
    "Laundering and Detecting Tainted Data" in perlsec.

    Among the questions that pop to (my) mind (even knowing and allowing -- or trying to do so -- for the fact you've given us a brief sample that likely bears no resemblance to your actual code):

    • How are you going to untaint all the possible regexen?
    • How is $string generated? Assuming a user actually inputs a (safe) regex, your code must have use psi enabled for the user to have any chance of matching anything.
    • Why do you need to let users use regexen? Why wouldn't a decent search engine (Kino... etc) do the job more securely (and perhaps better)?
    • What basis do you have for expecting your users to know how to write a proper regex (you clearly have an expectation that some won't have that knowledge)?
    • And, repeated for emphasis, why a regex?
Re: Making sure user input is a valid regexp
by rjt (Deacon) on Jun 08, 2010 at 17:43 UTC

    My preference would be to explicitly test for compilation failure with $@.

    use warnings; use strict; $| = 1; print "Input regexp: "; my $user_input = <STDIN>; my $regex = eval { qr/$user_input/x }; if ($@) { print "Your regexp would not compile. The error was:\n$@\n"; exit 1; } my $string = 'abcdef'; print "$string - " . (($string =~ $regex) ? "matches\n" : "no match found\n");

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://843657]
Approved by moritz
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (18)
As of 2015-07-31 17:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (280 votes), past polls