Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re^2: Site Search perlscript and security

by steelrose (Scribe)
on Nov 29, 2005 at 21:06 UTC ( #512770=note: print w/ replies, xml ) Need Help??


in reply to Re: Site Search perlscript and security
in thread Site Search perlscript and security

So, my solution:

$string =~ s/((?![\w,\s])|(?=[_,\,])).//g;

then do the match using the string data. And of course, print a disclaimer on the page with the text box for the users that special characters will be ignored ;)

If you give a man a fish he will eat for a day.
If you teach a man to fish he will buy an ugly hat.
If you talk about fish to a starving man, you're a consultant.


Comment on Re^2: Site Search perlscript and security
Download Code
Re^3: Site Search perlscript and security
by hardburn (Abbot) on Nov 29, 2005 at 21:25 UTC

    IMHO, \w and \s are too liberal in what they accept. Chances are that your search will not need Unicode, and \w in particular is going to accept that if your perl has Unicode support. Unless you know you need Unicode, it's probably better to use the explicit character class [A-Za-z0-9].

    "There is no shame in being self-taught, only in not trying to learn in the first place." -- Atrus, Myst: The Book of D'ni.

      A good point. Since the data I plan to search contains very few non A-Z a-z 0-9 characters that would need to be searchable, I can just add those characters to the string (like the e with acute é mark)

      If you give a man a fish he will eat for a day.
      If you teach a man to fish he will buy an ugly hat.
      If you talk about fish to a starving man, you're a consultant.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://512770]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (15)
As of 2014-07-10 16:58 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (213 votes), past polls