Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW

Making a regex case insensitive

by Win (Novice)
on Mar 06, 2007 at 18:13 UTC ( #603475=perlquestion: print w/replies, xml ) Need Help??
Win has asked for the wisdom of the Perl Monks concerning the following question:

Please could someone guide me as to how best to write the regex below. The issue I have with it is that someone could write 'dRop' and the regex would not pick it up.
if ( $_ =~ /$(--|;|DELETE\s|DROP\s|UPDATE\s|EXEC\s|INSERT\s|CREATE\s|d +elete\s|drop\s|update\s|exec\s|insert\s|create\s)/ ) { print "There is a possible injection attack here"; ## Security fun +ction here die; }
Update: It appears the answer is:
if ( $_ =~ /$(--|;|DELETE\s|DROP\s|UPDATE\s|EXEC\s|INSERT\s|CREATE\s +)/i ) { die; }
But how do I stop the case insensitivity applying to '--' and ';'?

Replies are listed 'Best First'.
Re: Making a regex case insensitive
by ikegami (Pope) on Mar 06, 2007 at 18:18 UTC

    There's a modifier for regexp that makes it case-insensitive.

    By the way, if you're trying to prevent the users from making changes to the database, why don't you revoke the relevant rights from your database user?

      Because copies of the database exist on multiple machines and I believe that the system is best protected on two different levels. I wish to port the system to different applications as well. I can't believe that I forgot the /i thing. I really must go back to those Perl books I have at home and read them again.
        Instead of checking for bad tokens you should just use bound parameters whenever possible, and DBI's quote method when it isn't possible. You'll save yourself a lot of pain that way.

        Perfect paranoia is perfect awareness when it comes to preventing SQL injection attacks. Make sure you are binding or quoting everything that will touch the database. It's a semi common mistake to include $ENV{HTTP_REFERER} or $ENV{HTTP_USER_AGENT} in the sql unquoted.

        I really must go back to those Perl books I have at home and read them again.
        I'll say.
      How do I stop the case insensitivity applying to '--' and ';'?

      Update: ok I should start to try and think like a computer.
        While the case of '--' and ';' seems to be a matter of debate and jocose comment, you might like to know that you can switch case sensitivity on and off in different parts of a regular expression. You would use constructs like (?i), (?-i), (?i:text) and (?-i:text). The first two switch case insensitivity on and off respectively. The second two just apply their effect, insensitive or sensitive respectively, to the text inside the parentheses. Here is a contrived example that uses a precompiled (qr{ ...}) regular expression that also uses extended syntax, the x, to allow comments and white space inside the expression.

        use strict; use warnings; my @strings = ( q{catFiSHcake}, q{DogFISHcAkE}, q{cATfishCake}, q{caTFISHcaKE}); my $rxMixed = qr {(?xi) # use extended syntax and # make case insensitive (?:cat|dog) # non-capture alternation # of cat or dog (?-i:FISH) # FISH, case sensitive # inside parentheses cake # cake, case insensitive # again }; foreach my $string ( @strings ) { print qq{$string: }, $string =~ $rxMixed ? qq{Match\n} : qq{No match\n}; }

        Here's the output.

        catFiSHcake: No match DogFISHcAkE: Match cATfishCake: No match caTFISHcaKE: Match

        I hope this is of interest.



        ok I should start to try and think like a computer

        Thinking like a programmer would be a good start.

        Update: Yes, I know it's all very tempting to downvote this. I'm sure I could have been a lot more tactful. But before you reach for the downvote button, please take a couple of minutes to review Win's posting history.

        I didn't know there was a lowercase '--' or ';'...
        eh?? What's a lowercase and uppercase ;?
        This characters will not be modified (nor will numbers, if you had any).

        Please Win,

        How do I stop the case insensitivity applying to '--' and ';'?

        Update: ok I should start to try and think like a computer.

        When you say things like this, it really sounds like you're just trolling...

        Where do you want *them* to go today?

        Regardless of how you think you could at least test stuff.

        for (0..255) { print "$_=" . chr($_) . "\n" if chr($_) =~ /;/i };

        That should print 59=; out showing that /;/i is only matching ; and not lowercase ;, whatever that would be.

        print uc(';'), lc(';'); #outputs ;;

        FYI I like to use perl -dex (as recommended by tye I think) which gives you a nice way to run perl code and see its results.

        Eric Hodges
Re: Making a regex case insensitive
by Moron (Curate) on Mar 07, 2007 at 13:49 UTC
    Once upon a time this was a sensible question. Character sets such as EBCDIC and ASCII were designed so that everything could be shifted or unshifted by unsetting or setting a bit in the binary code. For example, in the original ASCII, unsetting the 32 bit not only shifts from 'a' to 'A', but also from ';' to '['. Unfortunately it is unclear why it happened, whether the ASCII convention itself contains not even a single technical provision (just a long featureless vomit of impreganable legalese) or because the whole idea got overlooked when a plethora of nastily unpredictable keyboards started arriving in front of us intended to support different languages or whatever (just an excuse for incompetence - there was no need to botch ASCII). Nor is it clear whether the technical standards behind ASCII (probably an IEEE standard actually) was just Microshafted and/or IBuMmered as well, but suffice to say the concept of a shift key got severely corrupted with total loss of consistency and Perl, which came to be after the Great IBM/Microsoft Disaster had only the bare bones left to pick through, hence the whole concept of case for Perl being limited to alpha only.

    Update: actually in a way the ASCII ctrl key survived more than the shift key! (for example CTRL-C is "C" with the 64 bit unset = ascii 3) and that is more consistently implemented on keyboards, probably because the hordes of typists from all over the world didn't actually use it to type a letter on IBM word processors, so it remained unIBuMerred until it was later Microshafted: I.e. in another sense it didn't, for example, CTRL-C (ascii-cancel) while surviving to some extent on Unix and Linux won't work the same way under Windows. Unfortunately, the ASCII ctrl key is partially impaired on *Nix owing to the attempt to make e.g. Open Office transparent for the ex-Windows user. (ALT-C would have been a better idea for "copy" in regard to preserving the existing standards for keyboards).


    Free your mind

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://603475]
Approved by ikegami
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (5)
As of 2018-06-18 23:48 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (111 votes). Check out past polls.