Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask

CGI - hazardous characters

by rpike (Scribe)
on Jul 19, 2010 at 20:36 UTC ( #850323=perlquestion: print w/replies, xml ) Need Help??
rpike has asked for the wisdom of the Perl Monks concerning the following question:

Hi all, I was wondering what's the easiest and most effective way to remove special characters coming from a form? If I use a CGI object (from CGI module) will that remove any characters automatically or do any sort of checks on the data coming in? The data coming in will vary and so will the acceptable data (i.e. @ is needed for e-mail, text fields for comments may expect a few types of characters, etc..,). How do you handle data such as this? I'm trying to write the code for a mock website for buying/searching/etc.., goods. Any help would be greatly appreciated. Thanks. AG

Replies are listed 'Best First'.
Re: CGI - hazardous characters
by fullermd (Priest) on Jul 20, 2010 at 00:35 UTC

    As a general rule, removing "special" (whatever that may mean in a particular context) characters is a much more dangerous and fragile solution than removing everything but "normal" characters. Figure out what you want to allow, and then remove (or throw errors for) anything else.

Re: CGI - hazardous characters
by ikegami (Pope) on Jul 20, 2010 at 01:23 UTC
    No, CGI doesn't remove anything. It couldn't possible know what would consist a bad character pattern because that varies by context. For example, the field I'm typing into right now expects an large HTML subset, so "<" is not a bad character for this field, but it wouldn't be acceptable for a person's name.
Re: CGI - hazardous characters
by Khen1950fx (Canon) on Jul 20, 2010 at 00:21 UTC
    It all depends on what your definition of "special characters" is. Punctuation? There's some regex examples here. Since you're working with CGI, then I'd recommend WWW::Form.
Re: CGI - hazardous characters
by ahmad (Hermit) on Jul 19, 2010 at 20:56 UTC

    You'll have to do the checking yourself, or use a form validator module (Search CPAN for the module that suits your needs)

Re: CGI - hazardous characters
by bradcathey (Prior) on Jul 20, 2010 at 02:19 UTC

    Not sure exactly what you are looking for, but here's some Perl that grabs the form value and then tests it for unwanted characters and untaints in the same step. I have a bunch of validation methods depending on what I'm testing for.

    Calling script:

    ($sql{'name'}, $error) = $self->val_text( 1, 64, $self->query->param(' +name') ); if ( $error-> { msg } ) { push @error_list, { "name" => $error->{ m +sg } }; }

    Validation script

    sub val_alphanum { my $self = shift; my ($mand, $len, $value) = @_; if (!$value && $mand) { return (undef, { msg => 'cannot be blank' }); } elsif ($len && (length($value) > $len) ) { return (undef, { msg => 'is limited to '.$len.' characters.' }); } elsif ($value && $value !~ /^(\w*)$/) { return (undef, { msg => 'can only use letters, numbers and _' } else { my $tf = new HTML::TagFilter; return ($tf->filter($1)); } }

    I've put a lot of work in to figuring out this CGI stuff—you can see more complete examples at Using Perl, jQuery, and JSON for Web development and A Tutorial for CGI::Application.

    "The important work of moving the world forward does not wait to be done by perfect men." George Eliot
Re: CGI - hazardous characters
by astroboy (Chaplain) on Jul 20, 2010 at 03:40 UTC
    I'm not sure from your question what you're after. Maybe HTML::Strip or HTML::Scrubber? I use the latter to remove some HTML tags and not others, while also removing any JavaScript.
Re: CGI - hazardous characters
by rpike (Scribe) on Jul 20, 2010 at 13:38 UTC
    Thanks guys. Alot of great suggestions and info. I'll try them in sequence and maybe combination and hopefully I'll have good end results (I'm guessing it will). :-)
Re: CGI - hazardous characters
by pemungkah (Priest) on Jul 20, 2010 at 21:49 UTC
    I can highly recommend testing your code with the sample XSS attacks from - you will find a lot of potential cross-site scripting attacks that way.

    Also, avoid dumping raw text from user input into comments: all the user has to do is figure out you're doing that and preface any of the XSS exploits on the page with '-->' to close your comment early...

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://850323]
Approved by Corion
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (2)
As of 2018-02-20 04:19 GMT
Find Nodes?
    Voting Booth?
    When it is dark outside I am happiest to see ...

    Results (267 votes). Check out past polls.