Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

Form Security

by virtualweb (Sexton)
on Jun 10, 2009 at 00:10 UTC ( #770144=perlquestion: print w/replies, xml ) Need Help??
virtualweb has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks:

I would like to add some sucurity to my forms..

I found a few regex's that might help like:

$string =~ s/\</\&lt\;/g; $string =~ s/\>/\&gt\;/g; $string =~ s/[\"\'\}\{\)\(\+]//g; $string =~ s/<!(?:--[\s\S]*?--\s*)?>\s*//g; $string =~ s/[\~\^]//g; $string =~ s/~!/ ~!/g; $string =~ s/<*(javascript)[^>]+>//gi; $string =~ s/(<[\s\/]*)(script\b[^>]*>)/$1x$2/gi; $string =~ s/<*(iframe)[^>]+>//gi; $string =~ s/<*(script)[^>]+>//gi;

Except I have a trillion of different form field names in different forms all through my server.

Is there a way to do a generic catch all field names loop and test them rather than specify each field name...?? I'm thinking to add a routine in a separate library, (let's say security.lib), and just add (require "security.lib";), to the forms I want to add security to.

Im using the following syntax to obtain input:

use CGI $q = new CGI; $string = $q->param('string');

Thanx for your help




ikegami:thank you for your input, as I said, I found these regex's, I have no idea if they are properly written or if they help at all. I havent used them, I only listed them here as example of what Im trying to do. If you know better regex's to filter out possible intruders from doing any damage yuo may suggest them

hangon: Thank you for that loop, you are the one who understood best what Im trying to do. I will do some testing on your snippet.

Your Mother: thank you for suggesting the use of your HTML::Scrubber and HTML::Strip. My concern is not that people may input HTML tags, but malicious code that may delete or steal password files, download the cgi code that makes up my script, change folder names, or run shell commands, etc. At least I think i should use a black list so people wont be able to enter comands like system, exec, open, eval, rand, etc

leocharre: thanx for the suggestion of printing out CGI documentation. If you know where to find some that deals with form security I promise to read it.

Replies are listed 'Best First'.
Re: Form Security
by ikegami (Pope) on Jun 10, 2009 at 00:14 UTC
    $string =~ s/\</\&lt\;/g; $string =~ s/\>/\&gt\;/g; $string =~ s/[\"\'\}\{\)\(\+]//g; # Why??? $string =~ s/<!(?:--[\s\S]*?--\s*)?>\s*//g; # Never matches $string =~ s/[\~\^]//g; # Why??? $string =~ s/~!/ ~!/g; # Why??? $string =~ s/<*(javascript)[^>]+>//gi; # Never matches $string =~ s/(<[\s\/]*)(script\b[^>]*>)/$1x$2/gi; # Never matches $string =~ s/<*(iframe)[^>]+>//gi; # Never matches $string =~ s/<*(script)[^>]+>//gi; # Never matches

    Those that never match don't match because < and > have been replaced. I didn't look at how useful they are by themselves.

    You escape only two of the three HTML chars that differentiate text from HTML (<, > and &).

    You convert text to HTML, but you remove all formatting (by not adding <pre> or some alternative). It may make for unreadable text.

    \Y\o\u \a\l\s\o \h\a\v\e \w\a\a\a\a\a\y \t\o\o \m\a\n\y \"\\\"\!

Re: Form Security
by Your Mother (Bishop) on Jun 10, 2009 at 02:45 UTC

    You should look closely at HTML::Scrubber and HTML::Strip. Solving this with regexes yourself is difficult and error prone. It's also fairly easy to build your own on the back of something like XML::LibXML if wanted. Here's something to play with-

    use warnings; use strict; use XML::LibXML; my @strip = @ARGV; @strip || die "Give a list of tags to strip.\n"; my $parser = XML::LibXML->new(); $parser->recover(1); $parser->keep_blanks(1); $parser->line_numbers(1); my $raw = join '', <DATA>; my $doc = $parser->parse_html_string($raw); my $root = $doc->documentElement(); for my $strip ( @strip ) { for my $node ( $root->findnodes("//$strip") ) { my $fragment = $doc->createDocumentFragment(); $fragment->appendChild($_) for $node->childNodes; $node->replaceNode($fragment); } } # entire HTML doc: print $doc->serialize(1); print $_->serialize(1) for $doc->findnodes("//body/*"); __END__ <div> <h1>Bang!<sup>1</sup></h1> <p>Did <i>italic</i> and <a href="/uri">link with <b>bold</b> inside it</a>.</p> <script type="whatever/whatnot">doSomethingTerrible()</script> <a href="/top-level">naked link</a> <p><i>The</i> <b>content</b> of the body <sup>element</sup> is displayed in your <span>browser</span>.</p> </div>

    The nice thing about this snippet is that it only removes the disallowed tags, not the content within the tags.

Re: Form Security
by hangon (Deacon) on Jun 10, 2009 at 02:10 UTC

    Calling $q->param will give you a list of all field names from the cgi object. Something like the following untested code should apply your regexes to each field value.

    my @fields = $q->param; for my $name ( @fields ){ my @vals = $q->param($name); for my $value( @vals ){ # apply regexes here $value =~ s/... $value =~ s/... $value =~ s/... } if (@vals > 1){ $q->param(name => $name, values => [ @vals ]); }else{ $q->param(name => $name, value => $vals[0] ); } }

    Update: Applied correction and clarification per post below. Thanks Mom, I'll try not to be in such a hurry next time. ;-}

      Quick clarification: there is no param object. It's just a method (or a function depending on how it's called). And if you're skipping the lexical variable in the loop already-

      $_ =~ s/... # is equivalent to s/...
Re: Form Security
by leocharre (Priest) on Jun 10, 2009 at 14:01 UTC

    You are using CGI.

    I understand the excitement about finding a cool module such as PDF::API2, or CGI::Application, and you just want to use it.. skim the docs..

    But here's what. Print out CGI doccumentation. Read it. No, really, read it. Don't memorize it, just read it. Maybe staple the sheets in a corner and put it under your bed when you sleep at night.

    In a week or a month when you're trying to solve a problem.. Your brain will knock on your door and whisper.. Hey.. you know what.. I think I may have seen something about that in such and such docs, yeah.. remember how at the time it seemed useless and we giggled to each other about "who the heck would use such a weird method call?" - well.. here we are..

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://770144]
Approved by ikegami
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (3)
As of 2018-04-23 00:36 GMT
Find Nodes?
    Voting Booth?