It's a way of (hopefully) stopping you from making silly mistakes.
Every piece of data that comes to the script that is used outside the script is considered tainted unless you explicitly grab it from a regular expression (I think, there may be other ways to untaint though).
Why is this useful? Let's say you had a script that uploaded a domain from a web page and you wanted to ping that domain.
my $q = CGI->new();
my $domain = $q->param('domain');
my $result = `ping $domain`;
Under taint, this would die because you're trying to pipe some untainted data to an external program. Imagine what would happen if some malicious user uploaded "localhost; rm -rf /" as the domain name!
So, under taint, you would need to explicitly grab the domain from the variable:
my $domain='';
$q->param('domain') =~ /^([a-zA-Z0-9\.]+)$/
and $domain = $1;
That's just a rough expression to grab the domain. The point is that you know that there won't be anything malicious in $domain when it's assigned.
But, untainting data in itself does not protect you. You could, if you wished, untaint it like this:
$q->param('domain') =~ /^(.+)$/
and $domain = $1;
but you won't have added to your security understanding if you do :) There are times though, when you don't care what a value contains and, in those instances, it would be perfectly acceptable to untaint like that. Just as long as you know for sure!
I wrote a little article on it here if you're interested.
.02
cLive ;-) |