These have all been mentioned numerous times, but many programmers still don't
understand the risks of using power tools. Because I think everyone knows why
it is important to program with security in mind, I'll just begin without any
further introduction.
Know your environment
Perl is written in C and doesn't prevent you from shooting your own feet. This
has some very dangerous implications that not every Perl programmer is aware
of. Even though the language you use, you still need some basic knowledge of C
to create secure programs.
The platform perl runs on is also important. Linux asks for different security
measures than Win32. Even the filesystem that is used can be important: are
FooBar and foobar the same file, or not? If you program for
one specific platform, make it work only on that one.
Read documentation!
Your program will be used by others
Or maybe not. But always assume the worst. If not because use by others is a
probable future, then to keep yourself focussed on the important issues.
Security comes first
Your boss may tell you that the first priority is that everything works, but it
is your job to tell him he's wrong. If something is wrong, make sure the
program dies before more goes wrong. It's better to have a program that does
nothing at all than to have a program that does everything that is expected
from it and provides a backdoor for evil-doers as a free bonus feature.
Examples
Encode and escape!
Knowing your environment includes knowing all the protocols and file formats
used. If you are a web programmer, then you should know HTTP, HTML and probably
CSS and JavaScript too.
Of every string that comes in, you should know the character set and, more
importantly, its encoding. Before doing anything with the data, it's best to
convert it to Perl's internal format.
For example, if your input is %-encoded UTF-8, use:
use Encode qw(decode);
use URI::Escape qw(uri_unescape);
my $string = decode 'utf-8' => uri_unescape $input;
If you don't know exactly how to unescape the incoming data, use a module, like
CGI (or its faster equivalent, CGI::Simple) and let that
handle it for you.
The reverse is also true. Before outputting a string, make sure it is in the
right format. For example, to output the $string we just decoded in an
utf-8 encoded HTML-document, we can use:
use Encode qw(encode);
use HTML::Entities qw(encode_entities);
my $output = encode_entities encode 'utf-8' => $string;
So even though input and output are utf-8, we still explicitly decode and
encode it from and to Perl's internal format. If the output was part of a URL,
we'd also unescape and then re-escape the data. This is to make sure no strange
octet (byte) slips through. Another benefit is that in between, you have a
string that normal Perl functions can manipulate without needing to have
special facilities to handle a certain character set or encoding.
(Note: Perl's internal format happens to also be utf-8, but you should never
assume this. Always explicitly decode and encode!)
If you don't escape properly, your program is prone to injection attacks. These
include, but are not limited to:
- SQL injection
- HTML/Javascript injection (Cross site scripting, XSS)
- open injection (to avoid, use 3-arg open, not 2-arg)
- Shell command line injection
- SMTP injection (don't let others abuse your machine as a spam gateway!)
Every output format requires its own escaping. Even better than escaping data,
though, is preventing interpolation when possible by using placeholders
(DBI) or a list variant of a function (system, exec, open). This
skips escaping and unescaping by using a more direct mode of communication.
If internally it is still implemented as escaping+unescaping
(DBI::mysql), at least you know knowledgeable people take care of it.
Null bytes are scary
Several control characters are scary, because they often have special meaning
in certain string formats, but the null byte is the most scary of all. In C, a
null byte (\0) indicates where a string ends. However, in Perl, it's
just a normal character. This has advantages and disadvantages. The
disadvantages are more important to be aware of. Many of Perl's functions are
implented using C functions, and in general, you can (and SHOULD!) assume
they're not removing the null bytes for you.
Suppose you have written a CGI-script that does nothing more than display a
page from the current directory. Storing data in the working directory is often
a mistake in itself, but for this contrived example, let's ignore that.
#!/usr/bin/perl -w
# this is page.cgi
use strict;
use CGI::Simple;
use File::Slurp qw(read_file);
my $cgi = CGI::Simple->new;
my $page = $cgi->param('page');
die if $page =~ m[/]; # Disallow pages from other folders
print "Content-Type: text/html\n\n";
print read_file "$page.html";
You disallow anything that has a slash in it, and ".html" is used in the
read_file call, so only .html-files from the current directory can be read,
right?
Wrong. Just poisoning the data with a null byte is enough to evade the .html restriction. URI-encoded, a null byte is %00.
http://example.com/page.cgi?page=page.cgi%00blah!
The underlying function is a C function. It thinks the string ends where the
null byte is. So it opens page.cgi and ignores the "\0blah!.html" part. But
wasn't File::Slurp a pure Perl module? Yes, it is. But it uses sysopen
internally! Don't let the "sys" part fool you: open uses the same internal C
function.
Instead reading through every module and Perl's source to find out what it
uses, just remove all null bytes unless you have a good reason to keep them
around. While you're at it, remove other control characters as well.
$string =~ tr/\x00-\x09\x0b\x0c\x0e-\x1f//d;
I skipped 0x0a and 0x0d because they are LF (line feed) and CR (carriage
return), used for line endings. Depending on the application you write, you may
need to exclude more characters, like vertical and horizontal tabs and form
feeds.
Taint mode
A good way to make sure you test each string before using it externally is to
use Perl's taint mode. It is invoked with -T. The previous example would only
need one a small change.
#!/usr/bin/perl -wT
...
my ($page) = $cgi->param('page') =~ /^(\w+)\z/ or die;
print "Content-Type: text/html\n\n";
print read_file "$page.html";
Note that you should NEVER blindly use . or [^...]
in your untaint regex. Whitelisting is much safer than blacklisting, and should
have preference. For example ^(.+)\z and ^([^/.]+)\z
still allow the dangerous null byte. I use \z instead of
$, because $ allows \n (newline) just
before the end. Know your tools, so learn to use regexes properly!
Conclusion
Please, add your own generic security related advice below. Preferrably with
examples of how easy it is to get wrong. There is much more than I have just
mentioned. If you know revelant PM nodes or external URLs, link to them. Let's
have all the important information in one place.
But realise that knowing what you're doing, and thus reading documentation, is
much better than reading only about the risks involved.
Considered by demerphq: "Section titles are too big"
Unconsidered by davido: No consensus in vote: (keep/edit/delete) = (26/35/0)
Considered by kutsu: "Edit: Move to tutorials" Vote: 4/10/0
Unconsidered by davido: Juerd knows where to post tutorials. He chose to post this as a Meditation. Let's respect the author's decision. Juerd's reaction: the original idea was to consider it for a move, or to post a new node, after having received lots of additional sections. However, I expected much more response than I got. Making this node a tutorial in its current state may give some people the impression that all important security issues are discussed, which is far from true.
Re: Security techniques every programmer should know
by dws (Chancellor) on Dec 27, 2004 at 04:14 UTC
|
| [reply] |
Re: Security techniques every programmer should know
by Aristotle (Chancellor) on Dec 27, 2004 at 03:10 UTC
|
Dealing with nuls, my preference would be to consider them an end-of-string marker.
s/\0.*//s;
After all, that's what the underlying system calls will get to see.
Makeshifts last the longest. | [reply] [d/l] |
Re: Security techniques every programmer should know
by Jaap (Curate) on Dec 27, 2004 at 09:47 UTC
|
In stead of blacklisting with
$string =~ tr/\x00-\x09\x0b\x0c\x0e-\x1f//d;
one should whitelist, allowing certain characters and forbidding the rest:
if ($string =~ m/^([a-zA-Z0-9_])$/)
{
my $safeString = $1; ### also untainted now
}
Edit:
Ok you say that in the Taint part, but i would add it to the "Null btes are scary" part. | [reply] [d/l] [select] |
|
$string =~ s/!([\w\s]+)//; ##add other allowed chars as needed
That will sanitize all strings to contain only numbers, digits, the underscore and whitespace. A more complete regex (which would still not include unicode or international chars) would be:
$string =~ s/!([\w\s\!\@\#\$\%\^\&\*\(\)\\\`\~\-\+\=\,\.]+)//;
(Yes, there's more escaping there than strictly necessary.) Suddenly, that transliteration is looking a lot easier to maintain. If your allowed set is "everything but nulls and control chars", then you're better off explicitly excluding the known control-char set.
Denying all, then allowing is a good general rule of thumb. But, in this case, the "dangerous" items are a fixed set while the "safe" items are much more variable -- so it makes sense to simply remove that which is dangerous.
Update=> Aristotle reminded me that, as \s includes \n, these regexes will not strip newlines; that means strings sanitized with these will be unsafe if executed with a shell (e.g. system("$string");). This further shows that inclusion-matching isn't as good, in this case, as merely stripping "bad" data out.
Anima Legato .oO all things connect through the motion of the mind
| [reply] [d/l] [select] |
|
\w matches different things depending on your locale. If you have a German locale, for instance, it will match ß.
The danger of using perl's shortcut character classes, as was pointed out to me by DrHyde.
"Cogito cogito ergo cogito sum - I think that I think, therefore I think that I am." Ambrose Bierce
| [reply] |
|
| [reply] |
Re: Security techniques every programmer should know
by Anonymous Monk on Dec 27, 2004 at 09:12 UTC
|
| [reply] |
Re: Security techniques every programmer should know
by ihb (Deacon) on Dec 27, 2004 at 23:45 UTC
|
Taint mode does not help against null bytes (or any other bytes) in your read_file "$page.html" example. Reads are not checked for tainted data. Writes are though, so write_file "$page.html" would've been stopped by the -T switch.
In short, I'd like to add this: Don't think -T will do the job for you! Just think it may help you if you slipped up.
ihb
See perltoc if you don't know which perldoc to read!
Read argumentation in its context!
| [reply] [d/l] [select] |
Re: Security techniques every programmer should know
by eyepopslikeamosquito (Chancellor) on Dec 29, 2004 at 02:06 UTC
|
Writing secure programs. Wow, that's a huge topic.
Where to start? :-)
I suppose with some basic Perl references.
The Camel
Chapter 23 "Security" provides
an excellent (and much more detailed than perlsec) overview of fundamental
Perl security issues.
This chapter is broken into: Handling Insecure Data, Cleaning Up Your
Environment, Accessing Commands and Files Under Reduced Privileges,
Handling Timing Glitches (Unix Kernel Security Bugs, Race Conditions,
Temporary Files), Handling Insecure Code (Safe module, Code Masquerading
as Data).
The
Perl Cookbook
has recipes:
8.17 (Testing a File for Trustworthiness),
19.4 (Writing a Safe CGI Program),
19.5 (Executing Commands Without Shell Escapes).
Can anyone comment on how safe is the Safe module?
Sorry, I've not used it, though it is described in the Camel. Update: apparently it's not safe according to Safe.pm considered unsafe?.
The venerable suidperl has apparently had all known insecurities
plugged by Paul Szabo in Perl 5.8.4. However,
"For new projects the core perl team would strongly recommend that you use
dedicated, single purpose security tools such as sudo in preference to
suidperl" (perl584delta).
Which leads me to an important general piece of security advice (simplifying outrageously):
Keep up-to-date with the latest version of perl. Well, that's a bit over the top; keep an eye on security alerts and perldelta security bug fixes and upgrade your perl judiciously.
Apart from Paul's heroic suidperl fixes, security bugs are being
squashed all the time.
For example,
perl 5.8 introduced Hash Randomisation and ensuring
that sort never goes O(n-squared).
Despite these two important denial-of-service (DoS) improvements,
Perl regular expressions remain a concern for DoS attacks,
it being easy to write (and hard to detect) a regular expression that finishes
after the heat death of the universe.
| [reply] |
Re: Security techniques every programmer should know
by fizbin (Chaplain) on Dec 30, 2004 at 15:41 UTC
|
This is more secure shell programming than secure perl programming, per se, but when passing arguments to an external command, in addition to the advice above about general control-character cleaning and proper escaping, be wary of cases where the passed argument might be interpreted as an option. For example, consider this code that might be part of a man2html gateway:
# $page and $section are parameters from the user that have been clean
+ed of 0 bytes and obvious control characters
my $mantext = ''; my $status;
my $pid = open(KID_STDOUT, "-|");
if (not defined $pid) {
die "cannot fork: $!; bailing out";
}
if ($pid) { ## parent
while(<KID_STDOUT>) {$mantext .= $_;}
$status = $?;
} else {
close(STDIN);
open(STDERR, '>&STDOUT');
if ($section) {exec('/usr/bin/man', $section, $page);}
else {exec('/usr/bin/man', $page);}
}
# now reformat $mantext and display it.
Now, there are some nice security plusses in this code - the use of the many-arg form of exec, for example, avoids a whole host of shell-escaping issues. However, this gives a potential attacker shell access on any system whose man command allows the -P option. (quid vide) All an attacker needs to do is pass in
section=-P/usr/bin/whatever%20command%20I%20want&page=cat
as part of the url, and their command will be executed. (And fed the "cat" manpage as input, but that's immaterial)
The general lesson here is that options change the behavior of external commands in ways you don't expect; don't allow the user to send options to external commands. Fortunately, with almost every unix command passing a '--' will prevent subsequent arguments from being interpreted as options, so a fixed version of the above code could read:
# $page and $section are parameters from the user that have been clean
+ed of 0 bytes and obvious control characters
my $mantext = ''; my $status;
my $pid = open(KID_STDOUT, "-|");
if (not defined $pid) {
die "cannot fork: $!; bailing out";
}
if ($pid) { ## parent
while(<KID_STDOUT>) {$mantext .= $_;}
$status = $?;
} else {
close(STDIN);
open(STDERR, '>&STDOUT');
if ($section) {exec('/usr/bin/man', '--', $section, $page);}
else {exec('/usr/bin/man', '--', $page);}
}
# now reformat $mantext and display it.
As an aside, note that the following code contains the same hole as the initial code:
my $qpage = quotemeta($page);
my $qsect = quotemeta($section || '');
exec("/usr/bin/man $qsect $qpage");
The issue is not shell escaping - the issue is that when calling external commands, be aware that many commands use arguments beginning with "-" to mean "radically alter your behavior in some fashion". This leads to behavior you can't predict ahead of time, which means that guarding against it is almost impossible if you allow options to be passed along.
Note that on an MS windows platform, (and, I suppose, on VMS too) some external commands may treat arguments beginning with '/' as options. Unfortunately, I don't know of any standard way to prevent that as with the '--' common on unix; on those platforms you'll just have to be careful to strip leading / characters in cases where the variables are being used in a way that could pass unwanted options to an external command.
--
@/=map{[/./g]}qw/.h_nJ Xapou cets krht ele_ r_ra/;
map{y/X_/\n /;print}map{pop@$_}@/for@/
| [reply] [d/l] [select] |
Re: Security techniques every programmer should know
by Juerd (Abbot) on Dec 28, 2004 at 22:28 UTC
|
| [reply] |
|
|