agn has asked for the wisdom of the Perl Monks concerning the following question:
Hi,
Am new to perlmonks.org and to Perl. Currently going through 'Intermediate Perl'. To keep me motivated I started a project - a simple/basic web server. I know that re-inventing the wheel is not a good idea but I figured it could be a good learning process. So please give me suggestions on how to improve my code. It's not complete yet, but I want to know if am in the right direction.
Here is the code.
PS: Am posting a link as am not sure if its allowed to post around 100+ lines of code in a post.
#!/usr/local/bin/perl -w
#Caution: Big mess ahead
#TODO:
# get .. working in dir listing
# implement Getopt::Long
# usage() function
# log to syslog
# daemonize
# persistent connections
use strict;
use IO::Socket;
my (@files, $req, $client);
my $DOCROOT = '/home/arun/site/';
my %error_page = (
403 => $DOCROOT.'403.html', # forbidden
404 => $DOCROOT.'404.html', # not found
406 => $DOCROOT.'406.html', # not acceptable
501 => $DOCROOT.'501.html' # not implemented
);
$SIG{'INT'} = \&cleanup;
my $socket = new IO::Socket::INET (
LocalAddr => '172.17.1.50',
LocalPort => (shift || 4321),
Proto => 'tcp',
Listen => 5,
ReuseAddr => 1
) or die "$! \n";
$socket->listen();
&log("Listening on ".$socket->sockhost().":".$socket->sockport."\n");
while ($client = $socket->accept()) {
&log("Connection from ".$client->peerhost().":".$client->peerport(
+)."\n");
# get http request - first line
$req = <$client>;
&log($client->peerhost()." ".$req);
&respond_to( &handle_req($req) );
close $client;
}
sub cleanup { close $socket; die "Interrupted. Exiting...\n"; }
sub log {
my $msg = shift;
print scalar localtime," ", $msg;
}
sub getfiles {
my $dir = shift;
opendir DIR, $dir or die "open:$!\n";
# remove . and .. from list of files
@files = grep { !/^\.(\.)?$/ } readdir DIR;
closedir DIR;
return \@files;
}
sub handle_req {
my ($method, $uri) = split / +/, shift;
if ($method !~ /^GET/) {
&log("501 Not Implemented\nr");
return 501;
}
$uri =~ s/\/(.*)/$1/; # strip the f
+irst slash
if (-e $DOCROOT.$uri) {
if (-f $DOCROOT.$uri) {
&log("200 HTTP OK\n");
return 200;
} elsif (-d $DOCROOT.$uri) {
&log("200 HTTP OK\n");
return 200;
} else {
&log("406 Not Acceptable\n");
return 406;
}
} else {
&log("404 Not Found\n");
return 404;
}
}
sub respond_to {
my $status_code = shift;
unless ($status_code == 200) {
&display($error_page{$status_code}) if (-f $error_page{$status
+_code});
return;
}
my $uri = (split / +/, $req)[1];
$uri =~ s/\/(.*)/$1/;
my $path = $DOCROOT.$uri;
if (-f $path) {
&display($path) ;
return;
}
if (-d $path) {
if (-f $path.'index.html') {
&display($path.'index.html');
} else {
&gen_dir_list($uri, &getfiles($path));
}
}
}
sub display {
my $file = shift;
open RES, $file or die "open: $file: $!";
&log("Sending $file\n");
print $client $_ while (<RES>);
close RES;
}
sub gen_dir_list {
my ($uri, $files) = @_;
&log("[info] dir listing request\n");
# print html header
print $client <<HEADER;
<html>
<head><title>dir listing for: /$uri</title></head>
<body>
<table cellpadding=5>
HEADER
my ($count, $modification_time);
foreach my $f (@$files) {
# open $f to get its modification time
if (-f $DOCROOT.$uri.'/'.$f) {
open my $handle, $DOCROOT.$uri.'/'.$f or &log("open: $!\n"
+);
$modification_time = scalar localtime((stat $handle)[9]);
} else {
opendir my $handle, $DOCROOT.$uri.'/'.$f or &log("opendir:
+ $!\n");
$modification_time = scalar localtime((stat $handle)[9]);
}
printf $client "%s<td><a href=\"%s\">%s</a></td><td>%s</td></t
+r>",
# different colours for alternate rows
(++$count % 2 ? '<tr bgcolor="#e0ffd6">' : '<tr bgcolo
+r="#ffdcd6">'),
# generate href links
(-d $DOCROOT.$uri.'/'.$f ? '/'.$uri.$f.'/' : '/'.$
+uri.$f),
# append a '/' to the end of dir names
(-d $DOCROOT.$uri.'/'.$f ? $f.'/' : $f),
$modification_time;
}
#print html footer
print $client <<FOOTER;
</table>
<p>-- httpserv v0.1 --</p>
</body>
</html>
FOOTER
}
Re: how to improve my script
by betterworld (Curate) on Jun 15, 2009 at 13:27 UTC
|
From looking at your code, it appears that it is vulnerable to directory traversal attacks. If someone requests URLs like "/../../../../../../etc/passwd", you happily deliver the system's user database.
Also I'd guess that you'll get a lot of warnings if a client terminates the connection without sending a request.
However I may be wrong; I did not run your code, I only had a look at it.
I realize that you are doing this only as an exercise, but I'd strongly recommend thinking about malicious input when writing networking software.
| [reply] |
|
I'm missing taint mode and any kind of input validation. I see code written in Perl4 style (&log(...)), I see repeated (hidden) stat calls (-e $DOCROOT.$uri, then -d $DOCROOT.$uri, then -f $DOCROOT.$uri), I see open where stat is sufficient (# open $f to get its modification time). And I'm sure perlcritic would find a lot more than this.
Alexander
--
Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
| [reply] [d/l] [select] |
Re: how to improve my script
by rovf (Priest) on Jun 15, 2009 at 13:04 UTC
|
I suggest that you also test your code against http://perlcritic.com/, maybe with critic level set to "stern" for a start.
Also, I don't think it is a good idea to start your sub calls with &. Though it doesn't harm in your case, if you would call a sub having a prototype, you would bypass the conversion. Though I use & only if I really need to.
--
Ronald Fischer <ynnor@mm.st>
| [reply] [d/l] [select] |
Re: how to improve my script
by toolic (Bishop) on Jun 15, 2009 at 18:48 UTC
|
PS: Am posting a link as am not sure if its allowed to post around 100+ lines of code in a post.
It would be preferable in this case if you included the code in "readmore" tags (as described in Writeup Formatting Tips), rather than as a link. You may update your original post to do so.
The way you handle your @files variable is a little confusing to me. You declare it in your main code as my @files;, and you fill it your getfiles sub.
But then that sub does something unexpected by returning it as a reference to an array. You should decide whether you want to use it as an ordinary array or as an array reference, not both. I would be surprised if this does not give you problems.
| [reply] [d/l] [select] |
Re: how to improve my script
by agn (Initiate) on Jun 15, 2009 at 18:32 UTC
|
Thanks for the suggestions. Directory traversals with .. don't work. I checked it when I wrote the code. Patched up the code while keeping the other suggestions in mind.
Pasting the updated code:
#!/usr/local/bin/perl -w
#Caution: Big mess ahead
#TODO:
# get .. working in dir listing
# implement Getopt::Long
# usage() function
# log to syslog
# fork() children
# daemonize
# persistent connections
use strict;
use IO::Socket;
my ($socket, @files, $req, $client);
my $DOCROOT = '/home/arun/docs/';
my %error_page = (
403 => $DOCROOT.'403.html', # forbidden
404 => $DOCROOT.'404.html', # not found
406 => $DOCROOT.'406.html', # not acceptable
501 => $DOCROOT.'501.html' # not implemented
);
#$SIG{'INT'} = \&cleanup;
$socket = new IO::Socket::INET (
LocalAddr => '127.0.0.1',
LocalPort => (shift || 4321),
Proto => 'tcp',
Listen => 5,
ReuseAddr => 1
) or die "$! \n";
$socket->listen();
&log("Listening on ".$socket->sockhost().":".$socket->sockport."\n");
while ($client = $socket->accept()) {
&log("Connection from ".$client->peerhost().":".$client->peerport(
+)."\n");
# get http request - first line
$req = <$client>;
&log($client->peerhost()." ".$req);
&respond_to( &handle_req($req) );
close $client;
}
sub cleanup { close $socket; die "Interrupted. Exiting...\n"; }
sub log {
my $msg = shift;
print scalar localtime," ", $msg;
}
sub getfiles {
my $dir = shift;
opendir DIR, $dir or die "open:$!\n";
# remove . and .. from list of files
@files = grep { !/^\.(\.)?$/ } readdir DIR;
closedir DIR;
return \@files;
}
sub handle_req {
my ($method, $uri) = split / +/, shift;
if ($method !~ /^GET/) {
&log("501 Not Implemented\nr");
return 501;
}
$uri =~ s/\/(.*)/$1/; # strip the fi
+rst slash
if (-e $DOCROOT.$uri) {
if (-f $DOCROOT.$uri) {
&log("200 HTTP OK\n");
return 200;
} elsif (-d $DOCROOT.$uri) {
&log("200 HTTP OK\n");
return 200;
} else {
&log("406 Not Acceptable\n");
return 406;
}
}
&log("404 Not Found\n");
return 404;
}
sub respond_to {
my $status_code = shift;
unless ($status_code == 200) {
&display($error_page{$status_code}) if (-f $error_page{$status
+_code});
return;
}
my $uri = (split / +/, $req)[1];
$uri =~ s/\/(.*)/$1/;
chomp($uri);
my $path = $DOCROOT.$uri;
if (-f $path) {
&display($path) ;
return;
}
if (-d $path) {
if (-f $path.'index.html') {
&display($path.'index.html');
} else {
&gen_dir_list($uri, &getfiles($path));
}
}
return;
}
sub display {
my $file = shift;
open RES, $file or die "open: $file: $!";
&log("Sending $file\n");
print $client $_ while (<RES>);
close RES;
}
sub gen_dir_list {
my ($uri, $files) = @_;
&log("[info] dir listing request\n");
# print html header
print $client <<HEADER;
<html>
<head><title>dir listing for: /$uri</title></head>
<body>
<table cellpadding=5>
HEADER
my $count;
foreach my $f (@$files) {
printf $client "%s<td><a href=\"%s\">%s</a></td><td>%s</td></t
+r>",
# different colours for alternate rows
(++$count % 2 ? '<tr bgcolor="#e0ffd6">' : '<tr bgcolo
+r="#ffdcd6">'),
# genereate href links
(-d $DOCROOT.$uri.'/'.$f ? '/'.$uri.$f.'/' : '/'.$uri
+.$f),
# append a '/' to the end of dirs
(-d $DOCROOT.$uri.'/'.$f ? $f.'/' : $f),
scalar localtime((stat $DOCROOT.$uri.'/'.$f)[9]);
}
#print html footer
print $client <<FOOTER;
</table>
<p>-- httpserv v0.1 --</p>
</body>
</html>
FOOTER
}
Thanks for the percritic link. It did find many things. Am still working on it. I will also remove & from function calls.
| [reply] [d/l] |
|
| [reply] [d/l] [select] |
|
skx@gold:~$ telnet 127.0.0.1 4321
Trying 127.0.0.1...
Connected to 127.0.0.1.
Escape character is '^]'.
GET /../../../../../../etc/passwd HTTP/1.0
root:x:0:0:root:/root:/bin/bash
..
I notice too that you never return a content-type in your response which is surprising to say the least! (I was surprised that firefox displayed the resulting pages correctly.)
| [reply] [d/l] |
Re: how to improve my script
by agn (Initiate) on Jun 16, 2009 at 05:22 UTC
|
skx, .. traversals didn't work when I checked it in firefox. Maybe it was firefox that removed those ..'s . It works while using telnet. :( Thanks for the heads up. That'll be my first priority.
And thanks, afoken, for the suggestions. I definitely have my hands full with your suggestions. For a novice like me, without any prior programming experience, it'll take a bit of time to correct all these. I'll be back after I make the recommended changes.
I also want to make it clear that I don't plan to write a full fledged web server. Just want to support GET requests and write good code.
Thank you all once again. | [reply] |
|
|