Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw

Web Browser File Upload Progress Bar in Perl w/AJAX

by skazat (Chaplain)
on Dec 11, 2006 at 02:07 UTC ( #588964=perlmeditation: print w/replies, xml ) Need Help??


I work mostly in a web-based environment and a good part of the problems that need to be solved is actually working within this environment.

One of my major problems is attempting to have long-running processes initiated via a web browser and all the things that go with it: making sure the process works, not using too many server resources at one time, working within a basically stateless environment, session issues, knowing how to fork() correctly, giving feedback to the user on the other end, etc. The below little cgi script is my first swing at making a progress bar for file uploads in Perl for the CGI env.

I looked around when doing research and found a few things:

  • most of the attempts were written using php. Some of the attempts required actually patching php itself - I couldn't really figure that one out, but that's pretty crazy

  • The few examples I found in Perl seem to be very verbosely coded - either they spanned multiple files, weren't put into logical modules, were a mixture (I'm not kidding) of perl and some other backend scripting language or - the one example where the feature was in a single script - over 6,000 lines long. And none of them had a compatible license for what I needed (GPL).

So, it looked like wheel re-inventing time. Something told me the problem wasn't as hard as I thought it was and I'm glad to see with a bit of hacking, it wasn't, so below is my first stab at making a little script that has a progress bar meter for file uploads.

I consulted CPAN and did find this module: CGI::ProgressBar , but still decided to make one myself. I like having code separated from design and this module seemed to keep on the paradigm of how has methods for HTML tags that I've been trying to avoid (even though it does look sort of neat)

The heavy lifting, I realized, could come from the old workhorse, and its hook callback. I hadn't known about it before - it seems to have had a bad history (from my research) of not working in some instances, of not working AT ALL, and other such things. I don't know the current status of it and looking through the test suite for the module there... isn't any tests that I found that deal with this callback. Odd.

Along with CGI, I also used the handy module called CGI::Ajax. A handy module in of itself, it works well with my current program that used HTML::Template templates. Basically, you can feed it a string that has your HTML document and it will parse in the various Javascript bits that you need to have to then use its Javascript function to Perl function bridge. Pretty cool.

Basically, the way the script works is that when a file is uploaded, the hook callback in keeps sort of a state file of how much data has been uploaded to the server.

The AJAX calls periodically check on this state file and report that back to the browser in the form of a status bar/percent done thingy. As you probably know, AJAX can be used to update a specific <div> witin an HTML doc - so as that part of the doc is being updated, the upload itself is still going on. When the upload is done, the screen refreshed automatically. And that's it.

This script is a bit of a proof-of-concept - there are some things I'm not sure of:

  • How well the CGI::Ajax module works in the Real World. I've used it in limited ways before, but this is the first time it's done something *really* cool. I'm pretty pleased with it. The documentation for the module is very well written - kudos to the developers

  • How well the hook callback in is faring. I haven't ever used it before, so I don't know if there are any ways you *shouldn't* use it. The documentation for *it* is relatively lacking. It seems that the current documentation on how to use it (from the docs), doesn't really makes sense, it states:

    You can set up a callback that will be called whenever a file upload is being read during the form processing. This is much like the UPLOAD_HOOK facility available in Apache::Request, with the exception that the first argument to the callback is an Apache::Upload object, here it's the remote filename.

    $q = CGI->new(\&hook [,$data [,$use_tempfile]]); sub hook { my ($filename, $buffer, $bytes_read, $data) = @_; print "Read $bytes_read bytes of $filename\n"; }
    is this: $q = CGI->new(\&hook [,$data [,$use_tempfile]]); valid Perl? I looked and see if it was a pod parsing error but it seems like that's the way it actually is in source. Perhaps it's simply a shorthand schematic of what is available...? In any case, it's confusing. I had to actually go to the Apache::Request docs, grok it's similar method and then do some research on how people were using *that* method and do some tests on how to use's. If I get a conclusion to how to safely use this callback, I'll annotate the current docs.
  • I doubt that the progress bar will stop the problem that sometimes on slow/large uploads, something will timeout and you'll be left with an unfinished upload. I don't know what initiates the timeout - if it's the server, or the browser, or the what. I also don't know what the workaround is - it may be a pref in the server itself or something you can't workaround in a CGI script - perhaps your program has to be for mod_perl - I don't know (but any insight is welcome)

  • My current script is not secure in any way, so please don't use it for anything :) It's just used to illustrate - I do know about its blatant security problems ;)

I'd like to RFC this script, if possible and shape it up, so I can release it as an example on how to do this particular idea in Perl for other people. As I stated, there isn't a very clear way to do it in Perl, but there seems to be lots of ways to do it in other backend scripting languages.

Code's below the



#!/usr/bin/perl -w # Where is the upload file going? my $Upload_Dir = '/tmp'; ### That's all for user-serviceable parts. $|++; use strict; use CGI::Carp qw(fatalsToBrowser); use CGI; use CGI::Ajax; my $pjx = new CGI::Ajax('check_status' => \&check_status); my $q = CGI->new(\&hook); sub hook { my ($filename, $buffer, $bytes_read, $data) = @_; $bytes_read ||= 0; open(COUNTER, ">" . $Upload_Dir . '/' . $filename . '-meta.txt'); my $per = 0; if($ENV{CONTENT_LENGTH} > 0){ # This *should* stop us from dividi +ng by 0, right? $per = int(($bytes_read * 100) / $ENV{CONTENT_LENGTH}); } print COUNTER $per; close(COUNTER); } my $init_rand_string = 0; if(!$q->param('process')){ $init_rand_string = generate_rand_string(); } my $d = <<EOF <html> <head> </head> <body> <form name="default_form" enctype="multipart/form-data" method +="post"> <p> <input type="file" name="uploadedfile" /> </p> <input type="hidden" name="yes_upload" value="1" /> <input type="hidden" name="process" value="1" /> <input type="hidden" name="rand_string" id="rand_string" v +alue="$init_rand_string" /> <p> <input type="submit" value="upload." /> </p> <script language="Javascript"> setInterval("check_status(['check_upload__1', 'rand_string', 'uplo +adedfile'], ['statusbar']);",'1000') </script> <div id="statusbar"> </div> </body> </html> EOF ; my $outfile = $Upload_Dir . '/' . $q->param('rand_string') . '-' . $q- +>param('uploadedfile'); my $p = <<EOF <html> <head> </head> <body> <h1> Done!: </h1> <hr /> <p> $outfile </p> </body> </html> EOF ; main(); sub main { if($q->param('process')){ if($q->param('yes_upload')) { upload_that_file($q); } print $q->header(); print $p; dump_meta_file(); } else { print $pjx->build_html( $q, $d); } } sub upload_that_file { my $q = shift; my $fh = $q->upload('uploadedfile'); my $filename = $q->param('uploadedfile'); return '' if ! $filename; my $outfile = $Upload_Dir . '/' . $q->param('rand_string') . '-' . + $q->param('uploadedfile'); open (OUTFILE, '>' . $outfile) or die("can't write to " . $outfile . ": $!"); while (my $bytesread = read($fh, my $buffer, 1024)) { print OUTFILE $buffer; } close (OUTFILE); chmod(0666, $outfile); } sub check_status { my $filename = $q->param('uploadedfile'); $filename =~ s{^(.*)\/}{}; return '' if ! -f $Upload_Dir . '/' . $filename . '-meta.txt'; open my $META, '<', $Upload_Dir . '/' . $filename . '-meta.txt' o +r die $!; my $s = do { local $/; <$META> }; close ($META); my $small = 500 - ($s * 5); my $big = $s * 5; my $r = '<h1>' . $s . '%</h1>'; $r .= '<div style="width:' . $big . 'px;height:25px;background- +color:#6f0;float:left"></div>'; $r .= '<div style="width:' . $small . 'px;height:25px;backgroun +d-color:f33;float:left"></div>'; return $r; } sub dump_meta_file { my $filename = $q->param('uploadedfile'); $filename =~ s{^(.*)\/}{}; unlink($Upload_Dir . '/' . $filename . '-meta.txt') or warn "delet +ing meta file didn't work..."; } sub generate_rand_string { my $chars = shift || 'aAeEiIoOuUyYabcdefghijkmnopqrstuvwxyzABCDEFG +HJKMNPQRSTUVWXYZ23456789'; my $num = shift || 1024; require Digest::MD5; my @chars = split '', $chars; my $ran; for(1..$num){ $ran .= $chars[rand @chars]; } return Digest::MD5::md5_hex($ran); } =pod =head1 COPYRIGHT Copyright (c) 2006 Justin Simoni All rights reserved. This program is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version. This program is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details. You should have received a copy of the GNU General Public License along with this program; if not, write to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, Boston, MA 02111-1307, + USA. =cut

Replies are listed 'Best First'.
Re: Web Browser File Upload Progress Bar in Perl w/AJAX
by skx (Parson) on Dec 11, 2006 at 06:25 UTC
    is this: $q = CGI->new(\&hook [,$data [,$use_tempfile]]); valid Perl?


    Although it is not literal perl. The first argument "\&hook" is a reference to the subroutine "hook".

    The second argument is optional, and thus enclosed in [ and ] characters, and similarly the last argument is optional.

    So you could call it in three ways:

    # just one argument. $q = CGI->new(\&hook); # two arguments. $q = CGI->new(\&hook ,$data ); # all three - party time! $q = CGI->new(\&hook ,$data, $use_tempfile);

    Under Unix-like systems there is a convention that optional arguments are enclosed in "[" and "]" characters. For a simple example of that try readng "man ls" which has the following output:

    ls [OPTION]... [FILE]...
      I see - and sort of what I thought it may be - still, it's very odd, since it's the only example in the entire documentation of the module that has that style. Had to do a mental style switch in the noggin' there (although the docs right after it explain it well enough)

      Thanks for the clarification.


      -justin simoni
      skazat me

Re: Web Browser File Upload Progress Bar in Perl w/AJAX
by cees (Curate) on Dec 11, 2006 at 14:01 UTC

    You might want to have a look at Apache2::UploadProgress. It requires that you run mode_perl2, but it makes things a whole lot easier than what you have. In fact, you don't have to make any changes to your code to use it. You just have to load a couple of JavaScript files in your HTML page, and then add an onSubmit handler to the form and you are done.

    All the magic is done with an apache filter, so your code doesn't even have to know that you are using an upload meter on the frontend.

    I gave a demo of this at YAPC in Chicago and I have put my example code online, including the upload meter example.

      That does look interesting and I'm confident (although if you could give clarification) that the hooks using mod_perl2 would get around the timeout problem with uploading hyoooge files.

      I'm currently still playing in CGI script-land, since mod_perl, not even mentioning mod_perl2 is not a environment that's in the mainstream, as far as shared hosting accounts go (which is what I usually attempt to target, le sigh). Although making a really really mature CGI app dually work as a mod_perl/cgi script is something I'll eventually try to tackle.

      Thanks for the tip, though;


      -justin simoni
      skazat me

        Most timeout problems are apache configuration, and I'd recommend not uploading files over a certain size via the web anyways. The issue becomes more "User" and less "programming" after awhile. People do not wait. You can flash a giant blinking orb with sounds saying "PLEASE WAIT WHILE THE FILE UPLOADS" and you will still get people canceling the upload because they are sick of waiting

        BTW CGI::Ajax is a nice module to use if you haven't checked it out. Also if they are uploading images, once thing you can look at is a Java or ActiveX control that allows them to paste the clipboard to the web portal. This gives you the option to set how much compression the image gets and cut down the upload option. (I've used an ActiveX control called Online Image Editor for years and it sends data to the server using Base64).

        If this does not help you at all, then hopefully it will give someone else some ideas. --

        Even smart people are dumb in most things...
Re: Web Browser File Upload Progress Bar in Perl w/AJAX
by un-chomp (Scribe) on Dec 11, 2006 at 09:26 UTC

    I doubt that the progress bar will stop the problem that sometimes on slow/large uploads, something will timeout and you'll be left with an unfinished upload. I don't know what initiates the timeout - if it's the server, or the browser, or the what. I also don't know what the workaround is - it may be a pref in the server itself or something you can't workaround in a CGI script

    You might find this link helpful which discusses a common solution to this problem.

      I don't know if that technique applies to file uploads - since your giving a request with a POST - there's no mechanism that I know of that says, "hey, upload 100k of a file, then refresh, and automatically ask for the next 100k chunk". That's not something that server push/client pull can do.

      The screen won't refresh on you, until you're done with the POST, which basically will take as long as it takes to upload that file. I'm guessing the way that sites like google video allow you to upload large video files that could literally take hours on a slow connection is a server setting and isn't something you can work around in Perl. I think. (again, I don't know)

      I think the technique mentioned in the article is similar to what a lot AJAXy stuff does, the example I just posted included. This article is somewhat of an oldschool way of doing it. The only difference really is that in the AJAX way, it's only part of the screen that gets refreshed, and instead of using a Meta tag, it uses a little Javascript that does a similar thing.

      Forking a process is still not the end-all, be-all solution. Processes can still be stopped, killed, haulted, whatever for odd reasons. The only way to get around that is to chunk whatever task you have to do in pieces that you know will run in the time allotted, have a whole lot of meta information/state information around so if you need to restart it at a specific spot, you can. It's a lot of work ;)


      -justin simoni
      skazat me

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlmeditation [id://588964]
Approved by grep
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (2)
As of 2019-06-16 04:46 GMT
Find Nodes?
    Voting Booth?
    Is there a future for codeless software?

    Results (76 votes). Check out past polls.