Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Using parent and child processes in perl

by perl-nun (Initiate)
on Jul 14, 2013 at 08:41 UTC ( #1044211=perlquestion: print w/ replies, xml ) Need Help??
perl-nun has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I have a cgi-script that calls a service to fetch around 1000 records from the database. But the cgi script times out and I get this error message :

(70007)The timeout specified has expired: ap_content_length_filter: apr_bucket_read() failed, referer: <website-address>

So, I searched this forum and found that I can use a parent process to temporarily tell the user that the result is being fetched and a child process to carry out the actual result computation. Still, I get this error. Kindly, kindly help me out. Here's the code snippet :

my $pid = fork; if($pid>=0) { if($pid>0) { print "</form>\n"; print "<FORM action=\"/cgi-bin/reload.cgi\" name=reloadform id +=reloadform target=\"_blank\">\n"; print "<script type=text/javascript>\n"; print "window.onload=function(){ window.setTimeout(function() {red +irect();},15000); };\n"; print "function redirect() { document.reloadform.submit(); }"; print "</script>"; print "</FORM>\n"; } else { $s2ui->downLoadResults() if param('action') eq 'Download Resul +ts'; exit 0; }
UPDATE : Could anyone also tell me if there are ways other than forking to handle the above type of error?

Comment on Using parent and child processes in perl
Download Code
Re: Using parent and child processes in perl
by Anonymous Monk on Jul 14, 2013 at 09:11 UTC

    What kind of operating system? Follow this pattern in Watching long processes through CGI (Aug 02), but create more functions besides show_form to replace all those long lines in that long chain of if/elsif/...

    Hi, I have <big>

    You sure do

Re: Using parent and child processes in perl
by rjt (Deacon) on Jul 14, 2013 at 10:19 UTC
    (70007)The timeout specified has expired: ap_content_length_filter: apr_bucket_read() failed, referer: <website-address>

    That particular error tends to come up for long-running CGI requests that also happen to return a ton of data. I don't know to what extent each condition may contribute, as they are pretty highly correlated in the wild. Still, I think I can help a little:

    I can use a parent process to temporarily tell the user that the result is being fetched and a child process to carry out the actual result computation.

    How long does the actual computation take? (Can you run the same thing on the command line?) How much information is returned (and how much is sent to the browser in the initial connection)?

    The de-facto standard way of handling this sort of thing used to be to do something like this (at least on POSIX-ish systems, although much of this can be simulated on Win32 and others):

    use File::Temp qw/tempfile/; use CGI; my $q = new CGI; # Get temp file before we fork my (undef, $tempname) = tempfile(OPEN => 0); my $pid = fork; if (not defined $pid) { print $q->header(-status=>'500 Server cutlery malfunction'); exit; } if ($pid) { # Parent. Redirect to display request. # XXX WARNING XXX - This implementation is # very insecure: better off creating a random # string and keying that in your database. print $q->redirect($BASE.'?display_id=' . $tempname); exit; } # ELSE, we're in the child. Run the long-lived command, # saving the result in $tempname. Do not attempt to # send any information directly to the browser.

    Then, you have another CGI (or use a CGI parameter, if you prefer to roll it into the same one), that does something like this:

    my $display_id = $q->param('display_id') // my_fatal_error_handler(); # Error if $display_id is undef open my $fh, '<', $display_id or my_fatal_error_handler(); print $q->header; print '<html><head><meta http-equiv="refresh" content="5"></head><body +>'; # Page output here # Do not include the <meta...refresh > tag once you # have detected the command is finished, or redirect # to another page/script close $fh; unlink $display_id;

    That's the basic template, anyway, and is easy to implement with pure HTML. You can either dump the contents of $tempfile directly, or store a serialized copy of the results there instead. If your implementation is heavily database-driven already, it may be preferable to store the intermediate result in a temporary/memory table instead of a file. If you already have session ID mechanics built in to your CGI app, use them.

    More advanced/modern systems use Javascript and usually something like JSON to display the page, and then have the browser make JSON requests and stick the updated text right in the DOM, so that a complete page refresh isn't necessary. That's a bit beyond the scope for SoPW, but there are many sites out there that will explain it better than most of us can. The point is, the server-side logic is actually quite similar; you're just sending JSON data back instead of HTML, and you might set it up to send only new/changed data to save time/bandwidth.

      The actual computation takes about 4-5 minutes and I am using a RedHat Linux OS. I return some 1000 rows of data. Some 1000 rows of "The quick brown fox jumped over a lazy dog, and the lazy dog did nothing but bark", you can assume. So you are saying, in the child I store the result in a global variable($temp) and in the parent,I will have only one line that redirects the url to another cgi script with this $temp as parameter. i.e

      my $temp; # Global variable if($pid) { print redirect(); exit; } else { # call the computation function in child }


      But how will I know when $temp is populated completely? :(

        The actual computation takes about 4-5 minutes and I am using a RedHat Linux OS. I return some 1000 rows of data.

        Unless the rows are huge, 1000 rows is not a lot of data, so you should be OK, there. The original issue is almost certainly the computation time. Redhat definitely supports fork, which is good.

        So you are saying, in the child I store the result in a global variable($temp) and in the parent,I will have only one line that redirects the url to another cgi script with this $temp as parameter. i.e

        No. Once the fork() takes place, the parent and child are separate processes. Updates to $temp (or any other memory, for that matter) in one process will not affect the other process whatsoever. Then, the parent just prints the redirect header with the display_id and exits immediately. In five seconds, when the browser sends a request for the display_id, the server creates another entirely new process to service the request.

        That's why I suggested a temporary file (or temporary database storage); you need some way to keep persistent state (and know where to find that state), because the client connection is created and destroyed every few seconds when the refresh hits. It's a simple form of IPC.

        But how will I know when $temp is populated completely? :(

        That's up to you, but one simple method is to designate an "end of transmission" (EOT) marker that will never appear in the normal output. In the child, print the EOT to the end of the same temp file once the operation finishes. In the other CGI that reads the display_id file, if you see that EOT, you know the job has finished, and can take whatever action is appropriate.

Re: Using parent and child processes in perl
by Anonymous Monk on Jul 14, 2013 at 17:19 UTC
    I believe your original problem was that you aren't writing any output when this service call is happening (and it takes quite a bit of time -- half a minute or more), which causes your web server or client to give up its wait for data.

    One fix to this is to output something while processing -- small HTML comments, once per second, are a good option. How are you calling the service?

    Alternatively, you could try to fix the root cause (why is the service call slow?)

      My service call is not slow, the time it takes to return the data is long! How do I output lines while processing? Could you elaborate on that please?
        My service call is not slow, the time it takes to return the data is long!

        Aren't those the same thing? Or does "it" refer to the processing of your CGI script?

        How do I output lines while processing? Could you elaborate on that please?

        It really depends on what this "processing" of yours means. Where does it take time? What sort of structure does it have? Do post code if possible.

        Anyway, you should be looking into chopping the processing into smaller pieces by e.g. converting it to an iterator, and doing the intermediate output between those stages.

        If all your processing time is taken inside a single subroutine call you can't modify, there isn't much you can do; some things may be possible to solve with alarm(), but it's usually not a good fit for this.

        However, since you mentioned that the request takes four or five minutes, it is a good idea to not do the processing on the web server at all, but do it in the background with a separate process: you'll just need to communicate with it (e.g. through a file or database) to queue a job with it. Needs a bit more infrastructure but is probably the easiest approach!

Re: Using parent and child processes in perl
by sundialsvc4 (Abbot) on Jul 14, 2013 at 20:47 UTC

    Another “big picture” consideration is ... what do you do if the user impatiently mashes Reload?   If you are not careful, very soon you have a bunch of forked-processes all getting in one another’s way.

    One simple-but-effective way to deal with this sort of thing is to define a database table as a work-to-do queue.   Your web-page simply adds a request record to this table, then provides the user with a way to monitor until (the database entry now says that) the work has been completed.

    How does the work get completed?   How about a simple cron task that goes off once-a-minute?   The process that is fired works like this:

    1. Check for the presence of a lockfile.   If one exists, exit.   (You’re a duplicate.)
    2. Create a lockfile.   (There are CPAN modules for this ...)
    3. Query the database for work to do.   If no (more) entries exist, remove the lockfile and exit.   Otherwise, select an entry, mark it “in progress,” do it, and then mark it “done.”
    4. goto 3

    You can get fancier than this, of course, but the bottom line is that there are entirely separate processes, periodically fired or always-running, which do the work, while the web-page monitors it (and provides the means to return results to the user).

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1044211]
Approved by Happy-the-monk
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (6)
As of 2014-09-20 23:10 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (163 votes), past polls