Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

How do I start a long process with a short visit to a URL?

by Cody Fendant (Pilgrim)
on Oct 06, 2009 at 22:36 UTC ( #799606=perlquestion: print w/ replies, xml ) Need Help??
Cody Fendant has asked for the wisdom of the Perl Monks concerning the following question:

I'm using NearlyFreeSpeech for hosting, and the one problem is, they don't do cron.

The preferred solution is to use a web cron service, which is fine by me, except that I have some processes which take a while and I don't want the web cron service to wait for them.

How do I make a quick hit on a URL at mydomain.com/dostuff.cgi, start my processes, send back "Content-type: text/plain\n\nOK" and have the processes continue after the HTTP interaction is over? I have a dim idea that there are such things as threads and child processes but I've never used them.

Comment on How do I start a long process with a short visit to a URL?
Select or Download Code
Re: How do I start a long process with a short visit to a URL?
by Your Mother (Canon) on Oct 06, 2009 at 22:46 UTC

    You could definitely do this and it might even be fun and certainly a learning experience. You could also get a $10ish a month host that does cron, fastcgi, and any number of other goodies which would free up at least $120 worth of your time a year I'd guess. :)

Re: How do I start a long process with a short visit to a URL?
by ikegami (Pope) on Oct 06, 2009 at 22:55 UTC
    #!/usr/bin/perl use strict; use warnings; use Net::Server::Daemonize qw( daemonize ); print("Content-type: text/plain\n\nOK"); daemonize($>, $), undef); ...
Re: How do I start a long process with a short visit to a URL?
by ambrus (Abbot) on Oct 07, 2009 at 09:26 UTC

    See the easier half of merlyn's Linux Magazine Column 39 (Watching long processes through CGI).

    Someone fix the merlyn's secret bot because it hasn't replied to this question yet.

      Isn't that column the exact opposite of what I want? I want the CGI to not follow the process. I want it to start the process, report back that it's started, and close the HTTP connection to the agent which started it. Then the process continues by itself.
        The text of that column talks about what has to happen to have a child process be forked from a CGI process and live on. So no, it's quite relevant... you just can't cut-n-paste the code.

        -- Randal L. Schwartz, Perl hacker

        The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT", "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this document are to be interpreted as described in RFC 2119.

Re: How do I start a long process with a short visit to a URL?
by ELISHEVA (Prior) on Oct 07, 2009 at 10:17 UTC

    Please be very careful with any URL that can trigger memory or CPU intensive processes. The web-based cron that I've seen comes in two flavors:

    • a provider that lets you submit a schedule and sends a URL request at a specified time.
    • a script on your own website checks a scheduling file each time someone makes a web-request. If it finds any tasks scheduled before the present moment, it runs them if they haven't been run already.

    If you go with the first solution you must be very sure that only that web-cron provider can trigger the script. Ideally this should be done both on the web server level (via well configured .htaccess files) and checks internal to your script. If you are not very careful, you can open yourself up to DOS attacks. You don't need to be a known target to be vulnerable. There are non-so-nice crawlers and script kiddies out there that will canvas random websites looking for vulnerable URLs and when they have found them, they "play" until your site croaks.

    The on-site cron approach tends to be less risky because it knows its own schedule and won't rerun a job after it has already been run. The main down-side of that approach is that timing is never precise. If you schedule a process for 2AM but nobody visits your site until 7:30AM the process will run at 7:30AM, not 2AM. This is obviously a problem if you need something to run at exactly 2AM.

    Depending on your traffic patterns you may also experience load balancing problems. Presumably one schedules a resource intensive task at 2AM because it is a low traffic period. If nobody visits in the middle of the night and you tend to get a burst of traffic in the morning, the task scheduled at 2AM may end up running at a peak traffic period rather than the low traffic period you intended.

    Perhaps the best solution is a combination - use method 2 (a carefully secured and unpublished website based cron script) to check schedules and trigger tasks. Use the external cron service to make a totally innocent URL request (e.g. http://example.com/index.html) at a specific time. The request just happens to trigger the cron script which in turn triggers the expensive process if it hasn't run yet.

    WordPress has a fairly mature plug-in (WP-Cron) that you might want to look at to give you ideas about how to write a scheduler that is triggered by HTTP requests. It is written in PHP, of course, but studying it might be useful for ideas about handling security issues, corner cases and design details.

    Best, beth

      Thanks for that. Very useful.
Re: How do I start a long process with a short visit to a URL?
by Tanktalus (Canon) on Oct 07, 2009 at 23:03 UTC

    Something like this:

    if (fork() == 0) { close STDIN; close STDOUT; close STDERR; do_long_running_thing(); exit 0; } print_out_web_response();
    Now, granted, I missed a bunch of error conditions around fork, as well as omitting the Highlander Maneuver, but that's the basic idea.

    Note that you can reopen stdout and/or stderr and redirect them to a file, or you can pipe them into something that'll send you an email, whatever you want. If nothing else, you may want to reopen to /dev/null, just to sure you don't confuse subprocesses if you call any.

      The details you say you don't handle? The module I mentioned does. It does the STDOUT/STDERR redirecting too if you wish.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://799606]
Approved by ikegami
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (8)
As of 2014-08-22 22:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (168 votes), past polls