Beefy Boxes and Bandwidth Generously Provided by pair Networks
laziness, impatience, and hubris
 
PerlMonks  

How to fork in PSGI/Dancer2

by morgon (Priest)
on Jan 30, 2015 at 12:04 UTC ( [id://1115052]=perlquestion: print w/replies, xml ) Need Help??

morgon has asked for the wisdom of the Perl Monks concerning the following question:

Hi

I am currently playing around with Dancer2 and wonder about the proper way to start a long-running child-process.

But even though I currently use Dancer2 I guess my question is more about PSGI in general.

I don't know that much about it but my understanding is that PSGI/Plack abstracts away the deployment details, so in theory your PSGI application should not be affected regardless of whether the app is later deployed as GCI or running under starman or whatever, but I cannot see how you can abstract away all the mechanics of forking (e.g. what part of the system reaps the childs again etc).

So what is the best practice here?

Not forking at all and using some job-qeueue mechanism or dealing with it on a case-by-case basis as there is no abstraction for it?

Please enlighten me...

Replies are listed 'Best First'.
Re: How to fork in PSGI/Dancer2 (daemonize)
by tye (Sage) on Jan 30, 2015 at 15:19 UTC

    We have a PSGI/Plack service (no Dancer nor Mojo involved, just StarMan) and it daemonizes a child in order to finish some processing that we don't want to cause a delay in the response. We had to daemonize as just fork()ing and exec()ing still left the parent process waiting before finishing the response. I didn't take the time to investigate why.

    But testing showed that standard daemonization was sufficient. Note that we chose to exec() from the beginning as not exec()ing has risk of leaving objects alive that might cause problems, especially when the child finishes (and the objects try to finish their work).

    We send the data needed for the finalization work as JSON written through a pipe to the child's STDIN:

    open( my $worker, '|-', $ASYNC_NOTIFIER ) or die ...;

    And the child daemonizes after it finishes reading STDIN.

    - tye        

      That sounds like a fine and practical solution.   The general problem with “daemonizing,” of course, is that the number of processes/threads will equal the number of active web requests, more or less.   If the number of requests is small, infrequent, etc., that’s all well and good.   But if the work to be done is “beefier” and there might be a large number of requests coming in to do such work, that’s specifically when I suggest that you should separate the two concerns.   Let the web strictly be a user-interface, to a separate workload-handling (background) activity which can be separately throttled.   Even if 1,000 requests to do some work, came in all at once, the system would still be able to apportion the work so that no more than some-n of those requests would be in-process at one time, on some number of servers.   The known worst-case completion times would not become “worse yet,” even though the waiting-lines might briefly have stretched out into the hallway.   The system would be briefly swamped, but not drowning.

      And none of this actually has any bearing on PSGI / FCGI / CGI / mod_perl or any other website implementation concern.

        That sounds like a fine and practical solution. The general problem with “daemonizing,” of course, is that the number of processes/threads will equal the number of active web requests, more or less

        Sure it isn't -- why would you daemonize for every request? That sounds like PEBKAC

Re: How to fork in PSGI/Dancer2
by sundialsvc4 (Abbot) on Jan 30, 2015 at 14:53 UTC

    My advice is that a web-service process, no matter how it is implemented, should not directly own nor control “a long-running child process.”   That is, in fact, a batch job, and it should be treated as such by a separate service-runner of some kind, perhaps using a database as a queue ... of which there are already many good examples and framework in CPAN and elsewhere.

    The web service should then provide a user interface to this activity, enabling users to queue requests and to monitor the completion of their requests ... without waiting for them, or owning them itself.

    I have seen strategies that are as simple as cron jobs which are launched, say, twice a minute.   I have seen batch-runners which are constructed just like web-servers (or rather, RPC = Remote Procedure Call servers), using the same “plumbing” that has already been developed for that purpose.   There are lots of ways to do it, but the principle remains:   the web service provides an interface to the batch process, but does not itself own the processes that are doing the work.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1115052]
Approved by marto
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others imbibing at the Monastery: (6)
As of 2024-04-16 07:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found