Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Designing an enqueing application proxy

by bronto (Priest)
on Jan 03, 2008 at 22:07 UTC ( #660327=perlquestion: print w/ replies, xml ) Need Help??
bronto has asked for the wisdom of the Perl Monks concerning the following question:

Wisest brothers

A few days ago I was presented a problem.

The problem is as following. We have a web service that we'll call WS thereafter, and an application that uses that service and we'll call it PE. PE is a batch that runs every 30 minutes.

Now: WS can't handle more than 10 connections per second, and apparently this limit can't be easily changed. PE could possibly be limited, but trying to do that activates a bug that makes it loop and duplicate requests, making the problem even worse.

After discussing it with one of the people at the PE development team, we thought that a solution would be having a proxy that gets the incoming requests from PE, dispatching them at a maximum rate of 10 per second and enqueing the exceeding ones. Unfortunately, no solution has been found to do that job out of the box.

Being System Administrator and not (officially) a programmer I was not charged to solve it; nevertheless it intrigued me, and I started thinking how it could be solved using Perl. Probably I won't never have the chance to write the real program, but I designed it in my mind and I would kindly ask the opinion of the many of you that are real programmers. So, here we go.

I thought that the best thing would be to create a multithreaded application.

The application should have 10 threads that I would call "Connectors", that would interface to WS. On the other side there would be 10*N threads that I would call "Listeners", that would receive incoming requests from PE. I think that five listeners per connector would suffice.

Each listener and connector has an associated incoming queue. For the listeners only, this queue pool could be a shared hash that associates the listener's thread id with its queue (more on this later).

A further queue is shared by all the listeners and will be used to enqueue the incoming requests.

Each listener would create a listen socket. When a request comes in, its content is extracted and sent into the common queue, along with the thread identifier that will be used to route the response to the right socket.

A timed subroutine (maybe a separate thread would be better?) runs each second, dequeues up to 10 requests and enqueues them, one for each connector. (Question: since, as far as I remember, alarm() is not considered a good solution for this kind of problems, how else could one do the job?)

The connector reads the request content and id from the queue, forwards the request to WS, reads the response and uses the id to enqueue it to the right listener queue. The listener reads from the queue, forwards the response to PE, closes the connection and goes back to listening (this could be optimized with a keepalive configuration).

Is this design correct?

Do you think it would be better to do it in POE instead? In that case, how would the POE application be designed (I am a POE illiterate...)

Thanks in advance

Ciao!
--bronto


In theory, there is no difference between theory and practice. In practice, there is.

Comment on Designing an enqueing application proxy
Re: Designing an enqueing application proxy
by kyle (Abbot) on Jan 03, 2008 at 22:44 UTC

    It sounds to me as if it would work, but I personally am not a big fan of threads. If I were writing this, I'd be tempted to do a lot of plumbing. Each 'connector' would be a child with a pipe running back to the queue maintainer that uses IO::Select to keep track of them all. This would probably be a good job for POE, but I don't know enough about that module to say so with any authority or detail.

Re: Designing an enqueing application proxy
by polettix (Vicar) on Jan 04, 2008 at 00:12 UTC
    I don't understand where the problem is exactly, there are at least two scenarios that come to mind. Both stem from:
    WS can't handle more than 10 connections per second, and apparently this limit can't be easily changed.
    Do you mean that your experiments show that it's physically impossible (more or less) to get more than 10 connections per second, or that if you try to do more Bad Things™ happen?

    In the first case, I'd say that the real problem is that when burst arrive, some requests get discarded and this is the problem. This could probably be addressed tweaking the server's configuration, trying to have longer listen queue sizes, more childs to serve requests, etc. This seems to be indicated also by the fact that you're talking about associating 5 listeners per connector (in this case I'd use one listener with a longer queue, e.g. see the Listen parameter in IO::Socket::INET).

    On the other hand, if Bad Things™ happen beyond the 10 req/s limit, then you have to put some throttling mechanism in place. But I'd ask: can we assume that it would be safe if no more than 10 contemporary WS transactions are alive at the same time? Probably yes if the service time is more than 1/10th of second, probably not if the service time is less than this. In any case, I'd try to find how many outstanding requests are good for your case, and start from this: you're eliminating the time variable.

    Regarding the implementation, I'd go with IO::Select and avoid all the thread stuff, but it's just me. In this case, you end up with one listener (whose listen queue is sufficiently long according to your needs in terms of rejection probability) and up to 10 * 2 contemporary "working sockes" (one pair for each connection, one towards the WS, the other towards the client). IO::Select over 21 file descriptor shouldn't be an issue.

    Hey! Up to Dec 16, 2007 I was named frodo72, take note of the change! Flavio
    perl -ple'$_=reverse' <<<ti.xittelop@oivalf

    Io ho capito... ma tu che hai detto?
      Do you mean that your experiments show that it's physically impossible (more or less) to get more than 10 connections per second, or that if you try to do more Bad Things™ happen?

      The second one: if you try to do more Bad Things™ happen.

      WS is a web service developed on top of a vendor product by the vendor itself. What happens is that if more than 10conn/sec come in, the first 10 are managed, and the exceeding are just accepted and then hung there until they timeout. Vendor says they can't change it, and by the way 10conn/sec are 600conn/min and that's a lot. I am SAYING not saying I agree or disagree here, just reporting it as they told me.

      That's why the final solution was to cortocircuit-ate the developers of PE with those of WS and let them talk. Nevertheless, the problem was so interesting to me that I came out with this design. And having very little experience with threads I wanted to know if the design was ok. And I am also interested in different approaches, like yours for example.

      By the way: thanks to all that answered. I am really curious on how this could be efficiently managed with POE (thanks emazep for talking about that).

      Update: how could my keyboard give birth to the monster error up there????? I am pretty sure I wrote "not saying" from the beginning...

      Ciao!
      --bronto


      In theory, there is no difference between theory and practice. In practice, there is.
Re: Designing an enqueing application proxy
by jbert (Priest) on Jan 04, 2008 at 00:35 UTC
    I'd say you're not far off. The biggest issue is the idea of the 'timed subroutine' to dispatch requests to the worker threads.

    More easily, the worker threads can just grab another request when they have finished what they are doing (the job queue will need a mutex to avoid multiple threads messing with it at the same time).

    You also only need one listener. The job of that thread is to accept evey incoming connection and put the request in the 'to be processed' queue, which the worker threads are consuming as described above.

    Of course, some people prefer processes to threads, etc. And your problem is probably best solved by changes to the applications at either end. Still, I hope the above is useful.

Re: Designing an enqueing application proxy
by guaguanco (Acolyte) on Jan 04, 2008 at 00:47 UTC
    The main question isn't a language question but is a design question: how robust a solution are you trying to create? If your only record of queued requests is a hashtable in memory being shared between threads, any disruptions in that process (tripping over a power cord, perl executable segfaulting etc) will cause a loss of your queued requests.
    I'd start with trying to use apache/mod_perl as my platform to avoid reinventing a lot of wheels, but to write this kind of throttling proxy server in a bombproof manner is not a trivial exercise.
Re: Designing an enqueing application proxy
by emazep (Priest) on Jan 04, 2008 at 01:27 UTC
    Ave bronto!

    It seems that you are looking for some sort of job queue manager, an application pattern for which we fortunately have several implementations in Perl, some of which are widely used and considered very reliable: please have a look at TheSchwartz or POE::Component::JobQueue (there are also several others).

    If not solve your problem out-of-the-box, they should at least facilitate it.

    In (the likely) case I've completely missed your point, I'll accept your e me lo dovevi di' tu! ;-)

    Ciao, Emanuele.

      EHLO,

      Yeah, POE::Component::JobQueue seems to fit well in this picture. First POE::Wheel::SocketFactory would accept client connection and post requests to the job queue. Then one of the 10 job queue workes would pick up the request and create a new connection to the final destination and send the data from the original source to the final destination.

      This will ensure that at any given time at most 10 connections are active. If the issue is that no more than 10 new connections should be created per second, then adding a delay of 0.1 second will be enough and one could bypass the mentioned job queue. The socket factory would get the request and fire off a proxy session with a 0.1 second delay.

Re: Designing an enqueing application proxy
by BrowserUk (Pope) on Jan 04, 2008 at 02:54 UTC

    While you wait for someone to post a POE solution (could take a while), you might get away with something as simple as this:

    #! perl -slw use strict; use IO::Socket; use IO::Select; use threads; use threads::shared; use Thread::Queue; my $done :shared = 0; my $Qin = new Thread::Queue; my $Qout = new Thread::Queue; sub listener { my %conns; my $listener = IO::Socket::INET->new( LocalAddr => 'localhost:54321', Listen => 10, Reuse => 1 ) or die "Failed to create listener: $!"; my $sel = new IO::Select( $listener ); until( $done ) { if( $sel->can_read( 0.01 ) ) { my $client = $listener->accept; $conns{ $client } = $client; my $req = <$client>; ## Assumes request is a single line $Qin->enqueue( "$client : $req" ); warn "Accepted and queued request '$req'"; } elsif( $Qout->pending ) { my( $client, $res ) = split ' : ', $Qout->dequeue, 2; print { $conns{ $client } } $res; close $conns{ $client }; delete $conns{ $client }; } } close $listener; } my $lt = threads->create( \&listener ); while( my( $client, $req ) = split ' : ', $Qin->dequeue, 2 ) { async{ my $serv = IO::Socket::INET->new( 'localhost:12345' ); print $serv $req; warn "dequeue and forwarded '$req'"; my $res = <$serv>; warn "Retrieved and queued '$res'"; $Qout->enqueue( "$client : $res" ); close $serv; }->detach; select '','','', 0.1; ## Ensure no more than 10/s }

    This creates 2 threads. The listener thread, which accepts connections, maintains a (non-shared) hash of client sockets and queues the requests as they are recieved. Then the main thread dequeues the requests, spawns a thread to forward that request and await the reply before queing that back to the listener thread. A 1/10 second delay in the second threads processing loop ensures the request rate doesn't exceed 10/second. When the spawned threads receive their replies, they are queued back to the listener thread for dispatch back to the appropriate socket.

    Of course, a thread pool solution for the upstream connections might be more efficient (as well as a tad more complicated), but whether that is necessary depends a lot upon the burst rate of your PE process?


    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.
Re: Designing an enqueing application proxy
by talexb (Canon) on Jan 04, 2008 at 15:24 UTC

    I've written code that does this using IPC::Run to launch a program and communicate with it through a pipe.

    You might consider that as an alternative to POE, although it looks like you already have lots of guidance from the various POEple here. Just wanted to mention to be sure to mention IPC::Run -- it worked great for me.

    Alex / talexb / Toronto

    "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://660327]
Approved by Corion
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (7)
As of 2014-08-30 19:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (293 votes), past polls