Beefy Boxes and Bandwidth Generously Provided by pair Networks
Come for the quick hacks, stay for the epiphanies.
 
PerlMonks  

thread/fork boss/worker framework recommendation

by learnedbyerror (Monk)
on Nov 30, 2011 at 12:55 UTC ( [id://940832]=perlquestion: print w/replies, xml ) Need Help??

learnedbyerror has asked for the wisdom of the Perl Monks concerning the following question:

Oh Monks,

I come to you yet confused again. My learnings teach me that there are always more than one way to do things in perl; however, my walk through search.cpan.org leads me to the conclude that I have found many almost, but not quite, good and reliable choices :)

I am working on an app that make calls to web service apis on various social web sites. Depending upon the results, this could trigger 0 - n additional calls, which could do the same again. At this time, the call depth does not exceed 3.

I have a prototype working using threads that is marginally acceptable in performance.; however, I know that I am losing performance because of the overhead associated with threads::queue and threads::shared. My current boss worker framework, crafted by myself and evolved over a couple of months of work, depends on these two modules to manage requests pipeline and to temporarily store data containing state information as well as data that will ultimately be written to a database.

I am certain that others before me have developed a most streamlined and elegant boss/worker model solution, but I have not been able to find one that works reliably. Now that I have the logic down, I am rexamining my tooling.

I would appreciate any recomendations that you may have on the following:

  1. Asynchronous Boss/Worker for threads
  2. Asynchronous Boss/Worker for fork
  3. Event Loop approach (I am currently leaning toward AnyEvent)
  4. Fast shared perl data structures (references, hashes and arrays)

Goals: reliability, speed

My current target platform is Linux and perl v5.10.1. I can upgrade perl if needed and alternatively can move this to a Windows machine if that were to be advatangeous.

Thanks in advance for your help!

lbe

Updated to strikeout comment on performance penalty of threads::shared and threads::queue based upon feedback from BrowserUK

  • Comment on thread/fork boss/worker framework recommendation

Replies are listed 'Best First'.
Re: thread/fork boss/worker framework recommendation
by BrowserUk (Patriarch) on Nov 30, 2011 at 13:43 UTC
    I have a prototype working using threads that is marginally acceptable in performance; however, I know that I am losing performance because of the overhead associated with threads::queue and threads::shared.

    I am sceptical that an IO-bound multi-processing solution is being limited by the performance of either Thread::Queue or threads::shared. Unless those modules are being used very badly.

    By way of demonstrating the basis of my scepticism for your claim, this sends messages back and forth between two threads via two queues alternately. This is the very worst case scenario as every communication requires a context switch, with each thread doing almost nothing between context switches. Ie. This example defeats the purpose of using queues by forcing constant synchronisation between the threads. It would be very hard to use queues in a worse way than this.

    #! perl -slw use strict; use Time::HiRes qw[ time ]; use threads; use Thread::Queue; $|++; my $Qin = new Thread::Queue; my $Qout = new Thread::Queue; my $t1 = async { while( my $in = $Qin->dequeue ) { $Qout->enqueue( $in + 1 ); } }; my $start = time; $Qin->enqueue( 1 ); while( my $in = $Qout->dequeue ) { $Qin->enqueue( $in + 1 ); printf "\rMessages per second: %.3f", $in / ( time() - $start ); } __END__ [13:29:53.76] c:\test>junk37 Messages per second: 21791.810Terminating on signal SIGINT(2) Messages per second: 21791.633

    As you can see, the results on my machine is that somewhat over 20,000 messages are exchanged every second, which means unless your web API calls are completing in less than half a millisecond, the queuing is not the source of your perceived performance limitations.

    The bottom line is, I believe you are miss attributing the source of your problem with performance.

    If you would post your code, I feel certain that I could resolve the issue for you quite quickly. If you don't wish to post proprietary code in public, I'd be willing to take a look in private.


    With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

    The start of some sanity?

      BrowserUK, Thanks for your offer, but I can't shared the code at this time. Even if I could, it is too large to share in mass via this site.

      Your response got me thinking or re-thinking about what are the issues with my current codes set. My concerns, at least today until I learn more :), are two: 1.) Scalability (primarily memory), 2.) Program Architecture (extendability and maintainability).

      From the scalability standpoint, my current code is broken into 7 discrete, or high level, job steps that are run sequentially. Within each job step, I use one or more pools of worker threads to parallelize the execution of portions of the job. My current threads based approach stores everything needed for the run of each step in shared memory. I chose this because I thought it to be the fastest approach for each step; however, this consumes a lot of memory and includes the overhead of additional perl interpreters for each thread. I do create the worker threads very early in each step; so I do minimize the amount of memory consumed by each instance. Ultimately, I want to assemble these 7 job stops in an asynchronous framework that will let data flow through each step as it is available and have the merge points, which are currently the completion of a step, happen on a more granualuar basis to allow the application run more like a peaceful running riven, than a tsunami.

      From a program architecture standpoint, I am having to do a lot of low level thread and thread pool management in my code. As I progressed from writing the first job step to the 7th, I have developed a module that encapsulates this pattern; however, it isn't work that I am proud of or want to support. I have been through a number of thread pool management modules on CPAN. Most simply don't work, don't fully work, or don't work well with the current versions of perl. When I looked to migrating from threads to forks, I started getting into the challenges of shared memory IPC.

      So, I spent some time back in CPAN and mucking around with some small functional tests. I think I have come up with an approach using two well known modules and one relatively new module that will allow me to address both of my concerns.

      • Approach: Use a custom written, pseudo-event, main loop to initialize asynchronous process (i.e. forks ) to query web api.
      • IPC::Lite will be used to construct and manage global variables need to store application state, information to be shared by process and job step data queues.
      • Parallel::Forker will be used to construct and manage the level 1 job steps (i.e. manage steps to merge points).
      • Parallel::Fork::BossWorkerAsync will be used to manage multiple process pools for discrete job steps (querying api in parallel, downloading files in parallel).

      The use of forks and IPC::Lite should minimize my memory footprint and on the whole should be comparable in performance to threads. Parallel::Forker allows me to define a job step sequence that will let me create the merge points needed for process synchronization. Parallel::Fork::BossWorkerAsync will let me spin off pools of forks where I can parallelize as a pool.

      I believe that this approach will let me get my code running where I can consider it production quality from a run and maintenance perspective. I remain interested in pursuing a full event-loop approach as think this will give me better extendability and further minimize the code footprint that I will have to maintain; however, either the state of the documentation for something like AnyEvent is too chopped for me to readily synergize how to use it, or I'm just not smart enough to pick it up and run with it. In any event, I need to spend some time working with it to get my confidence up sufficiently to give it a try.

      Again, thanks in advance!

      lbe

        it is too large to share in mass via this site.

        I do have email... but if you can't show me, I cannot help.

        From a program architecture standpoint, I am having to do a lot of low level thread and thread pool management in my code. As I progressed from writing the first job step to the 7th, I have developed a module that encapsulates this pattern; however, it isn't work that I am proud of or want to support. I have been through a number of thread pool management modules on CPAN. Most simply don't work, don't fully work, or don't work well with the current versions of perl.

        Re: The highlighted portion. This is a myth! (Or simply beginners coding.)

        Thread pool management only becomes complex or laborious when people insist on trying to wrap it up in an API. It is the very process of that wrapping that creates complexity. There are so many ways to use thread pools, that writing a wrapper has to deal with so many possibilities that it just creates complexity everywhere. Which is why I don't use such modules.

        Written the right way, thread pools don't need to be 'managed', they manage themselves. Here is a complete working example in 30 lines, some of which are just for show. I've typed it so many times now, I can do it from memory, and get it right first time, every time:

        #! perl -slw use strict; use Time::HiRes qw[ time ]; use threads; use Thread::Queue; sub worker { my $tid = threads->tid; print "Thread: $tid started"; my $Q = shift; while( my $workitem = $Q->dequeue ) { print "Thread: $tid processing workitem $workitem"; sleep $workitem; } print "Thread: $tid ended"; } our $WORKERS //= 10; our $ITEMS //= 1000; our $MAXWORK //= 2; my $Q = new Thread::Queue; my @workers = map threads->create( \&worker, $Q ), 1 .. $WORKERS; for ( 1 .. $ITEMS ) { sleep 1 while $Q->pending > $WORKERS; $Q->enqueue( rand( $MAXWORK ) ); print "main Q'd workitem $_"; } print "Main: telling threads to die"; $Q->enqueue( (undef) x $WORKERS ); print "Main waiting for threads to die"; $_->join for @workers;

        A run:

        [16:03:44.95] c:\test>ThreadPoolExample -WORKERS=4 -ITEMS=20 -MAXWORK= +1 Thread: 1 started Thread: 2 started Thread: 3 started main Q'd workitem 1 main Q'd workitem 2 main Q'd workitem 3 main Q'd workitem 4 main Q'd workitem 5 main Q'd workitem 6 main Q'd workitem 7 main Q'd workitem 8 Thread: 2 processing workitem 0.170745849609375 Thread: 1 processing workitem 0.852569580078125 Thread: 1 processing workitem 0.3243408203125 Thread: 1 processing workitem 0.4161376953125 Thread: 1 processing workitem 0.034759521484375 Thread: 1 processing workitem 0.488800048828125 Thread: 4 started Thread: 3 processing workitem 0.927337646484375 Thread: 2 processing workitem 0.23193359375 main Q'd workitem 9 main Q'd workitem 10 main Q'd workitem 11 main Q'd workitem 12 main Q'd workitem 13 main Q'd workitem 14 main Q'd workitem 15 main Q'd workitem 16 main Q'd workitem 17 Thread: 3 processing workitem 0.3211669921875 Thread: 3 processing workitem 0.806793212890625 Thread: 3 processing workitem 0.1046142578125 Thread: 3 processing workitem 0.812042236328125 Thread: 3 processing workitem 0.38275146484375 Thread: 3 processing workitem 0.60479736328125 Thread: 1 processing workitem 0.225616455078125 Thread: 2 processing workitem 0.750457763671875 Thread: 4 processing workitem 0.132110595703125 main Q'd workitem 18 main Q'd workitem 19 main Q'd workitem 20 Main: telling threads to die Main waiting for threads to die Thread: 2 processing workitem 0.96856689453125 Thread: 2 ended Thread: 3 processing workitem 0.59661865234375 Thread: 3 ended Thread: 1 processing workitem 0.765777587890625 Thread: 1 ended Thread: 4 ended
        I think I have come up with an approach

        Okay, it looks like you are decided and there's nothing left for me to try and help you with. Good luck.

        I'll finish by saying this though. I bet that if you'd give me the specs for your problem so that I could write the thread-pool version, it would be quicker and simpler to write, easier to maintain, and scale better than your event-driven approach.

        (That's a bold claim given I have no knowledge of what your code does. But hey, you gotta live a little :)


        With the rise and rise of 'Social' network sites: 'Computers are making people easier to use everyday'
        Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
        "Science is about questioning the status quo. Questioning authority".
        In the absence of evidence, opinion is indistinguishable from prejudice.

        The start of some sanity?

        Hey lbe,

        I'm the author of Parallel::Fork::BossWorkerAsync. I created it for my own use, and I use it daily. But mine is a limited, sole, use case. Your project sounds meaty enough to be revealing in terms of what the module does, what maybe it should do, etc. I'd be very grateful if you could shoot me an email to moosie@gmail.com at some point, describing your experience with the module.

        Cheers,

        -joe

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://940832]
Approved by Ratazong
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (9)
As of 2024-04-18 11:36 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found