Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

Comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

Dear Perl monks,

I have sped up a time-consuming task through concurrent execution of several instances of the same code ("crew" thread model). It is quite effective since time to complete the task decreases from 2 hours 33 minutes to 56 minutes.

The question is: How can protect against ALL threads erroring out?

Basically my code queues the jobs to be done with throttling control through a semaphore to prevent memory flooding with pending jobs:

sub queueProcessRequest { my ($job) = @_; $throttle->down(); $dispatcher->enqueue($job); return undef }

Worker thread increments the semaphore as soon as it removes a job from the queue.

As long as there remains a live worker thread, the queue can be emptied and the main thread is not blocked on the semaphore. When it has finished queuing the job, it can check thread states and wait for termination:

sub syncIdle { # Check if any thread errored out my $abort = 0; for my $i (0..$#thread) { if ( !$thread[$i]->is_running() || $thread[$i]->is_joinable() ) { lock($screenaccess); print (STDERR 'ERROR: thread #' , 1+$i , ' encountered a problem while processing file' , "\n" , $log[1+$i] , "\n" , 'Check the cause and eventually report a bug.' , "\n" ); $abort ||= 1; } } if ($abort) { endThreadedOperation(); print (STDERR "${VTred}Flushing and aborting now ...${VTnorm}\ +n"); print (STDERR 'The error message may have scrolled out due to +asynchronous operation. Check.', "\n"); exit(1); } while ($busy || 0 < $dispatcher->pending()) { threads->yield(); # sleep(1); # Retry later } }

However, if ALL worker threads errored out (if a bug is present, it is likely to happen in all threads since they share the same code base), the job queue eventually fills up, the main thread is blocked on the semaphore and never gets the opportunity to call syncIdle()

I tried putting some threads->yield() in adequate locations but on my Fedora Linux yield() is just a no-op (as the manual warns).

I then modified the queueing sub as follows:

sub queueProcessRequest { my ($job) = @_; # If the queue fills up, it may be caused by threads killed # by an error. In this case, we'll be blocked forever below. # Then let's have a look on the threads. if ($queuelen <= $dispatcher->pending()) { # threads->yield(); sleep(1); # Give a chance foreach my $t (@thread) { if ( !$t->is_running() || $t->is_joinable() ) { syncIdle(); # Diagnose and abort } } } $throttle->down(); $dispatcher->enqueue($job); return undef }

This works as intended, BUT ...

It looks like the queue gets filled first. I then see a pause (sleep(1) above) and worker threads get scheduled to do their job. The cycle restarts.

Note: all debugging code not shown in code snippet.

My analysis is thread switching occurs only at sleep() time.

I can't afford to leave such sleep(1) calls in the code since it would mean about 5000 seconds (roughly 1 hour and a half compared to 56 minutes) wasted to wait for scheduling.

I replaced sleep() with usleep() but the delay must also be long enough to give switching a chance (on the order of 15 ms). Unfortunately, the required minimum delay seems to depend on the number of worker threads, the global machine load, ... and is affected by some sort of jitter.

What I need is a way to force thread switching without causing delay so that my job queue can be emptied by the worker threads.

How can I do it?

My design may also be wrong. Is there an alternate suggestion?

In reply to How can I force thread switching? by ajl52

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    [stevieb]: It is clear that JetBrains needs an XS plugin. That at this time is WAY beyond my comprehension :)

    How do I use this? | Other CB clients
    Other Users?
    Others lurking in the Monastery: (5)
    As of 2017-02-22 02:28 GMT
    Find Nodes?
      Voting Booth?
      Before electricity was invented, what was the Electric Eel called?

      Results (323 votes). Check out past polls.