in reply to Re^2: Thread terminating abnormally COND_SIGNAL(6)
in thread Thread terminating abnormally COND_SIGNAL(6)
Are there any typical cases that could cause the semaphore to become invalid? Like too many Thread::Queue's?
I'm not sure. But, internally, a Windows system call WaitForMultipleObjects is used in various places, and this has a limit. Historically that was 64 handles though it might have changed on later versions. Note: You can have (many) more waitable handles, you can only wait on up to 64 of them at a time without using additional techniques. NOTE: This was just a guess on my behalf having looked at your linked code.
I create 2 Thread::Queue's per JobNode, and in my testing i was creating roughly 10 JobNodes every 2 minutes. So that would be about 300 every hour? These queues though are basically used only once, in that after the JobQueue enqueues something, the JobNode no longer cares about that Thread::Queue. Is there something I should be doing to actually clean these up somehow?
Firstly, using a Queue to pass a single value is a nonsense.
- Do you not know you can pass arguments to thread when you create them? my $thread = async( sub{ print "@_"; }, 123, 'fred', {'a'..'f'}, [ 0.. +9 ] )->join;; 123 fred HASH(0x3ea93f8) ARRAY(0x3ea9470)
- Have you heard of threads::shared?
If you need to pass a single value to a thread after it has been created rather than when you create it, then use a shared scalar:
my $jobNo :shared = 0; ... sub job { sleep 1 until do{ lock $jobNo; $jobNo; }; ... use $jobNo. ... } ... my $thread = threads::create( \&job ); ... { lock $jobNo; $jobNo = getJobNo(); }Of course, you'll want a different jobno of each thread, so use a shared hash:
my %jobNos :shared; ... sub job { my $tid = threads->tid; my $jobNo; sleep 1 until do{ lock %jobNos; $jobNo = $jobNos{ $tid } }; ## now use $jobNo. ... } my $thread = threads->create( \&job ); ... some time later { lock %jobNos; $jobNos{ $thread->tid } = getJobNo() }; ...There are other (some would say better) ways of waiting for a shared variable -- cond_vars -- than busy looping over sleep, but this is easy to write, explain and -- most importantly -- debug.
Secondly, yes, you should be cleaning up those queues. Each queue encapsulated various system resources -- including those semaphores -- and the are a finite resource. 300/hour for 24 hours means 7,200 semaphores. I can't tell without deep inspection of your code, but you could simply be running out of resources. I'd expect to get a different error message than you have -- something like: Insufficient system resources exist to complete the requested service when the queue (or a resource it uses) was being created, but it is possible that an error return is not being checked at that point.
Remember also that for a Queue to be cleaned up, *all references* at both ends will need to be freed completely before the reference count will drop to 0 and it will get recycled.
This could be a bug in threads or Threads:Queue, or perl's internals, but having glanced briefly at your linked code, I suspect that it is much more likely that the problem is sourced in the way you are abusing those modules.
In essence, I think you are constructing a very complicated system around the use of threads and queues, but you do not really know enough about those modules to be doing so. I'd strongly advise that you create a few simple, stand-alone programs and play with threads, Thread::Queue (and threads::shared, and acclimatise yourself to them before using them within what appears to be a very complex library module -- presumably intended to be used by others.
I hope that does not sounds patronising -- it certainly isn't intended to. I just know from deep experience that Perl's threading is quite different to other forms of threading and it takes everyone coming to them -- regardless of their threading background in other languages -- a while to become familiar with their particular strengths and weaknesses.
Often at this point, I offer to review the threaded code (here or via email), but given the presence of "IBM::CLIFARM::SERVER" in the title of your module, I doubt there would be any point. I don;t have a server farm lying around -- IBM or otherwise :) And from looking at the bits you linked, this isn't something that could be debugged 'by inspection' (without running it).
To reiterate:
- I strongly suspect you are running your system out of some critical resource.
You might be able to verify this using the System Information panel of ProcessExplorer.exe and checking the "Totals->Handles" count whilst your code is running. If it keeps rising and rising -- and drops back significantly when you kill your process ....
- I think that -- on the basis of the little I've seen -- that your code will need a substantial re-work to make it viable.
Your comments already show you are uncomfortable with using queues the way you are.
Doing so is simply wrong, and almost certainly completely unnecessary.
You just need to become familiar with the facilities and techniques available to you, at which point you'll see a better way to tackle the problem.
- There is not much I can do to help you with a project of this size and complexity.
I strongly advise that you try out the main components of your project is small, stand-alone throw-aways until you are convinced they work the way you want them to.
These will not only let you become familiar with the way Perl's threading works; but when you get problems, you have a ready-made test case you can post here for us to help you with.
I write all my projects this way -- small stand-alones to iron out the details of the algorithms (and my understanding) -- and then I design the main project in the light and knowledge of what I've learned. I strongly advocate the method to you (and anyone listening).
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^4: Thread terminating abnormally COND_SIGNAL(6)
by rmahin (Scribe) on Jul 16, 2013 at 01:37 UTC | |
Re^4: Thread terminating abnormally COND_SIGNAL(6)
by rmahin (Scribe) on Jul 16, 2013 at 22:56 UTC | |
by BrowserUk (Patriarch) on Jul 17, 2013 at 01:53 UTC | |
by rmahin (Scribe) on Jul 17, 2013 at 19:20 UTC | |
by BrowserUk (Patriarch) on Jul 17, 2013 at 02:27 UTC | |
by rmahin (Scribe) on Jul 17, 2013 at 19:14 UTC | |
by BrowserUk (Patriarch) on Jul 18, 2013 at 00:52 UTC | |
by rmahin (Scribe) on Jul 18, 2013 at 01:44 UTC | |
|