rmahin has asked for the wisdom of the Perl Monks concerning the following question:
Hey monks,
I have a large multi threaded program and have run into an issue where i see this message:
Thread 1 terminated abnormally: panic: COND_SIGNAL (6) at D:/Perl64/lib/Thread/Queue.pm line 31.
I am running on Windows Server 2003 R2 x64 SP1. 4gb RAM. This is perl 5, version 16, subversion 1 (v5.16.1) built for MSWin32-x64-multi-thread.
I have been unable to reliably recreate this issue. But the problem seems to be in the enqueue() subroutine in the Thread::Queue module for some reason. Perhaps with that error code it will be enough to figure out the source of the problem, but I am not sure how to get any meaning from it.
I realize that is getting printed from:
#define COND_SIGNAL(c) \ STMT_START { \ if ((c)->waiters > 0 && \ ReleaseSemaphore((c)->sem,1,NULL) == 0) \ croak("panic: COND_SIGNAL (%ld)",GetLastError()); \ } STMT_END
But I cant find where that is being called.
Aside from that, I'm not really sure the best approach for asking for help. The gist of the program is as follows:
1. Main script first creats a JobQueue, and starts a thread using its manageQueue() subroutine.
my $jobQueue = IBM::CLIFARM::SERVER::UTIL::JobQueue->new(DBfile => $db +File); threads->create('IBM::CLIFARM::SERVER::UTIL::JobQueue::manageQueue');
2. Script then creates a pool of threads using another subroutine which is used for clients to connect and issue commands.
3. When they issue commands, a JobNode is created, which basically just contains information about the job. Then we call the subroutine enqueueJob passing the jobNode which blocks until it is that job's turn to execute using the dequeue subroutine of Thread::Queue. Basically the JobNode waits until the JobQueue enequeue's a jobNumber into the JobNode's Thread::Queue (if that made any sense). It was implemented that way because it seemed to be simplest way to pass information between the different threads.
my $jobNode = IBM::CLIFARM::SERVER::UTIL::JobNode->new({process => + $process, jobType => $recover_type, userName => $username, options = +> $options}); $jobNode->setPossibleResources(resourceGiven => $resources); IBM::CLIFARM::SERVER::UTIL::JobQueue::enqueueJob({jobNode => $jobN +ode}); $logger->debug("Created new job node for $recover_type command wit +h job number $jobNode->{JOBNUMBER}");
If you need me to elaborate more on that, let me know. Here are links for the JobNode and JobQueue. Seemed a little long to post their entirety here. I tried to strip out most of the irrelevant stuff for this problem. JobNode.pm JobQueue.pm
This error message popped up when the JobQueue tried to enqueue a jobNumber to the JobNode's JOBNUMBER_QUEUE, in the setJobNumber subroutine.
This ran fine for roughly 24 hours, and then all of a sudden it the JobQueue thread died with message above. Ive tried to give as much information as I can as I am at a complete loss of where to start debugging this. Especially since I'm having trouble recreating it. If you need any additional information, or have any suggestions at all, please let me know.
Thanks a bunch!
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Thread terminating abnormally COND_SIGNAL(6)
by BrowserUk (Patriarch) on Jul 15, 2013 at 23:23 UTC | |
by rmahin (Scribe) on Jul 15, 2013 at 23:34 UTC | |
by BrowserUk (Patriarch) on Jul 16, 2013 at 00:32 UTC | |
by rmahin (Scribe) on Jul 16, 2013 at 01:37 UTC | |
by rmahin (Scribe) on Jul 16, 2013 at 22:56 UTC | |
by BrowserUk (Patriarch) on Jul 17, 2013 at 01:53 UTC | |
| |
by BrowserUk (Patriarch) on Jul 17, 2013 at 02:27 UTC | |
|