Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

Re: panic: COND_DESTROY(6)

by sundialsvc4 (Monsignor)
on Jan 31, 2012 at 15:02 UTC ( #950996=note: print w/ replies, xml ) Need Help??


in reply to panic: COND_DESTROY(6)

Addressing now the point of the original poster, and speaking to the original poster not the esteemed monk, BrowserUK, I suggest that enough evidence has been gleaned from the various (useful...) responses to this thread to point the way to sleuthing-out the deficiency in the design of the original program.   There is a timing hole, somewhere, in the application, and the objective is to solve the damm thing make the application work properly.   (The Perl Gods can do their magic on their own time; meanwhile, one must work with what one has.)

In my experience, the most probable cause of such a “once in a million” issue has to do with the order in which condition-variables are asserted and released.   In much the same way that the Linux kernel stipulates that you must obtain this lock before you may obtain that one, the mutual-exclusion controls within the application should be arranged in a definite hierarchy.   Each of the alternative paths through the application which employ mutual-exclusion must be hand-examined in this way.   Furthermore, if you find yourself releasing one condition and in the very next statements grabbing another one, this practice probably should be avoided:   devise some exclusion control that covers both.

Mutual exclusion mechanisms cover two distinct but useful purposes:   not only do they regulate simultaneous access to a single atomic resource, but they also and more usefully can be employed to compel programs that need to shuffle between several resources to do that “shuffling” in only certain selected code-paths and therefore only in a known-in-advance timing sequence.   If x code-paths are manipulating y resources, then you can wind up with x^y possible combinations between them, and that’s just too many possibilities to manage.   Select a handful of reasonable sequences for the work, and oblige the programs who are doing it to grab some kind of semaphore to serialize their passage through it.

Even if the mutual-exclusion tools are “buried” within nice, safe, well-tested (as they certainly are...) perlguts, the essential principle remains:   you have an application to write.   Even if some bizzare, not yet found bug still exists in those “guts,” you have to devise this application so as to stay well clear of any of them.   HTH.


Comment on Re: panic: COND_DESTROY(6)
Re^2: panic: COND_DESTROY(6)
by menth0l (Monk) on Feb 01, 2012 at 09:02 UTC
    Maybe you're right. I added a tons of logging just next to each lock occurrence to see at which point my app fails. I found that each time it was crashing near calling this function:
    sub UnshareHash { my $reference = shift; lock $reference if is_shared($reference); given (ref $reference) { when ('HASH') { return { map UnshareHash($_), %{$reference} } } when ('ARRAY') { return [ map UnshareHash($_), @{$reference} ] } when ('REF') { return \UnshareHash($$reference) } default { return $reference } } }
    I have a configuration object shared between threads which sometimes need to clone/unshare some part of it using the function above. I've changed the function to lock only the top-level structure:
    sub UnshareHash { my $reference = shift; my $deep = shift; lock $reference if is_shared($reference) and not $deep; given (ref $reference) { when ('HASH') { return { map UnshareHash($_, 1), %{$reference} } } when ('ARRAY') { return [ map UnshareHash($_, 1), @{$reference} ] } when ('REF') { return \UnshareHash($$reference, 1) } default { return $reference } } }
    For now it looks promising: my app runs for about 40 straight hours now. Before that crash happened after few hours at most, sometimes after few minutes. But that may be just a coincidence, i'll have to wait some more time.

    But if it happens to be true (i.e. UnshareHash() is the culprit) then i assume that recursive locking is the problem? That would be a bug in threads::shared, wouldn't it?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://950996]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (4)
As of 2014-07-26 20:01 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (179 votes), past polls