|Keep It Simple, Stupid|
Re^2: Writing Solid CPAN Modulesby eyepopslikeamosquito (Chancellor)
|on Jan 03, 2005 at 21:52 UTC||Need Help??|
As should be obvious from his response, the BrowserUk is much more experienced with Perl threads than me and I thank him for his prompt and detailed corrections.
The only place where I disagree with him is described below.
AFAICT, it's still not fixed. As described here, Dave Mitchell proposed a patch which seems to have been rejected on performance grounds. My original sort test program, used to demonstrate the bug, still crashes with perl 5.8.6.
After I found this bug, I searched the whole Perl core source tree for use of this rare sort construct and the only place I found that used it was Test::More. Hence the simple fix to work around it. BTW, I found this bug when using LWP (which has also been patched in libwww-perl-5.801).
I should also consider the possibility that the resolution to your bug was a piece of "Don't do that!" advice.
It was. The crashing perl core construct is rarely used and easily worked around. IMHO, it's unacceptable for heavily used modules, like LWP and Test::More, to crash when run on earlier versions of Perl. Perl module authors working around core bugs is akin to application developers working around OS bugs to ensure their application does not crash (which most customers prefer to being told to upgrade their OS).
Update: To clarify the sort bug in question, if you write, for example:
your code is not currently thread-safe, yet if you write instead:
your code is thread-safe all the way back to Perl 5.8.0. It's the lightweight calling mechanism employed with sort subs that's at fault. You might say it's ridiculous that the Perl programmer needs to worry about such things, yet if you want your module to be thread-safe all the way back to perl 5.8.0 you must avoid this construct. Hopefully, this bug will be fixed soon. As for the root cause, I quote Dave Mitchell:
This is deeply, deeply, not thread-safe. It's supposed to turn the leavesub op temporarily into a null op: the sort sub is invoked via a lightweight mechanism that doesn't require a leavesub at the end; but since the op tree is shared between threads, the other thread may have already done the same and then restored it, leading to leavesub being erroneouly called, and general corruption ensuing.
Update: I'm delighted to report that Replacing closures (to work around threads crash) is now fixed in Perl 5.8.6 (see change 23499 by Steve Hay for more detail). (Further update: sorry, it seems this bug was fully fixed in 5.8.7, not 5.8.6). While I'm happy to work around the rarely used sort construct described above, I am definitely not keen to work around the much more widely used closures. So, if you have a multi processor machine and are using threads and closures, you better upgrade to perl 5.8.7. BTW, though I'm far behind the BrowserUk in general ithreads knowledge, I'm rapidly becoming an expert at debugging Perl threads crashes at the C level. ;-) Which is not all bad, no, really it's not, without that I would never have had a reason to learn about perl internals.