|Problems? Is your data what you think it is?|
Threads and fork and CLONE, oh my!by xdg (Monsignor)
|on Aug 12, 2005 at 01:24 UTC||Need Help??|
So lately, I've had to learn a bit about threads in Perl. (Re: Outside-in objects...) In particular, I've learned that the "inside-out" object technique (c.f. Yet Another Perl Object Model, Class::InsideOut, etc.) -- which typically uses stringified $self or else refaddr($self) as the key to storing object properties in a package-scoped lexical hash -- can be fatally flawed when used with threads. Because Perl ithreads clone all data into the new thread, the memory address of the blessed reference changes, dissociating it with the values stored in the property hash. Ugh.
After fooling around with ideas for using a UUID for each object that could be tracked across a thread boundary, I stumbled into rereading perlmod and its description of the CLONE method, which is called once per package right after a new thread is created (and from the context of the new thread). Using this method and a global registry of objects, I was able to migrate object data to be keyed off the new memory location in the new thread. While this doesn't allow sharing objects across threads, it at least preserves existing objects into newly created threads.
Once I got that working, I began to wonder about fork-safety. In my next bit of exploration, I discovered that forking is platform-specific. (Hey, it was news to me, at least.) On a unix derived OS, fork is done using the system fork call, which creates a new process with memory allocated as "copy-on-write" (at least it works this way on Linux, with which I'm most familiar). (While I'm not deep on the internals of it, from what I understand, that means that the same memory locations are used for variables until the value of the variable is changed -- experts please correct or expound if I'm off target.) That seems to work just fine for inside-out objects -- as the reference is preserved (and changing the reference is tantamount to changing the object, anyway).
On Win32, however, forking is faked using threads! (c.f. perlfork) So fork-safety on Win32 means getting thread-safety as well, which means that thread-safety for inside-out objects winds up being rather important, as unsuspecting users might wind up forking their way into threads without even realizing it and finding all their objects have lost their data. (Unfortunately, this detail is completely glossed over in Conway's recent Perl Best Practices, as he only mentions the need for declaring the lexical hashes as shared and ensuring locking occurs on access for thread-safety for inside-out objects.)
I've included below some code samples that show how to use a global registry of objects with CLONE, along with some test files that demonstrate how it works -- albeit only in a very simple case. I've tested it on WinXp (ActiveState) and Linux and it worked as expected. (Code is a bit pedantic for clarity.)
As expected, while the 02-fork.t tests pass on both Linux and Windows, on Windows we get the "# Notice: Cloning data..." warning, showing that the fork() is actually creating a new thread.
I still think that an alternative approach, storing a UUID within a blessed scalar, would be a reasonable approach, and might even facilitate sharing inside-out objects across threads (again, storing them in a registry and locking the UUID within the registry to control access). However, one of the nice features of inside-out objects keyed off of a memory address is that it's possible to transparently subclass other objects that use traditional blessed data structures to store their data. That capability would be lost using a blessed scalar to store a UUID. I'm not sure whether sharing objects across threads or flexible subclassing is a more-important feature.
Fellow monks, as I'm only starting down this path of inside-out objects and threads and forking, I'd appreciate your perspectives on this problem, the solution I've laid out above and, in particular, any other details that should be considered as this scales up beyond a simple test case.
Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.