Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

comment on

( [id://3333]=superdoc: print w/replies, xml ) Need Help??

So lately, I've had to learn a bit about threads in Perl. (Re: Outside-in objects...) In particular, I've learned that the "inside-out" object technique (c.f. Yet Another Perl Object Model, Class::InsideOut, etc.) -- which typically uses stringified $self or else refaddr($self) as the key to storing object properties in a package-scoped lexical hash -- can be fatally flawed when used with threads. Because Perl ithreads clone all data into the new thread, the memory address of the blessed reference changes, dissociating it with the values stored in the property hash. Ugh.

After fooling around with ideas for using a UUID for each object that could be tracked across a thread boundary, I stumbled into rereading perlmod and its description of the CLONE method, which is called once per package right after a new thread is created (and from the context of the new thread). Using this method and a global registry of objects, I was able to migrate object data to be keyed off the new memory location in the new thread. While this doesn't allow sharing objects across threads, it at least preserves existing objects into newly created threads.

Once I got that working, I began to wonder about fork-safety. In my next bit of exploration, I discovered that forking is platform-specific. (Hey, it was news to me, at least.) On a unix derived OS, fork is done using the system fork call, which creates a new process with memory allocated as "copy-on-write" (at least it works this way on Linux, with which I'm most familiar). (While I'm not deep on the internals of it, from what I understand, that means that the same memory locations are used for variables until the value of the variable is changed -- experts please correct or expound if I'm off target.) That seems to work just fine for inside-out objects -- as the reference is preserved (and changing the reference is tantamount to changing the object, anyway).

On Win32, however, forking is faked using threads! (c.f. perlfork) So fork-safety on Win32 means getting thread-safety as well, which means that thread-safety for inside-out objects winds up being rather important, as unsuspecting users might wind up forking their way into threads without even realizing it and finding all their objects have lost their data. (Unfortunately, this detail is completely glossed over in Conway's recent Perl Best Practices, as he only mentions the need for declaring the lexical hashes as shared and ensuring locking occurs on access for thread-safety for inside-out objects.)

I've included below some code samples that show how to use a global registry of objects with CLONE, along with some test files that demonstrate how it works -- albeit only in a very simple case. I've tested it on WinXp (ActiveState) and Linux and it worked as expected. (Code is a bit pedantic for clarity.)

SafeObject.pm:

# A thread-safe inside-out object class package SafeObject; use strict; use warnings; use Scalar::Util qw( refaddr weaken ); our $VERSION = 0.001; # Global object tracking and constructor my %REGISTRY; # Object property storage and accessor my %NAME; sub name { my ($self, $value) = @_; # store a value if one is provided my $id = refaddr $self; if ( defined $value ) { $NAME{ $id } = $value; } return $NAME{ $id }; } # Constructor and destructor sub new { my $class = shift; my $self = bless {}, $class; # store a weak reference in the registry my $id = refaddr $self; weaken ( $REGISTRY{ $id } = $self ); return $self; } sub DESTROY { my $self = shift; my $id = refaddr $self; # clean up memory used for the object delete $NAME{ $id }; delete $REGISTRY{ $id }; return; } # Cloning routine called for new threads sub CLONE { # So we can see this called in a Windows fork() warn "# Notice: Cloning data in new thread\n"; # fix-up all object ids in the new thread # (note: %REGISTRY change in the middle, so don't use "each") for my $old_id ( keys %REGISTRY ) { # look under old_id to find the new, cloned reference my $object = $REGISTRY{ $old_id }; my $new_id = refaddr $object; # relocate data $NAME{ $new_id } = $NAME{ $old_id }; delete $NAME{ $old_id }; # update the weak reference to the new, cloned object weaken ( $REGISTRY{ $new_id } = $REGISTRY{ $old_id } ); delete $REGISTRY{ $old_id }; } return; } 1; # package must return true

01-thread.t:

#!/usr/bin/perl use strict; use warnings; use 5.008; # CLONE only supported in Perl > 5.8 use threads; use Test::More tests => 7; require_ok( "SafeObject" ); my $safe_obj = SafeObject->new; isa_ok( $safe_obj, "SafeObject" ); is( $safe_obj->name( "Charlie" ), "Charlie", "mutator returns value" +); is( $safe_obj->name() , "Charlie", "accessor returns value" +); my $thr = threads->new( sub { is( $safe_obj->name( ), "Charlie", "got right name in t +hread"); is( $safe_obj->name( "Fred" ), "Fred" , "changed name in thr +ead" ); } ); $thr->join; is( $safe_obj->name(), "Charlie", "main thread still has original name +" );

02-fork.t:

#!/usr/bin/perl use strict; use warnings; use 5.008; # CLONE only supported in Perl > 5.8 use Test::More tests => 7; require_ok( "SafeObject" ); my $obj = SafeObject->new; isa_ok( $obj, "SafeObject" ); is( $obj->name( "Charlie" ), "Charlie", "mutator returns value" ); is( $obj->name() , "Charlie", "accessor returns value" ); my $child_pid = fork; if ( !$child_pid ) { # we're in the child is( $obj->name( ), "Charlie", "got right name in child" +); is( $obj->name( "Fred" ), "Fred" , "changed name in child" +); exit; } # wait for child to finish waitpid $child_pid, 0; # Test counter is off due to the fork Test::More->builder->current_test( 6 ); is( $obj->name(), "Charlie", "parent still has original name" );

As expected, while the 02-fork.t tests pass on both Linux and Windows, on Windows we get the "# Notice: Cloning data..." warning, showing that the fork() is actually creating a new thread.

I still think that an alternative approach, storing a UUID within a blessed scalar, would be a reasonable approach, and might even facilitate sharing inside-out objects across threads (again, storing them in a registry and locking the UUID within the registry to control access). However, one of the nice features of inside-out objects keyed off of a memory address is that it's possible to transparently subclass other objects that use traditional blessed data structures to store their data. That capability would be lost using a blessed scalar to store a UUID. I'm not sure whether sharing objects across threads or flexible subclassing is a more-important feature.

Fellow monks, as I'm only starting down this path of inside-out objects and threads and forking, I'd appreciate your perspectives on this problem, the solution I've laid out above and, in particular, any other details that should be considered as this scales up beyond a simple test case.

Thanks,

-xdg

Code written by xdg and posted on PerlMonks is public domain. It is provided as is with no warranties, express or implied, of any kind. Posted code may not have been tested. Use of posted code is at your own risk.


In reply to Threads and fork and CLONE, oh my! by xdg

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Are you posting in the right place? Check out Where do I post X? to know for sure.
  • Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
    <code> <a> <b> <big> <blockquote> <br /> <dd> <dl> <dt> <em> <font> <h1> <h2> <h3> <h4> <h5> <h6> <hr /> <i> <li> <nbsp> <ol> <p> <small> <strike> <strong> <sub> <sup> <table> <td> <th> <tr> <tt> <u> <ul>
  • Snippets of code should be wrapped in <code> tags not <pre> tags. In fact, <pre> tags should generally be avoided. If they must be used, extreme care should be taken to ensure that their contents do not have long lines (<70 chars), in order to prevent horizontal scrolling (and possible janitor intervention).
  • Want more info? How to link or How to display code and escape characters are good places to start.
Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (3)
As of 2024-03-29 06:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found