http://www.perlmonks.org?node_id=1187394

1nickt has asked for the wisdom of the Perl Monks concerning the following question:

Learned brethren:

I am using MCE::Shared to create a shared hash for collecting results. I am using MCE::Loop to fork off workers and process a long list of tasks. (Note: I have tried the below code with Parellel::ForkManager instead, with the same results.)

The program works as expected: workers are forked, report their results to the shared hash, and the hash is printed from the END block by the parent process.

However, I would like to be able to interrupt the program and have the hash printed (with manual interrupt, and also on uncaught exception). In a single-process environment this works as expected: the hash is built up and dumped from whatever state it has, in the END block, on interrupt. But in parallel-processing environment, I get unexpected results.

Here is my test program:

use strict; use warnings; use feature 'say'; use Data::Dumper; ++$Data::Dumper::Sortkeys; use Time::HiRes qw/ usleep time /; use MCE::Shared; use MCE::Loop; my $pid = $$; say "PID $pid"; tie my %hash, 'MCE::Shared', (); $SIG{'INT'} = sub { kill 'TERM', -$$ }; $SIG{'TERM'} = sub { exit 0 }; MCE::Loop->init( max_workers => 6, chunk_size => 10 ); mce_loop { say "Forked child with $$"; my ( $mce, $chunk_ref, $chunk_id ) = @_; for ( @{ $chunk_ref } ) { $hash{ sprintf '%.2d %s', $_, $$ } = time; sleep 1; } } ( 0 .. 99 ); MCE::Loop->finish; END { say sprintf '%s %s (%s) in END', $$, time, $$ == $pid ? 'Parent' : + 'Child'; if ( $$ == $pid ) { say 'Parent is ready to dump'; say 'Dumping: ' . Dumper \%hash; } } __END__ <P> Here's an example of the output I am getting from my test program: <c> PID 13209 Forked child with 13211 Forked child with 13212 Forked child with 13215 Forked child with 13214 Forked child with 13213 Forked child with 13216 ^C13211 1491567754.05168 (Child) in END 13216 1491567754.05169 (Child) in END 13213 1491567754.05382 (Child) in END 13212 1491567754.05404 (Child) in END 13214 1491567754.05448 (Child) in END 13209 1491567754.05601 (Parent) in END Parent is ready to dump 13215 1491567754.05627 (Child) in END ^C

Three things strike me as odd about this:

Comparing with a Parallel::ForkManager script that doesn't use a shared variable (still shows child processes surviving longer than the parent, but exits completely with one interrupt signal):

use strict; use warnings; use feature 'say'; use Data::Dumper; ++$Data::Dumper::Sortkeys; use Parallel::ForkManager; my $pid = $$; say "PID $pid"; $SIG{'INT'} = sub { kill 'TERM', -$$ }; $SIG{'TERM'} = sub { exit 0 }; my $pm = Parallel::ForkManager->new(6); for ( 0 .. 9 ) { my $start = 10 * $_; $pm->start and next; for ( $start .. $start + 9 ) { say sprintf '%.2d %s', $_, $$; sleep 1; } $pm->finish; } END { say sprintf '%s (%s) in END', $$, $$ == $pid ? 'Parent' : 'Child'; } __END__
PID 14274 00 14275 10 14276 20 14277 30 14278 40 14279 50 14280 01 14275 11 14276 21 14277 31 14278 41 14279 51 14280 02 14275 12 14276 22 14277 32 14278 42 14279 52 14280 ^C43 14279 33 14278 23 14277 13 14276 03 14275 53 14280 14279 (Child) in END 14280 (Child) in END 14277 (Child) in END 14275 (Child) in END 14276 (Child) in END 14274 (Parent) in END 14278 (Child) in END

I realize this may be more of a fork question than shared data, but the real problem is only manifesting when trying to use the shared data structure. Thanks for any pointers.


The way forward always starts with a minimal test.