Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

AnyEvent::ForkManager fails tests on Cygwin

by choroba (Cardinal)
on Apr 30, 2015 at 11:35 UTC ( [id://1125265]=perlquestion: print w/replies, xml ) Need Help??

choroba has asked for the wisdom of the Perl Monks concerning the following question:

Hi fellow Monks,

I tried to install AnyEvent::ForkManager on Cygwin. Both its main dependencies, AnyEvent and Parallel::ForkManager, installed without problems, but the module itself hung right after the first test in 001_basic.t:

~/.cpan/build/AnyEvent-ForkManager-0.04-ZOUXbL$ ./Build test t/000_load.t ...... 1/1 # Testing AnyEvent::ForkManager/0.04 t/000_load.t ...... ok t/001_basic.t ..... 1/63

I sprinkled the code with tracing warns to discover where exactly the code gets stuck. The following line never finished:

$pm->start( cb => sub { my($pm, $exit_code) = @_; local $SIG{USR1} = sub { $started_all_process = 1; }; isnt $$, $pm->manager_pid, 'called by child'; # <<== + HERE until ($started_all_process) {}; # wait note "exit_code: $exit_code"; $pm->finish($exit_code); fail 'finish failed'; }, args => [$exit_code] );

At first, I though that's manager_pid that doesn't return, but after replacing the line with

my $mpid = $pm->manager_pid; isnt $$, $mpid, 'called by child';

it became obvious it's the isnt line that causes the issue. I delved more deeply and found out it comes from Test::SharedFork. It uses flock to lock a file that shares the information between forks. The Store::Locker is constructed with the following:

sub new { my ($class, $store) = @_; $store->_reopen_if_needed; if ($store->{lock}++ == 0) { flock $store->{fh}, LOCK_EX or die $!; # <<== HERE } bless { store => $store }, $class; }

The code stops on the flock line and stays there forever (on Linux, it works correctly). I wanted to know more, so I prepended the following to the line:

use Data::Dumper; $Data::Dumper::Deparse = 1; warn Dumper($store);

Not only was I able to see the structure, but all the tests passed. "A race condition," though I and replaced the line with

use Time::HiRes qw{ usleep }; usleep 200;

Result: PASS. When lowering the value, the tests sometimes hung again.

The questions

  1. Can someone with a MSWin machine (non-cygwin) try the same? Is the behaviour similar?
  2. Can someone explain how exactly the race condition happens in this case?

Thanks. The issue lives outside of PM at github, too.

لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

Replies are listed 'Best First'.
Re: AnyEvent::ForkManager fails tests on Cygwin
by MidLifeXis (Monsignor) on Apr 30, 2015 at 13:42 UTC

    I get a:

    t\001_basic.t ..... 1/63 No such signal: SIGUSR1 at t\001_basic.t line + 74. No such signal: SIGUSR1 at t\001_basic.t line 74. t\001_basic.t ..... 22/63 Unrecognized signal name "USR1" at C:\....\A +nyEvent-ForkManager-0.04\blib\lib/AnyEvent/ForkManager.pm line 154 # Looks like you planned 63 tests but ran 22. # Looks like your test exited with -1 just after 22.

    and then a hang.

    Strawberry 5.18.2.1 portable.

    $ perl -v This is perl 5, version 18, subversion 2 (v5.18.2) built for MSWin32-x +86-multi-thread-64int Copyright 1987-2013, Larry Wall Perl may be copied only under the terms of either the Artistic License + or the GNU General Public License, which may be found in the Perl 5 source ki +t. Complete documentation for Perl, including FAQ lists, should be found +on this system using "man perl" or "perldoc perl". If you have access to + the Internet, point your browser at http://www.perl.org/, the Perl Home Pa +ge.

    --MidLifeXis

      Thank you for the report. Does the hang disappear when you add the usleep to Test::SharedFork::Store.pm?
      لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

        Nope. I even tried replacing USR1 with something available on Windows (a couple of somethings, actually: TERM, INT, NUM17 (just for grins)), but to no avail.

        t\001_basic.t ..... 1..63 # start on_start ok 1 - not working max ok 2 - called by manager # end on_start ok 3 - called by child # start on_start ok 4 - not working max ok 5 - called by manager # end on_start # start on_working_max ok 6 - working max ok 7 - called by manager # end on_working_max # start on_enqueue ok 8 - called by child ok 9 - working max ok 10 - called by manager # end on_start # start on_working_max ok 11 - working max ok 12 - called by manager # end on_working_max # start on_enqueue ok 13 - working max ok 14 - called by manager # end on_start # start on_working_max ok 15 - working max ok 16 - called by manager # end on_working_max # start on_enqueue ok 17 - working max ok 18 - called by manager # end on_start # start on_working_max ok 19 - working max ok 20 - called by manager # end on_working_max # start on_enqueue ok 21 - working max ok 22 - called by manager # end on_start # exit_code: 1 # exit_code: 2

        Hangs right there.

        --MidLifeXis

Re: AnyEvent::ForkManager fails tests on Cygwin
by ikegami (Patriarch) on Apr 30, 2015 at 15:01 UTC

    Windows (as opposed to cygwin) doesn't have signals except for Ctrl-C and Ctrl-Break. The code will definitely fail there.

      The latest version on GitHub uses SIGINT instead of SIGUSR1. Would it help?
      لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ
        Ctrl-C does call $SIG{INT}. I don't know it if it interrupts system calls like unix signals do.
Re: AnyEvent::ForkManager fails tests on Cygwin
by choroba (Cardinal) on May 18, 2015 at 08:21 UTC
    I blogged about the issue.
    لսႽ† ᥲᥒ⚪⟊Ⴙᘓᖇ Ꮅᘓᖇ⎱ Ⴙᥲ𝇋ƙᘓᖇ

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://1125265]
Approved by marto
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (3)
As of 2024-04-19 22:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found