Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re: Crash with ForkManager on Windows

by marioroy (Vicar)
on Sep 22, 2017 at 08:56 UTC ( #1199891=note: print w/replies, xml ) Need Help??


in reply to Crash with ForkManager on Windows

Update: Modified script. Completes in 1 second with Cygwin Perl v5.22.4 and less than 1 second on Unix.

Update: Modified list of modules to pre-load. On the Windows platform, run with Perl v5.26 or later for best results.

Running parallel is problematic and not fun on the Windows platform. To increase the chance for failure, I quadrupled the input size.

# Strawberry Perl on Windows 7 VM # LWP::Simple included with Perl # Parallel::ForkManager v1.19 # Testing involves running multiple times. # Failing indicates the script crashed one or more times. perl-5.10.1.2 - LWP::Simple v5.827 : pass, slow (> 14 seconds) perl-5.12.3.0 - LWP::Simple v5.835 : pass, slow (> 14 seconds) perl-5.14.4.1 - LWP::Simple v6.00 : fail perl-5.16.3.1 - LWP::Simple v6.00 : fail perl-5.18.4.1 - LWP::Simple v6.00 : fail perl-5.20.3.3 - LWP::Simple v6.15 : fail perl-5.22.3.1 - LWP::Simple v6.15 : fail perl-5.24.2.1 - LWP::Simple v6.26 : fail perl-5.26.0.2 - LWP::Simple v6.26 : pass, fast (~ 3 seconds)

A solution is pre-loading essential modules (required at runtime) by LWP::Simple before running parallel.

use strict; use warnings; use LWP::Simple; # Pre-load essential modules for extra stability. if ( $INC{'LWP/UserAgent.pm'} && !$INC{'Net/HTTP.pm'} ) { require IO::Handle; require Net::HTTP; require Net::HTTPS; } my @urls = ( 'http://hooboy.no-such-host.int/', 'http://us.a1.yimg.com/us.yimg.com/i/ww/m5v9.gif', 'http://www.guardian.co.uk/', 'http://www.ora.com/ask_tim/graphics/asktim_header_main.gif', 'http://www.pixunlimited.co.uk/siteheaders/Guardian.gif', 'http://www.yahoo.com', ) x 4; use Parallel::ForkManager; my $pm = new Parallel::ForkManager(8); if ( $^O ne 'MSWin32' ) { $pm->set_waitpid_blocking_sleep(0); } foreach my $url ( @urls ) { $pm->start and next; my ($type, $length, $mod) = head($url); # if (!defined $type) { # ... # } # elsif ($mod) { # ... # } # else { # ... # } print "$url is done\n"; $pm->finish; } $pm->wait_all_children;

Results.

# Strawberry Perl on Windows 7 VM # LWP::Simple included with Perl # Parallel::ForkManager v1.19 # Testing involves running multiple times. perl-5.10.1.2 - LWP::Simple v5.827 : pass, > 14 seconds perl-5.12.3.0 - LWP::Simple v5.835 : pass, > 14 seconds perl-5.14.4.1 - LWP::Simple v6.00 : pass, > 7 seconds perl-5.16.3.1 - LWP::Simple v6.00 : pass, > 7 seconds perl-5.18.4.1 - LWP::Simple v6.00 : pass, > 7 seconds perl-5.20.3.3 - LWP::Simple v6.15 : pass, > 6 seconds perl-5.22.3.1 - LWP::Simple v6.15 : pass, > 6 seconds perl-5.24.2.1 - LWP::Simple v6.26 : pass, > 6 seconds perl-5.26.0.2 - LWP::Simple v6.26 : pass, ~ 3 seconds perl v5.22.4 on Cygwin - LWP::Simple v6.27 : pass, 1 second ;-)

Perl 5.26 provides the best performance, completing in 3 seconds.

Regards, Mario

Replies are listed 'Best First'.
Re^2: Crash with ForkManager on Windows
by marioroy (Vicar) on Sep 22, 2017 at 09:17 UTC

    Update: Modified script. Completes in 1 second with Cygwin Perl v5.22.4 and less than 1 second on Unix.

    Update: Modified list of modules to pre-load. On the Windows platform, run with Perl v5.26 or later for best results.

    The following does the same thing using MCE::Hobo. Pre-loading essential modules is necessary, even for Perl v5.26. The posix_exit option applies to UNIX-like OSes and is helpful in the event an underlying module or dependency isn't multi-process safe.

    use strict; use warnings; use LWP::Simple; # Pre-load essential modules for extra stability. if ( $INC{'LWP/UserAgent.pm'} && !$INC{'Net/HTTP.pm'} ) { require IO::Handle; require Net::HTTP; require Net::HTTPS; } my @urls = ( 'http://hooboy.no-such-host.int/', 'http://us.a1.yimg.com/us.yimg.com/i/ww/m5v9.gif', 'http://www.guardian.co.uk/', 'http://www.ora.com/ask_tim/graphics/asktim_header_main.gif', 'http://www.pixunlimited.co.uk/siteheaders/Guardian.gif', 'http://www.yahoo.com', ) x 4; use MCE::Hobo 1.831; MCE::Hobo->init( max_workers => 8, posix_exit => 1, ); foreach my $url ( @urls ) { mce_async { my ($type, $length, $mod) = head($url); print "$url is done\n"; }; } MCE::Hobo->waitall;

    Regards, Mario

      Update: Modified script. Completes in 1 second with Cygwin Perl v5.22.4 and less than 1 second on Unix.

      Update: Modified list of modules to pre-load. On the Windows platform, run with Perl v5.26 or later for best results.

      The following is a demonstration using MCE::Loop. The behaviour is similar to running a pool of workers. Workers request the manager process the next URL to process. The job_delay option gives each worker time to load any missing modules at runtime before another worker starts processing. The delay occurs one time by each worker.

      use strict; use warnings; use LWP::Simple; # Pre-load essential modules for extra stability. if ( $INC{'LWP/UserAgent.pm'} && !$INC{'Net/HTTP.pm'} ) { require IO::Handle; require Net::HTTP; require Net::HTTPS; } my @urls = ( 'http://hooboy.no-such-host.int/', 'http://us.a1.yimg.com/us.yimg.com/i/ww/m5v9.gif', 'http://www.guardian.co.uk/', 'http://www.ora.com/ask_tim/graphics/asktim_header_main.gif', 'http://www.pixunlimited.co.uk/siteheaders/Guardian.gif', 'http://www.yahoo.com', ) x 4; use MCE::Loop; MCE::Loop->init( max_workers => 8, chunk_size => 1, posix_exit => 1, ); mce_loop { my ($type, $length, $mod) = head( my $url = $_ ); print "$url is done\n"; } \@urls; MCE::Loop->finish;

      Regards, Mario

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1199891]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (3)
As of 2019-12-08 00:00 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    Strict and warnings: which comes first?



    Results (162 votes). Check out past polls.

    Notices?