Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister
 
PerlMonks  

What is this "Do you need to predeclare croak" about? [SOLVED]

by karlgoethebier (Prior)
on Jun 14, 2017 at 17:58 UTC ( #1192821=perlquestion: print w/replies, xml ) Need Help??
karlgoethebier has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

i wrote this:

#!/usr/bin/env perl use strict; use warnings; use threads; use MCE::Loop; use MCE::Shared; use LWP::Simple; use feature qw(say); use constant AMOUNT => 0; my $result = MCE::Shared->hash; my @urls = qw(http://perlmonks.org http://www.whitehouse.org ); MCE::Loop::init { max_workers => 'auto', chunk_size => 1 }; my $fetch = sub { head(shift) }; mce_loop { my @data = $fetch->($_); sleep AMOUNT; $result->set( $_ => \@data ); } @urls; { no warnings qw(uninitialized); my $iter = $result->iterator(); while ( my ( $url, $data ) = $iter->() ) { say $url; say for @$data; say q(---); } } __END__

It works as expected but sometimes i get this confusing error message:

String found where operator expected at /Users/karl/perl5/perlbrew/per +ls/perl-5.24.1threads/lib/5.24.1/darwin-thread-multi-2level/IO/Socket +/INET.pm line 303, near "croak 'usage: $sock->peerhost()'" (Do you need to predeclare croak?) # ... many more like this

This comes from...

sub peerhost { @_ == 1 or croak 'usage: $sock->peerhost()'; my($sock) = @_; my $addr = $sock->peeraddr; $addr ? inet_ntoa($addr) : undef; }

... and many more subs in INET.pm

Update: Replaced $_ with shift. Sorry.

Update2: Mh, it seems when i don't use threads the phenomenon doesn't occur...

Update3: From time to time i get this:

Segmentation fault: 11 Deep recursion on subroutine "IO::Socket::new" at /Users/karl/perl5/pe +rlbrew/perls/perl-5.24.1threads/lib/5.24.1/IO/Socket/IP.pm line 353, +<__ANONIO__> line 2.

Update4: Switching to WWW::Curl::Easy made it work:

#!/usr/bin/env perl # The sky may fall on your head tomorrow, but tomorrow never comes # $Id: uagent.pl,v 1.5 2017/06/15 13:03:32 karl Exp karl $ use strict; use warnings; use threads; use MCE::Loop; use MCE::Shared; use MCE::Mutex; # who knows what happens ;-) use WWW::Curl::Easy; use feature qw(say); use constant AMOUNT => 0.008; my $result = MCE::Shared->hash; my @urls = qw(http://perlmonks.org http://www.whitehouse.org ); MCE::Loop::init { max_workers => 'auto', chunk_size => 1, interval => AMOUNT, # posix_exit => 1, # useless when loading threads }; my $fetch = sub { my $curl = WWW::Curl::Easy->new; my ( $header, $body ); $curl->setopt( CURLOPT_URL, shift ); $curl->setopt( CURLOPT_WRITEHEADER, \$header ); $curl->setopt( CURLOPT_WRITEDATA, \$body ); $curl->perform; $header; }; my $mutex = MCE::Mutex->new; mce_loop { MCE->yield; my $data = $fetch->($_); $data =~ s/\n+$/---/; $mutex->enter( $result->set( $_ => $data ) ); } @urls; my $iter = $result->iterator(); while ( my ( $url, $data ) = $iter->() ) { say qq($url\n$data); } __END__ # for i in {1..100}; do ./uagent.pl; done

Thanks to all fellow monks for help.

Thanks for any hint and best regards, Karl

«The Crux of the Biscuit is the Apostrophe»

Furthermore I consider that Donald Trump must be impeached as soon as possible

Replies are listed 'Best First'.
Re: What is this "Do you need to predeclare croak" about?
by marioroy (Curate) on Jun 14, 2017 at 21:44 UTC

    Hello karlgoethebier,

    The demonstration looks fine. Some modules may not play well with threads, unfortunately. The LWP::Simple module has many dependencies. One of them may be unsafe for use with threads. In that case, the use_threads option is necessary to have workers spawn via fork on the Windows platform or when loading threads at the top of the script.

    Network related tasks may benefit from MCE's interval option. It helps stagger the immediate code that follows. For this use-case, calling yield prevents workers from initiating remote connections at the same time. It is similarly to sleep, but runs serially, not parallel, for that duration of time. All participating workers wait their turn to sleep.

    The next MCE update 1.830 will default to 1 for the posix_exit option. It is nearly impossible to manage a list of modules thread-safe or multi-process END safe for that matter.

    #!/usr/bin/env perl # http://www.perlmonks.org/?node_id=1192821 use strict; use warnings; use MCE::Loop; use MCE::Shared; use LWP::Simple; use feature qw(say); my $result = MCE::Shared->hash; my @urls = qw(http://perlmonks.org http://www.whitehouse.org); MCE::Loop::init { max_workers => 'auto', chunk_size => 1, interval => 0.008, posix_exit => 1, use_threads => 0 }; my $fetch = sub { eval { head(shift) }; warn $@ if $@; }; mce_loop { MCE->yield; my @data = $fetch->( $_ ); $result->set( $_ => \@data ); } @urls; { no warnings qw(uninitialized); my $iter = $result->iterator(); while ( my ( $url, $data ) = $iter->() ) { say $url; say for @$data; say q(---); } }

    MCE Loop is wantarray-aware. This allows one to use the gather method to send the key-value pair into a plain hash. For readers, this is how it was done before MCE::Shared came about.

    #!/usr/bin/env perl # http://www.perlmonks.org/?node_id=1192821 use strict; use warnings; use MCE::Loop; use LWP::Simple; use feature qw(say); my @urls = qw(http://perlmonks.org http://www.whitehouse.org); MCE::Loop::init { max_workers => 'auto', chunk_size => 1, interval => 0.008, posix_exit => 1, use_threads => 0 }; my $fetch = sub { eval { head(shift) }; warn $@ if $@; }; my %result = mce_loop { MCE->yield; my @data = $fetch->( $_ ); MCE->gather( $_ => \@data ); } @urls; { no warnings qw(uninitialized); while ( my ( $url, $data ) = each %result ) { say $url; say for @$data; say q(---); } }

    Regards, Mario

      Workers yielding serially matters more when involving an event loop. For example preventing 200 workers x 300 chunk size from initiating many connections simultaneously.

      sub walk { my ( $job, $result, $failed ) = @_; # Yielding is critical when running an event loop in parallel. # Not doing so means that the app may reach contention points # with the firewall and likely impose unnecessary hardship at # the OS level. The idea here is not to have multiple workers # initiate HTTP requests to a batch of URLs at the same time. # Yielding in 1.827+ behaves more like scatter for the worker # to run solo in a fraction of time. MCE::Hobo->yield( 0.03 ); # MCE::Hobo 1.827 my $cv = AnyEvent->condvar(); # Populate the hash ref for URLs it could reach. # Do not mix AnyEvent timeout and Hobo timeout. # Choose to do the event timeout if available. foreach my $url ( @{ $job->{INPUT} } ) { $cv->begin(); http_get $url, timeout => 2, sub { my ( $data, $headers ) = @_; $result->{$url} = $data; $cv->end(); }; } $cv->recv(); # Populate the array ref for URLs it could not reach. foreach my $url ( @{ $job->{INPUT} } ) { push @{ $failed }, $url unless (exists $result->{ $url }); } return; }

      Regards, Mario

      Thank you very much Mario and best regards, Karl

      «The Crux of the Biscuit is the Apostrophe»

      Furthermore I consider that Donald Trump must be impeached as soon as possible

Re: What is this "Do you need to predeclare croak" about?
by syphilis (Chancellor) on Jun 15, 2017 at 04:59 UTC
    Hi,

    In answer to the question asked in the subject line, this is just the standard response to an unknown symbol followed by a string:
    C:\>perl -le "rubbish 'foo';" String found where operator expected at -e line 1, near "rubbish 'foo' +" (Do you need to predeclare rubbish?) syntax error at -e line 1, near "rubbish 'foo'" Execution of -e aborted due to compilation errors.
    Cheers,
    Rob

      Thanks. I understood this so far. I had no better idea for the subject line.

      Best regards, Karl

      «The Crux of the Biscuit is the Apostrophe»

      Furthermore I consider that Donald Trump must be impeached as soon as possible

Re: What is this "Do you need to predeclare croak" about?
by 1nickt (Prior) on Jun 14, 2017 at 19:02 UTC

    Hi Karl, I am not clever enough to know how this could lead to the warning, but do you need to use a mutex since you are writing to a shared data structure?

    ... use MCE::Loop; use MCE::Shared; use MCE::Mutex; my $result = MCE::Shared->hash; my $mutex = MCE::Mutex->new ... mce_loop { my @data = $fetch->($_); sleep AMOUNT; $mutex->enter( $result->set( $_ => \@data ) ); } @urls; ...

    Hope this helps! (Mario would know better...)


    The way forward always starts with a minimal test.

      Hi 1nickt. Yes, a mutex is necessary when workers update the same element in a shared hash. For Karl's demonstration, a mutex is optional when workers update unique elements via the OO interface.

        Thanks for clarifying. I hadn't made the distinction, but I see it's an important one, certainly don't want to block if not needed.


        The way forward always starts with a minimal test.

      According to Mario i don't need a mutex - as far as i remember ;-)

      Update: I tried it with the mutex as you suggested. Doesn't help.

      Thanks and best regards, Karl

      «The Crux of the Biscuit is the Apostrophe»

      Furthermore I consider that Donald Trump must be impeached as soon as possible

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1192821]
Approved by herveus
Front-paged by Corion
help
Chatterbox?
[ambrus]: erix: further, SQLite may be the second most installed piece of software on earth after zlib.
[erix]: heh, really? yeah, I suppose it may be so

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (10)
As of 2017-09-25 14:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    During the recent solar eclipse, I:









    Results (280 votes). Check out past polls.

    Notices?