I'm using perl's threads module with a simple crawler I'm working on so I can download pages in parallel. Ocasionally, I get error messages like these:
Thread 7 terminated abnormally: read timeout at /usr/lib64/perl5/threa
+ds.pm line 101.
Thread 15 terminated abnormally: Can't connect to burgundywinecomp
+any.com:80 (connect: timeout) at /usr/lib64/perl5/threads.pm line 101
+.
Thread 19 terminated abnormally: write failed: Connection reset by
+ peer at /usr/lib64/perl5/threads.pm line 101.
When I run the script linearly without threads (or if I use fork() instead of threading), I do not encounter these errors. And these errors almost seem like they are from the LWP::UserAgent module that I am using, but they do not seem like they should be causing the threads to exit abnormally; they seem like errors that would be help in my HTTP::Response object. Is there some extra precaution I have to take while using perl's threads, or something I'm missing? Thanks!
UPDATE
I have tracked down the source of these abnormal terminations, and it does seem to be whenever I make a request using LWP::UserAgent. If I remove the method call to download the webpage, then the errors stop.
Here is a test script that produces the error:
#!/usr/bin/perl
use threads;
use Thread::Queue;
use LWP::UserAgent;
my $THREADS=10; # Number of threads
#(if you care about them)
my $workq = Thread::Queue->new(); # Work to do
my @stufftodo = qw(http://www.collectorsarmoury.com/ http://burgundywi
+necompany.com/ http://beetreeminiatures.com/);
$workq->enqueue(@stufftodo); # Queue up some work to do
$workq->enqueue("EXIT") for(1..$THREADS); # And tell them when
threads->create("Handle_Work") for(1..$THREADS); # Spawn our workers
$_->join for threads->list;
sub Handle_Work {
while(my $todo=$workq->dequeue()) {
last if $todo eq 'EXIT'; # All done
print "$todo\n";
my $ua = LWP::UserAgent->new;
my $RESP = $ua->get($todo);
}
threads->exit(0);
}