Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

LWP and WWW:Mechanize not working

by AI Cowboy (Sexton)
on Jun 04, 2013 at 00:40 UTC ( #1036855=perlquestion: print w/replies, xml ) Need Help??
AI Cowboy has asked for the wisdom of the Perl Monks concerning the following question:

Greetings dear Monks,

I have been trying to, for my work project, get a Perl program that can automatically download all the packages linked to by a single Google web-page. Nothing nefarious of course, but there are literally hundreds of 20-100 MB sized files there, and I don't fancy doing this manually. I've actually been instructed to build such a program.

The program isn't the hard part per se, as I've already completed the task in multiple ways, theoretically. The problem is, when I run even the following test code (with the extra "Use" statements left over from previous attempts):
#!/usr/bin/perl use LWP::UserAgent; use LWP::Simple; use URI::URL; use WWW::Mechanize; use HTML::LinkExtor; my $url = 'http://foo.bar.baz'; getprint('http://foo.bar.baz'); $user = LWP::UserAgent->new(); $user->get($url);

I get the following error in command prompt (I am using Windows 8, don't ask):
500 Status read failed: A non-blocking socket operation could not be c +ompleted i mmediately. <URL:http://foo.bar.baz>

What am I doing wrong with my approach? Is there a way to fix/bypass this? Could a different programming language get the job done? I've gotten the script to download the files successfully (I tried on one of them manually with the script using LWP::Simple to save the file on my disk), but the page that links the downloads is unreadable apparently.

UPDATE: I've tried wget, curl, and a few other things - even a method that worked yesterday to grab a test file off the net, and today the method I used to download a test file off the net with perl, doesn't work.

Every time I use lwp to connect ANYWHERE on the net with perl now, it gives me the "500 Status Read Failed" error, "a non-blocking socket operation could not be completed immediately". I'm completely baffled by this. I can connect to local html files with lwp, but not anything on the internet, and I have no firewalls up.

Replies are listed 'Best First'.
Re: LWP and WWW:Mechanize not working (you think)
by Anonymous Monk on Jun 04, 2013 at 03:06 UTC

    What am I doing wrong with my approach? Is there a way to fix/bypass this?

    Well, the code you posted wouldn't generate the error message you posted, so this is where you're going wrong, try

    mech-dump --links http://www.google.com/googlebooks/uspto-patents-grants-text.html

    OTOH, why are you even bothering to write a program for this, wget/curl/httrack/lwp-rget... already do this

      What do you mean the code have wouldn't generate that output? I ran the code and copied the output directly - that is exactly what I got.

      I'll take a look at the links you provided.

        Yeah, see LWP and WWW:Mechanize not working :) so getprint gave you diagnostic error message ( WSAEWOULDBLOCK 10035 Resource temporarily unavailable ) seems to be working :)

      I tried mech dump, and it gave a nearly identical error message - here it is:

      Error GETing http://www.google.com/googlebooks/uspto-patents-grants-text.html: S tatus read failed: A non-blocking socket operation could not be completed immedi ately. at C:\Perl64\bin/mech-dump line 103.

      Also, I can't seem to locate a way to download curl, wget, or either of the others - can you help me out? Sorry for the kiddy question, I've gotten really used to easy downloads where the download is a big button on the page :P
      oh, didn't see the getprint in there :) OTOH
      $ perl -MLWP::Simple -e " getprint( shift ) " http://foo.bar.baz 500 Can't connect to foo.bar.baz:80 (Bad hostname) <URL:http://foo.bar +.baz>

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1036855]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (6)
As of 2016-10-01 19:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    How many different varieties (color, size, etc) of socks do you have in your sock drawer?






    Results (7 votes). Check out past polls.