Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: suck on www::mechanize question

by leira (Monk)
on Mar 01, 2004 at 06:24 UTC ( [id://332800]=note: print w/replies, xml ) Need Help??


in reply to can't get www::mechanize to work on a web site

It doesn't look like you were far off in your script, and I think the "Connection refused" error that Roger mentioned might be your problem. You should make sure that doing those actions in a normal browser does what you expect it to do, before you start blaming your script for unexpected results.

You could try using HTTP::Recorder to record your WWW::Mechanize script. I tried it, and it generated this script:

$mech->get("http://search.lyrics.astraweb.com/?word=hey+jude"); $mech->follow_link(text => "Hey Jude (lennon/mccartney)", n => "1");

Since the "Hey Jude" link (without authors) produced a 500 (connection refused) error, so I chose another one for the example.

Linda

Replies are listed 'Best First'.
Re: Re: suck on www::mechanize question
by smackdab (Pilgrim) on Mar 01, 2004 at 07:10 UTC
    Thanks for trying it!
    The connection refused error might be the problem, but I can get it to work in the browser, not www::mechanize. I'll look into HTTP::Recorder and see what it does...

    Did you get it to work??????

    I tried switching the artist/song to others w/o change. I also tried the ->follow_link having the artist and song as part of the search string, but no difference.

    The original $mech->get() has artist and song, if you know of any artist/song that works, maybe that would give me a hint...(or maybe not...)
      OK, further investigation suggests that it's not one bad link, but that the server just sometimes returns a 500 error. Other times it succeeds. If I run my script several times, it will sometimes succeed and sometimes fail -- and I get the same results if I try to follow the link several times in my browser.

      I was able to get around it like this:

      my $maxtries = 10; my $i = 1; while ($i <= $maxtries) { $mech->follow_link(text => "Hey Jude (lennon/mccartney)", n => "1" +); last if $mech->success; $mech->back(); $i++; } $mech->success() or die "Can't find song page\n";

      Of course, you can set $maxtries to whatever you think is prudent, and you can put in a sleep() in the loop if you think that might help.

      Linda

        Thanks again for the help

        I tried your string 200 times w/o success and it works every time in the browser...Don't know what I could be doing wrong...

        I don't think it is the lyrics web site as the first ->get() works, it is just the 2nd one that fails...

        The link that failes in www:mechanize is:

        http://display.lyrics.astraweb.com:2000/display.cgi?beatles..beatles_1..hey_jude

        This URL works in my browser always...

        Here is my complete test program incase someone else wants to give it a try ;-)

        use WWW::Mechanize; use URI::URL; use strict; use warnings; my $artist = 'The Beatles'; my $title = 'Hey Jude'; my $mech = WWW::Mechanize->new(autocheck=>1); #$mech->agent_alias('Windows IE 6'); my $search = join("+", split(/ /, $artist)) . "+" . join("+", split(/ +/, $title)); print "search=" . "http://search.lyrics.astraweb.com/?word=$search" . + "\n"; $mech->get("http://search.lyrics.astraweb.com/?word=$search"); $mech->success() or die "Can't get the search page\n"; $mech->follow_link(text=>"Hey Jude"); #$mech->follow_link(n=>6); $mech->success() or die "Can't find lyrics page\n";

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://332800]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others studying the Monastery: (7)
As of 2024-03-28 19:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found