Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re: WWW::Mechanize - error on geting non-existing page

by Limbic~Region (Chancellor)
on Nov 05, 2011 at 18:20 UTC ( #936180=note: print w/ replies, xml ) Need Help??


in reply to WWW::Mechanize - error on geting non-existing page

Anonymous Monk,
With current versions of WWW::Mechanize, autocheck is enabled causing $mech to die in these situations. Here is how you might want to handle it:

eval { $mech->get($url); }; if ($@) { if ($mech->status == 404) { # handle page not found } else { die "Mech failed to fetch '$url': $@"; } }

Cheers - L~R


Comment on Re: WWW::Mechanize - error on geting non-existing page
Download Code
Re^2: WWW::Mechanize - error on geting non-existing page
by cavac (Chaplain) on Nov 05, 2011 at 20:43 UTC
    You also might want to switch to WWW::Mechanize::GZip, since some versions of the original WWW::Mechanize seem to announce to handle HTTP compression but don't in reality. This might result in "unparseable" results returned from the webserver.
    Don't use '#ff0000':
    use Acme::AutoColor; my $redcolor = RED();
    All colors subject to change without notice.

      You also might want to switch to WWW::Mechanize::GZip, since some versions of the original WWW::Mechanize seem to announce to handle HTTP compression but don't in reality. This might result in "unparseable" results returned from the webserver.

      I doubt this ever happens :) but if it did, the solution is to always install the latest WWW::Mechanize , LWP, Compress::Zlib

      If you were going to install WWW::Mechanize::GZip, you might install the latest WWW::Mechanize , LWP, Compress::Zlib instead

        Thanks for the info. I had the problem a few months ago on Windows (for a windows service compiled with the ActiveState tools) after upgrading Maplat with the compression options. Took me a while to figure out the root of the problem and/while it affected production - so i grabbed at the first straw i could find. So far, hasn't let me down yet.

        But i'm gonna try it your way. Sounds a bit saner than what i had come up with. Thanks.

        Don't use '#ff0000':
        use Acme::AutoColor; my $redcolor = RED();
        All colors subject to change without notice.
        Anonymous Monk,
        I doubt this ever happens :)

        Until today, I would have agreed with you. See Re^2: What Tools Do You Use With WWW::Mechanize. Essentially, $mech->mirror() will not decompress the file but that doesn't mean that Mech can't handle compression (I provide an alternative way of downloading the file that works just fine in the node).

        Cheers - L~R

Re^2: WWW::Mechanize - error on geting non-existing page
by Anonymous Monk on Nov 05, 2011 at 23:09 UTC
    Thanks a bunch, problem solved!
Re^2: WWW::Mechanize - error on geting non-existing page
by Your Mother (Canon) on Nov 05, 2011 at 23:22 UTC

    You can also just set onerror => undef which is easier and less error prone than eval/try style code.

      Your Mother,
      I honestly can't think of any reason this might be true. Do you also establish a DBI connection without RaiseError set to true? It is akin to not checking the return code of open. If something goes wrong, I want to know about it very loudly instead of blindly assuming everything is ok.

      Can you give me an example of where setting it to false is better? I assure you I haven't adopted this strategy as a cargo cult practice and am genuinely interested in hearing your views on how this is better.

      Cheers - L~R

        …well, all examples really. Though it depends on how you see it. I never advocated ignoring return states.

        Not finding a page online is not an exception(al case). It’s entirely normal. Should your car blow-up if a road-sign is missing or should the driver revisit the map? If you need to deal with any of the various 400s and 500s, it’s just easier on the hacker to not wrap it up in evals. The end code will have identical functionality: i.e., every request is checked for success or unexpected codes and then processed accordingly. All the exception handling buys you here is additional complexity (and the chance to fall into eval problems with disappearing $@ if you’re not careful).

        The change to make this stuff fatal by default is recent-ish. For most of the life of Mech, it wasn’t this way so it was a pretty big style break for those of us using it from the beginning, and I think most LWP hackers are already accustomed to checking $response->is_success habitually.

        There may be some value in the fatals for the newcomer who has incorrect expectations but for the experienced user of the kit it’s just an annoying hoop (and a bunch of broken scripts when the update came through but I digress… uh, and whine).

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://936180]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others examining the Monastery: (7)
As of 2014-08-21 09:18 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (128 votes), past polls