Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

LWP::Simple: a little more complicated than it sounds

by Zed_Lopez (Chaplain)
on Dec 05, 2004 at 23:12 UTC ( [id://412544]=perlmeditation: print w/replies, xml ) Need Help??

I recently had a frustrating debugging session in which, for a program using LWP::Simple,

my $result = get($url);

was assigning undef to result, yet

getprint($url);

was getting and printing $url's HTML.

Digging into the code, I found out why.

getprint() (and getstore() and head()) drives HTTP::Request. But ever since libwww-perl 5.15, from November 6, 1997, get() doesn't, normally. LWP::Simple rolls its own super-lightweight HTTP::Request, _trivial_http_request, and that's what get() normally uses.

And while getprint() (etc.) uses a user agent like "LWP::Simple/5.79" (the number is libwww-perl's version) and protocol HTTP 1.1, _trivial_http_request uses "lwp-trivial/1.40" (LWP::Simple's version) and protocol HTTP 1.0. So a robots.txt that allows getprint() can forbid get().

If you're using a proxy (as determined by looking for the existence of an HTTP_PROXY environment variable), get() will use HTTP::Request. If _trivial_http_request gets an HTTP redirect, it'll switch to using HTTP::Request.

Or you can import $ua, the LWP::UserAgent object LWP::Simple uses, and, as a side effect, it'll guarantee that get() always drives HTTP::Request. Remember that if you're specifying a list to import, the module's @EXPORT list won't be exported by default -- it's now incumbent upon you to include all the names you want imported.

use LWP::Simple qw($ua get);

I'm writing a doc patch to make some of this clearer; the maintainer, Gisle Aas, has verified that importing $ua is the only officially supported technique to force get() to use HTTP::Request.

Updated: linkified module names.

Replies are listed 'Best First'.
Re: LWP::Simple: a little more complicated than it sounds
by Roy Johnson (Monsignor) on Dec 06, 2004 at 16:17 UTC
    it's now incumbent upon you to include all the names you want imported.
    If I correctly understand the docs for import, you could do
    use LWP::Simple qw(:DEFAULT $ua get);
    and you'd have the @EXPORT list in addition to the extra critters.

    Caution: Contents may have been coded under pressure.
Re: LWP::Simple: a little more complicated than it sounds
by Your Mother (Archbishop) on Dec 06, 2004 at 01:49 UTC

    Nice catch++. I ran into this exact same problem last week but it wasn't for doing anything important so I dropped it without investigating. Thanks for doing the legwork for me.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlmeditation [id://412544]
Approved by rob_au
Front-paged by broquaint
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others goofing around in the Monastery: (4)
As of 2024-04-20 01:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found