Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

LWP head replacement

by crazyinsomniac (Prior)
on Nov 21, 2001 at 13:01 UTC ( #126739=snippet: print w/ replies, xml ) Need Help??

Description: Inspired by LWP head mystery, where a server replies incorrectly to a HEAD request, which is apparently a common bug, this snippet makes a GET request, but closes the socket after reading only 1 character of the content, thus doing basically the same thing HEAD does, while not having to deal with the common problem of a broken HEAD reply.
#!/usr/bin/perl -w

$^W = 552 >> 3;

use strict;          # for sanity (ALWAYS!!!)
use LWP::UserAgent;
use HTTP::Request;
use HTTP::Response;

HEAD('http://123box.co.uk/');
HEAD('http://japhy.perlmonk.org/book/');

real_HEAD('http://123box.co.uk/');
real_HEAD('http://japhy.perlmonk.org/book/');

sub HEAD {
    my $req = HTTP::Request->new(GET => shift);
    my $UA = new LWP::UserAgent;

    my $res = $UA->request($req, sub { die }, 1);
    
    if($res->is_success) {
        print $res->as_string();
    } else {
        print "Error: " . $res->status_line . "\n";
    }
}


sub real_HEAD {
    my $req = HTTP::Request->new(HEAD => shift);
    my $UA = new LWP::UserAgent;

    my $res = $UA->request($req);
    
    if($res->is_success) {
        print $res->as_string();
    } else {
        print "Error: " . $res->status_line . "\n";
    }
}
__END__
F:\dev\snippets>perl fake.HEAD.pl
HTTP/1.1 200 OK
Connection: close
Date: Wed, 21 Nov 2001 07:49:38 GMT
Server: Apache/1.3.19 (Unix) mod_gzip/1.3.19.1a Resin/1.2.2
Content-Type: text/html
Client-Date: Wed, 21 Nov 2001 07:39:46 GMT
Client-Peer: 212.67.197.196:80
X-Died: Died at fake.HEAD.pl line 18.


HTTP/1.1 200 OK
Connection: close
Date: Wed, 21 Nov 2001 07:40:25 GMT
Accept-Ranges: bytes
Server: Apache/1.3.22 (Unix) mod_perl/1.26 PHP/4.0.6
Content-Length: 1766
Content-Type: text/html
ETag: "57532-6e6-3bf6bcac"
Last-Modified: Sat, 17 Nov 2001 19:38:20 GMT
Client-Date: Wed, 21 Nov 2001 07:39:55 GMT
Client-Peer: 66.92.212.9:80
X-Died: Died at fake.HEAD.pl line 18.


Error: 500 unexpected EOF before status line seen

HTTP/1.1 200 OK
Connection: close
Date: Wed, 21 Nov 2001 07:40:35 GMT
Accept-Ranges: bytes
Server: Apache/1.3.22 (Unix) mod_perl/1.26 PHP/4.0.6
Content-Length: 1766
Content-Type: text/html
ETag: "57532-6e6-3bf6bcac"
Last-Modified: Sat, 17 Nov 2001 19:38:20 GMT
Client-Date: Wed, 21 Nov 2001 07:40:03 GMT
Client-Peer: 66.92.212.9:80



If you wanna try to benchmark (which wouldn't be accurate cause it's network stuff) just add
print "\n>>>>>>>>>>>>>>>>>>>>and now for the benchmark\n";
use Benchmark;
timethese(50, { 'real_HEAD' => sub { real_HEAD('http://japhy.perlmonk.
+org/book/')},
                'HEAD' => sub { HEAD('http://japhy.perlmonk.org/book/'
+)},
              });
I got something resembling
 ........ output from the HEAD .....
      HEAD: 140 wallclock secs (139.73 usr +  0.00 sys = 139.73 CPU) @
+  0.36/s (n=50)
 ........ output from the HEAD .....
 real_HEAD: 208 wallclock secs (208.00 usr +  0.00 sys = 208.00 CPU) @
+  0.24/s (n=50)
Comment on LWP head replacement
Select or Download Code
Re: LWP head replacement
by tachyon (Chancellor) on Nov 21, 2001 at 17:46 UTC

    Neat trick. BTW you don't need to specifically use the HTTP entities with LWPUA. Golf:

    #!/usr/bin/perl -w $^W = 336 >> 3; use strict; use LWP::UserAgent; my $UA = new LWP::UserAgent; $UA->proxy('http', 'http://proxy.ahcl.com:8080/'); HEAD('http://123box.co.uk/'); HEAD('http://japhy.perlmonk.org/book/'); real_HEAD('http://123box.co.uk/'); real_HEAD('http://japhy.perlmonk.org/book/'); sub HEAD { my $req = HTTP::Request->new(GET => shift); my $res = $UA->request($req, sub { }, 1); print "HEAD\n\n", $res->as_string(), "\n"; } sub real_HEAD { my $req = HTTP::Request->new(HEAD => shift); my $res = $UA->request($req); print "real_HEAD\n\n", $res->as_string(), "\n"; } __END__ HEAD HTTP/1.1 200 OK Connection: close Date: Wed, 21 Nov 2001 12:27:24 GMT Via: HTTP/1.1 (IBM-PROXY-WTE), 1.0 NSW-PROXY Server: Apache/1.3.19 (Unix) mod_gzip/1.3.19.1a Resin/1.2.2 Content-Length: 8514 Content-Type: text/html Content-Type: text/html; Client-Date: Wed, 21 Nov 2001 12:33:16 GMT Client-Peer: 10.1.17.5:8080 Title: login HEAD HTTP/1.1 200 OK Connection: close Date: Wed, 21 Nov 2001 12:18:07 GMT Via: 1.1 NSW-PROXY Age: 940 Server: Apache/1.3.22 (Unix) mod_perl/1.26 PHP/4.0.6 Content-Length: 1766 Content-Type: text/html ETag: "57532-6e6-3bf6bcac" Last-Modified: Sat, 17 Nov 2001 19:38:20 GMT Client-Date: Wed, 21 Nov 2001 12:33:18 GMT Client-Peer: 10.1.17.5:8080 Title: Regular Expressions in Perl real_HEAD HTTP/1.1 504 (Gateway Timeout) Proxy Error: Remote host did not send a +ny data - URL "http://123box.co.uk/". Cache-Control: no-cache Connection: close Date: Wed, 21 Nov 2001 12:36:20 GMT Pragma: no-cache Via: HTTP/1.1 (IBM-PROXY-WTE), 1.0 NSW-PROXY Server: IBM-PROXY-WTE/3.0 Content-Type: text/html Expires: Wed, 21 Nov 2001 12:36:20 GMT Last-Modified: Wed, 21 Nov 2001 12:36:20 GMT Client-Date: Wed, 21 Nov 2001 12:33:18 GMT Client-Peer: 10.1.17.5:8080 real_HEAD HTTP/1.1 200 OK Connection: close Date: Wed, 21 Nov 2001 12:43:15 GMT Via: HTTP/1.1 (IBM-PROXY-WTE), 1.0 NSW-PROXY Server: Apache/1.3.22 (Unix) mod_perl/1.26 PHP/4.0.6 Content-Length: 1766 Content-Type: text/html ETag: "57532-6e6-3bf6bcac" Last-Modified: Sat, 17 Nov 2001 19:38:20 GMT Client-Date: Wed, 21 Nov 2001 12:33:19 GMT Client-Peer: 10.1.17.5:8080

    cheers

    tachyon

    s&&rsenoyhcatreve&&&s&n.+t&"$'$`$\"$\&"&ee&&y&srve&&d&&print

      Here's crazyinsomniac's version with a few extras that it took me way to long to figure out it needed to work for me (I still have to view source to see the output, but at least I don't get an internal service error), and two extra new-lines to clarify error messages.

      Also, I think one of the addresses in the script used to return a bad header, but someone apparently fixed it, so I changed an address to a nonexistant one so I could see the error handler at work.

      #!/usr/bin/perl -w ###################################################################### +######## print "Cache-Control: no-cache, must-revalidate\n"; print "Pragma: no-cache\n"; print "Content-type: text/html\n\n"; $^W = 552 >> 3; use strict; # for sanity (ALWAYS!!!) use LWP::UserAgent; use HTTP::Request; use HTTP::Response; HEAD('http://12box.co.uk/'); HEAD('http://japhy.perlmonk.org/book/'); real_HEAD('http://123box.co.uk/'); real_HEAD('http://japhy.perlmonk.org/book/'); sub HEAD { my $req = HTTP::Request->new(GET => shift); my $UA = new LWP::UserAgent; my $res = $UA->request($req, sub { die }, 1); if($res->is_success) { print $res->as_string(); } else { print "\nError: " . $res->status_line . "\n\n"; } } sub real_HEAD { my $req = HTTP::Request->new(HEAD => shift); my $UA = new LWP::UserAgent; my $res = $UA->request($req); if($res->is_success) { print $res->as_string(); } else { print "Error: " . $res->status_line . "\n\n"; } } __END__
•Re: LWP head replacement
by merlyn (Sage) on Nov 03, 2003 at 12:02 UTC
    It also suffices in recent LWP releases to simply say:
    my $ua = LWP::UserAgent->new(max_size => 1); ... for my $url (@list_of_urls_to_check) { my $res = $ua->head($url); unless ($res->is_success) { $res = $ua->get($url); } ... }
    which effectively does what your callback does in a lot less typing. {grin}

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.

Back to Snippets Section

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: snippet [id://126739]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (7)
As of 2014-08-29 05:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (275 votes), past polls