Beefy Boxes and Bandwidth Generously Provided by pair Networks
more useful options
 
PerlMonks  

Infrequent LWP::UserAgent 500 connect: Invalid argument

by superfrink (Curate)
on Dec 29, 2006 at 22:53 UTC ( [id://592305]=perlquestion: print w/replies, xml ) Need Help??

superfrink has asked for the wisdom of the Perl Monks concerning the following question:

I have a script that gets a file using HTTP with BasicAuth. Sometimes the GET fails with a code 500 even though the apache server logs show a code 200.

I thought it was an ISP proxy issue. I added code to re-try the GET request when is_success() is not true. After that I sometimes found that on a rare occasion even though LWP reported a code 200 on the download the actual contents of the file would be:
500 Can't connect to miniwall.foo.com:80 (connect: Invalid argument)
My next step was to create a separate file on the web server that contained the MD5 of the file I was trying to download. I updated the script to GET both files, calculate the MD5 of the file I am interested in and compare them. If the checksum does not match then retry both GET requests.

This worked in general. Sometimes the MD5 file contained the "500 ..." error message but that was okay because the MD5 checksum calculated from the data file would never match the error message in the MD5 file.

Still there are some times that the downloaded file is written to the disk containing the "500 ..." message.

I added use LWP::Debug qw(+); and started running the script and grepping the downloaded files between runs. After about 5 runs I got a case where the file contents contain an error. (Each run downloads 12 files + 12 MD5 files => 24 files).

The output for a file that was downloaded without error looks like:
LWP::UserAgent::request: () LWP::UserAgent::send_request: GET http://miniwall.foo.com/update-4.0/n +et4801/etc-files/nrpe.cfg LWP::UserAgent::_need_proxy: Not proxied LWP::Protocol::http::request: () LWP::Protocol::collect: read 624 bytes LWP::Protocol::collect: read 4096 bytes LWP::Protocol::collect: read 1539 bytes LWP::UserAgent::request: Simple response: OK

A download were an error occured looks like:
LWP::UserAgent::request: () LWP::UserAgent::send_request: GET http://miniwall.foo.com/update-4.0/n +et4801/scripts/ping-gw-by-int.pl LWP::UserAgent::_need_proxy: Not proxied LWP::Protocol::http::request: () LWP::UserAgent::request: Simple response: Internal Server Error

In the case of an error debugging printed:
LWP::UserAgent::request: Simple response: Internal Server Error
but is_success() returned true.

The part of the code that does the downloading and checks for errors is:
# GOAL : download a copy of the remote file my $file_url = $base_url . $remote_file; my $ua = LWP::UserAgent->new; my $req = GET $file_url; my $downloaded = 0; # 0 = not d/l , 1 = d/l my $tries = 4; while( (! $downloaded) and ($tries >= 0) ) { -- $tries; #print "tries: $tries\n"; $req->authorization_basic('mwuser', 'mwpass'); my $response = $ua->request($req); my $file_content = $ua->request($req)->content; # print $file_content; my $md5_is_good; check_file_md5($file_url, $file_content, \$md5_is_good); #print "md5_is_good: $md5_is_good\n"; if ($response->is_success and $md5_is_good) { $downloaded = 1; } else { my $msg = "$0: unable to get file. '$file_url'" . " '" . $response->code. "'" . " '" . status_message($response->code). "'" ; unless ($md5_is_good) { $msg .= " MD5 sum mismatch error"; } warn $msg; if ($tries < 0) { next CFG_KEY; # next file to download } else { print "re-trying $file_url\n"; } } } my $file_content = $ua->request($req)->content; # print $file_content;
Does this code fail to correctly check for a failed request?

I am certain that the check_file_md5() routine works (at least sometimes) because I have seen the "MD5 sum mismatch error" string output from time to time. Somehow there are times where the MD5 checksum matched and yet the "500 ..." string is written to the output file. OH! Why do I do this line twice?
my $file_content = $ua->request($req)->content;
I wonder if the different calls to content() are making different GET requests to the server. I will remove the second call so the MD5 is calculated on the same data written to the output file and see if the problem goes away.
By the way I am running the script with:
Perl 5.8.8
LWP::UserAgent version 2.033
OpenBSD 4.0

I ran ktrace and see that the connect() system call really is failing but I am not sure why.
30481 perl CALL connect(0x3,0x85b8bae0,0x10) 30481 perl RET connect -1 errno 22 Invalid argument

Update: I made two changes to the code. I replaced
my $response = $ua->request($req); my $file_content = $ua->request($req)->content;
with
my $response = $ua->request($req); my $file_content = $response->content;
I also removed the second call to ->content(). After several more runs of the script I have not yet noticed the error message in an output file.

Replies are listed 'Best First'.
Re: Infrequent LWP::UserAgent 500 connect: Invalid argument
by alpha (Scribe) on Dec 30, 2006 at 00:04 UTC
    Error #500 usually means an error in server-side script. IMO you should check it first...
      The first thing I checked was the Apache logs. The files Apache is serving are plain text (no CGI). The Apache error log is empty and the access log shows the requests having been served with a status 200.

      I have not read the wires with tcpdump. I might have to if there continues to be problems. I am hoping that retrying when there was an error or the file is corrupt gets around the problem. :(

      For what it is worth I have seen the problem from three different OpenBSD machines. Each is located in different corners of the city. They were all downloading from the same server (running Apache on Slackware). I have not seen similar problems with that Apache server when using a graphical browser and it runs several sites (wiki, request tracker, etc).

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://592305]
Approved by GrandFather
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others romping around the Monastery: (3)
As of 2024-04-25 16:03 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found