Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

LWP::UserAgent connection problem as CGI

by bassplayer (Monsignor)
on Mar 03, 2004 at 17:58 UTC ( [id://333632]=perlquestion: print w/replies, xml ) Need Help??

bassplayer has asked for the wisdom of the Perl Monks concerning the following question:

Greetings Wise Monks,

I am having a very strange problem. I have a script which uses LWP::UserAgent to connect to web pages. It has functioned properly for quite some time, but started failing last week, and LWP::UserAgent reports a 500 error. From this node, I learned that a 500 error does not necessarily come from the server, and that LWP will report 500 even if it does not reach the server. Following a great suggestion I got from this node, I Data::Dumpered the value returned by $ua->request(), discovering that I am getting the message:

Can't connect to <domain>:80 (Bad Hostname '<domain>')

This only happens when the code is run as a CGI.
It works properly from command line, whether as root or the apache user. This script is on several servers, and is functioning properly on one of them, failing on most. I have compared the code between the working server and the rest, and found no differences. LWP::UserAgent is the same on both (2.003) and so is HTTP::Request (1.30). Both machines are running FreeBSD 4.8, and /etc/resolv.conf is the same for both. Our sysadmins made no recent changes to these servers.

Update: Using an IP address instead of the domain name works. Also, forgot to mention that from command line, I can ping and ssh to other domains. Also, mod_perl is involved here.

The code does not seem to be the issue here, but I here is the relevant snippet anyway (not mine):
my $qurl = 'legitmate url (tested -- fails for many)'; my $uasnd = new HTTP::Request ('GET', $qurl); $uasnd->header('Accept' => 'text/html'); my $res = $ua->request($uasnd); use Data::Dumper; print STDERR "RES: ".Dumper( $res )."\n"; my $code= $res->code; # RETURNS 500 my $content= $res->content; # NADA
Output is:
RES: $VAR1 = bless( { '_content' => '', '_rc' => 500, '_headers' => bless( { 'client-date' => 'Wed, 03 Mar 2004 17: +37:09 GMT' }, 'HTTP::Headers' ), '_msg' => 'Can\'t connect to domainname.net:80 (Bad hostname \ +'domainname.net\')', '_request' => bless( { '_content' => '', '_uri' => bless( do{\(my $o = 'http:// +domainname.net/?param1=1&param2=2')}, 'URI::http' ), '_headers' => bless( { 'user-agent' => +'libwww-perl/5.69', 'accept' => 'tex +t/html' }, 'HTTP::Headers +' ), '_method' => 'GET' }, 'HTTP::Request' ) }, 'HTTP::Response' );
Something tells me this is going to be something stupid that I have forgotten, but I have asked several others about this. Anyone had a similar experience or know what this could be? As always, your time and assistance is much appreciated.

bassplayer

Replies are listed 'Best First'.
Re: LWP::UserAgent connection problem as CGI
by tachyon (Chancellor) on Mar 03, 2004 at 18:47 UTC

    The error message you see is generated initially by IO::Socket::INET's configure method when it has called its _get_addr() method which has in turn called gethostbyname() which is what will have choked in the first place. (Probably)

    To debug write a CGI like this:

    #!/usr/bin/perl print "Content-Type: text/html\n\n"; my $host = gethostbyname($ENV{QUERY_STRING}); print $host ? "Got host $host\n" : "Choked!"; call it like: http://yourdomain.com/cgi-bin/gethost?perlmonks.org

    Assuming you call it gethost you should see it print 4 binary bytes out as the host name. If it says choked you have an issue with gethostbyname under CGI. If so it will be perms as your CGI will not be running as you.

    An alternative way to prove it is perms is

    su nobody ./script.pl

    ie change to the user nobody/apache or whatever webserver runs as and see if it chokes. I am making the assumption that you are testing it off the command line *on the same box* that is running the CGI? If not all bets are off!

    If gethostbyname is working your IO::Socket::INET has issues. The most probable reason for this is that you have 2 perls on your system. The webserver will probably do

    ./script.pl

    To exec the script and thus use the shebang. Probably. It could be running mod_perl or lots of other stuff but that is the high probability. If you are executing it as perl script.pl then you could quite easily be running it with a different perl to the web server. We have 5.6.1, 5.6.2, 5.8.2 and 5.8.3 on our devel box. perl will currently give you 5.6.2 but...... Let us know your results. div class="pmsig">

    cheers

    tachyon

      Thanks for the lengthy response. I tried your script and it failed on the problem server, and succeeded on the working server. Given that, you are saying that if the issue is likely permissions. As stated in my original post, my script works at command line whether as root, or as the user that apache runs as. For the record, I am indeed testing on the same machine as the CGI is running. Same script, actually, with a use lib statement added to lead to our modules (a line usually provided by mod_perl).

      I also tried running your test script with the shebang line pointing to any perl install I could find, as well as without a shebang line, using all of the different perl installs before the script name. Is there another way that permissions could be affecting this that I am not seeing, or can you suggest further tests to help me narrow this down? I also added the following line to your script:

      print "\nVERSION:  ".`perl -v`."\n";

      The version was always displayed as 5.8.0, regardless of what I tried to make it use (via shebang or command line). The perl installs appear to be different (not just symbolic links to one). The @INC array did change with each different shebang line:
      #!/usr/bin/perl #!/usr/local/bin/perl #!/usr/local/bin/perl5.8.0
      produce:
      /usr/local/lib/perl5/5.8.0/i386-freebsd /usr/local/lib/perl5/5.8.0 /usr/local/lib/perl5/site_perl/5.8.0/i386-freebsd /usr/local/lib/perl5/site_perl/5.8.0 /usr/local/lib/perl5/site_perl

      #!/usr/bin/perl5 #!/usr/bin/perl5.00503
      produce:
      /usr/local/lib/perl5/site_perl/5.005/i386-freebsd /usr/local/lib/perl5/site_perl/5.005 /usr/libdata/perl/5.00503/mach /usr/libdata/perl/5.00503

      Update: Fixed shebang typos (second group).

      Not sure if any of that helps at all...

      bassplayer

        When you say failed I presume you mean it printed choked. Here is the same script as a command line utility. Run it on the server that choked. As root, and as nobody/apache.

        #!/usr/bin/perl my $host = gethostbyname($ARGV[0]); print $host ? "Got host $host\n" : "Choked!";

        It fails because gethostbyname() is failing. Some systems do have a crap gethostbyname(). In fact freebsd (which I see is what you are using) is one such OS. Do a google for 'gethostbyname freebsd perl issue' or some such. From memory the gethostbyname function is not thread safe up until the most recent of the freebsd releases.

        So.....

        • Freebsd has known issues with gethostbyname. I use Linux so don't know the details but there may be a patch for the core function.
        • You have now effectively isolated the bug to gethostbyname() failing under some circumstances and have a test case.
        • You are using 5.8.0 which was by all reports less than optimal. I would strongly consider recomiling perl using 5.8.3. I seem to remember there is the option to not use the systems native gethostbyname() - something like 'system has broken gethostbyname()' so let perl know.
        • Getting 5.8.3, compiling, installing it into the default location (/usr/local/lib/perl5/5.8.3) will not overwrite anything and will probably be the quickest fix if it works. It will take about 15 minutes or so to do.

        You may need to install libwwwperl (if you do not include the old 5.8.0 lib and site/lib into @INC during the configure. Anyway if you do it is just /usr/local/bin/perl5.8.3 Makefile.PL && make && make test && make install. This forces linking to 5.8.3 and installation into the 5.8.3 tree. Once it is in all you need to do to the test case is put #!/usr/local/bin/perl5.8.3 to run the test case against the new perl. If it works you have a cure. I would then just remove/rename the /usr/bin/perl (probably a symlink) and link it to the new 5.8.3 binary. You can always undo the link if random stuff happens (like none of the 5.8.0 modules can get found as the lib paths were not compiled into your new perl)

        cheers

        tachyon

•Re: LWP::UserAgent connection problem as CGI
by merlyn (Sage) on Mar 03, 2004 at 18:21 UTC
    Using an IP address instead of the domain name works.
    Perhaps the CGI box doesn't have DNS properly configured. Can you do hostname lookups at all? For example, what happens when you type "ping www.yahoo.com" on the command line on the CGI box?

    -- Randal L. Schwartz, Perl hacker
    Be sure to read my standard disclaimer if this is a reply.

      Thanks, but no problems pinging. I receive packets back from the IP address. Seems as though DNS is configured properly, but for some reason not for perl|apache|mod_perl?

      bassplayer

        I don't merlyn meant for you to ping th IP address but to test the Domain Name Server configuration by doing ping <hostname> as oppossed to ping <IP address>.

        Plankton: 1% Evil, 99% Hot Gas.
Re: LWP::UserAgent connection problem as CGI
by The Mad Hatter (Priest) on Mar 03, 2004 at 18:10 UTC
    How are you getting the value of $qurl?
      $qurl is a valid URL, the value of which has been verified. The same URL works from command line, but fails as a CGI. Thanks.

      bassplayer

Re: LWP::UserAgent connection problem as CGI
by inman (Curate) on Mar 04, 2004 at 09:35 UTC
    Set up LWP debugging with the following line:
    use LWP::Debug qw(level); level('+');

    You can redirect the output to either the CGI generated page or a file. The trace produced for each call helps you root out many LWP based errors. You will want to comment this line out in production.

      A very useful line of code. Didn't help that much in this case, but I really dig the output it gives. Thanks.

      bassplayer

Re: LWP::UserAgent connection problem as CGI
by bassplayer (Monsignor) on Mar 04, 2004 at 20:28 UTC
    Well, it turns out that it was DNS and apache related. Our systems administrators changed our DNS server a few weeks ago, and updated the /etc/resolv.conf file accordingly. They did not, however, stop and start apache (which was caching the DNS server value), causing gethostbyname() to fail. One of those cases where a restart does not cut it. DNS seemed like a likely source of the problem, as merlyn pointed out, but the fact that it was limited to apache was throwing me. Anyway, many thanks to all (especially tachyon) who helped me troubleshoot this (apparently OT) issue.

    bassplayer

      Random addition - I had a similar problem (getting bad hostname when executing within apache but not from command line). Turns out I had multiple httpd instances running as different users (root, nobody). When I straightened that out and restarted everything cleanly (no root instances of http), all worked fine.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://333632]
Approved by kvale
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (7)
As of 2024-04-18 02:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found