http://www.perlmonks.org?node_id=818559

sri1230 has asked for the wisdom of the Perl Monks concerning the following question:

Can some one tell me why the get in this code fails to retrieve anyhting? #!/usr/bin/perl use strict; use warnings; use LWP::Simple; use HTML::LinkExtor; use Data::Dumper; my $content = get("http://www.yahoo.com/"); #Get web page in content die "get failed" if (!defined $content); my $parser = HTML::LinkExtor->new(); #create LinkExtor object with no callbacks $parser->parse($content); #parse content my @links = $parser->links; #get list of links print Dumper \@links; #print list of links out.

Replies are listed 'Best First'.
Re: LWP question
by VinsWorldcom (Prior) on Jan 20, 2010 at 21:42 UTC

    As written below, it works for me:

    {C} > cat test.pl #!/usr/bin/perl use strict; use warnings; use LWP::Simple; use HTML::LinkExtor; use Data::Dumper; my $content = get("http://www.yahoo.com/"); #Get web page in content die "get failed" if (!defined $content); my $parser = HTML::LinkExtor->new(); #create LinkExtor object with no +callbacks $parser->parse($content); #parse content my @links = $parser->links; #get list of links print Dumper \@links; #print list of links out. {C} > perl -c test.pl test.pl syntax OK {C} > perl test.pl $VAR1 = [ [ 'link', 'href', 'http://l.yimg.com/a/lib/arc/core_1.0.5.css' ], [ 'link', ...OUTPUT TRUNCATED...
Re: LWP question
by eric256 (Parson) on Jan 20, 2010 at 22:47 UTC

    I know in the past i've had issues with several of the search engines where they will block requests by IP if they see repeated requests by UserAgents that look like bots/spiders/etc. (Oh and please read that whole section below your post about formatting.


    ___________
    Eric Hodges
      Thanks a bunch guys. It was a proxy issue, as i was using my work machine which was blocking it.
Re: LWP question
by fod (Friar) on Jan 20, 2010 at 21:45 UTC
    It works for me - I'd say you've got a connection/firewall/proxy problem

    output (long):