Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked

how to extract links from a part of web page ?

by saidinesh (Initiate)
on Jun 18, 2014 at 14:34 UTC ( #1090314=perlquestion: print w/replies, xml ) Need Help??
saidinesh has asked for the wisdom of the Perl Monks concerning the following question:

#!/usr/bin/perl use strict; use LWP::simple; my $doc_url = ""; my $document; my $browser; init_browser( ); # Get the page whose links we want to check: my $response = $browser->get($doc_url); #die "Couldn't get $doc_url: ", $response->status_line #unless $response->is_success; $document = $response->content; # $doc_url = $response->base; # In case we need to resolve relative URLs later while ($document =~ m/href\s*=\s*"([^"\s]+)"/gi) { my $absolute_url = absolutize($1, $doc_url); check_url($absolute_url); } sub absolutize { my($url, $base) = @_; use URI; return URI->new_abs($url, $base)->canonical; } sub init_browser { $browser = LWP::UserAgent->new; # ...And any other initialization we might need to do... return $browser; } sub check_url { # A temporary placeholder... print "url's list $_[0]\n"; }

when i run this code it showing all the <href> in the source of the html page, but i need a middle part(say middle module of the page) for that how to modify the code ???

Replies are listed 'Best First'.
Re: how to extract links from a part of web page ?
by marto (Bishop) on Jun 18, 2014 at 14:47 UTC has no "middle module", but you know this isn't right because the page you actually want to scrape is over at (did you check their terms of use?). As previously discussed, please read and understand PerlMonks for the Absolute Beginner (and don't ignore the formatting advice displayed when posting How do I post a question effectively?).

    You were given lots of links previously for tools to make this job easy for you, the exampe above isn't based on this advice, rather old content from Perl & LWP.

      For the uninitiated, what's this previous posting you're referring to?

        I make no reference to a previous post, rather two long conversations in the chatterbox (see also Chatterbox FAQ) earlier today were several people discussed the various issues with OP.

        Update: fixed link

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1090314]
Approved by Corion
[stevieb]: thanks choroba for making the world right again by mentioning something as simple as a missing backslash ;)

How do I use this? | Other CB clients
Other Users?
Others browsing the Monastery: (6)
As of 2017-07-25 23:21 GMT
Find Nodes?
    Voting Booth?
    I came, I saw, I ...

    Results (383 votes). Check out past polls.