Beefy Boxes and Bandwidth Generously Provided by pair Networks
P is for Practical
 
PerlMonks  

Get Web Page

by kanish (Sexton)
on Oct 25, 2005 at 04:20 UTC ( #502624=perlquestion: print w/replies, xml ) Need Help??
kanish has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

I am new to perl and also new to this forum

I want to extract web page. for example i need to extract perlmonks.com page.

Thanks a lot!

Kanishk

Replies are listed 'Best First'.
Re: Get Web Page
by Tanktalus (Canon) on Oct 25, 2005 at 04:25 UTC

    So ... what have you tried?

    (Hint: LWP::Simple.)

    Update: Please also get the gods permission to hammer PM - if it's a few odd requests, it'll probably be ok, but don't send dozens of requests in a short timeframe without their permission. You'll be using their bandwidth and CPU time to the detriment of, well, actual users. This goes along with the reason why Google has an API to use to do searches rather than allowing people to get their web pages programmatically.

Re: Get Web Page
by pg (Canon) on Oct 25, 2005 at 04:27 UTC

    You said that you were new to Perl, but I am not sure whether you are also new to programming. If yes, one advise: always checking return code from function calls like I did below:

    use LWP::UserAgent; use warnings; use strict; my $ua = LWP::UserAgent->new(); my $res = $ua->get('http://www.perlmonks.org/'); if ($res->is_success()) { print $res->content(); } else { print "Failed, " . $res->status_line() . "\n";; }
Re: Get Web Page
by GrandFather (Sage) on Oct 25, 2005 at 04:23 UTC

    Take a look at LWP


    Perl is Huffman encoded by design.
Re: Get Web Page
by monkfan (Curate) on Oct 25, 2005 at 04:25 UTC
    Can you be specific on what do you want to extract?
    Perhaps you need to have a look at these modules LWP or WWW::Mechanize.

    Regards,
    Edward
Re: Get Web Page
by gopalr (Priest) on Oct 25, 2005 at 04:27 UTC

    Hi Kanishk

    WWW:Mechanize will do this work.

    use strict; use WWW::Mechanize; my $mech=WWW::Mechanize->new(); $mech->get('http://www.perlmonks.com'); $mech->success() || die "$mech->status()"; print $mech->content();

    And also take a look at this link Extract Web Page

    Thanks,
    Gopal.R

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://502624]
Approved by monkfan
help
Chatterbox?
[choroba]: becuase the hardware always changes beneath it, at least
[Discipulus]: yes i think I know a bit about you: if I would had the competence and the will to accomplish something like MCE.. i'd be running here and there with flags and t-shirt about.. ;=)
Lady_Aleena rolls over on the sofa, pulling a light blanket over her, and falls to sleep.
[Discipulus]: but world is fun beacause it is manifold
[Discipulus]: anyway congratulations marioroy imho your work is very good and useful
[marioroy]: After the release (ETA ~ 1 week) will post on PM on the refinements made.

How do I use this? | Other CB clients
Other Users?
Others lurking in the Monastery: (12)
As of 2017-05-26 09:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?