Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"

Re: web page source?

by Falkkin (Chaplain)
on Feb 22, 2001 at 06:48 UTC ( #60119=note: print w/replies, xml ) Need Help??

in reply to web page source?

To get the source, I'd get LWP::Simple from CPAN. The code to get your source would then be a simple 2-liner:
use LWP::Simple; my $source = get("");
You only need the "use" directive once in your program; use the get() command every time you need to get the source of a page.

Writing an HTML parser by hand is very non-trivial... I'd look at HTML::Parser (again, at CPAN) and see if that'll make your life easier. I've not really used HTML::Parser before, but, by looking at the documentation and playing around for the last 15 minutes, it appears you'd want to do something like the following:

#!/usr/bin/perl -w use strict; use LWP::Simple; use HTML::Parser; my $source = get(""); my $parser = HTML::Parser->new(); $parser->handler( start => \&function, 'token0, attr'); $parser->parse($source); sub function { my ($tag_name, $attr_ref) = @_; if ($tag_name eq 'a') { my %attr = %$attr_ref; print $attr{href}, "\n"; } }

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://60119]
[Discipulus]: ah thanks.. 1nickt all is well with the new job?

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (7)
As of 2018-01-19 18:03 GMT
Find Nodes?
    Voting Booth?
    How did you see in the new year?

    Results (222 votes). Check out past polls.