Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Re: Trivial HTML extractor utility

by brian_d_foy (Abbot)
on Nov 22, 2007 at 19:57 UTC ( #652442=note: print w/replies, xml ) Need Help??

in reply to Trivial HTML extractor utility

HTML::SimpleLinkExtor comes with linktractor which does the same thing as linkx. :)

If you just want TITLE, here's the one that I use:

#!/usr/bin/perl require HTML::HeadParser; local( $/ ); foreach ( @ARGV ) { open my( $fh ), "<", $_ or do { warn "$!"; next }; my $p = HTML::HeadParser->new; $p->parse( <$fh> ); print "$_: ", $p->header( 'title' ), "\n"; }
brian d foy <>
Subscribe to The Perl Review

Replies are listed 'Best First'.
Re^2: Trivial HTML extractor utility
by Dominus (Parson) on Nov 23, 2007 at 03:06 UTC
    HTML::SimpleLinkExtor comes with linktractor which does the same thing as linkx.
    I had an idea a while back that every Perl module should come with at least one useful demo program. For example, Net::FTP should come with a command-line FTP client program. Text::Template would come with a program that fills a template with values specified on the command line, and prints the results. But the idea never got out of the wishful thinking stage.

    It's nice to know that someone wrote a replacement for HTML::LinkExtor that has an interface that (I presume) doesn't suck quite so hard. The amount of code I had to write for linkx was appalling.

    Thanks for the pointers.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://652442]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (7)
As of 2018-01-20 09:18 GMT
Find Nodes?
    Voting Booth?
    How did you see in the new year?

    Results (226 votes). Check out past polls.