http://www.perlmonks.org?node_id=652442


in reply to Trivial HTML extractor utility

HTML::SimpleLinkExtor comes with linktractor which does the same thing as linkx. :)

If you just want TITLE, here's the one that I use:

#!/usr/bin/perl require HTML::HeadParser; local( $/ ); foreach ( @ARGV ) { open my( $fh ), "<", $_ or do { warn "$!"; next }; my $p = HTML::HeadParser->new; $p->parse( <$fh> ); print "$_: ", $p->header( 'title' ), "\n"; }
--
brian d foy <brian@stonehenge.com>
Subscribe to The Perl Review

Replies are listed 'Best First'.
Re^2: Trivial HTML extractor utility
by Dominus (Parson) on Nov 23, 2007 at 03:06 UTC
    HTML::SimpleLinkExtor comes with linktractor which does the same thing as linkx.
    I had an idea a while back that every Perl module should come with at least one useful demo program. For example, Net::FTP should come with a command-line FTP client program. Text::Template would come with a program that fills a template with values specified on the command line, and prints the results. But the idea never got out of the wishful thinking stage.

    It's nice to know that someone wrote a replacement for HTML::LinkExtor that has an interface that (I presume) doesn't suck quite so hard. The amount of code I had to write for linkx was appalling.

    Thanks for the pointers.