Beefy Boxes and Bandwidth Generously Provided by pair Networks
"be consistent"
 
PerlMonks  

Re: Trivial HTML extractor utility

by brian_d_foy (Abbot)
on Nov 22, 2007 at 19:57 UTC ( #652442=note: print w/ replies, xml ) Need Help??


in reply to Trivial HTML extractor utility

HTML::SimpleLinkExtor comes with linktractor which does the same thing as linkx. :)

If you just want TITLE, here's the one that I use:

#!/usr/bin/perl require HTML::HeadParser; local( $/ ); foreach ( @ARGV ) { open my( $fh ), "<", $_ or do { warn "$!"; next }; my $p = HTML::HeadParser->new; $p->parse( <$fh> ); print "$_: ", $p->header( 'title' ), "\n"; }
--
brian d foy <brian@stonehenge.com>
Subscribe to The Perl Review


Comment on Re: Trivial HTML extractor utility
Select or Download Code
Replies are listed 'Best First'.
Re^2: Trivial HTML extractor utility
by Dominus (Parson) on Nov 23, 2007 at 03:06 UTC
    HTML::SimpleLinkExtor comes with linktractor which does the same thing as linkx.
    I had an idea a while back that every Perl module should come with at least one useful demo program. For example, Net::FTP should come with a command-line FTP client program. Text::Template would come with a program that fills a template with values specified on the command line, and prints the results. But the idea never got out of the wishful thinking stage.

    It's nice to know that someone wrote a replacement for HTML::LinkExtor that has an interface that (I presume) doesn't suck quite so hard. The amount of code I had to write for linkx was appalling.

    Thanks for the pointers.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://652442]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (6)
As of 2015-07-30 05:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (270 votes), past polls