Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re: Formating a HTML document to show certain text.

by Anonymous Monk
on Mar 26, 2011 at 22:57 UTC ( #895706=note: print w/ replies, xml ) Need Help??


in reply to Formating a HTML document to show certain text.

  1. $ lwp-download http://www.imreportcard.com/products/the-elevation-group
    Saving to 'the-elevation-group.htm'...
    35.2 KB received
     

  2. $ perl htmltreexpather.pl the-elevation-group.htm 2>NUL |grep -A3 "^Product Description$" Product Description
    /html/body/div/div[3]/div/div/div[6]
    //div[@id='leftColTop']/div[6]
    //div[@id='leftColTop']/div[@class='heading']
     

  3. use HTML::TreeBuilder::XPath; my $tree= HTML::TreeBuilder::XPath->new; $tree->parse_file( "the-elevation-group.htm"); for my $n( $tree->findnodes( q#//div[@id='leftColTop']/div[@class='heading']# ) ){ print $n->getValue, "\n"; } __END__ Product Description Detailed Overview Reputation Domain "Whois"
  4. repeat


Comment on Re: Formating a HTML document to show certain text.
Download Code
Replies are listed 'Best First'.
Re^2: Formating a HTML document to show certain text.
by Anonymous Monk on Mar 28, 2011 at 06:52 UTC
    #!/usr/bin/perl -- use strict; use warnings; use HTML::TreeBuilder::XPath; Main( @ARGV ); exit( 0 ); sub Main { my $tree = HTML::TreeBuilder::XPath->new; $tree->parse_file( "the-elevation-group.htm" ); my $XpathXpr = join '|', q#//div[@id='leftColTop']/div[@class='heading']#, q#//div[@id='leftColTop']/div[@class='heading']/following-sibling::nod +e()[1]#, ; for my $node ( $tree->findnodes_as_strings( $XpathXpr ) ){ print "$node\n\n"; } } __END__
    Read http://w3schools.com/xpath/default.asp for gentle introduction to xpath.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://895706]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (5)
As of 2015-07-30 00:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (269 votes), past polls