Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

HTM::TreeBuilder, HTML::Element's look_down() method

by 7stud (Deacon)
on Oct 06, 2010 at 20:53 UTC ( #863866=perlquestion: print w/ replies, xml ) Need Help??
7stud has asked for the wisdom of the Perl Monks concerning the following question:

Dear Monks,

When I parse a string of html that I conjured up looking for span tags that have the attribute class="value", this code works:

use strict; use warnings; use 5.010; use LWP::Simple; use HTML::TreeBuilder; my $html =<<'END_HTML'; <html> <head><title></title></head> <body> <div><span class="value">Hi</span></div> <p>Thanks</p> <div><span class="value">Hello</span></div> </body> </html> END_HTML my $tree = HTML::TreeBuilder->new_from_content($html); my @spans = $tree->look_down(class => 'value'); for my $span (@spans) { say $span->as_trimmed_text(); } $tree->delete(); --output:-- Hi Hello

But when I unleash my code in the wild, it doesn't find the span tags:

use strict; use warnings; use 5.010; use LWP::Simple; use HTML::TreeBuilder; my $url = 'http://www.almanac.com/weather/history/zipcode/21218/2008-0 +9-02'; my $html = get($url); my $tree = HTML::TreeBuilder->new(); $tree->parse_file($html); my $span_tag = $tree->look_down( class => 'value', ); say $span_tag->as_trimmed_text(); $tree->delete(); --output:-- Can't call method "as_trimmed_text" on an undefined value at 1perl.pl +line 41.

Can anyone see what the problem is?

Thanks

Comment on HTM::TreeBuilder, HTML::Element's look_down() method
Select or Download Code
Re: HTM::TreeBuilder, HTML::Element's look_down() method
by afresh1 (Hermit) on Oct 06, 2010 at 21:11 UTC

    Pretty simple, $html is not a file, so $tree->parse_file won't work. Use $tree->parse instead.

    With some additional error checking it would have even give you a clue: $tree->parse_file($html) or die $!;

    l8rZ,
    --
    andrew

      Ah, thanks! And I could just use the same method as in the first example. The examples I looked at didn't use die(). I should have thought of that myself.

      Thanks again!

        If you had done the same as the first example, it should have worked as well, however I would still recommend checking that it was successful.

        my $tree = HTML::TreeBuilder->new_from_content($html) or die $!;
        l8rZ,
        --
        andrew

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://863866]
Approved by planetscape
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (6)
As of 2015-07-07 02:50 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (86 votes), past polls