Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

Re^2: Problem extracting an HTML table with Perl

by Sosi (Sexton)
on Aug 11, 2014 at 17:05 UTC ( #1097017=note: print w/replies, xml ) Need Help??


in reply to Re: Problem extracting an HTML table with Perl
in thread Problem extracting an HTML table with Perl

Indeed, it got a bit better, but I am still getting a lot of information. I now found that my search is completely independent of that "class" in my $tree->find. So any of the following alternatives gives the same result, and shows that the search is only done on the tag:

my $data =$tree->find( '_tag' =>'div' );

or even

my $data =$tree->find( '_tag' =>'div', class => 'somethingthatdoesnotexists1209841290r' );

Replies are listed 'Best First'.
Re^3: Problem extracting an HTML table with Perl
by kennethk (Abbot) on Aug 11, 2014 at 17:21 UTC
    That is not what I see, and I note the OP used the look down method instead of the find method as you have in this post.

    If I run

    #!/usr/local/bin/perl use strict; use warnings; use autodie; use Data::Dump; use HTML::Tree; use LWP::Simple qw(get); my $content=get('http://www.ncbi.nlm.nih.gov/genome/?term=Xylella_fast +idiosa'); my $tree = HTML::Tree->new(); $tree->parse($content); my $data =$tree->look_down( '_tag' =>'div', class => 'genome_descr' ); print $data->as_HTML;
    I get the output
    <div class="genome_descr"><p><b>Submitter: </b><a href="http://aeg.lbi +.ic.unicamp.br/xf/" target="_blank">Sao Paulo state (Brazil) Consorti +um</a></div>
    If I run with
    my @data =$tree->look_down( '_tag' =>'div', class => 'genome_descr' );
    instead, I get 2 results. How does this compare for you?

    #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

      I had missed that of the find. I now found that I chose the wrong tag given that I wanted the list that shows after this tag, but I'll get to that later.

      Indeed I now see that I cannot dd $data as it dumps everything - the print $data->as_HTML solves the problem. One more question: in your second example, how did you print the data?

      Thanks so much!

        For the list context call, I actually only output
        print 0+ @data;
        to tell me the length of the array. If I wanted to print both terms, I'd either use a map and a join
        print join "\n", map $_->as_HTML, @data;
        or just use Foreach Loops:
        for my $datum (@data) { print $datum->as_HTML, "\n"; }

        #11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1097017]
help
Chatterbox?
[Eily]: The best example of that is the TGV, they couldn't decide who would get it so it's in the middle of nowhere halfway between the two
[LanX]: Eily: that rings a bell
[erix]: interesting difference, I suppose they use different data/routes (and shorter seems better, no?)
[LanX]: Montabaur station
[erix]: (I used http://afstandmete n.nl/ )
LanX The stations of Limburg Süd and Montabaur, which are approximately 20 km apart, ...
[LanX]: teh route I get from gmaps is 1662 km long and is crossing the Swiss Alps (mounting 2400 m) ... I wouldn't try this in December...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (12)
As of 2017-12-13 15:28 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    What programming language do you hate the most?




















    Results (369 votes). Check out past polls.

    Notices?