Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling
 
PerlMonks  

Re: How to fetch table element from a site into data

by moritz (Cardinal)
on Aug 11, 2012 at 19:26 UTC ( #986919=note: print w/ replies, xml ) Need Help??


in reply to How to fetch table element from a site into data

I found HTML::TableParse & HTML::TableExtract but they are not working for my case

Why not? What's the problem? Please show the code you've written to try them.

Also you need to show the HTML table you are trying to extract data from.


Comment on Re: How to fetch table element from a site into data
Re^2: How to fetch table element from a site into data
by Anonymous Monk on Aug 12, 2012 at 04:22 UTC
    My code is something like this:
    use WWW::Mechanize; use HTTP::Cookies; use HTML::TableParser; use HTML::TableExtract; my $mech = WWW::Mechanize->new(); $mech->get('http://www.w3schools.com/sql/default.asp'); my $a = $mech->content(); $te = HTML::TableExtract->new( headers => [('Company', 'Country')] ); $te->parse($html_string); # Examine all matching tables foreach $ts ($te->tables) { print "Table (", join(',', $ts->coords), "):\n"; foreach $row ($ts->rows) { print join(',', @$row), "\n"; } } # Shorthand...top level rows() method assumes the first table found i +n # the document if no arguments are supplied. foreach $row ($te->rows) { print join(',', @$row), "\n"; }

      You almost got it(!), but you've captured the html content into $a, and then used $te->parse($html_string);.

      Try the following (based on the HTML::TableExtract scripting example):

      use Modern::Perl; use WWW::Mechanize; use HTML::TableExtract; my $mech = WWW::Mechanize->new(); $mech->get('http://www.w3schools.com/sql/default.asp'); my $html_string = $mech->content(); my $te = HTML::TableExtract->new( headers => [ ( 'Company', 'Country' +) ] ); $te->parse($html_string); foreach my $ts ( $te->tables ) { print "Table (", join( ',', $ts->coords ), "):\n"; foreach my $row ( $ts->rows ) { print join( ',', @$row ), "\n"; } }

      Output

      Table (0,0): Island Trading,UK Galería del gastrónomo,Spain Laughing Bacchus Wine Cellars,Canada Paris spécialités,France Simons bistro,Denmark Wolski Zajazd,Poland

      Hope this helps!

        Yes, it works! :) Can you tell me how can I put these things in a hash with key as the header and the contents as the value. So that the o/p should be like this:
        $hash = { company => ('Island Trading', 'Galerφa del gastr≤nomo' +, 'Laughing Bacchus Wine Cellars', 'Paris spΘcialitΘs', 'Si +mons bistro', 'Wolski Zajazd') country => ('UK','Spain','Canada','France','Denmark','Poland') }

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://986919]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (14)
As of 2014-08-27 11:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (237 votes), past polls