http://www.perlmonks.org?node_id=986919


in reply to How to fetch table element from a site into data

I found HTML::TableParse & HTML::TableExtract but they are not working for my case

Why not? What's the problem? Please show the code you've written to try them.

Also you need to show the HTML table you are trying to extract data from.

  • Comment on Re: How to fetch table element from a site into data

Replies are listed 'Best First'.
Re^2: How to fetch table element from a site into data
by Anonymous Monk on Aug 12, 2012 at 04:22 UTC
    My code is something like this:
    use WWW::Mechanize; use HTTP::Cookies; use HTML::TableParser; use HTML::TableExtract; my $mech = WWW::Mechanize->new(); $mech->get('http://www.w3schools.com/sql/default.asp'); my $a = $mech->content(); $te = HTML::TableExtract->new( headers => [('Company', 'Country')] ); $te->parse($html_string); # Examine all matching tables foreach $ts ($te->tables) { print "Table (", join(',', $ts->coords), "):\n"; foreach $row ($ts->rows) { print join(',', @$row), "\n"; } } # Shorthand...top level rows() method assumes the first table found i +n # the document if no arguments are supplied. foreach $row ($te->rows) { print join(',', @$row), "\n"; }

      You almost got it(!), but you've captured the html content into $a, and then used $te->parse($html_string);.

      Try the following (based on the HTML::TableExtract scripting example):

      use Modern::Perl; use WWW::Mechanize; use HTML::TableExtract; my $mech = WWW::Mechanize->new(); $mech->get('http://www.w3schools.com/sql/default.asp'); my $html_string = $mech->content(); my $te = HTML::TableExtract->new( headers => [ ( 'Company', 'Country' +) ] ); $te->parse($html_string); foreach my $ts ( $te->tables ) { print "Table (", join( ',', $ts->coords ), "):\n"; foreach my $row ( $ts->rows ) { print join( ',', @$row ), "\n"; } }

      Output

      Table (0,0): Island Trading,UK Galería del gastrónomo,Spain Laughing Bacchus Wine Cellars,Canada Paris spécialités,France Simons bistro,Denmark Wolski Zajazd,Poland

      Hope this helps!

        Yes, it works! :) Can you tell me how can I put these things in a hash with key as the header and the contents as the value. So that the o/p should be like this:
        $hash = { company => ('Island Trading', 'Galerφa del gastr≤nomo' +, 'Laughing Bacchus Wine Cellars', 'Paris spΘcialitΘs', 'Si +mons bistro', 'Wolski Zajazd') country => ('UK','Spain','Canada','France','Denmark','Poland') }