Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re^2: How do I reverse an extracted HTML table?

by Anonymous Monk
on Dec 05, 2011 at 22:15 UTC ( #941927=note: print w/ replies, xml ) Need Help??


in reply to Re: How do I reverse an extracted HTML table?
in thread How do I reverse an extracted HTML table?

#!/usr/bin/perl -- use strict; use warnings; use CGI qw/ *table *Tr Td /; use HTML::TableContentParser; my $html = <<'HTML'; <table> <tr><td>1</td><td>2</td><td>3</td></tr> <tr><td>ro</td><td>sham</td><td>bo</td></tr> <tr><td>&lt;ro&gt;</td><td>&lt;sham&gt;</td><td>&lt;bo&gt;</td></tr> </table> HTML #~ my $p = HTML::TableContentParser->new(); #~ my $tables = $p->parse($html); #~ use DDS; die Dump($tables); my $tables = [ { rows => [ { cells => [ { data => 1 }, { data => 2 }, { data => 3 } ] }, { cells => [ { data => 'ro' }, { data => 'sham' }, { data => 'bo' } ] }, { cells => [ { data => '&lt;ro&gt;' }, { data => '&lt;sham&gt;' }, { data => '&lt;bo&gt;' } ] } ] } ]; for my $t (@$tables) { print start_table(); for my $r ( @{ $t->{rows} } ) { print start_Tr(); for my $c ( @{ $r->{cells} } ) { print Td( $c->{data} ); } print end_Tr(); } print end_table(); } __END__ $ perl html.tablecontentparser.to.html.pl |xml_pp <table> <tr> <td>1</td> <td>2</td> <td>3</td> </tr> <tr> <td>ro</td> <td>sham</td> <td>bo</td> </tr> <tr> <td>&lt;ro&gt;</td> <td>&lt;sham&gt;</td> <td>&lt;bo&gt;</td> </tr> </table>

Instead of HTML::TableContentParser , I would standardize on HTML::Tree and xpath, see


Comment on Re^2: How do I reverse an extracted HTML table?
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://941927]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chanting in the Monastery: (2)
As of 2014-07-13 05:07 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    When choosing user names for websites, I prefer to use:








    Results (246 votes), past polls