Re: resolving HTML::TableExtract error

in reply to resolving HTML::TableExtract error

foreach $row ($te->rows) {

You'll have to walk through the parsed tables first:

foreach my $ts ( $te->table_states ) {
    foreach my $row ( $ts->rows ) {
        ...
    }
}

Please have a look at perldoc HTML::TableExtract and feel free to contact its author to provide better error messages for misuses like yours.

--Frank

Comment on Re: resolving HTML::TableExtract error Download Code

Replies are listed 'Best First'.
Re^2: resolving HTML::TableExtract error by jaydon (Novice) on Jul 13, 2005 at 21:18 UTC
I kind of did. The Synopsis in that documentation was where I got that code from: `# Shorthand...top level rows() method assumes the first table found in # the document if no arguments are supplied. foreach $row ($te->rows) { print join(',', @$row), "\n";` [download] I am probably misinterpretting what it says, but I took that to mean that I don't have to examine all matching tables with an (outer) foreach loop if I am only concerned with the 1st table found. Anyway I took your advice and added the outer foreach loop, but I my data file remains empty. Here is the ammended code: `use HTML::TableExtract; my $te = HTML::TableExtract->new( headers => [qr/Month\s/, qr/First\s/, qr/High\s/, qr/Low\s/, qr/Sett\s/, qr/Chg\s/, qr/Vol\s/, qr/GOWAVE\\s/] ); $te->parse_file($sourcefile); my $record; open (DATFILE, ">> meg.dat") or die "Unable to open meg.dat: $!"; print DATFILE "Table:\n"; foreach my $ts ($te->table_states) { foreach my $row ($ts->rows) { $record = join(',', @$row); print $record . "\n"; print DATFILE $record . "\n"; } } close DATFILE;` [download] And this is the html: `<tr align="center" valign="top"> <td><strong>Month </strong></td> <td><strong>First </strong></td> <td><strong>High </strong></td> <td><strong>Low </strong></td> <td bgcolor="#f3f3f3"><strong>Sett </strong></td> <td bgcolor="#f3f3f3"><strong>Chg </strong></td> <td><strong>Vol</strong></td> <td><strong>GOWAVE</strong></td> <td width="1" style="border-bottom-style:none;"></td> <td><strong>Vol</strong></td> <td style="border-right:1px solid #C0C0C0;"><strong>Open Int</strong> +</td> </tr>` [download]	[reply] [d/l] [select]
Re^3: resolving HTML::TableExtract error by crashtest (Curate) on Jul 14, 2005 at 00:34 UTC
The HTML snippet you provided above is not conducive to testing your code. Besides not being enclosed in `<table>` tags, it only has one row (the header). Both (apparently) prevent the HTML from being parsed into a `table_state`. Once I fixed that, your code (with haoess's extra loop over the tablestates) started producing data. One note of caution: According to the documentation, you should be passing regular expression strings to the constructor, not actual regular expressions. I.e., your constructor should look like: `my $te = HTML::TableExtract->new( headers => [ qw( Month\s* First\s* High\s* ... )] );` [download] ... although your constructor with the `qr//`'s was working as well. I had no trouble using the `rows` method on the table extract object directly, as in your original post. That makes me wonder whether you grabbed an older version off CPAN. I'm guessing the shorthand `rows` method in the `HTML::TableExtract` class might have been added somewhere down the line. The version I have is 1.10. Hope this helps...	[reply] [d/l] [select]
Re^4: resolving HTML::TableExtract error by jaydon (Novice) on Jul 14, 2005 at 15:43 UTC
That was great advice! I just pasted the header from the html file as I thought that I might not have constructed the tableextract object correctly. But you were right, I was missing the `<table>` tag as I had removed some lines from the html file. Once I processed the whole file, I got the results I wanted. Thank you!	[reply] [d/l]

In Section Seekers of Perl Wisdom