Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re^3: HTML::TableExtract issues

by poj (Priest)
on Aug 24, 2013 at 16:17 UTC ( #1050821=note: print w/ replies, xml ) Need Help??


in reply to Re^2: HTML::TableExtract issues
in thread HTML::TableExtract issues

"What I would like to do is populate the first cell in each row with the 58035.png (in this case)."

One way is by using the tree mode and look_down like this.

#!perl use strict; use warnings; use HTML::TableExtract 'tree'; use Text::CSV; use LWP::Simple; # input my $html = get('your url'); my $te = HTML::TableExtract->new(); $te->parse($html); # output my $csvfile = 'results.csv'; my $csv = Text::CSV->new ( { binary => 1, eol => "\n" } ) or die "Cannot use CSV: ".Text::CSV->error_diag (); open my $fh, '>:encoding(utf-8)', $csvfile or die "$csvfile : $!"; # process my $count=0; printf "%3s %4s %4s\n",'Tbl','Rows','Cols'; foreach my $ts ($te->tables){ my $tree = $ts->tree(); printf "%3d %4d %4d\n",++$count,$tree->maxrow,$tree->maxcol; foreach my $r (0..$tree->maxrow){ my @cells=(); # is col 1 an img ? my $x = $tree->cell($r,0)->look_down('src',qr/png$/); push @cells,(defined $x) ? $x->attr('src') : $tree->cell($r,0)->as +_text; for my $c (1..$tree->maxcol){ my $val = $tree->cell($r,$c)->as_text; push @cells,$val; } $csv->print ($fh, \@cells); } } close $fh or die "$csvfile: $!";

Notice I have used Text::CSV rather than just adding commas between columns.

poj


Comment on Re^3: HTML::TableExtract issues
Select or Download Code
Re^4: HTML::TableExtract issues
by Mr Bigglesworth (Initiate) on Aug 26, 2013 at 12:50 UTC

    Hi poj

    Thank you for your efforts, it is appreciated. It works great.

    I need to change it a little so that I can process the text inside the resulting CSV a little, but you have push me ahead quite a bit.

    I did notice the use of Text::CSV, that is definitely a much better way to do it, certainly a much more reliable way

    Cheers

    Mr Bigglesworth

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1050821]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (16)
As of 2014-08-28 15:45 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (263 votes), past polls