Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re^2: HTML Table to MYSQL DB

by Fiddler (Initiate)
on Nov 02, 2011 at 18:16 UTC ( #935463=note: print w/ replies, xml ) Need Help??


in reply to Re: HTML Table to MYSQL DB
in thread HTML Table to MYSQL DB

Thank you so much. i will add this to my code.. Will post the results as they come


Comment on Re^2: HTML Table to MYSQL DB
Re^3: HTML Table to MYSQL DB
by Fiddler (Initiate) on Nov 03, 2011 at 15:27 UTC
    OK i got the first table in my html document to process. Now here's the tricky part at the head of each table there's a time stamp that i need to store as a fifth column. Example
    00:00 - 00:30
    Channel Call Letters Count Percent Title
    1   IETV   3   0.81%   IETV
    3   CNC3   1   0.27%   CNC3 Overnight
    4   TV 4   1   0.27%   TV 4
    5   CCNTV6   16   4.35%   For Your Eyes Only
    6   C   2   0.54%   C Overnight
    7   GAYEL   2   0.54%   Macajuel Time
    8   IBN   1   0.27%   Press TV
    9   ACTS   2   0.54%   ACTS - Family Education
    10   EWTN   2   0.54%   Daily Mass: Our Lady
    12   WIN   1   0.27%   WIN

    00:30 - 01:00
    Channel Call Letters Count Percent Title
    1   IETV   1   0.37%   IETV
    3   CNC3   3   1.12%   CNC3 Overnight
    4   TV 4   1   0.37%   TV 4
    5   CCNTV6   15   5.63%   For Your Eyes Only
    6   C   4   1.50%   C Overnight
    7   GAYEL   1   0.37%   Macajuel Time
    9   ACTS   4   1.50%   ACTS - Family Education
    12   WIN   1   0.37%   WIN
    14   WI   3   1.12%   Local Sports
    15   SYN   1   0.37%   Synergy Nights
    or **********************************
    <BR><B> 00:00 - 00:30 </B> <TABLE BORDER> <TR> <TD> Channel </TD> <TD> Call Letters </TD> <TD> Count </TD> <TD> Percent </TD> <TD> Title </TD> </TR> <TR></TR> <TR> <TD align=right> 1 </TD> <TD > &nbsp; IETV &nbsp; </TD> <TD align=right> 3 </TD> <TD > &nbsp; 0.81% &nbsp; </TD> <TD > IETV </TD> </TR> <TR> <TD align=right> 3 </TD> <TD > &nbsp; CNC3 &nbsp; </TD> <TD align=right> 1 </TD> <TD > &nbsp; 0.27% &nbsp; </TD> <TD > CNC3 Overnight </TD> </TR> <TR> <TD align=right> 4 </TD> <TD > &nbsp; TV 4 &nbsp; </TD> <TD align=right> 1 </TD> <TD > &nbsp; 0.27% &nbsp; </TD> <TD > TV 4 </TD> </TR> <TR> <TD align=right> 5 </TD> <TD > &nbsp; CCNTV6 &nbsp; </TD> <TD align=right> 16 </TD> <TD > &nbsp; 4.35% &nbsp; </TD> <TD > For Your Eyes Only </TD> </TR> <TR> <TD align=right> 6 </TD> <TD > &nbsp; C &nbsp; </TD> <TD align=right> 2 </TD> <TD > &nbsp; 0.54% &nbsp; </TD> <TD > C Overnight </TD> </TR> <TR> <TD align=right> 7 </TD> <TD > &nbsp; GAYEL &nbsp; </TD> <TD align=right> 2 </TD> <TD > &nbsp; 0.54% &nbsp; </TD> <TD > Macajuel Time </TD> </TR> <TR> <TD align=right> 8 </TD> <TD > &nbsp; IBN &nbsp; </TD> <TD align=right> 1 </TD> <TD > &nbsp; 0.27% &nbsp; </TD> <TD > Press TV </TD> </TR> <TR> <TD align=right> 9 </TD> <TD > &nbsp; ACTS &nbsp; </TD> <TD align=right> 2 </TD> <TD > &nbsp; 0.54% &nbsp; </TD> <TD > ACTS - Family Education </TD> </TR> <TR> <TD align=right> 10 </TD> <TD > &nbsp; EWTN &nbsp; </TD> <TD align=right> 2 </TD> <TD > &nbsp; 0.54% &nbsp; </TD> <TD > Daily Mass: Our Lady </TD> </TR> <TR> <TD align=right> 12 </TD> <TD > &nbsp; WIN &nbsp; </TD> <TD align=right> 1 </TD> <TD > &nbsp; 0.27% &nbsp; </TD> <TD > WIN </TD> </TR> </TABLE> <BR><B> 00:30 - 01:00 </B> <TABLE BORDER> <TR> <TD> Channel </TD> <TD> Call Letters </TD> <TD> Count </TD> <TD> Percent </TD> <TD> Title </TD> </TR> <TR></TR> <TR> <TD align=right> 1 </TD> <TD > &nbsp; IETV &nbsp; </TD> <TD align=right> 1 </TD> <TD > &nbsp; 0.37% &nbsp; </TD> <TD > IETV </TD> </TR> <TR> <TD align=right> 3 </TD> <TD > &nbsp; CNC3 &nbsp; </TD> <TD align=right> 3 </TD> <TD > &nbsp; 1.12% &nbsp; </TD> <TD > CNC3 Overnight </TD> </TR> <TR> <TD align=right> 4 </TD> <TD > &nbsp; TV 4 &nbsp; </TD> <TD align=right> 1 </TD> <TD > &nbsp; 0.37% &nbsp; </TD> <TD > TV 4 </TD> </TR> <TR> <TD align=right> 5 </TD> <TD > &nbsp; CCNTV6 &nbsp; </TD> <TD align=right> 15 </TD> <TD > &nbsp; 5.63% &nbsp; </TD> <TD > For Your Eyes Only </TD> </TR> <TR> <TD align=right> 6 </TD> <TD > &nbsp; C &nbsp; </TD> <TD align=right> 4 </TD> <TD > &nbsp; 1.50% &nbsp; </TD> <TD > C Overnight </TD> </TR> <TR> <TD align=right> 7 </TD> <TD > &nbsp; GAYEL &nbsp; </TD> <TD align=right> 1 </TD> <TD > &nbsp; 0.37% &nbsp; </TD> <TD > Macajuel Time </TD> </TR> <TR> <TD align=right> 9 </TD> <TD > &nbsp; ACTS &nbsp; </TD> <TD align=right> 4 </TD> <TD > &nbsp; 1.50% &nbsp; </TD> <TD > ACTS - Family Education </TD> </TR> <TR> <TD align=right> 12 </TD> <TD > &nbsp; WIN &nbsp; </TD> <TD align=right> 1 </TD> <TD > &nbsp; 0.37% &nbsp; </TD> <TD > WIN </TD> </TR> <TR> <TD align=right> 14 </TD> <TD > &nbsp; WI &nbsp; </TD> <TD align=right> 3 </TD> <TD > &nbsp; 1.12% &nbsp; </TD> <TD > Local Sports </TD> </TR> <TR> <TD align=right> 15 </TD> <TD > &nbsp; SYN &nbsp; </TD> <TD align=right> 1 </TD> <TD > &nbsp; 0.37% &nbsp; </TD> <TD > Synergy Nights </TD> </TR> </TABLE>
    SO in reality the columns in my DB would be Stamp, Channel, Call Letters, Count, Percent, Title Any help on this will be greatly appreciated.
      Below is where i am with the code. The problem is that it stops at the first table.... If i get it to loop through the tables in the htm doc, then writing the rows to the DB should be fairly simple.
      use warnings; use strict; use HTML::TableExtract; use LWP::Simple; my $file ="/path/to/file/file.htm"; my $T = HTML::TableExtract->new(); my $table = $T->parse_file($file) ->first_table_found; my @rows = $table->rows; foreach my $row ($T->rows) { print join(',', @$row), "\n"; }
        Instead of first_table_found, use a loop over $T->tables as shown in the documentation of HTML::TableExtract. To get the timestamp, you will probably need something more powerful, as HTML::Parser.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://935463]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (9)
As of 2014-12-27 22:38 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (177 votes), past polls