Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Printing extracted Excel data into a HTML table

by Anonymous Monk
on Dec 09, 2005 at 06:45 UTC ( #515467=perlquestion: print w/ replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I have an microsoft excel sheet that I want to parse data by using Spreadsheet::ParseExcel. Basically the following code, I specify what min/max row and min/max col to retreive data from and prints out the results. However, I do not know where to go from now to print the results into a html table.
use strict; use Spreadsheet::ParseExcel; my $BookObject = Spreadsheet::ParseExcel::Workbook->Parse("testbook.xl +s"); foreach my $WorksheetObject (@{$BookObject->{Worksheet}}) { for(my $RowIndex = '0'; # starting at minimu+m row value defined $WorksheetObject->{MaxRow} && $RowIndex <= '7'; # ending at maximum +row value $RowIndex++) { for (my $ColumnIndex = '0'; defined $WorksheetObject->{MaxCol} && $ColumnIndex <= '2'; $ColumnIndex++) { my $CellObject = $WorksheetObject->{Cells}[$RowIndex][$Colum +nIndex]; # get cell #print "($RowIndex, $ColumnIndex)", print $CellObject->Value, "\n" if ($CellObject); } } }

The above code prints out the data extracted.
Sales B3 B4 Low Power Components 5000000 460000 Virtual Wire Products +27000 23000 Filter Products 45000 3700000 Frequency Control Modules 1 +0000 7000 Technology Developments 10000 2000 Total Sales 113000 11500 +0


So now I need that data it prints out to be in a HTML table that looks like this:

Sales Q3 Q4
Low Power Components 3000000 4600000
Virtual Wire Products 2700000 2300000
Filter Products 4500000 3700000
Frequency Control Modules 1000000 700000
Technology Developments 100000 200000
Total Sales 11300000 11500000

Comment on Printing extracted Excel data into a HTML table
Select or Download Code
Re: Printing extracted Excel data into a HTML table
by kulls (Hermit) on Dec 09, 2005 at 07:09 UTC
    you can use HTML::Tree module for generating html file.
    -kulls
      I find DBD::ADO the easiest way to interact with Excel.

      non-perl pointers:
      http://support.microsoft.com/kb/q257819/

      Otherwise, use Win:32::OLE to save as HTML, then use HTML::TableExtract.

      I think that either approach would reduce it to a very simple problem (i.e. very little code).

Re: Printing extracted Excel data into a HTML table
by Ryszard (Priest) on Dec 09, 2005 at 08:55 UTC
    My personal favourite is HTML::Template you can do something very simple like this (untested):
    <table> <TMPL_LOOP NAME=FINANCIALS> <tr> <td><TMPL_VAR NAME=SALES></td> <td><TMPL_VAR NAME=Q1></td> <td><TMPL_VAR NAME=Q2></td> </tr> </TMPL_LOOP> </table> $template->param(FINANCIALS => [ { SALES => "Low Power Components", Q3 => 3000000, Q4 => 4600000 }, { SALES => "Virtual Wire Products", Q3 => 2700000, Q4 => 2300000 }, ] );
    Its then relativly easy to loop thru your Excel output and create the array of hashes needed to fill the template.

    Then again, there is always Text::Template to use as well.. ;-)

      Thanks for the very useful suggestion. Glad to know about that 2 modules!

      Can you give example of how to do it with Text::Template. Cause I can't see any example with HTML there in the documentation.

      Given the two choices. What would you choose between the two? When do you use one and not the other?
        I've personally not used Text::Template, but have heard many good things about it. I'm using HTML::Template mostly because of legacy code, and i can produce stuff very quickly.

        Having said that it looks a good deal more flexible than HTML::Template.

        Sorry, i dont have any examples, but as HTML is not different to text in this context, you just need to build your page, and add the parameters you want as described in the documentation.

Re: Printing extracted Excel data into a HTML table
by Siddartha (Curate) on Dec 09, 2005 at 13:01 UTC
    If it's just a table you want just generate the html table and then add whatever other html you want around it.
    foreach my $WorksheetObject (@{$BookObject->{Worksheet}}) { print "<table>\n"; for(my $RowIndex = '0'; # starting at minimu+m row value defined $WorksheetObject->{MaxRow} && $RowIndex <= '7'; # ending at maximum +row value $RowIndex++) { print "<tr>"; for (my $ColumnIndex = '0'; defined $WorksheetObject->{MaxCol} && $ColumnIndex <= '2'; $ColumnIndex++) { my $CellObject = $WorksheetObject->{Cells}[$RowIndex][$Colum +nIndex]; # get cell #print "($RowIndex, $ColumnIndex)", print "<td>" . $CellObject->Value . "</td>" if ($CellObject +); } print "</tr>\n"; } print "</table>\n"; }
      Siddartha's will work for a fully-populated spreadsheet though the hardcoding of $RowIndex and $ColumnIndex will have to be tweaked manually... or better, modified with code similar to that in bmann's and traveler's observations (which use Win32::OLE -- but IMO, that's a small price in this case).

      However, there are *potentially SERIOUS* problems if the spread is sparsely populated.

      By way of illustration, we'll identify rows and cells of the .html table using notation similar to that used by the spreadsheet... ie, A1 is top left cell, A2 is the leftmost cell of the second row, B2 is the second cell in the second row and so on. (In the diagram below, the first row and first col are solely to show the grid; they are not data)

      Trials with a sparsely populated .xls source:

         A    B   C   D   E   F      (r1 and c1 are labels only)
      1           C1  C2      F1
      2  A2       C2  D2  E2 
      3       B3          E3  F3
      

      The alphanumerics represent "locations" or "addresses" used as data in the spreadsheet cell. The html output (below) shows that empty cells are (as the code makes clear) ignored...

         A    B   C   D   E   F      (labels only)
      1  C1  C2   F1
      2  A2  C2  D2  E2 
      3  B3  E3  F3
      

      if positional relationships have significance, you just might care (in fact, you should care, a lot), and the cure will depend on what you're willing to accept in your html table. One notion that I would consider would be inserting a non-breaking space, &nbsp;, in the table cell identified with an empty spreadsheet cell.

      UNtested: I think the empty spread cells will be undef, so tweaking Siddartha's line 9, defined $WorksheetObject->{MaxCol} &&..., into an if...else construct would facilitate populating the .html table with non-breaking spaces when a spreadheet cell is empty.

      NB that the algo also produces empty <table>-</table> pairs at the end of its output.

        yes sorry, didn't think about that. This is all the original code with a few extra prints in. Didn't want to rewrite his code for him. But yes I did forget about empty rows. So instead of:
        print "<td>" . $CellObject->Value . "</td>" if ($CellObject); I would do: print ($CellObject ? ("<td>" . $CellObject->Value . "</td>") :"<td>&nb +sp;</td>");
        Updated: added &nbsp; as ww suggested. Otherwise it might render like crap.
Re: Printing extracted Excel data into a HTML table
by fmerges (Chaplain) on Dec 10, 2005 at 13:48 UTC

    Hi,

    Take a look at HTML::Table

    Regards,

    fmerges at irc.freenode.net

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://515467]
Approved by monkfan
Front-paged by monkfan
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others making s'mores by the fire in the courtyard of the Monastery: (16)
As of 2014-07-23 20:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My favorite superfluous repetitious redundant duplicative phrase is:









    Results (152 votes), past polls