Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

I have a script that works perfectly fine with .xls file but the same file when in xlsx doesn't work with my script. I use Spreadsheet::ParseXLSX(https://metacpan.org/pod/Spreadsheet::ParseXLSX) to parse the .xlsx file. It works fine until getting the row and col range but doesn't work on get_cell or value of the cell.
my @name; my @details = (); for my $row(0 ..$row_max) { my $ligand_details = {}; for my $col (@required_cols) { warn Dumper($worksheet); my $cell = $worksheet->get_cell($row,$col); warn $cell; my $value = $cell ? $cell->val :''; if($row == 0 ) { $name[$col] = $value; } else { $ligand_details->{$name[$col]} = $value; } } push @details , $ligand_details if (defined $row); } return \@details;
Sorry forgot to add the error message
$VAR1 = [ {}, { '' => '' }, { '' => '' }, { '' => '' }, {
Is there any difference between ParseExcel and ParseXLSX? any suggestions?

Replies are listed 'Best First'.
Re: Spreadsheet::ParseXLSX or Spreadsheet::ParseExcel doesn't work with .xlsx
by Tux (Abbot) on Feb 03, 2015 at 11:59 UTC
      To use Spreadsheet::Read, do I need to install Spreadsheet::ParseExcel as well? I have already installed Spreadsheet::Read but it seems that my script does not load any data from the excel file

        You need to have the real parser installed. Spreadsheet::Read is just a unifying frontend API to all the supported parsers. This is like using DBI without knowing what DBD is used: your programm does use DBI; and DBI->connect decides what DBD to load. Likewise for Spreadsheet::Read.

        So to parse .xslx you have to install Spreadsheet::ParseXLSX and then you do somthing like:

        use strict; use warnings; use Spreadsheet::Read; my $book = Spreadsheet::Read->new ("file.xlsx"); my $sheet = $book->sheet (1); my $a3 = $sheet->cell ("A3");

        Enjoy, Have FUN! H.Merijn
Re: Spreadsheet::ParseXLSX or Spreadsheet::ParseExcel doesn't work with .xlsx
by roboticus (Chancellor) on Feb 03, 2015 at 13:13 UTC

    Are you sure that the cell actually exists? I seem to recall having an issue like this last year, where some cells never had a value and didn't exist. If I recall correctly, merged cells had their neighbors missing, too. I don't have a copy of Excel to double-check it (or the final version of the program, it's on a different machine), but here's a fragment of an earlier version where I was reading from the spreadsheet:

    sub get_worksheet_data { my $worksheet = shift; my ($row_min, $row_max) = $worksheet->row_range(); my ($col_min, $col_max) = $worksheet->col_range(); my @data; # Read sheet into array for my $row ($row_min .. $row_max) { for my $col ($col_min .. $col_max) { my $cell = $worksheet->get_cell($row, $col); next unless defined $cell; # Try to detect dates my $v = $cell->value(); my $u = $cell->unformatted(); my $T = $cell->{Type}; my $result = $T;

    See that "next unless defined $cell;" statement? If memory serves, I had the same issue you seem to be encountering, where cells don't necessarily exist. In my project, I had a post-processing phase where I copied the value from the previous row into the cell whenever I found a missing cell. (It was a human-readable report, so that was the expected input: blanks meant "same as the row above".)

    The "Try to detect dates" section was heinous (omitted) as there were multiple date formats people used (and they weren't always actual dates. That was one of the most annoying parts of the project.

    The *most* annoying part of the project is that the XLSX parser is so much slower than the binary file parser. (It took somewhere between two and three *minutes* to parse the spreadsheet. It wasn't a terribly large spreadsheet.)

    ...roboticus

    When your only tool is a hammer, all problems look like your thumb.

Re: Spreadsheet::ParseXLSX or Spreadsheet::ParseExcel doesn't work with .xlsx
by poj (Abbot) on Feb 03, 2015 at 11:26 UTC

    What do you get if you save the .xlsx file as .xls and run the script using Spreadsheet::ParseExcel ?
    poj