http://www.perlmonks.org?node_id=948604


in reply to Re^3: XLSX read and dump
in thread XLSX read and dump

The test.xlsx file isn't in a zip file. I thought it was going to create a new file called test.xlsx, but didn't. So then I tried renaming one of the files to test.xlsx and placed it in the same directory as the script below, named parse.pl, thinking that it would read it. Either way, I still get the same error about a zip archive.
use strict; use warnings; use Spreadsheet::XLSX; my $excel = Spreadsheet::XLSX -> new ('test.xlsx'); foreach my $sheet (@{$excel -> {Worksheet}}) { printf("Sheet: %s\n", $sheet->{Sheet1}); $sheet -> {B45} ||= $sheet -> {B37}; foreach my $row ($sheet -> {B37} .. $sheet -> {B45}) { $sheet -> {R45} ||= $sheet -> {C45}; foreach my $col ($sheet -> {C45} .. $sheet -> {R45}) +{ my $cell = $sheet -> {Cells} [$row] [$col]; if ($cell) { printf("( %s , %s ) => %s\n", $row, $col, +$cell -> {Val}); } } } }

Replies are listed 'Best First'.
Re^5: XLSX read and dump
by toolic (Bishop) on Jan 18, 2012 at 19:07 UTC
    It looks like it just can't find your test.xlsx file. Try using the full path to your file, something like:
    my $excel = Spreadsheet::XLSX -> new ('/home/chirp84/test.xlsx');
      No dice. Returns same error but now includes full path to file.
Re^5: XLSX read and dump
by runrig (Abbot) on Jan 18, 2012 at 20:36 UTC
    Are you trying to parse an actual, valid, existing xlsx file, or are you trying to create one? If you are reading/parsing a spreadsheet, use Spreadsheet::XLSX. If you are trying to create one, use Excel::Writer::XLSX. Your error message indicates that either your xlsx file does not exist, or is not a valid zip archive (and xlsx files are zip archives).
      I had the wrong file extension. Changing to .xlsx allowed program to run. I am trying to read certain cells in 100+ separate .xlsx documents and then dump those values into a single excel document (or really any format, i.e. .xls, xlsx, .csv) that I can then analyze the values. Is there further documentation other than that readily viewable on the CPAN site? Maybe the issue is I just don't know enough perl to tackle this. But it would be so useful so I keep pressing on. It's like I can see all the modules and their usefulness but I can't tie them together.
Re^5: XLSX read and dump
by talexb (Chancellor) on Jan 18, 2012 at 19:48 UTC
      The test.xlsx file isn't in a zip file. I thought it was going to create a new file called test.xlsx, but didn't.

    Sorry -- confusion. The 'new' in this context doesn't mean it's going to *create* a file. It's creating a new context in order read one of your files that already exists.

    Next, I'm not sure

    my $excel = Spreadsheet::XLSX -> new ('test.xlsx');
    is correct. I would have expected
    my $excel = Spreadsheet::XLSX->new ('test.xlsx');
    to be better. Does your script produce any warnings when you run it?

    You need to get to the point where you're able to open a file in your current directory before we can proceed.

    Alex / talexb / Toronto

    "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds

      Next, I'm not sure

      my $excel = Spreadsheet::XLSX -> new ('test.xlsx');

      is correct.

      It's not a common way of laying out code, but adding arbitrary whitespace between tokens in Perl is rarely disallowed. Whitespace is only really needed between tokens if they'd look like another token if the whitespace was missing.

      For example, given:

      use 5.010; my @foo = split /\|/, q {foo|bar|bar}; foreach my $x (@foo) { say $x; }

      ... there are only actually two pieces of required whitespace: between "use" and "5.010", and between "foreach" and "my". The code runs perfectly well if you strip out the rest of the whitespace:

      use 5.010;my@foo=split/\|/,q{foo|bar|bar};foreach my$x(@foo){say$x}

      ... though most people would consider the former to be more readable.

      Okay, the data file I renamed had the wrong extension. I had .xls, after I changed to .xlsx the script ran. All the data from excel printed into my terminal window in the format below:
      ( 1 , 1 ) => -230 ( 1 , 2 ) => -201.25 ( 1 , 3 ) => -172.5 ( 1 , 4 ) => -143.75
      and so on... Also, I changed the code to reflect exactly what the Spreadsheet::XLSX synopsis shows because my actual row and column names were causing errors. Now the question is: How can I print just the cells I want to a separate file?
          How can I print just the cells I want to a separate file?

        What does the documentation say?

        Alex / talexb / Toronto

        "Groklaw is the open-source mentality applied to legal research" ~ Linus Torvalds