Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Opening bad Excel files

by Arik123 (Sexton)
on Feb 21, 2018 at 07:30 UTC ( #1209625=perlquestion: print w/replies, xml ) Need Help??

Arik123 has asked for the wisdom of the Perl Monks concerning the following question:

I have an excel file which I need to extract data from. When I open it (with Spreadsheet::XLSX) it reports an empty file - no worksheets. This is not correct. Opening it with Excel itself shows that there is data.

The problem might be that the file wasn't created by Excel itself - it may have been created by another program, which pretends to create 100% compatible Excel files.

Is there any way to convert such files to valid Excel files, or is there another Perl module that can open and process them? I need a permanent solution, to process many such files automatically.

The file can be found here: https://ufile.io/f06gn

Thank you very much!

Replies are listed 'Best First'.
Re: Opening bad Excel files
by Tux (Abbot) on Feb 21, 2018 at 10:51 UTC

    There is a lot of environmental settings missing in your question. As others already found: do not use Spreadsheet::XLSX, but Spreadsheet::ParseXLSX.

    When I fetch your file, I can verify the reason to change the parser

    use Spreadsheet::Read; my $ss = Spreadsheet::Read->new ("pm1209625.xlsx", debug => 3);

    $ perl pm1209625.pl Opening XLSX pm1209625.xlsx using Spreadsheet::ParseXLSX-0.27 4 sheets Sheet 2 'Cities' 215 x 1 Sheet 3 'Categories' 22 x 1 Sheet 4 'Items' 187 x 2 Sheet 5 'VehicleTypes' 14 x 1 $ env SPREADSHEET_READ_XLSX=Spreadsheet::XLSX perl pm1209625.pl Opening XLSX pm1209625.xlsx using Spreadsheet::XLSX-0.15 0 sheets

    A short digging revealed that the parser only accepts xml tags sheet, where your workbook has tags /x:workbook at that level (XLSX.pm line 53).

    It is IMHO a lost cause to file a ticket.


    Enjoy, Have FUN! H.Merijn
Re: Opening bad Excel files
by vr (Curate) on Feb 21, 2018 at 09:08 UTC
Re: Opening bad Excel files
by thanos1983 (Parson) on Feb 21, 2018 at 09:49 UTC

    Hello Arik123,

    I download your excel file 'aveidotvalues-3.xlsx' for testing purposes and this is what I found using Spreadsheet::XLSX:

    #!/usr/bin/perl use strict; use warnings; use Data::Dumper; use Text::Iconv; my $converter = Text::Iconv -> new ("utf-8", "windows-1251"); # Text::Iconv is not really required. # This can be any object with the convert method. Or nothing. use Spreadsheet::XLSX; my $excel = Spreadsheet::XLSX -> new ('aveidotvalues-3.xlsx', $convert +er); print Dumper $excel; __END__ $ perl test.pl $VAR1 = bless( { 'SheetCount' => 0, 'FmtClass' => bless( {}, 'Spreadsheet::XLSX::Fmt2007' + ), 'Flg1904' => 0, 'Worksheet' => [] }, 'Spreadsheet::XLSX' );

    I also test your excel file with Spreadsheet::Read and this is the output:

    #!/usr/bin/perl use strict; use warnings; use Data::Dumper; use Spreadsheet::Read qw(ReadData); my $book = ReadData ('aveidotvalues-3.xlsx'); print Dumper $book; __END__ $ perl test.pl $VAR1 = [ { 'parsers' => [ { 'version' => '0.15', 'type' => 'xlsx', 'parser' => 'Spreadsheet::XLSX' } ], 'error' => undef, 'sheet' => {}, 'version' => '0.15', 'sheets' => 0, 'type' => 'xlsx', 'parser' => 'Spreadsheet::XLSX' } ];

    I can not test any Windows spreadsheet modules because I am running a LinuxOS, but what I can see so far both modules they can not see your sheets. This is the reason that the data can not be populated. So what I would recommend, is either use a Perl module to produce your files or download Apache OpenOffice to create your spreadsheets. Remember openoffice is compatible with WindowsOS and also LinuxOS and so far all the files that I have populated it worked correctly with all the Perl modules.

    Hope this helps, BR.

    Seeking for Perl wisdom...on the process of learning...not there...yet!
Re: Opening bad Excel files
by Anonymous Monk on Feb 21, 2018 at 08:08 UTC

    Maybe you can use Win32::OLE module.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1209625]
Approved by Discipulus
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2020-12-04 21:14 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    How often do you use taint mode?





    Results (62 votes). Check out past polls.

    Notices?