Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

msft xlsx2txt help please!

by david_lyon (Sexton)
on Aug 22, 2010 at 01:08 UTC ( #856531=perlquestion: print w/ replies, xml ) Need Help??
david_lyon has asked for the wisdom of the Perl Monks concerning the following question:

Bows to the Perl Monks

I have been looking for a way to parse msft xlsx to tab delimited text file on a linux machine using perl.

I have tried numerous scripts on the web which to date none of them reliably parses out xlsx from what I could see. Can someone post some perl code that can convert xlsx and produce a tab delimted text file.

Thank you so much...Bows

Comment on msft xlsx2txt help please!
Replies are listed 'Best First'.
Re: msft xlsx2txt help please!
by sflitman (Hermit) on Aug 22, 2010 at 05:43 UTC
    I took this from Spreadsheet::XLSX's perldoc. Make sure modules and all dependencies are installed through CPAN, and I found perl 5.10 was needed due to a dependency on Weak References in Spreadsheet::ParseExcel.
    #!/usr/bin/perl # SSF 082110 - for perlmonks - convert XLSX to tab-delimited use warnings; use strict; use Spreadsheet::XLSX; use Text::Iconv; my ($file,$sheetindex)=@ARGV; die "Usage: $0 file.xlsx [sheetindex]\n" unless $file; $sheetindex||=0; my $converter=Text::Iconv->new("utf-8","windows-1251"); my $excel=Spreadsheet::XLSX->new($file,$converter); my @sheets=@{$excel->{Worksheet}}; die "Bad sheet index: $sheetindex\n" if $sheetindex<0 or $sheetindex>$ +#sheets; my $sheet=$sheets[$sheetindex]; $sheet->{MaxRow}||=$sheet->{MinRow}; for my $row ($sheet->{MinRow} .. $sheet->{MaxRow}) { $sheet->{MaxCol}||=$sheet->{MinCol}; for my $col ($sheet->{MinCol}..$sheet->{MaxCol}) { my $cell=$sheet->{Cells}[$row][$col]; if ($cell) { print $cell->{Val}; } print "\t" if $col<$sheet->{MaxCol}; } print "\n"; } exit;
    Note that it only outputs one sheet, defaulting to the first. I've noticed that Excel's blank book has three sheets, so it is already three-dimensional. This script could use more error checking, but it works on a Linux system with the caveats listed above.

    HTH,

    SSF

      Bows to The Monks....

      Thanks for everyones help

      Thank you sflitman for posting the code.....incredible, works beautifully

      Bows

Re: msft xlsx2txt help please!
by Anonymous Monk on Aug 22, 2010 at 01:12 UTC
Re: msft xlsx2txt help please!
by sundialsvc4 (Abbot) on Aug 23, 2010 at 13:14 UTC

    I would say, as a categorical statement, that you always want to find, and to use, a CPAN module of recent vintage to do these things.   Don’t pore through the module and cabbage code from it:   use the module exactly as it is.

    Of course, we know that Microsoft is easy to deal with.   We know that they never create file-formats that are secretive or difficult to understand.   (Oh, no-o-o, they would never do such a thing!)   Therefore, when you find a CPAN contribution that has undertaken the chore of understanding that format ... that is the way that you should do it.

    The same is true for output:   there are CPAN modules for tab-delimited text files, too.   Granted, tab-delimited is a comparatively trivial format to deal with, but not entirely.   IMHO, even a slight amount of hair-pulling to do such things is an unjustifiable abuse of precious follicles.

    Actum Ne Agas:   “Do Not Do A Thing Already Done.”

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://856531]
Approved by ww
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (9)
As of 2015-07-08 08:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (97 votes), past polls