Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

ParseExcel:: wackiness

by misterperl (Sexton)
on Oct 05, 2010 at 15:04 UTC ( #863619=perlquestion: print w/ replies, xml ) Need Help??
misterperl has asked for the wisdom of the Perl Monks concerning the following question:

Hello PMs: I've been using ParseExcel forever on various forms. Suddenly a form shows up that appears garbled. Closer inspection shows that apparently somehow a NULL char is being inserted between EVERY character in the {Val} fields. For example a field has this: "My value" Perl instead sees: "\c@M\c@y\c@ \c@V\c@a\c@l\c@u\c@e" My first inclination was my client mucked with the form- but I pulled it into Excel, and went in and checked SHOW CONTROL CHARS and the fields are perfect- no control chars. I wrote a cleanup that zaps all the \c@ with a REGEX, but I'd PREFER to have a clean input. The other strangeness is that there is another hash field called _Value (instead of Val), which looks correct all the time. So I considered switching the code over to just use that instead. Insights and wisdom on this matter are most appreciated. MP

Comment on ParseExcel:: wackiness
Replies are listed 'Best First'.
Re: ParseExcel:: wackiness
by jmcnamara (Monsignor) on Oct 05, 2010 at 15:29 UTC
    The data being returned is in Excel's UTF-16 Unicode format.

    You will need to specify a Unicode formatter in the parse() method to handle the data. Something like this:

    #!/usr/bin/perl use warnings; use strict; use Spreadsheet::ParseExcel; use Spreadsheet::ParseExcel::FmtJapan; my $filename = 'file.xls'; my $parser = Spreadsheet::ParseExcel->new(); my $formatter = Spreadsheet::ParseExcel::FmtJapan->new(); my $workbook = $parser->parse($filename, $formatter); if ( !defined $workbook ) { die "Parsing error: ", $parser->error(), ".\n"; } # Set your output encoding to something like this. binmode STDOUT, ':utf8'; for my $worksheet ( $workbook->worksheets() ) { print "Worksheet name: ", $worksheet->get_name(), "\n\n"; my ( $row_min, $row_max ) = $worksheet->row_range(); my ( $col_min, $col_max ) = $worksheet->col_range(); for my $row ( $row_min .. $row_max ) { for my $col ( $col_min .. $col_max ) { my $cell = $worksheet->get_cell( $row, $col ); next unless $cell; print " Row, Col = ($row, $col)\n"; print " Value = ", $cell->value(), "\n" +; print " Unformatted = ", $cell->unformatted(), "\n" +; print "\n"; } } } __END__

    Ignore the fact that the formatter is called FmtJapan, it is also a general purpose Unicode handler.

    --
    John.

      Thanks for that, John! I ran into the same thing a few months ago, and merely did the same as misterperl (edited the wackiness out manually).

      Your module has come in handy many times, I appreciate your work...
      most helpful thanks..

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://863619]
Approved by johngg
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others contemplating the Monastery: (16)
As of 2015-07-29 12:41 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (263 votes), past polls