Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re^2: Writing International Phonetic Alphabet symbols to Excel?

by Porculus (Hermit)
on Sep 24, 2009 at 20:46 UTC ( #797317=note: print w/ replies, xml ) Need Help??


in reply to Re: Writing International Phonetic Alphabet symbols to Excel?
in thread Writing International Phonetic Alphabet symbols to Excel?

There is no "Microsoft's own character set" to worry about. Excel uses totally standard Unicode internally.

I've done quite a lot recently writing Unicode text into Excel files with Perl on Solaris, and I've never had to worry about encodings at all.


Comment on Re^2: Writing International Phonetic Alphabet symbols to Excel?
Re^3: Writing International Phonetic Alphabet symbols to Excel?
by rdfield (Priest) on Sep 25, 2009 at 07:49 UTC
    Looks like it was just the CSV files generated from Excel - working with the spreadsheet directly sounds like a better option.

    rdfield

Re^3: Writing International Phonetic Alphabet symbols to Excel?
by rdfield (Priest) on Nov 20, 2009 at 15:45 UTC

    After doing some testing on this, it looks like MS uses latin1 (or something very close) internally.

    I used a slightly modified version of the example script from Spreadsheet::ParseExcel, replacing the "print" with a database insert.

    When processing some data from a spreadsheet into a PostgreSQL database, cells with symbols such as 0xae (ascii 92, the "registered" symbol, ®), I constantly came up against the database error:
    DBD::Pg::db do failed: ERROR: invalid byte sequence for encoding "UTF +8": 0xae
    After setting the client encoding to latin1 (keeping the database at UTF8):
    $dbh->do("set client_encoding to latin1");

    the data went in OK.

    If there is a different/better way to process this, I'd be interested to know.

    Update: there is a better way...
    #!/usr/bin/perl use warnings; use strict; use Spreadsheet::ParseExcel; my $parser = Spreadsheet::ParseExcel->new(); my $workbook = $parser->Parse('Book1.xls'); binmode(STDOUT, ":utf8"); foreach my $worksheet ( $workbook->worksheets() ) { my ( $row_min, $row_max ) = $worksheet->row_range(); my ( $col_min, $col_max ) = $worksheet->col_range(); foreach my $row ( 1 .. $row_max ) { foreach my $col ( $col_min .. $col_max ) { my $cell = $worksheet->get_cell( $row, $col ); next unless $cell; next unless defined($col_mapping{$col}); my $value = $cell->value(); utf8::upgrade($value); ... store_in_database($value); ... } } }

    rdfield

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://797317]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (12)
As of 2014-09-16 14:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (31 votes), past polls