Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re^3: Writing International Phonetic Alphabet symbols to Excel?

by rdfield (Priest)
on Nov 20, 2009 at 15:45 UTC ( [id://808460]=note: print w/replies, xml ) Need Help??


in reply to Re^2: Writing International Phonetic Alphabet symbols to Excel?
in thread Writing International Phonetic Alphabet symbols to Excel?

After doing some testing on this, it looks like MS uses latin1 (or something very close) internally.

I used a slightly modified version of the example script from Spreadsheet::ParseExcel, replacing the "print" with a database insert.

When processing some data from a spreadsheet into a PostgreSQL database, cells with symbols such as 0xae (ascii 92, the "registered" symbol, ®), I constantly came up against the database error:
DBD::Pg::db do failed: ERROR: invalid byte sequence for encoding "UTF +8": 0xae
After setting the client encoding to latin1 (keeping the database at UTF8):
$dbh->do("set client_encoding to latin1");

the data went in OK.

If there is a different/better way to process this, I'd be interested to know.

Update: there is a better way...
#!/usr/bin/perl use warnings; use strict; use Spreadsheet::ParseExcel; my $parser = Spreadsheet::ParseExcel->new(); my $workbook = $parser->Parse('Book1.xls'); binmode(STDOUT, ":utf8"); foreach my $worksheet ( $workbook->worksheets() ) { my ( $row_min, $row_max ) = $worksheet->row_range(); my ( $col_min, $col_max ) = $worksheet->col_range(); foreach my $row ( 1 .. $row_max ) { foreach my $col ( $col_min .. $col_max ) { my $cell = $worksheet->get_cell( $row, $col ); next unless $cell; next unless defined($col_mapping{$col}); my $value = $cell->value(); utf8::upgrade($value); ... store_in_database($value); ... } } }

rdfield

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://808460]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others sharing their wisdom with the Monastery: (4)
As of 2024-03-19 11:46 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found