Beefy Boxes and Bandwidth Generously Provided by pair Networks
Do you know where your variables are?
 
PerlMonks  

Re: dealing with encoding while converting data from MySQL to Postgres

by Anonymous Monk
on Dec 10, 2011 at 11:52 UTC ( [id://942792]=note: print w/replies, xml ) Need Help??


in reply to dealing with encoding while converting data from MySQL to Postgres

This is a tough situation, and personally I would go through every row, every text column in the original database, then inspect it with e.g. Encode::is_utf8(), and if it happens to not be valid utf8, try to decode it with Encode::decode("cp1252", $str) ("cp1252" seems to be a good bet for many western languages), and only then insert it to the postgres database. This will probably leave you with some corrupted entries which you can then later figure out how to detect and fix.

Replies are listed 'Best First'.
Re^2: dealing with encoding while converting data from MySQL to Postgres
by Anonymous Monk on Dec 10, 2011 at 12:09 UTC

    ...and even then you may have to watch out for doubly-encoded UTF-8 (maybe the database driver does that, maybe it was inserted to the database that way, maybe ...). You should also look into Text::Iconv for your conversion as it works with raw bytes and does not mind perl's "utf8 bit."

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://942792]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others admiring the Monastery: (4)
As of 2024-03-29 00:31 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found