Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Re: Malformed UTF-8 character error after fetching data from Postgresql

by sundialsvc4 (Abbot)
on Jul 17, 2014 at 18:00 UTC ( [id://1094098]=note: print w/replies, xml ) Need Help??


in reply to Malformed UTF-8 character error after fetching data from Postgresql

Very respectfully to you, stefby, “I’m not so sure that you’re correct on this.

When I read section 23.3.3 of this PostGres doc page, it states that automatic conversion between client and server datasets is provided, and UTF8<->WIN is a supported combination.

What I am suspicious of is that Perl is treating the data as UTF8, within Data::Dumper.   It should be easy to query the database directly to be sure that the characters were stored (translated) correctly, and that the received characters are in CP1251.   I suspect that Perl thinks that it’s dealing with UTF8.

Replies are listed 'Best First'.
Re^2: Malformed UTF-8 character error after fetching data from Postgresql
by stefbv (Curate) on Jul 17, 2014 at 18:19 UTC

    That is interesting, and I found another phrase in the DBD::Pg docs that sounds like I was mistaken:

    pg_enable_utf8 (integer)

    DBD::Pg specific attribute. The behavior of DBD::Pg with regards to this flag has changed as of version 3.0.0. ...

    "Note that the value of client_encoding is only checked on connection time. If you change the client_encoding to/from 'UTF8' after connecting, you can set pg_enable_utf8 to -1 to force DBD::Pg to read in the new client_encoding and act accordingly."

      10x. That seems to be the problem. When I connect to my database the client encoding is UTF8, and by default DBD::Pg set the internal Perl UTF8 flag to true. This cause problems when I change the client encoding after connecting. Perl is asuming that my fetched data should be stored as UTF8 and this is not working. When I set pg_enable_utf8 to 0 everything is fine. Better solution is to set the pg_enable_utf8 flag to -1 after changing the client_encoding as suggested in the DBD::Pg documentation.

      My instinct here is that Data::Dumper is the one that is confused, not anything in the DBI Stack.

        If you don't have a test/demo case to back it up, you'd be better off not trusting your "instinct". Data::Dumper is actually pretty good about avoiding and clearing up confusion (unless of course you don't know how to read its output).

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1094098]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (6)
As of 2024-03-19 11:30 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found