in reply to DBD::Pg encodes Perlstring to UTF-8 bytes instead of WIN1252 regardless client encoding
I'm going to describe how DBD::mysql works. I suspect DBD::Pg works the same way.
Perl has two ways of storing strings. DBI or DBD::mysql looks at the internal buffer of scalars without checking which storage format was used, so every time you pass a string, it's as if you actually passed
use Encode qw( is_utf8 encode_utf8 ); is_utf8($string) ? encode_utf8($string) : $string
This is a bug, but it almost always does the right thing.
Workaround:
- If you have a decoded string (a string of Unicode code points), you can use the following:
use Encode qw( encode_utf8 ); $dbh->do("SET NAMES utf8"); my $sth = ...; $sth->execute(encode_utf8($decoded));
- If you have a string encoded using cp1252, you can use the following:
use Encode qw( decode ); $dbh->do("SET NAMES utf8"); my $sth = ...; $sth->execute(decode('cp1252', $encoded));
- If you have a string encoded using cp1252 you want to avoid any encoding and decoding on the Perl side, you can use the following:
sub _d { my ($s) = @_; utf8::downgrade($_); $s } $dbh->do("SET NAMES cp1252"); my $sth = ...; $sth->execute(_d($encoded));
Notes:
- Passing mysql_enable_utf8=>1 to DBI->connect does $dbh->do("SET NAMES utf8"); for you. Later changes to mysql_enable_utf8 does not.
- is_utf8 always returns true for strings returned by Encode::decode and Encode::decode_utf8.
- is_utf8 always returns false for strings returned by Encode::encode, Encode::encode_utf8 and Encode::from_to.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^2: DBD::Pg encodes Perlstring to UTF-8 bytes instead of WIN1252 regardless client encoding
by mje (Curate) on Feb 04, 2014 at 11:06 UTC | |
by ikegami (Patriarch) on Mar 27, 2014 at 19:10 UTC |
In Section
Seekers of Perl Wisdom