http://www.perlmonks.org?node_id=995061


in reply to I'm having a lot of trouble getting UTF-8 output via Perl/DBI on OS X

Hello.

Redirect it's output to file and see it with Browser. And set encoding of your browser to UTF-8. And What do you see?

I see nothing wrong with your perl script, get UTF-8 encoded bytes from db and print encoded bytes...What is this?

updated:

Sorry. maybe I was looking for wrong direction.
If you do 'set Names', how does it look like?

$dbh->do("SET NAMES 'utf8'");
regards

Replies are listed 'Best First'.
Re^2: I'm having a lot of trouble getting UTF-8 output via Perl/DBI on OS X
by Cody Fendant (Hermit) on Sep 22, 2012 at 08:07 UTC
    $dbh->do("SET NAMES 'utf8'");

    That works! Thank you!

    So, what's going on? It's actually MySQL auto-converting from UTF-8 when I never asked it to?

      Cody Fendant:

      I just looked at the DBD::mysql documentation, and there's a setting mysql_enable_utf8 that defaults to off. I've not used DBD::mysql, but based on the docs, it looks like you can just add the setting in the connect statement like this:

      $dbh= DBI->connect("DBI:mysql:test;mysql_enable_utf8=1", "root", "");

      ++ for the OP: clear, detailed and didn't seem to miss any pertinent information.

      ...roboticus

      When your only tool is a hammer, all problems look like your thumb.

      Hello, Cody Fendant,roboticus.

      PostgreSQL or MySQL seems to have encoding for server storage, and for client encoding. So, I guess 'select hex(field) from table' shows bytes for server storage and when you receive the value at the client, it is converted to encoding of client.

      "mysql_enable_utf8=1" and "SET NAMES 'utf8'" sets encoding for client. I saw both of them gives me good result with DBD::mysql version "4.008".

      In fact, I am not the man of MySQL... Please point me if I say wrong.

      P.S: With my DBD::mysql version, the result is utf-8 bytes, not Perl's character. utf8::is_utf8 will tell you whether it is decoded character or bytes.

Re^2: I'm having a lot of trouble getting UTF-8 output via Perl/DBI on OS X
by Cody Fendant (Hermit) on Sep 22, 2012 at 08:04 UTC
    Good idea. Outputting to a file gives me quotes which look fine in Latin (ISO-8859-1) and question marks in UTF-8. The bytes are 93 94 20 20 91 92.