Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re: Need tipps for identifying utf-8 problems with Dancer and MySQL

by McA (Priest)
on May 05, 2014 at 12:15 UTC ( [id://1085051]=note: print w/replies, xml ) Need Help??


in reply to Need tipps for identifying utf-8 problems with Dancer and MySQL

Hi,

just a hint for debugging:

1) Get sure whether your MySQL installation stores in UTF-8. Take the table in question and do a show create table blabla\G in mysql client. When there is an alternative charset declared you can see it on the last line. Also check via show global variables like '%char%';.

2.) Check whether the MySQL connect option mysql_enable_utf8 is set to true if you have UTF-8 enabled.

3.) Insert a debug statement soon after fetching data from the database. If you can put your hand on a string then do the following:

my $utf8_flag = utf8::is_utf8($string) ? 1 : 0; print STDERR "For String '$string' UTF8-Flag is: $utf8_flag\n";

When your whole code is running in an UTF-8 environment than you should get a '1' there.

The output you get is a sign for double encoding. Look for possibilities where this kind of double encoding could happen. Rule of thumb. Don't do encoding until it comes to output at the boundaries.

Best regards
McA

Replies are listed 'Best First'.
Re^2: Need tipps for identifying utf-8 problems with Dancer and MySQL
by kwetal (Initiate) on Jun 17, 2014 at 14:32 UTC

    Thanks for the advice on utf8::is_utf8! I stumbled on the same problem, but I am using sqlite3. When reading the string from the database, I can see that it contains the right bytes(*), but the utf8_flag is 0.

    How do I convince Perl that the string from the database is really an utf8 string? I think that I need to open the sqlite database with some option so that all strings read from the database will receive the utf8-flag.

    I tried utf8::upgrade, but it does not work: on the web page the single accented character shows up as 2 accented characters.

    (*)printing to STDERR which is connected to an utf8 terminal shows the correct accented character.

      Hi,

      First of all I don't know sqlite3. There are some players in the game: sqlite3 and DBD::xxx. When the DBD driver for sqlite you use does not decode the byte strings which come from the sqlite database, than you have to do it.

      use Encode qw(decode); my $decoded_string = decode('UTF-8', $byte_string_from_sqlite);

      Which driver 'DBD::xxx' are you using?

      UPDATE: Have a look at http://search.cpan.org/~ishigaki/DBD-SQLite/lib/DBD/SQLite.pm#DRIVER_PRIVATE_ATTRIBUTES. I'm pretty sure that is what you are looking for: sqlite_unicode

      Regards
      McA

        $dbh->{sqlite_unicode} = 1;

        Well yes! That's exactly the answer to my problem. Thank you very much.

        That explains why I couldn't find it in perldoc DBD; clearly I was browsing the wrong doc.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1085051]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others taking refuge in the Monastery: (3)
As of 2024-04-20 15:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found