http://www.perlmonks.org?node_id=1003756

jacob has asked for the wisdom of the Perl Monks concerning the following question:

Hi, i need some help in fetching russian text from dbi, when i fetch the data, it only prints ???? instead of the character. But when i directly assign a russian text to a variable it prints it fine. Need help thanks!

Replies are listed 'Best First'.
Re: Reading russian characters
by afoken (Chancellor) on Nov 14, 2012 at 06:44 UTC

    Show the relevant code.

    Make sure your DBD is configured to use Unicode. This is usually done during DBI->connect(), using an attribute value.DBD::Pg needs pg_enable_utf8 => 1, DBD::mysql needs mysql_enable_utf8 => 1, DBD::SQLite needs sqlite_unicode => 1 (but that breaks BLOBs), . Some DBDs can handle Unicode automatically, like DBD::Oracle (but you have to set either $ENV{'NLS_LANG'} or $ENV{'NLS_NCHAR'} to AL32UTF8, before loading Oracle DLLs, i.e. in a BEGIN block as early in your script as possible). DBD::ODBC handles Unicode automatically if it was compiled with Unicode support (default on Windows).

    Alexander

    --
    Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)
Re: Reading russian characters
by stefbv (Curate) on Nov 14, 2012 at 08:13 UTC

    In addition to the information provided by afoken, DBD::Firebird needs ib_enable_utf8 = 1.

    Ștefan

Re: Reading russian characters
by Anonymous Monk on Nov 14, 2012 at 14:23 UTC
    Also, find a way to look at (say in hexadecimal) what the actual bytes are, that are being rendered as "????" on output. (How are you generating that output? To the console; a web page?) Question-marks probably just mean a display-only issue: the bytes are there, and correct, but the system doesn't know what charset to use to display them. Could be that they're not there, though; that they've been corrupted earlier. Only one way to know for sure.

      I remember that I got a literal ? (chr 63) from a database whenever it had a character not representable by the current connection encoding. But I can't remember which database behaved like that.

      Java often also behaves like this. This is documented in http://docs.oracle.com/javase/6/docs/api/java/nio/charset/CharsetEncoder.html.

      Alexander

      --
      Today I will gladly share my knowledge and experience, for there are no sweeter words than "I told you so". ;-)