http://www.perlmonks.org?node_id=374960

olivierp has asked for the wisdom of the Perl Monks concerning the following question:

Fellow Monks,
I need to manipulate Unicode data stored in an MS SQL server, and would rather
use DBI + DBD::ODBC than my current Win32::OLE solution, for easier portability
(the scripts currently run on Win32, but could be moved to a Unix box...)

However, I am unable to correctly fetch from MSSQL's nvarchar, nchar and ntext fields
with DBI/DBD ODBC. I only get "??????" characters from the DB

I've included a small sample script that can be run to test this, provided you have non-ASCII values in the table.
use utf8; use strict; use DBI; use Win32::OLE; # Required to get Unicode correctly CP specifies Perl's "internal" enc +oding # LCID 65001 foces OLE to use (or stop translating) UTF8 # Beware that $variant->Time and $variant->Date will need an LCID in a +ddition # to the format string, e.g. 1033 for US English Win32::OLE->Option(CP => Win32::OLE::CP_UTF8, LCID => 65001); my $odbc_dsn = qq/driver={SQL Server};server=$database_name;AutoTransl +ate=OFF/; my $ado_oledb_dsn = <<EODSN; Provider=sqloledb; Data Source=$database_name; Integrated Security=SSPI EODSN my $sql =<<EOFSQL; select unicode_text from unicode_table where ut_id = 1 EOFSQL my $dbh = DBI->connect("dbi:ODBC:$odbc_dsn", "", ""); open FH, ">:utf8", "test_odbc.txt"; print FH ($dbh->selectrow_array($sql))[0]; close FH; $dbh->disconnect; $dbh = DBI->connect("dbi:ADO:$ado_oledb_dsn", "", ""); open FH, ">:utf8", "test_ado.txt"; print FH ($dbh->selectrow_array($sql))[0]; close FH; $dbh->disconnect; my $odbc_connection = Win32::OLE->new('ADODB.Connection'); $odbc_connection->Open($ado_oledb_dsn); my $recordset = Win32::OLE->new('ADODB.Recordset'); $recordset->Open($sql,$odbc_connection); open FH, ">:utf8", "test_ole.txt"; print FH $recordset->Fields(0)->Value; close FH; $odbc_connection->Close();

test_odbc.txt contains this: ?????????
test_ole and test_ado.txt contains the correct value, a string in Japanese.

I am running this on WinXP / Win2k, with DBI 1.43 & DBD::ODBC 1.07 on Activestate perl 5.8.0
I have tried various DBI/DBD::ODBC combinations on As 5.8.0 & 5.8.3, but to no avail.
Has anyone encountered this and found a solution ?

Update

I've found that DBD::ADO would be something to consider, as I found out it uses Win32::OLE itself, and
thus can coerce Windows into using CP_UTF8 / 65001 as Codepage
Before trying this out deeper, I'm wondering what would happen if Win32::OLE were already initialized
by another script/module, before DBD::ADO gets called and changes the Codepage. I'm asking, as my data
access functions are actually stored in a package that I then include. Does anyone have tips/pointers for this ?
Regards,

--
Olivier

Replies are listed 'Best First'.
Re: DBD-ODBC: How to use Unicode on Win32 ?
by jZed (Prior) on Jul 16, 2004 at 17:44 UTC
    Try putting "use utf8;" at the top of your script.
      As far as I understand use utf8, as of Perl 5.8.x is only required when the script itself
      contains Unicode. I've nonetheless given it a try, but I still get mangled characters through DBD::ODBC. Thanks anyways,

      --
      Olivier
Re: DBD-ODBC: How to use Unicode on Win32 ?
by Zaxo (Archbishop) on Jul 17, 2004 at 05:12 UTC

    Winders does unicode in utf16.

    After Compline,
    Zaxo