http://www.perlmonks.org?node_id=913326


in reply to Re^8: Japanese character in Linux
in thread Japanese character in Linux

Looks like this site can't handle Japanese as well :)

Anyway, I believe we can assume, that DBD::Oracle does not do any conversion, so the data obtained from the database comes in UTF-16 encoding, so it should be passed to the database in the same encoding.

Now, that's the goal. How to achieve it depends on what you are actually doing. I've already told you what you should do to use the obtained data in perl regular expressions. If you want to insert some Japanese text into database, then you would do something like this

use utf8; use Encode; my $data_to_be_inserted = Encode::encode("UTF-16", "Some Japanese text +");

I assume, we've clarified the issue with terminal output. If you have some other problems, then it is better to describe them in more detailed way, showing the code that does not work. Otherwise, I can only recommend you to read the documents mentioned in the very first answer to your post.

Replies are listed 'Best First'.
Re^10: Japanese character in Linux
by Anonymous Monk on Jul 11, 2011 at 12:58 UTC
    $sSQL = " select japanese_longname, \n"; $sSQL .= " japanese_shortname \n"; $sSQL .= " from <table name> \n"; $sSQL .= " where instrument_id = '1301' \n"; $sSQL .= " and instrument_type = 'ST' \n"; #print("Sql i d $sSQL\n"); $dbFOX_sth=$dbFOX->prepare($sSQL); $dbFOX_sth->execute(); if ( @row = $dbFOX_sth->fetchrow_array ) { foreach ( @row) { $_ = Encode::decode_utf8( $_ ); } ( $sInstrumentNameJ, $sInstrumentShortJ ) = @row;

    Here the values stored in $sInstrumentNameJ,$instrumentShortJ are passed to oracle stored procwhich is inserting into Nvarchar2 datatype columns. When we hardcode values with Ascii data like "ABC" it works fine , but for japanese data it comes as junk

Re^10: Japanese character in Linux
by Anonymous Monk on Jul 11, 2011 at 13:53 UTC
    $sSQL = " select japanese_longname, \n"; $sSQL .= " japanese_shortname \n"; $sSQL .= " from <table name> \n"; $sSQL .= " where instrument_id = '1301' \n"; $sSQL .= " and instrument_type = 'ST' \n"; #print("Sql i d $sSQL\n"); $dbFOX_sth=$dbFOX->prepare($sSQL); $dbFOX_sth->execute(); if ( @row = $dbFOX_sth->fetchrow_array ) { foreach ( @row) { $_ = Encode::decode_utf8( $_ ); } ( $sInstrumentNameJ, $sInstrumentShortJ ) = @row;

    Here the values stored in $sInstrumentNameJ,$instrumentShortJ are passed to oracle stored procwhich is inserting into Nvarchar2 datatype columns. When we hardcode values with Ascii data like "ABC" it works fine , but for japanese data it comes as junk

      Well, I would consider using "Encode::decode_utf8" very wrong here. At least from your previous messages it followed, that the data inserted into database was in UTF-16 encoding, so when you apply "decode_utf8" function to it, you create a mess and nothing else. Of course this function does not hurt any of pure ASCII data.

      You should use "Encode::decode("UTF-16", $_)".

        We have resolved the issue Following changes were done in perl Loader to handle Japanese. Env variable setting :

        $ENV{'NLS_NCHAR'} = 'AL32UTF8';

        Encoding from Shift JIS to UTF8 after data is fetched from Sybase:

        Encode::from_to($sInstrumentNameJ, "shiftjis", "utf8"); #added for te +sting Encode::from_to($sInstrumentShortJ, "shiftjis", "utf8"); #added for t +esting

        Binding Japanese data parameters to be passed as below

        use DBD::Oracle qw(:ora_types ORA_OCI SQLCS_NCHAR ); $dbGOSTky_sth->bind_param(":sInstrumentNameJ",$sInstrumentNameJ,{ora_c +sform => SQLCS_NCHAR}); $dbGOSTky_sth->bind_param(":sInstrumentShortJ",$sInstrumentShortJ,{ora +_csform => SQLCS_NCHAR});

      When we hardcode values with Ascii data like "ABC" it works fine , but for japanese data it comes as junk

      Hardcode it using \N{U+0104}?