Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight
 
PerlMonks  

Re^7: Japanese character in Linux

by andal (Hermit)
on Jul 08, 2011 at 09:56 UTC ( [id://913315]=note: print w/replies, xml ) Need Help??


in reply to Re^6: Japanese character in Linux
in thread Japanese character in Linux

There are quite a few things that are wrong. First of all, with locale "C" it is not possible to see any Japanese text, or actually any text outside of ASCII. So, it looks like you have to change your locale first. Assuming, that you use xterm as your terminal emulator, try to start it this way

LC_CTYPE="en_US.UTF-8" xterm &
Then check the output of locale command in that new terminal window. It should produce something like
LANG= LC_CTYPE=en_US.UTF-8

After that, you should convert your text to UTF-8 encoding. Since you haven't provided Japanese text corresponding to the hexdump, and since the hexdump looks like UTF-16, I'll assume, that it is UTF-16. Then you should output the variable using the following:

Encode::from_to($your_variable, "UTF-16", "UTF-8"); print $your_variable, "\n";
If this does not produce the result, then you may have some wrong font for terminal emulator :)

Actually, sending the variable to terminal probably is not the most important thing. You have to work with that value inside of perl. Depending on what you want to do, you have to convert your variables back and forth. For example, to do pattern matching on it, you should first do

use utf8; my $converted = Encode::decode("UTF-16", $your_variable); $converted =~ /some Japanese text/;

Replies are listed 'Best First'.
Re^8: Japanese character in Linux
by Anonymous Monk on Jul 08, 2011 at 10:15 UTC

    Japanese charctares are iŠ”j‹É—m

    (株)極洋 Hex Dump is 81698a94816a8bc9976d. We are doing sybase to oracle migration PROJECT> We successfully migraed the data to Oracle using Characterset16 in Sql loader file. Now we are finding issue when application is inserting data from Perl. We are not concrened about displaying the Japanese data in console . We want the Japanese data to be able to handle by perl post migration. Please advise. Thanks Praful

      Looks like this site can't handle Japanese as well :)

      Anyway, I believe we can assume, that DBD::Oracle does not do any conversion, so the data obtained from the database comes in UTF-16 encoding, so it should be passed to the database in the same encoding.

      Now, that's the goal. How to achieve it depends on what you are actually doing. I've already told you what you should do to use the obtained data in perl regular expressions. If you want to insert some Japanese text into database, then you would do something like this

      use utf8; use Encode; my $data_to_be_inserted = Encode::encode("UTF-16", "Some Japanese text +");

      I assume, we've clarified the issue with terminal output. If you have some other problems, then it is better to describe them in more detailed way, showing the code that does not work. Otherwise, I can only recommend you to read the documents mentioned in the very first answer to your post.

        $sSQL = " select japanese_longname, \n"; $sSQL .= " japanese_shortname \n"; $sSQL .= " from <table name> \n"; $sSQL .= " where instrument_id = '1301' \n"; $sSQL .= " and instrument_type = 'ST' \n"; #print("Sql i d $sSQL\n"); $dbFOX_sth=$dbFOX->prepare($sSQL); $dbFOX_sth->execute(); if ( @row = $dbFOX_sth->fetchrow_array ) { foreach ( @row) { $_ = Encode::decode_utf8( $_ ); } ( $sInstrumentNameJ, $sInstrumentShortJ ) = @row;

        Here the values stored in $sInstrumentNameJ,$instrumentShortJ are passed to oracle stored procwhich is inserting into Nvarchar2 datatype columns. When we hardcode values with Ascii data like "ABC" it works fine , but for japanese data it comes as junk

        $sSQL = " select japanese_longname, \n"; $sSQL .= " japanese_shortname \n"; $sSQL .= " from <table name> \n"; $sSQL .= " where instrument_id = '1301' \n"; $sSQL .= " and instrument_type = 'ST' \n"; #print("Sql i d $sSQL\n"); $dbFOX_sth=$dbFOX->prepare($sSQL); $dbFOX_sth->execute(); if ( @row = $dbFOX_sth->fetchrow_array ) { foreach ( @row) { $_ = Encode::decode_utf8( $_ ); } ( $sInstrumentNameJ, $sInstrumentShortJ ) = @row;

        Here the values stored in $sInstrumentNameJ,$instrumentShortJ are passed to oracle stored procwhich is inserting into Nvarchar2 datatype columns. When we hardcode values with Ascii data like "ABC" it works fine , but for japanese data it comes as junk

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://913315]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others chanting in the Monastery: (8)
As of 2024-04-19 12:27 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found