Hello fellow Monks,
I am trying to convert an Chinese string to hexadecimal (final form) and vise versa, but I wanted also to see the in between steps.
I tried successfully to convert the string to utf-8 and then I was thinking of converting it into ascii characters and then to hex with the help of String::HexConvert. If that would work I would revert the process to the original form.
I manage to accomplish my task for utf-8 and UCS-2 encoding. I though of sharing the script, since I was not able to find further information online.
Similar question (How do I convert a sequence of hexes (D0 D6) to Chinese characters (中)?). I used the following modules (Text::Unidecode) and also (String::HexConvert).
Sample of code:
#!/usr/bin/perl use utf8; use strict; use warnings; use feature 'say'; use Text::Unidecode; use Encode qw(decode encode); use String::HexConvert ':all'; binmode( STDOUT, ':utf8' ); my $Chinese = '北亰'; # Chinese characters for Bei Jing (U+5317 U+4EB0) say 'UTF-8'; my $utf8 = encode( 'UTF-8', $Chinese ); my $ascii2hexUTF8 = ascii_to_hex($utf8); say join(' ', split(/(..)/, $ascii2hexUTF8)); my $hex2ascciiUTF8 = hex_to_ascii($ascii2hexUTF8); my $strUTF8 = decode( 'UTF-8', $hex2ascciiUTF8); say $strUTF8; say unidecode( $strUTF8 ); say ''; say 'UCS-2'; my $ucs2 = encode("UCS-2BE", $Chinese); my $ascii2hexUCS2 = ascii_to_hex($ucs2); say join(' ', split(/(..)/, $ascii2hexUCS2)); my $hex2ascciiUCS2 = hex_to_ascii($ascii2hexUCS2); my $strUCS2 = decode("UCS-2BE", $hex2ascciiUCS2); say $strUCS2; say unidecode( $strUCS2 ); __END__ $ perl chinese.pl UTF-8 e5 8c 97 e4 ba b0 北亰 Bei Jing UCS-2 53 17 4e b0 北亰 Bei Jing
Hope this helps someone else as it did to me, BR.