Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Encoding issues

by manishrathi (Beadle)
on Dec 01, 2012 at 20:51 UTC ( #1006641=perlquestion: print w/ replies, xml ) Need Help??
manishrathi has asked for the wisdom of the Perl Monks concerning the following question:

I was scripting using Perl and I came across encoding issue. Now I am confused with this encoding issue, which is not exactly a Perl issue. I tried asking this at different blogs, but of not much help.

I am putting my confusion here, please let me know of correct idea. Please dont point to links as they are more confusing. I did a lot of research, but none of them explains it properly.

1) If I want use a different encoding, this new encoding will have assigned different bits to each character than UTF-8 encoding. How can I apply this new encoding with input device (keyboard) as keyboard is already typing according to UTF-8 ? If my understanding is not wrong, this problem is not faced by end users who are reading this page with different encoding than UTF-8. Because meta tag of HTML header will inform the client browser about encoding to use and that encoding will automatically be invoked by client browser. But for person who wants to type in this different encoding, how can he do that ?

2) Secondly, if I want to type Hindi characters using different character encoding, how can I do that if I don't have Hindi keyboard ? Do I need to type in Hex code pertaining to each character ? Thanks

Comment on Encoding issues
Re: Encoding issues
by moritz (Cardinal) on Dec 01, 2012 at 21:42 UTC
    1) If I want use a different encoding, this new encoding will have assigned different bits to each character than UTF-8 encoding. How can I apply this new encoding with input device (keyboard) as keyboard is already typing according to UTF-8 ?

    Encode::from_to can translate UTF-8 into other encodings.

    Secondly, if I want to type Hindi characters using different character encoding, how can I do that if I don't have Hindi keyboard ?

    Producing Hindi characters is a question the input method that your operating system (or editor, or terminal, or whatever) offer, and usually not tied to a particular encoding.

    You would do well to read up about the different subsytems that are involved (the operating system's input methods, key codes, character encodings), because you seem to mix them up right now.

Re: Encoding issues
by 2teez (Priest) on Dec 01, 2012 at 22:00 UTC

    While it may be most common for modern systems to support or use UTF-8 in filehandle settings, you may need to use other encodings.
    You can use either use open pragma  use open qw(:std :utf8); OR
    binmode like so: binmode FILEHANDLE, LAYER. i.e

    binmode STDIN, ":encoding(UTF-8)";

    Please note that you can SPECIFY the different encoding, instead of UTF-8. So, in place of UTF-8, you can use other encodings. E.g.
    binmode STDIN, ":encoding(UTF-16)"; #OR binmode STDIN, ":encoding(cp1252)";
    Please check this Perl Unicode Cookbook it will help alot.

    2.
    If I get your question right:
    To set your keyboard to type other language:
    If you are on a Win OS, go to Control Panel --> Region and Language, click on Keyboards and Languages, click on Change Keyboards, then click on Add, under installed services. If the language you want is installed then you can add and use.

    If you tell me, I'll forget.
    If you show me, I'll remember.
    if you involve me, I'll understand.
    --- Author unknown to me
Re: Encoding issues
by remiah (Hermit) on Dec 02, 2012 at 01:17 UTC

    Sorry not knowing Hindi but I hope they are similar things.
    Letter A and Hiragana Letter A for exmaple.

    #submitting A
    1.keyboard          type A 
    2.input method      no input method
    3.browser           display string A
    4.submit            sending A to server 
    5.cgi on server     recieve A from client
    
    #submitting HIRAGANA Letter A
    1.keyboard          type A     
    2.input method      convert to HIRAGANA_LETTER_A "???"
    3.browser           display HIRAGANA_LETTER_A "???"
    4.submit            sending HIRAGANA_LETTER_A encoded with "UTF-8"
    5.cgi on server     recieve "UTF-8" HIRAGANA_LETTER_A from client
    
    "???" could be UTF-16 or some system encoding.
    Maybe you are confused by lacking Hindi input method

    ~

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://1006641]
Approved by 2teez
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others romping around the Monastery: (8)
As of 2014-12-29 09:17 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (185 votes), past polls