http://www.perlmonks.org?node_id=1051031

msinfo has asked for the wisdom of the Perl Monks concerning the following question:

hi monks,

my recent project requires me to use utf chracters for input as well as output.

i tried below things, but they never worked for me. i am using windows machine

use utf8; use Encode; #binmode(STDOUT, ":utf8") binmode STDOUT, ":utf8"; use open qw/:std :utf8/; use open OUT => ':utf8'; use feature 'unicode_strings'; use Unicode::UCD 'charinfo'; print "एचटीएमल"; # actually it was html written in devnagri # browser and cmd prompt couldn't render it in utf format

Replies are listed 'Best First'.
Re: perl writing urf8/utf16 output to cmd screen and html page
by kcott (Archbishop) on Aug 27, 2013 at 13:28 UTC

    G'day msinfo,

    When posting Unicode, it's best to use <pre>...</pre> or <tt>...</tt> tags instead of <code>...</code> or <c>...</c> tags. Here's an example:

    $ perl -Mstrict -Mwarnings -le '
        use utf8;
        binmode STDOUT => ":utf8";
        print "एचटीएमल";
    '
    एचटीएमल
    

    Can you run that code on your command-line? [I don't use Perl on MSWin but I believe you'll need to swap the single- and double-quotes.] What output do you get?

    The entity references also render correctly as normal text (at least for me). With this markup:

    <p>&#2319;&#2330;&#2335;&#2368;&#2319;&#2350;&#2354;</p>

    I get this:

    एचटीएमल

    How does that look in your browser?

    [I'm unfamiliar with this script. You say it's "devnagri". I search for this; couldn't find it; the search engine suggested "Devanagari". Is that what you meant?]

    -- Ken

      There seems another problem, to write Unicode character, when I change encoding in notepad++ to Unicode, the web page doesn't renders in browser, and when I switch back to ANSI, it renders properly, but then I can't write Unicode characters
Re: perl writing urf8/utf16 output to cmd screen and html page
by Anonymous Monk on Aug 27, 2013 at 01:48 UTC
      no that was automatically done by perlmonks, i tried writing utf using language keyboard, but it got changed into character code
Re: perl writing urf8/utf16 output to cmd screen and html page
by moritz (Cardinal) on Aug 27, 2013 at 05:24 UTC
      for html i used this
      print $cgi->start_html({-head=>meta({-http_equiv => 'Content-Type', -content => 'text/html', -charset=>'utf-8'}), -title=>'Test'});
      currently going through links provided in first reply
Re: perl writing urf8/utf16 output to cmd screen and html page
by thewebsi (Scribe) on Aug 27, 2013 at 07:57 UTC

    I ran your test script (replacing the character entity codes with actual UTF-8 characters) and it works fine for me. So the problem is with your command prompt console not interpreting UTF-8 characters correctly. Try printing a UTF-8 character manually in the console using echo Ʃ for example, and see what happens.

    I also ran your CGI script (adding actual UTF-8 characters for testing) and it worked fine too. Check your browser's character encoding setting (for example, in Firefox it's under View->Character Encoding) - if it's isn't UTF-8 already, then set it manually for testing.

    There are also some quick guides on setting up UTF-8 in Perl.