Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

print UTF-8 problem

by HelenCr (Monk)
on Feb 15, 2012 at 19:13 UTC ( #954045=perlquestion: print w/ replies, xml ) Need Help??
HelenCr has asked for the wisdom of the Perl Monks concerning the following question:

Hi wizards, I seek your wisdom. I am running Active Perl 5.14 on Windows 7.
I am trying to write a program that will read-in a conversion table, then work on a file and replace certain patterns by other patterns - all of the above in Unicode (UTF-8). Here is the beginning of the program:
#!/usr/local/bin/perl # Load a conversion table from CONVTABLE to %ConvTable. # Then find matches in a file and convert them. use strict; use warnings; use Encode; use 5.014; use utf8; use autodie; use warnings qw< FATAL utf8 >; use open qw< :std :utf8 >; use charnames qw< :full >; use feature qw< unicode_strings >; my ($i,$j,$InputFile, $OutputFile,$word,$from,$to,$linetoprint); my (@line, @lineout); my %ConvTable; # Conversion hash print 'Conversion table: opening file: E:\My Documents\Perl\Conversio +n table.txt'."\n"; my $sta= open (CONVTABLE, "<:encoding(utf8)", 'E:\My Documents\Perl\C +onversion table.txt'); binmode STDOUT, ':utf8'; # output should be in UTF-8 # Load conversion hash while (<CONVTABLE>) { chomp; print "$_\n"; # etc ... # etc ...
It turns out that at this point, it says:
wide character in print at (eval 155)E:/Active Perl/lib/Perl5DB.pl:640 +]line 2, <CONVTABLE> line 1, etc...
Why is that? I think I've gone through all the necessary prescriptions for correct handling of Unicode, decoding and encoding into UTF-8?
And how to fix it?
TIA
Helen
Note: I may cross-post on StackOverflow

Comment on print UTF-8 problem
Select or Download Code
Re: print UTF-8 problem
by Anonymous Monk on Feb 15, 2012 at 19:25 UTC

    It turns out that at this point, it says:

    perl5db.pl is not your program :) what does your program say when you run it without debugger

      Running the program without the debugger,it says:
      Name "main::INPUT" used only once: possible typo at Conv.pl line 58. Name "main::CONVTABLE" used only once: possible typo at Conv.pl line 2 +6. Name "main::OUTPUT" used only once: possible typo at Conv.pl line 71. Conversion table: opening file: E:\My Documents\Perl\Conversion table. +txt &#8745;&#9559;&#9488;England, Germany he, she the, HOMHOM <&#9579;&#9579;&#9579;&#9579;&#9474;11> <&#9579;&#9579;&#9579;&# +9579;&#9579;&#9579;> <&#9579; &#9579;&#9579;&#9474; &#9579;> <&#9579;&#9579;&#9579;&# +9579;> <&#9579;&#9579;&#9474;&#9579;&#9579;&#9474;&#9579;> <&#9579;&#9579 +;&#9579;&#9579;&#9579;> <&#915;&#9579;&#9579;&#9579;&#9579;&#9579;&#9579;><&#9579;&#9 +579;&#8976;&#9579;&#9579;&#9579;&#9579;&#9579;>
      It prints gibberish to the Windows console (aka "DOS box"), instead of the right UTF-8 characters.

        It prints gibberish to the Windows console (aka "DOS box"), instead of the right UTF-8 characters.

        What makes you think your console understands UTF-8? Type chcp at the prompt, prepend "cp" to the number, and use that as the encoding.

        There aren't 71 lines in what you posted. Please run the code you actually posted.
Re: print UTF-8 problem
by ikegami (Pope) on Feb 15, 2012 at 20:08 UTC
      I will return in a couple of hours

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://954045]
Approved by Corion
Front-paged by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (5)
As of 2014-08-31 06:48 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The best computer themed movie is:











    Results (294 votes), past polls