Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change

Unicode UTF16 - Unknown encoding error

by Anonymous Monk
on Jun 25, 2007 at 10:41 UTC ( #623138=perlquestion: print w/replies, xml ) Need Help??
Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi, I'm working with the files consist of unicode - UTF16 characters. I've used following code to access the file and got the the desired output:
open (FSIN, "$_")||die ("Unable to open SGML.3d file $_"); my $sfile = <FSIN>; $sfile = encode("ascii", decode("UTF16", $sfile));
Now, the actual problem begins here. After converting the pl script to exe, i've executed the exe for the same input file, but i'm getting an error like 'Unknown encoding 'UTF16' at'. Can anyone throw somelight on this? cheers--B

Replies are listed 'Best First'.
Re: Unicode UTF16 - Unknown encoding error
by shmem (Chancellor) on Jun 25, 2007 at 11:36 UTC
    'Unknown encoding 'UTF16' at'.

    You want "UTF-16" or "UTF-16LE". See Encode::Unicode. Also, Encode has from_to, so I'd probably write

    use Encode qw(from_to); open (FSIN, "$_")||die ("Unable to open SGML.3d file $_"); my $sfile = <FSIN>; from_to($sfile,'UTF-16LE','ascii'); # or latin1, iso-8859-1 ...


    _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                  /\_¯/(q    /
    ----------------------------  \__(m.====·.(_("always off the crowd"))."·
    ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
      Shmem, Still I'm getting the same error --B
        Looking into the Encode Module -
        sub getEncoding { my ($class, $name, $skip_external) = @_; ref($name) && $name->can('renew') and return $name; exists $Encoding{$name} and return $Encoding{$name}; my $lc = lc $name; exists $Encoding{$lc} and return $Encoding{$lc}; my $oc = $class->find_alias($name); defined($oc) and return $oc; $lc ne $name and $oc = $class->find_alias($lc); defined($oc) and return $oc; unless ($skip_external) { if (my $mod = $ExtModule{$name} || $ExtModule{$lc}){ $mod =~ s,::,/,g ; $mod .= '.pm'; eval{ require $mod; }; <--- HERE exists $Encoding{$name} and return $Encoding{$name}; } } return; }

        it seems that on the line marked with <--- HERE the necessary encodings are loaded at runtime. I don't know about perl2exe, PerlAPP or such, but it will definitely not know what modules your script needs without running it.

        Put this at the end of your script:

        print STDERR "$_\n" for grep {/Encode/} sort keys %INC;

        and include explicitly any Encode/*.pm files you see. I guess you need to include, Encode/, so start your script with

        use Encode; use Encode::Unicode;


        _($_=" "x(1<<5)."?\n".q·/)Oo.  G°\        /
                                      /\_¯/(q    /
        ----------------------------  \__(m.====·.(_("always off the crowd"))."·
        ");sub _{s./.($e="'Itrs `mnsgdq Gdbj O`qkdq")=~y/"-y/#-z/;$e.e && print}
        Maybe you need binmode on that filehandle? Windows almost always needs binmode.

        I'm not really a human, but I play one on earth. Cogito ergo sum a bum
Re: Unicode UTF16 - Unknown encoding error
by john_oshea (Priest) on Jun 25, 2007 at 11:01 UTC

    At first glance, it would appear that whatever you're using to convert from .pl -> .exe is not bundling in the modules you're specifying in your code (Encode by the looks of it). What are you using to do the conversion?

      I'm using following module for the conversion. use Encode; I'm unable to paste my input file here. I would like to give some more information: If i open the input file in the word (Confirm conversion at over - enabled). It is opening under 'Unicode' mode. If open the same file in notepad, wordpad and few text editor(UltraEdit), it is giving proper output. If i open this in other text editor (Epsilon, Emacs), it is showing junk characters.

        Yes, but what are you using to convert your Perl script to an .exe? PAR, PerlApp, something else? That's going to be where your problem is by the sounds of it.

Re: Unicode UTF16 - Unknown encoding error
by ikegami (Pope) on Jun 26, 2007 at 05:12 UTC

    Encoding UTF-16 requires that a BOM be provided in the data to decode to determine whether UTF-16le or UTF-16be was used. The error you are getting is the error that is generated when the first character is not a UTF-16le or UTF-16be BOM character.

    To decode UTF-16le data without a leading BOM, specify UTF-16le as the charset.
    To decode UTF-16be data without a leading BOM, specify UTF-16be as the charset.

    Update: Nevermind. That's not the error message given by the case I've explained.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://623138]
Approved by shigetsu
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (4)
As of 2017-08-23 14:07 GMT
Find Nodes?
    Voting Booth?
    Who is your favorite scientist and why?

    Results (353 votes). Check out past polls.