Hi I've got a little problem with reading unicode characters from an Excel Sheet, I thought use utf8; would do the trick, but my cells contain only question marks instead of the unicode characters from the perl point of view. Text => ??? Value => ??? Value2 => ??? This is perl, v5.6.1 built for MSWin32-x86-multi-thread, from ActiveState. I tried it with Spreadsheet::ParseExcel; before, same result. Feels like I'm forgetting something important :-\
    Actually, I was faced with the same problem.

    After fixing other issues (thank god for PerlMonks!), I found the cure:

    use Win32::OLE qw(CP_UTF8); ... # Work in unicode! $Win32::OLE::CP = CP_UTF8; ...
    You can use Unicode::String to unpack() the string to look at each unicode char (which was what I had to do).


      Actually, using Unicode::String as a container for your data is not needed (in fact, it will croak on acctented chars and other punctuation). Just use the string as you 'normally' do, i.e. to look at each char:

      for my $uchar (split(//, $text)) { my $ord = ord($uchar); ... }
      While it seems natural to me now, it took me some time to locate that my troubles with unicode strings was *using* Unicode::String... :-)


    Do you have an example of a your Unicode string so that I can test? Odds are you are going to be playing with Variant (specifically VT_BSTR), but I don't want to steer you in the wrong direction.


[prathap keerthipati]: how to update perl in unix
[hippo]: yum update perl
[hippo]: Other package managers are available
LanX wouldn't update system Perl!
[Discipulus]: prathap keerthipati might be it is better to install an alternative Perl instead and do not touch the system one
[LanX]: see perlbrew for alternative Perl installations
[marto]: unless you know exactly what you're doing an often saner option is to simply build another Perl rather than replace the system one

