Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

How to know to know if string is utf8 encoded or decoded.

by Anonymous Monk
on Jul 25, 2019 at 12:10 UTC ( #11103384=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi Monks,

How to know if string/filename is utf8 encoded or decoded. how to avoid getting string encoded/decoded twice.
Because for me Encode::decode is working for one string while it is not working on another string.
for exa. it is working with filename "test1℗ὓ.txt" while not working with filename "1669-SCC-H˘pitauxdeSaint-Maurice-POC.PIF".

Thank you.

  • Comment on How to know to know if string is utf8 encoded or decoded.

Replies are listed 'Best First'.
Re: How to know to know if string is utf8 encoded or decoded.
by haukex (Chancellor) on Jul 25, 2019 at 13:17 UTC
Re: How to know to know if string is utf8 encoded or decoded.
by choroba (Archbishop) on Jul 25, 2019 at 12:17 UTC
    What do you mean by "is working"? The following works when the script is saved as UTF-8 and run in a UTF-8 terminal:
    #!/usr/bin/perl
    use warnings;
    use strict;
    use feature qw{ say };
    use utf8;
     
    use Encode;
     
    my @strings = ('test1℗ὓ.txt', '1669-SCC-H˘pitauxdeSaint-Maurice-POC.PIF');
     
    for my $string (@strings) {
        say encode('UTF-8', $string);
    }
    

    map{substr$_->[0],$_->[1]||0,1}[\*||{},3],[[]],[ref qr-1,-,-1],[{}],[sub{}^*ARGV,3]

      Thank you for the reply.
      below is the line which seems to have issue.
      eval {$result = ConvertEncoding($string,"utf8",'MIME-Header')}
      #convertEncoding uses below code to decode string

      eval { $unicode = Encode::decode($from,$str); }; if ($@) { &ConvertEncodingError("($from -> utf8)\n$@"); return $str; }

        Wrapping the code in eval only makes sense if you ask decode to die if it can't decode:
        eval { $unicode = Encode::decode($from,$str,Encode::FB_CROAK); }; if ($@) { &ConvertEncodingError("($from -> utf8)\n$@"); return $str; }
        You should be aware that in case of a decoding error, $str will be overwritten.

        eval { $unicode = Encode::decode($from,$str); #here $from is +'MIME-Header' }; if ($@) { &ConvertEncodingError("($from -> utf8)\n$@"); return $str; }

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://11103384]
Approved by haukex
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (6)
As of 2020-04-05 00:35 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    The most amusing oxymoron is:
















    Results (33 votes). Check out past polls.

    Notices?