Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot
 
PerlMonks  

Re^5: Why does Encode::Repair only correctly fix one of these two tandem characters?

by ikegami (Pope)
on Aug 11, 2014 at 01:49 UTC ( #1096954=note: print w/replies, xml ) Need Help??


in reply to Re^4: Why does Encode::Repair only correctly fix one of these two tandem characters?
in thread Why does Encode::Repair only correctly fix one of these two tandem characters?

The most common garbage from Perl code is mixed UTF-8 and latin-1. It happens when you forgot to specify the output encoding.

print "\N{LATIN CAPITAL LETTER E WITH ACUTE}"; print "\N{BLACK SPADE SUIT}";

The first string consists entirely of bytes, so Perl doesn't know you did something wrong. The second string makes no sense, so Perl guesses you meant to encode it using UTF-8. You end up with a mix of code points (effectively latin-1) and UTF-8.

This is fixed using Encoding::FixLatin

  • Comment on Re^5: Why does Encode::Repair only correctly fix one of these two tandem characters?
  • Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1096954]
help
Chatterbox?
[Perl300]: Hi
[Perl300]: One quick question. Is DBD::ORacle still the way to go for connecting to Oracle DB from Linux box? My code is one a different linux box than the Oracle DB
[Perl300]: I see DBI and DBD::Oracle are already installed on my Linux box where I am coding but having hard time connecting to the remote Oracle DB

How do I use this? | Other CB clients
Other Users?
Others wandering the Monastery: (5)
As of 2017-10-20 20:24 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?
    My fridge is mostly full of:

















    Results (267 votes). Check out past polls.

    Notices?