Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change

Re^4: text encodings and perl

by andal (Hermit)
on Nov 15, 2010 at 12:51 UTC ( #871460=note: print w/replies, xml ) Need Help??

in reply to Re^3: text encodings and perl
in thread text encodings and perl

Summary: Strings internally stored as Latin 1 can be perfectly fine text strings. Trying to use is_utf8 to determine whether a string holds characters or octects is wrong.

Well. I never said anything against this truth. I guess the misunderstanding comes from the use of terms "characters" and "octets". These terms are used by perlunicode so I've used them here as well. In no way I'm implying that strings with utf8 flags will never have "characters". Of course perl will find "characters" in those strings in the contexts where it shall find "characters". The opposite is also true, perl will find "octects" in the strings with utf8 flag set, when the context demands it.

In original writing word "character" stood for CORRECT characters, not just some deduced characters. So, if the developer called Encode::decode then the character values will be correct, otherwise they'll be correct only if the octets happen to use Latin1 encoding. I hope this clarifies things.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://871460]
[stevieb]: I'll see how to get in touch with someone and advise that there's an issue. Thanks for helping me confiirm pryrt!
[pryrt]: also mismatches; but matches their sha1
[pryrt]: The mismatched ones have a Jan 23 2017 Last-Modified header -- I wonder if they rezipped them and forgot to update the sha1

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (7)
As of 2017-03-29 21:06 GMT
Find Nodes?
    Voting Booth?
    Should Pluto Get Its Planethood Back?

    Results (353 votes). Check out past polls.