Beefy Boxes and Bandwidth Generously Provided by pair Networks
Don't ask to ask, just ask
 
PerlMonks  

Re^2: text encodings and perl

by Anonymous Monk
on Nov 15, 2010 at 20:30 UTC ( #871573=note: print w/ replies, xml ) Need Help??


in reply to Re: text encodings and perl
in thread text encodings and perl

... Unicode is such a system.
This is just so wrong. For one, Unicode is not an encoding. Rather, UTF-8, UTF-16 etc. are encodings. And a rather common one of them - UTF-8 - is variable-width, i.e. not same number of bytes per character...


Comment on Re^2: text encodings and perl
Replies are listed 'Best First'.
variable-width encodings
by tchrist (Pilgrim) on Apr 10, 2011 at 14:24 UTC
    For one, Unicode is not an encoding. Rather, UTF-8, UTF-16 etc. are encodings. And a rather common one of them — UTF-8 — is variable-width, i.e. not same number of bytes per character.

    Both UTF‑8 and also UTF‑16 as well are variable‐width encodings. The essential difference is the size of the code units. There is an infinitude of Java and Windows code (but not necessarily both) out there that screws this up, thinking that UTF‑16 is UCS‑2. It very much is not so.

    Plus UCS‑2 isn’t even a valid Unicode encoding in the first place. UTF‑8, UTF‑16, and UTF‑32 are, and of those, only the last uses fixed‐width code units. UTF‑16 is problematic and annoying in several ways that do not affect either UTF‑8 or UTF‑32, but that doesn’t make it fixed width.

    So the same statement as you’ve made about UTF‑8 applies equally well, mutatis mutandis, to UTF‑16: “UTF‑16 is also a variable‐width encoding, i.e. not the same number of 16‑bit code units per character.” It would be very, very good idea to remain ever conscious of this, given how much harm has been done by negligent programmers who have not done so.

      wait... the tchrist? where you been all these years,man?

        To say that I am subfond of writing clumsy ʜᴛᴍʟ merely to chat is gravely understating matters. And I haven’t found the pod option around here yet.
Re^3: text encodings and perl
by sundialsvc4 (Abbot) on Nov 17, 2010 at 20:50 UTC

    Thank you for the clarification.   I have revised the post, humbly eating my own words.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://871573]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others having an uproarious good time at the Monastery: (16)
As of 2015-07-31 17:04 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (279 votes), past polls