Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number
 
PerlMonks  

Re: Re: Re: Re: XML Simple Charset Q?

by jkahn (Friar)
on Nov 25, 2002 at 20:03 UTC ( #215717=note: print w/ replies, xml ) Need Help??


in reply to Re: Re: Re: XML Simple Charset Q?
in thread XML Simple Charset Q?

er, it's important to note that though the ISO-8859-1 codepoints are the same as Unicode (below 256), the encodings are not the same (values above 127 are encoded multi-byte in utf-8, but values in ISO-8859-1 are always encoded single-byte).

I bet most people in this conversation know this, but it's a bit important to clarify. For me, it wasn't so long ago that I didn't know the difference between codepoint and encoding.

Links to that subject:


Comment on Re: Re: Re: Re: XML Simple Charset Q?
Replies are listed 'Best First'.
Re: Re: Re: Re: Re: XML Simple Charset Q?
by John M. Dlugosz (Monsignor) on Nov 25, 2002 at 21:59 UTC
    Right (I updated my post to clarify). His regex takes single-byte characters in the range 80-ff nd recodes them as HTML escape codes. Same number, just a different way of persisting it to the output stream.

    Inspired by that, I showed that the same idea can convert from UTF8 by using the utf8 pragma and the extended \x escape codes in the regex, and meanwhile encode to Latin-1 by using pack.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://215717]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (8)
As of 2015-07-30 23:43 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (273 votes), past polls