We don't bite newbies here... much | |
PerlMonks |
comment on |
( [id://3333]=superdoc: print w/replies, xml ) | Need Help?? |
I see question marks, but I'm not sure if there's a question in there. You seem to have a good grasp of the concept.
You'd get the right result, at the cost of confusing your readers. You'd be saying you're doing one thing (changing the internal format) while actually doing another (changing the encoding of the string).
Correct, iso-8859-1 cannot encode U+201C. cp1252 can. cp1252 is Microsoft's extension of iso-8859-1. It's a commonly used encoding in the Windows world, which is why U+201C is encountered frequently.
Indeed. I have used that very code to make sure a sub was only given bytes before calling a function that expects to only get bytes. At the same time, it makes sure the bytes aren't internally encoded as UTF-8. Most XS functions can't handle that (which is really a bug in the XS function).
How encode handles errors is configurable using its third parameter. In reply to Re^3: Decoding, Encoding string, how to?
by ikegami
|
|