|Problems? Is your data what you think it is?|
What Voodoo Encoding does RTF use for > ASCII Chars?by tosh (Scribe)
|on Mar 20, 2012 at 21:06 UTC||Need Help??|
tosh has asked for the
wisdom of the Perl Monks concerning the following question:
I'm using templates that will eventually create RTF docs. All is well until (stop me if you've heard this one before...) non-ASCII characters. Rerun!!
But wait!! This is actually a little bit different. According to the RTF specification on encoding there's two kinds. Either \'HEX or \uVOODOO. Why there's two kinds is beyond me, nor have I figured out when to use one or the other.
So while I struggle on with the above, I have this problem here:
If I create a RTF document in OS-X with the TextEdit program and put some nice accents in it, like say:
à, è, ì, ò, ù
Then they are encoded in the document as: \'e0, \'e8, \'ec, \'f2, \'f9
Can anyone help me figure out by what witchcraft this was done, because straight up
Doe not work.
And don't get me started on using Unicode::Escape, it's too slow and also doesn't match the RTF specs.