in reply to Re: Is utf8, ascii ? in thread Is utf8, ascii ?
I see. I'm new to these encode stuff, but now I understand... check, guess try to encode, if not discard.
At the moment I want just to discard, later when I have time will do more tests
But my next question was... if I check for valid utf8 string and discard. Will this discard the string if it is ascii ?
Re^3: Is utf8, ascii ?
by clinton (Priest) on Aug 07, 2007 at 19:43 UTC
|
No. U+0000 to U+007F (the first 128 Unicode characters) are represented in UTF8 by one byte - the same byte that is used in ASCII. So ASCII (7 bit ASCII, not eg ISO-8859-* or WINDOWS-1252) is a subset of UTF8. | [reply] [d/l] [select] |
|