in reply to A UTF8 round trip with MySQL
That is a nice summary, although the only MySQL specific thing in it is {mysql_enable_utf8 => 1} :-).
You have one misleading bit of information, though:
For Perl to know whether the data it receives from an external source (which could be a string, or binary data such as an image) as a string of bytes or as a UTF-8 string, it uses the internal UTF8 flag.
This is a very dangerous assumption! The UTF8 flag is an internal flag that has nothing to do with anything that is external. If it is set, Perl assumes that it wrote the UTF8 buffer itself, and does no further checks. Blindly setting the UTF8 flag is dangerous because it can lead to internally corrupted scalars: malformed UTF8 data.
The :utf8 layer should not be used on input filehandles. Use :encoding(UTF-8) instead. The _utf8_on function should not be used on external input. Use decode("UTF-8", ...), or possibly decode("UTF8", ...) or decode_utf8(...) instead. You do this correctly.
The UTF8 flag indicates that internal data is UTF8 encoded, and that is regardless of source and history of this string.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^2: A UTF8 round trip with MySQL
by clinton (Priest) on Jun 13, 2007 at 20:13 UTC | |
by Juerd (Abbot) on Jun 13, 2007 at 20:45 UTC | |
by Anonymous Monk on Jul 30, 2013 at 13:28 UTC | |
by Joost (Canon) on Jun 13, 2007 at 20:34 UTC | |
by Juerd (Abbot) on Jun 13, 2007 at 20:52 UTC | |
by Joost (Canon) on Jun 13, 2007 at 21:01 UTC | |
by Juerd (Abbot) on Jun 13, 2007 at 21:12 UTC | |
by mje (Curate) on Mar 31, 2009 at 10:24 UTC |