That's not strange. You're seeing Unicode codepoints, which for the characters in question happen to be identical to their ISO-8859-1 encodings. Add "\N{EURO SIGN}" to the string and you get "\x{20ac}": That's again the codepoint and no UTF-8 encoding.

"Everything is UTF-8" is one of the most frequent false assumptions I encounter when dealing with non-ASCII characters.

In reply to Re^6: UTF8 versus \w in pattern matching (basic test) by haj
in thread UTF8 versus \w in pattern matching by mldvx4

