That's not strange. You're seeing Unicode codepoints, which for the characters in question happen to be identical to their ISO-8859-1 encodings. Add "\N{EURO SIGN}" to the string and you get "\x{20ac}": That's again the codepoint and no UTF-8 encoding.

"Everything is UTF-8" is one of the most frequent false assumptions I encounter when dealing with non-ASCII characters.


In reply to Re^6: UTF8 versus \w in pattern matching (basic test) by haj
in thread UTF8 versus \w in pattern matching by mldvx4

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":