in reply to Re^4: How to sanely handle unicode in perl?
in thread How to sanely handle unicode in perl?

\xc3\xb6 is not the right byte(s) for an Ų from a Latin-1 terminal, it is the UTF-8 encoding. Meaning it can only be issued by a UTF-8 encoded source (and still mean Ų). So what you are asking to do sanely, strikes me asÖstrange. If it is coming from a Latin-1 encoding source it would be \xf6. To do encoding properly you have to know what you are receiving, decode it with that, and know what your output layer is, encode it to that. Itís not easy but itís not magical either. Without the right steps at the right layers itís literally guesswork and impossible to do robustly.

Replies are listed 'Best First'.
Re^6: How to sanely handle unicode in perl?
by Sec (Monk) on Mar 23, 2015 at 10:26 UTC
    Please check the source. I explicitly state that the pipe that produces \xc3\xb6 is utf-8. So what you wrote does not apply to my code.

    In fact choroba found out that it works as intended if I prepend ":raw" to the encoding. (Which is unintuitive to me, but kind of makes sense in retrospect)

      Maybe you misunderstand my point. If you run that code in a Latin-1 terminal you are sending UTF-8 and expecting it to act properly. It makes no sense and canít work without goofy and unrealistic hoops.