Re^8: Is there some universal Unicode+UTF8 switch?

by Anonymous Monk
on Sep 02, 2019 at 13:36 UTC

in reply to Re^7: Is there some universal Unicode+UTF8 switch?
in thread Is there some universal Unicode+UTF8 switch?

It might be unpleasant to you, but he's mostly correct. This site is indeed a pretty good illustration of the problem. It doesn't use utf-8 at all, it uses Windows-1252. It's actually the browser that does html-escaping (both ways). As for why? Presumably because updating Perlmonks to use utf-8 would be too difficult for anyone to bother.
Replies are listed 'Best First'.
Re^9: Is there some universal Unicode+UTF8 switch?
by haj (Chaplain) on Sep 02, 2019 at 15:26 UTC

    All I can say is "it is not Perl's fault".

    Using Windows-1252 encoding is weird, but not incorrect. If your desired content type is text/html, then the whole set of Unicode characters is available for you not only in UTF-8, but in any encoding supported by the browsers. Both HTML::Entities and Encode can do the necessary mapping of Unicode characters to Entities which are then plain ASCII strings and understood by the browsers.

    I'm not going to defend the source code running this site either. I consider it pretty stale, but you can let software rot in about any programming language. In Perl, of course, the software will continue to run and run and run, as years go by.

      I'm actually curious... Has anyone ever looked into it (converting Perlmonks to utf-8)? Someone must have had? Windows-1252 definitely causes problems.

