in reply to Proper Unicode handling in Perl
in thread Is there some universal Unicode+UTF8 switch?
Nice progress! You don't even need the Encode module :)
This is a pretty straightforward way to deal with Unicode and UTF-8.
The remaining mentions of UTF-8 in your code have all their justification:
- use utf8; tells Perl that your source code comes with UTF-8 encoded literals.
- binmode STDOUT, ':utf8'; makes Perl spit out the strings in @html properly UTF-8 encoded. You can encode any Unicode character in UTF-8, so no problems here.
- Content-Type: text/html; charset=utf-8 tells the browser that it has to handle the byte stream as UTF-8 and decode the characters accordingly.
There are two caveats:
- Obviously, You need to save your source code UTF-8 encoded.
- You must check whether the JSON data might, in some circumstances, contain characters which have a special meaning in HTML, in particular < and &. This has nothing to do with Unicode, though.
I'm adding the relevant stuff to your sub display_html:
sub display_html { use HTML::Entities; my $html_encoded = encode_entities(shift, '<>&"'); my @html = ( '<!DOCTYPE html>', '<html>', '<head>', '<meta charset="UTF-8">', '<title>Мой тест</title>', '</head>', '<body>', $html_encoded // 'Статус — ОК', # soft OR: 0 and empty string accepted '</body>', '</html>' ); # to avoid "wide character" warnings: binmode STDOUT, ':utf8'; print "Content-Type: text/html; charset=utf-8\n\n"; print join("\n", @html); }
In Section
Seekers of Perl Wisdom