a new example

Raymond has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: a new example (Wide character in print ) (diagnostics) by Anonymous Monk on Jul 28, 2013 at 22:59 UTC
diagnostics, splain Wide character in print at 4.pl line 5. (#1) (S utf8) Perl met a wide character (>255) when it wasn't expecting one. This warning is by default on for I/O (like print). The eas +iest way to quiet this warning is simply to add the :utf8 layer to the output, e.g. binmode STDOUT, ':utf8'. Another way to turn off the warning is to add no warnings 'utf8'; but that is often closer to cheating. In general, you are supposed to explicitly mark the filehandle with an encoding, see open and "binmode" in perlfunc. [download] `[Wide character in print ]` Wide character in print -> Wide character in print `[Wide character in]` Wide character in -> Wide Character in Print Warning How do I compose an effective node title?	[reply] [d/l] [select]
Re: a new example by kcott (Archbishop) on Jul 29, 2013 at 04:27 UTC
G'day Raymond, Here's a few examples that may clarify the differences between handling UTF-8 in your output and in your source code. Here's how those characters should render: Ä = `Ä` and Δ = `Δ`. I've used `<pre>...</pre>` tags so that the characters (e.g. `Δ`), and not the entities (e.g. `Δ`), are displayed. Baseline code generating "Wide character" warning: $ perl -Mstrict -Mwarnings -E ' say "\xC4 and \x{0394} look different"; ' Wide character in say at -e line 2. Ä and Δ look different Using binmode function to specify UTF-8 output: $ perl -Mstrict -Mwarnings -E ' binmode STDOUT => ":utf8"; say "\xC4 and \x{0394} look different"; ' Ä and Δ look different Using the open pragma to specify UTF-8 output: $ perl -Mstrict -Mwarnings -E ' use open qw{:std :utf8}; say "\xC4 and \x{0394} look different"; ' Ä and Δ look different Attempting to use UTF-8 in the source code without letting Perl know: $ perl -Mstrict -Mwarnings -E ' binmode STDOUT => ":utf8"; say "\xC4 and \x{0394} look different"; say "Ä and \x{c4} look different"; say "Δ and \x{0394} look different"; ' Ä and Δ look different Ã„ and Ä look different Î” and Δ look different Using the utf8 pragma to tell Perl there's UTF-8 in the source code: $ perl -Mstrict -Mwarnings -E ' use utf8; binmode STDOUT => ":utf8"; say "\xC4 and \x{0394} look different"; say "Ä and \x{c4} look the same"; say "Δ and \x{0394} look the same"; ' Ä and Δ look different Ä and Ä look the same Δ and Δ look the same -- Ken	[reply] [d/l] [select]
5.pl i have another new example by Raymond (Novice) on Jul 29, 2013 at 23:09 UTC
`#!/usr/bin/perl use strict; use warnings; my $string = "This is what you have"; print $string; #this part does not print: substr($string, 5, 2) = "wasn't"; #change "is" to "wasn't" substr($string, -12) = "ondrous"; #"this wasn't wondrous" substr($string, 0, 1) = ""; #delete first character substr($string, -10) = ""; #delete last 10 characters #printing problem end here` [download]	[reply] [d/l]
Re: 5.pl i have another new example by ww (Archbishop) on Jul 30, 2013 at 00:41 UTC
Uh... well, yeah, it is a new example of something. Please share the 'of what' as I can't see the relevance in a thread devoted to encoding/decoding utf8. Oh, yes. When I print the part you've labeled as non-printing, I see this: #!/usr/bin/perl use strict; use warnings; # 1046930 my $string = "This is what you have" . "\n";; print $string; #this part does not print: substr($string, 5, 2) = "wasn't"; #change "is" to "wasn't" print $string . "\n"; substr($string, -12) = "ondrous"; #"this wasn't wondrous" print $string . "\n"; substr($string, 0, 1) = ""; #delete first character print $string . "\n"; substr($string, -10) = ""; #delete last 10 characters print $string . "\n"; #printing problem end here =head out: This is what you have This wasn't what you have This wasn't whondrous his wasn't whondrous his wasn't =cut [download] ...which is at some variance with what your comments suggest you expected. If I've misconstrued your question or the logic needed to answer it, I offer my apologies to all those electrons which were inconvenienced by the creation of this post.	[reply] [d/l]
Re: 5.pl i have another new example by kcott (Archbishop) on Jul 30, 2013 at 05:34 UTC
This makes no sense as a reply to what I wrote, nor does it make any sense in the context of this thread. If you'd care to clarify your intent, that would be good. :-) -- Ken	[reply]
Re^2: 5.pl i have another new example by Raymond (Novice) on Jul 30, 2013 at 19:08 UTC
Re: a new example by Loops (Curate) on Jul 28, 2013 at 22:48 UTC
You'll want to add: `use feature 'unicode_strings'; use open qw(:std :utf8); use utf8;` [download] To the top of your script to enable Unicode support. Alternatively you could use utf8::all from CPAN which handles even more Unicode edge cases, and can be included in your script in a single line	[reply] [d/l]
Re^2: a new example by chromatic (Archbishop) on Jul 28, 2013 at 23:46 UTC
Only the open pragma fixes the problem. There's no literal (high bit) UTF-8 character in the source code, so the utf8 pragma does nothing here, and character interpolation is, to my knowledge, unaffected by the Unicode bug, so the `unicode_strings` feature does nothing to fix the problem here either. The problem is solely the missing encoding discipline on the output filehandle.	[reply] [d/l]
Re: a new example by Khen1950fx (Canon) on Jul 29, 2013 at 01:34 UTC
The simplest way to fix the problem is to use utf8::all: `#!/usr/bin/perl -l use strict; use warnings; use utf8::all; print "\xC4 and \x{0394} look different";` [download] Returns: Ä and Δ look different	[reply] [d/l]


XP is just a number
	PerlMonks