Re: "Unrecognized character" stops perl cold
by moritz (Cardinal) on Jan 17, 2008 at 19:28 UTC
|
In which encoding is your script? (in vim: :se fileencoding).
You can try hexdump -C script.pl to inspect it even further.
Chances are that your system's nroff (or whatever manpage processor you're using) substituted some quotes with a "fancy" quote that now causes trouble. | [reply] [d/l] [select] |
|
The encoding is 'utf-8'. I think you're right, it's the quotes. See below, the quotes are replaced by "..." in hexdump's ASCII, which are 'e2 80 99'. Now where did I put that nroff book...
00000000 75 73 65 20 4d 49 4d 45 3a 3a 4c 69 74 65 3b 0a |use MIME:
+:Lite;.|
00000010 23 23 23 20 43 72 65 61 74 65 20 61 20 6e 65 77 |### Creat
+e a new|
00000020 20 73 69 6e 67 6c 65 2d 70 61 72 74 20 6d 65 73 | single-p
+art mes|
00000030 73 61 67 65 2c 20 74 6f 20 73 65 6e 64 20 61 20 |sage, to
+send a |
00000040 47 49 46 20 66 69 6c 65 3a 0a 24 6d 73 67 20 3d |GIF file:
+.$msg =|
00000050 20 4d 49 4d 45 3a 3a 4c 69 74 65 2d 3e 6e 65 77 | MIME::Li
+te->new|
00000060 28 0a 46 72 6f 6d 20 20 20 20 20 3d 3e e2 80 99 |(.From
+ =>...|
00000070 6d 65 40 6d 79 68 6f 73 74 2e 63 6f 6d e2 80 99 |me@myhost
+.com...|
00000080 2c 0a 54 6f 20 20 20 20 20 20 20 3d 3e e2 80 99 |,.To
+ =>...|
00000090 79 6f 75 40 79 6f 75 72 68 6f 73 74 2e 63 6f 6d |you@yourh
+ost.com|
000000a0 e2 80 99 2c 0a 43 63 20 20 20 20 20 20 20 3d 3e |...,.Cc
+ =>|
000000b0 e2 80 99 73 6f 6d 65 40 6f 74 68 65 72 2e 63 6f |...some@o
+ther.co|
000000c0 6d 2c 20 73 6f 6d 65 40 6d 6f 72 65 2e 63 6f 6d |m, some@m
+ore.com|
000000d0 e2 80 99 2c 0a 53 75 62 6a 65 63 74 20 20 3d 3e |...,.Subj
+ect =>|
000000e0 e2 80 99 48 65 6c 6c 6f 6f 6f 6f 6f 6f 2c 20 6e |...Helloo
+oooo, n|
000000f0 75 72 73 65 21 e2 80 99 2c 0a 54 79 70 65 20 20 |urse!...,
+.Type |
00000100 20 20 20 3d 3e e2 80 99 69 6d 61 67 65 2f 67 69 | =>...i
+mage/gi|
00000110 66 e2 80 99 2c 0a 45 6e 63 6f 64 69 6e 67 20 3d |f...,.Enc
+oding =|
00000120 3e e2 80 99 62 61 73 65 36 34 e2 80 99 2c 0a 50 |>...base6
+4...,.P|
00000130 61 74 68 20 20 20 20 20 3d 3e e2 80 99 68 65 6c |ath =
+>...hel|
00000140 6c 6f 6e 75 72 73 65 2e 67 69 66 e2 80 99 0a 29 |lonurse.g
+if....)|
00000150 3b 0a 24 6d 73 67 2d 3e 73 65 6e 64 3b 20 23 20 |;.$msg->s
+end; # |
00000160 73 65 6e 64 20 76 69 61 20 64 65 66 61 75 6c 74 |send via
+default|
00000170 0a 0a |..|
00000172
| [reply] [d/l] |
|
Would a simple use utf8; solve your problem? See perldoc utf8.
Jim
| [reply] [d/l] |
|
Re: "Unrecognized character" stops perl cold
by naChoZ (Curate) on Jan 17, 2008 at 20:12 UTC
|
I find this very annoying also. It's because the perldoc renderer is doing silly things like replacing regular '-' dash/hyphen characters with chr(226) (instead of the normal chr(45) character) and replacing single ticks with the "prettier" version of the single quote. I have not been able to figure out how to get this to display properly without switching completely to raw or text mode.
My solution was to bind a key in vim to run this silly little script. So whenever I paste some code from some perldoc I'm viewing, I run this over the code.
#!/usr/bin/perl -n
s/‐/-/g;
s/−/-/g;
s/’/'/g;
print;
I notice it renders oddly in the perlmonks node, but you get the idea. I literally just copy and pasted the offending symbol into this script.
--
naChoZ
Therapy is expensive. Popping bubble wrap is cheap. You choose.
| [reply] [d/l] |
|
vmap ,qq :%s/’/'/g<CR>
nmap ,qq :%s/’/'/g<CR>
where the first quote is entered as a digraph: Ctrl-V Ctrl-K '9.
At that point "comma q q" is mapped in vim to replace all the bad fancy quotes with good single quotes.
| [reply] [d/l] |
|
That will only handle that one character. The problem is there are other characters that are modified as well. After my previous post the other day, I ended up going and making a much more verbose version of that script. Now the bad characters can be referred to by name. Plus I added a silly way of making it display the identity of characters to make it easier to find more that need to be fixed.
#!/usr/bin/perl -n
#use strict;
#use warnings;
use charnames ();
use encoding "utf8";
$|++;
my $chars = {
'HYPHEN' => '-', # \x{2010}
'MINUS SIGN' => '-', # \x{2212}
'FIGURE DASH' => '-', # \x{2012}
'RIGHT SINGLE QUOTATION MARK' => "'", # \x{2212}
'BOX DRAWINGS LIGHT VERTICAL' => '|', # \x{2502}
};
# If the first character is an equal sign, skip it and
# display the identity of each remaining characters.
#
if (/^=/) {
for my $index ( 1 .. length($_) - 1 ) {
my $char = substr( $_, $index++, 1 );
print $char . " "
. sprintf( "\\x{%04X}", ord($char) )
. "\" = '"
. charnames::viacode( ord($char) )
. "'\n" ;
}
} else {
for my $cname ( keys %$chars ) {
my $char = chr( charnames::vianame($cname) );
s/$char/$chars->{$cname}/g;
}
print;
}
--
naChoZ
Therapy is expensive. Popping bubble wrap is cheap. You choose.
| [reply] [d/l] |
Re: "Unrecognized character" stops perl cold
by NetWallah (Canon) on Jan 17, 2008 at 19:31 UTC
|
| [reply] |
|
| [reply] |
|
| [reply] |
|
It's more like a hat really.
| [reply] |
Re: "Unrecognized character" stops perl cold
by Anonymous Monk on Dec 25, 2013 at 08:55 UTC
|
| [reply] |
|
| [reply] |
|
thank you very much buddy!!!
| [reply] |