Perl: the Markov chain saw | |
PerlMonks |
comment on |
( [id://3333]=superdoc: print w/replies, xml ) | Need Help?? |
Since you already know what sequence of encoding and decoding lead to the broken output, the easiest way with Encode::Repair is this: use 5.010; use strict; use warnings; use Encode::Repair qw(repair_encoding); my $broken = '敒›剕䕇呎'; say repair_encoding($broken, [decode => 'utf-8', encode => 'UTF-16LE']); __END__ # output: Re: URGENT But it also works with learn_recoding: use 5.010; use strict; use warnings; use Encode::Repair qw(repair_encoding learn_recoding); binmode STDOUT, ':encoding(UTF-8)'; my $broken = '敒›剕䕇呎'; my $pattern = learn_recoding( from => $broken, to => 'Re: URGENT', encodings => ['UTF-8', 'UTF-16LE', 'UTF-16BE'], ); if ($pattern) { say repair_encoding($broken, $pattern); } So, what did you try? (Updated to use pre tags instead of code, because code tags badly break most non-ASCII-chars. In reply to Re: How to Fix Character Encoding Damaged Text Using Perl?
by moritz
|
|