Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re^2: converting text file encodings

by andal (Hermit)
on May 06, 2011 at 08:37 UTC ( #903361=note: print w/ replies, xml ) Need Help??


in reply to Re: converting text file encodings
in thread converting text file encodings

Well, function Encode::encode takes third argument that allows to define handling for "bad" characters. I believe, with the help of that one can replace them with other than question mark (?). For example

use strict; use Encode; # my raw data my $data = "\xe0\xe1\x02\n"; # interpret it as text in encoding CP1251 my $txt = Encode::decode("cp1251", $data); # convert text to octets in encoding Latin1. # Replace bad ones with X my $nd = Encode::encode("latin1", $txt, sub{ return "X" }); # just to see the result in the terminal which uses UTF-8 encoding Encode::from_to($nd, "latin1", "UTF-8"); print $nd, "\n"


Comment on Re^2: converting text file encodings
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://903361]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others imbibing at the Monastery: (9)
As of 2015-07-31 02:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (274 votes), past polls