Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re^2: converting text file encodings

by andal (Friar)
on May 06, 2011 at 08:37 UTC ( #903361=note: print w/ replies, xml ) Need Help??


in reply to Re: converting text file encodings
in thread converting text file encodings

Well, function Encode::encode takes third argument that allows to define handling for "bad" characters. I believe, with the help of that one can replace them with other than question mark (?). For example

use strict; use Encode; # my raw data my $data = "\xe0\xe1\x02\n"; # interpret it as text in encoding CP1251 my $txt = Encode::decode("cp1251", $data); # convert text to octets in encoding Latin1. # Replace bad ones with X my $nd = Encode::encode("latin1", $txt, sub{ return "X" }); # just to see the result in the terminal which uses UTF-8 encoding Encode::from_to($nd, "latin1", "UTF-8"); print $nd, "\n"


Comment on Re^2: converting text file encodings
Download Code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://903361]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others about the Monastery: (11)
As of 2015-01-30 09:55 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My top resolution in 2015 is:

















    Results (248 votes), past polls