Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Encoding a hash in perl before saving it as a CSV file

by dep1078 (Initiate)
on Aug 11, 2010 at 15:05 UTC ( [id://854405]=perlquestion: print w/replies, xml ) Need Help??

dep1078 has asked for the wisdom of the Perl Monks concerning the following question:

Hi folks i've been trying to solve this issue for weeks with no success at all. I'm getting a CSV file from a website. Its not encoded in UTF-8. Hence, when you save it, characters which are written in french are displayed wrongly. I.e. the accent formatting is lost. I'm also getting an error: Failed: Cannot decode string with wide characters at C:/Perl/lib/Encode.pm line 174. Could someone kindly look into this code and let me know what i'm doing wrong? Your help is much appreciated! Here is where my function is getting called
my ($confn,$pegfn,$pirfn,$pittmfn) = map { "$conf::backup_dir/$_-$tim. +csv" } ("consum", "peg", "pir", "pittm"); $mech->follow_link(url_regex=>qr/sitestat\.com.*pir.*export_csv/i); $mech->save_content_binary($pirfn);
here is my save_content_binary method:
sub save_content_binary { my $self = shift; my $filename = shift; open( my $fh, '>:utf8', $filename ) or $self->die( "Unable to crea +te $filename: $!" ); binmode $fh; decode("utf-8",$self->content); print {$fh} $self->content or $self->die( "Unable to write to $fil +ename: $!" ); close $fh or $self->die( "Unable to close $filename: $!" ); return; }
i know i'm meant to use the encode or decode methods, but not sure which one or when.

Replies are listed 'Best First'.
Re: Encoding a hash in perl before saving it as a CSV file
by zentara (Archbishop) on Aug 11, 2010 at 17:16 UTC
Re: Encoding a hash in perl before saving it as a CSV file
by graff (Chancellor) on Aug 12, 2010 at 02:26 UTC
    This error message:
    Cannot decode string with wide characters ...
    only shows up when you have a string which is already known to contain a utf8 string (i.e. its utf8 flag is on), and you pass that string to Encode::decode(). This is sensible, because if the string is already known to contain utf8 data, "converting" it to utf8 again would be a "mistake".

    So based on the code snippet you posted, I would conclude that the $self->content thing is returning a utf8 string, and you shouldn't try to decode it -- just make sure that when you print the content to a file handle, the file handle has been set to use the utf8 IO layer -- and one sure way to do that is:

    binmode $fh, ":utf8";

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://854405]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others contemplating the Monastery: (6)
As of 2024-04-23 17:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found