Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re: Perl and Postgresql: Invalid byte sequence for encoding "UTF8"

by ides (Deacon)
on Dec 22, 2006 at 14:47 UTC ( [id://591337]=note: print w/replies, xml ) Need Help??


in reply to Perl and Postgresql: Invalid byte sequence for encoding "UTF8"

I ran into this recently myself with a project to insert E-mail messages into a PostgreSQL database. I solved it like so:

use utf8; use Encode; my $possibly_bad_utf8_data = get_data(); my $good_data = encode( "UTF-8", $possibly_bad_utf8_data );

The capitalization of the UTF-8 in the call to encode() is important, it tells encode to be "strict" about the UTF-8. It turns out use utf8; isn't 100% strict and hence can't be inserted into a strict mode PostgreSQL database.

Hope this helps.

Frank Wiles <frank@revsys.com>
www.revsys.com

Replies are listed 'Best First'.
Re^2: Perl and Postgresql: Invalid byte sequence for encoding "UTF8"
by StoneTable (Beadle) on Dec 23, 2006 at 18:59 UTC

    Remarkable, thanks!

    I had tried using Encode before, but missed the "UTF-8" bit apparently. It's working perfectly now.

      It is NOT the capitalization that is needed. Encode is case-insensitive for the encoding. It is the hyphen that makes the difference, see this example:
      use Encode qw(resolve_alias); my @aliases = ('utf-8', 'UTF-8', 'utf8',); for my $alias ( @aliases ) { my $canonical_name = Encode::resolve_alias($alias); print "$alias \t has canonical name $canonical_name\n"; }

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://591337]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others scrutinizing the Monastery: (4)
As of 2025-05-23 01:26 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?
    erzuuliAnonymous Monks are no longer allowed to use Super Search, due to an excessive use of this resource by robots.