Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Encoding SQLite Accents Ordering

by welle (Beadle)
on Dec 04, 2010 at 11:11 UTC ( #875357=perlquestion: print w/replies, xml ) Need Help??
welle has asked for the wisdom of the Perl Monks concerning the following question:


I'm searching for your wisdom concerning an encoding issue. I'm reading a SQLite database with data encoded in utf-8. I need to the data order alphabetically and print them out. Here is the problem with accented characters. I am using the following code (it worked perfectly with an older SQLite DB without UTF8 data.

sub alpha_order{ $dbh = DBI->connect( "dbi:SQLite:files/database/data.db" ) || die "Can +not connect: $DBI::errstr"; foreach my $row_db ( sort { deaccent($a->[2]) cmp deaccent($b->[2]) or + $a->[2] cmp $b->[2] } @$all_db_orderd ) { my ($ID, $col1, $col2) = @$row_db; #encoding in utf8 for printing $col1= Encode::decode_utf8( $col1 ); $col2= Encode::decode_utf8( $col2 ); #Printing data out } } sub deaccent { my $in = $_[0]; return lc($in) unless ( $in =~ y/\xC0-\xFF// ); #short circuit if +no upper chars # translterate $in =~ tr// +AAAAAAaaaaaaCcEEEEeeeeIIIIiiiiOOOOOOooooooNnUUUUuuuuYyy/; $in =~ tr/'//d; return lc($in); }

As I said before, it worked well order->abco with my old SQLite DB (no UTF data). Now unfortunatly I populate the DB with UTF-data and when I read out the data from the DB ordering it with the above script, I can't order the data properly. I get something as order->abco

Any idea what I am doing wrong? Encoding issues in Perl drive me crasy.... Welle

Replies are listed 'Best First'.
Re: Encoding SQLite Accents Ordering
by moritz (Cardinal) on Dec 04, 2010 at 15:58 UTC

    You're (probably correctly) decoding the incoming data from SQLite. But then you also need to store your script in UTF-8, and use utf8; to tell perl about the encoding of the script itself.

      I am struggling with a similar problem. I tried out what welle posted, my script is encoded in utf8 and "use utf8". It seems as the tr/ is not working properly

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://875357]
Approved by LanX
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others musing on the Monastery: (3)
As of 2018-02-20 04:40 GMT
Find Nodes?
    Voting Booth?
    When it is dark outside I am happiest to see ...

    Results (267 votes). Check out past polls.