Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer
 
PerlMonks  

Encoding SQLite Accents Ordering

by welle (Beadle)
on Dec 04, 2010 at 11:11 UTC ( [id://875357]=perlquestion: print w/replies, xml ) Need Help??

welle has asked for the wisdom of the Perl Monks concerning the following question:

Hello!

I'm searching for your wisdom concerning an encoding issue. I'm reading a SQLite database with data encoded in utf-8. I need to the data order alphabetically and print them out. Here is the problem with accented characters. I am using the following code (it worked perfectly with an older SQLite DB without UTF8 data.

sub alpha_order{ $dbh = DBI->connect( "dbi:SQLite:files/database/data.db" ) || die "Can +not connect: $DBI::errstr"; foreach my $row_db ( sort { deaccent($a->[2]) cmp deaccent($b->[2]) or + $a->[2] cmp $b->[2] } @$all_db_orderd ) { my ($ID, $col1, $col2) = @$row_db; #encoding in utf8 for printing $col1= Encode::decode_utf8( $col1 ); $col2= Encode::decode_utf8( $col2 ); #Printing data out } } sub deaccent { my $in = $_[0]; return lc($in) unless ( $in =~ y/\xC0-\xFF// ); #short circuit if +no upper chars # translterate $in =~ tr/ÀÁÂÃÄÅàáâãäåÇçÈÉÊËèéêëÌÍÎÏìíîïÒÓÔÕÖØòóôõöøÑñÙÚÛÜùúûüÝÿý/ +AAAAAAaaaaaaCcEEEEeeeeIIIIiiiiOOOOOOooooooNnUUUUuuuuYyy/; $in =~ tr/'//d; return lc($in); }

As I said before, it worked well order->aäbcoö with my old SQLite DB (no UTF data). Now unfortunatly I populate the DB with UTF-data and when I read out the data from the DB ordering it with the above script, I can't order the data properly. I get something as order->abcoäö

Any idea what I am doing wrong? Encoding issues in Perl drive me crasy.... Welle

Replies are listed 'Best First'.
Re: Encoding SQLite Accents Ordering
by moritz (Cardinal) on Dec 04, 2010 at 15:58 UTC

    You're (probably correctly) decoding the incoming data from SQLite. But then you also need to store your script in UTF-8, and use utf8; to tell perl about the encoding of the script itself.

      I am struggling with a similar problem. I tried out what welle posted, my script is encoded in utf8 and "use utf8". It seems as the tr/ is not working properly

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://875357]
Approved by LanX
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having a coffee break in the Monastery: (5)
As of 2024-03-19 11:34 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found