Re: Unaccenting characters

I'm not a big fan of such big tables, so instead I'd propose this:

use 5.010;
use strict;
use warnings;
use utf8;
use Unicode::Normalize qw/NFKD/;

sub unaccent {
    my $s = NFKD shift;
    $s =~ s/\pM//g;
    return $s;
}

say unaccent "Les Misérables";
__END__
Output:
Les Miserables
[download]

The NFD normalization form has the base character and the accent split into two different characters, and the substitution removes all the marks (\pM).

(And Unicode::Normalize is a core module since perl 5.8, and you really, really don't want to use anything older than that for Unicode stuff).

Perl 6 - the future is here, just unevenly distributed

In Section Seekers of Perl Wisdom