in reply to Sorting Vietnamese text
Update: Sorry, some errors in the code below. In particular, the constructor for the collator should be this.
Then the sort method will work as intended. Try it with actual Vietnamese words.my $Collator = Unicode::Collate::Locale->new(locale =>'vi');
Unicode::Collate::Locale ought to help. Example code below not using code tags due to display bug with utf8 text.
#!/usr/bin/env perl use v5.14; use warnings; use utf8::all; use Unicode::Collate::Locale; my $Collator = Unicode::Collate::Locale->new('vi'); my @unsorted = qw( a..7 ả..3 à..9 ạ..5 ã..4 á..1 ă..6 à..2 á..8 ); my @sorted = $Collator->sort(@unsorted); say "unsorted\n@unsorted"; say "sorted\n@sorted";Output is as follows.
unsorted a..7 ả..3 à..9 ạ..5 ã..4 á..1 ă..6 à..2 á..8 sorted á..1 à..2 ả..3 ã..4 ạ..5 ă..6 a..7 á..8 à..9
Update #2: The code below actually is a correct example.
#!/usr/bin/env perl use v5.14; use warnings; use utf8::all; use Unicode::Collate::Locale; my $Collator = Unicode::Collate::Locale->new(locale =>'vi'); my @unsorted = ('á', 'ả', 'ã', 'à', 'ậ', 'ă', 'ạ', 'ẫ', 'a', 'ẩ' ); my @sorted = $Collator->sort(@unsorted); say "unsorted\n@unsorted"; say "sorted\n@sorted";Giving the output:
unsorted á ả ã à ậ ă ạ ẫ a ẩ sorted a à ả ã á ạ ă ẩ ẫ ậ
|
---|
Replies are listed 'Best First'. | |
---|---|
Re^2: Sorting Vietnamese text
by pdenisowski (Acolyte) on Dec 22, 2013 at 20:07 UTC | |
by farang (Chaplain) on Dec 22, 2013 at 23:48 UTC | |
by pdenisowski (Acolyte) on Dec 23, 2013 at 02:37 UTC | |
by farang (Chaplain) on Dec 23, 2013 at 04:28 UTC | |
by pdenisowski (Acolyte) on Dec 23, 2013 at 15:03 UTC | |
| |
by Jim (Curate) on Dec 23, 2013 at 03:35 UTC | |
by pdenisowski (Acolyte) on Dec 23, 2013 at 00:08 UTC |
In Section
Seekers of Perl Wisdom