Clear questions and runnable code get the best and fastest answer |
|
PerlMonks |
comment on |
( [id://3333]=superdoc: print w/replies, xml ) | Need Help?? |
The synopsis for Unicode::Collate does a reasonable job of setting the stage, but there is a nice discussion in chapter 6 of Programming Perl (the camel book), 4th edition as well. You might also look at the Unicode Technical Standard #10: Unicode Collation Algorithm. Here's a brief example of doing comparisons at a lower (more relaxed) level using Unicode::Collate. use strict; use warnings FATAL => 'utf8'; use utf8; use Unicode::Collate; binmode STDOUT, ':encoding(UTF-8)'; my( $x, $y, $z ) = qw( α ά ὰ ); my $c = Unicode::Collate->new; print "\nStrict collation rules: Level 4 (default)\n"; print "\t cmp('α','ά'): ", $c->cmp( $x, $y ), "\n"; print "\t cmp('ά','ὰ'): ", $c->cmp( $y, $z ), "\n"; print "\t cmp('α','ὰ'): ", $c->cmp( $x, $z ), "\n"; my $rc = Unicode::Collate->new( level => 1 ); print "\nRelaxed collation rules: Level 1\n"; print "\t cmp('α','ά'): ", $rc->cmp( $x, $y ), "\n"; print "\t cmp('ά','ὰ'): ", $rc->cmp( $y, $z ), "\n"; print "\t cmp('α','ὰ'): ", $rc->cmp( $x, $z ), "\n\n"; And the output... Strict collation rules: Level 4 (default) cmp('α','ά'): -1 cmp('ά','ὰ'): -1 cmp('α','ὰ'): -1 Relaxed collation rules: Level 1 cmp('α','ά'): 0 cmp('ά','ὰ'): 0 cmp('α','ὰ'): 0 And if the reason for doing comparisons is to handle sorting, Unicode::Collate does that too (you don't need to explicitly use Perl's core sort). Dave In reply to Re^3: getting Unicode character names from string
by davido
|
|