Beefy Boxes and Bandwidth Generously Provided by pair Networks
Clear questions and runnable code
get the best and fastest answer

Re^2: New Alphabet Sort Order

by Polyglot (Pilgrim)
on Apr 03, 2011 at 21:57 UTC ( #897243=note: print w/replies, xml ) Need Help??

in reply to Re: New Alphabet Sort Order
in thread New Alphabet Sort Order

Because some of the alphabetical order is dependent upon character combinations (more than one character together), the tr/// approach is not adequate, though a nice idea. I may need to use something more along the lines of mapping characters and sequences, and then sorting based on that map.



Replies are listed 'Best First'.
Re^3: New Alphabet Sort Order
by BrowserUk (Pope) on Apr 03, 2011 at 23:16 UTC
    Because some of the alphabetical order is dependent upon character combinations (more than one character together), the tr/// approach is not adequate,

    Okay, but you can still use essentially the same mechanism. Just set up a hash with the ordering and a re to pick out the 'characters'. Then use the re in combination with the hash to perform the mapping. This way, you should be able to cater for any mapping you can describe.

    If you run this, you'll see the numbers sorted before the consonants before the vowels before the (artificial) dipthongs (CH, SH, TH, WH):

    #! perl -slw use strict; my @order = ( 0 .. 9, 'B'..'D', 'F'..'H', 'J'..'N', 'P'..'T', 'V'..'Z', 'b'..'d', 'f'..'h', 'j'..'n', 'p'..'t', 'v'..'z', 'A', 'E', 'I', 'O', 'U', 'a', 'e', 'i', 'o', 'u', 'CH', 'SH', 'TH', 'WH', 'ch', 'sh', 'th', 'wh', ); my $re = join '|', sort{ length $b <=> length $a } @order; my $n = 0; my %map = map{ $_ => chr( $n++ ) } @order; sub trans { my $in = shift; $in =~ s[($re)]{ $map{ $1 } }ge; return $in; } chomp( my @data = map{ split ' ' } <DATA> ); my @sorted = sort{ trans( $a ) cmp trans( $b ) } @data; print for @sorted; __DATA__ I've been asked to help with a project involving some Lao script. I need to alphabetize lists of words in Lao. However, Lao characters are only barely defined in Perl, e.g. \p{InLao} to identify a Lao character, and I have been unable to find a predefined localedef or similar for L +ao. Searching perlmonks revealed virtually nothing on localedef, and as it turns out, perl may use it, but it seems to come from a C li +brary. It appears a new Lao alphabet routine is needed. I may have to generate the rules for's the tough +part: Lao is not a typical job for an alphabetic sort. Lao words are first sorted by consonant order. Vowels follow consonants in terms of alphabetical order, but not necessarily in terms of chronological order. For example, some vowels appear before the consonant even though they are pronounced after the consonant, and the alphabetical order follows pronunciation. After the typical list of single-character consonants, Lao has some "diphthong" consonants (double-character ones) which have their own alphabetical placements. All of this adds up to a challenging puzzle for a perl enthusiast. I welcome your thoughts on how this could be done, and/or how it should be done in a way that would follow standard pract +ice and be able to serve the entire Perl community for Lao script. I have already developed a "" module (not yet submitted to CPAN, and may need to use a different namespace) that will identify Lao characters by consonant, vowel, punctuation, and tone marks, and will further classify the consonants by their Lao +classes (high/mid/low). So I have the tools for distinguishing at the characte +r level, e.g. \p{Lao::InLaoCons}\p{Lao::InLaoTone}\p{Lao::InLaoVowel}, but need to map the characters to an alphabetical order, and this part seems beyond my experience.

    Examine what is said, not who speaks -- Silence betokens consent -- Love the truth but pardon error.
    "Science is about questioning the status quo. Questioning authority".
    In the absence of evidence, opinion is indistinguishable from prejudice.

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://897243]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others pondering the Monastery: (3)
As of 2018-05-26 08:25 GMT
Find Nodes?
    Voting Booth?