in reply to Re^5: Sorting Vietnamese text
in thread Sorting Vietnamese text
Thanks again, but what I'm looking for is that every word staring with ỳ comes before any word starting with ỷ, so even that sort order isn't quite right, Also why are all the entries with ỷ not together?. Instead of
sorted ỳ : ỷ : ỳ ạch : ỷ eo : yêu nhau : yêu quí : ỷ lại : ỷ thế :should be
sorted ỳ : ỳ ạch : ỷ : ỷ eo : ỷ lại : ỷ thế : yêu nhau : yêu quí :This is how all paper dictionaries do it, regardless of which order they use for the tone marks. I'm beginning to wonder if I'm the only person who's ever cared about this before :)By the way, the reason I'm doing this is that I'm planning to release a large (>50,000 words) Vietnamese-English dictionary (as a single UTF8 file) under the CC license (essentially free to use for any purpose) and I'd like to make it available in "properly" sorted order. I've done similar projects for Chinese, Esperanto, and Interlinga already (see www.denisowski.org), but those are a lot easier to sort :)
Any other ideas? Thanks again for the help!
- Comment on Re^6: Sorting Vietnamese text
- Watch for: Direct replies / Any replies
Replies are listed 'Best First'. | |
---|---|
Re^7: Sorting Vietnamese text
by Atacama (Sexton) on Dec 25, 2013 at 04:28 UTC |
In Section
Seekers of Perl Wisdom