in reply to Re^3: sorting Chinese characters
in thread sorting Chinese characters
As it turns out, when I re-sorted the index in a debugging session in C# and then diffed the Perl index vs. the C# index, there were fewer differences than unreachable keys. A large block of mis-sorted entries were disrupting the binary search for proximate entries (including 'one') and then there were two or three Chinese characters which were sorting oddly between the two languages in a few places, which were causing the rest of the dead-ends.
By excluding these few entries during index creation, I now have 100% match.
If I get some time it'll be interesting to find out why those few characters are really being sorted in a different order. Might be a bug in either C# or Perl.
Anyway, thanks for your help.
larryk perl -le "s,,reverse killer,e,y,rifle,lycra,,print"