Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re^4: best sort

by jpl (Monk)
on Aug 16, 2011 at 16:27 UTC ( #920519=note: print w/ replies, xml ) Need Help??


in reply to Re^3: best sort
in thread best sort

If those were the things I was looking into, I would certainly not want a bare sort for any of them.

If you anticipate repeated elements, you'd probably want a tie-breaking

$a cmp $b
(the bare sort comparison function) on all but the first, because non-identical terms can otherwise compare equal and identical elements may not be adjacent in the sorted output. But I think we are in fundamental agreement: You need to know what you are comparing and how. That may be easier said than done. Knowing that "machenry" is a Scottish name, but "machinery" is not (or is it, and, if not, why not) makes "knowing what you are comparing" non-trivial.


Comment on Re^4: best sort
Download Code
Re^5: best sort
by tchrist (Pilgrim) on Aug 16, 2011 at 17:30 UTC
    If those were the things I was looking into, I would certainly not want a bare sort for any of them.
    If you anticipate repeated elements, you'd probably want a tie-breaking
    $a cmp $b
    Well, sure. Here’s some of the code to generate one of those:
    say $_->{PHRASE} for sort {     $b->{TOTAL_VOWELS} <=>  $a->{TOTAL_VOWELS}         ||    $b->{MAX_ANY_VOWEL} <=>  $a->{MAX_ANY_VOWEL}         ||         $b->{NUM_OF_A} <=>  $a->{NUM_OF_A}         ||         $b->{NUM_OF_E} <=>  $a->{NUM_OF_E}         ||         $b->{NUM_OF_I} <=>  $a->{NUM_OF_I}         ||         $b->{NUM_OF_O} <=>  $a->{NUM_OF_O}         ||         $b->{NUM_OF_U} <=>  $a->{NUM_OF_U}         ||         $b->{NUM_OF_Y} <=>  $a->{NUM_OF_Y}         ||         $a->{DICTFOLD} cmp  $b->{DICTFOLD};         ||          $a->{RECNO} <=>  $b->{RECNO}; } @records;
    Look more reasonable?
      It helps a lot with the how I am comparing part, although if anyone guessed in advance that was how By vowels: sorted, I'd like to solicit their advice on the outcome of upcoming NFL games. It's still not exactly what I had in mind for bringing all identical terms together. The
      $a->{RECNO} <=> $b->{RECNO}
      is unnecessary if the sort is stable, as sort() is, by default. Since different words can compare equal under the influence of
      $a->{DICTFOLD} cmp $b->{DICTFOLD}
      (if I'm correctly guessing what DICTFOLD is), I still might see
      word Word word Word
      in the sorted output, when I might have preferred
      word word Word Word
      which makes it easier to determine if words are "unique" without repeating all the complicated logic. So I would prefer
      $a->{ORIGINAL} cmp $b->{ORIGINAL}
      as the final tie-breaker. That's neither better nor worse than your code, merely different.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://920519]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others scrutinizing the Monastery: (9)
As of 2015-07-05 21:29 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (68 votes), past polls