Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?

Comment on

( #3333=superdoc: print w/replies, xml ) Need Help??


Regarding "my_sort_index": Oh, yeah, I totally forgot to write that.

Well, when writing comparison routines, you first want to determine what all the cases are that you want to handle, and what to do for each case. For ensuring that strings beginning with "index." sort to the top of the list, there are four cases:

  1. Neither string has the desired prefix: For this case, we simply return 0 and let the next bit of code choose the order.
  2. Only the first string has the desired prefix: Here, we return -1 to make the first string rise to the top of the list.
  3. Only the second string has the desired prefix: Here, we return +1 to make the second string rise to the top of the list.
  4. Both strings have the desired prefix: In this case, we don't really care which one is first. The first three cases ensure that the strings with the prefix float to the top. So we'll return 0 to let the next bit of code check the rest of the order.

So, here's my (untested!) version of the my_sort_index routine:

sub my_sort_index { my ($c, $d) = @_; # We want any item starting with "index." to sort to the top of # the list. The rest of the ordering is handled by the caller. # So there are four cases we must handle: # # Matches? # $c $d return # No No 0 # Yes No -1 # No Yes +1 # Yes Yes 0 (-1 + +1 == 0) # # In the fourth case, when both strings have the "index." prefix, # we want them to sort via the normal method so that "index.1" wil +l # always sort before "index.2". my $return_value = 0; --$return_value if $c =~ /^index\./; ++$return_value if $d =~ /^index\./; return $return_value; }

We could enforce the order in the fourth case, but it would make the subroutine a bit less useful. Why? It makes the subroutine harder to use with other subroutines for a sorting job, because it forces an order beyond just making sure that strings prefixed with "index." go first. In other words, if neither string began with index, it would use a different ordering than if both strings began with index. That can be problem.

Suppose you want all strings with "index." to sort to the top of the list, but you want everything sorted in reverse order beyond that, you'd have to have another version of your code. With the code we have now, though, we could do:

my @sorted = sort { my_index_sort($a,$b) || $b cmp $a } @list;

Regarding the Schwartzian transform: I wanted you to be aware of the technique, because it can let you do some pretty interesting things. When I use them, it's generally because I think to myself, "Oh, this would be *so* much easier to sort if everything looked like XYZ". Then I try to figure out an easy way to take what I have and turn it into XYZ.

In the case of sorting names, for example. As you know, to handle names well, you'll have a good bit of testing and comparison and such to arrange the name(s). and there are quite a few special cases to worry about. In this case, I'd think to myself. "Oh, this would be *so* much easier to sort if the names were split into first, middle and last name with the extra bits at the end. Then I could just sort them normally." Then I figure out how to do that.

sub convert_name_to_L_F_M_Extras { # Lots of ugly code to split the name into bits, determine which b +it goes where, and # return an array of [ LastName, FirstName, MiddleName, Extras ], +like so: # "Dr. John Smith" => [ Smith, John, undef, Dr. ] # "Eucalyptus A. Tree, Esquire" => [ Tree, Eucalyptus, A., Esquire + ] return [ $last, $first, $middle, $rest ]; } my @list = ( "Dr. John Smith", "Eucalyptus A. Tree, Esquire", "Robotic +us", "Lady Aleena, Baker of cookies" ); my @transformed_list = map { [ convert_name_to_L_F_M_Extras($_), $_ ], + } @list; # Now looks like: # @transformed_list = ( # [ [ "Smith", "John", undef, "Dr" ], "Dr. John Smith" +], # [ [ "Tree", "Eucalyptus", "A.", "Esquire" ], "Eucalyptus A. Tr +ee, Esquire" ], # ... # ); sub sort_by_name { # $a and $b contain an arrayref, the first entry has an arrayref w +ith last, first, etc. # Sort by last name: $a->[0][0] cmp $b->[0][0] # If they have the same last name, sort by first name: || $a->[0][1] cmp $b->[0][1] ... etc ... } my @sorted = sort {sort_by_name($a,$b)} @transformed_list; # Now throw away the transformed names and keep only the list of sorte +d names. @sorted = map { $_->[1] } @sorted;

Now, having said all that, I mainly mentioned it to help you shift your perspective on sorting. I don't know that you'll necessarily be able to use it, and the case I created above is my best guess of how it *might* be useful to you.


When your only tool is a hammer, all problems look like your thumb.

In reply to Re^3: RFC: I rewrote a custom sort function, but not sure if it is the best way. by roboticus
in thread RFC: I rewrote a custom sort function, but not sure if it is the best way. by Lady_Aleena

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and all is quiet...

    How do I use this? | Other CB clients
    Other Users?
    Others studying the Monastery: (8)
    As of 2018-06-20 16:17 GMT
    Find Nodes?
      Voting Booth?
      Should cpanminus be part of the standard Perl release?

      Results (116 votes). Check out past polls.