Beefy Boxes and Bandwidth Generously Provided by pair Networks
No such thing as a small change
 
PerlMonks  

sorting by numbers then alphabetically

by Anonymous Monk
on Dec 23, 2010 at 17:15 UTC ( [id://878837]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hello monks

I am a perl beginner and not sure how to do the following. I have an array ref to an array of objects and i need to sort these objects by a particular property. The object/property are called $slice->seq_region_name. This property is text: some objects will contain letters (X,Y or MT) and the others will contain numbers (normally 1 to about 30). I would like to sort the $slice objects by the seq_region_name property in the array so that the numbers come first and are sorted in ascending numerical order and the letters come next and are sorted in alphabetical order.

thanks a lot

Replies are listed 'Best First'.
Re: sorting by numbers then alphabetically
by eff_i_g (Curate) on Dec 23, 2010 at 17:18 UTC
      I was just looking at that but it doesn't do what i want does it...it sorts by letters then numbers doesnt it?
        Per the docs:
        Under natural sorting, strings are splitted at word and number boundaries, and the resulting substrings are compared as follows:

        * numeric substrings are compared numerically
        * alphabetic substrings are compared lexically
        * numeric substrings come always before alphabetic substrings
        I came back and looked at this thing. Looks like it would do close to what you want. From your problem description, my understanding is that you have just either numeric or alphabetic strings - so doesn't appear that there would be any substring splitting. However the "non-module" sort function as I show below is easy to write and also incorporates the case insensitive feature.
        Under natural sorting, strings are splitted at word and number boundaries, and the resulting substrings are compared as follows: * numeric substrings are compared numerically * alphabetic substrings are compared lexically * numeric substrings come always before alphabetic substrings
Re: sorting by numbers then alphabetically
by Tux (Canon) on Dec 23, 2010 at 17:27 UTC

    Swarzian transform

    my @sorted = map { $_->[0] } sort { $a->[1] <=> $b->[1] || $a->[2] cmp $b->[2] } map { my $s = $_->seq_region_name; [ $_, $s =~ m/^[0-9]+$/ ? $s : 999, $s ] } @unsorted;

    Enjoy, Have FUN! H.Merijn
      that worked a treat though i have no idea how it works! Is it case sensitive? I don't want it to be? If i could manipulate by object properties directly i would set the properties and test but i can't
        That uses a technique called a Schwartzian Transform, but you don't need to use it here. Aside from that, I'm not 100% sure that the comparison is going to work in all cases. It may very well. Below is another way.

        Basically compare using "cmp" if both things contain a non-digit, use "<=>" if both things are pure digits, if there is a mixture (one thing is pure digits, one thing is not) then the pure digit thing appears before the other in sort order. The sort subroutine needs to return, -1(a<b),0(a=b),1(a>b)and you can have any code to do that that you want.

        Perhaps you would want say some function that sorted the days of the week, in Europe weeks start on Monday, but in the US weeks start on Sunday. You could do that by making a subroutine that returns the appropriate -1,0,1 values. Cool.

        @sorted = sort by_num_then_letter @unsorted; sub by_num_then_letter { my $a_region = lc ($a->region_name); my $b_region = lc ($b->region_name); my $a_non_digit = ($a_region =~ /\D/); my $b_non_digit = ($b_region =~ /\D/); if ($a_non_digit and $b_non_digit) { $a_region cmp $b_region } elsif (!$a_non_digit and !$b_non_digit) { $a_region <=> $b_region } elsif ($a_non_digit) #digit only thing is always less than { 1; # b > a (reverse 1 and -1 if I got this wrong) } else { -1; # b<a } }
Re: sorting by numbers then alphabetically
by JavaFan (Canon) on Dec 23, 2010 at 17:24 UTC
    Untested, beware of typos:
    $ref = [map {$$_[0]} sort {$$a[1] cmp $$b[1]} map {my $srn = $_->seq_region_name; [$_, sprintf "%03d;%s", ($srn =~ /[0-9]/ ? ($srn, "") : ( +0, $srn))]} @$ref];

      Using pack will speed this up enormously. Besides that he wanted the numbers before the strings, so default to 999 instead of 0):

      $ref = [ map { $_->[0] } sort { $a->[1] cmp $b->[1] } map { my $srn = lc $_->seq_region_name; [ $_, pack "sA*", $srn =~ /^[0-9]+$/ ? $srn : 999, $srn ] } @$ref ];

      Enjoy, Have FUN! H.Merijn
Re: sorting by numbers then alphabetically
by AnomalousMonk (Archbishop) on Dec 23, 2010 at 18:44 UTC

    Another way, using decorate-undecorate, and including the latest requirement for case-insensitive sorting (but see also Sort::Maker):

    >perl -wMstrict -le "my $ar = [ qw(X1 Mt10 mT2 y20 Y1 Y10 mt20 x20 X10) ]; ;; my @sorted = map { undecorate($_) } sort map { print qq{'$_'}; $_ } map { decorate($_) } @$ar ; ;; print qq{'$_'} for @sorted; ;; use constant WIDTH => 10; use constant SEP => ':'; ;; sub decorate { my ($num) = $_[0] =~ m{ (\d+) }xms; my ($alpha) = $_[0] =~ m{ ([[:alpha:]]+) }xms; return sprintf '%0*d%s%s%s', WIDTH, $num, lc($alpha), SEP, $_[0]; } ;; sub undecorate { return substr $_[0], 1 + index $_[0], SEP; } " '0000000001x:X1' '0000000010mt:Mt10' '0000000002mt:mT2' '0000000020y:y20' '0000000001y:Y1' '0000000010y:Y10' '0000000020mt:mt20' '0000000020x:x20' '0000000010x:X10' 'X1' 'Y1' 'mT2' 'Mt10' 'X10' 'Y10' 'mt20' 'x20' 'y20'

    Note: The  map { print qq{'$_'};  $_ } step above is just to show the effect of decoration.

    Also: The two regexes could be combined into one in the  decorate() function.

    Update: See also Re: Help Manipulating Sort with Subroutines for another variation of this 'pattern'.

    A reply falls below the community's threshold of quality. You may see it by logging in.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://878837]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others wandering the Monastery: (5)
As of 2024-04-23 21:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found