Re: sorting by numbers then alphabetically
by eff_i_g (Curate) on Dec 23, 2010 at 17:18 UTC
|
| [reply] |
|
I was just looking at that but it doesn't do what i want does it...it sorts by letters then numbers doesnt it?
| [reply] |
|
Per the docs:
Under natural sorting, strings are splitted at word and number boundaries, and the resulting substrings are compared as follows:
* numeric substrings are compared numerically
* alphabetic substrings are compared lexically
* numeric substrings come always before alphabetic substrings
| [reply] |
|
I came back and looked at this thing. Looks like it would do close to what you want. From your problem description, my understanding is that you have just either numeric or alphabetic strings - so doesn't appear that there would be any substring splitting. However the "non-module" sort function as I show below is easy to write and also incorporates the case insensitive feature.
Under natural sorting, strings are splitted at word and number
boundaries, and the resulting substrings are compared as follows:
* numeric substrings are compared numerically
* alphabetic substrings are compared lexically
* numeric substrings come always before alphabetic substrings
| [reply] [d/l] |
Re: sorting by numbers then alphabetically
by Tux (Canon) on Dec 23, 2010 at 17:27 UTC
|
my @sorted =
map { $_->[0] }
sort { $a->[1] <=> $b->[1] || $a->[2] cmp $b->[2] }
map { my $s = $_->seq_region_name;
[ $_, $s =~ m/^[0-9]+$/ ? $s : 999, $s ]
}
@unsorted;
Enjoy, Have FUN! H.Merijn
| [reply] [d/l] |
|
that worked a treat though i have no idea how it works! Is it case sensitive? I don't want it to be? If i could manipulate by object properties directly i would set the properties and test but i can't
| [reply] |
|
That uses a technique called a Schwartzian Transform, but you don't need to use it here. Aside from that, I'm not 100% sure that the comparison is going to work in all cases. It may very well. Below is another way.
Basically compare using "cmp" if both things contain a non-digit, use "<=>" if both things are pure digits, if there is a mixture (one thing is pure digits, one thing is not) then the pure digit thing appears before the other in sort order. The sort subroutine needs to return, -1(a<b),0(a=b),1(a>b)and you can have any code to do that that you want. Perhaps you would want say some function that sorted the days of the week, in Europe weeks start on Monday, but in the US weeks start on Sunday. You could do that by making a subroutine that returns the appropriate -1,0,1 values. Cool.
@sorted = sort by_num_then_letter @unsorted;
sub by_num_then_letter
{
my $a_region = lc ($a->region_name);
my $b_region = lc ($b->region_name);
my $a_non_digit = ($a_region =~ /\D/);
my $b_non_digit = ($b_region =~ /\D/);
if ($a_non_digit and $b_non_digit)
{
$a_region cmp $b_region
}
elsif (!$a_non_digit and !$b_non_digit)
{
$a_region <=> $b_region
}
elsif ($a_non_digit) #digit only thing is always less than
{
1; # b > a (reverse 1 and -1 if I got this wrong)
}
else
{
-1; # b<a
}
}
| [reply] [d/l] [select] |
|
|
|
|
|
|
Re: sorting by numbers then alphabetically
by JavaFan (Canon) on Dec 23, 2010 at 17:24 UTC
|
Untested, beware of typos:
$ref = [map {$$_[0]}
sort {$$a[1] cmp $$b[1]}
map {my $srn = $_->seq_region_name;
[$_, sprintf "%03d;%s", ($srn =~ /[0-9]/ ? ($srn, "") : (
+0, $srn))]}
@$ref];
| [reply] [d/l] |
|
$ref = [
map { $_->[0] }
sort { $a->[1] cmp $b->[1] }
map { my $srn = lc $_->seq_region_name;
[ $_, pack "sA*", $srn =~ /^[0-9]+$/ ? $srn : 999, $srn ]
}
@$ref ];
Enjoy, Have FUN! H.Merijn
| [reply] [d/l] [select] |
Re: sorting by numbers then alphabetically
by AnomalousMonk (Archbishop) on Dec 23, 2010 at 18:44 UTC
|
Another way, using decorate-undecorate, and including the latest requirement for case-insensitive sorting (but see also Sort::Maker):
>perl -wMstrict -le
"my $ar = [
qw(X1 Mt10 mT2 y20 Y1 Y10 mt20 x20 X10)
];
;;
my @sorted =
map { undecorate($_) }
sort
map { print qq{'$_'}; $_ }
map { decorate($_) }
@$ar
;
;;
print qq{'$_'} for @sorted;
;;
use constant WIDTH => 10;
use constant SEP => ':';
;;
sub decorate {
my ($num) = $_[0] =~ m{ (\d+) }xms;
my ($alpha) = $_[0] =~ m{ ([[:alpha:]]+) }xms;
return sprintf '%0*d%s%s%s', WIDTH, $num, lc($alpha), SEP, $_[0];
}
;;
sub undecorate {
return substr $_[0], 1 + index $_[0], SEP;
}
"
'0000000001x:X1'
'0000000010mt:Mt10'
'0000000002mt:mT2'
'0000000020y:y20'
'0000000001y:Y1'
'0000000010y:Y10'
'0000000020mt:mt20'
'0000000020x:x20'
'0000000010x:X10'
'X1'
'Y1'
'mT2'
'Mt10'
'X10'
'Y10'
'mt20'
'x20'
'y20'
Note: The map { print qq{'$_'}; $_ } step above is just to show the effect of decoration.
Also: The two regexes could be combined into one in the decorate() function.
Update: See also Re: Help Manipulating Sort with Subroutines for another variation of this 'pattern'.
| [reply] [d/l] [select] |
A reply falls below the community's threshold of quality. You may see it by logging in. |