The reason this is hard to do is that you can't compare every character with another. You have to break your input strings into larger chunks, the letter chunks and the digit chunks. Then, you have to, essentially, do multiple sorts on any two strings alternating between digits and letters. And you have to keep them in sync (i.e. make sure you are always comparing digits to digits and letters to letters.)
This is my attempt. I imagine it can be improved upon. The Schwartzian Transform helps reduce a lot of duplicate work by splitting the string into its letter/digit components in the initial map. The regex should always create a field for letters first even if it is undef. That keeps them in sync. The sort routine itself is pretty straight forward, but it is lengthy so I put it in its own function.
my %hash = map {split} <DATA>;
for my $key ( map { $_->[0] }
sort custom_sort
map {[ $_, $hash{$_} =~ /([A-Z]*)(\d+|[A-Z]+)/ig ]} keys
+ %hash) {
print "Key = $key, Value = $hash{$key}\n";
}
sub custom_sort {
my $i = 1;
for (;;) {
return -1 unless defined $a->[$i];
return 1 unless defined $b->[$i];
my $c = $a->[$i] cmp $b->[$i] || $a->[$i+1] <=> $b->[$i+1];
return $c if $c;
$i += 2;
}
}
__DATA__
a 2
b 1
c 12
d a12
e a9
f a2b3
g a2b4
h b2b4
i 12
j a2bb5
k a2bb5a
l a2bb5b
The output from that:
Key = b, Value = 1
Key = a, Value = 2
Key = c, Value = 12
Key = i, Value = 12
Key = f, Value = a2b3
Key = g, Value = a2b4
Key = j, Value = a2bb5
Key = l, Value = a2bb5b
Key = k, Value = a2bb5a
Key = e, Value = a9
Key = d, Value = a12
Key = h, Value = b2b4
Caveat: This code assumes your strings only ever contain digits and letters. You'll have to modify the regex to suit your needs if other characters are to be permitted. Also, capitalization matters. (Caps sort lower than lowercase.)
-sauoq
"My two cents aren't worth a dime.";
|