Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation

Sorting an array on two computed fields

by jesuashok (Curate)
on Dec 16, 2005 at 10:05 UTC ( #517195=perlquestion: print w/replies, xml ) Need Help??
jesuashok has asked for the wisdom of the Perl Monks concerning the following question:


Please give me a Idea to sort the following array.

#!/usr/local/bin/perl my @array = ( '1-1', '2-5', '1-10', '1-3','2-1'); print "value :" , $_ , ":\n" foreach ( sort @array );
value :1-1: value :1-10: value :1-3: value :2-1: value :2-5:
But I want to have the following Result
value :1-1: value :1-3: value :1-10: value :2-1: value :2-5:
But I want the result should be in sorted order

"Keep pouring your ideas"

2005-12-18 Retitled by Arunbear, as per consideration.
Original title: 'Interpreting each and element of the array efficiently'

Replies are listed 'Best First'.
Re: Sorting an array on two computed fields
by davido (Archbishop) on Dec 16, 2005 at 10:11 UTC

    Something like this?

    my @sorted = map { $_->[2] } sort { $a->[0] <=> $b->[0] or $a->[1] <=> $b->[1] } map { [ split( /-/, $_ ), $_ ] } @unsorted;

    ...untested, but it ought to work. The biggest problem is figuring out the best use of whitespace in making such a construct look pretty.

    Update: For the record, I just tested it, and yes, it works as advertised. ;) The Schwartzian Transform isn't really a necessary part of it all, but to me, it makes it easier to read than if I had put all the splitting in the sort code block. Also, the or is used for "fall through". If the first comparison evaluates to equality, the logic falls through to the second comparison.

    Update-2: Here is it without the Schwartzian Transform. ...maybe it's not that ugly after all, though as the data set grows it may become less efficient:

    my @sorted = sort { my @a = split /-/, $a; my @b = split /-/, $b; $a[0] <=> $b[0] or $a[1] <=> $b[1] } @unsorted;


      Well, still using map, this seems a good candidate for Guttman-Rossler, and such an approach is also compatible with jesuashok's new requirement:
      #!/usr/bin/perl -l use strict; use warnings; my @data=qw/1-1 6 2-5 1-10 7 1-3 2-1/; print for map +(split)[1], sort map { (my $s=$_) =~ s/(\d+)/sprintf '%06d', $1/ge; "$s $_"; } @data; __END__

      Of course this is exclusively for illustrational purposes. In more complex or realistic situations he will have to find a suitable separator, provided that this is possible, and maybe he will have to work with index and substr instead, although pattern matching should be already optimized on fixed strings (as opposed to "real" patterns).

      Oh, and of course he must know in advance that the size of his numbers has an upper bound, which seems to be the case.

      it may become less efficient...

      It may, or it will? I guess what I am asking is, when does it pay to do the S.T. and why? Does it pay in this case? (I mean, if the data set was a lot larger?)

      UPDATE: Maybe an answer at Re: When does it pay to use the schwartzian transform?

      sorry for changing the requirement again what will happen if the array contains as follows:-
      my @array = ( '1-1', '6', '2-5', '1-10', '7', '1-3','2-1');
      "Keep pouring your ideas"

        ...expanding on my version "without Schwartzian Transform":

        my @sorted = sort { my @a = split /-/, $a; my @b = split /-/, $b; $a[0] <=> $b[0] or ( defined( $a[1] ) && defined( $b[1] ) and $a[1] <=> $b[1] ) } @unsorted;

        As before, the worst part is figuring out how to tab the whole thing to keep it readable. ;) You need the defined test so that fall-through doesn't occur on single-term items.

        Update: Hmm, there is a problem here though. You didn't define whether "6" should come before, after, or between "6-0" and "6-10". My solution didn't define it either, which means that the definition of the problem is inadequate, and that the solution is equally inadequate. But the point to all this is that you can hand-craft your sort routine. ...just think through what you want it to do, and craft your code. I need to get some sleep, so I'll leave you in the capable hands of the rest of the PerlMonks. :) You can do it, really. Dig in and let us know when you get stuck.


        sorry for changing the requirement again what will happen if the array contains as follows:
        my @array = ( '1-1', '6', '2-5', '1-10', '7', '1-3','2-1');
        Well, then it depends on how you want '6' to compare to the other entries. If it's the ehm natural one, then again Sort::Naturally should be appropriate!
Re: Sorting an array on two computed fields
by blazar (Canon) on Dec 16, 2005 at 10:12 UTC

    I think you want Sort::Naturally.

    But if it's just for this restricted task, then of course you can roll your own code. Well, you can try your hand at the general problem too.

    Having asked this very question myself some time ago in clpmisc I'll point you there: search for "sorting strings numerically"; e.g.: Benjamin Goldberg's reply and this thread. Unfortunately I cannot get from Google all the relevant posts in one thread.

Re: Sorting an array on two computed fields
by salva (Abbot) on Dec 16, 2005 at 11:25 UTC
    use Sort::Key qw(keysort); my @array = ( '1-1', '2-5', '2', '6', '1-10', '1-3','2-1'); my @sorted = keysort { pack "N*", split /-/, $_ } @array;
    update: or with the new Sort::Key::Natural I have just released:
    use Sort::Key::Natural 'natsort'; my @sorted = natsort @array;
Re: Sorting an array on two computed fields
by TedPride (Priest) on Dec 16, 2005 at 16:09 UTC
    blazar's solution works properly, unlike some of the other solutions given above (++ for him), but mine is more efficient: 2.08 vs 3.68 seconds for 1000 iterations with 100 randomly generated items. Smaller sets show much the same ratio.
    use strict; use warnings; ### Randomly generate numbers my @array; for (0..100) { my @n; push @n, (int rand 15) + 1; for (0..((int rand 3) - 1)) { push @n, int rand 15; } push @array, join '-', @n; } ### Sort the numbers my @arr = map { $_ = join '-', @$_ } sort { mysort($a, $b) } map { $_ = [split '-'] } @array; print join "\n", @arr; ### mysort does the actual comparisons sub mysort { my ($s1, $s2) = ($#{$_[0]}, $#{$_[1]}); for (0..($s1 < $s2 ? $s2 : $s1)) { no warnings; return $_ if $_ = $_[0][$_] <=> $_[1][$_]; } ### N comes before N-0 return ($s1 < $s2 ? -1 : ($s2 > $s1 ? 1 : 0)); }
      It seems you are ignoring my solution using Sort::Key. On my computer it runs more than 2.5 times faster than yours!
      Which solutions don't work properly?
Re: Sorting an array on two computed fields
by TedPride (Priest) on Dec 16, 2005 at 22:29 UTC
    That didn't come with my Perl installation, so I'm adding 10 or 15 minutes for install time. Mine is still faster!

    Edit: I will say, however, that yours looked the most promising. I tried it first, and was unhappy when I found I'd have to install a module to see how fast it ran :\

      keep adding elements to the array and at some point my solution will start outperforming yours, even considering the installation time ;-)

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: perlquestion [id://517195]
Approved by Corion
[Your Mother]: Stick with what you know. Or post a SOPW and if it's not complicated I guarantee someone will answer; me if it's straightforward and no one else gets it better first.
[1nickt]: ++YourMother (Grammar Vigilante of Bristol)
[Lady_Aleena]: Um, how does grep do multiline? print "$file:$_" if grep { /get_(array|hash| data)\(.+\)/ } $_; only returns the same 18 lines the command line grep does.
[Lady_Aleena]: my scratchpad has what I have so far.
[choroba]: what do you mean by multiline?
[Discipulus]: thanks 1nickt!
[choroba]: grep is a filter, it selects elements from a list based on a boolean condition
[moritz]: you need to read the file in one go to do a multiline regex match
[erix]: ha ha, I like the Vigilante
[Lady_Aleena]: Forget it. I'm getting angry with it.

How do I use this? | Other CB clients
Other Users?
Others chilling in the Monastery: (11)
As of 2017-05-23 19:45 GMT
Find Nodes?
    Voting Booth?