Beefy Boxes and Bandwidth Generously Provided by pair Networks
Welcome to the Monastery
 
PerlMonks  

Sorting large sets of geometric coordinates

by Anonymous Monk
on Apr 19, 2006 at 20:29 UTC ( [id://544435]=perlquestion: print w/replies, xml ) Need Help??

Anonymous Monk has asked for the wisdom of the Perl Monks concerning the following question:

Hi,

I could use some guidance on the best way to tackle a somewhat complex sorting problem I need to work though. I will try to keep this as generic as possible.

I have to grab 4 numbers from every line in a file. Each line is formatted in a very predictable pattern. Say I name each number like so:-

$row_lx
$row_ly
$row_ux
$row_uy

NOTE: Many of the $row_lx values will be the same.
NOTE: MOST of the $row_ly values will be different.

Here is a nice, neat, golden output. Imagine all of the following lines were originally completely out of order:-

((1.0 0.1) (99.0 0.2))
((1.0 0.2) (99.0 0.3))
((1.0 0.3) (99.0 0.4))
((5.0 0.4) (48.0 0.5))
((48.1 0.4) (99.0 0.5))
((1.0 0.5) (99.0 0.6))
((1.0 0.6) (13.0 0.7))
((13.1 0.6) (99.0 0.7))

Notice that the second number (highest priority), $row_ly, always increases, and sometimes it is repeated. When it does repeat, always print the $row_lx numbers in order (second priority).

I hope this is clear enough. Any recommendations? I think I need a hash with two primary key values, but I don't know how to sort the hash to produce the above output.

Thanks,
-Anonymous Monk #N

  • Comment on Sorting large sets of geometric coordinates

Replies are listed 'Best First'.
Re: Sorting large sets of geometric coordinates
by ikegami (Patriarch) on Apr 19, 2006 at 20:55 UTC

    A hash isn't appropriate, because nothing is guaranteed to be unique (as far as you've let on). You also ask to sort, and hashes are unordered.

    my @unsorted_data; while (<DATA>) { my @fields = /([\d.]+)/g or next; push(@unsorted_data, \@fields); } my @sorted_data = sort { $a->[1] <=> $b->[1] || $a->[0] <=> $b->[0] } @unsorted_data; print("(($_->[0] $_->[1]) ($_->[2] $_->[3]))\n") foreach @sorted_data;

    Update: Tested. Fixed the print.

    Update: Fixed the error discussed below.

      my @sorted_data = sort { $a->[1] <=> $b->[1] || $a->[0] <=> $a->[0] } @unsorted_data;
      Wouldn't the second part of that sort be a no-op?

      thor

      The only easy day was yesterday

        It handles ties in $row_ly. <=> returns 0 if the two things it compares are equal. For example, if the compare function were to be called to compare the following two rows,
        ((5.0 0.4) (48.0 0.5)) ((48.1 0.4) (99.0 0.5))
        The first <=> would return 0 since $a->[1] (0.4) and $b->[1] (0.4) are both equal. It would then go on to compare $a->[0] (5.0) with $b->[0] (48.1)

        Update: OOPS! Copy and paste error. The code should read:

        my @sorted_data = sort { $a->[1] <=> $b->[1] || $a->[0] <=> $b->[0] } @unsorted_data;

        Nod to BrowserUk for the head's up.

Re: Sorting large sets of geometric coordinates
by GrandFather (Saint) on Apr 19, 2006 at 21:01 UTC

    The spaceship (<=>) returns 0 when the two items being compared are equal so use an or to fall back to a secondary comparison if the first compare finds a match.

    use strict; use warnings; my @lines = <DATA>; @lines = map {[/(\d+\.\d+)/g]} @lines; @lines = sort {$a->[1] <=> $b->[1] or $a->[0] <=> $b->[0]} @lines; print "@$_\n" for @lines; __DATA__ 48.1 0.4 99.0 0.5 1.0 0.2 99.0 0.3 1.0 0.1 99.0 0.2 1.0 0.3 99.0 0.4 1.0 0.5 99.0 0.6 13.1 0.6 99.0 0.7 5.0 0.4 48.0 0.5 1.0 0.6 13.0 0.7

    Prints:

    1.0 0.1 99.0 0.2 1.0 0.2 99.0 0.3 1.0 0.3 99.0 0.4 5.0 0.4 48.0 0.5 48.1 0.4 99.0 0.5 1.0 0.5 99.0 0.6 1.0 0.6 13.0 0.7 13.1 0.6 99.0 0.7

    DWIM is Perl's answer to Gödel
Re: Sorting large sets of geometric coordinates
by davido (Cardinal) on Apr 19, 2006 at 21:32 UTC

    I'm not 100% clear on column priorities, so I'm kind of guessing the following, and you can tell us if I'm wrong:

    Priority: 3rd 1st 2nd 4th ((1.0 0.1)(99.0 0.2))

    Much of that is nothing but a guess. But let's assume it's accurate so that you can see an example of how you might handle the sorting:

    my @coordinates_unsorted; while ( <DATA> ) { push @coordinates_unsorted, [ m/([\d.]+)/g ]; } my( @coordinates_sorted ) = sort { $a->[1] <=> $b->[1] or $a->[2] <=> $b->[2] or $a->[0] <=> $b->[0] or $a->[3] <=> $b->[3] } @coordinates_unsorted; foreach my $coordinate_ref ( @coordinates_sorted ) { printf "((%2.1f %2.1f) (%2.1f %2.1f))\n", @{$coordinate_ref}; }

    An adaptation on that theme ought to work out for you.

    And here is essentially the same thing, but using a Schwartzian Transform. ...in this case, readability suffers, IMHO.

    my( @coordinates_sorted ) = map{ sprintf "((%2.1f %2.1f) (%2.1f %2.1f))\n", @{$_} } sort { $a->[1] <=> $b->[1] or $a->[2] <=> $b->[2] or $a->[0] <=> $b->[0] or $a->[3] <=> $b->[3] } map{ [ m/([\d.]+)/g ] } <DATA>; { local $, = "\n"; print @coordinates_sorted; }

    Dave

Re: Sorting large sets of geometric coordinates
by salva (Canon) on Apr 20, 2006 at 08:44 UTC
    You can create fast multikey sorting functions on the fly with Sort::Key::Multi:
    use Sort::Key::Multi qw(nn_keysort) # the "nn" prefix means # to use two numeric keys my @data; while (<DATA>) { my @f = /([\d.]+)/g or next; push(@data, \@f); } my @sorted = nn_keysort { $_->[1], $_->[0] } @data; printf("((%f %f) (%f %f))\n", @$_) foreach @sorted;

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://544435]
Approved by Paladin
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others avoiding work at the Monastery: (3)
As of 2024-04-25 17:56 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found