Sorting issues

Stamp_Guy has asked for the wisdom of the Perl Monks concerning the following question:

    my @sorted=sort{my $one=substr($a,rindex($a,'|'));
        my $two=substr($b,rindex($b,'|'));
    ($one <=> $two) } @database;
[download]

This block of code takes my tiny pipe-deliminated database and sorts it by the number in the last field. When it gets to 100, it puts it right in front of the 10. This data comes from a flat file database (don't flame me. I've been over all that about flat files enough.) The number is the last field, so it is possible it has a new line character attached. That's the only thing I can think of that would cause that. I don't know where to put a chomp in this block though.

Anyone else got any ideas? I am totally stumped and could use any help I can get on this.

-Stamp_Guy

Comment on Sorting issues Download Code

Replies are listed 'Best First'.

Re (tilly) 1: Sorting issues
by tilly (Archbishop) on May 26, 2001 at 21:06 UTC

rindex

You then take a substring starting there.

You are then taking strings that look like "|10" and "|100" and doing a numerical comparison. Neither looks like a number, so that turns into 0 being compared with 0. Add 1 to the rindex and that problem would go away.

BTW if you turned warnings on, you would have been told about this up front. Secondly if you had used an existing database then the problem would never have arisen. And finally the moral is that when debugging it is really bad to wildly guess at what you think the problem could be. Instead methodically work through what you think should happen and what is happening until you find the discrepancy (which could be anywhere).

[reply]

Re: Sorting issues
by chipmunk (Parson) on May 26, 2001 at 21:18 UTC

#!/usr/local/bin/perl -w

use strict;

my @database = <DATA>;

my @sorted=sort{my $one=substr($a,rindex($a,'|'));
                my $two=substr($b,rindex($b,'|'));
                ($one <=> $two) } @database;


print @sorted;

__DATA__
a|b|c|10
d|e|f|100
g|h|i|2
[download]

Argument "|10\n" isn't numeric in ncmp at tmp.pl line 9, <DATA> chunk 
+3.
Argument "|100\n" isn't numeric in ncmp at tmp.pl line 9, <DATA> chunk
+ 3.
Argument "|100\n" isn't numeric in ncmp at tmp.pl line 9, <DATA> chunk
+ 3.
Argument "|2\n" isn't numeric in ncmp at tmp.pl line 9, <DATA> chunk 3
+.
a|b|c|10
d|e|f|100
g|h|i|2
[download]

You can easily fix this error by adding 1 to each rindex(). However, I would suggest making another change as well. You're doing a fair amount of work in the sort subroutine, which makes it inefficient. A Schwartzian Transform moves the work out of the sort sub:

my @sorted = map $_->[1],
             sort { $a->[0] <=> $b->[0] }
             map [ substr($_,rindex($_,'|')+1), $_ ],
             @database;
[download]

But, there's another option that would work well here, and should be even more efficient because it uses an optimized sort sub. This one is often called the Guttman-Rosler Transform:

my $width = 10;  # must be at least as big as the longest field being 
+compared
my @sorted = map substr($_, index($_, '|')+1),
             sort
             map sprintf("%0${width}d|%s", substr($_,rindex($_,'|')+1)
+, $_),
             @database;
[download]

[reply]
[d/l]
[select]

Re: Sorting issues
by epoptai (Curate) on May 26, 2001 at 20:59 UTC

Chomp

my @sorted = sort{ 
my $one=substr($a,rindex($a,'|'));
my $two=substr($b,rindex($b,'|'));
chomp($one,$two);
($one <=> $two) } @database;
[download]

Update: I only answered the question of where to put a chomp in that block. See the following replies for a more in-depth analysis of the sorting problem.

--
Check out my Perlmonks Related Scripts like framechat, reputer, and xNN.

Edit: chipmunk 2001-05-26

[reply]
[d/l]

Re: Sorting issues
by the_slycer (Chaplain) on May 26, 2001 at 21:14 UTC

my @sorted=sort{
   my $one=substr($a,rindex($a,'|'));
   my $two=substr($b,rindex($b,'|'));
   $one =~ s/\D//g;
   $two =~ s/\D//g;
   ($one <=> $two) 
} @database;
[download]

[reply]
[d/l]

Re: Re: Sorting issues

by merlyn (Sage) on May 27, 2001 at 18:17 UTC

-- Randal L. Schwartz, Perl hacker

[reply]

Re: Sorting issues
by larryk (Friar) on May 27, 2001 at 00:36 UTC

#!/usr/bin/perl -w
use strict;

my @database = <DATA>;

sub g($) {(split/\|/,shift)[$#_]}

my @sorted = sort {g$a<=>g$b} @database;

print @sorted;

__DATA__
a|b|c|10
d|e|f|g|h|9
i|100
j|k|2
l|m|n|o|11
[download]

"Argument is futile - you will be ignorralated!"

[reply]
[d/l]

Re: Re: Sorting issues

by blue_cowdawg (Monsignor) on May 28, 2001 at 02:26 UTC

You forgot something: You need to a

chomp @database;
[download]

merlyn

Peter L. Berghold	Schooner Technology Consulting, Inc.
Peter@Berghold.Net	www.berghold.net

[reply]
[d/l]

Re: Re: Re: Sorting issues

by larryk (Friar) on May 28, 2001 at 14:11 UTC

i purposely left out the chomp because it works without.

++ for you if you can tell me when the \n will screw up the sort.

The fix to dump the newlines is: sub g($){(split/\||\n/,shift)[-1]};

"Argument is futile - you will be ignorralated!"

[reply]
[d/l]

Re: Sorting issues
by toma (Vicar) on May 27, 2001 at 07:01 UTC

Data::Table

use Data::Table;
my $t= Data::Table::fromCSV("data.csv");
$t->sort('exp', 0, 1); #Sort table by col 'exp', numeric, descending
print $t->csv;
[download]

name,exp,level merlyn,14272,10 tilly,13067,10 dkubb,1076,7 LD2,807,6 toma,17,1
I don't plan on spending any more time debugging CSV-type parsers, and I have an easy migration path for upgrading to using a relational database.

It should work perfectly the first time! - toma

[reply]
[d/l]

Re: Sorting issues
by zeidrik (Scribe) on May 28, 2001 at 15:08 UTC

1 2 3 |10
a b c |20
e f g |100
x n f |1
[download]

#!/usr/bin/perl -w
use strict;
my %H;
open(R,"database");
 map {/\|(\d+)/; $H{$1}=$_ if $1}<R>;
close(R);
 foreach (sort {$a<=>$b} keys %H){print $H{$_}}
[download]

x n f |1
1 2 3 |10
a b c |20
e f g |100
[download]

[reply]
[d/l]
[select]

Re: Sorting issues
by Stamp_Guy (Monk) on May 29, 2001 at 04:12 UTC

Thanks for the help guys. Chipmonk, thanks for taking the time to explain it. That made a lot more sense.

[reply]

Back to Seekers of Perl Wisdom