Re (tilly) 1: Sorting issues
by tilly (Archbishop) on May 26, 2001 at 21:06 UTC
|
Your rindex is giving you the index of the last |.
You then take a substring starting there.
You are then taking strings that look like "|10" and "|100"
and doing a numerical comparison. Neither looks like a
number, so that turns into 0 being compared with 0. Add 1
to the rindex and that problem would go away.
BTW if you turned warnings on, you would have been told
about this up front. Secondly if you had used an existing
database then the problem would never have arisen. And
finally the moral is that when debugging it is really bad
to wildly guess at what you think the problem could be.
Instead methodically work through what you think should
happen and what is happening until you find the
discrepancy (which could be anywhere). | [reply] |
Re: Sorting issues
by chipmunk (Parson) on May 26, 2001 at 21:18 UTC
|
I built this test script around your code:
#!/usr/local/bin/perl -w
use strict;
my @database = <DATA>;
my @sorted=sort{my $one=substr($a,rindex($a,'|'));
my $two=substr($b,rindex($b,'|'));
($one <=> $two) } @database;
print @sorted;
__DATA__
a|b|c|10
d|e|f|100
g|h|i|2
which produced the following output:
Argument "|10\n" isn't numeric in ncmp at tmp.pl line 9, <DATA> chunk
+3.
Argument "|100\n" isn't numeric in ncmp at tmp.pl line 9, <DATA> chunk
+ 3.
Argument "|100\n" isn't numeric in ncmp at tmp.pl line 9, <DATA> chunk
+ 3.
Argument "|2\n" isn't numeric in ncmp at tmp.pl line 9, <DATA> chunk 3
+.
a|b|c|10
d|e|f|100
g|h|i|2
The warnings reveal the problem quite conveniently; the substr() is including the pipe character, so all the values being compared are numerically equal to zero and thus to each other.
You can easily fix this error by adding 1 to each rindex(). However, I would suggest making another change as well. You're doing a fair amount of work in the sort subroutine, which makes it inefficient. A Schwartzian Transform moves the work out of the sort sub:
my @sorted = map $_->[1],
sort { $a->[0] <=> $b->[0] }
map [ substr($_,rindex($_,'|')+1), $_ ],
@database;
But, there's another option that would work well here, and should be even more efficient because it uses an optimized sort sub. This one is often called the Guttman-Rosler Transform:
my $width = 10; # must be at least as big as the longest field being
+compared
my @sorted = map substr($_, index($_, '|')+1),
sort
map sprintf("%0${width}d|%s", substr($_,rindex($_,'|')+1)
+, $_),
@database;
As you can see, this is very similar to the Schwartzian Transform, but the intermediate value is a string instead of an anonymous array. | [reply] [d/l] [select] |
Re: Sorting issues
by epoptai (Curate) on May 26, 2001 at 20:59 UTC
|
Chomp just before the comparison:
my @sorted = sort{
my $one=substr($a,rindex($a,'|'));
my $two=substr($b,rindex($b,'|'));
chomp($one,$two);
($one <=> $two) } @database;
Update: I only answered the question of where to
put a chomp in that block. See the following replies for
a more in-depth analysis of the sorting problem.
--
Check out my Perlmonks Related Scripts like framechat,
reputer, and xNN.
Edit: chipmunk 2001-05-26 | [reply] [d/l] |
Re: Sorting issues
by the_slycer (Chaplain) on May 26, 2001 at 21:14 UTC
|
Your $one and $two are being compared as strings instead of numbers. This is because they both contain a '|' infront of the numbers.
my @sorted=sort{
my $one=substr($a,rindex($a,'|'));
my $two=substr($b,rindex($b,'|'));
$one =~ s/\D//g;
$two =~ s/\D//g;
($one <=> $two)
} @database;
This sorts it properly. I'm sure there's a better way to get at that last number, but I'm just stripping out anything that's not a digit.
HTH | [reply] [d/l] |
|
| [reply] |
Re: Sorting issues
by larryk (Friar) on May 27, 2001 at 00:36 UTC
|
Forgive my simple brain but am I missing the point?
#!/usr/bin/perl -w
use strict;
my @database = <DATA>;
sub g($) {(split/\|/,shift)[$#_]}
my @sorted = sort {g$a<=>g$b} @database;
print @sorted;
__DATA__
a|b|c|10
d|e|f|g|h|9
i|100
j|k|2
l|m|n|o|11
"Argument is futile - you will be ignorralated!" | [reply] [d/l] |
|
chomp @database;
before the sort. And as merlyn points out this is a good application of the Schwartzian Transform.
Peter L. Berghold | Schooner Technology Consulting, Inc. |
Peter@Berghold.Net | www.berghold.net |
| [reply] [d/l] |
|
i purposely left out the chomp because it works without.
++ for you if you can tell me when the \n will screw up the sort.
The fix to dump the newlines is: sub g($){(split/\||\n/,shift)[-1]};
"Argument is futile - you will be ignorralated!"
| [reply] [d/l] |
Re: Sorting issues
by toma (Vicar) on May 27, 2001 at 07:01 UTC
|
I've enjoyed good results with the Data::Table module,
which provides functions for reading tables from CSV or a database.
It can output HTML tables or CSV. It also provides functions for
sorting, mapping, reordering, slicing, etc. I like to use
a header on the CSV file, so that I can refer to columns by
name.
use Data::Table;
my $t= Data::Table::fromCSV("data.csv");
$t->sort('exp', 0, 1); #Sort table by col 'exp', numeric, descending
print $t->csv;
For my test file, this program prints:
name,exp,level
merlyn,14272,10
tilly,13067,10
dkubb,1076,7
LD2,807,6
toma,17,1
I don't plan on spending any more
time debugging CSV-type parsers, and I have an easy migration
path for upgrading to using a relational database.
It should work perfectly the first time! - toma | [reply] [d/l] |
Re: Sorting issues
by zeidrik (Scribe) on May 28, 2001 at 15:08 UTC
|
If You insist on "sort" here is my way to do it:
for the database as
1 2 3 |10
a b c |20
e f g |100
x n f |1
the code
#!/usr/bin/perl -w
use strict;
my %H;
open(R,"database");
map {/\|(\d+)/; $H{$1}=$_ if $1}<R>;
close(R);
foreach (sort {$a<=>$b} keys %H){print $H{$_}}
does exactly:
x n f |1
1 2 3 |10
a b c |20
e f g |100
Enjoy... | [reply] [d/l] [select] |
Re: Sorting issues
by Stamp_Guy (Monk) on May 29, 2001 at 04:12 UTC
|
Thanks for the help guys. Chipmonk, thanks for taking the time to explain it. That made a lot more sense. | [reply] |