numita has asked for the wisdom of the Perl Monks concerning the following question:
hello everyone,I tried to write a code for roc curve using perl module statistic:roc, but when i run the program it gives the error value out of range for table lookup (2):0.711752437,1 at line 16.
use Statistics::ROC;
open(fh,"<roc_tp_fp.txt");
while ( <fh> ) {
($a,$b)=split/,/;
push @AoA, [ split ];
}
for $aref ( @AoA ) {
#print "[ @$aref ],";
}
@curves=roc('decrease',0.95,@AoA);
print "$curves[0][2][0] $curves[0][2][1] \n";
input_file:roc_tp_fp.txt looks like as follows:
0.9883817,1
0.770431568,1
0.983195895,1
0.812109932,1
0.901505931,1
0.72431528,1
0.73553418,1
0.724572657,1
Re: plotting roc curve using roc package
by Athanasius (Bishop) on Aug 02, 2015 at 07:54 UTC
|
Hello numita, and welcome to the Monastery!
I can see three problems with your code (there might be others):
First, you do not have:
use strict;
use warnings;
at the head of your script. Get into the habit of always adding these pragmata, and of using lexical variables whenever possible.
Second, this loop:
while ( <fh> ) {
($a,$b)=split/,/;
push @AoA, [ split ];
}
almost certainly doesn’t do what you want. The first call to split does nothing (because the results are never used); the second call results in @AoA containing this (obtained via Data::Dump):
What you need is something like this:
which produces the following output:
Well, we’re getting closer, but we’re still getting the same error message. Which brings us to the third problem: the input data is almost certainly incorrect. In the examples given in Statistics::ROC’s documentation, the second “true” value in each data pair is zero (i.e. false) for around half the pairs. In your data, the second value is always 1 (true). I’m no mathematician, but I’m guessing that the input data you have supplied is invalid (or at least incomplete) for this algorithm.
Hope that helps,
| [reply] [d/l] [select] |
Re: plotting roc curve using roc package
by GrandFather (Saint) on Aug 02, 2015 at 07:52 UTC
|
($a,$b)=split/,/;
push @AoA, [ split ];
is bogus. It should probably be:
chomp;
push @AoA, [split /,/];
In addition, shouldn't you have at least one x, 0 value? Looks to me like the module doesn't handle cases where all truth values are the same.
Premature optimization is the root of all job security
| [reply] [d/l] [select] |
|
Hello GrandFather,
chomp;
push @AoA, [split /,/];
This will work only if there is no more than one pair of values on each line of input data (which, from the OP, I’m guessing is not the case). Otherwise, @AoA will end up like this:
[
0.9883817,
"1 0.770431568",
"1 0.983195895",
"1 0.812109932",
"1 0.901505931",
"1 0.72431528",
"1 0.73553418",
"1 0.724572657",
1,
]
:-(
| [reply] [d/l] [select] |
|
| [reply] |
|
Re: plotting roc curve using roc package
by Laurent_R (Canon) on Aug 02, 2015 at 07:54 UTC
|
Hmmm, you probably want this:
while ( <fh> ) {
($a,$b)=split/,/;
push @AoA, [ $a, $b ];
}
Update: you also probably need to chomp your data lines. | [reply] [d/l] |
|
Hmmmm, no he probably doesn't want that.
$a and $b are special variables and even in sample code should be avoided. In fact more effort should go into making sample code clean and clear than even production code because you are providing example code for other people. You could write your sample like:
while (<$fh>) {
my ($value, $truth) = split /,/;
push @groups, [$value, $truth];
}
which hints at using lexical file handles, avoids special variables, uses correctly scoped sensibly named lexical variables and uses consistent white space.
Update: replaced ) with ] - thanks Laurent_R
Premature optimization is the root of all job security
| [reply] [d/l] |
|
Hum, yes, GrandFather, you're right, I wouldn't write such code, but here I only wanted to point to the obvious error in the code shown, i.e. that the first split was useless (because the result is never used) and that the second split did not split anything but probably only removed trailing spaces and newline character from $_. But you're right that the OP should use strictures and warnings, should not use the $a and $b special variables, should use meaningful variable names, and so on.
| [reply] [d/l] |
Re: plotting roc curve using roc package
by pme (Monsignor) on Aug 02, 2015 at 08:03 UTC
|
#!/usr/bin/perl
use strict;
use warnings;
use Data::Dumper;
use Statistics::ROC;
open(my $fh, "<roc_tp_fp.txt") or die "cannot open file 'roc_tp_fp.txt
+': $!\n";
my @AoA;
while ( <$fh> ) {
chomp;
push @AoA, [ split /,/ ];
}
close $fh;
print Dumper( \@AoA ) . "\n";
my @curves = roc('decrease', 0.95, @AoA);
print Dumper( \@curves ) . "\n";
| [reply] [d/l] |
|
|