http://www.perlmonks.org?node_id=666541

smilly has asked for the wisdom of the Perl Monks concerning the following question:

I tried to run my programme using WordNet::Similarity module which works in ommandlines for me but i get the undef values in my output matrix while my input is a simple 4 words corpus(each word per line). any special reason?
#! /usr/local/bin/perl -w use strict; use warnings; use WordNet::QueryData; use WordNet::Similarity::random; use WordNet::Similarity::path; use WordNet::Similarity::wup; use WordNet::Similarity::lch; use WordNet::Similarity::jcn; use WordNet::Similarity::res; use WordNet::Similarity::lin; use WordNet::Similarity::hso; use WordNet::Similarity::lesk; use WordNet::Similarity::vector; use WordNet::Similarity::vector_pairs; use Data::Dumper; my $Infile = shift; my $Outfile = shift; my $Measure = shift; my (@sim , $simi); unless (defined $Infile and defined $Outfile and defined $Measure) { print STDERR "Undefined input\n"; print STDERR "Usage: simmat.pl inputfile outputfile measure()\n"; exit 1; } print STDERR "Loading WordNet... "; my $wn = WordNet::QueryData->new; die "Unable to create WordNet object.\n" if(!$wn); print STDERR "done.\n"; open (INPUT, "$Infile") || die "can't open the input file"; my @words = <INPUT>; close (INPUT) ; for my $i (0 .. $#words) { for my $j ( ($i+1) .. $#words) { $sim[$i][$j] = similarity( $words[$i], $words[$j]); $sim[$j][$i] = $sim[$i][$j]; } } sub similarity { my ( $w1, $w2 ) = @_; $simi = 1; my $obj = $Measure -> new($wn); my $simi = $obj-> getRelatedness("$w1#n#1", "$w2#n#1"); return $simi; } open (OUTPUT, ">$Outfile"); print OUTPUT Dumper(\@sim); close(OUTPUT);

output :
$VAR1 = [ [ undef, undef, undef, undef ], [ undef, undef, undef, undef ], [ undef, undef, undef, undef ], [ undef, undef, undef ] ];

Replies are listed 'Best First'.
Re: 'undef' in the matrix instead of values!!!
by halley (Prior) on Feb 06, 2008 at 13:49 UTC
    I'm not seeing what kind of package $Measure is, as it appears you're grabbing it from @ARGV. Regardless, you should see what this returns, because I think it's the key.
    $Measure->getRelatedness("hotdog\n#n#1", "hamburger\n#n#1");
    If you look too closely, the answer will chomp you in the rear.

    --
    [ e d @ h a l l e y . c c ]


      Suppose that my file name(the codes written), is simmat.pl, then i use it this way: simmat.pl inputcorpus.txt output.txt WordNet::Similarity::res
      the measure would be WordNet::Similarity::res

        You can't include a module into your program just by putting its name in the arguments list. They must be included at compile time by using the use statement or at execution time by using require and import.
        Maybe one way to do what you want could be:

        my $Measure = shift; eval "require $Measure;import $Measure;"; ## Now, luckily you can use it: $Measure->new();
        citromatik
Re: 'undef' in the matrix instead of values!!!
by apl (Monsignor) on Feb 06, 2008 at 13:33 UTC
    Did you ever print out $#words after $words was populated? (It's possible your loops never executed.)

    Is it possible for getRelatedness to return an undef?

      tx for reply. I tried to print S#words before and it gave me 3 coz i have only 4 words morever my loop works as i checked before. relatedness shouldnt give any undef value. thats what i wonder. atleast for two similar values it shoudlnt.
        I'd debug (or put prints) in sub similarity. Does $obj get a value other than undef? Does $simi?