Nested Cycle - Statistic measure

remluvr has asked for the wisdom of the Perl Monks concerning the following question:

Hi everyone.
My question is the following. I have the following structure:

1    beast-n    into    transform-v    356.9551
2    beast-n    obj    kill-v    266.2511
3    beast-n    obj    see-v    252.3623
4    beast-n    prd    become-v    250.9534
5    beast-n    obj    tame-v    224.6948
6    beast-n    into    turn-v    191.9883
7    beast-n    obj    call-v    171.4000
8    beast-n    sbj_intr    devour-v    165.3228
9    beast-n    obj    hunt-v    155.7637
10    beast-n    obj    fight-v    150.4370
11    beast-n    obj    slay-v    150.3982
1    frog-n    obj    find-v    322.5589
2    frog-n    into    turn-v    307.3012
3    frog-n    sbj_intr    jump-v    235.0503
4    frog-n    coord-1    toad-n    207.3611
5    frog-n    obj    see-v    207.2610
6    frog-n    obj    eat-v    204.1762
7    frog-n    obj    kill-v    64.6689
[download]

Using these data I need to implement a statistical measure to check the relevance of a given semantic relation in regards to the names it occur with.
Apart from the list above, I have two words as input. (sticking to the previous example, let's say they are beast-n and frog-n. If a given feature occurring with the first word also occurs whit the second, I have to compute Precision of the feature in regard to the first word. If I am at rank 1, and I found a feature that occurs also with the second word, my precision is 1, because it's computed as found_relevant_feat/rank_found. In the example above, the only feat that occurs both with beast and frog is kill-v. My precision would then be 1(which is the number of found_rel_feat until that rank)/2 which is the rank in which it occurs.
Also, I have to found the rank in which the given feature has been founded with word2. (in this case 7) When I am done with this I also need to know the total number of occurrence of the first word and the total number of occurrence of the second word. (given the example before, it would be 11 for beast-n and 7 for frog-n).

  my ($prop,$rank, $score);
    my ($prop2,$rank2, $score2);
  
     while (my($name1,$aref) = each %matrice ) {
    my $num=0;
    
    foreach my $item (@$aref){
    $count_feat_trovate=0;
        ($prop,$rank, $score) = split(',',$item);
    my $lastrank=&lastrank2($name2,$prop);
    while (my($name2,$aref2) = each %matrice) {
              
        foreach my $item1 (@$aref2){
            ($prop2,$rank2, $score2) = split(',',$item1);
        
        
            if($prop eq $prop2){
            
            $count_feat_trovate++; #number of feat found
            
            $rank_trovato_2=$rank2; #Rank of which the feat has been f
+ound in regards to the second element
            }
            
        }
      
    }
     $last_rank_2=$rank2;
}
    $last_rank_1=$rank;
}
[download]

I guess I just made a lot of mess without achieving anything. Any ideas on how to retrieve the needed data?
Thanks
Giulia

Comment on Nested Cycle - Statistic measure Select or Download Code

Replies are listed 'Best First'.
Re: Nested Cycle - Statistic measure by Marshall (Canon) on Mar 12, 2012 at 17:14 UTC
Excuse me for my mis-understanding. Can you clarify? What is special about "kill-v"? #!/usr/bin/perl -w use strict; use Data::Dumper; use Data::Dump qw (pp); my %animals; my %verbs; # Your say, "In the example above, the only feat that occurs # both with beast and frog is kill-v". # # I don't see it # Obviously there is something that I don't understand... # # this simple code gets multiple things in common # I don't understand how to filter out "see-v" and "turn-v" # Prints: # turn-v: beast-n frog-n # see-v: beast-n frog-n # kill-v: beast-n frog-n while (<DATA>) { my ($animal, $verb) = (split(' ', $_))[1,3]; $animals{$animal}++; push @{$verbs{$verb}}, $animal; } foreach my $verb (keys %verbs) { if (@{$verbs{$verb}} > 1) { print "$verb: @{$verbs{$verb}}", "\n"; } } __DATA__ 1 beast-n into transform-v 356.9551 2 beast-n obj kill-v 266.2511 3 beast-n obj see-v 252.3623 4 beast-n prd become-v 250.9534 5 beast-n obj tame-v 224.6948 6 beast-n into turn-v 191.9883 7 beast-n obj call-v 171.4000 8 beast-n sbj_intr devour-v 165.3228 9 beast-n obj hunt-v 155.7637 10 beast-n obj fight-v 150.4370 11 beast-n obj slay-v 150.3982 1 frog-n obj find-v 322.5589 2 frog-n into turn-v 307.3012 3 frog-n sbj_intr jump-v 235.0503 4 frog-n coord-1 toad-n 207.3611 5 frog-n obj see-v 207.2610 6 frog-n obj eat-v 204.1762 7 frog-n obj kill-v 64.6689 [download]	[reply] [d/l]
Re^2: Nested Cycle - Statistic measure by remluvr (Sexton) on Mar 12, 2012 at 20:29 UTC
Kill-v is the only feature that occur both with beast-n and frog-n. I have been given two words, (frog-n and beast-n, for example) and I have to take care only of features that occur with both of the words I have. I hope this clarify it a little Thanks again Giulia	[reply]
Re^3: Nested Cycle - Statistic measure by Not_a_Number (Prior) on Mar 12, 2012 at 21:52 UTC
Kill-v is the only feature that occur both with beast-n and frog-n. Sorry, no. I can see that's not right even by eye-balling your data. And Marshall's code above (did you actually bother to run it?) proves that your claim is wrong! What haven't you told us about `turn-v` and `see-v` ??	[reply] [d/l] [select]
Re^4: Nested Cycle - Statistic measure by remluvr (Sexton) on Mar 12, 2012 at 22:14 UTC


laziness, impatience, and hubris
	PerlMonks