Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl Monk, Perl Meditation
 
PerlMonks  

Re^2: count trigrams of a whole file

by lakssreedhar (Acolyte)
on Dec 21, 2012 at 05:24 UTC ( [id://1009862]=note: print w/replies, xml ) Need Help??


in reply to Re: count trigrams of a whole file
in thread count trigrams of a whole file

@frozenwithjoy The trigrams are coming perfect for above code but the frequency count of trigrams is differing.

Replies are listed 'Best First'.
Re^3: count trigrams of a whole file
by frozenwithjoy (Priest) on Dec 21, 2012 at 05:42 UTC
    Can you show some examples of how they are differing?

      For a text file
      hai! how are you?

      will you come to Canada this weekend?

      hai! Hello! I am fine.

      No I am not coming to Canada this weekend.

      I will come to Canada next week:

      I will meet you next month at Canada

      The output is

      hai!howare 6

      howareyou? 6

      areyou? 1

      areyou?will 5

      you?willyou 5

      willyoucome 5

      youcometo 5

      cometoCanada 7

      toCanadathis 8

      Canadathisweekend? 5

      thisweekend? 1

      thisweekend?hai! 4

      weekend?hai!Hello! 4

      hai!Hello!I 4

      Hello!Iam 4

      Iamfine. 4

      amfine. 1

      amfine.No 3

      fine.NoI 3

      NoIam 3

      Iamnot 3

      amnotcoming 3

      notcomingto 3

      comingtoCanada 3

      Canadathisweekend. 3

      thisweekend. 1

      thisweekend.I 2

      weekend.Iwill 2

      Iwillcome 2

      willcometo 2

      toCanadanext 2

      Canadanextweek: 2

      nextweek: 1

      nextweek:I 1

      week:Iwill 1

      Iwillmeet 1

      willmeetyou 1

      meetyounext 1

      younextmonth 1

      nextmonthat 1

      monthatCanada 1

      atCanada 1

        That's strange. I'm getting the expected output for that input:
        trigram frequencies in your text: hai!howare 1 howareyou? 1 areyou?will 1 you?willyou 1 willyoucome 1 youcometo 1 cometoCanada 2 toCanadathis 2 Canadathisweekend? 1 thisweekend?hai! 1 weekend?hai!Hello! 1 hai!Hello!I 1 Hello!Iam 1 Iamfine. 1 amfine.No 1 fine.NoI 1 NoIam 1 Iamnot 1 amnotcoming 1 notcomingto 1 comingtoCanada 1 Canadathisweekend. 1 thisweekend.I 1 weekend.Iwill 1 Iwillcome 1 willcometo 1 toCanadanext 1 Canadanextweek: 1 nextweek:I 1 week:Iwill 1 Iwillmeet 1 willmeetyou 1 meetyounext 1 younextmonth 1 nextmonthat 1 monthatCanada 1
        This is from running your code with the minor change I suggested:
        #!/usr/bin/env perl use strict; use warnings; use autodie; use feature 'say'; my @trigrams; my @trigramfrequency; my @words; while (<DATA>) { push @words, split /\s/; } for ( my $i = 0 ; $i < $#words - 1 ; $i++ ) { my $trigram = $words[$i] . $words[ $i + 1 ] . $words[ $i + 2 ]; my $found = -1; if (@trigrams) { SEARCHtrigramINDEX: for ( my $index = 0 ; $index <= $#trigrams ; $index++ ) { if ( $trigrams[$index] eq $trigram ) { $found = $index; last SEARCHtrigramINDEX; } } } if ( $found > -1 ) { $trigramfrequency[$found]++; } else { push @trigrams, $trigram; $trigramfrequency[$#trigrams]++; } } print "trigram frequencies in your text:\n"; for ( my $index = 0 ; $index <= @trigrams ; $index++ ) { print "$trigrams[$index] $trigramfrequency[$index]\n"; } __DATA__ hai! how are you? will you come to Canada this weekend? hai! Hello! I am fine. No I am not coming to Canada this weekend. I will come to Canada next week: I will meet you next month at Canada

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1009862]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others perusing the Monastery: (3)
As of 2024-04-20 00:40 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found