Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re^2: count trigrams of a whole file

by lakssreedhar (Acolyte)
on Dec 21, 2012 at 05:24 UTC ( #1009862=note: print w/ replies, xml ) Need Help??


in reply to Re: count trigrams of a whole file
in thread count trigrams of a whole file

@frozenwithjoy The trigrams are coming perfect for above code but the frequency count of trigrams is differing.


Comment on Re^2: count trigrams of a whole file
Re^3: count trigrams of a whole file
by frozenwithjoy (Curate) on Dec 21, 2012 at 05:42 UTC
    Can you show some examples of how they are differing?

      For a text file
      hai! how are you?

      will you come to Canada this weekend?

      hai! Hello! I am fine.

      No I am not coming to Canada this weekend.

      I will come to Canada next week:

      I will meet you next month at Canada

      The output is

      hai!howare 6

      howareyou? 6

      areyou? 1

      areyou?will 5

      you?willyou 5

      willyoucome 5

      youcometo 5

      cometoCanada 7

      toCanadathis 8

      Canadathisweekend? 5

      thisweekend? 1

      thisweekend?hai! 4

      weekend?hai!Hello! 4

      hai!Hello!I 4

      Hello!Iam 4

      Iamfine. 4

      amfine. 1

      amfine.No 3

      fine.NoI 3

      NoIam 3

      Iamnot 3

      amnotcoming 3

      notcomingto 3

      comingtoCanada 3

      Canadathisweekend. 3

      thisweekend. 1

      thisweekend.I 2

      weekend.Iwill 2

      Iwillcome 2

      willcometo 2

      toCanadanext 2

      Canadanextweek: 2

      nextweek: 1

      nextweek:I 1

      week:Iwill 1

      Iwillmeet 1

      willmeetyou 1

      meetyounext 1

      younextmonth 1

      nextmonthat 1

      monthatCanada 1

      atCanada 1

        That's strange. I'm getting the expected output for that input:
        trigram frequencies in your text: hai!howare 1 howareyou? 1 areyou?will 1 you?willyou 1 willyoucome 1 youcometo 1 cometoCanada 2 toCanadathis 2 Canadathisweekend? 1 thisweekend?hai! 1 weekend?hai!Hello! 1 hai!Hello!I 1 Hello!Iam 1 Iamfine. 1 amfine.No 1 fine.NoI 1 NoIam 1 Iamnot 1 amnotcoming 1 notcomingto 1 comingtoCanada 1 Canadathisweekend. 1 thisweekend.I 1 weekend.Iwill 1 Iwillcome 1 willcometo 1 toCanadanext 1 Canadanextweek: 1 nextweek:I 1 week:Iwill 1 Iwillmeet 1 willmeetyou 1 meetyounext 1 younextmonth 1 nextmonthat 1 monthatCanada 1
        This is from running your code with the minor change I suggested:
        #!/usr/bin/env perl use strict; use warnings; use autodie; use feature 'say'; my @trigrams; my @trigramfrequency; my @words; while (<DATA>) { push @words, split /\s/; } for ( my $i = 0 ; $i < $#words - 1 ; $i++ ) { my $trigram = $words[$i] . $words[ $i + 1 ] . $words[ $i + 2 ]; my $found = -1; if (@trigrams) { SEARCHtrigramINDEX: for ( my $index = 0 ; $index <= $#trigrams ; $index++ ) { if ( $trigrams[$index] eq $trigram ) { $found = $index; last SEARCHtrigramINDEX; } } } if ( $found > -1 ) { $trigramfrequency[$found]++; } else { push @trigrams, $trigram; $trigramfrequency[$#trigrams]++; } } print "trigram frequencies in your text:\n"; for ( my $index = 0 ; $index <= @trigrams ; $index++ ) { print "$trigrams[$index] $trigramfrequency[$index]\n"; } __DATA__ hai! how are you? will you come to Canada this weekend? hai! Hello! I am fine. No I am not coming to Canada this weekend. I will come to Canada next week: I will meet you next month at Canada

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1009862]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others avoiding work at the Monastery: (4)
As of 2015-07-04 23:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (60 votes), past polls