Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Re^4: count trigrams of a whole file

by lakssreedhar (Acolyte)
on Dec 21, 2012 at 06:11 UTC ( #1009872=note: print w/ replies, xml ) Need Help??


in reply to Re^3: count trigrams of a whole file
in thread count trigrams of a whole file

For a text file
hai! how are you?

will you come to Canada this weekend?

hai! Hello! I am fine.

No I am not coming to Canada this weekend.

I will come to Canada next week:

I will meet you next month at Canada

The output is

hai!howare 6

howareyou? 6

areyou? 1

areyou?will 5

you?willyou 5

willyoucome 5

youcometo 5

cometoCanada 7

toCanadathis 8

Canadathisweekend? 5

thisweekend? 1

thisweekend?hai! 4

weekend?hai!Hello! 4

hai!Hello!I 4

Hello!Iam 4

Iamfine. 4

amfine. 1

amfine.No 3

fine.NoI 3

NoIam 3

Iamnot 3

amnotcoming 3

notcomingto 3

comingtoCanada 3

Canadathisweekend. 3

thisweekend. 1

thisweekend.I 2

weekend.Iwill 2

Iwillcome 2

willcometo 2

toCanadanext 2

Canadanextweek: 2

nextweek: 1

nextweek:I 1

week:Iwill 1

Iwillmeet 1

willmeetyou 1

meetyounext 1

younextmonth 1

nextmonthat 1

monthatCanada 1

atCanada 1


Comment on Re^4: count trigrams of a whole file
Re^5: count trigrams of a whole file
by frozenwithjoy (Curate) on Dec 21, 2012 at 20:26 UTC
    That's strange. I'm getting the expected output for that input:
    trigram frequencies in your text: hai!howare 1 howareyou? 1 areyou?will 1 you?willyou 1 willyoucome 1 youcometo 1 cometoCanada 2 toCanadathis 2 Canadathisweekend? 1 thisweekend?hai! 1 weekend?hai!Hello! 1 hai!Hello!I 1 Hello!Iam 1 Iamfine. 1 amfine.No 1 fine.NoI 1 NoIam 1 Iamnot 1 amnotcoming 1 notcomingto 1 comingtoCanada 1 Canadathisweekend. 1 thisweekend.I 1 weekend.Iwill 1 Iwillcome 1 willcometo 1 toCanadanext 1 Canadanextweek: 1 nextweek:I 1 week:Iwill 1 Iwillmeet 1 willmeetyou 1 meetyounext 1 younextmonth 1 nextmonthat 1 monthatCanada 1
    This is from running your code with the minor change I suggested:
    #!/usr/bin/env perl use strict; use warnings; use autodie; use feature 'say'; my @trigrams; my @trigramfrequency; my @words; while (<DATA>) { push @words, split /\s/; } for ( my $i = 0 ; $i < $#words - 1 ; $i++ ) { my $trigram = $words[$i] . $words[ $i + 1 ] . $words[ $i + 2 ]; my $found = -1; if (@trigrams) { SEARCHtrigramINDEX: for ( my $index = 0 ; $index <= $#trigrams ; $index++ ) { if ( $trigrams[$index] eq $trigram ) { $found = $index; last SEARCHtrigramINDEX; } } } if ( $found > -1 ) { $trigramfrequency[$found]++; } else { push @trigrams, $trigram; $trigramfrequency[$#trigrams]++; } } print "trigram frequencies in your text:\n"; for ( my $index = 0 ; $index <= @trigrams ; $index++ ) { print "$trigrams[$index] $trigramfrequency[$index]\n"; } __DATA__ hai! how are you? will you come to Canada this weekend? hai! Hello! I am fine. No I am not coming to Canada this weekend. I will come to Canada next week: I will meet you next month at Canada

      explain the code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1009872]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others perusing the Monastery: (15)
As of 2015-07-03 07:20 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (48 votes), past polls