Beefy Boxes and Bandwidth Generously Provided by pair Networks
Keep It Simple, Stupid
 
PerlMonks  

Re^5: count trigrams of a whole file

by frozenwithjoy (Curate)
on Dec 21, 2012 at 20:26 UTC ( #1009956=note: print w/ replies, xml ) Need Help??


in reply to Re^4: count trigrams of a whole file
in thread count trigrams of a whole file

That's strange. I'm getting the expected output for that input:

trigram frequencies in your text: hai!howare 1 howareyou? 1 areyou?will 1 you?willyou 1 willyoucome 1 youcometo 1 cometoCanada 2 toCanadathis 2 Canadathisweekend? 1 thisweekend?hai! 1 weekend?hai!Hello! 1 hai!Hello!I 1 Hello!Iam 1 Iamfine. 1 amfine.No 1 fine.NoI 1 NoIam 1 Iamnot 1 amnotcoming 1 notcomingto 1 comingtoCanada 1 Canadathisweekend. 1 thisweekend.I 1 weekend.Iwill 1 Iwillcome 1 willcometo 1 toCanadanext 1 Canadanextweek: 1 nextweek:I 1 week:Iwill 1 Iwillmeet 1 willmeetyou 1 meetyounext 1 younextmonth 1 nextmonthat 1 monthatCanada 1
This is from running your code with the minor change I suggested:
#!/usr/bin/env perl use strict; use warnings; use autodie; use feature 'say'; my @trigrams; my @trigramfrequency; my @words; while (<DATA>) { push @words, split /\s/; } for ( my $i = 0 ; $i < $#words - 1 ; $i++ ) { my $trigram = $words[$i] . $words[ $i + 1 ] . $words[ $i + 2 ]; my $found = -1; if (@trigrams) { SEARCHtrigramINDEX: for ( my $index = 0 ; $index <= $#trigrams ; $index++ ) { if ( $trigrams[$index] eq $trigram ) { $found = $index; last SEARCHtrigramINDEX; } } } if ( $found > -1 ) { $trigramfrequency[$found]++; } else { push @trigrams, $trigram; $trigramfrequency[$#trigrams]++; } } print "trigram frequencies in your text:\n"; for ( my $index = 0 ; $index <= @trigrams ; $index++ ) { print "$trigrams[$index] $trigramfrequency[$index]\n"; } __DATA__ hai! how are you? will you come to Canada this weekend? hai! Hello! I am fine. No I am not coming to Canada this weekend. I will come to Canada next week: I will meet you next month at Canada


Comment on Re^5: count trigrams of a whole file
Select or Download Code
Re^6: count trigrams of a whole file
by Anonymous Monk on Jan 05, 2013 at 06:30 UTC

    explain the code

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1009956]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others exploiting the Monastery: (15)
As of 2014-11-21 17:51 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    My preferred Perl binaries come from:














    Results (114 votes), past polls