Beefy Boxes and Bandwidth Generously Provided by pair Networks
Syntactic Confectionery Delight

Re: count trigrams of a whole file

by Anonymous Monk
on Dec 20, 2012 at 10:16 UTC ( #1009716=note: print w/ replies, xml ) Need Help??

in reply to count trigrams of a whole file

A couple of improvements you can make:

  • On each line, include the last two elements from the previous line in your array (if they were defined). That way you handle the overlapping cases without needing to read the whole file into memory.
  • A hash is a perfect tool for keeping track of the counts. Then you can do away with your loop to search the array.
  • In general, it is bad style to use the 3-argument for loop in Perl. There is almost always a better option: foreach (@array), for my $i (0..$#array), etc.
#!usr/bin/perl use strict; use warnings; my %trigrams; my @words; while(<DATA>) { #Include the previous two words to the beginning of this array. @words = ( $words[-2] // (), $words[-1] // (), split(/\s/, $_) ); $trigrams{"@words[$_..$_+2]"}++ for (0..$#words-2); } print "trigram frequencies in your text:\n"; #Sort the trigrams in descending order of frequency. for (sort {$trigrams{$b} <=> $trigrams{$a} } keys %trigrams) { print "$_: $trigrams{$_}\n"; } __DATA__ I went there! Me She also went there. Did you know that I went there!

Comment on Re: count trigrams of a whole file
Select or Download Code
Re^2: count trigrams of a whole file
by lakssreedhar (Acolyte) on Dec 21, 2012 at 18:28 UTC

    i need the words and its count printed in the order of words given in the file

      Thanks i got it

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1009716]
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (3)
As of 2014-09-20 23:44 GMT
Find Nodes?
    Voting Booth?

    How do you remember the number of days in each month?

    Results (164 votes), past polls