Beefy Boxes and Bandwidth Generously Provided by pair Networks
Your skill will accomplish
what the force of many cannot

Re: count trigrams of a whole file

by Anonymous Monk
on Dec 20, 2012 at 10:16 UTC ( #1009716=note: print w/replies, xml ) Need Help??

in reply to count trigrams of a whole file

A couple of improvements you can make:
  • On each line, include the last two elements from the previous line in your array (if they were defined). That way you handle the overlapping cases without needing to read the whole file into memory.
  • A hash is a perfect tool for keeping track of the counts. Then you can do away with your loop to search the array.
  • In general, it is bad style to use the 3-argument for loop in Perl. There is almost always a better option: foreach (@array), for my $i (0..$#array), etc.
#!usr/bin/perl use strict; use warnings; my %trigrams; my @words; while(<DATA>) { #Include the previous two words to the beginning of this array. @words = ( $words[-2] // (), $words[-1] // (), split(/\s/, $_) ); $trigrams{"@words[$_..$_+2]"}++ for (0..$#words-2); } print "trigram frequencies in your text:\n"; #Sort the trigrams in descending order of frequency. for (sort {$trigrams{$b} <=> $trigrams{$a} } keys %trigrams) { print "$_: $trigrams{$_}\n"; } __DATA__ I went there! Me She also went there. Did you know that I went there!

Replies are listed 'Best First'.
Re^2: count trigrams of a whole file
by lakssreedhar (Acolyte) on Dec 21, 2012 at 18:28 UTC

    i need the words and its count printed in the order of words given in the file

      Thanks i got it

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1009716]
and all is quiet...

How do I use this? | Other CB clients
Other Users?
Others taking refuge in the Monastery: (9)
As of 2018-06-21 15:13 GMT
Find Nodes?
    Voting Booth?
    Should cpanminus be part of the standard Perl release?

    Results (118 votes). Check out past polls.