Beefy Boxes and Bandwidth Generously Provided by pair Networks
go ahead... be a heretic
 
PerlMonks  

Re: count trigrams of a whole file

by frozenwithjoy (Curate)
on Dec 20, 2012 at 08:50 UTC ( #1009704=note: print w/ replies, xml ) Need Help??


in reply to count trigrams of a whole file

UPDATE:

A quick fix (if your file isn't too large or if you have sufficient RAM) would be to populate @words like so:

while (<>) { push @words, split /\s/; }

That way, you can move onto your for loop and you should get the result you want. This was the result I got after making this change:

trigram frequencies in your text: iwentthere! 1 wentthere!she 1 there!shealso 1 shealsowent 1 alsowentthere. 1

ORIGINAL POST:

Hi lakssreedhar, I didn't change anything, but I felt the need to reformat your code for better readability:

@trigrams = (); while (<>) { @words = split /\s/, $_; for ( $i = 0 ; $i < $#words - 1 ; $i++ ) { $trigram = $words[$i] . $words[ $i + 1 ] . $words[ $i + 2 ]; $found = -1; if (@trigrams) { SEARCHtrigramINDEX: for ( $index = 0 ; $index <= $#trigrams ; $index++ ) { if ( $trigrams[$index] eq $trigram ) { $found = $index; last SEARCHtrigramINDEX; } } } if ( $found > -1 ) { $trigramfrequency[$found]++; } else { push @trigrams, $trigram; $trigramfrequency[$#trigrams]++; } } } print "trigram frequencies in your text:\n"; for ( $index = 0 ; $index <= @trigrams ; $index++ ) { print "$trigrams[$index] $trigramfrequency[$index]\n"; }


Comment on Re: count trigrams of a whole file
Select or Download Code
Re^2: count trigrams of a whole file
by lakssreedhar (Acolyte) on Dec 20, 2012 at 09:09 UTC
    Thanks i got it.
Re^2: count trigrams of a whole file
by lakssreedhar (Acolyte) on Dec 21, 2012 at 05:24 UTC

    @frozenwithjoy The trigrams are coming perfect for above code but the frequency count of trigrams is differing.

      Can you show some examples of how they are differing?

        For a text file
        hai! how are you?

        will you come to Canada this weekend?

        hai! Hello! I am fine.

        No I am not coming to Canada this weekend.

        I will come to Canada next week:

        I will meet you next month at Canada

        The output is

        hai!howare 6

        howareyou? 6

        areyou? 1

        areyou?will 5

        you?willyou 5

        willyoucome 5

        youcometo 5

        cometoCanada 7

        toCanadathis 8

        Canadathisweekend? 5

        thisweekend? 1

        thisweekend?hai! 4

        weekend?hai!Hello! 4

        hai!Hello!I 4

        Hello!Iam 4

        Iamfine. 4

        amfine. 1

        amfine.No 3

        fine.NoI 3

        NoIam 3

        Iamnot 3

        amnotcoming 3

        notcomingto 3

        comingtoCanada 3

        Canadathisweekend. 3

        thisweekend. 1

        thisweekend.I 2

        weekend.Iwill 2

        Iwillcome 2

        willcometo 2

        toCanadanext 2

        Canadanextweek: 2

        nextweek: 1

        nextweek:I 1

        week:Iwill 1

        Iwillmeet 1

        willmeetyou 1

        meetyounext 1

        younextmonth 1

        nextmonthat 1

        monthatCanada 1

        atCanada 1

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1009704]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (9)
As of 2014-09-19 08:19 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (133 votes), past polls