|Perl: the Markov chain saw|
Top five words by occurrenceby ghettofinger (Monk)
|on Jul 18, 2005 at 16:37 UTC||Need Help??|
ghettofinger has asked for the wisdom of the Perl Monks concerning the following question:
I am trying to put a script together that will go through a text file and show me the top 5 words sorted my occurrence. Here is where I am at:
As you can see, I am only interested in words over 5 characters in length. I am unsure how to go about sorting this by number of occurrence. Also, I am having problems with punctuation showing up in my results. I have an example below:
Is this because I am using split? Is there a better way to go about this. I am sure I will start missing words that have apostrophes too. Also, how should I sort this? Then how can I take only the top 5?
Any help or advice is appreciated.