|
|
| The stupid question is the question not asked | |
| PerlMonks |
How can I print out a word-frequency or line-frequency summary?by faq_monk (Initiate) |
| on Oct 08, 1999 at 00:25 UTC ( [id://669]=perlfaq nodetype: print w/replies, xml ) | Need Help?? |
|
Current Perl documentation can be found at perldoc.perl.org. Here is our local, out-dated (pre-5.6) version: To do this, you have to parse out each word in the input stream. We'll pretend that by word you mean chunk of alphabetics, hyphens, or apostrophes, rather than the non-whitespace chunk idea of a word given in the previous question:
while (<>) {
while ( /(\b[^\W_\d][\w'-]+\b)/g ) { # misses "`sheep'"
$seen{$1}++;
}
}
while ( ($word, $count) = each %seen ) {
print "$count $word\n";
}
If you wanted to do the same thing for lines, you wouldn't need a regular expression:
while (<>) {
$seen{$_}++;
}
while ( ($line, $count) = each %seen ) {
print "$count $line";
}
If you want these output in a sorted order, see the section on Hashes.
|
|