Beefy Boxes and Bandwidth Generously Provided by pair Networks
Just another Perl shrine
 
PerlMonks  

Comment on

( #3333=superdoc: print w/ replies, xml ) Need Help??

Hello, Monks (first post).

I briefly learned Perl in my Programming Languages course last spring. I thought it was great, but didn't use it much since. I'm wanting to start learning the language again, and learn it correctly. I'm aware that there are many different ways to do anything in Perl, so let me know if anything can be improved or be more "Perl-ish" in my code below:

The point of this code is to read a .txt file and find the most frequently used word, and report the number of times it's used. If there are multiple words used the most frequently, the program just chooses the first match. The program seems to currently work fine, but as I said, please let me know if anything can be improved upon!

## # FILE: mostFreqWord.pl # AUTHOR: Daniel Jones # CREATED: 12/18/2013 # MODIFIED: 21/18/2013 ## die "ERROR: Must enter one file name.\n" unless $#ARGV == 0; open FILE, "<", $ARGV[0] or die "Could not open $ARGV[0] for reading.\ +n"; #the hash to contain the word-count pairs. my %hash; my @lines = <FILE>; #go through each line in the file foreach my $line(@lines){ #skip non-word characters my @words = split /\W+/, $line; #go through each word in the file foreach my $word(@words){ chomp $word; #force all words to lowercase $word = lc $word; #remove beginning/trailing whitespace $word =~ s/^\s+|\s+$//g; my $key = $file.$word; #if the word exists, increment its value. #otherwise, set it to 1. if(exists $hash{$key}){ $hash{$key}++; } else{ $hash{$key} = 1; } } } close FILE; my @values; #get the values from the hash foreach my $key(keys %hash){ push @values, $hash{$key}; } @values = sort @values; my @keys = keys %hash; my $idx = 0; my $bestVal = @values[-1]; my $bestKey; foreach my $key(@keys){ if ($hash{$key} == $bestVal){ $bestKey = $key; last; } } print "The most frequent word in $ARGV[0] is $bestKey, which was seen +$bestVal times.\n";


In reply to Find most frequently used word in text file. by jonesd14

Title:
Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":



  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • Outside of code tags, you may need to use entities for some characters:
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?
    Username:
    Password:

    What's my password?
    Create A New User
    Chatterbox?
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others drinking their drinks and smoking their pipes about the Monastery: (8)
    As of 2014-12-28 13:35 GMT
    Sections?
    Information?
    Find Nodes?
    Leftovers?
      Voting Booth?

      Is guessing a good strategy for surviving in the IT business?





      Results (181 votes), past polls