Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl: the Markov chain saw
 
PerlMonks  

Counting words

by raveguy2k (Novice)
on Aug 23, 2002 at 21:39 UTC ( #192458=perlquestion: print w/replies, xml ) Need Help??

raveguy2k has asked for the wisdom of the Perl Monks concerning the following question:

How could I count the number of words in my .txt file?

Here is the code I have, but it doesn't seem to work:

#!/usr/bin/perl use warnings; use strict; my($count, $pattern); my $input = shift; $pattern = \b\w+\b; open(FILE,"$input") or die "Error: $!\n"; $count = 0; while(<FILE>) { if(/$pattern/) { $count++; } } print "Word Count = ", $count, "\n\n";

Any suggestions?

Replies are listed 'Best First'.
Re: Counting words
by fruiture (Curate) on Aug 23, 2002 at 23:03 UTC

    There are 2 errors in your code: a syntax error '$pattern = \b\w+\b;' and a logical one. Try this:

    my ($count,$pattern,$file) = ( 0 , qr{\b\w+\b} , shift ); # declares and initializes the three variables # see perlop for qr// open FILE,$input or die "Error: $!"; # please read `perldoc -q 'quoting.*vars'` while(<FILE>){ $count += () = /$pattern/g; # that's kind of an idiom for counting the matches # it forces list context on m//g, but forces scalar # context on the resulting empty(?) list # that returns the number of elements that were # assigned to the list, although # in the end it's all thrown away, never reaching any # accessible memory outside the perl-internals... # -- it counts the number of matches } close FILE;

    HTH

    UPDATE: yes, qr// instead of qx//, i mixed 'em up, because i don't often use qr//. Thanks to sauoq and wog

    --
    http://fruiture.de
       $count += () ... kind of an idiom

      I've always hated that idiom. It's so satanic.

      #!/usr/bin/perl use strict; use warnings; my $COUNT = 0; while(<DATA>){ $COUNT++ while m{\b\w+\b}g; } die "COUNT $COUNT"; __END__ Hello there boss! How are you doing today. I really hope you don't slip on that banana peel. I count 22 words.

      ____________________________________________________
      ** The Third rule of perl club is a statement of fact: pod is sexy.

      I think fruiture meant to write qr instead of qx. ("Quote Regex" as opposed to "Quote eXecute". Of course checking perlop, as reccommended, would hopefully let one fiquire this out...)

Re: Counting words
by PetaMem (Priest) on Aug 24, 2002 at 20:25 UTC
    Hi there,

    well - since you use #!/usr/bin/perl you probably have wc at your fingertips. wc -w <file> could do what you want. You could just backtick it in your perl script. But then again this wouldn't be portable.

    Then, there is CPAN. String::ParseWords could be interesting. Or look there! wc in a portable manner!

    For a simple task all this may seem good enough. But when it comes to real word counting you first need to tokenize your text. Tokenizing is an art for itself. Have a look at This Book if you're really into it. And don't forget to make it unicode safe. :-)

    Bye
     PetaMem

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://192458]
Approved by Mr. Muskrat
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others studying the Monastery: (4)
As of 2022-12-10 09:33 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found

    Notices?