Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re: compute the occurrence of words

by vinoth.ree (Prior)
on Feb 13, 2013 at 14:05 UTC ( #1018541=note: print w/ replies, xml ) Need Help??


in reply to compute the occurrence of words

Also at the moment the code is returning numeric values which I need to exclude.

Then what you expect from this code? It gives the word and its count.


Comment on Re: compute the occurrence of words
Replies are listed 'Best First'.
Re^2: compute the occurrence of words
by BigGer (Novice) on Feb 13, 2013 at 14:17 UTC

    the line  $data = <FH>; Is an error and I have removed it. I am looking to count the occurrences of each word used in a document but excluding numbers. Hope that clarifies my question. G

      In which case you will also have to define "numbers" :) integers?, floats? e-notation? Roman? Only ASCII-digits, or also other Unicode numerals?

      Let me assume simple integers and floats represented in ASCII (no triad-sep, radix-sep = '.', so valid numbers include 1234 and 0.23, but not DCVII, 2.34e12 or 1,234,567.00

      my %count; while (<FH>) { $count{lc $_}++ for grep { !m{^[0-9]+(\.[0-9]+)?$} } m/\w+/g; }

      For a complete regular expression to integers and reals, I'd like to refer to Regexp::Common (see $RE{num}).

      update: /me just realized that it is overly complex, as \w+ can only match integers without a triad-sep, as . is not included in \w, reducing the loop-line to

      $count{lc $_}++ for grep { !m{^[0-9]+$} } m/^\w+$/g;

      Enjoy, Have FUN! H.Merijn

        Thanks H.Merijn That's perfect. I will go and read up on the hash function. G

      ... count ... but excluding numbers.

      This just confuses me. Can you provide a small input list of words and a corresponding output list showing the non-numeric 'count' you desire for the given input?

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1018541]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (8)
As of 2015-07-31 23:59 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (285 votes), past polls