Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Re^2: compute the occurrence of words

by BigGer (Novice)
on Feb 13, 2013 at 14:17 UTC ( #1018543=note: print w/ replies, xml ) Need Help??


in reply to Re: compute the occurrence of words
in thread compute the occurrence of words

the line  $data = <FH>; Is an error and I have removed it. I am looking to count the occurrences of each word used in a document but excluding numbers. Hope that clarifies my question. G


Comment on Re^2: compute the occurrence of words
Download Code
Re^3: compute the occurrence of words
by AnomalousMonk (Abbot) on Feb 13, 2013 at 14:25 UTC
    ... count ... but excluding numbers.

    This just confuses me. Can you provide a small input list of words and a corresponding output list showing the non-numeric 'count' you desire for the given input?

Re^3: compute the occurrence of words
by Tux (Monsignor) on Feb 13, 2013 at 14:27 UTC

    In which case you will also have to define "numbers" :) integers?, floats? e-notation? Roman? Only ASCII-digits, or also other Unicode numerals?

    Let me assume simple integers and floats represented in ASCII (no triad-sep, radix-sep = '.', so valid numbers include 1234 and 0.23, but not DCVII, 2.34e12 or 1,234,567.00

    my %count; while (<FH>) { $count{lc $_}++ for grep { !m{^[0-9]+(\.[0-9]+)?$} } m/\w+/g; }

    For a complete regular expression to integers and reals, I'd like to refer to Regexp::Common (see $RE{num}).

    update: /me just realized that it is overly complex, as \w+ can only match integers without a triad-sep, as . is not included in \w, reducing the loop-line to

    $count{lc $_}++ for grep { !m{^[0-9]+$} } m/^\w+$/g;

    Enjoy, Have FUN! H.Merijn

      Thanks H.Merijn That's perfect. I will go and read up on the hash function. G

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1018543]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (6)
As of 2015-07-05 05:22 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (60 votes), past polls