Beefy Boxes and Bandwidth Generously Provided by pair Networks
The stupid question is the question not asked
 
PerlMonks  

Re: Is it possible to find the matching words and the percentage of matching words between two texts?

by McDarren (Abbot)
on Dec 21, 2012 at 08:48 UTC ( [id://1009889]=note: print w/replies, xml ) Need Help??


in reply to Is it possible to find the matching words and the percentage of matching words between two texts?

A simple approach would be to build two hashes from the strings, and then compare the hashes.

So you might do something like:
my %foo; my $string = 'Poet Blake had a milky white cat. He used to call it Pus +sy.'; for my $word (split /\s+/, $string) { $foo{$word}++; }
You do the same for the second string, and then to compare you simply iterate through one of the hashes and increment a counter if each word is present in the other hash. Something like so:
my $cnt; for my $word (keys %foo) { $cnt++ if $bar{$word}; }

To find the total number of words in either string, you simply count the number of keys in the hash, e.g.

my $word_count = scalar keys %foo;

And then it's just a simple calculation.
Obvious question is how does your calculation look if the two strings contain a different number of words? But I'm sure you can decide that.

hope this helps,
Darren
  • Comment on Re: Is it possible to find the matching words and the percentage of matching words between two texts?
  • Select or Download Code

Replies are listed 'Best First'.
Re^2: Is it possible to find the matching words and the percentage of matching words between two texts?
by supriyoch_2008 (Monk) on Dec 21, 2012 at 10:34 UTC

    Hi McDarren,

    Thanks for your prompt reply. I shall try to solve my problem using the codes given by you.

    Regards

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1009889]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others meditating upon the Monastery: (4)
As of 2024-04-19 21:05 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found