http://www.perlmonks.org?node_id=1009889


in reply to Is it possible to find the matching words and the percentage of matching words between two texts?

A simple approach would be to build two hashes from the strings, and then compare the hashes.

So you might do something like:
my %foo; my $string = 'Poet Blake had a milky white cat. He used to call it Pus +sy.'; for my $word (split /\s+/, $string) { $foo{$word}++; }
You do the same for the second string, and then to compare you simply iterate through one of the hashes and increment a counter if each word is present in the other hash. Something like so:
my $cnt; for my $word (keys %foo) { $cnt++ if $bar{$word}; }

To find the total number of words in either string, you simply count the number of keys in the hash, e.g.

my $word_count = scalar keys %foo;

And then it's just a simple calculation.
Obvious question is how does your calculation look if the two strings contain a different number of words? But I'm sure you can decide that.

hope this helps,
Darren
  • Comment on Re: Is it possible to find the matching words and the percentage of matching words between two texts?
  • Select or Download Code

Replies are listed 'Best First'.
Re^2: Is it possible to find the matching words and the percentage of matching words between two texts?
by supriyoch_2008 (Monk) on Dec 21, 2012 at 10:34 UTC

    Hi McDarren,

    Thanks for your prompt reply. I shall try to solve my problem using the codes given by you.

    Regards