Hi Perlmonks,
I am interested to find out the matching words between an original text and a new text and the percentage
of matching words over the total words in new text. For example, I have an original text like $a="Poet Blake had a milky white cat.
He used to call it Pussy."; and a new text like $b="Poet Blake had a white cat and used to call it Pussy.";
The words that matched between these two texts are 11 i.e. Poet, Blake, had, a, white, cat, used, to, call, it, Pussy.
Moreover, there are 12 words in new text $b. Thus, the percentage of matching words over the new
text will be=(11/12)*100 i.e. 91.67%. Is it possible to get these results using a perl program?
This question is in continuation of one of my earlier nodes. I tried with a script that made use of a module called plagiarized.pm. but failed.
I am at my wit's end to get the desired results. I got some suggestions in this regard from perlmonks earlier. But I failed
to write a working script for the desired results.
Is it possible to use a simple script in perl to find the matching words between two texts and the percentage of matching words?
-
Are you posting in the right place? Check out Where do I post X? to know for sure.
-
Posts may use any of the Perl Monks Approved HTML tags. Currently these include the following:
<code> <a> <b> <big>
<blockquote> <br /> <dd>
<dl> <dt> <em> <font>
<h1> <h2> <h3> <h4>
<h5> <h6> <hr /> <i>
<li> <nbsp> <ol> <p>
<small> <strike> <strong>
<sub> <sup> <table>
<td> <th> <tr> <tt>
<u> <ul>
-
Snippets of code should be wrapped in
<code> tags not
<pre> tags. In fact, <pre>
tags should generally be avoided. If they must
be used, extreme care should be
taken to ensure that their contents do not
have long lines (<70 chars), in order to prevent
horizontal scrolling (and possible janitor
intervention).
-
Want more info? How to link
or How to display code and escape characters
are good places to start.
|