Beefy Boxes and Bandwidth Generously Provided by pair Networks
Perl-Sensitive Sunglasses
 
PerlMonks  

Re: Finding duplicate text in a paragraph

by hbm (Hermit)
on Aug 20, 2012 at 19:52 UTC ( #988496=note: print w/ replies, xml ) Need Help??


in reply to Finding duplicate text in a paragraph

Something like this?

use strict; use warnings; my $text = 'The cat jumped over the dog. Smart cat! He jumped over the + dog. The cat jumped over the dog. Smart cat!'; my %seen; my $longest = ''; while ($text =~ /\s*(.+?[!?.])/g) { $longest = $1 if $seen{$1}++ && length $1 > length $longest; } print $longest,$/;

Prints:

The cat jumped over the dog.

But it's a whole other challenge if you mean "phrase" and not (terminated) sentence.


Comment on Re: Finding duplicate text in a paragraph
Select or Download Code
Re^2: Finding duplicate text in a paragraph
by Jester (Novice) on Aug 20, 2012 at 20:40 UTC
    Ah, I am kind of embarrassed not to have thought of this. I was so focused on look ahead, for some reason, but yeah, this works. The thing is, I need to match more than one sentence, potentially whole sections of texts can be duplicated and I need to find them. I will try your solution see how it goes. Thanks!
Re^2: Finding duplicate text in a paragraph
by Jester (Novice) on Aug 20, 2012 at 20:45 UTC
    Oh, in fact this does not quite work. In that case, I would need it to print "The cat jumped over the dog." and "Smart cat!" as strings that are repeated.

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://988496]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others drinking their drinks and smoking their pipes about the Monastery: (5)
As of 2015-07-05 21:12 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    The top three priorities of my open tasks are (in descending order of likelihood to be worked on) ...









    Results (68 votes), past polls