|Problems? Is your data what you think it is?|
pattern sequence dispersed within textby nicemank (Novice)
|on Oct 01, 2012 at 11:20 UTC||Need Help??|
nicemank has asked for the
wisdom of the Perl Monks concerning the following question:
I want to find patterns dispersed within texts. Any word as the search pattern. Any text.
So (here goes):
I split a word into character pairs. Say the name is 'helen' (case irrelevant). That's got 5 letters; so it is two pairs and a single letter: 'he', 'le' and 'n'.
I want to get the parts in sequence. This is the case whether they are conveniently in the correct order in the text, such as:
1. xxxhexxxxxxxx xle xxxxxx nxxx
From this I want: xxxhexxxxxxxx xle nxxx
But they may not be in quite the right order. There may be repetitions and/or parts in the wrong order:
2. xxxhexxxxxxxx xle xxnxle nxxx xxnxxx xnxxhexx nxxxxx xlexxxxxx nxnx xxxx
I'd like to get:
xxxhexxxxxxxx xle xxnxle xnxxhexx xlexxxxxx nxnx
'xxnxle ' appears because it contains the final 'n'. The fact that it contains an additional 'le' just does not matter.
But actually I get:
xxxhexxxxxxxx xle xxnxle xxnxxx xnxxhexx xlexxxxxx nxnx
In other words it should always get the input sequence in the correct order if it is there. It will get it repeatedly if it is there. It will discard if it can anything extraneous.
Taking an input ( for instance, $words = 'xxxhexxxxxxxx xle xxnxle nxxx xxnxxx xnxxhexx nxxxxx xlexxxxxx nxnx xxxx')
Prints: 'xxxhexxxxxxxx xle xxnxle xxnxxx xnxxhexx xlexxxxxx nxnx' as above, which is wrong.
I realise this is a complicated question. But any help gratefully received.