<?xml version="1.0" encoding="windows-1252"?>
<node id="999100" title="Re^4: Using Look-ahead and Look-behind" created="2012-10-15 11:09:40" updated="2012-10-15 11:09:40">
<type id="11">
note</type>
<author id="999097">
JohnN</author>
<data>
<field name="doctext">
&lt;p&gt;I have a dumb question.&lt;/p&gt;

&lt;p&gt;This code works well (THANKS Roy!) when looking for DNA string matches within a genome sequence but not when the * is changed to {50,100}&lt;/p&gt;

e.g.&lt;br&gt; 
&lt;code&gt;
/CCGG          # Match starting at DNA sequence CCGG
  (          
    (?:        
      (?!CCGG) #   make sure we're not finding duplicates mid-stream
      .             #   accept any character
    )*?            # any number of times BUT not greedily &lt;====
  )          
  AATT         # and ending at AATT
/x;
&lt;/code&gt;

&lt;p&gt;versus&lt;/p&gt;

&lt;code&gt;
/CCGG   
  (          
    (?:        
      (?!CCGG)  
      .        
    ){50,100}?        # &lt;====
   )         
  AATT  # and ending at AATT
/x;
&lt;/code&gt;

&lt;p&gt;This latter one does not have dupes of CCGG but does have dupes of AATT.  The previous snippet has no dupes of either CCGG or AATT.&lt;/p&gt;

&lt;p&gt;&lt;b&gt;A follow-up:&lt;/b&gt;  The following code snippet fixes my problem, and I have NO idea why! I tried it out of desperation&lt;/p&gt;

&lt;code&gt;
/CCGG   
  (          
    (?:        
      (?!AATT|CCGG) #     &lt;=============
      .       # 
    ){50,100}?        # Here the "?" is not required but I'm anal
  )         # 
  AATT  # 
/x;
&lt;/code&gt;
</field>
<field name="root_node">
518444</field>
<field name="parent_node">
911397</field>
</data>
</node>
