<?xml version="1.0" encoding="windows-1252"?>
<node id="155967" title="Re: Re: Re: More Power to your Regex" created="2002-04-02 05:04:30" updated="2005-07-19 14:08:11">
<type id="11">
note</type>
<author id="132236">
Juerd</author>
<data>
<field name="doctext">
&lt;p&gt;&lt;blockquote&gt;&lt;em&gt;
 I've certainly never tested it with conditionals.
&lt;/em&gt;&lt;/blockquote&gt;&lt;/p&gt;

&lt;p&gt;
I think it's the conditional indeed, because it works smoothly when I re-write it to not use a conditional:
&lt;code&gt;
%
    ^
    \s*
    (                               # &lt;1&gt;
        # Single tags like &lt;foo/&gt;
        &lt;
        \s*
        [a-zA-Z:]+
        (?:
            \s*[a-zA-Z:]*
            \s* = \s*
            (?:'[^']*'|"[^"]*")
        )*
        \s*
        /\s*
        &gt;
    |
        # Tags in pairs like &lt;foo&gt;content&lt;/foo&gt;
        &lt;
        \s*
        ([a-zA-Z:]+)                # &lt;2/&gt;
        (?:
            \s*[a-zA-Z:]*
            \s* = \s*
            (?:'[^']*'|"[^"]*")
        )*
        \s*
        &gt;

        (?:[^&lt;&gt;]* | (?1))*

        &lt;\s*/\s*\2\s*&gt;
    )                               # &lt;/1&gt;
    \s*
    $
%x
&lt;/code&gt;
&lt;code&gt;
&lt;foo&gt;&lt;bar&gt;&lt;/bar&gt;&lt;/foo&gt; # Match
&lt;foo&gt;&lt;bar&gt;&lt;/foo&gt;&lt;/bar&gt; # No match
&lt;foo&gt;&lt;bar/&gt;&lt;/foo&gt;      # Match
&lt;foo&gt;&lt;bar&gt;&lt;/foo&gt;       # No match
&lt;foo bar=baz/&gt;         # No match
&lt;foo bar="baz"&gt;        # No match
&lt;foo bar="baz"/&gt;       # Match
&lt; fooo   /  &gt;          # Match
&lt;foo/&gt;foo              # No match
foo&lt;foo/&gt;              # No match
&lt;foo&gt;foo&lt;/foo&gt;         # Match
&lt;foo&gt;&lt;bar/&gt;foo&lt;/foo&gt;   # Match
#&lt;a&gt;&lt;b&gt;&lt;c&gt;&lt;/c&gt;&lt;/b&gt;&lt;/a&gt; # No match (WRONG!!)
&lt;/code&gt;
Now, there's still the three-level-deep problem...
&lt;/p&gt;

&lt;p&gt;&lt;font color=green&gt;&lt;pre&gt;
U28geW91IGNhbiBhbGwgcm90MTMgY
W5kIHBhY2soKS4gQnV0IGRvIHlvdS
ByZWNvZ25pc2UgQmFzZTY0IHdoZW4
geW91IHNlZSBpdD8gIC0tIEp1ZXJk
&lt;/pre&gt;&lt;/font&gt;&lt;/p&gt;</field>
<field name="root_node">
155869</field>
<field name="parent_node">
155957</field>
</data>
</node>
