<?xml version="1.0" encoding="windows-1252"?>
<node id="969247" title="Re: why is regex not matching final character?" created="2012-05-07 07:31:04" updated="2012-05-07 07:31:04">
<type id="11">
note</type>
<author id="634253">
AnomalousMonk</author>
<data>
<field name="doctext">
&lt;blockquote&gt;&lt;i&gt;
I thought &lt;c&gt;\w&lt;/C&gt; would gobble up the whole table name first and &lt;c&gt;[^\s]&lt;/C&gt; would stop the gobbling at the first space.
&lt;/I&gt;&lt;/BLOCKQUOTE&gt;

&lt;p&gt;
And that's just what happens. In, e.g., 'users', the &lt;c&gt; (\w*) &lt;/C&gt; gobbles (and captures) 'user', and the &lt;c&gt; [^\s] &lt;/C&gt; gobbles (and swallows) 's'. But all that's just what [moritz] just said. 
&lt;/P&gt;

&lt;p&gt;
Another way of looking at the regex (or any pre-Perl 5.7 regex) is with [cpan://YAPE::Regex::Explain].
&lt;/P&gt;

&lt;c&gt;
&gt;perl -wMstrict -le
"use YAPE::Regex::Explain;
 ;;
 my $rx = qr/^update\ (\w*)[^\s]/;
 print YAPE::Regex::Explain-&gt;new($rx)-&gt;explain;
"
The regular expression:

(?-imsx:^update\ (\w*)[^\s])

matches as follows:

NODE                     EXPLANATION
----------------------------------------------------------------------
(?-imsx:                 group, but do not capture (case-sensitive)
                         (with ^ and $ matching normally) (with . not
                         matching \n) (matching whitespace and #
                         normally):
----------------------------------------------------------------------
  ^                        the beginning of the string
----------------------------------------------------------------------
  update                   'update'
----------------------------------------------------------------------
  \                        ' '
----------------------------------------------------------------------
  (                        group and capture to \1:
----------------------------------------------------------------------
    \w*                      word characters (a-z, A-Z, 0-9, _) (0 or
                             more times (matching the most amount
                             possible))
----------------------------------------------------------------------
  )                        end of \1
----------------------------------------------------------------------
  [^\s]                    any character except: whitespace (\n, \r,
                           \t, \f, and " ")
----------------------------------------------------------------------
)                        end of grouping
----------------------------------------------------------------------
&lt;/C&gt;
</field>
<field name="root_node">
969212</field>
<field name="parent_node">
969212</field>
</data>
</node>
