Beefy Boxes and Bandwidth Generously Provided by pair Networks
Think about Loose Coupling

Comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

I am currently working on fixing some problems with the current rules for what \d \s and \w should match. It turns out that the current definition/rules lead to logical inconsistencies in the regex engine which cannot be resolved without changing the definitions, and thus breaking something out there.

Unfortunately however, the current behaviour is really close to what people expect: almost all of the time the rules DWIM's nicely. It is only on edge cases, and certain consistency checks do things fall down. This means that any "fixing" of the default rules causes a lot of stuff to break. Which in turn means that we have to do with by adding new modifier flags to control things and leave the defaults alone pretty much.

I am currently working on adding the following set of mutually exclusive flags and behaviour.

Modifier Semantics \w \s \ +d /u Unicode \p{IsWord} \p{IsSpace} [ +0-9] /a ASCII/Perl [A-Za-z0-9_] [ \t\r\n] [ +0-9] /b Broken/Legacy same as perl 5.8 [ +0-9] /l "use locale" same semantics as under use local +e in 5.8.x

Most of this is pretty much a given. The main question is \d under the /b modifier (which will likely be the default). I think it makes a lot of sense to change the default of \d to only be the "computing digits" and not "any digit in unicode". I think it is likely to fix more things than it will break. For you out there working in non-english/latin how much do you depend on \d matching your native digits?

Relevent links: Regarding the new \w regexp escape in 5.11


In reply to About \d \w and \s by demerphq

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and all is quiet...

    How do I use this? | Other CB clients
    Other Users?
    Others exploiting the Monastery: (3)
    As of 2018-02-26 02:19 GMT
    Find Nodes?
      Voting Booth?
      When it is dark outside I am happiest to see ...

      Results (316 votes). Check out past polls.