Beefy Boxes and Bandwidth Generously Provided by pair Networks
XP is just a number

comment on

( #3333=superdoc: print w/replies, xml ) Need Help??

I asked in the chatterbox about multidimensional regular expressions and was given a referral to a clean white room with rubber walls. I'm hoping that by putting this into a proper node that the question will get wider exposure and maybe someone has already implemented this. I did a cpan/google search and didn't come up with anything that really matches. At its heart a regular expression is just a series of concise set of assertions joined by AND/OR junctions (with side effects). There's no particular reason that it doesn't exist already. All I can think of is that no one has needed it before. It'd be interesting to take a whack at the idea. (this is where you post a pointer to th e right CPAN module)

Consider first that something m/(?:this|that)/ is a basic one dimensional expression. An AoA is a two dimensional expression, AoAoA is three dimensional, etc. (just for kicks think about locally expanded dimensions or what it'd mean to mix hashes in). I'm not even sure how to best describe the grammar for this but then if someone else has already implemented it then I don't have to (I don't have to anyway but that's not the point). I'm initially thinking of something like \[{ ... } and \]{ ... } (reading the overload section of perlre leads me to think this syntax is ok). So before I get any further with this I'd like to know if this already exists somewhere or if it doesn't exist for a reason. I'm also not sure if there is a way to rebind the currently executing regular expression to another string. I can work around that by use of (??{}) but I'd rather just not resort to that hack.

This is very contrived example for how this might be used. So far most of the data I work with is distinguished by being in different fields which sort of removes the point to this technique. In general though - imagine you were going to do a pattern match against a multidimensional bitmap. I /think/ this has applications there. Or maybe not. It's an idea anyway and if it's just oddball I'm interested in hearing why.

Update 0: It occurs to me that you all might more mileage out of this if I explain my original inspiration. A few weeks ago John M. Dlugosz was talking about unifying substr, splice, shift, unshift and other array functions with the string functions. The problem is, once you start treating strings like arrays then people like me start wanting to treat arrays like strings which is why this even occured to me.

Update 1: I think the main problem with this is rebinding the running regex with another string. You can play tricks like (?(?{more expressions...})continue in this expression|(?#fail)(?!)) but that doesn't quite strike me as a good idea.

Update 2: Taking into consideration both merlyn and my response to princepawn I think the basis of this ought to be a metasequence like \[{dimension,direction} for switching into a different dimension (like a tangled ball of string) and \]{//xpath/expression} for the original idea of skipping around in the data. The first metasequence is probably the most conservative in that all it's doing is adding right angles to regex. The second metasequence is more interesting in that it would allow you to specify a location to jump to. Perhaps somewhat like setting pos() while in the middle of an expression

Update 3: I'd just like to note that the use of the sequences \[{...} and \]{...} is entirely arbitary and just based off of \N{...}. If you have a better syntax then please speak up.

@matches = [ "LISTOP", "OP", "COP", "BINOP", [ "LOOP", [ "OP", "UNOP", [ "OP", "UNOP", [ "SVOP" ], ], ], "UNOP", [ "LOGOP", [ "OP", "LISTOP", [ "COP", "LISTOP", [ "OP", "UNOP", [ "SVOP" ], ], "OP", "COP" ], ], ], ], ] =~ m[(SVOP\[{[-1,-1]}UNOP)]g; print Dumper(\@matches); $VAR => [# match 0 [ "UNOP", [ "SVOP" ] ], # Match 1 [ "UNOP", [ "SVOP" ] ] ]; # empty those end SVOP strings s[(?<=UNOP\[{[0,1],[1,1]})SVOP][];

In reply to Multidimensional regular expressions by diotalevi

Use:  <p> text here (a paragraph) </p>
and:  <code> code here </code>
to format your post; it's "PerlMonks-approved HTML":

  • Posts are HTML formatted. Put <p> </p> tags around your paragraphs. Put <code> </code> tags around your code and data!
  • Titles consisting of a single word are discouraged, and in most cases are disallowed outright.
  • Read Where should I post X? if you're not absolutely sure you're posting in the right place.
  • Please read these before you post! —
  • Posts may use any of the Perl Monks Approved HTML tags:
    a, abbr, b, big, blockquote, br, caption, center, col, colgroup, dd, del, div, dl, dt, em, font, h1, h2, h3, h4, h5, h6, hr, i, ins, li, ol, p, pre, readmore, small, span, spoiler, strike, strong, sub, sup, table, tbody, td, tfoot, th, thead, tr, tt, u, ul, wbr
  • You may need to use entities for some characters, as follows. (Exception: Within code tags, you can put the characters literally.)
            For:     Use:
    & &amp;
    < &lt;
    > &gt;
    [ &#91;
    ] &#93;
  • Link using PerlMonks shortcuts! What shortcuts can I use for linking?
  • See Writeup Formatting Tips and other pages linked from there for more info.
  • Log In?

    What's my password?
    Create A New User
    and the web crawler heard nothing...

    How do I use this? | Other CB clients
    Other Users?
    Others examining the Monastery: (5)
    As of 2021-04-15 10:21 GMT
    Find Nodes?
      Voting Booth?

      No recent polls found