Beefy Boxes and Bandwidth Generously Provided by pair Networks
We don't bite newbies here... much
 
PerlMonks  

Re^7: Parsing and translating Perl Regexes ( PPIx::Regexp::xplain ppixregexplain.pl _desc.pl )

by Anonymous Monk
on Nov 09, 2013 at 20:19 UTC ( #1061862=note: print w/ replies, xml ) Need Help??


in reply to Re^6: Parsing and translating Perl Regexes ( PPIx::Regexp::xplain ppixregexplain.pl _desc.pl )
in thread Parsing and translating Perl Regexes

Thanks for the clarification.

It was the intent that the actual modifiers in effect be propagated so that you actually know what is in effect at any point in the Regex. Whether I did it right is another question. Or whether I did it in an obscure way and then didn't document worth a hoot. At any rate, I think the knowledge that the token for the "f" in /(?i)foo/ is not case-sensitive should be available somewhere.

On the other hand, I am not sure how seriously I am going to take /foo/aad, since it does not compile (at least under 5.18.1). On the gripping hand, there are already representations of invalid code, so maybe the "d" could become an invalid token. The disgusting thing is that it looks like I actually programmed semantics for this case, and that's definitely wrong.

As for munging around with other peoples' name spaces, I believe it is generally frowned upon. But I have also done it when desperate. It appears you have a genuine need to attach extra functionality to the PPIx::Regex classes. And the strict O-O way requires you to subclass all however-many-there-are of them, and you STILL have to go through and rebless everything the parser spits out -- or I have to figure out how to make it use your classes as an option. The fact of the matter is that Perl does Aspect-Oriented programming right out of the box, so we may as well recognize the fact.

What I'm currently thinking about is reserving to myself all subroutine names that begin with ASCII a-w, plus all that begin with one or two underscores, plus all the all-uppercase ones like DESTROY (which I actually use), AUTOLOAD (which I don't (yet)) and so on. Anything else would be fair game. If you plan to release your code as a CPAN module, I might need to document what parts of the name space you are using (and therefore break your anonymity to some hopefully-minimal extent), in case someone else wants to try the same thing.

Yes, I thought about having PPIx::Regexp actually explain what the tokens were, but I had no pressing need. The problem I was trying to solve was that I was helping out with Perl::Critic, and they were using a different regex parser, which was weird, unmaintained, and started throwing warnings about Perl 5.12 (or maybe 5.14).

Tom Wyant


Comment on Re^7: Parsing and translating Perl Regexes ( PPIx::Regexp::xplain ppixregexplain.pl _desc.pl )
Re^8: Parsing and translating Perl Regexes ( PPIx::Regexp::xplain ppixregexplain.pl _desc.pl )
by Anonymous Monk on Nov 11, 2013 at 00:51 UTC

    Status update:

    The "\g10" thing was an out-and-out bug, caused by blindly reblessing backreferences over and above the number of capture groups present. Only things of the form \10 should be so reblessed.

    The failure to recognize /foo/aia as equivalent to /foo/aai is also a bug.

    The thing with recognizing /foo/ad as /foo/d is more problematic, partly because my design goal was never to distinguish valid regexes from invalid ones, but only to parse valid ones "correctly". The practical problems are that I can't do anything about the error at the point I might detect it, since the code at that point needs to consider also stuff fed in from (e.g.) "use re '/x';". So for the moment nothing is going to be done about those.

    On the other hand, I have come up with a method on PPIx::Regexp::Element (i.e. inherited by all PPIx::Regexp objects) that will tell whether a given modifier is asserted. I'm sure there are all sorts of edge cases that I have not considered, but in "/(?-i:foo)/i" it correctly says /i is _not_ asserted on the "f".

    Unfortunately I am looking at a very busy week, and probably will not get anything published until very late in the week at the earliest.

    Tom Wyant

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1061862]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others rifling through the Monastery: (6)
As of 2014-12-21 18:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    Is guessing a good strategy for surviving in the IT business?





    Results (106 votes), past polls