Beefy Boxes and Bandwidth Generously Provided by pair Networks
There's more than one way to do things
 
PerlMonks  

Re: perl look ahead regular expression that is optional?

by talexb (Canon)
on Oct 03, 2013 at 20:37 UTC ( #1056793=note: print w/ replies, xml ) Need Help??


in reply to perl look ahead regular expression that is optional?

This is an interesting question -- I like brain-teasers as much as the next developer, so I wrote a solution. Except it doesn't work, because I think the example you have is faulty. Perhaps someone else can find the flaw in my reasoning.

    Find a string 'foobar' that is followed by two sets of optional characters
    • the first set of characters might contain a b or c, in that order. Only zero or one a, zero or one b, and zero or one c is allowed.
    • The second set of characters is always preceded by an X (to delimit that this is the second set) and might contain a b or c, Only zero or one a, zero or one b, and zero or one c is allowed in the second set.

OK -- foobar, then optionally (zero or one 'a', 'b', 'c'), then optionally ('X' followed by zero or one 'a', 'b', 'c'). But then there's a list of examples, and #5 seems to break the rule about the second group ..

foobarabcX must not match, nothing follows 'X'

I'm not sure this is right. The second optional group has been defined as 'X', followed by zero or one 'a', 'b' and 'c'. Which means I could have an 'X' followed by zero 'a', 'b' and 'c'. Which means that foobarabcX is a valid pattern.

Anyone else see the flaw?

Alex / talexb / Toronto

Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.


Comment on Re: perl look ahead regular expression that is optional?
Select or Download Code
Re^2: perl look ahead regular expression that is optional?
by Tanktalus (Canon) on Oct 03, 2013 at 23:50 UTC

    Management comes up with these contradictory requirements all the time :)

    But, really, this just looks like they mean that there must be SOMETHING after the X, but it could be a, b, OR c, or more than one of those, in relative order.

    So, Xa is ok, so is Xb, Xc, Xab, Xac, Xbc, and Xabc. Just X by itself is invalid - it's not allowed to be empty.

    I seem to recall such types of ambiguity to be common both among teachers and managers. Something about not having to implement it themselves combined with "I know what I mean. Why don't you?".

    And, based on my tests,

    should do the job. But if the OP doesn't understand why while trying to hand it in as their own work, they'll find that they're going to get further and further behind as their course progresses.

      Awesome, it works! 13 years programming Perl, and I'm still learning new sh*t.

      Well, I'm glad you understand it -- I don't. And now I need to peel open my Camel (4th edition) and figure out what your regexp does. After 15 years, I'm still learning too. :/

      Alex / talexb / Toronto

      Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.

      Thanks for posting your solution to this -- I've managed OK with regexps, but your post encouraged me to get a little more familiar with how lookaround assertions work. I'm going to document the regexp you've provided in order to show that I believe the instructions are still incorrect (see also my post here where I list all of the possible legal patterns that I think exist).

      Alex / talexb / Toronto

      Thanks PJ. We owe you so much. Groklaw -- RIP -- 2003 to 2013.

        I think you're taking the wrong set of instructions as definitive. Maybe you should play more D&D - where every rule seems to have an exception somewhere else :)

        ) # .. end of first group (is this missing # a '?' to show it's optional?)
        Nope. The first group exists, even if it's empty. That is called out explicitly. The only reason for the ? after the non-capturing group is because the X itself is optional - and not present when the second group is not present.

        As for your other post, again, you're using the more precise instructions as definitive instead of the less precise instructions (with explicit examples). If we take the examples as definitive, and fill in their ambiguity with the precise instructions instead of the other way around, we see that the X is always followed by something. That the precise instructions would seem to allow X without anything after them becomes moot because we are starting with the higher-level instructions which already said that wasn't allowed (through the example).

        The problem is that communication is a two-way street, and you're using your definitions of "set" and "might" and "zero or one" instead of theirs. That theirs is nonsensical is completely beside the point - they showed us their flawed dictionary through the examples. Not everyone understands that "set" can include the null or empty set. Or that their use of "might" is ambiguous (they mean "might include a, might include b, might include c" and left off the "but must include something" bit, thinking that "set" means "something" and can never be empty).

        Think about this as a list of items in perl:

        @r = (@$a, @b, c,); # c would be a function call!
        In perl, those commas, which represent "X" in the OP's question, are required. Except for the last one. However, in other languages, such as C, that last one is an error. And that follows the requirement - you can have that X, but only if something follows it, though what that is can be anything. To complete the simile, "a" becomes "@", "b" becomes "$", and c becomes "a name (such as a, b, c here)". The OP just can't allow that trailing comma, because he's reading something more like C than Perl. (Of course, in perl, "c", being "a name", isn't actually optional unless the first two are also missing.)

        The hardest part about any spec, as always, is figuring out what the author meant as opposed to what they said. And that means understanding who the author is and where they're coming from. In this case, it was a wild-assed guess. But that's one of the beauties of the site - I'm not regularly the one to predict what a petitioner is asking, but it's pretty often that someone does predict it. It just happened to be me this time. :)

Log In?
Username:
Password:

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://1056793]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this? | Other CB clients
Other Users?
Others cooling their heels in the Monastery: (11)
As of 2014-09-18 07:47 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    How do you remember the number of days in each month?











    Results (109 votes), past polls