Beefy Boxes and Bandwidth Generously Provided by pair Networks
Pathologically Eclectic Rubbish Lister

Re: Common Regex Gotchas -- "(:?"

by shenme (Priest)
on Sep 29, 2005 at 18:28 UTC ( #496199=note: print w/replies, xml ) Need Help??

in reply to Common Regex Gotchas

Summary: "(:?" is a quiet guy, but not as well-mannered and quick as that "(?:" fellow.

When extending the regex syntax to include features like zero-width negative look-ahead the authors tried very hard to use syntax that avoided duplicating any 'real' regex code. So they started all the new syntax with '(?'. It turns out that this makes typos a bit too easy, and far too quiet.

I came across the following in a CPAN module:

It isn't important what the RE does as much as 1) it doesn't work as intended, and 2) it doesn't (loudly) fail

The writer intended to use "(?:", the clustering grouping. This is used when you need to avoid capturing the matched subexpression. For instance you might want to say that a complex inner match is optional, e.g.

... ( contains \s+ (?:this|that)? \s+ item ) ...

But tyops happen. What is the result if you reverse the ':' and '?' characters? Nothing drastic, usually.

In "(:? pattern )" the original meaning of '?' is used - the ':' character becomes an optionally matched character. The parentheses also revert to their original meaning of capturing groups.

So usually the only result is that the regex is a bit slower and captures more substrings. It might also allow a stray ':' input character. If you weren't monitoring how many captures come back from a successful match you might never notice the typo.

But note that this typo could occur with any single character "(?X" syntax. You might notice it right away if your "(#? comment )" caused syntax errors. And you should notice it when your input matching tests fail on "fore(=?fend)". But otherwise these typos will silently fail.

Now this is a minor gotcha. Except that it is found in 15 nodes here, with another node mentioning it in an aside, and another node discovering the typo in a book. I wonder if it is in your code?

perlre - Extended Patterns

Log In?

What's my password?
Create A New User
Node Status?
node history
Node Type: note [id://496199]
[Corion]: Your Mother: I think that's because (in the west) the Nazi-Germans are recognized as universally evil. Of course, you could do some number games to calculate other measures of evil than "historic losers of second world war" to come up with other evils:)
[Corion]: I've heard "Troll" described as the new Punk, and in a way, it can be as destructive as living the Punk lifestyle, and you don't have to sit out in the cold...
[LanX]: Anti-Germans
[LanX]: Socrates was a Troll
[Your Mother]: It's very, very dangerous... Thinking that a group is intrinsically evil... buries the fact that all humans can be so deep that it starts to become likely they will be.
[Your Mother]: LanX++
[Corion]: (also the "troll for trolls sake" could seen be much like the "punk for punks beer")
[Your Mother]: I'm completely (historically anyway) a troll in real life. It's not fun online, you can't really win. :P
Happy-the-monk orders a Punk IPA.
[Your Mother]: Also, I'm too old to start fights with strangers anymore.

How do I use this? | Other CB clients
Other Users?
Others surveying the Monastery: (16)
As of 2018-03-19 13:45 GMT
Find Nodes?
    Voting Booth?
    When I think of a mole I think of:

    Results (240 votes). Check out past polls.