NetWallah has asked for the wisdom of the Perl Monks concerning the following question:
Esteemed RegEx-Monkers:
The Tag may or may not be present.
I.e. this fails:
In parsing a log file lines that contains XML-ish content with a single regex, I'm having trouble understanding the subtleties of optional capture.
The string I'm parsing is like:
And I'm trying to extract the content of the "type", and the tag name of a tag that ends with "TagIwant".<blah1 phase="2" type="MyType" more_keys="Values" <Unwanted/> <SomeTa +gIwant><k1="v1"></SomeTagIwant>
The Tag may or may not be present.
I'm able to capture both pieces with the RE:
but - the match fails if I append a "?" to the expression, in an attempt to make it optional.\btype="([^"]+)".+<(\w+TagIwant\b)
I.e. this fails:
Which returns only "MyType", and not the second expected capture of "SomeTagIwant".perl -E '$x=q|<blah1 phase="2" type="MyType" more_keys="Values" <Unwa +nted/> <SomeTagIwant><k1="v1"></SomeTagIwant>|; say for $x=~/\btype +="([^"]+)".+<(\w+TagIwant\b)?/'
The "\b" is an attempt to deal with variations like <SomeTagIwant/> and <SomeTagIwant k3="v3" /> .
I'm hoping for (1) Explanations for why the "?" fails, and (2) Suggestions on how to fix it.
All power corrupts, but we need electricity.
|
---|
Replies are listed 'Best First'. | |
---|---|
Re: Regex Optional capture doesn't
by haukex (Archbishop) on Oct 05, 2017 at 18:11 UTC | |
Re: Regex Optional capture doesn't
by LanX (Saint) on Oct 05, 2017 at 16:34 UTC | |
by NetWallah (Canon) on Oct 05, 2017 at 16:39 UTC | |
by LanX (Saint) on Oct 05, 2017 at 16:47 UTC | |
by LanX (Saint) on Oct 05, 2017 at 16:55 UTC | |
by NetWallah (Canon) on Oct 05, 2017 at 17:40 UTC | |
| |
Re: Regex Optional capture doesn't
by Laurent_R (Canon) on Oct 05, 2017 at 17:47 UTC | |
by NetWallah (Canon) on Oct 05, 2017 at 17:53 UTC | |
by Laurent_R (Canon) on Oct 05, 2017 at 22:18 UTC |
Back to
Seekers of Perl Wisdom