go ahead... be a heretic | |
PerlMonks |
Re: How to strip comments and whitespace from a regex defined with /x?by Laurent_R (Canon) |
on Jan 19, 2018 at 20:55 UTC ( [id://1207561]=note: print w/replies, xml ) | Need Help?? |
Hi jh
Let me first say that if you intend to do that with a regex or even several regexes, I am afraid this is going to be quite difficult. To quote from the documentation on the x modifier: A single /x tells the regular expression parser to ignore most whitespace that is neither backslashed nor within a bracketed character class. You can use this to break up your regular expression into more readable parts. Also, the "#" character is treated as a metacharacter introducing a comment that runs up to the pattern's closing delimiter, or to the end of the current line if the pattern extends onto the next line. Hence, this is very much like an ordinary Perl code comment. (You can include the closing delimiter within the comment only if you precede it with a backslash, so be careful!)So, it means, for example, that you can't just remove every thing that comes on a line after a # pound sign, because you can't do it if the pound sign is part of bracketed character class, which means in turn that you need to detect character classes (and that, in itself, is far from trivial). Also, for any pound sign you find, you need to check that it is not escaped by a backslash. Assuming that you build a bunch of regexes dealing correctly with pound signs, you then need to deal with white space, which is also quite difficult. So, in brief, it is certainly possible to use regexes to do that, but it is likely to be complex and very difficult. FWIW, I can think of the following alternatives:
Maybe some other monk(s) will be able to suggest a better solution, but that's what I can think of at the moment. Please also note that, starting with Perl 5.26, there is also a xx modifier with different rules.
In Section
Seekers of Perl Wisdom
|
|