Best Practice: Order of regex modifiers?

LanX has asked for the wisdom of the Perl Monks concerning the following question:

Replies are listed 'Best First'.
Re: Best Practice: Order of regex modifiers? by haukex (Archbishop) on Feb 01, 2017 at 15:28 UTC
Hi Rolf, Most regexes I write only have a few modifiers, so it's hard to get confused no matter what order they are in, and in those cases I don't see order as a problem. Although I don't have a strong opinion on this, if I did have to settle on a standard, I might consider the order that Perl uses when regexes are stringified, e.g. `$ perl -wMstrict -le 'print qr/abc/msixpoun' (?upmsixn:abc)` [download] Although unfortunately, it seems this order doesn't match with the documentation, `qr/STRING/msixpodualn`, which is another possible ordering... Update: Also, I often place those modifiers that change the behavior of the regex, like `/gc`, first, so they're immediately obvious. Regards, -- Hauke D	[reply] [d/l] [select]
Re^2: Best Practice: Order of regex modifiers? by LanX (Saint) on Feb 01, 2017 at 15:35 UTC
Hi Hauke, Thanks, I ignored the "natural order" of perldocs ;-) > Update: Also, I often place those modifiers that change the behavior of the regex, like /gc, first, so they're immediately obvious. Well all modifiers change the behaviour of a regex, don't you think? (I think that's why they are called modifiers ;-) This leads to my suggestion to order (or at least group) by importance... Cheers Rolf _{(addicted to the Perl Programming Language and ☆☆☆☆ :) Je suis Charlie!}	[reply]
Re^3: Best Practice: Order of regex modifiers? by haukex (Archbishop) on Feb 01, 2017 at 15:46 UTC
Hi Rolf, Well all modifiers change the behaviour of a regex, don't you think? Well yes, but: some change how the pattern of the regex is treated, like `/xmsialud`, whereas some change how the regex operator, like `m//`, behaves. For example, the return values of `m//` are quite different from those of `m//g`, and `/g` doesn't affect how the pattern is treated. Regards, -- Hauke D	[reply] [d/l] [select]
Re^4: Best Practice: Order of regex modifiers? by LanX (Saint) on Feb 02, 2017 at 17:27 UTC
Re: [OT - Separator character]: Best Practice: Order of regex modifiers? by AnomalousMonk (Archbishop) on Feb 01, 2017 at 17:43 UTC
I've often thought that a separator character would be useful in regex modifier strings to improve readability. Literal numbers have the `_` (underscore) for this reason, and I don't see why this separator cannot be "overloaded" for use in regexes. E.g., rather than `m{ ... }xmsgco` or `s{ ... }{...}xmsgeepo` one might write `m{ ... }xms_gc_o` or `s{ ... }{...}xms_geep_o` (just to fabricate some extreme cases). Of course, my own personal practice is always to use an `/xms` modifier tail, so a separator would always fall after this mandatory group if there were additional modifiers. Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re^2: [OT - Separator character]: Best Practice: Order of regex modifiers? by LanX (Saint) on Feb 01, 2017 at 18:29 UTC
Agreed! or a grouping syntax `s{ ... }{...}{xms geep o}` Anyway I think you could already use the `x` as such if you are careful about whitespaces. Like `s{ ... }{...}msxgeepxo` OTOH it doesn't look much better! BTW: the use of `o` seems to be discouraged. http://stackoverflow.com/questions/550258/does-the-o-modifier-for-perl-regular-expressions-still-provide-any-benefit Cheers Rolf _{(addicted to the Perl Programming Language and ☆☆☆☆ :) Je suis Charlie!}	[reply] [d/l] [select]
Re^3: [OT - Separator character]: Best Practice: Order of regex modifiers? by AnomalousMonk (Archbishop) on Feb 01, 2017 at 19:08 UTC
... the use of o seems to be discouraged. I only latched onto `/o` because I was casting about for something to use in a manufactured example. AFAIU, the `/o` modifier is only useful now in those very limited cases in which one wishes to prevent recompilation of a `qr// m// s///` even when interpolated `Regexp` objects or strings have changed. My understanding is that these operators will not now recompile on each execution unless an interpolated regex/string has changed. Give a man a fish: `<%-{-{-{-<`	[reply] [d/l] [select]
Re^4: Separator character: Best Practice: Order of regex modifiers? by Anonymous Monk on Feb 02, 2017 at 02:45 UTC
Re: Best Practice: Order of regex modifiers? by hippo (Bishop) on Feb 02, 2017 at 09:47 UTC
Alphabetical. Surely if you are trying to eye-parse some code and want to know if a particular modifier has been applied, this is the clearest and fastest approach to use. Perhaps this question could be submitted to the poll ideas quest 2017?	[reply]
Re^2: Best Practice: Order of regex modifiers? by LanX (Saint) on Feb 02, 2017 at 14:05 UTC
> Alphabetical. Surely if you are trying to eye-parse some code and want to know if a particular modifier has been applied, this is the clearest and fastest approach to use. I disagree. First one should separate modifiers which are `s///` only from standard `m//` modifiers (the latter (most?) can also be pre-compiled into the regex using `qr//` ) Than ordering by (and/or) category seniority (new vs established) frequency� memorizing make sense. For instance `/a /d /l /u` are perlre#Character-set-modifiers � but are mostly listed as `/dual` for obvious reasons, the word "dual" is far easy to remember. (I'd even argue that `/i` belongs to same category but which much higher frequency) So I'd say divide and conquer, humans can grasp sets with 5 to 7 elements far more easily, so 5 categories with at most 5 elements should fit (... because of connectivity problems the rest of the post got lost :/ ... TL; don't want to rewrite and posting by tethering thru mobile) so my bet at the moment is the following order by categories, respecting frequency and memorization Categories Syntax x Line m,s Matching n,p Character i,d,u,a,l Operation g,c,(r) Substitution-only r, e,ee, o Cheers Rolf _{(addicted to the Perl Programming Language and ☆☆☆☆ :) Je suis Charlie!} � not sure why the deep linking doesn't work (for me) seems like the anchor is missing. � in 5.10 perlre only listed 7 modifiers and already did a categorization: "g and c: Unlike i, m, s and x, these two flags affect the way the regex is used"	[reply] [d/l] [select]
Re: Best Practice: Order of regex modifiers? by kcott (Archbishop) on Feb 02, 2017 at 15:07 UTC
G'day Rolf, Purely for readability, I generally try to keep the modifiers in alphabetical order. In that respect, I concur with ++hippo's response. I'm pretty sure that the "`xms` default" came about, because that was the order they were introduced in the book "Perl Best Practices". Always use the `/x` flag. (pp. 236-237) Always use the `/m` flag. (pp. 237-239) Always use the `/s` flag. (pp. 240-241) Some modifiers can be applied to the regex itself (e.g. `/x`); others, to any operation the regex is involved in (e.g. `/g`); and others to only a specific operation (e.g. `/e`). It's a fatal error to use them in the wrong places: `$ perl -E 'say qr{}x' (?^ux:) $ perl -E 'say qr{}g' Unknown regexp modifier "/g" at -e line 1, near "say " Execution of -e aborted due to compilation errors. $ perl -E 'say m{}g' $ perl -E 'say m{}e' Unknown regexp modifier "/e" at -e line 1, near "say " Execution of -e aborted due to compilation errors. $ perl -E 'say s{}{}e' 1` [download] I can see some benefit in keeping those grouped together: your initial example of `xegis` would become `isxge`. I'm also not averse to ++AnomalousMonk's suggestion of using a separator. In which case, `xegis` would become `isx_g_e`. Overall, I'm not too bothered by personal preferences regarding modifier ordering: deciding upon a single style, and using it consistently, is far more important, in my opinion. — Ken	[reply] [d/l] [select]
Re: Best Practice: Order of regex modifiers? by choroba (Cardinal) on Feb 02, 2017 at 16:55 UTC
Reminds me of Re: Memorizing The s/// Option List For Fun and Profit. ($q=q:Sq=~/;[c](.)(.)/;chr(-\|\|-\|5+lengthSq)`"S\|oS2"`map{chr \|+ord }map{substrSq`S_+\|`\|}3E\|-\|`7**2-3:)=~y+S\|`+$1,++print+eval$q,q,a, [download]	[reply] [d/l]
Re^2: Best Practice: Order of regex modifiers? by LanX (Saint) on Feb 02, 2017 at 17:30 UTC
Oh, B::Deparse is a good point, too. :) ... and possibly use re 'debug'; (?) Cheers Rolf _{(addicted to the Perl Programming Language and ☆☆☆☆ :) Je suis Charlie!}	[reply]
Re: Best Practice: Order of regex modifiers? ( s///gexis s///mexig ) by Anonymous Monk on Feb 02, 2017 at 02:48 UTC
Hi, I mostly dont think about it too much, but some things are memorable like s///gexis s///mexig s///gimx s///gmix s///gsix s///gesr	[reply]


Think about Loose Coupling
	PerlMonks