http://www.perlmonks.org?node_id=1004231


in reply to Capturing all instances of a repeating sub-pattern in regex

It seems odd that you are doing all this in a substitution rather than iterating over the results of the match, e.g. while ($text =~ /$re/g) {...}, particularly given that you are already resorting to some complex activity including an e modifier. However, you could do this in a fell swoop by capturing the entire block you want to reparse, and using a sub-regex in list context w/ the g modifier to return the full list:

$text =~ s/ (.+?) (?:\s*\n)+ ((?: \d+ (?:\sPSI)? (?:\s*\n)+ ){4}) /push @r, [ $1, $2 =~ m|\d+ (?:\sPSI)?|xg ]/esgx;

If I were the maintenance guy that followed you, I would not think friendly thoughts when I saw that.

You could get something a little more maintainable by explicitly expanding your {4} terms:

$text =~ s/ (.+?) (?:\s*\n)+ (?: (\d+ (?:\sPSI)?) (?:\s*\n)+ ) (?: (\d+ (?:\sPSI)?) (?:\s*\n)+ ) (?: (\d+ (?:\sPSI)?) (?:\s*\n)+ ) (?: (\d+ (?:\sPSI)?) (?:\s*\n)+ ) /push @r, [ $1, $2, $3, $4, $5 ]/esgx;
which could be refactored into
my $sub_re = qr/ (\d+ (?:\sPSI)?) (?:\s*\n)+ /x; $text =~ s/ (.+?) (?:\s*\n)+ $sub_re $sub_re $sub_re $sub_re /push @r, [ $1, $2, $3, $4, $5 ]/esgx;

None of this changes the fact that you have a fundamental obfuscation by using a substitution instead of a loop.


#11929 First ask yourself `How would I do this without a computer?' Then have the computer do it the same way.

Replies are listed 'Best First'.
Re^2: Capturing all instances of a repeating sub-pattern in regex
by wanna_code_perl (Friar) on Nov 17, 2012 at 12:57 UTC

    Thanks for this reply. I appreciate the comments and that, even though you had reservations about doing this in production code (which I would too, and this is not!), you had a go at answering the question I asked.

    I'll share a bit more, just in case you're wondering what I'm thinking by going with an approach like this. It's really just a personal challenge to see if (and how) I could tackle something a different way than I normally would. Similar to obfu/JAPH/golf in that it's amusing and enlightening. I've never tried anything quite so arcane as to capture every instance of a repeating multi-line pattern within a pattern. I only got so far with it before I got stumped, which is why I naturally came here for sage advice next.

    That's the main reason, anyway. The data itself was from a personal engineering project that quite literally exploded (and set off a few car alarms) after I got what I needed out of it. It's hence sort of unlikely I'll ever need to maintain this code, and even then, the much more difficult task would be rebuilding the thing, not to mention the municipal permit. :-)