http://www.perlmonks.org?node_id=87581


in reply to List-to-Range generation

OK, it took me a few minutes to completely understand how and why this works. Here is my dissected version of the regex:
s/(\d+) # first number (group #1) (?: # group #2 , # followed by a comma ( # group #3 (??{$++1}) # match previous number + 1 (group 4) ) # end group #3 )+ # end group #4, repeat /$1-$+/gx; # substitute for the first number followed by +the # last matched one
Group #1 matches the first number in a sequence of numbers. Then, the ??{$+ + 1} is used to match "the last number plus one" ($+ stands for whatever was matched by the last set of grouping parenthesis). For the second number in a sequence, the "last number" is the one matched by group #1. But for subsequent numbers (because of the +), the last number matched (this is, whatever the ??{$++1} matched last time) becomes the "last number". So the thing repeats until the "last number plus one" part doesn't match anymore (this is, until a non-consecutive number is found), and then replaces the whole thing with the first number (group #1), a dash, and the last number matched.

At first look, I thought the double parenthesis around ??{$++1} were unnecessary, but without them it does not work, and here is why: $+ contains what was matched by the last set of parenthesis, not the current set. So by doubling the parenthesis, it makes $+ contain the last thing matched by the current expression. Very clever!

--ZZamboni

Replies are listed 'Best First'.
Re: Re: List-to-Range generation
by japhy (Canon) on Jun 11, 2001 at 21:17 UTC
    Almost.
    m{ (\d+) # \1 start -- digits -- \1 end (?: , # , ( # \2 start (??{$++1}) # evaluate '$+ + 1' as a regex )+ # \2 end (and try again) ) }
    The $+ refers to the last successful captured pattern, and that capture must have been closed. So the first time the (??{...}) is reached, $+ is $1's value. The next time, it's $2's (first) value, and then $2's new value, and so on.

    japhy -- Perl and Regex Hacker
Re^2: List-to-Range generation
by rmocster (Novice) on Jul 21, 2016 at 22:12 UTC
    Any chance to structure it so to list padded numbers. I like to convert (0001,0002,0003,011,012,013,015) to "0001-0003,011-013,015". Thanks a lot.

      Our venerable learned brother ZZamboni graces our humble monastery with his esteemed presence only infrequently these days. He last visited some 18 months ago, so you might be waiting a while for a direct reply.

      Other interested parties might wish to know that rmocster subsequently posted his own SoPW question (Convert an array of numbers into a range). You might wish therefore to follow that thread to see not only the context but the ensuing discussion.