http://www.perlmonks.org?node_id=89587


in reply to Unrolling the loop technique

OK, I'll try my best at explaining the "unrolling the loop" mechanism (I don't have MRE at hand, so feel free to correct me if I am blatantly wrong!):

I use this technique in 2 cases:

In the first case here is how you want to match:

Now if you want to match a string with a multi-character end delimiter here is how to do it:

A potential pitfall is that you want to make sure you don't consume the characters just after the first character of the end delimiter, or things like **/ (the first character of the end delimiter is there twice in a row, once as a regular character and once as the start of the end delimiter) would not be processed properly.

I guess a couple of examples might be appropriate.

First matching a double-quoted string, double quotes can be escaped using \":

#!/bin/perl -w use strict; while( <DATA>) { next if(/^\s*#/); # skip comments in DATA chomp; # split data into string to match and expected result(s) my( $string, @expected)= split /\s*=>\s*/; while( $string=~ m{" # the start deli +miter ([^\\"]* # anything but t +he end of the string or the escape char (?:\\. # the escape + char preceeding an escaped char (any char) [^\\"]* # anything b +ut the end of the string or the escape char )*) # repeat "}gx) # the end delimi +ter { my $match= $1; my $expected= shift @expected; unless( $match eq $expected) { print "unexpected result line $.: found /$match/, expectin +g /$expected/\n"; } } } __DATA__ # string to match => expected results(s) toto "a string" tata => a string toto "a string" => a string toto "a string" tata => a string toto "a \" string" tata => a \" string toto "\" string" tata => \" string toto "a\"" tata => a\" toto "\"" tata => \" toto "\"\"" tata => \"\" toto "string 1" "string 2" tata => string 1 => string 2 toto "string 1 => toto "string 1" "string => string 1 toto "tata\\" tutu => tata\\ toto "tata\\\"" tutu => tata\\\"

And now how to match C-like comments:

#!/bin/perl -w use strict; while( <DATA>) { chomp; next if(^\s*#); # skip comments # split the data into the string to match and the expected result( +s) my( $string, @expected)= split /\s*=>\s*/; while( $string=~ m{/\* # the delimite +r ([^*]* # anything but + the beginning of the delimiter (?:\*(?!>/) # the beginn +ing of the delimiter, not preceeding the rest of the delimiter # (?!>/) +means "not before /, do not use the next char) [^/]* # anything b +ut the beginning of the delimiter )*) # repeat \*/}gx) # the end of t +he delimiter { my $match= $1; my $expected= shift @expected || ''; unless( $match eq $expected) { print "unexpected result line $.: found /$match/, expectin +g /$expected/\n"; } } } __DATA__ # string => result(s) toto => toto /*foo*/ tata => foo /*foo*/ tata => foo toto /*foo*/ => foo toto /*foo*bar*/ => foo*bar toto /*foo**/ => foo* toto /**/ => /***/ => * /*/*/ => /