$text =~ s{$subsRE}{$replacementLU{ $1 }}g;
This is where all the actual substitutions take place. The first part, {$subsRE}, looks for keyword matches (see below), and the /g modifier keeps looking until no more matches can be found. For each match found, the keyword, referenced by $1, is used as a hash key in the lookup table %replacementLU, and the value corresponding to that key is used for the substitution. So, for example, ARP_VULNERABILITY is replaced by y3.
OK, you knew all that, but where does $subsRE fit in? Let’s print it out to see what it looks like:
(?^:(?x) \b ( CONTENT_FILTERING_PROFILE_ID|QUOTA_GRANTED|ARP_VULNERABI
+LITY|NETWORK_IDENTIFIER|ARP_PRIORITY_LEVEL|DEFAULT_BEARER_ID|EVENT_RE
+SULT|EVENT_ID|QOS_PROFILE_ID|ARP_CAPABILITY|SYSTEM_IDENTIFIER|TRACKIN
+G_AREA_CODE|GX_RAR_RAA_TRANSACTION|SERVICE_AREA_CODE|RECORD_TYPE|RECO
+RD_LENGTH|CHARGING_PROFILE_ID|QOS_ASSIGNED_TO_DEFAULT_BEARER|GX_CCR_C
+CA_TRANSACTION|RULE_REMOVED|BEARER_CONTROL_MODE|ROUTING_AREA_CODE|RUL
+E_INSTALLED|CAUSE_PROTOCOL|SUBSCRIBERID ) \b)
As you can see, this says: match any one of the keywords provided it is preceded and followed by a word boundary (\b). The character | separating the keywords is the metacharacter for alternation; for example, A|B|C means: match either A, or B, or C. (See “Metacharacters” in perlre#Regular-Expressions.) Note the capturing parentheses: if any of the keywords is matched, it is captured into the next available capture variable (which in this case is $1).
OK, so where did this monster $subsRE come from? It would be no fun constructing this by hand, so johngg harnessed Perl to do the work. Note that qr// is a the Perl regex quote operator: it converts a string into a regular expression (see perlop#Regexp-Quote-Like-Operators). (?x) is the /x modifier in a different form. The string argument to qr// is constructed by interpolating the keys of the hash %replacementLU into the string. But just saying this:
qr{(?x) \b ( keys %replacementLU ) \b};
wouldn’t work because Perl would think you want to match the literal characters keys %replacementLU. Perl will interpolate when it sees a $ (for a scalar) or an @ (for an array), so we need to give Perl a construct like this: @{ ... }. But that says, dereference (something) to get an array. So we need to convert keys %replacementLU (which returns a list) into an array reference, which we do by creating an anonymous array with square brackets. So
@{ [ keys %replacementLU ] }
is the Perlish idiom for interpolating the contents of the list returned by the keys function into the string.
Now all we need is to separate the elements of the list with | (alternation) characters. Normally, when a list is interpolated, the elements are separated by spaces. But actually they’re separated by whatever is the contents of the special variable $", for which a space is the default. By changing it to |, we get the elements of the list separated by alternation characters, which gives us the regex we want.
johngg could have just said:
$" = q{|};
my $subsRE = qr{(?x) \b ( @{ [ keys %replacementLU ] } ) \b};
but that would leave $" set to |, which might interfere with other parts of the script. It’s better practice to localise any temporary changes made to global variables. The syntax:
my $subsRE = do {
local $" = q{|};
qr{(?x) \b ( @{ [ keys %replacementLU ] } ) \b};
};
uses local to limit the scope of the assignment, and takes advantage of the fact that the do { ... }; syntax (1) provides an enclosing scope for the local $" = q{|}; assignment; and (2) returns the value of its final statement, in this case the regex returned by qr//.
As Hannibal Smith liked to say: “I love it when a plan comes together.” :-)
Hope that helps,
|