Try something like this:
c:\@Work\Perl\monks\lewars>perl -wMstrict -le
"use Data::Dump qw(dd);
;;
my $xlate_file = 'lookup.dat';
open my $fh_xlate, '<', $xlate_file or die qq{opening '$xlate_file':
+$!};
;;
my %xlate =
map { m{ \A (\S+) \s+ (.+?) \s+ \z }xms }
<$fh_xlate>
;
dd \%xlate;
close $fh_xlate or die qq{closing '$xlate_file': $!};
;;
my ($rx_ref) =
map qr{ \b (?: $_) \b }xms,
join ' | ',
map quotemeta,
reverse sort
keys %xlate
;
print $rx_ref;
;;
my $master_file = 'master.dat';
open my $fh_master, '<', $master_file or die qq{opening '$master_file
+': $!};
;;
my $master = do { local $/; <$fh_master> };
close $fh_master or die qq{closing '$master_file': $!};
;;
$master =~ s{ ($rx_ref) }{$xlate{$1}}xmsg;
print qq{[[$master]]};
"
{
Ref00004 => "https://dealerportal4.xx.com/siteminderagent/forms/xx.f
+cc;ACS=0",
Ref00005 => "https://sso.xx.com/siteminderagent/forms/xx.fcc;ACS=0;R
+EL=0",
Ref00006 => "https://secure3.xx.com/siteminderagent/forms/xx.fcc;ACS
+=0;REL=0",
Ref00007 => "https:///siteminderagent/cert/smgetcred.scc?cert",
Ref00008 => "https://secure4.xx.com/siteminderagent/forms/xx.fcc;ACS
+=0;REL=0",
Ref00009 => "https://vbos-uat.xx.com/siteminderagent/forms/xx.fcc;AC
+S=0;REL=0",
}
(?msx-i: \b (?: Ref00009 | Ref00008 | Ref00007 | Ref00006 | Ref00005 |
+ Ref00004) \b )
[[<Property Name="CA.SM::AuthScheme.IsUsedbyAdmin">
<BooleanValue>false</BooleanValue>
</Property>
<Property Name="CA.SM::AuthScheme.Desc">
<StringValue>TCP portal auth scheme</StringValue>
</Property>
<Property Name="CA.SM::AuthScheme.Level">
<NumberValue>5</NumberValue>
</Property>
<Property Name="CA.SM::AuthScheme.IsTemplate">
<BooleanValue>false</BooleanValue>
</Property>
<Property Name="CA.SM::AuthScheme.Param">
<LinkValue><XREF>https://sso.xx.com/siteminderagent/forms/xx.fcc;A
+CS=0;REL=0</XREF></LinkValue>
</Property>
<Property Name="CA.SM::AuthScheme.Library">
]]
Notes:
-
This approach slurps the entire master file into memory, so it should work fine with a 38 MB or even 380 MB file, but will not scale to larger file sizes indefinitely.
-
The regex for matching references assumes the reference string is always bounded by a non-\w character. If this is not the case, adjust as needed.
-
The substitution replaces Ref00004-like strings anywhere and everywhere in the file. If you need this replacement done, e.g., only between certain tags, adjust the match regex as needed or perhaps use an XML parser.
-
The example code only print-s to standard out; adjust as needed.
-
Update: No validation is done on the content of the lookup.dat file. It might be wise to consider this.
-
Update: I think the regex for extracting URLs from the lookup data file will support embedded whitespace in the URL, but I haven't tested this. Caveat Programmor.
-
Update: The regex for extracting reference placeholders and URLs from records in the lookup file is very naive. For instance, \S+ matches a reference placeholder. Personally, I would feel better with a more specific match, maybe something like
qr{ (?<! [[:alpha:]]) Ref \d{5} (?! \d) }xms
Likewise, I'm sure there are canned regexes for matching URLs available.
Update: For a good discussion of the technique used above to build the $rx_ref regex matching object, see Building Regex Alternations Dynamically by haukex.
Give a man a fish: <%-{-{-{-<