http://www.perlmonks.org?node_id=989341

Mac1 has asked for the wisdom of the Perl Monks concerning the following question:

Hello Monks, I have been attempting to replace names within a given log file with a different set of names, but I want it to be configurable. Example, since I know the names in the log file,I would create an external file 'OriginalNames.txt' containing the known names that I want to replace. I would then create another file 'AlternateNames.txt' containing the replacement names.

The original log file would be opened and parsed searching for all of the instances of the names in the 'OriginalNames.txt' file and replace each name found with the names in the 'AlternateNames.txt' file.

However, I am running into a brick wall. I put together some of the code below, but sure how to loop through to make the substitutions. As always, your suggestions would help. Thanks again

open(ORIGINALNAMES,"./OriginalNames.txt"); # Eg. Jupiter Mars Earth et +c. @originalNames = <ORIGINALNAMES>; open(ALTERNATENAMES,"./AlternateNames.txt"); # Eg. Host1 Host2 Host3 e +tc. @alternateNames = <ALTERNATENAMES>; open(LOG,"./log.txt") or die; ## Log file containing names wanting to +be replaced eg. Jupiter Mars Earth etc. @log = <LOG>; chomp @log; ## Should I do a for for($i = 0; $i<=$#log; $i++) { ### Not sure how to proceed to loop through log file, find the origina +l names and then substitute with the Alternate Names.. }

Replies are listed 'Best First'.
Re: Array Element Substitution
by Athanasius (Archbishop) on Aug 23, 2012 at 16:13 UTC

    Hello Mac1, and welcome to the Monastery!

    For the task outlined, I would build a hash with names-to-be-replaced as the keys, and their replacements as the values.

    However, I think you should first clarify the format of your input files. Do you intend that each occurrence of “Jupiter” be replaced with “Host1”, each occurrence of “Mars” be replaced with “Host2”, etc.? If so, this scheme looks very brittle. For example, a single name out of place in either file would throw out the replacements for all words following.

    Also, did you intend Jupiter Mars Earth to follow one another on the same line (as shown), or to appear on separate lines? This makes a difference to the way the input files are read in and parsed.

    Please specify the precise formats of the two input files, and their relationship to each other. This will make it easier for the monks to help you (and may help to clarify your own thinking about the problem).

    Athanasius <°(((><contra mundum

      Thanks for your response. A hash approach would be great, but not sure how to make the substituions. The actual 'OriginalNames.txt' would contain the names one after another such as:

      Earth

      Jupiter

      Venus

      Mars

      The 'AlternateNames.txt' file would contain the exact replacement in the same order aligning to the 'OriginalNames.txt such as:

      Host1

      Host2

      Host3

      Host4

      Thus, Earth would be replaced by Host1, Jupiter replaced by Host2, Venus Replaced by Host3 and Mars replaced by Host4

Re: Array Element Substitution
by Kenosis (Priest) on Aug 23, 2012 at 20:51 UTC

    aaron_baugher provided a good solution. Nevertheless, here's a longer (and not necessarily prettier nor essentially operationally different) option. It uses only one old/new file, where the old/new word pairs are tab delimited, e.g.:

    Earth Host1 Jupiter Host2 Venus Host3 Mars Host4

    It uses File::Slurp for file read/write operations. The regex matches the old words on word boundaries, in case you don't want to replace embedded 'words:'

    use Modern::Perl; use File::Slurp qw/read_file write_file/; my $logFile = 'log.txt'; my $oldNewFile = 'oldNewFile.txt'; my $text = read_file $logFile; for(read_file $oldNewFile){ my ($old, $new) = split; $text =~ s/\b$old\b/$new/g } say $text; #write_file( $logFile, $text );

    If you're satisfied with the replacement results shown in the printed output, you can uncomment the #write_file... line to save the changes to log.txt. I recommend running it on a test file, first.

    Hope this helps!

      Hi, Apologize for the delay in responding, but this helped. I added some parts and inserted some additional error checking etc. Overall, I liked the approach using the slurp module. Thanks again very much.

        You're most welcome, Mac1! Am glad it helped...

Re: Array Element Substitution
by aaron_baugher (Curate) on Aug 23, 2012 at 20:00 UTC

    Short, ugly version below. Responsible error checking and file handling are left to the student.

    #!/usr/bin/env perl use Modern::Perl; my @old = `cat old.txt`; chomp @old; my @new = `cat new.txt`; chomp @new; my %h; @h{@old} = @new; my $text = `cat log.txt`; $text =~ s/$_/$h{$_}/g for @old; print $text;

    Aaron B.
    Available for small or large Perl jobs; see my home node.

Re: Array Element Substitution
by cheekuperl (Monk) on Aug 24, 2012 at 03:03 UTC
    The way I see it, you have a certain orig_name=>new_name mapping (a hash) where $hash{$orig_name}=$new_name. And you wish to replace all orig_name in your log_file with new_name.
    #Open the file and slurp it into an array @lines #Search orig_name in line and replace it with new_name map{s/$orig_name/$new_name/g} @lines; #Write the transformed array into file again
Re: Array Element Substitution
by Marshall (Canon) on Aug 24, 2012 at 10:17 UTC
    I guess that this would also be a solution.
    Adapt to your application..
    #!/usr/bin/perl -w use strict; my $heredoc = <<END; This is a whole bunch of stuff with Jupiter and Mars and Earth and Venus and Saturn. Also could be Earth and Earth and something else like Neptune. END my $xlatedoc = <<END; Jupiter host1 Mars host2 Earth host3 END my %xlate; open (XLATE, '<', \$xlatedoc) or die "xlatedoc failed: $!\n"; while (<XLATE>) { my ($planet, $host) = split; #no chomp() needed $xlate{$planet}=$host; } close (XLATE); my @planets = keys %xlate; my $all_planets = join ('|',@planets); open (IN, '<', \$heredoc) or die "heredoc failed: $!\n"; while (<IN>) { s/($all_planets)/$xlate{$1}/g; # $all_planets is an OR expression # $1 winds up being which one of the OR'ed terms matched # that gets translated to the new term print; } __END__ This is a whole bunch of stuff with host1 and host2 and host3 and Venus and Saturn. Also could be host3 and host3 and something else like Neptune.