Beefy Boxes and Bandwidth Generously Provided by pair Networks
good chemistry is complicated,
and a little bit messy -LW
 
PerlMonks  

Search & Replace Across Two Files

by sophix (Sexton)
on Nov 21, 2010 at 11:18 UTC ( [id://872778]=perlquestion: print w/replies, xml ) Need Help??

sophix has asked for the wisdom of the Perl Monks concerning the following question:

Hi all,

I am stuck with this search&replace problem and would appreciate your help.

What I would like to do is to find the ids in a file and replace them with their descriptions provided in another text file.

Here is the reference file - tab-delimited

301609239 kitchen 114672573 study room 20531742 living room 89242146 terrace 301781720 balcony

And here is the file to be read and in which the replacements to be included

ABCDE + + +P237S S{301609239} + + + +R372W W{114672573} + + + +K576R R{20531742} R{89242146} + + + +R1059Q Q{126723431} Q{6934272} STRKE +T216I I{301781720} I{148237434} + + + +V275I I{149632297} I{47224534} + + + +R13H H{126333615} + + + +F113L L{301781720} L{148237434} + + + +G135S S{147902132} S{47224534} S{125864042} S{107921834} + + + +T307A A{224050516} A{126333615} + + + +L217F F{149632297} F{147902132}

The desired output would look like this:

ABCDE + + +P237S S{kitchen} + + + +R372W W{study room} + + + +K576R R{living room} R{terrace} + + + +R1059Q Q{126723431} Q{6934272} STRKE +T216I I{balcony} I{148237434} + + + +V275I I{149632297} I{47224534} + + + +R13H H{126333615} + + + +F113L L{301781720} L{148237434} + + + +G135S S{147902132} S{47224534} S{125864042} S{107921834} + + + +T307A A{224050516} A{126333615} + + + +L217F F{149632297} F{147902132}

And here is the script that I wrote but it does not work - not even generating error

#!/usr/bin/perl -w my $data = 'reference.txt'; open INFILE, '<', $data; while(<INFILE>) { my $line = $_; chomp($line); my ($id, $description) = split /\t/, $line, 2; my $data2 = "readthisfile.txt"; open INFILE2, "<", $data2; while(<INFILE2>) { my $line2 = $_; chomp($line2); $line2 =~ s/$id/$description/g; } }

Replies are listed 'Best First'.
Re: Search & Replace Across Two Files
by NetWallah (Canon) on Nov 21, 2010 at 15:06 UTC
    Your code reads the entire INFILE2 for each line of INFILE.

    I have re-written the code to help you get the logic right.

    This code is fairly efficient (There is room for improvement, but it would increase complexity), and made more verbose and simplified.
    See the discussion in Alternation vs. looping for multiple searches for improvement ideas.
    Untested.

    Assumes INFILE contents are smaller than INFILE2, AND small enough to fit into memory.

    If this is not the case, you need to consider using a database.

    #!/usr/bin/perl -w use strict; use warnings; my %subst; my $data = 'reference.txt'; open my $INFILE, '<', $data or die "cannot read $data: $!"; while(defined my $line = <$INFILE>) { chomp($line); my ($id, $description) = split /\t/, $line, 2; $subst{$id} = $description; } close $INFILE; my $data2 = 'readthisfile.txt'; open my $INFILE2, '<', $data2 or die "cannot read $data2: $!"; while(defined my $line2 = <$INFILE2>) { ## chomp($line2); Dont chomp - leave newline in for printing my $changes = 0; for my $id (keys %subst){ if ($line2 =~s/$id/$subst{$id}/g){ # This line was changed $changes++; } } print $line2 if $changes; #Prints only changed lines } close $INFILE2;

         Syntactic sugar causes cancer of the semicolon.        --Alan Perlis

Re: Search & Replace Across Two Files
by JavaFan (Canon) on Nov 21, 2010 at 11:45 UTC
    All you are doing is reading in data - but you never write or print your modified data. So, what do you expect your program to do?

    Oh, and as a general observation, do check the return value of open. Opening a file may fail.

Re: Search & Replace Across Two Files
by generator (Pilgrim) on Nov 21, 2010 at 12:01 UTC
    Well, for one thing, both files are being opened for "read only" access ("<"). "+<" will allow both read and write.

    You might consider reading the first file's code - room pairs into a hash (presuming that each numeric code is unique) and open the second file substituting the hash key for the hash value "looked up" in the first file.

    <><

    generator

      "+<" will allow both read and write.

      In most cases like this one modifiying an input file is a really bad idea.

Re: Search & Replace Across Two Files
by Generoso (Prior) on Nov 22, 2010 at 04:01 UTC

    Try this code to see if it is what you are looking for.

    #!/usr/bin/perl -w my $data = 'reference.txt'; open INFILE, '<', $data; my %hash = (); while (<INFILE>) { chomp($_); my ($id,$description)=split(/\t/); #print $id,' << ',$description,"\n"; $hash{ $id } = $description; } #print "\n"; #while ( my ($key, $value) = each(%hash) ) { # print "$key => $value\n"; # } my $data2 = "readthisfile.txt"; open INFILE2, "<", $data2; while(<INFILE2>) { chomp($_); @parts = split /\t/; print "Original \t",++$i,': ',$_,"\n"; print "Replaced \t",$i,': '; foreach $parl(@parts){ if ($parl =~ m/(.*\{)([0-9]+)(\}.*)/) { $parl = $1.$hash{$2}.$3 if exists $hash{$2}; } print $parl,"\t"; } print "\n"; }
    perl "D:\Temp\repalce1.pl" Process started >>> Original 1: ABCDE Replaced 1: ABCDE Original 2: + + Replaced 2: Original 3: +P237S S{301609239} + + + Replaced 3: +P237S S{kitchen} Original 4: +R372W W{114672573} + + + Replaced 4: +R372W W{study room} Original 5: +K576R R{20531742} R{89242146} + + + Replaced 5: +K576R R{living room} R{terrace} Original 6: +R1059Q Q{126723431} Q{6934272} Replaced 6: +R1059Q Q{126723431} Q{6934272} Original 7: Replaced 7: Original 8: STRKE Replaced 8: STRKE Original 9: Replaced 9: Original 10: +T216I I{301781720} I{148237434} + + + Replaced 10: +T216I I{balcony} I{148237434} Original 11: +V275I I{149632297} I{47224534} + + + Replaced 11: +V275I I{149632297} I{47224534} Original 12: +R13H H{126333615} + + + Replaced 12: +R13H H{126333615} Original 13: +F113L L{301781720} L{148237434} + + + Replaced 13: +F113L L{balcony} L{148237434} Original 14: +G135S S{147902132} S{47224534} S{125864042} + S{107921834} + + Replaced 14: +G135S S{147902132} S{47224534} S{125864042} + S{107921834} Original 15: +T307A A{224050516} A{126333615} + + + Replaced 15: +T307A A{224050516} A{126333615} Original 16: +L217F F{149632297} F{147902132} Replaced 16: +L217F F{149632297} F{147902132} <<< Process finished. ================ READY ================

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: perlquestion [id://872778]
Approved by Corion
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (3)
As of 2024-04-20 04:37 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found