Beefy Boxes and Bandwidth Generously Provided by pair Networks
Problems? Is your data what you think it is?
 
PerlMonks  

Re^22: search and replace strings in different files in a directory

by PitifulProgrammer (Acolyte)
on Sep 09, 2014 at 10:23 UTC ( [id://1099968]=note: print w/replies, xml ) Need Help??


in reply to Re^21: search and replace strings in different files in a directory
in thread search and replace strings in different files in a directory

Dear Monks

Thanks a mil in advance for your comments on how to comment one's code. I am sorry for my improper use of some of the technical terms that have led to some confusion. I did not know that there was so much to consider.

I went through some of the suggested links and promise to do better in the next project(s). It was very helpful and will surely be a great help in structuring my code and how I go about coding in general.

I changed the code a bit, i.e. substituted move with copy and got the results as specified by colleagues. I will put the script to test next time, there might be some issues with running the script on the server and not everybody has Perl installed, so I guess I'll be getting txt.file and run the script on my machine.

I would however like to post the recent version here so that it is accessible to others. It would also be grand if I got some feedback on the new comments.

Yes, before I forget, one of you mentioned that the following line would not quite match as intended.

sub Replace { my( $in, $bak ) = @_; path( $in )-> copy( $bak ); #rename $in to $bak my $infh = path( $bak )->openr_raw; my $outfh = path( $in )->openrw_raw; while( <$infh> ) { s{&}{&amp;}g; ## will match more than what you want fix it s{&amp;amp;}{&amp;}g; s{\s>\s}{&gt;}g; s{\s<\s}{&lt;}g; print $outfh $_; } close $infh; close $outfh; }

That contributor was right, but the subsitution is only carried out the wrong way if there the source file has a particular structure in terms of the items to be substituted. I have not yet found when, since all the recent substitutions proved to be ok.

I'll let you guys know or some of you might have an idea

Thanks a mil to all contributors for your patience and providing the bits and pieces which have created this wonderful script.

Thank you and keep it going.

Kind regards

C
  • Comment on Re^22: search and replace strings in different files in a directory
  • Download Code

Replies are listed 'Best First'.
Re^23: search and replace strings in different files in a directory
by PitifulProgrammer (Acolyte) on Sep 09, 2014 at 10:28 UTC

    Dear all

    This is my final (slightly anonymised version) of the code, which is working for me as intended.

    #!/usr/bin/perl -- use 5.014; use strict; use warnings; use Path::Tiny qw/ path /; use POSIX(); use autodie qw/ close /; use File::BOM; use Carp::Always; use Data::Dump qw/ dd /; Main( @ARGV ); exit( 0 ); sub Main { #my( $infile_paths ) = @_; #if run via my( $infile_paths ) = 'C:\dev\test_paths.txt'; chomp $infile_paths; my @paths = GetPaths( $infile_paths ); for my $path ( @paths ){ RetrieveAndBackupXML( $path ); } return @paths; } ## end sub Main sub GetPaths { use File::BOM; ## my @paths = path( shift )->lines_utf8; my @paths = path( shift )->lines( { binmode => ":via(File::BOM)" } + ); s/\s+$// for @paths; # "chomp" return @paths; } ## end sub GetPaths sub RetrieveAndBackupXML { my( $directory ) = shift; ## same as shift @_ ## my $date = POSIX::strftime( '%Y-%m-%d', localtime ); #suffix + for the backup-file, e.g. 2014-08-01 my $bak = "$date.bak"; my @xml_files = path( $directory )->children( qr/\.xml$/ ); for my $file ( @xml_files ) { Replace( $file, "$file-$bak" ); } } ## end sub Main # Fix xml entities and create a copy of the original file before editi +ng sub Replace { my( $in, $bak ) = @_; path( $in )-> copy( $bak ); #create a copy of $in with the ending( +s) specified in $bak my $infh = path( $bak )->openr_raw; my $outfh = path( $in )->openrw_raw; while( <$infh> ) { s{&}{&amp;}g; ## In some case does not match as intended s{&amp;amp;}{&amp;}g; s{\s>\s}{&gt;}g; s{\s<\s}{&lt;}g; print $outfh $_; } close $infh; close $outfh; } ## end sub Replace

      I notice you have this:

      while( <$infh> ) { s{&}{&amp;}g; ## In some case does not match as intended s{&amp;amp;}{&amp;}g; ... }

      presumably because, when the input line already contains &amp;, the first substitution changes it to &amp;amp;, so the second substitution is needed to change it back again! Better to replace these two substitutions with a single substitution using a negative look-ahead assertion (?!...). Proof-of-concept:

      14:25 >perl -wE "my @s = ('Fred & Wilma', 'Barney &amp; Betty'); for ( +@s) { s{&(?!amp;)}{&amp;}g }; say for @s;" Fred &amp; Wilma Barney &amp; Betty 14:25 >

      See “Look-Around Assertions” in perlre#Extended-Patterns.

      Hope that helps,

      Athanasius <°(((><contra mundum Iustus alius egestas vitae, eros Piratica,

        Dear Athanasius

        Thanks a mil for posting your regular expression.

        It is quite funny, since that line that caught your interest was no longer part of the code, I must have posted this particular version by accident.

        However, this will surely resolve some issues to come.

        Your help is much appreciated

        Thanks a mil again, I will bookmark the extended regex patterns, I am sure I might be needing them soonish

        Kind regards

        C.

Log In?
Username:
Password:

What's my password?
Create A New User
Domain Nodelet?
Node Status?
node history
Node Type: note [id://1099968]
help
Chatterbox?
and the web crawler heard nothing...

How do I use this?Last hourOther CB clients
Other Users?
Others having an uproarious good time at the Monastery: (3)
As of 2024-04-23 06:57 GMT
Sections?
Information?
Find Nodes?
Leftovers?
    Voting Booth?

    No recent polls found